AST抽象語法樹

AST抽象語法樹

why
主流項目插件的用途: javascript轉譯、代碼壓縮、css預處理、eslint、prettier等都創建在AST的基礎上。
what
according to the grammar of a programming language, each AST node corresponds to an item of a source code.(根據編程語言的語法,每一個AST節點對應一個源代碼項。)
demo

連接地址astexplorer.net
AST解析工具
image
js語法javascript

function square(n) {
  return n * n;
}

ast語法樹css

// Parser acorn-8.0.1
{
  "type": "Program",
  "start": 0,
  "end": 38,
  "body": [
    {
      "type": "FunctionDeclaration",
      "start": 0,
      "end": 38,
      "id": {
        "type": "Identifier",
        "start": 9,
        "end": 15,
        "name": "square"
      },
      "expression": false,
      "generator": false,
      "async": false,
      "params": [
        {
          "type": "Identifier",
          "start": 16,
          "end": 17,
          "name": "n"
        }
      ],
      "body": {
        "type": "BlockStatement",
        "start": 19,
        "end": 38,
        "body": [
          {
            "type": "ReturnStatement",
            "start": 23,
            "end": 36,
            "argument": {
              "type": "BinaryExpression",
              "start": 30,
              "end": 35,
              "left": {
                "type": "Identifier",
                "start": 30,
                "end": 31,
                "name": "n"
              },
              "operator": "*",
              "right": {
                "type": "Identifier",
                "start": 34,
                "end": 35,
                "name": "n"
              }
            }
          }
        ]
      }
    }
  ],
  "sourceType": "module"
}

從純文本中獲得AST(經過編譯器)

  • 詞法分析

    scanner。它讀取咱們的代碼,而後把他們按照預約的規則合併成一個個的標識(tokens).同時,它會移除空白符,註釋等。最後,整個代碼將被分割進一個tokens列表(或者說一維數組)。當詞法分析源代碼的時候,它會一個一個字母的讀取代碼。當它遇到空格,操做符,或者特殊符號的時候,它會認爲一個會話已經完成了。html

  • 語法解析,也叫解析器

    它將詞法分析出來的數組轉化成樹形的表達形式。同時驗證語法,語法錯誤,拋出語法錯誤。
    當生成樹的時候,解析器會刪除一些不必的標識tokens(好比不完整的括號),所以AST不是100%與源碼匹配,但咱們已經可以知道如何處理了。題外話,解析器100%覆蓋全部代碼結構生成樹叫作CST(具體語法樹)java

更多編譯器知識

the-super-tiny-compiler-倉庫地址node

將Lisp轉化爲C語言

LangSandbox-倉庫地址react

創造本身的語言,並將它編譯成C語言或者機器語言,最後運行它。

第三方庫生成ASTgit

重點介紹Babylon

Babylon
Babylon is a JavaScript parser used in Babel.Support for JSX, Flow, Typescript.github

babel

babel是一個javascript編譯器。宏觀來講,它分爲3個階段運行代碼:解析(parsing),轉譯(transforming),生成(generation)。咱們能夠給babel一些javascript代碼,它修改代碼而後生成新的代碼返回。過程即建立AST,遍歷樹,修改tokens,最後從AST中生成最新的代碼。express

babel解析生成

一、使用babylon解析代碼生成語法樹編程

import * as babylon from "babylon";
const code = `
  const abc = 5;
`;
const ast = babylon.parse(code);

生成樹結果:

{
  "type": "File",
  "start": 0,
  "end": 18,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 0
    }
  },
  "program": {
    "type": "Program",
    "start": 0,
    "end": 18,
    "loc": {
      "start": {
        "line": 1,
        "column": 0
      },
      "end": {
        "line": 3,
        "column": 0
      }
    },
    "sourceType": "script",
    "body": [
      {
        "type": "VariableDeclaration",
        "start": 3,
        "end": 17,
        "loc": {
          "start": {
            "line": 2,
            "column": 2
          },
          "end": {
            "line": 2,
            "column": 16
          }
        },
        "declarations": [
          {
            "type": "VariableDeclarator",
            "start": 9,
            "end": 16,
            "loc": {
              "start": {
                "line": 2,
                "column": 8
              },
              "end": {
                "line": 2,
                "column": 15
              }
            },
            "id": {
              "type": "Identifier",
              "start": 9,
              "end": 12,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 8
                },
                "end": {
                  "line": 2,
                  "column": 11
                },
                "identifierName": "abc"
              },
              "name": "abc"
            },
            "init": {
              "type": "NumericLiteral",
              "start": 15,
              "end": 16,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 14
                },
                "end": {
                  "line": 2,
                  "column": 15
                }
              },
              "extra": {
                "rawValue": 5,
                "raw": "5"
              },
              "value": 5
            }
          }
        ],
        "kind": "const"
      }
    ],
    "directives": []
  },
  "comments": [],
  "tokens": [
    {
      "type": {
        "label": "const",
        "keyword": "const",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "const",
      "start": 3,
      "end": 8,
      "loc": {
        "start": {
          "line": 2,
          "column": 2
        },
        "end": {
          "line": 2,
          "column": 7
        }
      }
    },
    {
      "type": {
        "label": "name",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null
      },
      "value": "abc",
      "start": 9,
      "end": 12,
      "loc": {
        "start": {
          "line": 2,
          "column": 8
        },
        "end": {
          "line": 2,
          "column": 11
        }
      }
    },
    {
      "type": {
        "label": "=",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": true,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "=",
      "start": 13,
      "end": 14,
      "loc": {
        "start": {
          "line": 2,
          "column": 12
        },
        "end": {
          "line": 2,
          "column": 13
        }
      }
    },
    {
      "type": {
        "label": "num",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": 5,
      "start": 15,
      "end": 16,
      "loc": {
        "start": {
          "line": 2,
          "column": 14
        },
        "end": {
          "line": 2,
          "column": 15
        }
      }
    },
    {
      "type": {
        "label": ";",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 16,
      "end": 17,
      "loc": {
        "start": {
          "line": 2,
          "column": 15
        },
        "end": {
          "line": 2,
          "column": 16
        }
      }
    },
    {
      "type": {
        "label": "eof",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 18,
      "end": 18,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 0
        }
      }
    }
  ]
}

二、使用babel的轉換器transforming語法樹語法

import traverse from "babel-traverse";
traverse(ast, {
  enter(path) {
    if (path.node.type === "Identifier") {
      path.node.name = path.node.name
        .split("")
        .reverse()
        .join("");
    }
  }
});
{
  "type": "File",
  "start": 0,
  "end": 18,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 0
    }
  },
  "program": {
    "type": "Program",
    "start": 0,
    "end": 18,
    "loc": {
      "start": {
        "line": 1,
        "column": 0
      },
      "end": {
        "line": 3,
        "column": 0
      }
    },
    "sourceType": "script",
    "body": [
      {
        "type": "VariableDeclaration",
        "start": 3,
        "end": 17,
        "loc": {
          "start": {
            "line": 2,
            "column": 2
          },
          "end": {
            "line": 2,
            "column": 16
          }
        },
        "declarations": [
          {
            "type": "VariableDeclarator",
            "start": 9,
            "end": 16,
            "loc": {
              "start": {
                "line": 2,
                "column": 8
              },
              "end": {
                "line": 2,
                "column": 15
              }
            },
            "id": {
              "type": "Identifier",
              "start": 9,
              "end": 12,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 8
                },
                "end": {
                  "line": 2,
                  "column": 11
                },
                "identifierName": "abc"
              },
              "name": "cba"
            },
            "init": {
              "type": "NumericLiteral",
              "start": 15,
              "end": 16,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 14
                },
                "end": {
                  "line": 2,
                  "column": 15
                }
              },
              "extra": {
                "rawValue": 5,
                "raw": "5"
              },
              "value": 5
            }
          }
        ],
        "kind": "const"
      }
    ],
    "directives": []
  },
  "comments": [],
  "tokens": [
    {
      "type": {
        "label": "const",
        "keyword": "const",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "const",
      "start": 3,
      "end": 8,
      "loc": {
        "start": {
          "line": 2,
          "column": 2
        },
        "end": {
          "line": 2,
          "column": 7
        }
      }
    },
    {
      "type": {
        "label": "name",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null
      },
      "value": "abc",
      "start": 9,
      "end": 12,
      "loc": {
        "start": {
          "line": 2,
          "column": 8
        },
        "end": {
          "line": 2,
          "column": 11
        }
      }
    },
    {
      "type": {
        "label": "=",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": true,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "=",
      "start": 13,
      "end": 14,
      "loc": {
        "start": {
          "line": 2,
          "column": 12
        },
        "end": {
          "line": 2,
          "column": 13
        }
      }
    },
    {
      "type": {
        "label": "num",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": 5,
      "start": 15,
      "end": 16,
      "loc": {
        "start": {
          "line": 2,
          "column": 14
        },
        "end": {
          "line": 2,
          "column": 15
        }
      }
    },
    {
      "type": {
        "label": ";",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 16,
      "end": 17,
      "loc": {
        "start": {
          "line": 2,
          "column": 15
        },
        "end": {
          "line": 2,
          "column": 16
        }
      }
    },
    {
      "type": {
        "label": "eof",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 18,
      "end": 18,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 0
        }
      }
    }
  ]
}

三、使用babel的生成器generator代碼

import generate from "@babel/generator";
const newCode = generate(ast).code;

// newCode => const cba = 5;
babel插件製做(babel-plugins)

在上述步驟中,第一步(解析)和第三步(生成)有babel處理。
當開發babel-plugin插件的時候,咱們只須要描述轉化你的AST節點的"visitors"就能夠了。

// my-babel-plugin.js
module.exports = function() {
  return {
    visitor: {
      Identifier(path) {
        const name = path.node.name;
        console.log(name);
        path.node.name = name
          .split("")
          .reverse()
          .join("");
      }
    }
  };
};
// 在babel.config.js中註冊插件,重啓項目才能生效
// plugins: ["./src/plugins/mybabelplugin.js"]

學習Babel插件製做-Babel-handbook
中文插件手冊

自動代碼重構工具,神器JSCodeshift

例如說你想要替換掉全部的老掉牙的匿名函數, 把他們變成Lambda表達式(箭頭函數)。

// transform
load().then(function(response)) {
  return response.data;
}
// to
load().then(response => response.data)

上述操做代碼編輯器可能沒辦法這麼作,由於這並非簡單的查找替換操做。這時候jscodeshift就可使用了。
若是你想建立自動把你的代碼從舊的框架遷移到新的框架,這就是一種很nice的方式。

jscodeshift

jscodeshift是一個工具包,用於在多個JavaScript或TypeScript文件上運行codemods。

react-codemod
This repository contains a collection of codemod scripts for use with JSCodeshift that help update React APIs.
此存儲庫包含一組codemod腳本,用於jscodeshift,用於更新React api。

Prettier
// transform
foo(reallyLongArg(), omgSoManyParameters(), IShouldRefactorThis()),isThereSeriouselyAnotherOne());
// to
foo {
  reallyLongArg(),
  omgSoManyParameters(), 
  IShouldRefactorThis(), 
  isThereSeriouselyAnotherOne()
};
// Prettier 格式化咱們的代碼。它調整長句,整理空格,括號等。

《A prettier printer》

Finally

js2flowchart在線轉化預覽地址
js2flowchart倉庫地址

它將js代碼轉化生成svg流程圖
這是一個很好的例子,由於它向你展示了你,當你擁有AST時,能夠作任何你想要作的事。把AST轉回成字符串代碼並非必要的,你能夠經過它畫一個流程圖,或者其它你想要的東西。

js2flowchart使用場景是什麼呢?經過流程圖,你能夠解釋你的代碼,或者給你代碼寫文檔;經過可視化的解釋學習其餘人的代碼;經過簡單的js語法,爲每一個處理過程簡單的描述建立流程圖。
你也能夠在代碼中使用它,或者經過CLI,你只須要指向你想生成SVG的文件就行。並且,還有VS Code插件(連接在項目readme中)

首先,解析代碼成AST,而後,咱們遍歷AST而且生成另外一顆樹,我稱之爲工做流樹。它刪除不少不重要的額tokens,可是將關鍵塊放在一塊兒,如函數、循環、條件等。再以後,咱們遍歷工做流樹而且建立形狀樹。每一個形狀樹的節點包含可視化類型、位置、在樹中的鏈接等信息。最後一步,咱們遍歷全部的形狀,生成對應的SVG,合併全部的SVG到一個文件中.後續會持續更新,學習中。。。

相關文章
相關標籤/搜索