babel原理&plugin實戰

本文將講解babel是如何運行的，AST的結構，以及怎麼建立一個babel的插件。javascript

再講babel以前，先不講babel，AST的這些概念，先帶你實現一個簡易的babel解析器，這樣再回過頭來說這些概念就容易理解多了。java

tiny-compiler 編譯器

想象一下咱們有一些新特性的語法，其中add subtract是普通的函數名，須要轉義到正常的javascript語法，以便讓瀏覽器可以兼容的運行。node


複製代碼

(add 2 2)webpack

(subtract 4 2)web

(add 2 (subtract 4 2))express


複製代碼

要轉義成以下npm


複製代碼

add(2, 2)json

subtract(4, 2)數組

add(2, subtract(4, 2))瀏覽器

```

編譯器都分爲三個步驟：

1. Parsing 解析

2. Transformation 轉義

3. Code Generation 代碼生成

### Parsing 解析

Parsing階段分紅兩個子階段，

1. Lexical Analysis 詞法分析

2. Syntactic Analysis語法分析，

先寫好咱們要轉化的代碼

```js

// 這是咱們要轉化的code

(add 2 (subtract 4 2))

```

#### Lexical Analysis 詞法分析

Lexical Analysis 詞法分析能夠理解爲把代碼拆分紅最小的獨立的語法單元，去描述每個語法，能夠是操做符，數字，標點符號等，最後生成token數組。

```js

// 第一步，Lexical Analysis，轉化成tokens相似以下

[

{ type: 'paren', value: '(' },

{ type: 'name', value: 'add' },

{ type: 'number', value: '2' },

{ type: 'paren', value: '(' },

{ type: 'name', value: 'subtract' },

{ type: 'number', value: '4' },

{ type: 'number', value: '2' },

{ type: 'paren', value: ')' },

]

```

那咱們開始實現它吧，幹！

```js

function tokenizer(input) {

let current = 0;

let tokens = [];

while (current < input.length) {

let char = input[current];

// 處理(

if (char === '(') {

tokens.push({

type: 'paren',

value: '(',

});

current++;

continue;

}

// 處理)

if (char === ')') {

tokens.push({

type: 'paren',

value: ')',

});

current++;

continue;

}

// 處理空白字符

let WHITESPACE = /s/;

if (WHITESPACE.test(char)) {

current++;

continue;

}

// 處理數字

let NUMBERS = /[0-9]/;

if (NUMBERS.test(char)) {

let value = '';

while (NUMBERS.test(char)) {

value += char;

char = input[++current];

}

tokens.push({ type: 'number', value });

continue;

}

// 處理字符串

if (char === '"') {

let value = '';

char = input[++current];

while (char !== '"') {

value += char;

char = input[++current];

}

char = input[++current];

tokens.push({ type: 'string', value });

continue;

}

// 處理函數名

let LETTERS = /[a-z]/i;

if (LETTERS.test(char)) {

let value = '';

while (LETTERS.test(char)) {

value += char;

char = input[++current];

}

tokens.push({ type: 'name', value });

continue;

}

// 報錯提示

throw new TypeError('I dont know what this character is: ' + char);

}

return tokens;

}

```

#### Syntactic Analysis 語法分析

Syntactic Analysis 語法分析就是根據上一步的tokens數組轉化成語法以前的關係，這就是Abstract Syntax Tree,也就是咱們常說的AST。

```js

// 第二步，Syntactic Analysis，轉化成AST相似以下

{

type: 'Program',

body: [{

type: 'CallExpression',

params: [{

type: 'NumberLiteral',

value: '2',

}, {

type: 'CallExpression',

params: [{

type: 'NumberLiteral',

value: '4',

}, {

type: 'NumberLiteral',

value: '2',

}]

}

```

咱們再來實現一個parser，轉化成AST。

```js

function parser(tokens) {

let current = 0;

function walk() {

let token = tokens[current];

// 處理數字

if (token.type === 'number') {

current++;

return {

type: 'NumberLiteral',

value: token.value,

}

// 處理字符串

if (token.type === 'string') {

current++;

return {

type: 'StringLiteral',

value: token.value,

};

}

// 處理括號表達式

if (

token.type === 'paren' &&

token.value === '('

) {

token = tokens[++current];

let node = {

type: 'CallExpression',

params: [],

};

token = tokens[++current];

while (

(token.type !== 'paren') ||

(token.type === 'paren' && token.value !== ')')

) {

node.params.push(walk());

token = tokens[current];

}

current++;

return node;

}

throw new TypeError(token.type);

}

let ast = {

type: 'Program',

body: [],

};

while (current < tokens.length) {

ast.body.push(walk());

}

return ast;

}

```

從上述代碼來看，跟階段AST是根節點是type=Program，body是一個嵌套的AST數組結構。再單獨處理了number和string類型以後，再遞歸的調用walk函數，以解決嵌套的括號表達式。

### Transformation 轉義

#### traverser 遍歷器

咱們最終的目的確定是想轉化成咱們想要的代碼，那怎麼轉化呢？答案就是更改咱們剛剛獲得的AST結構。那怎麼去改AST呢？直接去操做這個樹結構確定是不現實的，因此咱們須要遍歷這個AST，利用深度優先遍歷的方法遍歷這些節點，當遍歷到某個節點時，再去調用這個節點對應的方法，再方法裏面改變這些節點的值就垂手可得了。

想象一下咱們有這樣的一個visitor，就是上文說道的遍歷時調用的方法

```js

var visitor = {

NumberLiteral: {

enter(node, parent) { },

exit(node, parent) { },

}

};

```

因爲深度優先遍歷的特性，咱們遍歷到一個節點時有enter和exit的概念，表明着遍歷一些相似於CallExpression這樣的節點時，這個語句，enter表示開始解析,exit表示解析完畢。好比說上文中：

```js

* -> Program (enter)

* -> CallExpression (enter)

* -> Number Literal (enter)

* <- Number Literal (exit)

* -> Call Expression (enter)

* -> Number Literal (enter)

* <- Number Literal (exit)

* -> Number Literal (enter)

* <- Number Literal (exit)

* <- CallExpression (exit)

* <- Program (exit)

```

而後有一個函數，接受ast和vistor做爲參數，實現遍歷，相似於：

```js

traverse(ast, {

CallExpression: {

enter(node, parent) {

// ...

exit(node, parent) {

// ...

}

})

```

先實現traverser吧。

```js

function traverser(ast, visitor) {

// 遍歷一個數組節點

function traverseArray(array, parent) {

array.forEach(child => {

traverseNode(child, parent);

});

}

// 遍歷節點

function traverseNode(node, parent) {

let methods = visitor[node.type];

// 先執行enter方法

if (methods && methods.enter) {

methods.enter(node, parent);

}

switch (node.type) {

// 一開始節點的類型是Program，去接着解析body字段

case 'Program':

traverseArray(node.body, node);

break;

// 當節點類型是CallExpression，去解析params字段

case 'CallExpression':

traverseArray(node.params, node);

break;

// 數字和字符串沒有子節點，直接執行enter和exit就好

case 'NumberLiteral':

case 'StringLiteral':

break;

// 容錯處理

default:

throw new TypeError(node.type);

}

// 後執行exit方法

if (methods && methods.exit) {

methods.exit(node, parent);

}

// 開始從根部遍歷

traverseNode(ast, null);

}

```

#### transformer 轉換器

有了traverser遍歷器後，就開始遍歷吧，先看看先後兩個AST的對比。

```js

* ----------------------------------------------------------------------------

* Original AST | Transformed AST

* ----------------------------------------------------------------------------

* { | {

* type: 'Program', | type: 'Program',

* body: [{ | body: [{

* type: 'CallExpression', | type: 'ExpressionStatement',

* name: 'add', | expression: {

* params: [{ | type: 'CallExpression',

* type: 'NumberLiteral', | callee: {

* value: '2' | type: 'Identifier',

* }, { | name: 'add'

* type: 'CallExpression', | },

* name: 'subtract', | arguments: [{

* params: [{ | type: 'NumberLiteral',

* type: 'NumberLiteral', | value: '2'

* value: '4' | }, {

* }, { | type: 'CallExpression',

* type: 'NumberLiteral', | callee: {

* value: '2' | type: 'Identifier',

* }] | name: 'subtract'

* }] | },

* }] | arguments: [{

* } | type: 'NumberLiteral',

* | value: '4'

* ---------------------------------- | }, {

* | type: 'NumberLiteral',

* | value: '2'

* | }]

* (sorry the other one is longer.) | }

* | }

* | }]

* | }

* ----------------------------------------------------------------------------

```

這裏注意多了一中ExpressionStatement的type，以表示subtract(4, 2)這樣的結構。

遍歷的過程就是把左側AST轉化成右側AST。

```js

function transformer(ast) {

let newAst = {

type: 'Program',

body: [],

};

// 給節點一個

context,讓遍歷到子節點時能夠push內容到parent.

context中

ast._context = newAst.body;

traverser(ast, {

CallExpression: {

enter(node, parent) {

let expression = {

type: 'CallExpression',

callee: {

type: 'Identifier',

arguments: [],

};

// 讓子節點能夠push本身到expression.arguments中

node._context = expression.arguments;

// 若是父節點不是CallExpression，則外層包裹一層ExpressionStatement

if (parent.type !== 'CallExpression') {

expression = {

type: 'ExpressionStatement',

expression: expression,

};

}

parent._context.push(expression);

}

NumberLiteral: {

enter(node, parent) {

parent._context.push({

type: 'NumberLiteral',

value: node.value,

});

}

StringLiteral: {

enter(node, parent) {

parent._context.push({

type: 'StringLiteral',

value: node.value,

});

return newAst;

}

```

### CodeGeneration 代碼生成

那最後一個階段就是用心生成的AST生成咱們最後的代碼了，也是生成AST的一個反過程。

```js

function codeGenerator(node) {

switch (node.type) {

// 針對於Program，處理其中的body屬性，依次再遞歸調用codeGenerator

case 'Program':

return node.body.map(codeGenerator)

.join('n');

// 針對於ExpressionStatement，處理其中的expression屬性，再後面添加一個分號

case 'ExpressionStatement':

return (

codeGenerator(node.expression) +

';'

);

// 針對於CallExpression，左側處理callee，括號中處理arguments數組

case 'CallExpression':

return (

codeGenerator(node.callee) +

'(' +

node.arguments.map(codeGenerator)

.join(', ') +

')'

);

// 直接返回name

case 'Identifier':

return node.name;

// 返回數字的value

case 'NumberLiteral':

return node.value;

// 字符串類型添加雙引號

case 'StringLiteral':

return '"' + node.value + '"';

// 容錯處理

default:

throw new TypeError(node.type);

}

```

### 總結

這樣咱們一個tiny-compiler就寫好了，最後能夠執行下面的代碼去試試啦。

```js

function compiler(input) {

let tokens = tokenizer(input);

let ast = parser(tokens);

let newAst = transformer(ast);

let output = codeGenerator(newAst);

return output;

}

```

從上述代碼中就能夠看出來，一個代碼轉化的過程就把包括了tokenizer詞法分析階段，parser預發分析階段（AST生成），transformer轉義階段，codeGenerator代碼生成階段。那麼在寫babel-plugin的時候，其實就是在寫其中的transformer，其餘的部分已經被babel完美的實現了。

## babel plugin 概念

先上手看一個簡單的babel plugin示例

```js

module.exports = function ({ types: t }) {

const TRUE = t.unaryExpression("!", t.numericLiteral(0), true);

const FALSE = t.unaryExpression("!", t.numericLiteral(1), true);

return {

visitor: {

BooleanLiteral(path) {

path.replaceWith(path.node.value ? TRUE : FALSE)

}

};

}

```

這個plugin形成的效果：

```js

// 源代碼

const x = true;

// 轉義後的的代碼

const x = !0;

```

就是把全部的bool類型的值轉化成 !0 或者 !1，這是代碼壓縮的時候使用的一個技巧。

那麼逐行來分析這個簡單的plugin。一個plugin就是一個function，入參就是babel對象，這裏利用到了babel中types對象，來自於@babel/types這個庫，而後操做path對象進行節點替換操做。

### path

path是確定會用到的一個對象。咱們能夠用過path訪問到當前節點，父節點，也能夠去調用添加、更新、移動和刪除節點有關的其餘不少方法。舉幾個示例

```js

// 訪問當前節點的屬性，用path.node.property訪問node的屬性

path.node.node

path.node.left

// 直接改變當前節點的屬性

path.node.name = "x";

// 當前節點父節點

path.parent

// 當前節點的父節點的path

path.parentPath

// 訪問節點內部屬性

path.get('left')

// 刪除一個節點

path.remove();

// 替換一個節點

path.replaceWith();

// 替換成多個節點

path.replaceWithMultiple();

// 插入兄弟節點

path.insertBefore();

path.insertAfter();

// 跳過子節點的遍歷

path.skip();

// 徹底跳過遍歷

path.stop();

```

### @babel/types

能夠理解它爲一個工具庫，相似於Lodash，裏面封裝了很是多的幫作方法，通常用處以下

* 檢查節點

通常在類型前面加is就是判斷是否該類型

```js

// 判斷當前節點的left節點是不是identifier類型

if (t.isIdentifier(path.node.left)) {

// ...

}

```

```js

// 判斷當前節點的left節點是不是identifer類型，而且name='n'

if (t.isIdentifier(path.node.left, { name: "n" })) {

// ...

}

// 上述判斷等價於

if (

path.node.left != null &&

path.node.left.type === "Identifier" &&

path.node.left.name === "n"

) {

// ...

}

```

* 構建節點

直接手寫複雜的AST結構是不現實的，因此有了一些幫助方法去構建這些節點，示例：

```js

// 調用binaryExpression和identifier的構建方法，生成ast

t.binaryExpression("*", t.identifier("a"), t.identifier("b"));

// 生成以下

{

type: "BinaryExpression",

operator: "*",

left: {

type: "Identifier",

right: {

type: "Identifier",

}

// 最後通過AST反轉回來以下

a * b

```

其中每一種節點都有本身的構造方法，都有本身特定的入參，詳細請參考官方文檔

### scope

最後講一下做用域的概念，每個函數，每個變量都有本身的做用域，在編寫babel plugin的時候要特別當心，再改變或者添加代碼的時候要注意不要破壞了原有的代碼結構。

用path.scope中的一些方法能夠操做做用域，示例：

```js

// 檢查變量n是否被綁定（是否在上下文已經有引用）

path.scope.hasBinding("n")

// 檢查本身內部是否有引用n

path.scope.hasOwnBinding("n")

// 建立一個上下文中新的引用生成相似於{ type: 'Identifier', name: '_n2' }

path.scope.generateUidIdentifier("n");

// 重命名當前的引用

path.scope.rename("n", "x");

```

## plugin實戰

寫一個自定義plugin是什麼步驟呢？

1. 這個plugin用來幹嗎

2. 源代碼的AST

3. 轉換後代碼的AST

tip: 能夠去這個網站查看代碼的AST。

### plugin的目的

如今就作一個自定義的plugin，你們在應用寫代碼的時候能夠經過webpack配置alias，好比說配置@ -> ./src，這樣import的時候就直接從src目錄下找所須要的代碼了，那麼你們有在寫組件的時候用過這個功能嗎？這就是咱們這個plugin的目的。

### 代碼

咱們有以下配置

```json

"alias": {

"@": "./src"

}

```

源代碼以及要轉化的代碼以下：

```js

// ./src/index.js

import add from '@/common'; // -> import add from "./common";




複製代碼

// ./src/foo/test/index.js

import add from '@/common'; // -> import add from "../../common";


複製代碼

AST

源碼的AST展現以下

那咱們看見是否是隻須要找到ImportDeclaration節點中將source改爲轉換以後的代碼是否是就能夠了。

### 開始寫plugin

```js

const localPath = require('path');

module.exports = function ({ types: t }) {

return {

visitor: {

ImportDeclaration(path, state) {

// 從state中拿到外界傳進的參數，這裏咱們外界設置了alias

const { alias } = state.opts;

if (!alias) {

return;

}

// 拿到當前文件的信息

const { filename, root } = state.file.opts;

// 找到相對地址

const relativePath = localPath.relative(root, localPath.dirname(filename));

// 利用path獲取當前節點中source的value，這裏對應的就是 '@/common'了

let importSource = path.node.source.value;

// 遍歷咱們的配置，進行關鍵字替換

Object.keys(alias).forEach(key => {

const reg = new RegExp(^${key});

if (reg.test(importSource)) {

importSource = importSource.replace(reg, alias[key]);

importSource = localPath.relative(relativePath, importSource)

}

})

// 利用t.StringLiteral構建器構建一個StringLiteral類型的節點，賦值給source

path.node.source = t.StringLiteral(importSource)

}

};

}

```

### 用plugin

回到咱們的babel配置文件中來，這裏咱們用的是babel.config.json

```json

{

"plugins": [

[

// 這裏使用本地的文件當作plugin，實際上能夠把本身製做的plugin發佈成npm包供你們使用

"./plugin.js",

// 傳配置到plugin的第二個參數state.opts中

{

"alias": {

"@": "./src"

}

]

}

```

這樣一個plugin的流程就走完了，歡迎你們多多交流。