譯文來源javascript
歡迎閱讀如何使用 TypeScript, React, ANTLR4, Monaco Editor 建立一個自定義 Web 編輯器系列的第二章節, 在這以前建議您閱讀使用 TypeScript, React, ANTLR4, Monaco Editor 建立一個自定義 Web 編輯器(一)html
在本文中, 我將介紹如何實現語言服務, 語言服務在編輯器中主要用來解析鍵入文本的繁重工做, 咱們將使用經過Parser生成的抽象語法樹(AST)來查找語法或詞法錯誤, 格式文本, 針對用戶鍵入文本對TODOS語法作只能提示(本文中我不會實現語法自動完成), 基本上, 語言服務暴露以下函數:java
format(code: string): string
validate(code: string): Errors[]
autoComplete(code: string, currentPosition: Position): string[]
我將引入ANTLR庫並增長一個根據TODOLang.g4
語法文件生Parser和Lexer的腳本, 首先引入兩個必須的庫:antlr4ts 和antlr4ts-cli, antlr4 Typescript 目標生成的解析器對antlr4ts包有運行時依賴, 另外一方面, 顧名思義antlr4ts-cli 就是CLI咱們將使用它生成該語言的Parser和Lexernode
npm add antlr4ts npm add -D antlr4ts-cli
在根路徑建立包含TodoLang
語法規則的文件TodoLangGrammar.g4
react
grammar TodoLangGrammar; todoExpressions : (addExpression)* (completeExpression)*; addExpression : ADD TODO STRING; completeExpression : COMPLETE TODO STRING; ADD : 'ADD'; TODO : 'TODO'; COMPLETE: 'COMPLETE'; STRING: '"' ~ ["]* '"'; EOL: [\r\n] + -> skip; WS: [ \t] -> skip;
如今咱們在package.json
文件裏增長經過antlr-cli生成Parser和Lexer的腳本webpack
"antlr4ts": "antlr4ts ./TodoLangGrammar.g4 -o ./src/ANTLR"
讓咱們執行一下antlr4ts腳本,就能夠在./src/ANTLR
目錄看到生成的解析器的typescript源碼了nginx
npm run antlr4ts
正如咱們看到的那樣, 這裏有一個Lexer 和 Parser, 若是你查看Parser文件, 你會發現它導出 TodoLangGrammarParser
類, 該類有個構造函數constructor(input: TokenStream)
, 該構造函數將TodoLangGrammarLexer
爲給定代碼生成的TokenStream
做爲參數, TodoLangGrammarLexer
有一個以代碼做爲入參的構造函數 constructor(input: CharStream)
git
Parser文件包含了public todoExpressions(): TodoExpressionsContext
方法,該方法會返回代碼中定義的全部TodoExpressions
的上下文對象, 猜測一下TodoExpressions
在哪裏能夠追蹤到,其實它是源於咱們語法規則文件的第一行語法規則:github
todoExpressions : (addExpression)* (completeExpression)*;
TodoExpressionsContext
是AST
的根基, 其中的每一個節點都是另外一個規則的另外一個上下文, 它包含了終端和節點上下文,終端擁有最終令牌(ADD 令牌, TODO 令牌, todo 事項名稱的令牌)web
TodoExpressionsContext
包含了addExpressions
和completeExpressions
表達式列表, 來源於如下三條規則
todoExpressions : (addExpression)* (completeExpression)*; addExpression : ADD TODO STRING; completeExpression : COMPLETE TODO STRING;
另外一方面, 每一個上下文類都包含了終端節點, 它基本包含如下文本(代碼段或者令牌, 例如:ADD, COMPLETE, 表明 TODO 的字符串), AST的複雜度取決於你編寫的語法規則
讓咱們來看看TodoExpressionsContext, 它包含了ADD
, TODO
和STRING
終端節點, 對應的規則如:
addExpression : ADD TODO STRING;
STRING
終端節點保存了咱們要加的Todo
文本內容, 先來解析一個簡單的TodoLang
代碼以來了解AST如何工做的,在./src/language-service
目錄建一個包含如下內容的文件parser.ts
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer"; import { ANTLRInputStream, CommonTokenStream } from "antlr4ts"; export default function parseAndGetASTRoot(code: string): TodoExpressionsContext { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); // Parse the input, where `compilationUnit` is whatever entry point you defined return parser.todoExpressions(); }
parser.ts
文件導出了parseAndGetASTRoot(code)
方法, 它接受TodoLang
代碼而且生成相應的AST, 解析如下TodoLang
代碼:
parseAndGetASTRoot(` ADD TODO "Create an editor" COMPLETE TODO "Create an editor" `)
在本節中, 我將引導您逐步瞭解如何向編輯器添加語法驗證, ANTLR開箱即用爲咱們生成詞彙和語法錯誤, 咱們只須要實現ANTLRErrorListner
類並將其提供給Lexer和Parser, 這樣咱們就能夠在 ANTLR解析代碼時收集錯誤
在./src/language-service
目錄下建立TodoLangErrorListener.ts
文件, 文件導出實現ANTLRErrorListner
接口的TodoLangErrorListener
類
import { ANTLRErrorListener, RecognitionException, Recognizer } from "antlr4ts"; export interface ITodoLangError { startLineNumber: number; startColumn: number; endLineNumber: number; endColumn: number; message: string; code: string; } export default class TodoLangErrorListener implements ANTLRErrorListener<any>{ private errors: ITodoLangError[] = [] syntaxError(recognizer: Recognizer<any, any>, offendingSymbol: any, line: number, charPositionInLine: number, message: string, e: RecognitionException | undefined): void { this.errors.push( { startLineNumber:line, endLineNumber: line, startColumn: charPositionInLine, endColumn: charPositionInLine+1,//Let's suppose the length of the error is only 1 char for simplicity message, code: "1" // This the error code you can customize them as you want } ) } getErrors(): ITodoLangError[] { return this.errors; } }
每次 ANTLR 在代碼解析期間遇到錯誤時, 它將調用此TodoLangErrorListener
, 以向其提供有關錯誤的信息, 該監聽器會返回包含解析發生錯誤的代碼位置極錯誤信息, 如今咱們嘗試把TodoLangErrorListener
綁定到parser.ts
的文件的Lexer和Parser裏, eg:
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer"; import { ANTLRInputStream, CommonTokenStream } from "antlr4ts"; import TodoLangErrorListener, { ITodoLangError } from "./TodoLangErrorListener"; function parse(code: string): {ast:TodoExpressionsContext, errors: ITodoLangError[]} { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); lexer.removeErrorListeners() const todoLangErrorsListner = new TodoLangErrorListener(); lexer.addErrorListener(todoLangErrorsListner); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); parser.removeErrorListeners(); parser.addErrorListener(todoLangErrorsListner); const ast = parser.todoExpressions(); const errors: ITodoLangError[] = todoLangErrorsListner.getErrors(); return {ast, errors}; } export function parseAndGetASTRoot(code: string): TodoExpressionsContext { const {ast} = parse(code); return ast; } export function parseAndGetSyntaxErrors(code: string): ITodoLangError[] { const {errors} = parse(code); return errors; }
在./src/language-service
目錄下建立LanguageService.ts
, 如下是它導出的內容
import { TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser"; import { parseAndGetASTRoot, parseAndGetSyntaxErrors } from "./Parser"; import { ITodoLangError } from "./TodoLangErrorListener"; export default class TodoLangLanguageService { validate(code: string): ITodoLangError[] { const syntaxErrors: ITodoLangError[] = parseAndGetSyntaxErrors(code); //Later we will append semantic errors return syntaxErrors; } }
不錯, 咱們實現了編輯器錯誤解析, 爲此我將要建立上篇文章討論過的web worker
, 而且添加worker
服務代理, 該代理將調用語言服務區完成編輯器的高級功能
首先, 咱們調用 monaco.editor.createWebWorker 來使用內置的 ES6 Proxies 建立代理TodoLangWorker
, TodoLangWorker
將使用語言服務來執行編輯器功能,在web worker
中執行的那些方法將由monaco代理,所以在web worker
中調用方法僅是在主線程中調用被代理的方法。
在./src/todo-lang
文件夾下建立TodoLangWorker.ts
包含如下內容:
import * as monaco from "monaco-editor-core"; import IWorkerContext = monaco.worker.IWorkerContext; import TodoLangLanguageService from "../language-service/LanguageService"; import { ITodoLangError } from "../language-service/TodoLangErrorListener"; export class TodoLangWorker { private _ctx: IWorkerContext; private languageService: TodoLangLanguageService; constructor(ctx: IWorkerContext) { this._ctx = ctx; this.languageService = new TodoLangLanguageService(); } doValidation(): Promise<ITodoLangError[]> { const code = this.getTextDocument(); return Promise.resolve(this.languageService.validate(code)); } private getTextDocument(): string { const model = this._ctx.getMirrorModels()[0]; return model.getValue(); }
咱們建立了language service
實例 而且添加了doValidation
方法, 進一步它會調用language service
的validate
方法, 還添加了getTextDocument
方法, 該方法用來獲取編輯器的文本值, TodoLangWorker
類還能夠擴展不少功能若是你想要支持多文件編輯等, _ctx: IWorkerContext
是編輯器的上下文對象, 它保存了文件的 model 信息
如今讓咱們在./src/todo-lang
目錄下建立 web worker 文件todolang.worker.ts
import * as worker from 'monaco-editor-core/esm/vs/editor/editor.worker'; import { TodoLangWorker } from './todoLangWorker'; self.onmessage = () => { worker.initialize((ctx) => { return new TodoLangWorker(ctx) }); };
咱們使用內置的worker.initialize
初始化咱們的 worker,並使用TodoLangWorker
進行必要的方法代理
那是一個web worker
, 所以咱們必須讓webpack
輸出對應的worker
文件
// webpack.config.js entry: { app: './src/index.tsx', "editor.worker": 'monaco-editor-core/esm/vs/editor/editor.worker.js', "todoLangWorker": './src/todo-lang/todolang.worker.ts' }, output: { globalObject: 'self', filename: (chunkData) => { switch (chunkData.chunk.name) { case 'editor.worker': return 'editor.worker.js'; case 'todoLangWorker': return "todoLangWorker.js" default: return 'bundle.[hash].js'; } }, path: path.resolve(__dirname, 'dist') }
咱們命名worker
文件爲todoLangWorker.js
文件, 如今咱們在編輯器啓動函數裏面增長getWorkUrl
(window as any).MonacoEnvironment = { getWorkerUrl: function (moduleId, label) { if (label === languageID) return "./todoLangWorker.js"; return './editor.worker.js'; } }
這是 monaco 如何獲取web worker
的 URL 的方法, 請注意, 若是worker
的 label 是TodoLang
的 ID, 咱們將返回用於在 Webpack 中打包輸出的同名worker,
若是如今構建項目, 則可能會發現有一個名爲todoLangWorker.js
的文件(或者在 dev-tools 中, 您將在線程部分中找到兩個worker
)
如今建立一個用來管理worker
建立和獲取代理worker
客戶端的 WorkerManager
import * as monaco from "monaco-editor-core"; import Uri = monaco.Uri; import { TodoLangWorker } from './todoLangWorker'; import { languageID } from './config'; export class WorkerManager { private worker: monaco.editor.MonacoWebWorker<TodoLangWorker>; private workerClientProxy: Promise<TodoLangWorker>; constructor() { this.worker = null; } private getClientproxy(): Promise<TodoLangWorker> { if (!this.workerClientProxy) { this.worker = monaco.editor.createWebWorker<TodoLangWorker>({ moduleId: 'TodoLangWorker', label: languageID, createData: { languageId: languageID, } }); this.workerClientProxy = <Promise<TodoLangWorker>><any>this.worker.getProxy(); } return this.workerClientProxy; } async getLanguageServiceWorker(...resources: Uri[]): Promise<TodoLangWorker> { const _client: TodoLangWorker = await this.getClientproxy(); await this.worker.withSyncedResources(resources) return _client; } }
咱們使用createWebWorker
建立monaco代理的web worker
, 其次咱們獲取返回了代理的客戶端對象, 咱們使用workerClientProxy
調用代理的一些方法, 讓咱們建立DiagnosticsAdapter
類, 該類用來鏈接 Monaco 標記 Api 和語言服務返回的 error,爲了讓解析的錯誤正確的標記在monaco上
import * as monaco from "monaco-editor-core"; import { WorkerAccessor } from "./setup"; import { languageID } from "./config"; import { ITodoLangError } from "../language-service/TodoLangErrorListener"; export default class DiagnosticsAdapter { constructor(private worker: WorkerAccessor) { const onModelAdd = (model: monaco.editor.IModel): void => { let handle: any; model.onDidChangeContent(() => { // here we are Debouncing the user changes, so everytime a new change is done, we wait 500ms before validating // otherwise if the user is still typing, we cancel the clearTimeout(handle); handle = setTimeout(() => this.validate(model.uri), 500); }); this.validate(model.uri); }; monaco.editor.onDidCreateModel(onModelAdd); monaco.editor.getModels().forEach(onModelAdd); } private async validate(resource: monaco.Uri): Promise<void> { const worker = await this.worker(resource) const errorMarkers = await worker.doValidation(); const model = monaco.editor.getModel(resource); monaco.editor.setModelMarkers(model, languageID, errorMarkers.map(toDiagnostics)); } } function toDiagnostics(error: ITodoLangError): monaco.editor.IMarkerData { return { ...error, severity: monaco.MarkerSeverity.Error, }; }
onDidChangeContent
監聽器監聽model
信息, 若是model
信息變動, 咱們將每隔 500ms 調用webworker
去驗證代碼而且增長錯誤標記;setModelMarkers
通知monaco增長錯誤標記, 爲了使得編輯器語法驗證功能完成,請確保在setup
函數中調用它們,並注意咱們正在使用WorkerManager來獲取代理worker
monaco.languages.onLanguage(languageID, () => { monaco.languages.setMonarchTokensProvider(languageID, monarchLanguage); monaco.languages.setLanguageConfiguration(languageID, richLanguageConfiguration); const client = new WorkerManager(); const worker: WorkerAccessor = (...uris: monaco.Uri[]): Promise<TodoLangWorker> => { return client.getLanguageServiceWorker(...uris); }; //Call the errors provider new DiagnosticsAdapter(worker); }); } export type WorkerAccessor = (...uris: monaco.Uri[]) => Promise<TodoLangWorker>;
如今一切準備就緒, 運行項目而且輸入錯誤的TodoLang
代碼, 你會發現錯誤被標記在代碼下面
如今往編輯器增長語義校驗, 記得我在上篇文章提到的兩個語義規則
要檢查是否認義了 TODO,咱們要作的就是遍歷 AST 以獲取每一個 ADD 表達式並將其推入definedTodos
.而後咱們在definedTodos
中檢查 TODO 的存在. 若是存在, 則是語義錯誤, 所以請從 ADD 表達式的上下文中獲取錯誤的位置, 而後將錯誤推送到數組中, 第二條規則也是如此
function checkSemanticRules(ast: TodoExpressionsContext): ITodoLangError[] { const errors: ITodoLangError[] = []; const definedTodos: string[] = []; ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; // If a TODO is defined using ADD TODO instruction, we can re-add it. if (definedTodos.some(todo_ => todo_ === todo)) { // node has everything to know the position of this expression is in the code errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todo} already defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } else { definedTodos.push(todo); } }else if(node instanceof CompleteExpressionContext) { const todoToComplete = node.STRING().text; if(definedTodos.every(todo_ => todo_ !== todoToComplete)){ // if the the todo is not yet defined, here we are only checking the predefined todo until this expression // which means the order is important errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todoToComplete} is not defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } } }) return errors; }
如今調用checkSemanticRules
函數, 在language service
的validate
方法中將語義和語法錯誤合併返回, 如今咱們編輯器已經支持語義校驗
對於編輯器的自動格式化功能, 您須要經過調用Monaco API registerDocumentFormattingEditProvider
提供並註冊 Monaco 的格式化提供程序. 查看 monaco-editor 文檔以獲取更多詳細信息. 調用並遍歷 AST 將爲你展現美化後的代碼
// languageService.ts format(code: string): string{ // if the code contains errors, no need to format, because this way of formating the code, will remove some of the code // to make things simple, we only allow formatting a valide code if(this.validate(code).length > 0) return code; let formattedCode = ""; const ast: TodoExpressionsContext = parseAndGetASTRoot(code); ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; formattedCode += `ADD TODO ${todo}\n`; }else if(node instanceof CompleteExpressionContext) { // If a Complete expression: COMPLETE TODO "STRING" const todoToComplete = node.STRING().text; formattedCode += `COMPLETE TODO ${todoToComplete}\n`; } }); return formattedCode; }
在todoLangWorker
中添加format
方法, 該format
方法會使用language service
的format
方法
如今建立TodoLangFomattingProvider
類去實現`DocumentFormattingEditProvider
接口
import * as monaco from "monaco-editor-core"; import { WorkerAccessor } from "./setup"; export default class TodoLangFormattingProvider implements monaco.languages.DocumentFormattingEditProvider { constructor(private worker: WorkerAccessor) { } provideDocumentFormattingEdits(model: monaco.editor.ITextModel, options: monaco.languages.FormattingOptions, token: monaco.CancellationToken): monaco.languages.ProviderResult<monaco.languages.TextEdit[]> { return this.format(model.uri, model.getValue()); } private async format(resource: monaco.Uri, code: string): Promise<monaco.languages.TextEdit[]> { // get the worker proxy const worker = await this.worker(resource) // call the validate methode proxy from the langaueg service and get errors const formattedCode = await worker.format(code); const endLineNumber = code.split("\n").length + 1; const endColumn = code.split("\n").map(line => line.length).sort((a, b) => a - b)[0] + 1; console.log({ endColumn, endLineNumber, formattedCode, code }) return [ { text: formattedCode, range: { endColumn, endLineNumber, startColumn: 0, startLineNumber: 0 } } ] } }
TodoLangFormattingProvider
經過調用worker
提供的format
方法, 並藉助editor.getValue()
做爲入參, 而且向monaco提供各式後的代碼及想要替換的代碼範圍, 如今進入setup
函數而且使用Monaco registerDocumentFormattingEditProvider
API註冊formatting provider
, 重跑應用, 你能看到編輯器已支持自動格式化了
monaco.languages.registerDocumentFormattingEditProvider(languageID, new TodoLangFormattingProvider(worker));
嘗試點擊Format document 或Shift + Alt + F, 你能看到如圖的效果:
若要使自動完成支持定義的 TODO, 您要作的就是從 AST 獲取全部定義的 TODO, 並提供completion provider
經過在setup
中調用registerCompletionItemProvider
。completion provider
爲您提供代碼和光標的當前位置,所以您能夠檢查用戶正在鍵入的上下文,若是他們在完整的表達式中鍵入 TODO,則能夠建議預約義的 TO DOs。 請記住,默認狀況下,Monaco-editor 支持對代碼中的預約義標記進行自動補全,您可能須要禁用該功能並實現本身的標記以使其更加智能化和上下文化