項目的完整代碼在 C2j-Compilerjava
在以前完成了描述符號表的數據結構,如今就能夠正式構造符號表了。符號表的建立天然是要根據語法分析過程當中走的,因此符號表的建立就在LRStateTableParser裏的takeActionForReduce方法git
不過在此以前,固然還須要一個方便對這個符號表操做的類了github
這一篇主要的兩個文件是數據結構
操做符號表的方法都在TypeSystem類裏。函數
TypeSystem裏主要有這幾個方法:ui
邏輯都很簡單code
public TypeLink newType(String typeText) { Specifier sp; int type = Specifier.NONE; boolean isLong = false, isSigned = true; switch (typeText.charAt(0)) { case 'c': if (typeText.charAt(1) == 'h') { type = Specifier.CHAR; } break; case 'd': case 'f': System.err.println("Floating point Numbers are not supported"); System.exit(1); break; case 'i': type = Specifier.INT; break; case 'l': isLong = true; break; case 'u': isSigned = false; break; case 'v': if (typeText.charAt(2) == 'i') { type = Specifier.VOID; } break; case 's': //ignore short signed break; default: break; } sp = new Specifier(); sp.setType(type); sp.setLong(isLong); sp.setSign(isSigned); TypeLink link = new TypeLink(false, false, sp); return link; }
其實這一部分有的到後面解釋執行或者代碼生成的時候,如今這個編譯器是不處理的ci
這一部分的邏輯也很簡單作用域
public TypeLink newClass(String classText) { Specifier sp = new Specifier(); sp.setType(Specifier.NONE); setClassType(sp, classText.charAt(0)); TypeLink link = new TypeLink(false, false, sp); return link; } private void setClassType(Specifier sp, char c) { switch(c) { case 0: sp.setStorageClass(Specifier.FIXED); sp.setStatic(false); sp.setExternal(false); break; case 't': sp.setStorageClass(Specifier.TYPEDEF); break; case 'r': sp.setStorageClass(Specifier.REGISTER); break; case 's': sp.setStatic(true); break; case 'e': sp.setExternal(true); break; default: System.err.println("Internal error, Invalid Class type"); System.exit(1); break; } }
addSpecifierToDeclaration是爲當前整個Symbol鏈都加上修飾符,好比碰見int x,y,z;這種狀況get
public Declarator addDeclarator(Symbol symbol, int declaratorType) { Declarator declarator = new Declarator(declaratorType); TypeLink link = new TypeLink(true, false, declarator); symbol.addDeclarator(link); return declarator; } public void addSpecifierToDeclaration(TypeLink specifier, Symbol symbol) { while (symbol != null) { symbol.addSpecifier(specifier); symbol = symbol.getNextSymbol(); } }
剩下用到再提
構造符號表的過程在語法分析的過程裏,也就是進行reduce操做的時候。很好理解問什麼符號表的構建會在reduce操做時發生,由於當發生reduce操做的時候就表明產生了一個變量名或者是產生了一個變量定義,這時候是把它們加入符號表的最好時機
因此在語法分析的過程當中加入一個方法來處理reduce時的操做,此外還須要一個屬性堆棧來輔助操做,屬性堆棧的做用就是用來保存以前的操做,以方便後面使用
好比如今產生了一個修飾符,可是語法分析過程尚未讀入變量名,就先把這個修飾符壓入屬性堆棧,等讀入變量名的時候就能夠建立一個Symbol,再把修飾符彈出堆棧連接到Symbol上
takeActionForReduce方法的參數就是作reduce操做依據的產生式的編號
只看一下比較複雜的:
這是結構定義生成一個結構體類型的declartor,對應的推導式是
* TYPE_SPECIFIER -> STRUCT_SPECIFIER * STTUCT_SPECIFIER -> STRUCT OPT_TAG LC DEF_LIST RC * | STRUCT TAG
先生成一個Specifier再設置它的vStruct屬性,也就是聲明這是一個結構體,以後拿到結構體定義,從這個推導式中咱們能夠看出,結構體定義確定在它的上一步,也就是被放入了屬性堆棧的頂端
這裏對應的推導式是
* SPECIFIERS -> SPECIFIERS TYPE_OR_CLASS
目的其實就是合併多個specifier
case SyntaxProductionInit.ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList: case SyntaxProductionInit.VarList_COMMA_ParamDeclaration_TO_VarList: case SyntaxProductionInit.DeclList_Comma_Decl_TO_DeclList: case SyntaxProductionInit.DefList_Def_TO_DefList:
其實這部分是鏈接一系列變量的定義,好比ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList就是對應像x,y這樣用逗號分割的連續定義,把它們的符號鏈接起來
case SyntaxProductionInit.OptSpecifier_ExtDeclList_Semi_TO_ExtDef: case SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration: case SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def:
這裏實際上是完成一個完成的定義,像int x,y;後把拿到Specifier放入到每一個符號去,比較特殊的是拿到Specifier的位置,這是根據reduce次數計算的。
接下來比較須要注意的就是處理struct和function的方法
private void handleStructVariable(Symbol symbol) { if (symbol == null) { return; } boolean isStruct = false; TypeLink typeLink = symbol.typeLinkBegin; Specifier specifier = null; while (typeLink != null) { if (!typeLink.isDeclarator) { specifier = (Specifier) typeLink.getTypeObject(); if (specifier.getType() == Specifier.STRUCTURE) { isStruct = true; break; } } typeLink = typeLink.toNext(); } if (isStruct) { StructDefine structDefine = specifier.getStruct(); Symbol copy = null, headCopy = null, original = structDefine.getFields(); while (original != null) { if (copy != null) { Symbol sym = original.copy(); copy.setNextSymbol(sym); copy = sym; } else { copy = original.copy(); headCopy = copy; } original = original.getNextSymbol(); } symbol.setArgList(headCopy); } }
private void setFunctionSymbol(boolean hasArgs) { Symbol funcSymbol; if (hasArgs) { funcSymbol = (Symbol) valueStack.get(valueStack.size() - 4); } else { funcSymbol = (Symbol) valueStack.get(valueStack.size() - 3); } typeSystem.addDeclarator(funcSymbol, Declarator.FUNCTION); attributeForParentNode = funcSymbol; }
private void takeActionForReduce(int productionNum) { switch (productionNum) { case SyntaxProductionInit.TYPE_TO_TYPE_SPECIFIER: attributeForParentNode = typeSystem.newType(text); break; case SyntaxProductionInit.EnumSpecifier_TO_TypeSpecifier: attributeForParentNode = typeSystem.newType("int"); break; case SyntaxProductionInit.StructSpecifier_TO_TypeSpecifier: attributeForParentNode = typeSystem.newType(text); TypeLink link = (TypeLink) attributeForParentNode; Specifier sp = (Specifier) link.getTypeObject(); sp.setType(Specifier.STRUCTURE); StructDefine struct = (StructDefine) valueStack.get(valueStack.size() - 1); sp.setStruct(struct); break; case SyntaxProductionInit.SPECIFIERS_TypeOrClass_TO_SPECIFIERS: attributeForParentNode = valueStack.peek(); Specifier last = (Specifier) ((TypeLink) valueStack.get(valueStack.size() - 2)).getTypeObject(); Specifier dst = (Specifier) ((TypeLink) attributeForParentNode).getTypeObject(); typeSystem.specifierCopy(dst, last); break; case SyntaxProductionInit.NAME_TO_NewName: case SyntaxProductionInit.Name_TO_NameNT: attributeForParentNode = typeSystem.newSymbol(text, nestingLevel); break; case SyntaxProductionInit.START_VarDecl_TO_VarDecl: case SyntaxProductionInit.Start_VarDecl_TO_VarDecl: typeSystem.addDeclarator((Symbol) attributeForParentNode, Declarator.POINTER); break; case SyntaxProductionInit.VarDecl_LB_ConstExpr_RB_TO_VarDecl: Declarator declarator = typeSystem.addDeclarator((Symbol) valueStack.get(valueStack.size() - 4), Declarator.ARRAY); int arrayNum = (Integer) attributeForParentNode; declarator.setElementNum(arrayNum); attributeForParentNode = valueStack.get(valueStack.size() - 4); break; case SyntaxProductionInit.Name_TO_Unary: attributeForParentNode = typeSystem.getSymbolByText(text, nestingLevel, symbolScope); break; case SyntaxProductionInit.ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList: case SyntaxProductionInit.VarList_COMMA_ParamDeclaration_TO_VarList: case SyntaxProductionInit.DeclList_Comma_Decl_TO_DeclList: case SyntaxProductionInit.DefList_Def_TO_DefList: { Symbol currentSym = (Symbol) attributeForParentNode; Symbol lastSym = null; if (productionNum == SyntaxProductionInit.DefList_Def_TO_DefList) { lastSym = (Symbol) valueStack.get(valueStack.size() - 2); } else { lastSym = (Symbol) valueStack.get(valueStack.size() - 3); } currentSym.setNextSymbol(lastSym); } break; case SyntaxProductionInit.OptSpecifier_ExtDeclList_Semi_TO_ExtDef: case SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration: case SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def: Symbol symbol = (Symbol) attributeForParentNode; TypeLink specifier; if (productionNum == SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration) { specifier = (TypeLink) (valueStack.get(valueStack.size() - 2)); } else { specifier = (TypeLink) (valueStack.get(valueStack.size() - 3)); } typeSystem.addSpecifierToDeclaration(specifier, symbol); typeSystem.addSymbolsToTable(symbol, symbolScope); handleStructVariable(symbol); break; case SyntaxProductionInit.VarDecl_Equal_Initializer_TO_Decl: //Here you need to set the Symbol object for the response, otherwise there will be an error above Symbol symbol = (Symbol)attributeForParentNode; attributeForParentNode = (Symbol) valueStack.get(valueStack.size() - 2); break; case SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl: setFunctionSymbol(true); Symbol argList = (Symbol) valueStack.get(valueStack.size() - 2); ((Symbol) attributeForParentNode).args = argList; typeSystem.addSymbolsToTable((Symbol) attributeForParentNode, symbolScope); symbolScope = ((Symbol) attributeForParentNode).getName(); Symbol sym = argList; while (sym != null) { sym.addScope(symbolScope); sym = sym.getNextSymbol(); } break; case SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl: setFunctionSymbol(false); typeSystem.addSymbolsToTable((Symbol) attributeForParentNode, symbolScope); symbolScope = ((Symbol) attributeForParentNode).getName(); break; case SyntaxProductionInit.OptSpecifiers_FunctDecl_CompoundStmt_TO_ExtDef: symbol = (Symbol) valueStack.get(valueStack.size() - 2); specifier = (TypeLink) (valueStack.get(valueStack.size() - 3)); typeSystem.addSpecifierToDeclaration(specifier, symbol); symbolScope = GLOBAL_SCOPE; break; case SyntaxProductionInit.Name_To_Tag: symbolScope = text; attributeForParentNode = typeSystem.getStructFromTable(text); if (attributeForParentNode == null) { attributeForParentNode = new StructDefine(text, nestingLevel, null); typeSystem.addStructToTable((StructDefine) attributeForParentNode); } break; case SyntaxProductionInit.Struct_OptTag_LC_DefList_RC_TO_StructSpecifier: Symbol defList = (Symbol) valueStack.get(valueStack.size() - 2); StructDefine structObj = (StructDefine) valueStack.get(valueStack.size() - 4); structObj.setFields(defList); attributeForParentNode = structObj; symbolScope = GLOBAL_SCOPE; break; case SyntaxProductionInit.Enum_TO_EnumNT: enumVal = 0; break; case SyntaxProductionInit.NameNT_TO_Emurator: doEnum(); break; case SyntaxProductionInit.Name_Eequal_ConstExpr_TO_Enuerator: enumVal = (Integer) (valueStack.get(valueStack.size() - 1)); attributeForParentNode = (Symbol) (valueStack.get(valueStack.size() - 3)); doEnum(); break; case SyntaxProductionInit.Number_TO_ConstExpr: case SyntaxProductionInit.Number_TO_Unary: attributeForParentNode = Integer.valueOf(text); break; default: break; } astBuilder.buildSyntaxTree(productionNum, text); }
在上面說的構造符號表的過程還有一個點沒有說,就是每一個符號的做用域問題。
在LRStateTableParser有兩個屬性
nestingLevel在shift操做碰見左右括號時,就會進行相應的加減
symbolScope則是在reduce過程,若是碰見一個函數定義或者完成一個完整的函數定義就會進行相應的設置
這一篇就是利用reduce操做的時候,咱們能夠知道是根據哪一個產生式作的reduce,就能夠在其中插入符號表的構建。過程看着可能比較複雜,但其實就是:
根據reduce的產生式建立信息 -> 根據reduce的產生式來組合信息
另外的github博客:https://dejavudwh.cn