做爲一個iOS工程師,每次看到Xcode在進行漫長的編譯的時候老是忍不住想深究一下本身手寫的BUG是如何被生成的,因此下定決定研究一下咱們的編譯器前端
要探究首先要知道咱們使用的是LLVM編譯器node
LLVM是一個自由軟件項目,它是一種編譯器基礎設施,以C++寫成,包含一系列模塊化的編譯器組件和工具鏈,用來開發編譯器前端和後端。它是爲了任意一種編程語言而寫成的程序,利用虛擬技術創造出編譯時期、連接時期、運行時期以及「閒置時期」的最優化。它最先以C/C++爲實現對象,而當前它已支持包括ActionScript、Ada、D語言、Fortran、GLSL、Haskell、Java字節碼、Objective-C、Swift、Python、Ruby、Rust、Scala以及C#等語言。
以上摘自維基百科c++
目前市面上常見的編譯器有如下兩種git
LLVM 咱們上面已經稍微介紹過了,下面引用維基百科對GCC的定義github
GNU編譯器套裝(英語:GNU Compiler Collection,縮寫爲GCC),指一套編程語言編譯器,以GPL及LGPL許可證所發行的自由軟件,也是GNU項目的關鍵部分,也是GNU工具鏈的主要組成部分之一。GCC(特別是其中的C語言編譯器)也常被認爲是跨平臺編譯器的事實標準。1985年由理查德·馬修·斯托曼開始發展,如今由自由軟件基金會負責維護工做。
原名爲GNU C語言編譯器(GNU C Compiler),由於它本來只能處理C語言。GCC在發佈後很快地獲得擴展,變得可處理C++。以後也變得可處理Fortran、Pascal、Objective-C、Java、Ada,Go與其餘語言。
許多操做系統,包括許多類Unix系統,如Linux及BSD家族都採用GCC做爲標準編譯器。
咱們如今所使用的Xcode採用的是LLVM,之前曾經使用過GCC,見下表objective-c
Xcode 版本 | 應用編譯器 |
---|---|
< Xcode3 | GCC |
Xcode3 | GCC + LLVM |
Xcode4.2 | 默認LLVM-Clang |
> Xcode5 | 廢棄GCC |
那麼,一樣是編譯器,爲什麼Xcode最終選擇LLVM而捨棄Clang呢macos
如下是傳統的三相設計思想編程
對於iOS開發者來講,整個流程能夠簡要歸納爲 Clang對代碼進行處理造成中間層做爲輸出,llvm把CLang的輸出做爲輸入生成機器碼後端
下面就到了這篇文章的重點了,LLVM編譯器的前端,Clangbash
這個軟件項目在2005年由蘋果計算機發起,是LLVM編譯器工具集的前端(front-end),目的是輸出代碼對應的抽象語法樹(Abstract Syntax Tree, AST),並將代碼編譯成LLVM Bitcode。接着在後端(back-end)使用LLVM編譯成平臺相關的機器語言 。Clang支持C、C++、Objective C。
在Clang語言中,使用Stmt來表明statement。Clang代碼的單元(unit)皆爲語句(statement),語法樹的節點(node)類型就是Stmt。另外Clang的表達式(Expression)也是語句的一種,Clang使用Expr來表明Expression,Expr自己繼承自Stmt。節點之下有子節點列表(sub-node-list)。
Clang自己性能優異,其生成的AST所耗用掉的內存僅僅是GCC的20%左右。FreeBSD操做系統自2014年1月發行的10.0版本開始將Clang/LLVM做爲默認編譯器[3]。
Clang的執行過程包含如下幾步
下面咱們建立一個CommandLine工程來試驗一下,demo託管在Github
首先打開Xcode建立工程,語言選擇objective-c接下來咱們找到 main.m ,
clang -ccc-print-phases ClangTest/main.m
能夠看到輸出
0: input, "ClangTest/main.m", objective-c 1: preprocessor, {0}, objective-c-cpp-output 2: compiler, {1}, ir 3: backend, {2}, assembler 4: assembler, {3}, object 5: linker, {4}, image 6: bind-arch, "x86_64", {5}, image
也就是宏替換和頭文件導入步驟
clang -E ClangTest/main.m
咱們能夠看到輸出以下(前面部分省略)
# 1 "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Foundation.framework/Headers/FoundationLegacySwiftCompatibility.h" 1 3 # 185 "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Foundation.framework/Headers/Foundation.h" 2 3 # 10 "ClangTest/main.m" 2 int main(int argc, const char * argv[]) { @autoreleasepool { NSLog(@"Hello, World!"); } return 0; }
clang -fmodules -fsyntax-only -Xclang -dump-tokens ClangTest/main.m
輸出以下
annot_module_include '#import <Foundation/Foundation.h> int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here...' Loc=<ClangTest/main.m:9:1> int 'int' [StartOfLine] Loc=<ClangTest/main.m:11:1> identifier 'main' [LeadingSpace] Loc=<ClangTest/main.m:11:5> l_paren '(' Loc=<ClangTest/main.m:11:9> int 'int' Loc=<ClangTest/main.m:11:10> identifier 'argc' [LeadingSpace] Loc=<ClangTest/main.m:11:14> comma ',' Loc=<ClangTest/main.m:11:18> const 'const' [LeadingSpace] Loc=<ClangTest/main.m:11:20> char 'char' [LeadingSpace] Loc=<ClangTest/main.m:11:26> star '*' [LeadingSpace] Loc=<ClangTest/main.m:11:31> identifier 'argv' [LeadingSpace] Loc=<ClangTest/main.m:11:33> l_square '[' Loc=<ClangTest/main.m:11:37> r_square ']' Loc=<ClangTest/main.m:11:38> r_paren ')' Loc=<ClangTest/main.m:11:39> l_brace '{' [LeadingSpace] Loc=<ClangTest/main.m:11:41> at '@' [StartOfLine] [LeadingSpace] Loc=<ClangTest/main.m:12:5> identifier 'autoreleasepool' Loc=<ClangTest/main.m:12:6> l_brace '{' [LeadingSpace] Loc=<ClangTest/main.m:12:22> identifier 'NSLog' [StartOfLine] [LeadingSpace] Loc=<ClangTest/main.m:14:9> l_paren '(' Loc=<ClangTest/main.m:14:14> at '@' Loc=<ClangTest/main.m:14:15> string_literal '"Hello, World!"' Loc=<ClangTest/main.m:14:16> r_paren ')' Loc=<ClangTest/main.m:14:31> semi ';' Loc=<ClangTest/main.m:14:32> r_brace '}' [StartOfLine] [LeadingSpace] Loc=<ClangTest/main.m:15:5> return 'return' [StartOfLine] [LeadingSpace] Loc=<ClangTest/main.m:16:5> numeric_constant '0' [LeadingSpace] Loc=<ClangTest/main.m:16:12> semi ';' Loc=<ClangTest/main.m:16:13> r_brace '}' [StartOfLine] Loc=<ClangTest/main.m:17:1> eof '' Loc=<ClangTest/main.m:17:2>
能夠看到,括號,符號,關鍵字等等都被切割出來了
clang -fmodules -fsyntax-only -Xclang -ast-dump ClangTest/main.m
能夠看到AST輸出以下:
TranslationUnitDecl 0x7f95b28032e8 <<invalid sloc>> <invalid sloc> |-TypedefDecl 0x7f95b2803b80 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128' | `-BuiltinType 0x7f95b2803880 '__int128' |-TypedefDecl 0x7f95b2803be8 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128' | `-BuiltinType 0x7f95b28038a0 'unsigned __int128' |-TypedefDecl 0x7f95b2803c80 <<invalid sloc>> <invalid sloc> implicit SEL 'SEL *' | `-PointerType 0x7f95b2803c40 'SEL *' imported | `-BuiltinType 0x7f95b2803ae0 'SEL' |-TypedefDecl 0x7f95b2803d58 <<invalid sloc>> <invalid sloc> implicit id 'id' | `-ObjCObjectPointerType 0x7f95b2803d00 'id' imported | `-ObjCObjectType 0x7f95b2803cd0 'id' imported |-TypedefDecl 0x7f95b2803e38 <<invalid sloc>> <invalid sloc> implicit Class 'Class' | `-ObjCObjectPointerType 0x7f95b2803de0 'Class' imported | `-ObjCObjectType 0x7f95b2803db0 'Class' imported |-ObjCInterfaceDecl 0x7f95b2803e88 <<invalid sloc>> <invalid sloc> implicit Protocol |-TypedefDecl 0x7f95b28465e8 <<invalid sloc>> <invalid sloc> implicit __NSConstantString 'struct __NSConstantString_tag' | `-RecordType 0x7f95b2846400 'struct __NSConstantString_tag' | `-Record 0x7f95b2803f50 '__NSConstantString_tag' |-TypedefDecl 0x7f95b2846680 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *' | `-PointerType 0x7f95b2846640 'char *' | `-BuiltinType 0x7f95b2803380 'char' |-TypedefDecl 0x7f95b2846948 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'struct __va_list_tag [1]' | `-ConstantArrayType 0x7f95b28468f0 'struct __va_list_tag [1]' 1 | `-RecordType 0x7f95b2846770 'struct __va_list_tag' | `-Record 0x7f95b28466d0 '__va_list_tag' |-ImportDecl 0x7f95b30612f8 <ClangTest/main.m:9:1> col:1 implicit Foundation |-FunctionDecl 0x7f95b30615a8 <line:11:1, line:17:1> line:11:5 main 'int (int, const char **)' | |-ParmVarDecl 0x7f95b3061348 <col:10, col:14> col:14 argc 'int' | |-ParmVarDecl 0x7f95b3061460 <col:20, col:38> col:33 argv 'const char **':'const char **' | `-CompoundStmt 0x7f95b2260ae8 <col:41, line:17:1> | |-ObjCAutoreleasePoolStmt 0x7f95b2260aa0 <line:12:5, line:15:5> | | `-CompoundStmt 0x7f95b2260a88 <line:12:22, line:15:5> | | `-CallExpr 0x7f95b2260a40 <line:14:9, col:31> 'void' | | |-ImplicitCastExpr 0x7f95b2260a28 <col:9> 'void (*)(id, ...)' <FunctionToPointerDecay> | | | `-DeclRefExpr 0x7f95b2260910 <col:9> 'void (id, ...)' Function 0x7f95b30616e8 'NSLog' 'void (id, ...)' | | `-ImplicitCastExpr 0x7f95b2260a70 <col:15, col:16> 'id':'id' <BitCast> | | `-ObjCStringLiteral 0x7f95b22609b0 <col:15, col:16> 'NSString *' | | `-StringLiteral 0x7f95b2260978 <col:16> 'char [14]' lvalue "Hello, World!" | `-ReturnStmt 0x7f95b2260ad0 <line:16:5, col:12> | `-IntegerLiteral 0x7f95b2260ab0 <col:12> 'int' 0 `-<undeserialized declarations>
這一步CodeGen會自頂向下遍歷AST,產出中間層,也就是IR
clang -S -fobjc-arc -emit-llvm ClangTest/main.m -o main.ll
; ModuleID = 'ClangTest/main.m' source_filename = "ClangTest/main.m" target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.14.0" %struct.__NSConstantString_tag = type { i32*, i32, i8*, i64 } @__CFConstantStringClassReference = external global [0 x i32] @.str = private unnamed_addr constant [14 x i8] c"Hello, World!\00", section "__TEXT,__cstring,cstring_literals", align 1 @_unnamed_cfstring_ = private global %struct.__NSConstantString_tag { i32* getelementptr inbounds ([0 x i32], [0 x i32]* @__CFConstantStringClassReference, i32 0, i32 0), i32 1992, i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i32 0, i32 0), i64 13 }, section "__DATA,__cfstring", align 8 ; Function Attrs: noinline optnone ssp uwtable define i32 @main(i32, i8**) #0 { %3 = alloca i32, align 4 %4 = alloca i32, align 4 %5 = alloca i8**, align 8 store i32 0, i32* %3, align 4 store i32 %0, i32* %4, align 4 store i8** %1, i8*** %5, align 8 %6 = call i8* @objc_autoreleasePoolPush() #2 notail call void (i8*, ...) @NSLog(i8* bitcast (%struct.__NSConstantString_tag* @_unnamed_cfstring_ to i8*)) call void @objc_autoreleasePoolPop(i8* %6) ret i32 0 } declare i8* @objc_autoreleasePoolPush() declare void @NSLog(i8*, ...) #1 declare void @objc_autoreleasePoolPop(i8*) attributes #0 = { noinline optnone ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #2 = { nounwind } !llvm.module.flags = !{!0, !1, !2, !3, !4, !5, !6, !7} !llvm.ident = !{!8} !0 = !{i32 2, !"SDK Version", [2 x i32] [i32 10, i32 14]} !1 = !{i32 1, !"Objective-C Version", i32 2} !2 = !{i32 1, !"Objective-C Image Info Version", i32 0} !3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA,__objc_imageinfo,regular,no_dead_strip"} !4 = !{i32 4, !"Objective-C Garbage Collection", i32 0} !5 = !{i32 1, !"Objective-C Class Properties", i32 64} !6 = !{i32 1, !"wchar_size", i32 4} !7 = !{i32 7, !"PIC Level", i32 2} !8 = !{!"Apple LLVM version 10.0.1 (clang-1001.0.46.4)"}
clang -S -fobjc-arc ClangTest/main.m -o main.s
以下:
.section __TEXT,__text,regular,pure_instructions .build_version macos, 10, 14 sdk_version 10, 14 .globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp subq $32, %rsp movl $0, -4(%rbp) movl %edi, -8(%rbp) movq %rsi, -16(%rbp) callq _objc_autoreleasePoolPush leaq L__unnamed_cfstring_(%rip), %rsi movq %rsi, %rdi movq %rax, -24(%rbp) ## 8-byte Spill movb $0, %al callq _NSLog movq -24(%rbp), %rdi ## 8-byte Reload callq _objc_autoreleasePoolPop xorl %eax, %eax addq $32, %rsp popq %rbp retq .cfi_endproc ## -- End function .section __TEXT,__cstring,cstring_literals L_.str: ## @.str .asciz "Hello, World!" .section __DATA,__cfstring .p2align 3 ## @_unnamed_cfstring_ L__unnamed_cfstring_: .quad ___CFConstantStringClassReference .long 1992 ## 0x7c8 .space 4 .quad L_.str .quad 13 ## 0xd .section __DATA,__objc_imageinfo,regular,no_dead_strip L_OBJC_IMAGE_INFO: .long 0 .long 64 .subsections_via_symbols
clang -fmodules -c ClangTest/main.m -o main.o
clang main.o -o main
運行
./main
以這個例子來講,雖然咱們一行代碼都沒有寫,可是看得出來,編譯器爲咱們作的事情可很多,接下來我想在編譯器中添加一個插件,打印工程中全部的方法名
說了這麼多
下面咱們從官網下載並編譯最新的Clang
輸入
./bin/clang-9 --version
正常輸出clang就沒什麼問題了
仍是用剛纔那個ClangTest工程,這裏咱們要替換掉Xcode使用的Clang爲咱們本身編譯的版本
點擊工程,找到Build Settings,點擊加號,選擇 Add User-Defined settings
添加以下兩條
將路徑替換爲你編譯文件的路徑
而後搜索Enable Index-While-Building Functionality ,將值更改成No
接下來能夠Command+B進行編譯
下面來開發咱們的第一個插件,打印全部的方法名
首先,進入 llvm-project/clang/examples, 建立新文件夾,命名爲Find
在Find下新建兩個文件 DemoPlugin.cpp、CMakeLists.txt
打開 llvm-project/clang/examples/CMakeLists.txt,在文件末尾追加
add_subdirectory(DemoPlugin)
如今讓咱們編輯Find插件,
CMakeLists.txt
# If we don't need RTTI or EH, there's no reason to export anything # from the plugin. if( NOT MSVC ) # MSVC mangles symbols differently, and # PrintFunctionNames.export contains C++ symbols. if( NOT LLVM_REQUIRES_RTTI ) if( NOT LLVM_REQUIRES_EH ) set(LLVM_EXPORTED_SYMBOL_FILE ${CMAKE_CURRENT_SOURCE_DIR}/DemoPlugin.exports) endif() endif() endif() add_llvm_library(DemoPlugin MODULE DemoPlugin.cpp PLUGIN_TOOL clang) if(LLVM_ENABLE_PLUGINS AND (WIN32 OR CYGWIN)) target_link_libraries(DemoPlugin PRIVATE clangAST clangBasic clangFrontend LLVMSupport ) endif()
DemoPlugin.cpp
#include "clang/Frontend/FrontendPluginRegistry.h" #include "clang/AST/AST.h" #include "clang/AST/ASTConsumer.h" #include "clang/AST/RecursiveASTVisitor.h" #include "clang/Frontend/CompilerInstance.h" using namespace clang; namespace { // 能夠深度優先搜索整個AST,並訪問每個基類,遍歷須要處理的節點 class DemoPluginVisitor : public RecursiveASTVisitor<DemoPluginVisitor> { private: CompilerInstance &Instance; ASTContext *Context; public: void setASTContext (ASTContext &context) { this -> Context = &context; } DemoPluginVisitor (CompilerInstance &Instance) :Instance(Instance) {} // 查找類名 bool VisitObjCInterfaceDecl(ObjCInterfaceDecl *declaration) { if(isUserSourceCode(declaration)) { DiagnosticsEngine &D = Instance.getDiagnostics(); unsigned diagID = D.getCustomDiagID(DiagnosticsEngine::Warning, "查找到一個類名: %0"); D.Report(declaration->getBeginLoc(), diagID) << declaration->getName(); } return true; } // 查找方法名 bool VisitObjCMethodDecl(ObjCMethodDecl *declaration) { if(isUserSourceCode(declaration)) { DiagnosticsEngine &D = Instance.getDiagnostics(); unsigned diagID = D.getCustomDiagID(DiagnosticsEngine::Warning, "查找到一個方法名: %0"); // D.Report(declaration->getLocStart(), diagID).AddString(declaration->getSelector().getAsString()); D.Report(declaration->getBeginLoc(), diagID) << declaration->getSelector().getAsString(); } return true; } // 是否用戶代碼 bool isUserSourceCode (Decl *decl){ std::string filename = Instance.getSourceManager().getFilename(decl->getSourceRange().getBegin()).str(); if (filename.empty()) return false; // 定義非Xcode中的源碼都是用戶源碼 if(filename.find("/Applications/Xcode.app/") == 0) return false; return true; } }; class DemoPluginConsumer : public ASTConsumer { private: DemoPluginVisitor visitor; CompilerInstance &Instance; std::set<std::string> ParsedTemplates; public: DemoPluginConsumer(CompilerInstance &Instance, std::set<std::string> ParsedTemplates) : Instance(Instance), ParsedTemplates(ParsedTemplates), visitor(Instance) {} // 每次分析到一個頂層定義時會回調此函數,返回true表示處理 bool HandleTopLevelDecl(DeclGroupRef DG) override { return true; } // ASTConsumer的入口函數 void HandleTranslationUnit(ASTContext& context) override { visitor.setASTContext(context); visitor.TraverseDecl(context.getTranslationUnitDecl()); } }; class DemoPluginASTAction : public PluginASTAction { std::set<std::string> ParsedTemplates; protected: std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI, llvm::StringRef) override { return llvm::make_unique<DemoPluginConsumer>(CI, ParsedTemplates); } // 插件的入口函數 bool ParseArgs(const CompilerInstance &CI, const std::vector<std::string> &args) override { return true; } }; } static clang::FrontendPluginRegistry::Add<DemoPluginASTAction> X("DemoPlugin", "demo plugin");
首先,咱們須要編譯插件,
cmake -DLLVM_ENABLE_PROJECTS=clang -G "Unix Makefiles" ../llvm make DemoPlugin
編譯成功後咱們能夠在 build的lib 目錄下找到 DemoPlugin.dylib
在工程中加入配置 Other C Flags ,加入如下配置
-Xclang -load -Xclang /Users/felix/Documents/llvm-project/build/lib/DemoPlugin.dylib -Xclang -add-plugin -Xclang DemoPlugin
下面咱們就能夠開始編譯工程了,能夠看到每一個方法名和類名都被找到拉