前一篇咱們看到了CoreCLR中對Object的定義,這一篇咱們將會看CoreCLR中對new的定義和處理
new對於.Net程序員們來講一樣是耳熟能詳的關鍵詞,咱們天天都會用到new,然而new到底是什麼?html
由於篇幅限制和避免難度跳的過高,這一篇將不會詳細講解如下的內容,請耐心等待後續的文章node
如下的內容將會使用到一些名詞和縮寫,若是碰到看不懂的能夠到這裏來對照linux
BasicBlock: 在同一個分支(Branch)的一羣指令,使用雙向鏈表鏈接 GenTree: 語句樹,節點類型以GT開頭 Importation: 從BasicBlock生成GenTree的過程 Lowering: 具體化語句樹,讓語句樹的各個節點能夠明確的轉換到機器碼 SSA: Static Single Assignment R2R: Ready To Run Phases: JIT編譯IL到機器碼通過的各個階段 JIT: Just In Time CEE: CLR Execute Engine ee: Execute Engine EH: Exception Handling Cor: CoreCLR comp: Compiler fg: FlowGraph imp: Import LDLOCA: Load Local Variable gt: Generate hlp: Help Ftn: Function MP: Multi Process CER: Constrained Execution Regions TLS: Thread Local Storage
請看圖中的代碼和生成的IL,咱們能夠看到儘管一樣是new,卻生成了三種不一樣的IL代碼c++
咱們先來看newobj和newarr這兩個指令在coreclr中是怎麼定義的
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/opcode.def#L153git
OPDEF(CEE_NEWOBJ, "newobj", VarPop, PushRef, InlineMethod, IObjModel, 1, 0xFF, 0x73, CALL) OPDEF(CEE_NEWARR, "newarr", PopI, PushRef, InlineType, IObjModel, 1, 0xFF, 0x8D, NEXT)
咱們能夠看到這兩個指令的定義,名稱分別是CEE_NEWOBJ和CEE_NEWARR,請記住這兩個名稱程序員
接下來咱們將看看coreclr是如何把CEE_NEWOBJ指令變爲機器碼的
在講解以前請先大概瞭解JIT的工做流程,JIT編譯按函數爲單位,當調用函數時會自動觸發JIT編譯github
下面的代碼雖然進過努力的提取,但仍然比較長,請耐心閱讀算法
咱們從JIT的入口函數開始看,這個函數會被EE(運行引擎)調用
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/corjit.h#L350
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/ee_il_dll.cpp#L279
注: 按微軟文檔中說CILJit是32位上的實現,PreJit是64位上的實現,但實際我找不到PreJit在哪裏windows
CorJitResult CILJit::compileMethod( ICorJitInfo* compHnd, CORINFO_METHOD_INFO* methodInfo, unsigned flags, BYTE** entryAddress, ULONG* nativeSizeOfCode) { // 省略部分代碼...... assert(methodInfo->ILCode); result = jitNativeCode(methodHandle, methodInfo->scope, compHnd, methodInfo, &methodCodePtr, nativeSizeOfCode, &jitFlags, nullptr); // 省略部分代碼...... return CorJitResult(result); }
jitNativeCode是一個負責使用JIT編譯單個函數的靜態函數,會在內部爲編譯的函數建立單獨的Compiler實例
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.cpp#L6075數組
int jitNativeCode(CORINFO_METHOD_HANDLE methodHnd, CORINFO_MODULE_HANDLE classPtr, COMP_HANDLE compHnd, CORINFO_METHOD_INFO* methodInfo, void** methodCodePtr, ULONG* methodCodeSize, JitFlags* compileFlags, void* inlineInfoPtr) { // 省略部分代碼...... pParam->pComp->compInit(pParam->pAlloc, pParam->inlineInfo); pParam->pComp->jitFallbackCompile = pParam->jitFallbackCompile; // Now generate the code pParam->result = pParam->pComp->compCompile(pParam->methodHnd, pParam->classPtr, pParam->compHnd, pParam->methodInfo, pParam->methodCodePtr, pParam->methodCodeSize, pParam->compileFlags); // 省略部分代碼...... return result; }
Compiler::compCompile是Compiler類提供的入口函數,做用一樣是編譯函數
注意這個函數有7個參數,等一會還會有一個同名但只有3個參數的函數
這個函數主要調用了Compiler::compCompileHelper函數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.cpp#L4693
int Compiler::compCompile(CORINFO_METHOD_HANDLE methodHnd, CORINFO_MODULE_HANDLE classPtr, COMP_HANDLE compHnd, CORINFO_METHOD_INFO* methodInfo, void** methodCodePtr, ULONG* methodCodeSize, JitFlags* compileFlags) { // 省略部分代碼...... pParam->result = pParam->pThis->compCompileHelper(pParam->classPtr, pParam->compHnd, pParam->methodInfo, pParam->methodCodePtr, pParam->methodCodeSize, pParam->compileFlags, pParam->instVerInfo); // 省略部分代碼...... return param.result; }
讓咱們繼續看Compiler::compCompileHelper
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.cpp#L5294
int Compiler::compCompileHelper(CORINFO_MODULE_HANDLE classPtr, COMP_HANDLE compHnd, CORINFO_METHOD_INFO* methodInfo, void** methodCodePtr, ULONG* methodCodeSize, JitFlags* compileFlags, CorInfoInstantiationVerification instVerInfo) { // 省略部分代碼...... // 初始化本地變量表 lvaInitTypeRef(); // 省略部分代碼...... // 查找全部BasicBlock fgFindBasicBlocks(); // 省略部分代碼...... // 調用3個參數的compCompile函數,注意不是7個函數的compCompile函數 compCompile(methodCodePtr, methodCodeSize, compileFlags); // 省略部分代碼...... return CORJIT_OK; }
如今到了3個參數的compCompile,這個函數被微軟認爲是JIT最被感興趣的入口函數
你能夠額外閱讀一下微軟的JIT介紹文檔
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.cpp#L4078
//********************************************************************************************* // #Phases // // This is the most interesting 'toplevel' function in the JIT. It goes through the operations of // importing, morphing, optimizations and code generation. This is called from the EE through the // code:CILJit::compileMethod function. // // For an overview of the structure of the JIT, see: // https://github.com/dotnet/coreclr/blob/master/Documentation/botr/ryujit-overview.md // void Compiler::compCompile(void** methodCodePtr, ULONG* methodCodeSize, JitFlags* compileFlags) { // 省略部分代碼...... // 轉換BasicBlock(基本代碼塊)到GenTree(語句樹) fgImport(); // 省略部分代碼...... // 這裏會進行各個處理步驟(Phases),如Inline和優化等 // 省略部分代碼...... // 轉換GT_ALLOCOBJ節點到GT_CALL節點(分配內存=調用幫助函數) ObjectAllocator objectAllocator(this); objectAllocator.Run(); // 省略部分代碼...... // 建立本地變量表和計算各個變量的引用計數 lvaMarkLocalVars(); // 省略部分代碼...... // 具體化語句樹 Lowering lower(this, m_pLinearScan); // PHASE_LOWERING lower.Run(); // 省略部分代碼...... // 生成機器碼 codeGen->genGenerateCode(methodCodePtr, methodCodeSize); }
到這裏你應該大概知道JIT在整體上作了什麼事情
接下來咱們來看Compiler::fgImport函數,這個函數負責把BasicBlock(基本代碼塊)轉換到GenTree(語句樹)
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/flowgraph.cpp#L6663
void Compiler::fgImport() { // 省略部分代碼...... impImport(fgFirstBB); // 省略部分代碼...... }
再看Compiler::impImport
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/importer.cpp#L9207
void Compiler::impImport(BasicBlock* method) { // 省略部分代碼...... /* Import blocks in the worker-list until there are no more */ while (impPendingList) { PendingDsc* dsc = impPendingList; impPendingList = impPendingList->pdNext; // 省略部分代碼...... /* Now import the block */ impImportBlock(dsc->pdBB); } }
再看Compiler::impImportBlock
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/importer.cpp#L15321
//*************************************************************** // Import the instructions for the given basic block. Perform // verification, throwing an exception on failure. Push any successor blocks that are enabled for the first // time, or whose verification pre-state is changed. void Compiler::impImportBlock(BasicBlock* block) { // 省略部分代碼...... pParam->pThis->impImportBlockCode(pParam->block); }
在接下來的Compiler::impImportBlockCode函數裏面咱們終於能夠看到對CEE_NEWOBJ指令的處理了
這個函數有5000多行,推薦直接搜索case CEE_NEWOBJ來看如下的部分
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/importer.cpp#L9207
/***************************************************************************** * Import the instr for the given basic block */ void Compiler::impImportBlockCode(BasicBlock* block) { // 省略部分代碼...... // 處理CEE_NEWOBJ指令 case CEE_NEWOBJ: // 在這裏微軟給出了有三種狀況 // 一種是對象是array,一種是對象有活動的長度(例如string),一種是普通的class // 在這裏咱們只分析第三種狀況 // There are three different cases for new // Object size is variable (depends on arguments) // 1) Object is an array (arrays treated specially by the EE) // 2) Object is some other variable sized object (e.g. String) // 3) Class Size can be determined beforehand (normal case) // In the first case, we need to call a NEWOBJ helper (multinewarray) // in the second case we call the constructor with a '0' this pointer // In the third case we alloc the memory, then call the constuctor // 省略部分代碼...... // 建立一個GT_ALLOCOBJ類型的GenTree(語句樹)節點,用於分配內存 op1 = gtNewAllocObjNode(info.compCompHnd->getNewHelper(&resolvedToken, info.compMethodHnd), resolvedToken.hClass, TYP_REF, op1); // 省略部分代碼...... // 由於GT_ALLOCOBJ僅負責分配內存,咱們還須要調用構造函數 // 這裏複用了CEE_CALL指令的處理 goto CALL; // 省略部分代碼...... CALL: // memberRef should be set. // 省略部分代碼...... // 建立一個GT_CALL類型的GenTree(語句樹)節點,用於調用構造函數 callTyp = impImportCall(opcode, &resolvedToken, constraintCall ? &constrainedResolvedToken : nullptr, newObjThisPtr, prefixFlags, &callInfo, opcodeOffs);
請記住上面代碼中新建的兩個GenTree(語句樹)節點
在上面的代碼咱們能夠看到在生成GT_ALLOCOBJ類型的節點時還傳入了一個newHelper參數,這個newHelper正是分配內存函數的一個標識(索引值)
在CoreCLR中有不少HelperFunc(幫助函數)供JIT生成的代碼調用
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp#L5894
CorInfoHelpFunc CEEInfo::getNewHelper(CORINFO_RESOLVED_TOKEN * pResolvedToken, CORINFO_METHOD_HANDLE callerHandle) { // 省略部分代碼...... MethodTable* pMT = VMClsHnd.AsMethodTable(); // 省略部分代碼...... result = getNewHelperStatic(pMT); // 省略部分代碼...... return result; }
看CEEInfo::getNewHelperStatic
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp#L5941
CorInfoHelpFunc CEEInfo::getNewHelperStatic(MethodTable * pMT) { // 省略部分代碼...... // 這裏有不少判斷,例如是不是Com對象或擁有析構函數,默認會返回CORINFO_HELP_NEWFAST // Slow helper is the default CorInfoHelpFunc helper = CORINFO_HELP_NEWFAST; // 省略部分代碼...... return helper; }
到這裏,咱們能夠知道新建的兩個節點帶有如下的信息
在使用fgImport生成了GenTree(語句樹)之後,還不能直接用這個樹來生成機器代碼,須要通過不少步的變換
其中的一步變換會把GT_ALLOCOBJ節點轉換爲GT_CALL節點,由於分配內存其實是一個對JIT專用的幫助函數的調用
這個變換在ObjectAllocator中實現,ObjectAllocator是JIT編譯過程當中的一個階段(Phase)
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/objectalloc.cpp#L27
void ObjectAllocator::DoPhase() { // 省略部分代碼...... MorphAllocObjNodes(); }
MorphAllocObjNodes用於查找全部節點,若是是GT_ALLOCOBJ則進行轉換
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/objectalloc.cpp#L63
void ObjectAllocator::MorphAllocObjNodes() { // 省略部分代碼...... for (GenTreeStmt* stmt = block->firstStmt(); stmt; stmt = stmt->gtNextStmt) { // 省略部分代碼...... bool canonicalAllocObjFound = false; // 省略部分代碼...... if (op2->OperGet() == GT_ALLOCOBJ) canonicalAllocObjFound = true; // 省略部分代碼...... if (canonicalAllocObjFound) { // 省略部分代碼...... op2 = MorphAllocObjNodeIntoHelperCall(asAllocObj); } } }
MorphAllocObjNodeIntoHelperCall的定義
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/objectalloc.cpp#L152
// MorphAllocObjNodeIntoHelperCall: Morph a GT_ALLOCOBJ node into an // allocation helper call. GenTreePtr ObjectAllocator::MorphAllocObjNodeIntoHelperCall(GenTreeAllocObj* allocObj) { // 省略部分代碼...... GenTreePtr helperCall = comp->fgMorphIntoHelperCall(allocObj, allocObj->gtNewHelper, comp->gtNewArgList(op1)); return helperCall; }
fgMorphIntoHelperCall的定義
這個函數轉換GT_ALLOCOBJ節點到GT_CALL節點,而且獲取指向分配內存的函數的指針
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/morph.cpp#L61
GenTreePtr Compiler::fgMorphIntoHelperCall(GenTreePtr tree, int helper, GenTreeArgList* args) { tree->ChangeOper(GT_CALL); tree->gtFlags |= GTF_CALL; // 省略部分代碼...... // 若是GT_ALLOCOBJ中幫助函數的標識是CORINFO_HELP_NEWFAST,這裏就是eeFindHelper(CORINFO_HELP_NEWFAST) // eeFindHelper會把幫助函數的表示轉換爲幫助函數的句柄 tree->gtCall.gtCallType = CT_HELPER; tree->gtCall.gtCallMethHnd = eeFindHelper(helper); // 省略部分代碼...... tree = fgMorphArgs(tree->AsCall()); return tree; }
到這裏,咱們能夠知道新建的兩個節點變成了這樣
接下來JIT還會對GenTree(語句樹)作出大量處理,這裏省略說明,接下來咱們來看機器碼的生成
函數CodeGen::genCallInstruction負責把GT_CALL節點轉換爲彙編
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/codegenxarch.cpp#L5934
// Produce code for a GT_CALL node void CodeGen::genCallInstruction(GenTreePtr node) { // 省略部分代碼...... if (callType == CT_HELPER) { // 把句柄轉換爲幫助函數的句柄,默認是CORINFO_HELP_NEWFAST helperNum = compiler->eeGetHelperNum(methHnd); // 獲取指向幫助函數的指針 // 這裏等於調用compiler->compGetHelperFtn(CORINFO_HELP_NEWFAST, ...) addr = compiler->compGetHelperFtn(helperNum, (void**)&pAddr); } else { // 調用普通函數 // Direct call to a non-virtual user function. addr = call->gtDirectCallAddress; } }
咱們來看下compGetHelperFtn究竟把CORINFO_HELP_NEWFAST轉換到了什麼函數
compGetHelperFtn的定義
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.hpp#L1907
void* Compiler::compGetHelperFtn(CorInfoHelpFunc ftnNum, /* IN */ void** ppIndirection) /* OUT */ { // 省略部分代碼...... addr = info.compCompHnd->getHelperFtn(ftnNum, ppIndirection); return addr; }
getHelperFtn的定義
這裏咱們能夠看到獲取了hlpDynamicFuncTable這個函數表中的函數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp#L10369
void* CEEJitInfo::getHelperFtn(CorInfoHelpFunc ftnNum, /* IN */ void ** ppIndirection) /* OUT */ { // 省略部分代碼...... pfnHelper = hlpDynamicFuncTable[dynamicFtnNum].pfnHelper; // 省略部分代碼...... result = (LPVOID)GetEEFuncEntryPoint(pfnHelper); return result; }
hlpDynamicFuncTable函數表使用了jithelpers.h中的定義,其中CORINFO_HELP_NEWFAST對應的函數以下
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/jithelpers.h#L78
JITHELPER(CORINFO_HELP_NEWFAST, JIT_New, CORINFO_HELP_SIG_REG_ONLY)
能夠看到對應了JIT_New,這個就是JIT生成的代碼調用分配內存的函數了,JIT_New的定義以下
須要注意的是函數表中的JIT_New在知足必定條件時會被替換爲更快的實現,但做用和JIT_New是同樣的,這一塊將在後面說起
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jithelpers.cpp#L2908
HCIMPL1(Object*, JIT_New, CORINFO_CLASS_HANDLE typeHnd_) { // 省略部分代碼...... MethodTable *pMT = typeHnd.AsMethodTable(); // 省略部分代碼...... // AllocateObject是分配內存的函數,這個函數供CoreCLR的內部代碼或非託管代碼調用 // JIT_New是對這個函數的一個包裝,僅供JIT生成的代碼調用 newobj = AllocateObject(pMT); // 省略部分代碼...... return(OBJECTREFToObject(newobj)); } HCIMPLEND
總結:
JIT從CEE_NEWOBJ生成了兩段代碼,一段是調用JIT_New函數分配內存的代碼,一段是調用構造函數的代碼
咱們來看一下CEE_NEWARR指令是怎樣處理的,由於前面已經花了很大篇幅介紹對CEE_NEWOBJ的處理,這裏僅列出不一樣的部分
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/importer.cpp#L13334
/***************************************************************************** * Import the instr for the given basic block */ void Compiler::impImportBlockCode(BasicBlock* block) { // 省略部分代碼...... // 處理CEE_NEWARR指令 case CEE_NEWARR: // 省略部分代碼...... args = gtNewArgList(op1, op2); // 生成GT_CALL類型的節點調用幫助函數 /* Create a call to 'new' */ // Note that this only works for shared generic code because the same helper is used for all // reference array types op1 = gtNewHelperCallNode(info.compCompHnd->getNewArrHelper(resolvedToken.hClass), TYP_REF, 0, args); }
咱們能夠看到CEE_NEWARR直接生成了GT_CALL節點,不像CEE_NEWOBJ須要進一步的轉換
getNewArrHelper返回了調用的幫助函數,咱們來看一下getNewArrHelper
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp#L6035
/***********************************************************************/ // <REVIEW> this only works for shared generic code because all the // helpers are actually the same. If they were different then things might // break because the same helper would end up getting used for different but // representation-compatible arrays (e.g. one with a default constructor // and one without) </REVIEW> CorInfoHelpFunc CEEInfo::getNewArrHelper (CORINFO_CLASS_HANDLE arrayClsHnd) { // 省略部分代碼...... TypeHandle arrayType(arrayClsHnd); result = getNewArrHelperStatic(arrayType); // 省略部分代碼...... return result; }
再看getNewArrHelperStatic,咱們能夠看到通常狀況下會返回CORINFO_HELP_NEWARR_1_OBJ
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp#L6060
CorInfoHelpFunc CEEInfo::getNewArrHelperStatic(TypeHandle clsHnd) { // 省略部分代碼...... if (CorTypeInfo::IsGenericVariable(elemType)) { result = CORINFO_HELP_NEWARR_1_OBJ; } else if (CorTypeInfo::IsObjRef(elemType)) { // It is an array of object refs result = CORINFO_HELP_NEWARR_1_OBJ; } else { // These cases always must use the slow helper // 省略部分代碼...... } return result; {
CORINFO_HELP_NEWARR_1_OBJ對應的函數以下
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/jithelpers.h#L86
DYNAMICJITHELPER(CORINFO_HELP_NEWARR_1_OBJ, JIT_NewArr1,CORINFO_HELP_SIG_REG_ONLY)
能夠看到對應了JIT_NewArr1這個包裝給JIT調用的幫助函數
和JIT_New同樣,在知足必定條件時會被替換爲更快的實現
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jithelpers.cpp#L3303
HCIMPL2(Object*, JIT_NewArr1, CORINFO_CLASS_HANDLE arrayTypeHnd_, INT_PTR size) { // 省略部分代碼...... CorElementType elemType = pArrayClassRef->GetArrayElementTypeHandle().GetSignatureCorElementType(); if (CorTypeInfo::IsPrimitiveType(elemType) { // 省略部分代碼...... // 若是類型是基元類型(int, double等)則使用更快的FastAllocatePrimitiveArray函數 newArray = FastAllocatePrimitiveArray(pArrayClassRef->GetMethodTable(), static_cast<DWORD>(size), bAllocateInLargeHeap); } else { // 省略部分代碼...... // 默認使用AllocateArrayEx函數 INT32 size32 = (INT32)size; newArray = AllocateArrayEx(typeHnd, &size32, 1); } // 省略部分代碼...... return(OBJECTREFToObject(newArray)); } HCIMPLEND
總結:
JIT從CEE_NEWARR只生成了一段代碼,就是調用JIT_NewArr1函數的代碼
這種new會在棧(stack)分配內存,因此不須要調用任何分配內存的函數
在一開始的例子中,myStruct在編譯時就已經定義爲一個本地變量,對本地變量的須要的內存會在函數剛進入的時候一併分配
這裏咱們先來看本地變量所須要的內存是怎麼計算的
先看Compiler::lvaAssignVirtualFrameOffsetsToLocals
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/lclvars.cpp#L4863
/***************************************************************************** * lvaAssignVirtualFrameOffsetsToLocals() : Assign virtual stack offsets to * locals, temps, and anything else. These will all be negative offsets * (stack grows down) relative to the virtual '0'/return address */ void Compiler::lvaAssignVirtualFrameOffsetsToLocals() { // 省略部分代碼...... for (cur = 0; alloc_order[cur]; cur++) { // 省略部分代碼...... for (lclNum = 0, varDsc = lvaTable; lclNum < lvaCount; lclNum++, varDsc++) { // 省略部分代碼...... // Reserve the stack space for this variable stkOffs = lvaAllocLocalAndSetVirtualOffset(lclNum, lvaLclSize(lclNum), stkOffs); } } }
再看Compiler::lvaAllocLocalAndSetVirtualOffset
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/lclvars.cpp#L5537
int Compiler::lvaAllocLocalAndSetVirtualOffset(unsigned lclNum, unsigned size, int stkOffs) { // 省略部分代碼...... /* Reserve space on the stack by bumping the frame size */ lvaIncrementFrameSize(size); stkOffs -= size; lvaTable[lclNum].lvStkOffs = stkOffs; // 省略部分代碼...... return stkOffs; }
再看Compiler::lvaIncrementFrameSize
咱們能夠看到最終會加到compLclFrameSize這個變量中,這個變量就是當前函數總共須要在棧(Stack)分配的內存大小
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/lclvars.cpp#L3528
inline void Compiler::lvaIncrementFrameSize(unsigned size) { if (size > MAX_FrameSize || compLclFrameSize + size > MAX_FrameSize) { BADCODE("Frame size overflow"); } compLclFrameSize += size; }
如今來看生成機器碼的代碼,在棧分配內存的代碼會在CodeGen::genFnProlog生成
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/codegencommon.cpp#L8140
void CodeGen::genFnProlog() { // 省略部分代碼...... // ARM64和其餘平臺的調用時機不同,可是參數同樣 genAllocLclFrame(compiler->compLclFrameSize, initReg, &initRegZeroed, intRegState.rsCalleeRegArgMaskLiveIn); }
再看CodeGen::genAllocLclFrame,這裏就是分配棧內存的代碼了,簡單的rsp(esp)減去了frameSize
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/codegencommon.cpp#L5846
/*----------------------------------------------------------------------------- * * Probe the stack and allocate the local stack frame: subtract from SP. * On ARM64, this only does the probing; allocating the frame is done when callee-saved registers are saved. */ void CodeGen::genAllocLclFrame(unsigned frameSize, regNumber initReg, bool* pInitRegZeroed, regMaskTP maskArgRegsLiveIn) { // 省略部分代碼...... // sub esp, frameSize 6 inst_RV_IV(INS_sub, REG_SPBASE, frameSize, EA_PTRSIZE); }
總結:
JIT對struct的new會生成統一在棧分配內存的代碼,因此你在IL中看不到new struct的指令
調用構造函數的代碼會從後面的call指令生成
從上面的分析咱們能夠知道第一種new先調用JIT_New分配內存,而後調用構造函數
在上面JIT_New的源代碼中能夠看到,JIT_New內部調用了AllocateObject
先看AllocateObject函數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L931
// AllocateObject will throw OutOfMemoryException so don't need to check // for NULL return value from it. OBJECTREF AllocateObject(MethodTable *pMT #ifdef FEATURE_COMINTEROP , bool fHandleCom #endif ) { // 省略部分代碼...... Object *orObject = NULL; // 若是類型有重要的析構函數,預編譯全部相關的函數(詳細能夠搜索CER) // 同一個類型只會處理一次 if (pMT->HasCriticalFinalizer()) PrepareCriticalFinalizerObject(pMT); // 省略部分代碼...... DWORD baseSize = pMT->GetBaseSize(); // 調用gc的幫助函數分配內存,若是須要向8對齊則調用AllocAlign8,不然調用Alloc if (pMT->RequiresAlign8()) { // 省略部分代碼...... orObject = (Object *) AllocAlign8(baseSize, pMT->HasFinalizer(), pMT->ContainsPointers(), pMT->IsValueType()); } else { orObject = (Object *) Alloc(baseSize, pMT->HasFinalizer(), pMT->ContainsPointers()); } // 檢查同步塊索引(SyncBlock)是否爲0 // verify zero'd memory (at least for sync block) _ASSERTE( orObject->HasEmptySyncBlockInfo() ); // 設置類型信息(MethodTable) if ((baseSize >= LARGE_OBJECT_SIZE)) { orObject->SetMethodTableForLargeObject(pMT); GCHeap::GetGCHeap()->PublishObject((BYTE*)orObject); } else { orObject->SetMethodTable(pMT); } // 省略部分代碼...... return UNCHECKED_OBJECTREF_TO_OBJECTREF(oref); }
再看Alloc函數
源代碼:
// There are only three ways to get into allocate an object. // * Call optimized helpers that were generated on the fly. This is how JIT compiled code does most // allocations, however they fall back code:Alloc, when for all but the most common code paths. These // helpers are NOT used if profiler has asked to track GC allocation (see code:TrackAllocations) // * Call code:Alloc - When the jit helpers fall back, or we do allocations within the runtime code // itself, we ultimately call here. // * Call code:AllocLHeap - Used very rarely to force allocation to be on the large object heap. // // While this is a choke point into allocating an object, it is primitive (it does not want to know about // MethodTable and thus does not initialize that poitner. It also does not know if the object is finalizable // or contains pointers. Thus we quickly wrap this function in more user-friendly ones that know about // MethodTables etc. (see code:FastAllocatePrimitiveArray code:AllocateArrayEx code:AllocateObject) // // You can get an exhaustive list of code sites that allocate GC objects by finding all calls to // code:ProfilerObjectAllocatedCallback (since the profiler has to hook them all). inline Object* Alloc(size_t size, BOOL bFinalize, BOOL bContainsPointers ) { // 省略部分代碼...... // We don't want to throw an SO during the GC, so make sure we have plenty // of stack before calling in. INTERIOR_STACK_PROBE_FOR(GetThread(), static_cast<unsigned>(DEFAULT_ENTRY_PROBE_AMOUNT * 1.5)); if (GCHeapUtilities::UseAllocationContexts()) retVal = GCHeapUtilities::GetGCHeap()->Alloc(GetThreadAllocContext(), size, flags); else retVal = GCHeapUtilities::GetGCHeap()->Alloc(size, flags); if (!retVal) { ThrowOutOfMemory(); } END_INTERIOR_STACK_PROBE; return retVal; }
總結:
第一種new作的事情主要有
第二種new只調用了JIT_NewArr1,從上面JIT_NewArr1的源代碼能夠看到
若是元素的類型是基元類型(int, double等)則會調用FastAllocatePrimitiveArray,不然會調用AllocateArrayEx
先看FastAllocatePrimitiveArray函數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L563
/* * Allocates a single dimensional array of primitive types. */ OBJECTREF FastAllocatePrimitiveArray(MethodTable* pMT, DWORD cElements, BOOL bAllocateInLargeHeap) { // 省略部分代碼...... // 檢查元素數量不能大於一個硬性限制 SIZE_T componentSize = pMT->GetComponentSize(); if (cElements > MaxArrayLength(componentSize)) ThrowOutOfMemory(); // 檢查總大小不能溢出 S_SIZE_T safeTotalSize = S_SIZE_T(cElements) * S_SIZE_T(componentSize) + S_SIZE_T(pMT->GetBaseSize()); if (safeTotalSize.IsOverflow()) ThrowOutOfMemory(); size_t totalSize = safeTotalSize.Value(); // 省略部分代碼...... // 調用gc的幫助函數分配內存 ArrayBase* orObject; if (bAllocateInLargeHeap) { orObject = (ArrayBase*) AllocLHeap(totalSize, FALSE, FALSE); } else { ArrayTypeDesc *pArrayR8TypeDesc = g_pPredefinedArrayTypes[ELEMENT_TYPE_R8]; if (DATA_ALIGNMENT < sizeof(double) && pArrayR8TypeDesc != NULL && pMT == pArrayR8TypeDesc->GetMethodTable() && totalSize < LARGE_OBJECT_SIZE - MIN_OBJECT_SIZE) { // Creation of an array of doubles, not in the large object heap. // We want to align the doubles to 8 byte boundaries, but the GC gives us pointers aligned // to 4 bytes only (on 32 bit platforms). To align, we ask for 12 bytes more to fill with a // dummy object. // If the GC gives us a 8 byte aligned address, we use it for the array and place the dummy // object after the array, otherwise we put the dummy object first, shifting the base of // the array to an 8 byte aligned address. // Note: on 64 bit platforms, the GC always returns 8 byte aligned addresses, and we don't // execute this code because DATA_ALIGNMENT < sizeof(double) is false. _ASSERTE(DATA_ALIGNMENT == sizeof(double)/2); _ASSERTE((MIN_OBJECT_SIZE % sizeof(double)) == DATA_ALIGNMENT); // used to change alignment _ASSERTE(pMT->GetComponentSize() == sizeof(double)); _ASSERTE(g_pObjectClass->GetBaseSize() == MIN_OBJECT_SIZE); _ASSERTE(totalSize < totalSize + MIN_OBJECT_SIZE); orObject = (ArrayBase*) Alloc(totalSize + MIN_OBJECT_SIZE, FALSE, FALSE); Object *orDummyObject; if((size_t)orObject % sizeof(double)) { orDummyObject = orObject; orObject = (ArrayBase*) ((size_t)orObject + MIN_OBJECT_SIZE); } else { orDummyObject = (Object*) ((size_t)orObject + totalSize); } _ASSERTE(((size_t)orObject % sizeof(double)) == 0); orDummyObject->SetMethodTable(g_pObjectClass); } else { orObject = (ArrayBase*) Alloc(totalSize, FALSE, FALSE); bPublish = (totalSize >= LARGE_OBJECT_SIZE); } } // 設置類型信息(MethodTable) // Initialize Object orObject->SetMethodTable( pMT ); _ASSERTE(orObject->GetMethodTable() != NULL); // 設置數組長度 orObject->m_NumComponents = cElements; // 省略部分代碼...... return( ObjectToOBJECTREF((Object*)orObject) ); }
再看AllocateArrayEx函數,這個函數比起上面的函數多出了對多維數組的處理
JIT_NewArr1調用AllocateArrayEx時傳了3個參數,剩下2個參數是可選參數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L282
// Handles arrays of arbitrary dimensions // // If dwNumArgs is set to greater than 1 for a SZARRAY this function will recursively // allocate sub-arrays and fill them in. // // For arrays with lower bounds, pBounds is <lower bound 1>, <count 1>, <lower bound 2>, ... OBJECTREF AllocateArrayEx(TypeHandle arrayType, INT32 *pArgs, DWORD dwNumArgs, BOOL bAllocateInLargeHeap DEBUG_ARG(BOOL bDontSetAppDomain)) { // 省略部分代碼...... ArrayBase * orArray = NULL; // 省略部分代碼...... // 調用gc的幫助函數分配內存 if (bAllocateInLargeHeap) { orArray = (ArrayBase *) AllocLHeap(totalSize, FALSE, pArrayMT->ContainsPointers()); // 設置類型信息(MethodTable) orArray->SetMethodTableForLargeObject(pArrayMT); } else { #ifdef FEATURE_64BIT_ALIGNMENT MethodTable *pElementMT = arrayDesc->GetTypeParam().GetMethodTable(); if (pElementMT->RequiresAlign8() && pElementMT->IsValueType()) { // This platform requires that certain fields are 8-byte aligned (and the runtime doesn't provide // this guarantee implicitly, e.g. on 32-bit platforms). Since it's the array payload, not the // header that requires alignment we need to be careful. However it just so happens that all the // cases we care about (single and multi-dim arrays of value types) have an even number of DWORDs // in their headers so the alignment requirements for the header and the payload are the same. _ASSERTE(((pArrayMT->GetBaseSize() - SIZEOF_OBJHEADER) & 7) == 0); orArray = (ArrayBase *) AllocAlign8(totalSize, FALSE, pArrayMT->ContainsPointers(), FALSE); } else #endif { orArray = (ArrayBase *) Alloc(totalSize, FALSE, pArrayMT->ContainsPointers()); } // 設置類型信息(MethodTable) orArray->SetMethodTable(pArrayMT); } // 設置數組長度 // Initialize Object orArray->m_NumComponents = cElements; // 省略部分代碼...... return ObjectToOBJECTREF((Object *) orArray); }
總結:
第二種new作的事情主要有
對struct的new不會從GCHeap申請內存,也不會設置類型信息(MethodTable),因此能夠直接進入總結
總結:
第三種new作的事情主要有
打開VS反彙編和內存窗口,讓咱們來看看第一種new實際作了什麼事情
第一種new的反彙編結果以下,一共有兩個call
00007FF919570B53 mov rcx,7FF9194161A0h // 設置第一個參數(指向MethodTable的指針) 00007FF919570B5D call 00007FF97905E350 // 調用分配內存的函數,默認是JIT_New 00007FF919570B62 mov qword ptr [rbp+38h],rax // 把地址設置到臨時變量(rbp+38) 00007FF919570B66 mov r8,37BFC73068h 00007FF919570B70 mov r8,qword ptr [r8] // 設置第三個參數("hello") 00007FF919570B73 mov rcx,qword ptr [rbp+38h] // 設置第一個參數(this) 00007FF919570B77 mov edx,12345678h // 設置第二個參數(0x12345678) 00007FF919570B7C call 00007FF9195700B8 // 調用構造函數 00007FF919570B81 mov rcx,qword ptr [rbp+38h] 00007FF919570B85 mov qword ptr [rbp+50h],rcx // 把臨時變量複製到myClass變量中
第一個call是分配內存使用的幫助函數,默認調用JIT_New
可是這裏實際調用的不是JIT_New而是JIT_TrialAllocSFastMP_InlineGetThread函數,這是一個優化版本容許分配上下文中快速分配內存
咱們來看一下JIT_TrialAllocSFastMP_InlineGetThread函數的定義
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/amd64/JitHelpers_InlineGetThread.asm#L59
; IN: rcx: MethodTable* ; OUT: rax: new object LEAF_ENTRY JIT_TrialAllocSFastMP_InlineGetThread, _TEXT mov edx, [rcx + OFFSET__MethodTable__m_BaseSize] // 從MethodTable獲取須要分配的內存大小,放到edx ; m_BaseSize is guaranteed to be a multiple of 8. PATCHABLE_INLINE_GETTHREAD r11, JIT_TrialAllocSFastMP_InlineGetThread__PatchTLSOffset mov r10, [r11 + OFFSET__Thread__m_alloc_context__alloc_limit] // 獲取分配上下文的限制地址,放到r10 mov rax, [r11 + OFFSET__Thread__m_alloc_context__alloc_ptr] // 獲取分配上下文的當前地址,放到rax add rdx, rax // 地址 + 須要分配的內存大小,放到rdx cmp rdx, r10 // 判斷是否能夠從分配上下文分配內存 ja AllocFailed // if (rdx > r10) mov [r11 + OFFSET__Thread__m_alloc_context__alloc_ptr], rdx // 設置新的當前地址 mov [rax], rcx // 給剛剛分配到的內存設置MethodTable ifdef _DEBUG call DEBUG_TrialAllocSetAppDomain_NoScratchArea endif ; _DEBUG ret // 分配成功,返回 AllocFailed: jmp JIT_NEW // 分配失敗,調用默認的JIT_New函數 LEAF_END JIT_TrialAllocSFastMP_InlineGetThread, _TEXT
能夠當分配上下文未用完時會從分配上下文中分配,但用完時會調用JIT_New作更多的處理
第二個call調用構造函數,call的地址和下面的地址不一致多是由於中間有一層包裝,目前還未解明包裝中的處理
最後一個call調用的是JIT_WriteBarrier
反彙編能夠看到第二種new只有一個call
00007FF919570B93 mov rcx,7FF9195B4CFAh // 設置第一個參數(指向MethodTable的指針) 00007FF919570B9D mov edx,378h // 設置第二個參數(數組的大小) 00007FF919570BA2 call 00007FF97905E440 // 調用分配內存的函數,默認是JIT_NewArr1 00007FF919570BA7 mov qword ptr [rbp+30h],rax // 設置到臨時變量(rbp+30) 00007FF919570BAB mov rcx,qword ptr [rbp+30h] 00007FF919570BAF mov qword ptr [rbp+48h],rcx // 把臨時變量複製到myArray變量中
call實際調用的是JIT_NewArr1VC_MP_InlineGetThread這個函數
和JIT_TrialAllocSFastMP_InlineGetThread同樣,一樣是從分配上下文中快速分配內存的函數
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/amd64/JitHelpers_InlineGetThread.asm#L207
具體代碼這裏就再也不分析,有興趣的能夠去閱讀上面的源代碼
對struct的new會在函數進入的時候從棧分配內存,這裏是減小rsp寄存器(棧頂)的值
00007FF919570B22 push rsi // 保存原rsi 00007FF919570B23 sub rsp,60h // 從棧分配內存 00007FF919570B27 mov rbp,rsp // 複製值到rbp 00007FF919570B2A mov rsi,rcx // 保存原rcx到rsi 00007FF919570B2D lea rdi,[rbp+28h] // rdi = rbp+28,有28個字節須要清零 00007FF919570B31 mov ecx,0Eh // rcx = 14 (計數) 00007FF919570B36 xor eax,eax // eax = 0 00007FF919570B38 rep stos dword ptr [rdi] // 把eax的值(short)設置到rdi直到rcx爲0,總共清空14*2=28個字節 00007FF919570B3A mov rcx,rsi // 恢復原rcx
由於分配的內存已經在棧裏面,後面只須要直接調構造函數
00007FF919570BBD lea rcx,[rbp+40h] // 第一個參數 (this) 00007FF919570BC1 mov edx,55667788h // 第二個參數 (0x55667788) 00007FF919570BC6 call 00007FF9195700A0 // 調用構造函數
構造函數的反編譯
中間有一個call 00007FF97942E260調用的是JIT_DbgIsJustMyCode
在函數結束時會自動釋放從棧分配的內存,在最後會讓rsp = rbp + 0x60,這樣rsp就恢復原值了
http://stackoverflow.com/questions/1255803/does-the-net-clr-jit-compile-every-method-every-time
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L986
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jithelpers.cpp#L2908
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterface.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/jitinterfacegen.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/amd64/JitHelpers_InlineGetThread.asm
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcinterface.h#L230
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/opcode.def#L153
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/readytorunhelpers.h#L46
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/readytorun.h#L236
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/corinfo.h##L1147
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/corjit.h#L350
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/ee_il_dll.cpp#L279
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/inc/jithelpers.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.hpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/compiler.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/flowgraph.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/importer.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/gentree.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/objectalloc.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/morph.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/jit/codegenxarch.cpp#L8404
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/ryujit-overview.md
https://github.com/dotnet/coreclr/blob/master/Documentation/building/viewing-jit-dumps.md
https://github.com/dotnet/coreclr/blob/master/Documentation/building/linux-instructions.md
https://en.wikipedia.org/wiki/Basic_block
https://en.wikipedia.org/wiki/Control_flow_graph
https://en.wikipedia.org/wiki/Static_single_assignment_form
https://msdn.microsoft.com/en-us/library/windows/hardware/ff561499(v=vs.85).aspx
https://msdn.microsoft.com/en-us/library/ms228973(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.runtime.constrainedexecution.criticalfinalizerobject(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.runtime.interopservices.safehandle(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.runtime.interopservices.criticalhandle(v=vs.110).aspx
https://dotnet.myget.org/feed/dotnet-core/package/nuget/runtime.win7-x64.Microsoft.NETCore.Runtime.CoreCLR
http://www.codemachine.com/article_x64deepdive.html
這一篇相對前一篇多了不少c++和彙編代碼,也在表面上涉及到了JIT,大家可能會說看不懂
這是正常的,我也不是徹底看懂這篇提到的全部處理
歡迎大神們勘誤,也歡迎小白們提問
接下來我會重點分析GC分配內存的算法,敬請期待