在前一篇中我講解了new是怎麼工做的, 可是卻一筆跳過了內存分配相關的部分.
在這一篇中我將詳細講解GC內存分配器的內部實現.
在看這一篇以前請必須先看完微軟BOTR文檔中的"Garbage Collection Design",
原文地址是: https://github.com/dotnet/coreclr/blob/master/Documentation/botr/garbage-collection.md
譯文能夠看知平軟件的譯文或我後來的譯文
請務必先看完"Garbage Collection Design", 不然如下內容你極可能會沒法理解html
關於服務器GC和工做站GC的區別, 網上已經有不少資料講解這篇就再也不說明了.
咱們來看服務器GC和工做站GC的代碼是怎麼區別開來的.
默認編譯CoreCLR會對同一份代碼以使用服務器GC仍是工做站GC的區別編譯兩次, 分別在SVR和WKS命名空間中:linux
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcsvr.cppgit
#define SERVER_GC 1 namespace SVR { #include "gcimpl.h" #include "gc.cpp" }
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcwks.cppgithub
#ifdef SERVER_GC #undef SERVER_GC #endif namespace WKS { #include "gcimpl.h" #include "gc.cpp" }
當定義了SERVER_GC時, MULTIPLE_HEAPS和會被同時定義.
定義了MULTIPLE_HEAPS會使用多個堆(Heap), 服務器GC每一個cpu核心都會對應一個堆(默認), 工做站GC則全局使用同一個堆.web
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcimpl.hmacos
#ifdef SERVER_GC #define MULTIPLE_HEAPS 1 #endif // SERVER_GC
後臺GC不管是服務器GC仍是工做站GC都會默認支持, 但運行時不必定會啓用.json
源代碼: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcpriv.hubuntu
#define BACKGROUND_GC //concurrent background GC (requires WRITE_WATCH)
咱們從https://www.microsoft.com/net下回來的CoreCLR安裝包中已經包含了服務器GC和後臺GC的支持,但默認不會開啓.
開啓它們能夠修改project.json中的·runtimeOptions·節, 例子以下:windows
{ "runtimeOptions": { "configProperties": { "System.GC.Server": true, "System.GC.Concurrent": true } } }
設置後發佈項目能夠看到coreapp.runtimeconfig.json, 運行時會只看這個文件.
微軟官方的文檔: https://docs.microsoft.com/en-us/dotnet/articles/core/tools/project-jsonapi
我先用兩張圖來解釋服務器GC和工做站GC下GC相關的類的關係
圖中一共有5個類型
GCHeap的源代碼摘要:
GCHeap的定義: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcimpl.h#L61
這裏我只列出這篇文章涉及到的成員
// WKS::GCHeap或SVR::GCHeap繼承全局命名空間下的GCHeap class GCHeap : public ::GCHeap { #ifdef MULTIPLE_HEAPS // 服務器GC每一個GCHeap實例都會和一個gc_heap實例互相關聯 gc_heap* pGenGCHeap; #else // 工做站GC下gc_heap全部字段和函數都是靜態的, 因此能夠用((gc_heap*)nullptr)->xxx來訪問 // 嚴格來講是UB(未定義動做), 可是實際能夠工做 #define pGenGCHeap ((gc_heap*)0) #endif //MULTIPLE_HEAPS };
全局的GCHeap實例: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.h#L105
這裏是1.1.0的代碼, 1.2.0全局GCHeap會分別保存到gcheaputilities.h(g_pGCHeap)和gc.cpp(g_theGCHeap), 兩處地方都指向同一個實例.
// 至關於extern GCHeap* g_pGCHeap; GPTR_DECL(GCHeap, g_pGCHeap);
gc_heap的源代碼摘要:
gc_heap的定義: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcpriv.h#L1079
這個類有300多個成員(從ephemeral_low開始), 這裏我只列出這篇文章涉及到的成員
class gc_heap { #ifdef MULTIPLE_HEAPS // 對應的GCHeap實例 PER_HEAP GCHeap* vm_heap; // 序號 PER_HEAP int heap_number; // 給分配上下文設置內存範圍的次數 PER_HEAP VOLATILE(int) alloc_context_count; #else //MULTIPLE_HEAPS // 工做站GC時對應全局的GCHeap實例 #define vm_heap ((GCHeap*) g_pGCHeap) // 工做站GC時序號爲0 #define heap_number (0) #endif //MULTIPLE_HEAPS #ifndef MULTIPLE_HEAPS // 當前使用的短暫的堆段(用於分配新對象的堆段) SPTR_DECL(heap_segment,ephemeral_heap_segment); #else // 同上 PER_HEAP heap_segment* ephemeral_heap_segment; #endif // !MULTIPLE_HEAPS // 全局GC線程鎖, 靜態變量 PER_HEAP_ISOLATED GCSpinLock gc_lock; //lock while doing GC // 分配上下文用完, 須要爲分配上下文指定新的範圍時使用的線程鎖 PER_HEAP GCSpinLock more_space_lock; //lock while allocating more space #ifdef MULTIPLE_HEAPS // 儲存各個代的信息 // NUMBERGENERATIONS+1=5, 代分別有0 1 2 3, 最後一個元素不會被使用 // 工做站GC時不會定義, 而是使用全局變量generation_table PER_HEAP generation generation_table [NUMBERGENERATIONS+1]; #endif #ifdef MULTIPLE_HEAPS // 全局gc_heap的數量, 靜態變量 // 服務器GC默認是cpu核心數, 工做站GC是0 SVAL_DECL(int, n_heaps); // 全局gc_heap的數組, 靜態變量 SPTR_DECL(PTR_gc_heap, g_heaps); #endif };
generation的源代碼摘要:
generation的定義: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcpriv.h#L754
這裏我只列出這篇文章涉及到的成員
class generation { public: // 默認的分配上下文 alloc_context allocation_context; // 用於分配的最新的堆段 heap_segment* allocation_segment; // 開始的堆段 PTR_heap_segment start_segment; // 用於區分對象在哪一個代的指針, 在此以後的對象都屬於這個代, 或比這個代更年輕的代 uint8_t* allocation_start; // 用於儲存和分配自由對象(Free Object, 又名Unused Array, 能夠理解爲碎片空間)的分配器 allocator free_list_allocator; // 這個代是第幾代 int gen_num; };
heap_segment的源代碼摘要:
heap_segment的定義: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcpriv.h#L4166
這裏我只列出這篇文章涉及到的成員
class heap_segment { public: // 已實際分配地址 (mem + 已分配大小) // 更新有可能會延遲 uint8_t* allocated; // 已提交到物理內存的地址 (this + SEGMENT_INITIAL_COMMIT) uint8_t* committed; // 預留到的分配地址 (this + size) uint8_t* reserved; // 已使用地址 (mem + 已分配大小 - 對象頭大小) uint8_t* used; // 初始分配地址 (服務器gc開啓時: this + OS_PAGE_SIZE, 不然: this + sizeof(*this) + alignment) uint8_t* mem; // 下一個堆段 PTR_heap_segment next; // 屬於的gc_heap實例 gc_heap* heap; };
alloc_context的源代碼摘要:
alloc_context的定義: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.h#L162
這裏是1.1.0的代碼, 1.2.0這些成員移動到了gcinterface.h的gc_alloc_context, 可是成員仍是同樣的
struct alloc_context { // 下一次分配對象的開始地址 uint8_t* alloc_ptr; // 能夠分配到的最終地址 uint8_t* alloc_limit; // 歷史分配的小對象大小合計 int64_t alloc_bytes; //Number of bytes allocated on SOH by this context // 歷史分配的大對象大小合計 int64_t alloc_bytes_loh; //Number of bytes allocated on LOH by this context #if defined(FEATURE_SVR_GC) // 空間不夠須要獲取更多空間時使用的GCHeap // 分alloc_heap和home_heap的做用是平衡各個heap的使用量,這樣並行回收時能夠減小處理各個heap的時間差別 SVR::GCHeap* alloc_heap; // 原來的GCHeap SVR::GCHeap* home_heap; #endif // defined(FEATURE_SVR_GC) // 歷史分配對象次數 int alloc_count; };
爲了更好理解下面即將講解的代碼,請先看這兩張圖片
還記得上篇我提到過的AllocateObject函數嗎? 這個函數由JIT_New調用, 負責分配一個普通的對象.
讓咱們來繼續跟蹤這個函數的內部吧:
AllocateObject函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L931
AllocateObject的其餘版本一樣也會調用AllocAlign8或Alloc函數, 下面就再也不貼出其餘版本的函數代碼了.
OBJECTREF AllocateObject(MethodTable *pMT #ifdef FEATURE_COMINTEROP , bool fHandleCom #endif ) { // 省略部分代碼...... Object *orObject = NULL; // 調用gc的幫助函數分配內存,若是須要向8對齊則調用AllocAlign8,不然調用Alloc if (pMT->RequiresAlign8()) { // 省略部分代碼...... orObject = (Object *) AllocAlign8(baseSize, pMT->HasFinalizer(), pMT->ContainsPointers(), pMT->IsValueType()); } else { orObject = (Object *) Alloc(baseSize, pMT->HasFinalizer(), pMT->ContainsPointers()); } // 省略部分代碼...... return UNCHECKED_OBJECTREF_TO_OBJECTREF(oref); }
Alloc函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L931
inline Object* Alloc(size_t size, BOOL bFinalize, BOOL bContainsPointers ) { // 省略部分代碼...... // 若是啓用分配上下文,則使用當前線程的分配上下文進行分配 // 不然使用代(generation)中默認的分配上下文進行分配 // 按官方的說法絕大部分狀況下都會啓用分配上下文 // 實測的機器上UseAllocationContexts函數會不通過判斷直接返回true if (GCHeap::UseAllocationContexts()) retVal = GCHeap::GetGCHeap()->Alloc(GetThreadAllocContext(), size, flags); else retVal = GCHeap::GetGCHeap()->Alloc(size, flags); // 省略部分代碼...... return retVal; }
GetGCHeap函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.h#L377
static GCHeap *GetGCHeap() { LIMITED_METHOD_CONTRACT; // 返回全局的GCHeap實例 // 注意這個實例只做爲接口使用,不和具體的gc_heap實例關聯 _ASSERTE(g_pGCHeap != NULL); return g_pGCHeap; }
GetThreadAllocContext函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L54
inline alloc_context* GetThreadAllocContext() { WRAPPER_NO_CONTRACT; assert(GCHeap::UseAllocationContexts()); // 獲取當前線程並返回m_alloc_context成員的地址 return & GetThread()->m_alloc_context; }
GCHeap::Alloc函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
Object* GCHeap::Alloc(alloc_context* acontext, size_t size, uint32_t flags REQD_ALIGN_DCL) { // 省略部分代碼...... Object* newAlloc = NULL; // 若是分配上下文是第一次使用,使用AssignHeap函數先給它對應一個GCHeap實例 #ifdef MULTIPLE_HEAPS if (acontext->alloc_heap == 0) { AssignHeap (acontext); assert (acontext->alloc_heap); } #endif //MULTIPLE_HEAPS // 必要時觸發GC #ifndef FEATURE_REDHAWK GCStress<gc_on_alloc>::MaybeTrigger(acontext); #endif // FEATURE_REDHAWK // 服務器GC使用GCHeap對應的gc_heap, 工做站GC使用nullptr #ifdef MULTIPLE_HEAPS gc_heap* hp = acontext->alloc_heap->pGenGCHeap; #else gc_heap* hp = pGenGCHeap; // 省略部分代碼...... #endif //MULTIPLE_HEAPS // 分配小對象時使用allocate函數, 分配大對象時使用allocate_large_object函數 if (size < LARGE_OBJECT_SIZE) { #ifdef TRACE_GC AllocSmallCount++; #endif //TRACE_GC // 分配小對象內存 newAlloc = (Object*) hp->allocate (size + ComputeMaxStructAlignPad(requiredAlignment), acontext); #ifdef FEATURE_STRUCTALIGN // 對齊指針 newAlloc = (Object*) hp->pad_for_alignment ((uint8_t*) newAlloc, requiredAlignment, size, acontext); #endif // FEATURE_STRUCTALIGN // ASSERT (newAlloc); } else { // 分配大對象內存 newAlloc = (Object*) hp->allocate_large_object (size + ComputeMaxStructAlignPadLarge(requiredAlignment), acontext->alloc_bytes_loh); #ifdef FEATURE_STRUCTALIGN // 對齊指針 newAlloc = (Object*) hp->pad_for_alignment_large ((uint8_t*) newAlloc, requiredAlignment, size); #endif // FEATURE_STRUCTALIGN } // 省略部分代碼...... return newAlloc; }
讓咱們來看一下小對象的內存是如何分配的
allocate函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數嘗試從分配上下文分配內存, 失敗時調用allocate_more_space爲分配上下文指定新的空間
這裏的前半部分的處理還有彙編版本, 能夠看上一篇分析的JIT_TrialAllocSFastMP_InlineGetThread
函數
inline CObjectHeader* gc_heap::allocate (size_t jsize, alloc_context* acontext) { size_t size = Align (jsize); assert (size >= Align (min_obj_size)); { retry: // 嘗試把對象分配到alloc_ptr uint8_t* result = acontext->alloc_ptr; acontext->alloc_ptr+=size; // 若是alloc_ptr + 對象大小 > alloc_limit, 則表示這個分配上下文是第一次使用或者剩餘空間已經不夠用了 if (acontext->alloc_ptr <= acontext->alloc_limit) { // 分配成功, 這裏返回的地址就是+=size以前的alloc_ptr CObjectHeader* obj = (CObjectHeader*)result; assert (obj != 0); return obj; } else { // 分配失敗, 把size減回去 acontext->alloc_ptr -= size; #ifdef _MSC_VER #pragma inline_depth(0) #endif //_MSC_VER // 嘗試爲分配上下文從新指定一塊範圍 if (! allocate_more_space (acontext, size, 0)) return 0; #ifdef _MSC_VER #pragma inline_depth(20) #endif //_MSC_VER // 重試 goto retry; } } }
allocate_more_space函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會在有多個heap時調用balance_heaps平衡各個heap的使用量, 而後再調用try_allocate_more_space函數
BOOL gc_heap::allocate_more_space(alloc_context* acontext, size_t size, int alloc_generation_number) { int status; do { // 若是有多個heap須要先平衡它們的使用量以減小並行回收時的處理時間差 #ifdef MULTIPLE_HEAPS if (alloc_generation_number == 0) { // 平衡各個heap的使用量 balance_heaps (acontext); // 調用try_allocate_more_space函數 status = acontext->alloc_heap->pGenGCHeap->try_allocate_more_space (acontext, size, alloc_generation_number); } else { // 平衡各個heap的使用量(大對象) gc_heap* alloc_heap = balance_heaps_loh (acontext, size); // 調用try_allocate_more_space函數 status = alloc_heap->try_allocate_more_space (acontext, size, alloc_generation_number); } #else // 只有一個heap時直接調用try_allocate_more_space函數 status = try_allocate_more_space (acontext, size, alloc_generation_number); #endif //MULTIPLE_HEAPS } while (status == -1); return (status != 0); }
try_allocate_more_space函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會獲取MSL鎖, 檢查是否有必要觸發GC, 而後根據gen_number參數調用allocate_small或allocate_large函數
int gc_heap::try_allocate_more_space (alloc_context* acontext, size_t size, int gen_number) { // gc已經開始時等待gc完成並重試 // allocate函數會跑到retry再調用這個函數 if (gc_heap::gc_started) { wait_for_gc_done(); return -1; } // 獲取more_space_lock鎖 // 而且統計獲取鎖須要的時間是否多或者少 #ifdef SYNCHRONIZATION_STATS unsigned int msl_acquire_start = GetCycleCount32(); #endif //SYNCHRONIZATION_STATS enter_spin_lock (&more_space_lock); add_saved_spinlock_info (me_acquire, mt_try_alloc); dprintf (SPINLOCK_LOG, ("[%d]Emsl for alloc", heap_number)); #ifdef SYNCHRONIZATION_STATS unsigned int msl_acquire = GetCycleCount32() - msl_acquire_start; total_msl_acquire += msl_acquire; num_msl_acquired++; if (msl_acquire > 200) { num_high_msl_acquire++; } else { num_low_msl_acquire++; } #endif //SYNCHRONIZATION_STATS // 這部分的代碼被註釋了 // 由於獲取msl(more space lock)鎖已經能夠防止問題出現 /* // We are commenting this out 'cause we don't see the point - we already // have checked gc_started when we were acquiring the msl - no need to check // again. This complicates the logic in bgc_suspend_EE 'cause that one would // need to release msl which causes all sorts of trouble. if (gc_heap::gc_started) { #ifdef SYNCHRONIZATION_STATS good_suspension++; #endif //SYNCHRONIZATION_STATS BOOL fStress = (g_pConfig->GetGCStressLevel() & EEConfig::GCSTRESS_TRANSITION) != 0; if (!fStress) { //Rendez vous early (MP scaling issue) //dprintf (1, ("[%d]waiting for gc", heap_number)); wait_for_gc_done(); #ifdef MULTIPLE_HEAPS return -1; #endif //MULTIPLE_HEAPS } } */ dprintf (3, ("requested to allocate %d bytes on gen%d", size, gen_number)); // 獲取對齊使用的值 // 小對象3(0b11)或者7(0b111), 大對象7(0b111) int align_const = get_alignment_constant (gen_number != (max_generation+1)); // 必要時觸發GC if (fgn_maxgen_percent) { check_for_full_gc (gen_number, size); } // 再次檢查必要時觸發GC if (!(new_allocation_allowed (gen_number))) { if (fgn_maxgen_percent && (gen_number == 0)) { // We only check gen0 every so often, so take this opportunity to check again. check_for_full_gc (gen_number, size); } // 後臺GC運行中而且物理內存佔用率在95%以上時等待後臺GC完成 #ifdef BACKGROUND_GC wait_for_bgc_high_memory (awr_gen0_alloc); #endif //BACKGROUND_GC #ifdef SYNCHRONIZATION_STATS bad_suspension++; #endif //SYNCHRONIZATION_STATS dprintf (/*100*/ 2, ("running out of budget on gen%d, gc", gen_number)); // 必要時原地觸發GC if (!settings.concurrent || (gen_number == 0)) { vm_heap->GarbageCollectGeneration (0, ((gen_number == 0) ? reason_alloc_soh : reason_alloc_loh)); #ifdef MULTIPLE_HEAPS // 觸發GC後會釋放MSL鎖, 須要從新獲取 enter_spin_lock (&more_space_lock); add_saved_spinlock_info (me_acquire, mt_try_budget); dprintf (SPINLOCK_LOG, ("[%d]Emsl out budget", heap_number)); #endif //MULTIPLE_HEAPS } } // 根據是第幾代調用不一樣的函數, 函數裏面會給分配上下文指定新的範圍 // 參數gen_number只能是0或者3 BOOL can_allocate = ((gen_number == 0) ? allocate_small (gen_number, size, acontext, align_const) : allocate_large (gen_number, size, acontext, align_const)); // 成功時檢查是否要觸發ETW(Event Tracing for Windows)事件 if (can_allocate) { // 記錄給了分配上下文多少字節 //ETW trace for allocation tick size_t alloc_context_bytes = acontext->alloc_limit + Align (min_obj_size, align_const) - acontext->alloc_ptr; int etw_allocation_index = ((gen_number == 0) ? 0 : 1); etw_allocation_running_amount[etw_allocation_index] += alloc_context_bytes; // 超過必定量時觸發ETW事件 if (etw_allocation_running_amount[etw_allocation_index] > etw_allocation_tick) { #ifdef FEATURE_REDHAWK FireEtwGCAllocationTick_V1((uint32_t)etw_allocation_running_amount[etw_allocation_index], ((gen_number == 0) ? ETW::GCLog::ETW_GC_INFO::AllocationSmall : ETW::GCLog::ETW_GC_INFO::AllocationLarge), GetClrInstanceId()); #else // Unfortunately some of the ETW macros do not check whether the ETW feature is enabled. // The ones that do are much less efficient. #if defined(FEATURE_EVENT_TRACE) if (EventEnabledGCAllocationTick_V2()) { fire_etw_allocation_event (etw_allocation_running_amount[etw_allocation_index], gen_number, acontext->alloc_ptr); } #endif //FEATURE_EVENT_TRACE #endif //FEATURE_REDHAWK // 重置量 etw_allocation_running_amount[etw_allocation_index] = 0; } } return (int)can_allocate; }
allocate_small函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
循環嘗試進行各類回收內存的處理和調用soh_try_fit函數, soh_try_fit函數分配成功或手段已經用盡時跳出循環
BOOL gc_heap::allocate_small (int gen_number, size_t size, alloc_context* acontext, int align_const) { // 工做站GC且後臺GC運行時140次(bgc_alloc_spin_count)休眠1次, 休眠時間2ms(bgc_alloc_spin) #if defined (BACKGROUND_GC) && !defined (MULTIPLE_HEAPS) if (recursive_gc_sync::background_running_p()) { background_soh_alloc_count++; if ((background_soh_alloc_count % bgc_alloc_spin_count) == 0) { Thread* current_thread = GetThread(); add_saved_spinlock_info (me_release, mt_alloc_small); dprintf (SPINLOCK_LOG, ("[%d]spin Lmsl", heap_number)); leave_spin_lock (&more_space_lock); BOOL cooperative_mode = enable_preemptive (current_thread); GCToOSInterface::Sleep (bgc_alloc_spin); disable_preemptive (current_thread, cooperative_mode); enter_spin_lock (&more_space_lock); add_saved_spinlock_info (me_acquire, mt_alloc_small); dprintf (SPINLOCK_LOG, ("[%d]spin Emsl", heap_number)); } else { //GCToOSInterface::YieldThread (0); } } #endif //BACKGROUND_GC && !MULTIPLE_HEAPS gc_reason gr = reason_oos_soh; oom_reason oom_r = oom_no_failure; // No variable values should be "carried over" from one state to the other. // That's why there are local variable for each state allocation_state soh_alloc_state = a_state_start; // 開始循環切換狀態, 請關注soh_alloc_state // If we can get a new seg it means allocation will succeed. while (1) { dprintf (3, ("[h%d]soh state is %s", heap_number, allocation_state_str[soh_alloc_state])); switch (soh_alloc_state) { // 成功或失敗時跳出循環 case a_state_can_allocate: case a_state_cant_allocate: { goto exit; } // 開始時切換狀態到a_state_try_fit case a_state_start: { soh_alloc_state = a_state_try_fit; break; } // 調用soh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 失敗時切換狀態到a_state_trigger_full_compact_gc或a_state_trigger_ephemeral_gc case a_state_try_fit: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = soh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, NULL); soh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (commit_failed_p ? a_state_trigger_full_compact_gc : a_state_trigger_ephemeral_gc)); break; } // 後臺GC完成後調用soh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 失敗時切換狀態到a_state_trigger_2nd_ephemeral_gc或a_state_trigger_full_compact_gc case a_state_try_fit_after_bgc: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; BOOL short_seg_end_p = FALSE; can_use_existing_p = soh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &short_seg_end_p); soh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (short_seg_end_p ? a_state_trigger_2nd_ephemeral_gc : a_state_trigger_full_compact_gc)); break; } // 壓縮GC完成後調用soh_try_fit函數 // 若是壓縮後仍分配失敗則切換狀態到a_state_cant_allocate // 成功時切換狀態到a_state_can_allocate case a_state_try_fit_after_cg: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; BOOL short_seg_end_p = FALSE; can_use_existing_p = soh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &short_seg_end_p); if (short_seg_end_p) { soh_alloc_state = a_state_cant_allocate; oom_r = oom_budget; } else { if (can_use_existing_p) { soh_alloc_state = a_state_can_allocate; } else { #ifdef MULTIPLE_HEAPS if (!commit_failed_p) { // some other threads already grabbed the more space lock and allocated // so we should attemp an ephemeral GC again. assert (heap_segment_allocated (ephemeral_heap_segment) < alloc_allocated); soh_alloc_state = a_state_trigger_ephemeral_gc; } else #endif //MULTIPLE_HEAPS { assert (commit_failed_p); soh_alloc_state = a_state_cant_allocate; oom_r = oom_cant_commit; } } } break; } // 等待後臺GC完成 // 若是執行了壓縮則切換狀態到a_state_try_fit_after_cg // 不然切換狀態到a_state_try_fit_after_bgc case a_state_check_and_wait_for_bgc: { BOOL bgc_in_progress_p = FALSE; BOOL did_full_compacting_gc = FALSE; bgc_in_progress_p = check_and_wait_for_bgc (awr_gen0_oos_bgc, &did_full_compacting_gc); soh_alloc_state = (did_full_compacting_gc ? a_state_try_fit_after_cg : a_state_try_fit_after_bgc); break; } // 觸發第0和1代的GC // 若是有壓縮則切換狀態到a_state_try_fit_after_cg // 不然重試soh_try_fit, 成功時切換狀態到a_state_can_allocate, 失敗時切換狀態到等待後臺GC或觸發其餘GC case a_state_trigger_ephemeral_gc: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; BOOL short_seg_end_p = FALSE; BOOL bgc_in_progress_p = FALSE; BOOL did_full_compacting_gc = FALSE; did_full_compacting_gc = trigger_ephemeral_gc (gr); if (did_full_compacting_gc) { soh_alloc_state = a_state_try_fit_after_cg; } else { can_use_existing_p = soh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &short_seg_end_p); #ifdef BACKGROUND_GC bgc_in_progress_p = recursive_gc_sync::background_running_p(); #endif //BACKGROUND_GC if (short_seg_end_p) { soh_alloc_state = (bgc_in_progress_p ? a_state_check_and_wait_for_bgc : a_state_trigger_full_compact_gc); if (fgn_maxgen_percent) { dprintf (2, ("FGN: doing last GC before we throw OOM")); send_full_gc_notification (max_generation, FALSE); } } else { if (can_use_existing_p) { soh_alloc_state = a_state_can_allocate; } else { #ifdef MULTIPLE_HEAPS if (!commit_failed_p) { // some other threads already grabbed the more space lock and allocated // so we should attemp an ephemeral GC again. assert (heap_segment_allocated (ephemeral_heap_segment) < alloc_allocated); soh_alloc_state = a_state_trigger_ephemeral_gc; } else #endif //MULTIPLE_HEAPS { soh_alloc_state = a_state_trigger_full_compact_gc; if (fgn_maxgen_percent) { dprintf (2, ("FGN: failed to commit, doing full compacting GC")); send_full_gc_notification (max_generation, FALSE); } } } } } break; } // 第二次觸發第0和1代的GC // 若是有壓縮則切換狀態到a_state_try_fit_after_cg // 不然重試soh_try_fit, 成功時切換狀態到a_state_can_allocate, 失敗時切換狀態到a_state_trigger_full_compact_gc case a_state_trigger_2nd_ephemeral_gc: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; BOOL short_seg_end_p = FALSE; BOOL did_full_compacting_gc = FALSE; did_full_compacting_gc = trigger_ephemeral_gc (gr); if (did_full_compacting_gc) { soh_alloc_state = a_state_try_fit_after_cg; } else { can_use_existing_p = soh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &short_seg_end_p); if (short_seg_end_p || commit_failed_p) { soh_alloc_state = a_state_trigger_full_compact_gc; } else { assert (can_use_existing_p); soh_alloc_state = a_state_can_allocate; } } break; } // 觸發第0和1和2代的壓縮GC // 成功時切換狀態到a_state_try_fit_after_cg, 失敗時切換狀態到a_state_cant_allocate case a_state_trigger_full_compact_gc: { BOOL got_full_compacting_gc = FALSE; got_full_compacting_gc = trigger_full_compact_gc (gr, &oom_r); soh_alloc_state = (got_full_compacting_gc ? a_state_try_fit_after_cg : a_state_cant_allocate); break; } default: { assert (!"Invalid state!"); break; } } } exit: // 分配失敗時處理OOM(Out Of Memory) if (soh_alloc_state == a_state_cant_allocate) { assert (oom_r != oom_no_failure); handle_oom (heap_number, oom_r, size, heap_segment_allocated (ephemeral_heap_segment), heap_segment_reserved (ephemeral_heap_segment)); dprintf (SPINLOCK_LOG, ("[%d]Lmsl for oom", heap_number)); add_saved_spinlock_info (me_release, mt_alloc_small_cant); leave_spin_lock (&more_space_lock); } return (soh_alloc_state == a_state_can_allocate); }
soh_try_fit函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會先嚐試調用a_fit_free_list_p從自由對象列表中分配, 而後嘗試調用a_fit_segment_end_p從堆段結尾分配
BOOL gc_heap::soh_try_fit (int gen_number, size_t size, alloc_context* acontext, int align_const, BOOL* commit_failed_p, // 返回參數, 把虛擬內存提交到物理內存是否失敗(物理內存不足) BOOL* short_seg_end_p) // 返回參數, 堆段的結尾是否不夠用 { BOOL can_allocate = TRUE; // 有傳入short_seg_end_p時先設置它的值爲false if (short_seg_end_p) { *short_seg_end_p = FALSE; } // 先嚐試從自由對象列表中分配 can_allocate = a_fit_free_list_p (gen_number, size, acontext, align_const); if (!can_allocate) { // 不能從自由對象列表中分配, 嘗試從堆段的結尾分配 // 檢查ephemeral_heap_segment的結尾空間是否足夠 if (short_seg_end_p) { *short_seg_end_p = short_on_end_of_seg (gen_number, ephemeral_heap_segment, align_const); } // 若是空間足夠, 或者調用時不傳入short_seg_end_p參數(傳入nullptr), 則調用a_fit_segment_end_p函數 // If the caller doesn't care, we always try to fit at the end of seg; // otherwise we would only try if we are actually not short at end of seg. if (!short_seg_end_p || !(*short_seg_end_p)) { can_allocate = a_fit_segment_end_p (gen_number, ephemeral_heap_segment, size, acontext, align_const, commit_failed_p); } } return can_allocate; }
a_fit_free_list_p函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會嘗試從自由對象列表中找到足夠大小的空間, 若是找到則把分配上下文指向這個空間
inline BOOL gc_heap::a_fit_free_list_p (int gen_number, size_t size, alloc_context* acontext, int align_const) { BOOL can_fit = FALSE; // 獲取指定的代中的自由對象列表 generation* gen = generation_of (gen_number); allocator* gen_allocator = generation_allocator (gen); // 列表會按大小分爲多個bucket(用鏈表形式連接) // 大小會*2遞增, 例如first_bucket的大小是256那第二個bucket的大小則爲512 size_t sz_list = gen_allocator->first_bucket_size(); for (unsigned int a_l_idx = 0; a_l_idx < gen_allocator->number_of_buckets(); a_l_idx++) { if ((size < sz_list) || (a_l_idx == (gen_allocator->number_of_buckets()-1))) { uint8_t* free_list = gen_allocator->alloc_list_head_of (a_l_idx); uint8_t* prev_free_item = 0; while (free_list != 0) { dprintf (3, ("considering free list %Ix", (size_t)free_list)); size_t free_list_size = unused_array_size (free_list); if ((size + Align (min_obj_size, align_const)) <= free_list_size) { dprintf (3, ("Found adequate unused area: [%Ix, size: %Id", (size_t)free_list, free_list_size)); // 大小足夠時從該bucket的鏈表中pop出來 gen_allocator->unlink_item (a_l_idx, free_list, prev_free_item, FALSE); // We ask for more Align (min_obj_size) // to make sure that we can insert a free object // in adjust_limit will set the limit lower size_t limit = limit_from_size (size, free_list_size, gen_number, align_const); uint8_t* remain = (free_list + limit); size_t remain_size = (free_list_size - limit); // 若是分配完還有剩餘空間, 在剩餘空間生成一個自由對象並塞回自由對象列表 if (remain_size >= Align(min_free_list, align_const)) { make_unused_array (remain, remain_size); gen_allocator->thread_item_front (remain, remain_size); assert (remain_size >= Align (min_obj_size, align_const)); } else { //absorb the entire free list limit += remain_size; } generation_free_list_space (gen) -= limit; // 給分配上下文設置新的範圍 adjust_limit_clr (free_list, limit, acontext, 0, align_const, gen_number); // 分配成功跳出循環 can_fit = TRUE; goto end; } else if (gen_allocator->discard_if_no_fit_p()) { assert (prev_free_item == 0); dprintf (3, ("couldn't use this free area, discarding")); generation_free_obj_space (gen) += free_list_size; gen_allocator->unlink_item (a_l_idx, free_list, prev_free_item, FALSE); generation_free_list_space (gen) -= free_list_size; } else { prev_free_item = free_list; } // 同一bucket的下一個自由對象 free_list = free_list_slot (free_list); } } // 當前bucket的大小不夠, 下一個bucket的大小會是當前bucket的兩倍 sz_list = sz_list * 2; } end: return can_fit; }
a_fit_segment_end_p函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會嘗試在堆段的結尾找到一塊足夠大小的空間, 若是找到則把分配上下文指向這個空間
BOOL gc_heap::a_fit_segment_end_p (int gen_number, heap_segment* seg, size_t size, alloc_context* acontext, int align_const, BOOL* commit_failed_p) { *commit_failed_p = FALSE; size_t limit = 0; #ifdef BACKGROUND_GC int cookie = -1; #endif //BACKGROUND_GC // 開始分配的地址 uint8_t*& allocated = ((gen_number == 0) ? alloc_allocated : heap_segment_allocated(seg)); size_t pad = Align (min_obj_size, align_const); #ifdef FEATURE_LOH_COMPACTION if (gen_number == (max_generation + 1)) { pad += Align (loh_padding_obj_size, align_const); } #endif //FEATURE_LOH_COMPACTION // 最多能分配到的地址 = 已提交到物理內存的地址 - 對齊大小 uint8_t* end = heap_segment_committed (seg) - pad; // 若是空間足夠則跳到found_fit if (a_size_fit_p (size, allocated, end, align_const)) { limit = limit_from_size (size, (end - allocated), gen_number, align_const); goto found_fit; } // 已提交到物理內存的地址不夠用, 須要提交新的地址 // 最多能分配到的地址 = 堆段預留的末尾地址 - 對齊大小 end = heap_segment_reserved (seg) - pad; // 若是空間足夠則調用grow_heap_segment // 調用grow_heap_segment成功則跳到found_fit, 不然設置commit_failed_p的值等於true if (a_size_fit_p (size, allocated, end, align_const)) { limit = limit_from_size (size, (end - allocated), gen_number, align_const); if (grow_heap_segment (seg, allocated + limit)) { goto found_fit; } else { dprintf (2, ("can't grow segment, doing a full gc")); *commit_failed_p = TRUE; } } goto found_no_fit; found_fit: // 若是啓用了後臺GC, 而且正在分配大對象, 須要檢測後臺GC是否正在標記對象 #ifdef BACKGROUND_GC if (gen_number != 0) { cookie = bgc_alloc_lock->loh_alloc_set (allocated); } #endif //BACKGROUND_GC uint8_t* old_alloc; old_alloc = allocated; // 若是是第3代(大對象)則往對齊的空間添加一個自由對象 #ifdef FEATURE_LOH_COMPACTION if (gen_number == (max_generation + 1)) { size_t loh_pad = Align (loh_padding_obj_size, align_const); make_unused_array (old_alloc, loh_pad); old_alloc += loh_pad; allocated += loh_pad; limit -= loh_pad; } #endif //FEATURE_LOH_COMPACTION // 清空SyncBlock // 正常不須要, 由於前一個對象已經清零並預留好空間 #if defined (VERIFY_HEAP) && defined (_DEBUG) ((void**) allocated)[-1] = 0; //clear the sync block #endif //VERIFY_HEAP && _DEBUG // 增長開始分配的地址, 下一次將會從這裏分配 // 注意這個不是本地變量而是引用 allocated += limit; dprintf (3, ("found fit at end of seg: %Ix", old_alloc)); #ifdef BACKGROUND_GC if (cookie != -1) { // 若是後臺GC正在標記對象須要調用bgc_loh_alloc_clr給分配上下文設置新的範圍 // 這個函數會在下一節(分配大對象內存的代碼流程)解釋 bgc_loh_alloc_clr (old_alloc, limit, acontext, align_const, cookie, TRUE, seg); } else #endif //BACKGROUND_GC { // 給分配上下文設置新的範圍 adjust_limit_clr (old_alloc, limit, acontext, seg, align_const, gen_number); } return TRUE; found_no_fit: return FALSE; }
adjust_limit_clr函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會給分配上下文設置新的範圍
不論是從自由列表仍是堆段的結尾分配都會調用這個函數, 從自由列表分配時seg參數會是nullptr
調用完這個函數之後分配上下文就有足夠的空間了, 回到gc_heap::allocate的retry就能夠成功的分配到對象的內存
void gc_heap::adjust_limit_clr (uint8_t* start, size_t limit_size, alloc_context* acontext, heap_segment* seg, int align_const, int gen_number) { size_t aligned_min_obj_size = Align(min_obj_size, align_const); //probably should pass seg==0 for free lists. if (seg) { assert (heap_segment_used (seg) <= heap_segment_committed (seg)); } dprintf (3, ("Expanding segment allocation [%Ix, %Ix[", (size_t)start, (size_t)start + limit_size - aligned_min_obj_size)); // 若是分配上下文的開始地址改變了, 而且原來的空間未用完(只是不夠用), 應該在這個空間建立一個自由對象 // 這裏就是BOTR中說的若是剩下30bytes可是要分配40bytes時會在原來的30bytes建立一個自由對象 // 但若是隻是結束地址改變了, 開始地址未改變則不須要 if ((acontext->alloc_limit != start) && (acontext->alloc_limit + aligned_min_obj_size)!= start) { uint8_t* hole = acontext->alloc_ptr; if (hole != 0) { size_t size = (acontext->alloc_limit - acontext->alloc_ptr); dprintf (3, ("filling up hole [%Ix, %Ix[", (size_t)hole, (size_t)hole + size + Align (min_obj_size, align_const))); // when we are finishing an allocation from a free list // we know that the free area was Align(min_obj_size) larger acontext->alloc_bytes -= size; size_t free_obj_size = size + aligned_min_obj_size; make_unused_array (hole, free_obj_size); generation_free_obj_space (generation_of (gen_number)) += free_obj_size; } // 設置新的開始地址 acontext->alloc_ptr = start; } // 設置新的結束地址 acontext->alloc_limit = (start + limit_size - aligned_min_obj_size); // 添加已分配的字節數 acontext->alloc_bytes += limit_size - ((gen_number < max_generation + 1) ? aligned_min_obj_size : 0); #ifdef FEATURE_APPDOMAIN_RESOURCE_MONITORING if (g_fEnableARM) { AppDomain* alloc_appdomain = GetAppDomain(); alloc_appdomain->RecordAllocBytes (limit_size, heap_number); } #endif //FEATURE_APPDOMAIN_RESOURCE_MONITORING uint8_t* saved_used = 0; if (seg) { saved_used = heap_segment_used (seg); } // 若是傳入了seg參數, 調整heap_segment::used的位置 if (seg == ephemeral_heap_segment) { //Sometimes the allocated size is advanced without clearing the //memory. Let's catch up here if (heap_segment_used (seg) < (alloc_allocated - plug_skew)) { #ifdef MARK_ARRAY #ifndef BACKGROUND_GC clear_mark_array (heap_segment_used (seg) + plug_skew, alloc_allocated); #endif //BACKGROUND_GC #endif //MARK_ARRAY heap_segment_used (seg) = alloc_allocated - plug_skew; } } #ifdef BACKGROUND_GC else if (seg) { uint8_t* old_allocated = heap_segment_allocated (seg) - plug_skew - limit_size; #ifdef FEATURE_LOH_COMPACTION old_allocated -= Align (loh_padding_obj_size, align_const); #endif //FEATURE_LOH_COMPACTION assert (heap_segment_used (seg) >= old_allocated); } #endif //BACKGROUND_GC // 對設置的空間進行清0 // plug_skew其實就是SyncBlock的大小, 這裏會把start前面的一個SyncBlock也清0 // 對大塊內存的清0會比較耗費時間, 清0以前會釋放掉MSL鎖 if ((seg == 0) || (start - plug_skew + limit_size) <= heap_segment_used (seg)) { dprintf (SPINLOCK_LOG, ("[%d]Lmsl to clear memory(1)", heap_number)); add_saved_spinlock_info (me_release, mt_clr_mem); leave_spin_lock (&more_space_lock); dprintf (3, ("clearing memory at %Ix for %d bytes", (start - plug_skew), limit_size)); memclr (start - plug_skew, limit_size); } else { uint8_t* used = heap_segment_used (seg); heap_segment_used (seg) = start + limit_size - plug_skew; dprintf (SPINLOCK_LOG, ("[%d]Lmsl to clear memory", heap_number)); add_saved_spinlock_info (me_release, mt_clr_mem); leave_spin_lock (&more_space_lock); if ((start - plug_skew) < used) { if (used != saved_used) { FATAL_GC_ERROR (); } dprintf (2, ("clearing memory before used at %Ix for %Id bytes", (start - plug_skew), (plug_skew + used - start))); memclr (start - plug_skew, used - (start - plug_skew)); } } // 設置BrickTable // BrickTable中屬於start的塊會設置爲alloc_ptr距離塊開始地址的大小 // 以後一直到start + limit的塊會設置爲-1 //this portion can be done after we release the lock if (seg == ephemeral_heap_segment) { #ifdef FFIND_OBJECT if (gen0_must_clear_bricks > 0) { //set the brick table to speed up find_object size_t b = brick_of (acontext->alloc_ptr); set_brick (b, acontext->alloc_ptr - brick_address (b)); b++; dprintf (3, ("Allocation Clearing bricks [%Ix, %Ix[", b, brick_of (align_on_brick (start + limit_size)))); volatile short* x = &brick_table [b]; short* end_x = &brick_table [brick_of (align_on_brick (start + limit_size))]; for (;x < end_x;x++) *x = -1; } else #endif //FFIND_OBJECT { gen0_bricks_cleared = FALSE; } } // verifying the memory is completely cleared. //verify_mem_cleared (start - plug_skew, limit_size); }
總結小對象內存的代碼流程
讓咱們來看一下大對象的內存是如何分配的
分配小對象咱們從gc_heap::allocate開始跟蹤, 這裏咱們從gc_heap::allocate_large_object開始跟蹤
allocate_large_object函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數和allocate函數不一樣的是它不會嘗試從分配上下文中分配, 而是直接從堆段中分配
CObjectHeader* gc_heap::allocate_large_object (size_t jsize, int64_t& alloc_bytes) { // 建立一個空的分配上下文 //create a new alloc context because gen3context is shared. alloc_context acontext; acontext.alloc_ptr = 0; acontext.alloc_limit = 0; acontext.alloc_bytes = 0; #ifdef MULTIPLE_HEAPS acontext.alloc_heap = vm_heap; #endif //MULTIPLE_HEAPS #ifdef MARK_ARRAY uint8_t* current_lowest_address = lowest_address; uint8_t* current_highest_address = highest_address; #ifdef BACKGROUND_GC if (recursive_gc_sync::background_running_p()) { current_lowest_address = background_saved_lowest_address; current_highest_address = background_saved_highest_address; } #endif //BACKGROUND_GC #endif // MARK_ARRAY // 檢查對象大小是否超過了最大容許的對象大小 // 超過期分配失敗 size_t maxObjectSize = (INT32_MAX - 7 - Align(min_obj_size)); #ifdef BIT64 if (g_pConfig->GetGCAllowVeryLargeObjects()) { maxObjectSize = (INT64_MAX - 7 - Align(min_obj_size)); } #endif if (jsize >= maxObjectSize) { if (g_pConfig->IsGCBreakOnOOMEnabled()) { GCToOSInterface::DebugBreak(); } #ifndef FEATURE_REDHAWK ThrowOutOfMemoryDimensionsExceeded(); #else return 0; #endif } // 計算對齊 size_t size = AlignQword (jsize); int align_const = get_alignment_constant (FALSE); #ifdef FEATURE_LOH_COMPACTION size_t pad = Align (loh_padding_obj_size, align_const); #else size_t pad = 0; #endif //FEATURE_LOH_COMPACTION // 調用allocate_more_space函數 // 由於分配上下文是空的, 這裏咱們給分配上下文指定的空間就是這個大對象使用的空間 assert (size >= Align (min_obj_size, align_const)); #ifdef _MSC_VER #pragma inline_depth(0) #endif //_MSC_VER if (! allocate_more_space (&acontext, (size + pad), max_generation+1)) { return 0; } #ifdef _MSC_VER #pragma inline_depth(20) #endif //_MSC_VER #ifdef FEATURE_LOH_COMPACTION // The GC allocator made a free object already in this alloc context and // adjusted the alloc_ptr accordingly. #endif //FEATURE_LOH_COMPACTION // 對象分配到剛纔獲取到的空間的開始地址 uint8_t* result = acontext.alloc_ptr; // 空間大小應該等於對象大小 assert ((size_t)(acontext.alloc_limit - acontext.alloc_ptr) == size); // 返回結果 CObjectHeader* obj = (CObjectHeader*)result; #ifdef MARK_ARRAY if (recursive_gc_sync::background_running_p()) { // 若是對象不在掃描範圍中清掉標記的bit if ((result < current_highest_address) && (result >= current_lowest_address)) { dprintf (3, ("Clearing mark bit at address %Ix", (size_t)(&mark_array [mark_word_of (result)]))); mark_array_clear_marked (result); } #ifdef BACKGROUND_GC //the object has to cover one full mark uint32_t assert (size > mark_word_size); if (current_c_gc_state == c_gc_state_marking) { dprintf (3, ("Concurrent allocation of a large object %Ix", (size_t)obj)); // 若是對象在掃描範圍中則設置標記bit防止它被回收 //mark the new block specially so we know it is a new object if ((result < current_highest_address) && (result >= current_lowest_address)) { dprintf (3, ("Setting mark bit at address %Ix", (size_t)(&mark_array [mark_word_of (result)]))); mark_array_set_marked (result); } } #endif //BACKGROUND_GC } #endif //MARK_ARRAY assert (obj != 0); assert ((size_t)obj == Align ((size_t)obj, align_const)); alloc_bytes += acontext.alloc_bytes; return obj; }
allocate_more_space這個函數咱們在以前已經看過了, 忘掉的能夠向前翻
這個函數會調用try_allocate_more_space函數
try_allocate_more_space函數在分配大對象時會調用allocate_large函數
allocate_large函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數的結構和alloc_small類似可是內部處理的細節不同
BOOL gc_heap::allocate_large (int gen_number, size_t size, alloc_context* acontext, int align_const) { // 後臺GC運行時且不在計劃階段 // 原來是16次處理1次可是如今if被註釋了 #ifdef BACKGROUND_GC if (recursive_gc_sync::background_running_p() && (current_c_gc_state != c_gc_state_planning)) { background_loh_alloc_count++; //if ((background_loh_alloc_count % bgc_alloc_spin_count_loh) == 0) { // 若是合適在後臺GC完成前分配對象 if (bgc_loh_should_allocate()) { // 若是記錄的LOH(Large Object Heap)增加比較大則這個線程須要暫停一下, 先安排其餘線程工做 // 釋放MSL鎖並調用YieldThread, 若是switchCount參數(bgc_alloc_spin_loh)較大還有可能休眠1ms if (!bgc_alloc_spin_loh) { Thread* current_thread = GetThread(); add_saved_spinlock_info (me_release, mt_alloc_large); dprintf (SPINLOCK_LOG, ("[%d]spin Lmsl loh", heap_number)); leave_spin_lock (&more_space_lock); BOOL cooperative_mode = enable_preemptive (current_thread); GCToOSInterface::YieldThread (bgc_alloc_spin_loh); disable_preemptive (current_thread, cooperative_mode); enter_spin_lock (&more_space_lock); add_saved_spinlock_info (me_acquire, mt_alloc_large); dprintf (SPINLOCK_LOG, ("[%d]spin Emsl loh", heap_number)); } } // 不合適時等待後臺GC完成 else { wait_for_background (awr_loh_alloc_during_bgc); } } } #endif //BACKGROUND_GC gc_reason gr = reason_oos_loh; generation* gen = generation_of (gen_number); oom_reason oom_r = oom_no_failure; size_t current_full_compact_gc_count = 0; // No variable values should be "carried over" from one state to the other. // That's why there are local variable for each state allocation_state loh_alloc_state = a_state_start; #ifdef RECORD_LOH_STATE EEThreadId current_thread_id; current_thread_id.SetToCurrentThread(); #endif //RECORD_LOH_STATE // 開始循環切換狀態, 請關注loh_alloc_state // If we can get a new seg it means allocation will succeed. while (1) { dprintf (3, ("[h%d]loh state is %s", heap_number, allocation_state_str[loh_alloc_state])); #ifdef RECORD_LOH_STATE add_saved_loh_state (loh_alloc_state, current_thread_id); #endif //RECORD_LOH_STATE switch (loh_alloc_state) { // 成功或失敗時跳出循環 case a_state_can_allocate: case a_state_cant_allocate: { goto exit; } // 開始時切換狀態到a_state_try_fit case a_state_start: { loh_alloc_state = a_state_try_fit; break; } // 調用loh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 失敗時切換狀態到a_state_trigger_full_compact_gc或a_state_acquire_seg case a_state_try_fit: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (commit_failed_p ? a_state_trigger_full_compact_gc : a_state_acquire_seg)); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); break; } // 在建立了一個新的堆段之後調用loh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 失敗時切換狀態到a_state_try_fit case a_state_try_fit_new_seg: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); // 即便咱們建立了一個新的堆段也不表明分配必定會成功,例如被其餘線程搶走了,若是這樣咱們須要重試 // Even after we got a new seg it doesn't necessarily mean we can allocate, // another LOH allocating thread could have beat us to acquire the msl so // we need to try again. loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : a_state_try_fit); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); break; } // 在壓縮GC後建立一個新的堆段成功, 調用loh_try_fit函數在這個堆段上分配 // 成功時切換狀態到a_state_can_allocate // 失敗時若是提交到物理內存失敗(物理內存不足)則切換狀態到a_state_cant_allocate // 不然再嘗試一次建立一個新的堆段 case a_state_try_fit_new_seg_after_cg: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); // Even after we got a new seg it doesn't necessarily mean we can allocate, // another LOH allocating thread could have beat us to acquire the msl so // we need to try again. However, if we failed to commit, which means we // did have space on the seg, we bail right away 'cause we already did a // full compacting GC. loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (commit_failed_p ? a_state_cant_allocate : a_state_acquire_seg_after_cg)); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); break; } // 這個狀態目前不會被其餘狀態切換到 // 簡單的調用loh_try_fit函數成功則切換到a_state_can_allocate失敗則切換到a_state_cant_allocate case a_state_try_fit_no_seg: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : a_state_cant_allocate); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); assert ((loh_alloc_state != a_state_cant_allocate) || (oom_r != oom_no_failure)); break; } // 壓縮GC完成後調用loh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 若是壓縮後仍分配失敗, 而且提交內存到物理內存失敗(物理內存不足)則切換狀態到a_state_cant_allocate // 若是壓縮後仍分配失敗, 可是提交內存到物理內存並沒有失敗則嘗試再次建立一個新的堆段 case a_state_try_fit_after_cg: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (commit_failed_p ? a_state_cant_allocate : a_state_acquire_seg_after_cg)); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); break; } // 在後臺GC完成後調用loh_try_fit函數 // 成功時切換狀態到a_state_can_allocate // 若是提交內存到物理內存失敗(物理內存不足)則切換狀態到a_state_trigger_full_compact_gc // 若是提交內存到物理內存並沒有失敗則嘗試建立一個新的堆段 case a_state_try_fit_after_bgc: { BOOL commit_failed_p = FALSE; BOOL can_use_existing_p = FALSE; can_use_existing_p = loh_try_fit (gen_number, size, acontext, align_const, &commit_failed_p, &oom_r); loh_alloc_state = (can_use_existing_p ? a_state_can_allocate : (commit_failed_p ? a_state_trigger_full_compact_gc : a_state_acquire_seg_after_bgc)); assert ((loh_alloc_state == a_state_can_allocate) == (acontext->alloc_ptr != 0)); break; } // 嘗試建立一個新的堆段 // 成功時切換狀態到a_state_try_fit_new_seg // 失敗時若是已執行了壓縮則切換狀態到a_state_check_retry_seg, 不然切換狀態到a_state_check_and_wait_for_bgc case a_state_acquire_seg: { BOOL can_get_new_seg_p = FALSE; BOOL did_full_compacting_gc = FALSE; current_full_compact_gc_count = get_full_compact_gc_count(); can_get_new_seg_p = loh_get_new_seg (gen, size, align_const, &did_full_compacting_gc, &oom_r); loh_alloc_state = (can_get_new_seg_p ? a_state_try_fit_new_seg : (did_full_compacting_gc ? a_state_check_retry_seg : a_state_check_and_wait_for_bgc)); break; } // 嘗試在壓縮GC後建立一個新的堆段 // 成功時切換狀態到a_state_try_fit_new_seg_after_cg // 失敗時切換狀態到a_state_check_retry_seg case a_state_acquire_seg_after_cg: { BOOL can_get_new_seg_p = FALSE; BOOL did_full_compacting_gc = FALSE; current_full_compact_gc_count = get_full_compact_gc_count(); can_get_new_seg_p = loh_get_new_seg (gen, size, align_const, &did_full_compacting_gc, &oom_r); // Since we release the msl before we try to allocate a seg, other // threads could have allocated a bunch of segments before us so // we might need to retry. loh_alloc_state = (can_get_new_seg_p ? a_state_try_fit_new_seg_after_cg : a_state_check_retry_seg); break; } // 後臺GC完成後嘗試建立一個新的堆段 // 成功時切換狀態到a_state_try_fit_new_seg // 失敗時若是已執行了壓縮則切換狀態到a_state_check_retry_seg, 不然切換狀態到a_state_trigger_full_compact_gc case a_state_acquire_seg_after_bgc: { BOOL can_get_new_seg_p = FALSE; BOOL did_full_compacting_gc = FALSE; current_full_compact_gc_count = get_full_compact_gc_count(); can_get_new_seg_p = loh_get_new_seg (gen, size, align_const, &did_full_compacting_gc, &oom_r); loh_alloc_state = (can_get_new_seg_p ? a_state_try_fit_new_seg : (did_full_compacting_gc ? a_state_check_retry_seg : a_state_trigger_full_compact_gc)); assert ((loh_alloc_state != a_state_cant_allocate) || (oom_r != oom_no_failure)); break; } // 等待後臺GC完成 // 若是後臺GC不在運行狀態中則切換狀態到a_state_trigger_full_compact_gc // 若是執行了壓縮則切換狀態到a_state_try_fit_after_cg, 不然切換狀態到a_state_try_fit_after_bgc case a_state_check_and_wait_for_bgc: { BOOL bgc_in_progress_p = FALSE; BOOL did_full_compacting_gc = FALSE; if (fgn_maxgen_percent) { dprintf (2, ("FGN: failed to acquire seg, may need to do a full blocking GC")); send_full_gc_notification (max_generation, FALSE); } bgc_in_progress_p = check_and_wait_for_bgc (awr_loh_oos_bgc, &did_full_compacting_gc); loh_alloc_state = (!bgc_in_progress_p ? a_state_trigger_full_compact_gc : (did_full_compacting_gc ? a_state_try_fit_after_cg : a_state_try_fit_after_bgc)); break; } // 觸發第0和1和2代的壓縮GC // 成功時切換狀態到a_state_try_fit_after_cg, 失敗時切換狀態到a_state_cant_allocate case a_state_trigger_full_compact_gc: { BOOL got_full_compacting_gc = FALSE; got_full_compacting_gc = trigger_full_compact_gc (gr, &oom_r); loh_alloc_state = (got_full_compacting_gc ? a_state_try_fit_after_cg : a_state_cant_allocate); assert ((loh_alloc_state != a_state_cant_allocate) || (oom_r != oom_no_failure)); break; } // 檢查是否應該重試GC或申請新的堆段 // 應該重試GC時切換狀態到a_state_trigger_full_compact_gc // 應該重試申請新的堆段時切換狀態到a_state_acquire_seg_after_cg // 不然切換狀態到a_state_cant_allocate // 若是不能獲取一個新的堆段, 可是對原來的堆段執行了壓縮GC那就應該重試 case a_state_check_retry_seg: { BOOL should_retry_gc = retry_full_compact_gc (size); BOOL should_retry_get_seg = FALSE; if (!should_retry_gc) { size_t last_full_compact_gc_count = current_full_compact_gc_count; current_full_compact_gc_count = get_full_compact_gc_count(); if (current_full_compact_gc_count > (last_full_compact_gc_count + 1)) { should_retry_get_seg = TRUE; } } loh_alloc_state = (should_retry_gc ? a_state_trigger_full_compact_gc : (should_retry_get_seg ? a_state_acquire_seg_after_cg : a_state_cant_allocate)); assert ((loh_alloc_state != a_state_cant_allocate) || (oom_r != oom_no_failure)); break; } default: { assert (!"Invalid state!"); break; } } } exit: // 分配失敗時處理OOM(Out Of Memory) if (loh_alloc_state == a_state_cant_allocate) { assert (oom_r != oom_no_failure); handle_oom (heap_number, oom_r, size, 0, 0); add_saved_spinlock_info (me_release, mt_alloc_large_cant); dprintf (SPINLOCK_LOG, ("[%d]Lmsl for loh oom", heap_number)); leave_spin_lock (&more_space_lock); } return (loh_alloc_state == a_state_can_allocate); }
loh_try_fit函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
處理和soh_try_fit差很少, 先嚐試調用a_fit_free_list_large_p從自由對象列表中分配, 而後嘗試調用loh_a_fit_segment_end_p從堆段結尾分配
BOOL gc_heap::loh_try_fit (int gen_number, size_t size, alloc_context* acontext, int align_const, BOOL* commit_failed_p, oom_reason* oom_r) { BOOL can_allocate = TRUE; // 嘗試從自由對象列表分配 if (!a_fit_free_list_large_p (size, acontext, align_const)) { // 嘗試從堆段結尾分配 can_allocate = loh_a_fit_segment_end_p (gen_number, size, acontext, align_const, commit_failed_p, oom_r); // 後臺GC運行時, 統計在堆段結尾分配的大小 #ifdef BACKGROUND_GC if (can_allocate && recursive_gc_sync::background_running_p()) { bgc_loh_size_increased += size; } #endif //BACKGROUND_GC } #ifdef BACKGROUND_GC else { // 後臺GC運行時, 統計在自由對象列表分配的大小 if (recursive_gc_sync::background_running_p()) { bgc_loh_allocated_in_free += size; } } #endif //BACKGROUND_GC return can_allocate; }
a_fit_free_list_large_p函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
和a_fit_free_list_p的處理基本相同, 可是在支持LOH壓縮時會生成填充對象, 而且有可能會調用bgc_loh_alloc_clr函數
BOOL gc_heap::a_fit_free_list_large_p (size_t size, alloc_context* acontext, int align_const) { // 若是後臺GC在計劃階段, 等待計劃完成 #ifdef BACKGROUND_GC wait_for_background_planning (awr_loh_alloc_during_plan); #endif //BACKGROUND_GC // 獲取第3代的自由對象列表 BOOL can_fit = FALSE; int gen_number = max_generation + 1; generation* gen = generation_of (gen_number); allocator* loh_allocator = generation_allocator (gen); // 支持LOH壓縮時須要在大對象前塞一個填充對象 #ifdef FEATURE_LOH_COMPACTION size_t loh_pad = Align (loh_padding_obj_size, align_const); #endif //FEATURE_LOH_COMPACTION #ifdef BACKGROUND_GC int cookie = -1; #endif //BACKGROUND_GC // 列表會按大小分爲多個bucket(用鏈表形式連接) // 大小會*2遞增, 例如first_bucket的大小是256那第二個bucket的大小則爲512 size_t sz_list = loh_allocator->first_bucket_size(); for (unsigned int a_l_idx = 0; a_l_idx < loh_allocator->number_of_buckets(); a_l_idx++) { if ((size < sz_list) || (a_l_idx == (loh_allocator->number_of_buckets()-1))) { uint8_t* free_list = loh_allocator->alloc_list_head_of (a_l_idx); uint8_t* prev_free_item = 0; while (free_list != 0) { dprintf (3, ("considering free list %Ix", (size_t)free_list)); size_t free_list_size = unused_array_size(free_list); #ifdef FEATURE_LOH_COMPACTION if ((size + loh_pad) <= free_list_size) #else if (((size + Align (min_obj_size, align_const)) <= free_list_size)|| (size == free_list_size)) #endif //FEATURE_LOH_COMPACTION { // 若是啓用了後臺GC, 而且正在分配大對象, 須要檢測後臺GC是否正在標記對象 #ifdef BACKGROUND_GC cookie = bgc_alloc_lock->loh_alloc_set (free_list); #endif //BACKGROUND_GC // 大小足夠時從該bucket的鏈表中pop出來 //unlink the free_item loh_allocator->unlink_item (a_l_idx, free_list, prev_free_item, FALSE); // Substract min obj size because limit_from_size adds it. Not needed for LOH size_t limit = limit_from_size (size - Align(min_obj_size, align_const), free_list_size, gen_number, align_const); // 支持LOH壓縮時須要在大對象前塞一個填充對象 #ifdef FEATURE_LOH_COMPACTION make_unused_array (free_list, loh_pad); limit -= loh_pad; free_list += loh_pad; free_list_size -= loh_pad; #endif //FEATURE_LOH_COMPACTION // 若是分配完還有剩餘空間, 在剩餘空間生成一個自由對象並塞回自由對象列表 uint8_t* remain = (free_list + limit); size_t remain_size = (free_list_size - limit); if (remain_size != 0) { assert (remain_size >= Align (min_obj_size, align_const)); make_unused_array (remain, remain_size); } if (remain_size >= Align(min_free_list, align_const)) { loh_thread_gap_front (remain, remain_size, gen); assert (remain_size >= Align (min_obj_size, align_const)); } else { generation_free_obj_space (gen) += remain_size; } generation_free_list_space (gen) -= free_list_size; dprintf (3, ("found fit on loh at %Ix", free_list)); #ifdef BACKGROUND_GC if (cookie != -1) { // 若是後臺GC正在標記對象須要調用bgc_loh_alloc_clr給分配上下文設置新的範圍 bgc_loh_alloc_clr (free_list, limit, acontext, align_const, cookie, FALSE, 0); } else #endif //BACKGROUND_GC { // 給分配上下文設置新的範圍 adjust_limit_clr (free_list, limit, acontext, 0, align_const, gen_number); } //fix the limit to compensate for adjust_limit_clr making it too short acontext->alloc_limit += Align (min_obj_size, align_const); can_fit = TRUE; goto exit; } // 同一bucket的下一個自由對象 prev_free_item = free_list; free_list = free_list_slot (free_list); } } // 當前bucket的大小不夠, 下一個bucket的大小會是當前bucket的兩倍 sz_list = sz_list * 2; } exit: return can_fit; }
adjust_limit_clr這個函數咱們在看小對象的代碼流程時已經看過
這裏看bgc_loh_alloc_clr函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數是在後臺GC運行時分配大對象使用的, 須要照顧到運行中的後臺GC
#ifdef BACKGROUND_GC void gc_heap::bgc_loh_alloc_clr (uint8_t* alloc_start, size_t size, alloc_context* acontext, int align_const, int lock_index, BOOL check_used_p, heap_segment* seg) { // 一開始就在這片空間建立一個自由對象 // 由於等會要釋放在bgc_alloc_lock中的鎖再清0內存因此要先建立一個自由對象防止GC使用這塊空間 // 這個自由對象在最後從新上鎖後會被重置回空白的空間 make_unused_array (alloc_start, size); #ifdef FEATURE_APPDOMAIN_RESOURCE_MONITORING if (g_fEnableARM) { AppDomain* alloc_appdomain = GetAppDomain(); alloc_appdomain->RecordAllocBytes (size, heap_number); } #endif //FEATURE_APPDOMAIN_RESOURCE_MONITORING size_t size_of_array_base = sizeof(ArrayBase); // 釋放cookie對應的鎖 (設置數組中lock_index位置的值爲0) bgc_alloc_lock->loh_alloc_done_with_index (lock_index); // 開始對內存進行清0 // 計算清0的的範圍 // clear memory while not holding the lock. size_t size_to_skip = size_of_array_base; size_t size_to_clear = size - size_to_skip - plug_skew; size_t saved_size_to_clear = size_to_clear; if (check_used_p) { uint8_t* end = alloc_start + size - plug_skew; uint8_t* used = heap_segment_used (seg); if (used < end) { if ((alloc_start + size_to_skip) < used) { size_to_clear = used - (alloc_start + size_to_skip); } else { size_to_clear = 0; } // 調整heap_segment::used的位置 dprintf (2, ("bgc loh: setting used to %Ix", end)); heap_segment_used (seg) = end; } dprintf (2, ("bgc loh: used: %Ix, alloc: %Ix, end of alloc: %Ix, clear %Id bytes", used, alloc_start, end, size_to_clear)); } else { dprintf (2, ("bgc loh: [%Ix-[%Ix(%Id)", alloc_start, alloc_start+size, size)); } #ifdef VERIFY_HEAP // since we filled in 0xcc for free object when we verify heap, // we need to make sure we clear those bytes. if (g_pConfig->GetHeapVerifyLevel() & EEConfig::HEAPVERIFY_GC) { if (size_to_clear < saved_size_to_clear) { size_to_clear = saved_size_to_clear; } } #endif //VERIFY_HEAP // 釋放MSL鎖並清0內存 dprintf (SPINLOCK_LOG, ("[%d]Lmsl to clear large obj", heap_number)); add_saved_spinlock_info (me_release, mt_clr_large_mem); leave_spin_lock (&more_space_lock); memclr (alloc_start + size_to_skip, size_to_clear); // 從新找一個鎖鎖上 // 這裏的鎖會在PublishObject時釋放 bgc_alloc_lock->loh_alloc_set (alloc_start); // 設置分配上下文指向的範圍 acontext->alloc_ptr = alloc_start; acontext->alloc_limit = (alloc_start + size - Align (min_obj_size, align_const)); // 把自由對象從新變回一塊空白的空間 // need to clear the rest of the object before we hand it out. clear_unused_array(alloc_start, size); } #endif //BACKGROUND_GC
loh_a_fit_segment_end_p函數的內容: https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
這個函數會遍歷第3代的堆段鏈表逐個調用a_fit_segment_end_p函數嘗試分配
BOOL gc_heap::loh_a_fit_segment_end_p (int gen_number, size_t size, alloc_context* acontext, int align_const, BOOL* commit_failed_p, oom_reason* oom_r) { *commit_failed_p = FALSE; // 獲取代中第一個堆段節點用於接下來的分配 heap_segment* seg = generation_allocation_segment (generation_of (gen_number)); BOOL can_allocate_p = FALSE; while (seg) { // 調用a_fit_segment_end_p嘗試在這個堆段的結尾分配 if (a_fit_segment_end_p (gen_number, seg, (size - Align (min_obj_size, align_const)), acontext, align_const, commit_failed_p)) { acontext->alloc_limit += Align (min_obj_size, align_const); can_allocate_p = TRUE; break; } else { if (*commit_failed_p) { // 若是堆段還有剩餘空間但不能提交到物理內存, 則返回內存不足錯誤 *oom_r = oom_cant_commit; break; } else { // 若是堆段已無剩餘空間, 看鏈表中的下一個堆段 seg = heap_segment_next_rw (seg); } } } return can_allocate_p; }
總結大對象內存的代碼流程
看到這裏咱們應該知道分配上下文, 小對象, 大對象的內存都是來源於堆段, 那堆段的內存來源於哪裏呢?
GC在程序啓動時會建立默認的堆段, 調用流程是init_gc_heap => get_initial_segment => make_heap_segment
若是默認的堆段不夠用會建立新的堆段
小對象的堆段會經過gc1 => plan_phase => soh_get_segment_to_expand => get_segment => make_heap_segment
建立
大對象的堆段會經過allocate_large => loh_get_new_seg => get_large_segment => get_segment_for_loh => get_segment => make_heap_segment
建立
默認的堆段會經過next_initial_memory分配內存, 這一塊內存在程序啓動時從reserve_initial_memory函數申請
reserve_initial_memory函數和make_heap_segment函數都會調用virtual_alloc函數
由於調用流程很長我這裏就不一個個函數貼代碼了, 有興趣的能夠本身去跟蹤
virtual_alloc函數的調用流程是
virtual_alloc => GCToOSInterface::VirtualReserve => ClrVirtualAllocAligned => ClrVirtualAlloc => CExecutionEngine::ClrVirtualAlloc => EEVirtualAlloc => VirtualAlloc
若是是windows, VirtualAlloc就是同名的windows api
若是是linux或者macosx, 調用流程是VirtualAlloc => VIRTUALReserveMemory => ReserveVirtualMemory
ReserveVirtualMemory函數會調用mmap函數
ReserveVirtualMemory函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/pal/src/map/virtual.cpp#L894
static LPVOID ReserveVirtualMemory( IN CPalThread *pthrCurrent, /* Currently executing thread */ IN LPVOID lpAddress, /* Region to reserve or commit */ IN SIZE_T dwSize) /* Size of Region */ { UINT_PTR StartBoundary = (UINT_PTR)lpAddress; SIZE_T MemSize = dwSize; TRACE( "Reserving the memory now.\n"); // Most platforms will only commit memory if it is dirtied, // so this should not consume too much swap space. int mmapFlags = 0; #if HAVE_VM_ALLOCATE // Allocate with vm_allocate first, then map at the fixed address. int result = vm_allocate(mach_task_self(), &StartBoundary, MemSize, ((LPVOID) StartBoundary != nullptr) ? FALSE : TRUE); if (result != KERN_SUCCESS) { ERROR("vm_allocate failed to allocated the requested region!\n"); pthrCurrent->SetLastError(ERROR_INVALID_ADDRESS); return nullptr; } mmapFlags |= MAP_FIXED; #endif // HAVE_VM_ALLOCATE mmapFlags |= MAP_ANON | MAP_PRIVATE; LPVOID pRetVal = mmap((LPVOID) StartBoundary, MemSize, PROT_NONE, mmapFlags, -1 /* fd */, 0 /* offset */); if (pRetVal == MAP_FAILED) { ERROR( "Failed due to insufficient memory.\n" ); #if HAVE_VM_ALLOCATE vm_deallocate(mach_task_self(), StartBoundary, MemSize); #endif // HAVE_VM_ALLOCATE pthrCurrent->SetLastError(ERROR_NOT_ENOUGH_MEMORY); return nullptr; } /* Check to see if the region is what we asked for. */ if (lpAddress != nullptr && StartBoundary != (UINT_PTR)pRetVal) { ERROR("We did not get the region we asked for from mmap!\n"); pthrCurrent->SetLastError(ERROR_INVALID_ADDRESS); munmap(pRetVal, MemSize); return nullptr; } #if MMAP_ANON_IGNORES_PROTECTION if (mprotect(pRetVal, MemSize, PROT_NONE) != 0) { ERROR("mprotect failed to protect the region!\n"); pthrCurrent->SetLastError(ERROR_INVALID_ADDRESS); munmap(pRetVal, MemSize); return nullptr; } #endif // MMAP_ANON_IGNORES_PROTECTION return pRetVal; }
CoreCLR在從系統申請內存時會使用VirtualAlloc或mmap模擬的VirtualAlloc
申請後會獲得一塊還沒有徹底提交到物理內存的虛擬內存(注意保護模式是PROT_NONE, 表示該塊內存不能讀寫執行, 內核無需設置它的PageTable)
若是你有興趣能夠看一下CoreCLR的虛擬內存佔用, 工做站GC啓動時就佔了1G多, 服務器GC啓動時就佔用了20G
以後CoreCLR會根據使用慢慢的把使用的部分提交到物理內存, 流程是
GCToOSInterface::VirtualCommit => ClrVirtualAlloc => CExecutionEngine::ClrVirtualAlloc => EEVirtualAlloc => VirtualAlloc
若是是windows, VirtualAlloc是同名的windowsapi, 地址會被顯式指定且頁保護模式爲可讀寫(PAGE_READWRITE)
若是是linux或者macosx, VirtualAlloc會調用VIRTUALCommitMemory, 且內部會調用mprotect來設置該頁爲可讀寫(PROT_READ|PROT_WRITE)
當GC回收了垃圾對象, 再也不須要部份內存時會把內存還給系統, 例如回收小對象後的流程是
gc1 => decommit_ephemeral_segment_pages => decommit_heap_segment_pages => GCToOSInterface::VirtualDecommit
GCToOSInterface::VirtualDecommit的調用流程是
GCToOSInterface::VirtualDecommit => ClrVirtualFree => CExecutionEngine::ClrVirtualFree => EEVirtualFree => VirtualFree
若是是windows, VirtualFree是同名的windowsapi, 表示該部分虛擬內存已經再也不使用內核能夠重置它們的PageTable
若是是linux或者macosx, VirtualFree經過mprotect模擬, 設置該頁的保護模式爲PROT_NONE
VirtualFree函數的內容: https://github.com/dotnet/coreclr/blob/release/1.1.0/src/pal/src/map/virtual.cpp#L1291
BOOL PALAPI VirtualFree( IN LPVOID lpAddress, /* Address of region. */ IN SIZE_T dwSize, /* Size of region. */ IN DWORD dwFreeType ) /* Operation type. */ { BOOL bRetVal = TRUE; CPalThread *pthrCurrent; PERF_ENTRY(VirtualFree); ENTRY("VirtualFree(lpAddress=%p, dwSize=%u, dwFreeType=%#x)\n", lpAddress, dwSize, dwFreeType); pthrCurrent = InternalGetCurrentThread(); InternalEnterCriticalSection(pthrCurrent, &virtual_critsec); /* Sanity Checks. */ if ( !lpAddress ) { ERROR( "lpAddress cannot be NULL. You must specify the base address of\ regions to be de-committed. \n" ); pthrCurrent->SetLastError( ERROR_INVALID_ADDRESS ); bRetVal = FALSE; goto VirtualFreeExit; } if ( !( dwFreeType & MEM_RELEASE ) && !(dwFreeType & MEM_DECOMMIT ) ) { ERROR( "dwFreeType must contain one of the following: \ MEM_RELEASE or MEM_DECOMMIT\n" ); pthrCurrent->SetLastError( ERROR_INVALID_PARAMETER ); bRetVal = FALSE; goto VirtualFreeExit; } /* You cannot release and decommit in one call.*/ if ( dwFreeType & MEM_RELEASE && dwFreeType & MEM_DECOMMIT ) { ERROR( "MEM_RELEASE cannot be combined with MEM_DECOMMIT.\n" ); bRetVal = FALSE; goto VirtualFreeExit; } if ( dwFreeType & MEM_DECOMMIT ) { UINT_PTR StartBoundary = 0; SIZE_T MemSize = 0; if ( dwSize == 0 ) { ERROR( "dwSize cannot be 0. \n" ); pthrCurrent->SetLastError( ERROR_INVALID_PARAMETER ); bRetVal = FALSE; goto VirtualFreeExit; } /* * A two byte range straddling 2 pages caues both pages to be either * released or decommitted. So round the dwSize up to the next page * boundary and round the lpAddress down to the next page boundary. */ MemSize = (((UINT_PTR)(dwSize) + ((UINT_PTR)(lpAddress) & VIRTUAL_PAGE_MASK) + VIRTUAL_PAGE_MASK) & ~VIRTUAL_PAGE_MASK); StartBoundary = (UINT_PTR)lpAddress & ~VIRTUAL_PAGE_MASK; PCMI pUnCommittedMem; pUnCommittedMem = VIRTUALFindRegionInformation( StartBoundary ); if (!pUnCommittedMem) { ASSERT( "Unable to locate the region information.\n" ); pthrCurrent->SetLastError( ERROR_INTERNAL_ERROR ); bRetVal = FALSE; goto VirtualFreeExit; } TRACE( "Un-committing the following page(s) %d to %d.\n", StartBoundary, MemSize ); // Explicitly calling mmap instead of mprotect here makes it // that much more clear to the operating system that we no // longer need these pages. if ( mmap( (LPVOID)StartBoundary, MemSize, PROT_NONE, MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0 ) != MAP_FAILED ) { #if (MMAP_ANON_IGNORES_PROTECTION) if (mprotect((LPVOID) StartBoundary, MemSize, PROT_NONE) != 0) { ASSERT("mprotect failed to protect the region!\n"); pthrCurrent->SetLastError(ERROR_INTERNAL_ERROR); munmap((LPVOID) StartBoundary, MemSize); bRetVal = FALSE; goto VirtualFreeExit; } #endif // MMAP_ANON_IGNORES_PROTECTION SIZE_T index = 0; SIZE_T nNumOfPagesToChange = 0; /* We can now commit this memory by calling VirtualAlloc().*/ index = (StartBoundary - pUnCommittedMem->startBoundary) / VIRTUAL_PAGE_SIZE; nNumOfPagesToChange = MemSize / VIRTUAL_PAGE_SIZE; VIRTUALSetAllocState( MEM_RESERVE, index, nNumOfPagesToChange, pUnCommittedMem ); goto VirtualFreeExit; } else { ASSERT( "mmap() returned an abnormal value.\n" ); bRetVal = FALSE; pthrCurrent->SetLastError( ERROR_INTERNAL_ERROR ); goto VirtualFreeExit; } } if ( dwFreeType & MEM_RELEASE ) { PCMI pMemoryToBeReleased = VIRTUALFindRegionInformation( (UINT_PTR)lpAddress ); if ( !pMemoryToBeReleased ) { ERROR( "lpAddress must be the base address returned by VirtualAlloc.\n" ); pthrCurrent->SetLastError( ERROR_INVALID_ADDRESS ); bRetVal = FALSE; goto VirtualFreeExit; } if ( dwSize != 0 ) { ERROR( "dwSize must be 0 if you are releasing the memory.\n" ); pthrCurrent->SetLastError( ERROR_INVALID_PARAMETER ); bRetVal = FALSE; goto VirtualFreeExit; } TRACE( "Releasing the following memory %d to %d.\n", pMemoryToBeReleased->startBoundary, pMemoryToBeReleased->memSize ); if ( munmap( (LPVOID)pMemoryToBeReleased->startBoundary, pMemoryToBeReleased->memSize ) == 0 ) { if ( VIRTUALReleaseMemory( pMemoryToBeReleased ) == FALSE ) { ASSERT( "Unable to remove the PCMI entry from the list.\n" ); pthrCurrent->SetLastError( ERROR_INTERNAL_ERROR ); bRetVal = FALSE; goto VirtualFreeExit; } pMemoryToBeReleased = NULL; } else { ASSERT( "Unable to unmap the memory, munmap() returned an abnormal value.\n" ); pthrCurrent->SetLastError( ERROR_INTERNAL_ERROR ); bRetVal = FALSE; goto VirtualFreeExit; } } VirtualFreeExit: LogVaOperation( (dwFreeType & MEM_DECOMMIT) ? VirtualMemoryLogging::VirtualOperation::Decommit : VirtualMemoryLogging::VirtualOperation::Release, lpAddress, dwSize, dwFreeType, 0, NULL, bRetVal); InternalLeaveCriticalSection(pthrCurrent, &virtual_critsec); LOGEXIT( "VirtualFree returning %s.\n", bRetVal == TRUE ? "TRUE" : "FALSE" ); PERF_EXIT(VirtualFree); return bRetVal; }
咱們能夠看出, CoreCLR管理系統內存的方式比較底層
在windows上使用了VirtualAlloc和VirtualFree
在linux上使用了mmap和mprotect
而不是使用傳統的malloc和new
這樣會帶來更好的性能但同時增長了移植到其餘平臺的成本
要深刻學習CoreCLR光看代碼是很難作到的, 好比此次大部分來源的gc.cpp有接近37000行的代碼,
爲了很好的瞭解CoreCLR的工做原理此次我本身編譯了CoreCLR並在本地用lldb進行了調試, 這裏我分享一下編譯和調試的過程
這裏我使用了ubuntu 16.04 LTS, 由於linux上部署編譯環境比windows要簡單不少
下載CORECLR:
git clone https://github.com/dotnet/coreclr.git
切換到你正在使用的版本, 請務必切換不要直接去編譯master分支
git checkout v1.1.0
參考微軟的幫助安裝好須要的包
# https://github.com/dotnet/coreclr/blob/master/Documentation/building/linux-instructions.md echo "deb http://llvm.org/apt/trusty/ llvm-toolchain-trusty-3.6 main" | sudo tee /etc/apt/sources.list.d/llvm.list wget -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add - sudo apt-get update sudo apt-get install cmake llvm-3.5 clang-3.5 lldb-3.6 lldb-3.6-dev libunwind8 libunwind8-dev gettext libicu-dev liblttng-ust-dev libcurl4-openssl-dev libssl-dev uuid-dev cd coreclr ./build.sh
執行build.sh會從微軟的網站下載一些東西, 若是很長時間都下載不成功你應該考慮掛點什麼東西
編譯過程須要幾十分鐘, 完成之後能夠在coreclr/bin/Product/Linux.x64.Debug
下看到編譯結果
完成之後用dotnet建立一個新的可執行項目, 在project.json中添加runtimes節
{ "runtimes": { "ubuntu.16.04-x64": {} } }
Program.cs的代碼能夠隨意寫, 想測哪部分就寫哪部分的代碼,我這裏寫的是多線程分配內存而後釋放的代碼
using System; using System.Threading; using System.Collections.Generic; namespace ConsoleApplication { public class A { public int a; public byte[] padding; } public class Program { public static void ThreadBody() { Thread.Sleep(1000); var list = new List<A>(); for (long x = 0; x < 1000000; ++x) { list.Add(new A()); } } public static void Main(string[] args) { var threads = new List<Thread>(); for (var x = 0; x < 100; ++x) { var thread = new Thread(ThreadBody); threads.Add(thread); thread.Start(); } foreach (var thread in threads) { thread.Join(); } GC.Collect(); Console.WriteLine("memory released"); Console.ReadLine(); } } }
寫完之後編譯併發布
dotnet restore dotnet publish
發佈後bin/Debug/netcoreapp1.1/ubuntu16.04-x64/publish
會多出最終發佈的文件
把剛纔CoreCLR編譯出來的coreclr/bin/Product/Linux.x64.Debug
下的全部文件複製到publish
目錄下, 並覆蓋原有文件
微軟官方的調試文檔可見 https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/building/debugging-instructions.md
使用lldb啓動進程, 這裏我項目名稱是coreapp因此publish下的可執行文件名稱也是coreapp
lldb-3.6 ./coreapp
啓動進程後能夠打命令來調試, 須要中斷(暫停)程序運行能夠按下ctrl+c
這張圖中的命令
b allocate_small 給函數下斷點, 這裏的allocate_small雖然全名是SVR::gc_heap::allocate_small或WKS::gc_heap::allocate_small 可是lldb容許用短名稱下斷點, 碰到多個符合的函數會一併截取 r 運行程序, 以前在pending中的斷點若是在程序運行後能夠肯定內存位置則實際的添加斷點 bt 查看當前的堆棧調用樹, 能夠看當前被調用的函數的來源是哪些函數
這張圖中的命令
n 步過, 遇到函數不會進去, 若是須要步進能夠用s 另外步過彙編和步進彙編是ni和si fr v 查看當前堆棧幀中的變量 也就是傳入的參數和本地變量 p acontext->alloc_ptr p *acontext 打印全局或本地變量的值, 這個命令是調試中必用的命令, 不只支持查看變量還支持計算表達式
這張圖中的命令
c 繼續中斷進程直到退出或下一個斷點 br del 刪除以前設置的全部斷點
這張圖顯示的是線程列表中的第一個線程的分配上下文內容, 0x168能夠經過p &((Thread*)nullptr)->m_Link
計算得出(就是offsetof)
這張圖中的命令
me re -s4 -fx -c12 0x00007fff5c006f00 讀取0x00007fff5c006f00開始的內存, 單位是4byte, 表現形式是hex, 顯示12個單位
lldb不只能調試CoreCLR自身的代碼
還能用來調試用戶寫的程序代碼, 須要微軟的SOS插件支持
詳細能夠看微軟的官方文檔 https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/building/debugging-instructions.md
最後附上在此次分析中我經常使用的lldb命令
學習lldb能夠查看官方的Tutorial和GDB and LLDB command examples
plugin load libsosplugin.so process launch -s process handle -s false SIGUSR1 SIGUSR2 breakpoint set -n LoadLibraryExW c sos DumpHeap bpmd coreapp.dll ConsoleApplication.Program.Main p g_pGCHeap p n_heaps p g_heaps[0] p *WKS::gc_heap::ephemeral_heap_segment p g_heaps[0]->ephemeral_heap_segment p s_pThreadStore->m_ThreadList p &((Thread*)nullptr)->m_Link p ((Thread*)((char*)s_pThreadStore->m_ThreadList.m_link.m_pNext-0x168))->m_alloc_context p ((Thread*)((char*)s_pThreadStore->m_ThreadList.m_link.m_pNext->m_pNext-0x168))->m_alloc_context me re -s4 -fx -c100 0x00007fff5c027fe0 p generation_table p generation_table[0] p generation_table[2].free_list_allocator p generation_table[2].free_list_allocator.first_bucket.head p (generation_table[2].free_list_allocator.buckets)->head p (generation_table[2].free_list_allocator.buckets+1)->head p *generation_table[2].free_list_allocator.buckets wa s v generation_table[2].free_list_allocator.first_bucket.head me re -s8 -fx -c3 0x00007fff5bfff018
https://github.com/dotnet/coreclr/blob/master/Documentation/botr/garbage-collection.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcsvr.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcwks.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcimpl.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gcpriv.h
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/gc/gc.h#L162
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/vm/gchelpers.cpp#L931
https://raw.githubusercontent.com/dotnet/coreclr/release/1.1.0/src/gc/gc.cpp
https://github.com/dotnet/coreclr/blob/release/1.1.0/src/pal/src/map/virtual.cpp#L894
https://github.com/dotnet/coreclr/blob/master/Documentation/building/linux-instructions.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/building/debugging-instructions.md
https://docs.microsoft.com/en-us/dotnet/articles/core/tools/project-json
https://github.com/dotnet/coreclr/issues/8959
https://github.com/dotnet/coreclr/issues/8995
https://github.com/dotnet/coreclr/issues/9053
由於gc的代碼實在龐大而且註釋少, 此次的分析我不只在官方的github上提問了還動用到lldb才能作到初步的理解 下一篇我將講解GC內存回收器的內部實現, 可能須要的時間更長, 請耐心等待吧