深刻理解Go-內存分配

時間 2019-11-07

標籤深刻理解內存分配简体版

原文原文鏈接

Go語言內置運行時（就是runtime），拋棄了傳統的內存分配方式，改成自主管理，最開始是基於tcmalloc，雖而後面改動相對已經很大了。使用自主管理能夠實現更好的內存使用模式，好比內存池、預分配等等，從而避免了系統調用所帶來的性能問題。node

在瞭解Go的內存分配以前，咱們能夠看一下內存分配的基本策略，來幫助咱們理解Go的內存分配c++

基本策略：golang

每次從操做系統申請一大塊內存，以減小系統調用
將申請的大塊內存按照特定大小預先切成小塊，構成鏈表
爲對象分配內存時，從大小合適的鏈表中提取一塊便可
若是對象銷燬，則將對象佔用的內存，歸還到原鏈表，以便複用
若是限制內存過多，則嘗試歸還部分給操做系統，下降總體開銷

下面咱們從源碼角度來分析Go的內存分配策略有何異同算法

準備

在追蹤源碼以前，咱們須要首先了解一些概念和結構體數組

span: 又多個地址連續的頁（page）組成的大塊內存
object: 將span按特定大小切分紅多個小塊，每一個小塊可存儲一個對象

對象分類

小對象（tiny）: size < 16byte
普通對象： 16byte ~ 32K
大對象（large）： size > 32K

大小轉換

結構體

mHeap

表明Go程序持有的全部堆空間，Go程序使用一個mheap的全局對象_mheap來管理堆內存。緩存

type mheap struct {
lock mutex
free [_MaxMHeapList]mSpanList // page在127之內的閒置的span列表
freelarge mTreap // page數大於127的大span組成的樹狀結構體
busy [_MaxMHeapList]mSpanList // page在127之內的已分配的span列表
busylarge mSpanList // page數大於127的已分配的大span組成的列表

// allspans is a slice of all mspans ever created. Each mspan
// appears exactly once.
// 全部建立過的mspan的slice
allspans []*mspan // all spans out there

// arenas is the heap arena map. It points to the metadata for
// the heap for every arena frame of the entire usable virtual
// address space.
//
// Use arenaIndex to compute indexes into this array.
//
// For regions of the address space that are not backed by the
// Go heap, the arena map contains nil.
//
// Modifications are protected by mheap_.lock. Reads can be
// performed without locking; however, a given entry can
// transition from nil to non-nil at any time when the lock
// isn't held. (Entries never transitions back to nil.)
//
// In general, this is a two-level mapping consisting of an L1
// map and possibly many L2 maps. This saves space when there
// are a huge number of arena frames. However, on many
// platforms (even 64-bit), arenaL1Bits is 0, making this
// effectively a single-level map. In this case, arenas[0]
// will never be nil.
// 一組heapArena組成，每個heapArena都包含了連續的pagesPerArena個span，這個主要是爲mheap管理span和垃圾回收服務，heapArena也有介紹
arenas [1 << arenaL1Bits]*[1 << arenaL2Bits]*heapArena

// heapArenaAlloc is pre-reserved space for allocating heapArena
// objects. This is only used on 32-bit, where we pre-reserve
// this space to avoid interleaving it with the heap itself.
// 預先分配的 heapArena 對象的地址
heapArenaAlloc linearAlloc

// arenaHints is a list of addresses at which to attempt to
// add more heap arenas. This is initially populated with a
// set of general hint addresses, and grown with the bounds of
// actual heap arena ranges.
arenaHints *arenaHint

// arena is a pre-reserved space for allocating heap arenas
// (the actual arenas). This is only used on 32-bit.
// 僅32位使用
arena linearAlloc

//_ uint32 // ensure 64-bit alignment of central

// central free lists for small size classes.
// the padding makes sure that the MCentrals are
// spaced CacheLineSize bytes apart, so that each MCentral.lock
// gets its own cache line.
// central is indexed by spanClass.
// mcentral 內存分配中心，mcache沒有足夠的內存分配的時候，會從mcentral分配
central [numSpanClasses]struct {
mcentral mcentral
pad [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte
}

spanalloc fixalloc // allocator for span*
cachealloc fixalloc // allocator for mcache*
treapalloc fixalloc // allocator for treapNodes* used by large objects
specialfinalizeralloc fixalloc // allocator for specialfinalizer*
specialprofilealloc fixalloc // allocator for specialprofile*
speciallock mutex // lock for special record allocators.
arenaHintAlloc fixalloc // allocator for arenaHints

unused *specialfinalizer // never set, just here to force the specialfinalizer type into DWARF
}

mSpanList

mSpan的鏈表，free busy busyLarge 上的mSpan都是經過鏈表串聯起來的app

type mSpanList struct {
    first *mspan // first span in list, or nil if none
    last  *mspan // last span in list, or nil if none
}

mSpan

Go中內存管理的基本單元，是由一片連續的8KB的頁組成的大塊內存。注意，這裏的頁和操做系統自己的頁並非一回事，它通常是操做系統頁大小的幾倍。一句話歸納：mspan是一個包含起始地址、mspan規格、頁的數量等內容的雙端鏈表。dom

type mspan struct {
next *mspan // next span in list, or nil if none
prev *mspan // previous span in list, or nil if none
list *mSpanList // For debugging. TODO: Remove.

startAddr uintptr // address of first byte of span aka s.base()
// 該span鎖包含的頁數
npages uintptr // number of pages in span

manualFreeList gclinkptr // list of free objects in _MSpanManual spans

// freeindex is the slot index between 0 and nelems at which to begin scanning
// for the next free object in this span.
// Each allocation scans allocBits starting at freeindex until it encounters a 0
// indicating a free object. freeindex is then adjusted so that subsequent scans begin
// just past the newly discovered free object.
//
// If freeindex == nelem, this span has no free objects.
//
// allocBits is a bitmap of objects in this span.
// If n >= freeindex and allocBits[n/8] & (1<<(n%8)) is 0
// then object n is free;
// otherwise, object n is allocated. Bits starting at nelem are
// undefined and should never be referenced.
//
// Object n starts at address n*elemsize + (start << pageShift).
// 用於定位下一個可用的object, 大小範圍在 0- nelems 之間
freeindex uintptr
// TODO: Look up nelems from sizeclass and remove this field if it
// helps performance.
// span裏object的數量
nelems uintptr // number of object in the span.

// Cache of the allocBits at freeindex. allocCache is shifted
// such that the lowest bit corresponds to the bit freeindex.
// allocCache holds the complement of allocBits, thus allowing
// ctz (count trailing zero) to use it directly.
// allocCache may contain bits beyond s.nelems; the caller must ignore
// these.
// 用於緩存freeindex開始的bitmap, 緩存的bit值與原值相反，ctz函數能夠經過這個值快速計算出下一個 free object的index
allocCache uint64

// 分配位圖，每一位表明每一塊是否已經分配
allocBits *gcBits

// 已經分配的object的數量
allocCount uint16 // number of allocated objects

elemsize uintptr // computed from sizeclass or from npages

}

spanClass

class表中的class ID，和Size Classs相關ide

type spanClass uint8

mTreap

這個結構是包含mspan的樹狀結構，主要是給 freeLarge使用，在查找對應classsize的大對象的時候，使用樹狀結構查找要比鏈表更快函數

type mTreap struct {
    treap *treapNode
}

mtreapNode

mTreap結構的節點，節點信息包含mspan和左右子節點等信息

type treapNode struct {
    right     *treapNode // all treapNodes > this treap node
    left      *treapNode // all treapNodes < this treap node
    parent    *treapNode // direct parent of this node, nil if root
    npagesKey uintptr    // number of pages in spanKey, used as primary sort key
    spanKey   *mspan     // span of size npagesKey, used as secondary sort key
    priority  uint32     // random number used by treap algorithm to keep tree probabilistically balanced
}

heapArena

heapArena存儲的是arena的元數據， arenas是一組heapArena構成，全部的分配的內存都在 arenas 裏面，大體 arenas[L1][L2] = heapArena，而對於分配出去的內存的 address，經過 arenaIndex 能夠計算出 L1 L2，從而找到該內存所對應的 arenas[L1][L2]，即 heapArena

type heapArena struct {
    // bitmap stores the pointer/scalar bitmap for the words in
    // this arena. See mbitmap.go for a description. Use the
    // heapBits type to access this.
    bitmap [heapArenaBitmapBytes]byte

    // spans maps from virtual address page ID within this arena to *mspan.
    // For allocated spans, their pages map to the span itself.
    // For free spans, only the lowest and highest pages map to the span itself.
    // Internal pages map to an arbitrary span.
    // For pages that have never been allocated, spans entries are nil.
    //
    // Modifications are protected by mheap.lock. Reads can be
    // performed without locking, but ONLY from indexes that are
    // known to contain in-use or stack spans. This means there
    // must not be a safe-point between establishing that an
    // address is live and looking it up in the spans array.
    spans [pagesPerArena]*mspan
}

arenaHint

這個是記錄arena能夠增加的地址

type arenaHint struct {
    addr uintptr
    // down 爲 true，表示能夠擴展arena的大小
    down bool
    next *arenaHint
}

mcentral

mcentral則是全局資源，爲多個線程服務，當某個線程內存不足時會向mcentral申請，當某個線程釋放內存時又會回收進mcentral

type mcentral struct {
    lock      mutex
    spanclass spanClass
    // free object 的鏈表
    nonempty  mSpanList // list of spans with a free object, ie a nonempty free list
    // no free object 的鏈表
    empty     mSpanList // list of spans with no free objects (or cached in an mcache)

    // nmalloc is the cumulative count of objects allocated from
    // this mcentral, assuming all spans in mcaches are
    // fully-allocated. Written atomically, read under STW.
    nmalloc uint64
}

結構圖

接下來，咱們結合一下宏觀的圖示來理解一下上面的結構體之間的關聯，同時對於後面的內存分配有一個簡單的瞭解，等到後面所有講完後，在回過頭來看看這幅圖，可能會對Go的內存分配有更清晰的認知

初始化

func mallocinit() {
    // Initialize the heap.
    // 初始化 mheap
    mheap_.init()
    _g_ := getg()
  // 獲取當前g所在的m的mcache，並初始化
    _g_.m.mcache = allocmcache()
    for i := 0x7f; i >= 0; i-- {
  var p uintptr
  switch {
  case GOARCH == "arm64" && GOOS == "darwin":
      p = uintptr(i)<<40 | uintptrMask&(0x0013<<28)
  case GOARCH == "arm64":
      p = uintptr(i)<<40 | uintptrMask&(0x0040<<32)
  case raceenabled:
    // The TSAN runtime requires the heap
    // to be in the range [0x00c000000000,
    // 0x00e000000000).
    p = uintptr(i)<<32 | uintptrMask&(0x00c0<<32)
    if p >= uintptrMask&0x00e000000000 {
      continue
    }
  default:
      p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32)
  }
  // 保存 arena相關屬性
  hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
  hint.addr = p
  hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
}

mheap.init

func (h *mheap) init() {
    h.treapalloc.init(unsafe.Sizeof(treapNode{}), nil, nil, &memstats.other_sys)
    h.spanalloc.init(unsafe.Sizeof(mspan{}), recordspan, unsafe.Pointer(h), &memstats.mspan_sys)
    h.cachealloc.init(unsafe.Sizeof(mcache{}), nil, nil, &memstats.mcache_sys)
    h.specialfinalizeralloc.init(unsafe.Sizeof(specialfinalizer{}), nil, nil, &memstats.other_sys)
    h.specialprofilealloc.init(unsafe.Sizeof(specialprofile{}), nil, nil, &memstats.other_sys)
    h.arenaHintAlloc.init(unsafe.Sizeof(arenaHint{}), nil, nil, &memstats.other_sys)

    // Don't zero mspan allocations. Background sweeping can
    // inspect a span concurrently with allocating it, so it's
    // important that the span's sweepgen survive across freeing
    // and re-allocating a span to prevent background sweeping
    // from improperly cas'ing it from 0.
    //
    // This is safe because mspan contains no heap pointers.
    h.spanalloc.zero = false

    // h->mapcache needs no init
    for i := range h.free {
        h.free[i].init()
        h.busy[i].init()
    }

    h.busylarge.init()
    for i := range h.central {
        h.central[i].mcentral.init(spanClass(i))
    }
}

mcentral.init

初始化某個規格的mcentral

// Initialize a single central free list.
func (c *mcentral) init(spc spanClass) {
    c.spanclass = spc
    c.nonempty.init()
    c.empty.init()
}

allocmcache

mcache的初始化

func allocmcache() *mcache {
    lock(&mheap_.lock)
    c := (*mcache)(mheap_.cachealloc.alloc())
    unlock(&mheap_.lock)
    for i := range c.alloc {
        c.alloc[i] = &emptymspan
    }
    c.next_sample = nextSample()
    return c
}

fixalloc.alloc

fixalloc是一個固定大小的分配器。主要用來分配一些對內存的包裝的結構,好比:mspan,mcache..等等,雖然啓動分配的實際使用內存是由其餘內存分配器分配的。主要分配思路爲: 開始的時候一次性分配一大塊內存，每次請求分配一小塊，釋放時放在list鏈表中，因爲size是不變的，因此不會出現內存碎片。

func (f *fixalloc) alloc() unsafe.Pointer {
    if f.size == 0 {
        print("runtime: use of FixAlloc_Alloc before FixAlloc_Init\n")
        throw("runtime: internal error")
    }
    
  // 若是list不要爲空，直接拿
    if f.list != nil {
        v := unsafe.Pointer(f.list)
        f.list = f.list.next
        f.inuse += f.size
        if f.zero {
            memclrNoHeapPointers(v, f.size)
        }
        return v
    }
  // 若是塊爲空，則從系統分配中調用系統內存分配
    if uintptr(f.nchunk) < f.size {
        f.chunk = uintptr(persistentalloc(_FixAllocChunk, 0, f.stat))
        f.nchunk = _FixAllocChunk
    }
    // 從chunk中分配一個固定大小的size，釋放的時候，會迴歸到list中
    v := unsafe.Pointer(f.chunk)
    if f.first != nil {
        f.first(f.arg, v)
    }
    f.chunk = f.chunk + f.size
    f.nchunk -= uint32(f.size)
    f.inuse += f.size
    return v
}

初始化的工做很簡單：

初始化heap，初始化free large對應規格的鏈表，初始化busyLarge鏈表
初始化每一個規格對應的mcentral
初始化mcache，對mcache裏面每一個對應的規格進行初始化
初始化 arenaHints，填充一組地址，後面根據真正的arena邊界來進行擴增

分配

newObject

func newobject(typ *_type) unsafe.Pointer {
    return mallocgc(typ.size, typ, true)
}

mallocgc

func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
    
    // Set mp.mallocing to keep from being preempted by GC.
    // 加鎖防止被GC搶佔
    mp := acquirem()
    if mp.mallocing != 0 {
        throw("malloc deadlock")
    }
    if mp.gsignal == getg() {
        throw("malloc during signal")
    }
    mp.mallocing = 1

    shouldhelpgc := false
    dataSize := size
    // 獲取當前線程的mcache
    c := gomcache()
    var x unsafe.Pointer
    
    // 判斷分配的對象是否 是nil或非指針類型
    noscan := typ == nil || typ.kind&kindNoPointers != 0
    if size <= maxSmallSize {
        if noscan && size < maxTinySize {
            // 這裏開始小對象的內存分配
            
            // 對齊，調整偏移量
            off := c.tinyoffset
            // Align tiny pointer for required (conservative) alignment.
            if size&7 == 0 {
                off = round(off, 8)
            } else if size&3 == 0 {
                off = round(off, 4)
            } else if size&1 == 0 {
                off = round(off, 2)
            }
            // 若是當前mcache上綁定的tiny 塊內存空間足夠，直接分配，並返回
            if off+size <= maxTinySize && c.tiny != 0 {
                // The object fits into existing tiny block.
                x = unsafe.Pointer(c.tiny + off)
                c.tinyoffset = off + size
                c.local_tinyallocs++
                mp.mallocing = 0
                releasem(mp)
                return x
            }
            // Allocate a new maxTinySize block.
            // 當前mcache上的 tiny 塊內存空間不足，從新分配一塊 tiny 塊內存
            span := c.alloc[tinySpanClass]
            
            // 嘗試從 allocCache 獲取內存，獲取不到返回0
            v := nextFreeFast(span)
            if v == 0 {
                // 沒有從 allocCache 獲取到內存，netxtFree函數 嘗試從 mcentral獲取一個新的對應規格的快內存，替換原先內存空間不足的內存塊，並分配內存，後面解析 nextFree 函數
                v, _, shouldhelpgc = c.nextFree(tinySpanClass)
            }
            x = unsafe.Pointer(v)
            (*[2]uint64)(x)[0] = 0
            (*[2]uint64)(x)[1] = 0
            // See if we need to replace the existing tiny block with the new one
            // based on amount of remaining free space.
            if size < c.tinyoffset || c.tiny == 0 {
                c.tiny = uintptr(x)
                c.tinyoffset = size
            }
            size = maxTinySize
        } else {
            // 這裏開始 正常對象的 內存分配
            
            // 首先查表，以肯定 sizeclass
            var sizeclass uint8
            if size <= smallSizeMax-8 {
                sizeclass = size_to_class8[(size+smallSizeDiv-1)/smallSizeDiv]
            } else {
                sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv-1)/largeSizeDiv]
            }
            size = uintptr(class_to_size[sizeclass])
            spc := makeSpanClass(sizeclass, noscan)
            // 找到對應 sizeclass(後面 `規格` 來代替)的span
            span := c.alloc[spc]
            // 同小對象分配同樣，嘗試從 allocCache 獲取內存，獲取不到返回0
            v := nextFreeFast(span)
            if v == 0 {
                v, span, shouldhelpgc = c.nextFree(spc)
            }
            x = unsafe.Pointer(v)
            if needzero && span.needzero != 0 {
                memclrNoHeapPointers(unsafe.Pointer(v), size)
            }
        }
    } else {
        // 這裏開始大對象的分配
        // 大對象的分配與 小對象 和普通對象 的分配有點不同，大對象直接從 mheap 上分配
        var s *mspan
        shouldhelpgc = true
        systemstack(func() {
            s = largeAlloc(size, needzero, noscan)
        })
        s.freeindex = 1
        s.allocCount = 1
        x = unsafe.Pointer(s.base())
        size = s.elemsize
    }
    
    // bitmap標記...
    // 檢查出發條件，啓動垃圾回收 ...

    return x
}

整理一下這段代碼的基本思路：

首先斷定對象是大對象仍是普通對象仍是小對象
若是是小對象
1. 從 mcache 的alloc 找到對應 classsize 的 mspan
2. 若是當前mspan有足夠的空間，分配並修改mspan的相關屬性（nextFreeFast函數中實現）
3. 若是當前mspan沒有足夠的空間，從 mcentral從新獲取一塊對應 classsize的 mspan，替換原先的mspan，而後分配並修改mspan的相關屬性
若是是普通對象，邏輯大體同小對象的內存分配
1. 首先查表，以肯定須要分配內存的對象的 sizeclass，並找到對應 classsize的 mspan
2. 若是當前mspan有足夠的空間，分配並修改mspan的相關屬性（nextFreeFast函數中實現）
3. 若是當前mspan沒有足夠的空間，從 mcentral從新獲取一塊對應 classsize的 mspan，替換原先的mspan，而後分配並修改mspan的相關屬性
若是是大對象，直接從mheap進行分配，這裏的實現依靠 largeAlloc 函數實現，咱們先跟一下這個函數

largeAlloc

func largeAlloc(size uintptr, needzero bool, noscan bool) *mspan {
    // print("largeAlloc size=", size, "\n")
    
  // 內存溢出判斷
    if size+_PageSize < size {
        throw("out of memory")
    }
  
  // 計算出對象所需的頁數
    npages := size >> _PageShift
    if size&_PageMask != 0 {
        npages++
    }

    // Deduct credit for this span allocation and sweep if
    // necessary. mHeap_Alloc will also sweep npages, so this only
    // pays the debt down to npage pages.
    deductSweepCredit(npages*_PageSize, npages)
    
  // 分配函數的具體實現
    s := mheap_.alloc(npages, makeSpanClass(0, noscan), true, needzero)
    if s == nil {
        throw("out of memory")
    }
    s.limit = s.base() + size
  // bitmap 記錄分配的span
    heapBitsForAddr(s.base()).initSpan(s)
    return s
}

mheap.alloc

func (h *mheap) alloc(npage uintptr, spanclass spanClass, large bool, needzero bool) *mspan {
    // Don't do any operations that lock the heap on the G stack.
    // It might trigger stack growth, and the stack growth code needs
    // to be able to allocate heap.
    var s *mspan
    systemstack(func() {
        s = h.alloc_m(npage, spanclass, large)
    })

    if s != nil {
        if needzero && s.needzero != 0 {
            memclrNoHeapPointers(unsafe.Pointer(s.base()), s.npages<<_PageShift)
        }
        s.needzero = 0
    }
    return s
}

mheap.alloc_m

根據頁數從 heap 上面分配一個新的span，而且在 HeapMap 和 HeapMapCache 上記錄對象的sizeclass

func (h *mheap) alloc_m(npage uintptr, spanclass spanClass, large bool) *mspan {
    _g_ := getg()
    if _g_ != _g_.m.g0 {
        throw("_mheap_alloc not on g0 stack")
    }
    lock(&h.lock)

    // 清理垃圾，內存塊狀態標記 省略...
    
    // 從 heap中獲取指定頁數的span
    s := h.allocSpanLocked(npage, &memstats.heap_inuse)
    if s != nil {
        // Record span info, because gc needs to be
        // able to map interior pointer to containing span.
        atomic.Store(&s.sweepgen, h.sweepgen)
        h.sweepSpans[h.sweepgen/2%2].push(s) // Add to swept in-use list.// 忽略
        s.state = _MSpanInUse
        s.allocCount = 0
        s.spanclass = spanclass
    // 重置span的狀態
        if sizeclass := spanclass.sizeclass(); sizeclass == 0 {
            s.elemsize = s.npages << _PageShift
            s.divShift = 0
            s.divMul = 0
            s.divShift2 = 0
            s.baseMask = 0
        } else {
            s.elemsize = uintptr(class_to_size[sizeclass])
            m := &class_to_divmagic[sizeclass]
            s.divShift = m.shift
            s.divMul = m.mul
            s.divShift2 = m.shift2
            s.baseMask = m.baseMask
        }

        // update stats, sweep lists
        h.pagesInUse += uint64(npage)
        if large {
      // 更新 mheap中大對象的相關屬性
            memstats.heap_objects++
            mheap_.largealloc += uint64(s.elemsize)
            mheap_.nlargealloc++
            atomic.Xadd64(&memstats.heap_live, int64(npage<<_PageShift))
            // Swept spans are at the end of lists.
      // 根據頁數判斷是busy仍是 busylarge鏈表，並追加到末尾
            if s.npages < uintptr(len(h.busy)) {
                h.busy[s.npages].insertBack(s)
            } else {
                h.busylarge.insertBack(s)
            }
        }
    }
    // gc trace 標記，省略...
    unlock(&h.lock)
    return s
}

mheap.allocSpanLocked

分配一個給定大小的span，並將分配的span從freelist中移除

func (h *mheap) allocSpanLocked(npage uintptr, stat *uint64) *mspan {
    var list *mSpanList
    var s *mspan

    // Try in fixed-size lists up to max.
  // 先嚐試獲取指定頁數的span，若是沒有，則試試頁數更多的
    for i := int(npage); i < len(h.free); i++ {
        list = &h.free[i]
        if !list.isEmpty() {
            s = list.first
            list.remove(s)
            goto HaveSpan
        }
    }
    // Best fit in list of large spans.
  // 從 freelarge 上找到一個合適的span節點返回 ，下面繼續分析這個函數
    s = h.allocLarge(npage) // allocLarge removed s from h.freelarge for us
    if s == nil {
    // 若是 freelarge上找不到合適的span節點，就只有從 系統 從新分配了
    // 後面繼續分析這個函數
        if !h.grow(npage) {
            return nil
        }
    // 從系統分配後，再次到freelarge 上尋找合適的節點
        s = h.allocLarge(npage)
        if s == nil {
            return nil
        }
    }

HaveSpan:
  // 從 free 上面獲取到了 合適頁數的span
    // Mark span in use. 省略....
    
    if s.npages > npage {
        // Trim extra and put it back in the heap.
    // 建立一個 s.napges - npage 大小的span，並放回 heap
        t := (*mspan)(h.spanalloc.alloc())
        t.init(s.base()+npage<<_PageShift, s.npages-npage)
    // 更新獲取到的span s 的屬性
        s.npages = npage
        h.setSpan(t.base()-1, s)
        h.setSpan(t.base(), t)
        h.setSpan(t.base()+t.npages*pageSize-1, t)
        t.needzero = s.needzero
        s.state = _MSpanManual // prevent coalescing with s
        t.state = _MSpanManual
        h.freeSpanLocked(t, false, false, s.unusedsince)
        s.state = _MSpanFree
    }
    s.unusedsince = 0
    // 將s放到spans 和 arenas 數組裏面
    h.setSpans(s.base(), npage, s)

    *stat += uint64(npage << _PageShift)
    memstats.heap_idle -= uint64(npage << _PageShift)

    //println("spanalloc", hex(s.start<<_PageShift))
    if s.inList() {
        throw("still in list")
    }
    return s
}

mheap.allocLarge

從 mheap 的 freeLarge 樹上面找到一個指定page數量的span，並將該span從樹上移除，找不到則返回nil

func (h *mheap) allocLarge(npage uintptr) *mspan {
    // Search treap for smallest span with >= npage pages.
    return h.freelarge.remove(npage)
}

// 上面的 h.freelarge.remove 即調用這個函數
// 典型的二叉樹尋找算法
func (root *mTreap) remove(npages uintptr) *mspan {
    t := root.treap
    for t != nil {
        if t.spanKey == nil {
            throw("treap node with nil spanKey found")
        }
        if t.npagesKey < npages {
            t = t.right
        } else if t.left != nil && t.left.npagesKey >= npages {
            t = t.left
        } else {
            result := t.spanKey
            root.removeNode(t)
            return result
        }
    }
    return nil
}

注：在看《Go語言學習筆記》的時候，這裏的查找算法仍是對鏈表的遍歷查找

mheap.grow

在 mheap.allocSpanLocked 這個函數中，若是 freelarge上找不到合適的span節點，就只有從系統從新分配了，那咱們接下來就繼續分析一下這個函數的實現

func (h *mheap) grow(npage uintptr) bool {
    ask := npage << _PageShift
  // 向系統申請內存，後面繼續追蹤 sysAlloc 這個函數
    v, size := h.sysAlloc(ask)
    if v == nil {
        print("runtime: out of memory: cannot allocate ", ask, "-byte block (", memstats.heap_sys, " in use)\n")
        return false
    }

    // Create a fake "in use" span and free it, so that the
    // right coalescing happens.
  // 建立 span 來管理剛剛申請的內存
    s := (*mspan)(h.spanalloc.alloc())
    s.init(uintptr(v), size/pageSize)
    h.setSpans(s.base(), s.npages, s)
    atomic.Store(&s.sweepgen, h.sweepgen)
    s.state = _MSpanInUse
    h.pagesInUse += uint64(s.npages)
  // 將剛剛申請的span放到 arenas 和 spans 數組裏面
    h.freeSpanLocked(s, false, true, 0)
    return true
}

mheao.sysAlloc

func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
   n = round(n, heapArenaBytes)

   // First, try the arena pre-reservation.
 // 從 arena 中 獲取對應大小的內存， 獲取不到返回nil
   v = h.arena.alloc(n, heapArenaBytes, &memstats.heap_sys)
   if v != nil {
   // 從arena獲取到須要的內存，跳轉到 mapped操做
       size = n
       goto mapped
   }

   // Try to grow the heap at a hint address.
 // 嘗試 從 arenaHint向下擴展內存
   for h.arenaHints != nil {
       hint := h.arenaHints
       p := hint.addr
       if hint.down {
           p -= n
       }
       if p+n < p {
           // We can't use this, so don't ask.
     // 表名 hint.down = false 不能向下擴展內存
           v = nil
       } else if arenaIndex(p+n-1) >= 1<<arenaBits {
     // 超出 heap 可尋址的內存地址，不能使用
           // Outside addressable heap. Can't use.
           v = nil
       } else {
     // 當前hint能夠向下擴展內存，利用mmap向系統申請內存
           v = sysReserve(unsafe.Pointer(p), n)
       }
       if p == uintptr(v) {
           // Success. Update the hint.
           if !hint.down {
               p += n
           }
           hint.addr = p
           size = n
           break
       }
       // Failed. Discard this hint and try the next.
       //
       // TODO: This would be cleaner if sysReserve could be
       // told to only return the requested address. In
       // particular, this is already how Windows behaves, so
       // it would simply things there.
       if v != nil {
           sysFree(v, n, nil)
       }
       h.arenaHints = hint.next
       h.arenaHintAlloc.free(unsafe.Pointer(hint))
   }

   if size == 0 {
       if raceenabled {
           // The race detector assumes the heap lives in
           // [0x00c000000000, 0x00e000000000), but we
           // just ran out of hints in this region. Give
           // a nice failure.
           throw("too many address space collisions for -race mode")
       }

       // All of the hints failed, so we'll take any
       // (sufficiently aligned) address the kernel will give
       // us.
       v, size = sysReserveAligned(nil, n, heapArenaBytes)
       if v == nil {
           return nil, 0
       }

       // Create new hints for extending this region.
       hint := (*arenaHint)(h.arenaHintAlloc.alloc())
       hint.addr, hint.down = uintptr(v), true
       hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
       hint = (*arenaHint)(h.arenaHintAlloc.alloc())
       hint.addr = uintptr(v) + size
       hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
   }

   // Check for bad pointers or pointers we can't use.
   {
       var bad string
       p := uintptr(v)
       if p+size < p {
           bad = "region exceeds uintptr range"
       } else if arenaIndex(p) >= 1<<arenaBits {
           bad = "base outside usable address space"
       } else if arenaIndex(p+size-1) >= 1<<arenaBits {
           bad = "end outside usable address space"
       }
       if bad != "" {
           // This should be impossible on most architectures,
           // but it would be really confusing to debug.
           print("runtime: memory allocated by OS [", hex(p), ", ", hex(p+size), ") not in usable address space: ", bad, "\n")
           throw("memory reservation exceeds address space limit")
       }
   }

   if uintptr(v)&(heapArenaBytes-1) != 0 {
       throw("misrounded allocation in sysAlloc")
   }

   // Back the reservation.
   sysMap(v, size, &memstats.heap_sys)

mapped:
   // Create arena metadata.
 // 根據 v 的address，計算出 arenas 的L1 L2
   for ri := arenaIndex(uintptr(v)); ri <= arenaIndex(uintptr(v)+size-1); ri++ {
       l2 := h.arenas[ri.l1()]
       if l2 == nil {
     // 若是 L2 爲 nil，則分配 arenas[L1]
           // Allocate an L2 arena map.
           l2 = (*[1 << arenaL2Bits]*heapArena)(persistentalloc(unsafe.Sizeof(*l2), sys.PtrSize, nil))
           if l2 == nil {
               throw("out of memory allocating heap arena map")
           }
           atomic.StorepNoWB(unsafe.Pointer(&h.arenas[ri.l1()]), unsafe.Pointer(l2))
       }
       
   // 若是 arenas[ri.L1()][ri.L2()] 不爲空 說明已經實例化過了
       if l2[ri.l2()] != nil {
           throw("arena already initialized")
       }
       var r *heapArena
   // 從 arena 上分配內存
       r = (*heapArena)(h.heapArenaAlloc.alloc(unsafe.Sizeof(*r), sys.PtrSize, &memstats.gc_sys))
       if r == nil {
           r = (*heapArena)(persistentalloc(unsafe.Sizeof(*r), sys.PtrSize, &memstats.gc_sys))
           if r == nil {
               throw("out of memory allocating heap arena metadata")
           }
       }

       // Store atomically just in case an object from the
       // new heap arena becomes visible before the heap lock
       // is released (which shouldn't happen, but there's
       // little downside to this).
       atomic.StorepNoWB(unsafe.Pointer(&l2[ri.l2()]), unsafe.Pointer(r))
   }
   // 省略部分代碼...
   return
}

至此，大對象的分配流程至此結束，咱們繼續看一下，小對象和普通話對象的分配流程

小對象和普通對象分配

下面一段是小對象和普通對象的內存查找和分配的主要函數，在上面的時候已經分析過了，下面咱們就着重分析這兩個函數

span := c.alloc[spc]
            v := nextFreeFast(span)
            if v == 0 {
                v, _, shouldhelpgc = c.nextFree(spc)
            }

nextFreeFast

這個函數返回 span 上可用的地址，若是找不到則返回0

func nextFreeFast(s *mspan) gclinkptr {
  // 計算s.allocCache從低位起有多少個0
    theBit := sys.Ctz64(s.allocCache) // Is there a free object in the allocCache?
    if theBit < 64 {
    
        result := s.freeindex + uintptr(theBit)
        if result < s.nelems {
            freeidx := result + 1
            if freeidx%64 == 0 && freeidx != s.nelems {
                return 0
            }
      // 更新bitmap、可用的 slot索引
            s.allocCache >>= uint(theBit + 1)
            s.freeindex = freeidx
            s.allocCount++
      // 返回 找到的內存的地址
            return gclinkptr(result*s.elemsize + s.base())
        }
    }
    return 0
}

mcache.nextFree

若是 nextFreeFast 找不到合適的內存，就會進入這個函數

nextFree 若是在cached span 裏面找到未使用的object，則返回，不然，調用refill 函數，從 central 中獲取對應classsize的span，而後重新的span裏面找到未使用的object返回

func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) {
    // 先找到 mcache 中 對應 規格的 span
  s = c.alloc[spc]
    shouldhelpgc = false
  // 在 當前span中找到合適的 index索引
    freeIndex := s.nextFreeIndex()
    if freeIndex == s.nelems {
        // The span is full.
    // freeIndex == nelems 時，表示當前span已滿
        if uintptr(s.allocCount) != s.nelems {
            println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
            throw("s.allocCount != s.nelems && freeIndex == s.nelems")
        }
    // 調用refill函數，從 mcentral 中獲取可用的span，並替換掉當前 mcache裏面的span
        systemstack(func() {
            c.refill(spc)
        })
        shouldhelpgc = true
        s = c.alloc[spc]
        
    // 再次到新的span裏面查找合適的index
        freeIndex = s.nextFreeIndex()
    }

    if freeIndex >= s.nelems {
        throw("freeIndex is not valid")
    }
    
  // 計算出來 內存地址，並更新span的屬性
    v = gclinkptr(freeIndex*s.elemsize + s.base())
    s.allocCount++
    if uintptr(s.allocCount) > s.nelems {
        println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
        throw("s.allocCount > s.nelems")
    }
    return
}

mcache.refill

Refill 根據指定的sizeclass獲取對應的span，並做爲 mcache的新的sizeclass對應的span

func (c *mcache) refill(spc spanClass) {
    _g_ := getg()

    _g_.m.locks++
    // Return the current cached span to the central lists.
    s := c.alloc[spc]

    if uintptr(s.allocCount) != s.nelems {
        throw("refill of span with free space remaining")
    }
    
  // 判斷s是否是 空的span
    if s != &emptymspan {
        s.incache = false
    }
    // 嘗試從 mcentral 獲取一個新的span來代替老的span
    // Get a new cached span from the central lists.
    s = mheap_.central[spc].mcentral.cacheSpan()
    if s == nil {
        throw("out of memory")
    }

    if uintptr(s.allocCount) == s.nelems {
        throw("span has no free space")
    }
    // 更新mcache的span
    c.alloc[spc] = s
    _g_.m.locks--
}

mcentral.cacheSpan

func (c *mcentral) cacheSpan() *mspan {
    // Deduct credit for this span allocation and sweep if necessary.
    spanBytes := uintptr(class_to_allocnpages[c.spanclass.sizeclass()]) * _PageSize
    // 清理垃圾...
    lock(&c.lock)

    sg := mheap_.sweepgen
retry:
    var s *mspan
    for s = c.nonempty.first; s != nil; s = s.next {
    // if sweepgen == h->sweepgen - 2, the span needs sweeping
    // if sweepgen == h->sweepgen - 1, the span is currently being swept
    // if sweepgen == h->sweepgen, the span is swept and ready to use
    // h->sweepgen is incremented by 2 after every GC
    // 須要清理的span
        if s.sweepgen == sg-2 && atomic.Cas(&s.sweepgen, sg-2, sg-1) {
            c.nonempty.remove(s)
            c.empty.insertBack(s)
            unlock(&c.lock)
            s.sweep(true)
            goto havespan
        }
        if s.sweepgen == sg-1 {
            // the span is being swept by background sweeper, skip
            continue
        }
        // we have a nonempty span that does not require sweeping, allocate from it
    // 找到片 沒有被 清理的span，分配，跳轉到 havespan標籤繼續處理
        c.nonempty.remove(s)
        c.empty.insertBack(s)
        unlock(&c.lock)
        goto havespan
    }
    
  // 對於 上一輪循環中，可能 正在清掃的span，清掃後的span可能會有有用的span，因此在這裏 在進行一次遍歷檢查
    for s = c.empty.first; s != nil; s = s.next {
        if s.sweepgen == sg-2 && atomic.Cas(&s.sweepgen, sg-2, sg-1) {
            // we have an empty span that requires sweeping,
            // sweep it and see if we can free some space in it
            c.empty.remove(s)
            // swept spans are at the end of the list
            c.empty.insertBack(s)
            unlock(&c.lock)
            s.sweep(true)
            freeIndex := s.nextFreeIndex()
            if freeIndex != s.nelems {
                s.freeindex = freeIndex
                goto havespan
            }
            lock(&c.lock)
            // the span is still empty after sweep
            // it is already in the empty list, so just retry
            goto retry
        }
        if s.sweepgen == sg-1 {
            // the span is being swept by background sweeper, skip
            continue
        }
        // already swept empty span,
        // all subsequent ones must also be either swept or in process of sweeping
        break
    }

    unlock(&c.lock)

    // Replenish central list if empty.
  // 找不到 合適的span，補充對應classsize的span，grow函數會調用 mheap.alloc 來填充span，上面已經分析過了，再也不贅述
    s = c.grow()
    if s == nil {
        return nil
    }
    lock(&c.lock)
  // 插入到empty span list後面
    c.empty.insertBack(s)
    unlock(&c.lock)

    // At this point s is a non-empty span, queued at the end of the empty list,
    // c is unlocked.
havespan:

    cap := int32((s.npages << _PageShift) / s.elemsize)
    n := cap - int32(s.allocCount)
    if n == 0 || s.freeindex == s.nelems || uintptr(s.allocCount) == s.nelems {
        throw("span has no free objects")
    }
    // Assume all objects from this span will be allocated in the
    // mcache. If it gets uncached, we'll adjust this.
    atomic.Xadd64(&c.nmalloc, int64(n))
    usedBytes := uintptr(s.allocCount) * s.elemsize
    atomic.Xadd64(&memstats.heap_live, int64(spanBytes)-int64(usedBytes))
    // 表示 span 爲正在使用
    s.incache = true
    freeByteBase := s.freeindex &^ (64 - 1)
    whichByte := freeByteBase / 8
  // 更新 bitmap
    // Init alloc bits cache.
    s.refillAllocCache(whichByte)

    // Adjust the allocCache so that s.freeindex corresponds to the low bit in
    // s.allocCache.
    s.allocCache >>= s.freeindex % 64

    return s
}

到這裏，若是從 mcentral 找不到對應的span，就開始了內存擴張之旅了，也就是咱們上面分析的 mheap.alloc，後面的分析就同上了

分配小結

綜上，能夠看出Go的內存分配的大體流程以下

首先斷定對象是大對象仍是普通對象仍是小對象
若是是小對象
1. 從 mcache 的alloc 找到對應 classsize 的 mspan
2. 若是當前mspan有足夠的空間，分配並修改mspan的相關屬性（nextFreeFast函數中實現）
3. 若是當前mspan沒有足夠的空間，從 mcentral從新獲取一塊對應 classsize的 mspan，替換原先的mspan，而後分配並修改mspan的相關屬性
4. 若是mcentral沒有足夠的對應的classsize的span，則去向mheap申請
5. 若是對應classsize的span沒有了，則找一個相近的classsize的span，切割並分配
6. 若是找不到相近的classsize的span，則去向系統申請，並補充到mheap中
若是是普通對象，邏輯大體同小對象的內存分配
1. 首先查表，以肯定須要分配內存的對象的 sizeclass，並找到對應 classsize的 mspan
2. 若是當前mspan有足夠的空間，分配並修改mspan的相關屬性（nextFreeFast函數中實現）
3. 若是當前mspan沒有足夠的空間，從 mcentral從新獲取一塊對應 classsize的 mspan，替換原先的mspan，而後分配並修改mspan的相關屬性
4. 若是mcentral沒有足夠的對應的classsize的span，則去向mheap申請
5. 若是對應classsize的span沒有了，則找一個相近的classsize的span，切割並分配
6. 若是找不到相近的classsize的span，則去向系統申請，並補充到mheap中
若是是大對象，直接從mheap進行分配
1. 若是對應classsize的span沒有了，則找一個相近的classsize的span，切割並分配
2. 若是找不到相近的classsize的span，則去向系統申請，並補充到mheap中