OC底層原理05:類結構cache

以前分析類結構中談到了cache:利用散列表來緩存方法,這裏咱們具體深刻探索下cache。緩存

cache源碼分析

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED  explicit_atomic<struct bucket_t *> _buckets;  explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16  explicit_atomic<uintptr_t> _maskAndBuckets;  mask_t _mask_unused;  // 部分省略... #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4  // _maskAndBuckets stores the mask shift in the low 4 bits, and  // the buckets pointer in the remainder of the value. The mask  // shift is the value where (0xffff >> shift) produces the correct  // mask. This is equal to 16 - log2(cache_size).  explicit_atomic<uintptr_t> _maskAndBuckets;  mask_t _mask_unused;  // 部分省略... #else #error Unknown cache mask storage type. #endif  #if __LP64__  uint16_t _flags; // 位置標記,用來外部進行讀取 #endif  uint16_t _occupied; // 佔用狀況   // 部分方法省略... public:  struct bucket_t *buckets(); // 獲取buckets  mask_t mask(); // 獲取掩碼  mask_t occupied(); // 獲取occupied  void incrementOccupied(); // occupied個數自增  void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);  void initializeToEmpty();   unsigned capacity(); // 緩存容量大小  bool isConstantEmptyCache();  bool canBeFreed();   // 開闢內容  void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);  // 插入sel和imp  void insert(Class cls, SEL sel, IMP imp, id receiver); 複製代碼

1. CACHE_MASK_STORAGE

CACHE_MASK_STORAGE_OUTLINED:表示支持運行環境爲MacOS或者模擬器
CACHE_MASK_STORAGE_HIGH_16:表示支持運行環境爲64位的真機
CACHE_MASK_STORAGE_LOW_4:表示支持運行環境爲非64位的真機

由於文章裏設計的代碼運行在MacOS下,編譯後就決定了 CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED,因此點其它就會找不到.安全

explicit_atomic: cache用來作方法緩存,緩存過程當中確定會涉及到--增刪改查.explictit_atomic表明了原子性,保證了線程安全性markdown

2. _buckets

從源碼咱們能夠看到, _buckets實際上是一個struct bucket_t *類型的數據, bucket_t源碼:app

struct bucket_t {
private:  // IMP-first is better for arm64e ptrauth and no worse for arm64.  // SEL-first is better for armv7* and i386 and x86_64. #if __arm64__ // 64位真機  explicit_atomic<uintptr_t> _imp;  explicit_atomic<SEL> _sel; #else // 其他  explicit_atomic<SEL> _sel;  explicit_atomic<uintptr_t> _imp; #endif // 部分方法省略 public:  // 獲取sel  inline SEL sel() const {  // ...  }  // 獲取imp 須要傳遞類做爲參數  inline IMP imp(Class cls) const {  // ...  } 複製代碼

其中不管運行環境是怎樣的,bucket_t結構體中,都有兩個數據成員_imp和_sel,只是順序的差異.less

sel和imp函數
- sel是方法的編號,能夠理解爲目錄的名稱
- imp是函數方法的指針地址,能夠理解爲目錄的頁碼

cache調試

源碼基礎上調試代碼:oop

@interface LGPerson : NSObject
@property (nonatomic, copy) NSString *lgName; @property (nonatomic, strong) NSString *nickName;  - (void)sayHello; - (void)sayCode; - (void)sayMaster; - (void)sayNB; + (void)sayHappy;  @end  // main  LGPerson *p = [LGPerson alloc]; [p sayHello]; [p sayCode]; [p sayMaster];  複製代碼

首先,斷點卡在[LGPerson alloc]以後,即還沒有調用方法時:源碼分析

此時能夠看到,occupied和capacity都爲0.斷點向下,看下調用第一個方法sayHello以後:ui

此時由於代碼調用了sayHello方法,系統會將該方法存在緩存中,以便下次調用時提升調用速度.因此咱們能夠看到this

occupied = 1 , 方法有1個
capacity 爲4 , 緩存大小爲4(4個bucket_t結構體的大小)
sel = "sayHello"
imp = 0x0000000100000c00 - [LGPerson sayHello]

咱們確實從cache中找到了調用過的方法,那麼多調用幾個方法會是什麼樣子的呢? 接下來把斷點斷在sayMaster方法以後,那麼,此時緩存中應該有三個方法.

可是實際調試後,咱們卻發現緩存中只有一個方法,可是緩存容量變大爲8.咱們遍歷buckets中全部的數據,在第二個位置找到了緩存的方法sayMaster,即代碼中調用的最後一個方法. 這是爲何???

爲何第三個方法調用後,緩存中的方法被清空了?
緩存被清空後,爲何capacity仍然變大?
爲何方法存入緩存順序是亂序的?

方法插入緩存過程

懷揣着疑問,咱們研究下: 方法到底是如何插入到緩存的.相信這個過程可以幫咱們解答上邊的疑問.

從新閱讀下源碼,咱們看到在cache_t中,有兩個這樣的方法

void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
void insert(Class cls, SEL sel, IMP imp, id receiver); 複製代碼

insert源碼

void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
{ #if CONFIG_USE_CACHE_LOCK  cacheUpdateLock.assertLocked(); #else  runtimeLock.assertLocked(); #endif   ASSERT(sel != 0 && cls->isInitialized());   // Use the cache as-is if it is less than 3/4 full  mask_t newOccupied = occupied() + 1;  unsigned oldCapacity = capacity(), capacity = oldCapacity;  if (slowpath(isConstantEmptyCache())) {  // Cache is read-only. Replace it.  if (!capacity) capacity = INIT_CACHE_SIZE;  reallocate(oldCapacity, capacity, /* freeOld */false);  }  else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // 4 3 + 1 bucket cache_t  // Cache is less than 3/4 full. Use it as-is.  }  else {  capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; // 擴容兩倍 4  if (capacity > MAX_CACHE_SIZE) {  capacity = MAX_CACHE_SIZE;  }  reallocate(oldCapacity, capacity, true); // 內存 庫容完畢  }   bucket_t *b = buckets();  mask_t m = capacity - 1;  mask_t begin = cache_hash(sel, m);  mask_t i = begin;   // Scan for the first unused slot and insert there.  // There is guaranteed to be an empty slot because the  // minimum size is 4 and we resized at 3/4 full.  do {  if (fastpath(b[i].sel() == 0)) {  incrementOccupied();  b[i].set<Atomic, Encoded>(sel, imp, cls);  return;  }  if (b[i].sel() == sel) {  // The entry was added to the cache by some other thread  // before we grabbed the cacheUpdateLock.  return;  }  } while (fastpath((i = cache_next(i, m)) != begin));   cache_t::bad_cache(receiver, (SEL)sel, cls); } 複製代碼

咱們一段一段進行具體分析:

1. buckets爲空時

// Use the cache as-is if it is less than 3/4 full
 mask_t newOccupied = occupied() + 1;  unsigned oldCapacity = capacity(), capacity = oldCapacity;  if (slowpath(isConstantEmptyCache())) {  // Cache is read-only. Replace it.  if (!capacity) capacity = INIT_CACHE_SIZE;  reallocate(oldCapacity, capacity, /* freeOld */false);  }  // isConstantEmptyCache源碼 bool cache_t::isConstantEmptyCache() {  return  occupied() == 0 &&  buckets() == emptyBucketsForCapacity(capacity(), false); }  複製代碼

經過判斷`isConstantEmptyCache`方法,當條件知足時,即buckets爲空時:

if (!capacity) capacity = INIT_CACHE_SIZE;

爲capacity賦初值 INIT_CACHE_SIZE, 0001 << 2 = 0100 = 4.因此capacity初值爲4.
reallocate(oldCapacity, capacity, false);

ALWAYS_INLINE
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld) {  bucket_t *oldBuckets = buckets();  bucket_t *newBuckets = allocateBuckets(newCapacity);   // Cache's old contents are not propagated.   // This is thought to save cache memory at the cost of extra cache fills.  // fixme re-measure this   ASSERT(newCapacity > 0);  ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);   setBucketsAndMask(newBuckets, newCapacity - 1);   if (freeOld) {  cache_collect_free(oldBuckets, oldCapacity);  } }  複製代碼

allocateBuckets

reallocate開闢內存方法中,咱們先關注bucket_t *newBuckets = allocateBuckets(newCapacity);

再點進去看set方法,其中保存了方法的SEL和IMP:

此時咱們就獲得了一個newBuckets,回看reallocate方法,咱們在獲得newBuckets後,會繼續向下調用setBucketsAndMask方法,newBuckets和capacity-1會做爲參數傳遞進去:

setBucketsAndMask 方法,會根據不一樣的運行環境下,store存儲方法的調用.其中_buckets 、 _mask、 _occupied就是cache_t結構體中的對應數據.

到此簡單分析完了insert源碼中第一段if的過程, 簡單總結以下:

2.buckets不爲空時

else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { 
 // Cache is less than 3/4 full. Use it as-is.  // 其中CACHE_END_MARKER 爲 宏 #define CACHE_END_MARKER 1  // 即當前的newOccupied+1以後,是否 小於等於capacity容量的四分之三 } else {  capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; // 擴容兩倍  if (capacity > MAX_CACHE_SIZE) {  capacity = MAX_CACHE_SIZE;  }  reallocate(oldCapacity, capacity, true); // 內存 擴容完畢 } 複製代碼

當前的方法個數newOccupied 加 1, 小於等於 capacity容量的四分之三, 則繼續向下執行
當大於四分之三時,會將當前的capacity擴容兩倍,並從新reallocate,此時調用reallocate方法中,傳入的freeOld參數爲true,則此次會調用到cache_collect_free方法

咱們來看下cache_collect_free源碼

3.肯定插入的位置

bucket_t *b = buckets();
mask_t m = capacity - 1; mask_t begin = cache_hash(sel, m); mask_t i = begin;  // cache_hash static inline mask_t cache_hash(SEL sel, mask_t mask) {  return (mask_t)(uintptr_t)sel & mask; } 複製代碼

bucket_t要插入的位置,並非順序插入,由於順序插入存儲,不如哈希計算後直接取效率高.

咱們能夠驗證一下:

咱們在調用第一個方法sayHello後,看下它的存儲狀況.

4.插入位置的校驗

哈希計算,可能會存儲不一樣方法時,計算結果相同的狀況,因此在肯定插入前,須要再作下校驗,判斷要插入的位置是否已有數據

do {
 if (fastpath(b[i].sel() == 0)) {  incrementOccupied();  b[i].set<Atomic, Encoded>(sel, imp, cls);  return;  }  if (b[i].sel() == sel) {  // The entry was added to the cache by some other thread  // before we grabbed the cacheUpdateLock.  return;  } } while (fastpath((i = cache_next(i, m)) != begin));  // cache_next方法 static inline mask_t cache_next(mask_t i, mask_t mask) {  return (i+1) & mask; } 複製代碼

校驗位置在do..while中,循環條件爲:fastpath((i = cache_next(i, m)) != begin),即從新哈希計算後的位置不一樣於最初的哈希結構,即位置的計算會對最初的哈希結果再次進行哈希計算,下降計算結果相同的機率.

循環內部:

當要插入的數據爲空時:fastpath(b[i].sel() == 0),會先調用incrementOccupied,進行緩存中方法個數自增,而後將sel、imp、cls保存在一塊兒.
當插入位置的sel,相等於要插入的sel時,便可能存在,在不一樣線程中,已經存儲過的狀況下,就再也不存儲了.The entry was added to the cache by some other thread before we grabbed the cacheUpdateLock.

對insert簡單作個總結

上邊的疑問就不是疑問了...