這是我參與8月更文挑戰的第7天,活動詳情查看:8月更文挑戰算法
在以前的文章中,咱們講到了NSObject的父類是objc_class,而它包含如下信息數組
Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
複製代碼
今天咱們來探索一下cache_t緩存
數組是用於儲存多個相同類型數據的集合。主要有如下優缺點:markdown
鏈表是一種物理存儲單元上非連續、非順序的存儲結構,數據元素的邏輯順序是經過鏈表中的指針連接次序實現的。主要有如下優缺點:數據結構
哈希表是根據關鍵碼值而直接進行訪問的數據結構。主要有如下優缺點:架構
類的結構:在objc_class
結構體中,由isa
、superclass
、cache
和bits
組成。isa
和superclass
都是結構體指針,各佔8字節
。故此,使用內存平移:首地址+16字節
,便可探索cache的數據結構體。框架
找到cache_t的定義less
struct cache_t {
private: explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
union {
struct {
explicit_atomic<mask_t> _maybeMask;
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied;
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
...
};
複製代碼
_bucketsAndMaybeMask
:泛型,傳入uintptr_t
類型,佔8字節union
:聯合體,包含一個結構體和一個結構體指針_originalPreoptCache
struct
:包含_maybeMask
、_flags
、_occupied
三個成員變量,和_originalPreoptCache
互斥咱們找到了cache_t的數據結構,但他的做用還不得而知 經過cache_t的各自方法,能夠看出它在圍繞bucket_t進行增刪改查 找到bucket_t的定義函數
struct bucket_t {
private:
// IMP-first is better for arm64e ptrauth and no worse for arm64.
// SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
explicit_atomic<uintptr_t> _imp;
explicit_atomic<SEL> _sel;
#else
explicit_atomic<SEL> _sel;
explicit_atomic<uintptr_t> _imp;
#endif
...
};
複製代碼
bucket_t
中包含sel
和imp
sel
和imp
的順序不同經過sel
和imp
不難看出,在cache_t
中緩存的應該是方法post
insert
函數在cache_t
結構體中,找到insert
函數
struct cache_t {
...
void insert(SEL sel, IMP imp, id receiver);
...
};
複製代碼
bucket
insert函數,當緩存列表爲空時
INIT_CACHE_SIZE_LOG2 = 2,
INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2),
mask_t newOccupied = occupied() + 1;
unsigned oldCapacity = capacity(), capacity = oldCapacity;
if (slowpath(isConstantEmptyCache())) {
// Cache is read-only. Replace it.
if (!capacity) capacity = INIT_CACHE_SIZE;
reallocate(oldCapacity, capacity, /* freeOld */false);
}
複製代碼
newOccupied
:已有緩存的大小+1capacity
:值爲4(1 << 2),緩存列表的初始容量reallocate
函數,首次建立,freeOld
傳入false
reallocate
函數,建立buckets
存儲桶,調用setBucketsAndMask
函數
bucket_t *newBuckets = allocateBuckets(newCapacity);
setBucketsAndMask(newBuckets, newCapacity - 1);
複製代碼
setBucketsAndMask
函數,不一樣架構下代碼不同,以當前運行的非真機代碼爲例
void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
#ifdef __arm__
// ensure other threads see buckets contents before buckets pointer
mega_barrier();
_bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_relaxed);
// ensure other threads see new buckets before new mask
mega_barrier();
_maybeMask.store(newMask, memory_order_relaxed);
_occupied = 0;
#elif __x86_64__ || i386
// ensure other threads see buckets contents before buckets pointer
_bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_release);
// ensure other threads see new buckets before new mask
_maybeMask.store(newMask, memory_order_release);
_occupied = 0;
#else
#error Don't know how to do setBucketsAndMask on this architecture.
#endif
}
複製代碼
newMask
爲緩存列表的容量-1,用做掩碼buckets
存儲桶,存儲到_bucketsAndMaybeMask
中。強轉uintptr_t
類型,只存儲結構體指針,即:buckets
首地址newMask
掩碼,存儲到_maybeMask
中_occupied
設置爲0,由於buckets
存儲桶目前仍是空的若是newOccupied + 1
小於等於75%
,不須要擴容
#define CACHE_END_MARKER 1
if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) {
// Cache is less than 3/4 or 7/8 full. Use it as-is.
}
// Historical fill ratio of 75% (since the new objc runtime was introduced).
static inline mask_t cache_fill_ratio(mask_t capacity) {
return capacity * 3 / 4;
}
複製代碼
CACHE_END_MARKER
:系統插入的結束標記,邊界做用超過75%
,進行2倍擴容
MAX_CACHE_SIZE_LOG2 = 16,
MAX_CACHE_SIZE = (1 << MAX_CACHE_SIZE_LOG2),
capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
if (capacity > MAX_CACHE_SIZE) {
capacity = MAX_CACHE_SIZE;
}
reallocate(oldCapacity, capacity, true);
複製代碼
capacity
進行2倍擴容,但不能超過65536
reallocate
函數,擴容時freeOld
傳入true
reallocate
函數,當freeOld
傳入true
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
collect_free(oldBuckets, oldCapacity);
}
複製代碼
buckets
存儲桶,代替原有buckets
,新的buckets
容量爲擴容後的大小buckets
buckets
中的方法緩存,所有清除insert
函數,調用哈希函數,計算sel
的下標
mask_t m = capacity - 1;
mask_t begin = cache_hash(sel, m);
mask_t i = begin;
複製代碼
capacity - 1
做爲哈希函數的掩碼,用於計算下標insert
函數,獲得buckets
存儲桶
bucket_t *b = buckets();
複製代碼
buckets
函數,進行&
運算,返回bucket_t
類型的結構體脂針,即:buckets
首地址
static constexpr uintptr_t bucketsMask = ~0ul;
struct bucket_t *cache_t::buckets() const
{
uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
return (bucket_t *)(addr & bucketsMask);
}
複製代碼
~0ul
:0b1111111111111111111111111111111111111111111111111111111111111111
&
運算:若是兩個相應的二進制位都爲1,則該位的結果值爲1addr & ~0Ul
,結果仍是addr
使用下標獲取bucket,至關於內存平移。若是bucket中不存在sel,寫入緩存
if (fastpath(b[i].sel() == 0)) {
incrementOccupied();
b[i].set<Atomic, Encoded>(b, sel, imp, cls());
return;
}
複製代碼
incrementOccupied
函數,對_occupied
進行++
set
函數,將sel
和imp
寫入bucket
若是存在sel
,而且和當前sel
相同,直接return
if (b[i].sel() == sel) {
// The entry was added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
return;
}
複製代碼
不然,表示哈希衝突
cache_next函數,不一樣框架下算法不同,以當前運行的非真機代碼爲例:
static inline mask_t cache_next(mask_t i, mask_t mask) {
return (i+1) & mask;
}
複製代碼
+1
,再和mask
進行&
運算在do...while
中,調用cache_next
函數,直到解決哈希衝突爲止
do {
...
} while (fastpath((i = cache_next(i, m)) != begin));
複製代碼
結論:
capacity
:緩存列表的容量occupied
:已有緩存的大小maybeMask
:使用capacity-1
的值做爲掩碼,在哈希算法、哈希衝突中,用於計算下標寫入緩存後的大小+邊界
超過容量的75%
,進行擴容
bucket
bucket
中的sel
,不存在則寫入sel
,而且和當前sel
相同,直接return
+1
,再和mask
進行&
運算do...while
中,直到解決哈希衝突爲止哈希表具備兩個影響其性能的參數:初始容量和負載因子
當哈希表中的條目數超過負載因子和當前容量的乘積時,哈希表將會被從新哈希。即:內部數據結構將被重建。所以哈希表的存儲桶大約爲兩倍 負載因子定義爲3/4,在時間和空間成本之間提供了一個很好的折中方案