歡迎閱讀iOS探索系列(按序閱讀食用效果更加)c++
在上一篇文章中已經全面地介紹了類的結構,可是還剩下一個cache_t cache
沒有進行詳細的介紹,本文就將從源碼層面分析cache_t
算法
以下是類在底層的結構緩存
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
class_rw_t *data() {
return bits.data();
}
...
}
複製代碼
其中cache_t
的結構以下安全
struct cache_t {
struct bucket_t *_buckets;
mask_t _mask;
mask_t _occupied;
...
};
複製代碼
以前文章也說過,從cache_t
的結構中能夠得出它是由兩個uint32_t
類型的_mask
和_occupied
以及bucket_t
類型的結構體指針所組成的bash
struct bucket_t {
private:
// IMP-first is better for arm64e ptrauth and no worse for arm64.
// SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
MethodCacheIMP _imp;
cache_key_t _key;
#else
cache_key_t _key;
MethodCacheIMP _imp;
#endif
public:
inline cache_key_t key() const { return _key; }
inline IMP imp() const { return (IMP)_imp; }
inline void setKey(cache_key_t newKey) { _key = newKey; }
inline void setImp(IMP newImp) { _imp = newImp; }
void set(cache_key_t newKey, IMP newImp);
};
複製代碼
從以上bucket_t
的屬性和方法中能夠看出它應該與imp
有聯繫——事實上bucket_t
做爲一個桶,裏面是用來裝imp
方法實現以及它的key
數據結構
cache_t
中的_buckets
、_mask
、_occupied
從字面意思上理解爲桶
、面具
、佔據
,可是咱們不知道這三個的做用是否與他們的名字有關係,下面咱們先從LLDB打印一些信息來看看less
在objc源碼
準備好代碼ide
#import <objc/runtime.h>
@interface FXPerson : NSObject
- (void)doFirst;
- (void)doSecond;
- (void)doThird;
@end
@implementation FXPerson
- (void)doFirst {}
- (void)doSecond {}
- (void)doThird {}
@end
int main(int argc, const char * argv[]) {
@autoreleasepool {
FXPerson *p = [[FXPerson alloc] init];
Class cls = object_getClass(p);
[p doFirst];
[p doSecond];
[p doThird];
}
return 0;
}
複製代碼
_buckets
是一個裝imp
方法實現的桶,那咱們在方法調用的時候打個斷點(上篇文章講過,類中isa指針
佔8字節,superclass指針
佔8字節,只要拿到類的首地址+16字節
就能獲得cache_t
的地址) 函數
此時_mask
爲3,_occupied
爲1,咱們繼續打印_buckets
post
打印了多個$3
只發現緩存了一個[NSObject init]
,心中難免有了一個想法
斷點來到[p doSecond];
一行(筆者這裏從新跑項目了)
斷點來到[p doThird];
一行,獲得以下數據:
斷點處 | _occupied | _buckets包含方法 |
---|---|---|
[p doFirst] | 1 | -[NSObject init] |
[p doSecond] | 2 | -[NSObject init]、-[FXPerson doFirst] |
[p doThird] | 3 | -[NSObject init]、-[FXPerson doFirst]、-[FXPerson doSecond] |
上述數據能夠得出_buckets
是個裝方法實現的桶子,_occupied
數值是桶子中有多少個方法實現
等等,這裏確定有人還有疑問,FXPerson調用了alloc方法,怎麼都沒緩存——上一篇文章已經講過了,alloc方法屬於類方法,存在FXPerson元類中
本覺得一切都順順利利的時候,意外發生了——斷點走到下一行
_mask
和_occupied
都發生了難以想象的變化,那麼底層到底作了什麼呢?爲何先前打印bucket[0]
的時候全爲空呢?
已知_mask
的值是增長了,因此咱們找到cache_t
中的mask_t mask()
方法,結果只返回了_mask
自己
mask_t cache_t::mask()
{
return _mask;
}
複製代碼
繼續搜索mask()
方法,發如今capacity
方法中有mask的相應操做,可是操做目的不是很明確
mask_t cache_t::capacity()
{
return mask() ? mask()+1 : 0;
}
複製代碼
繼續搜索capacity()
方法,在expand方法中看到了capacity
方法的有意義調用
void cache_t::expand()
{
cacheUpdateLock.assertLocked();
uint32_t oldCapacity = capacity();
uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
if ((uint32_t)(mask_t)newCapacity != newCapacity) {
// mask overflow - can't grow further
// fixme this wastes one bit of mask
newCapacity = oldCapacity;
}
reallocate(oldCapacity, newCapacity);
}
複製代碼
expand
方法應該是個擴容方法,繼續往上摸,摸到了cache_fill_nolock
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver) {
cacheUpdateLock.assertLocked();
// Never cache before +initialize is done
if (!cls->isInitialized()) return;
// Make sure the entry wasn't added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
if (cache_getImp(cls, sel)) return;
cache_t *cache = getCache(cls);
cache_key_t key = getKey(sel);
// Use the cache as-is if it is less than 3/4 full
mask_t newOccupied = cache->occupied() + 1;
mask_t capacity = cache->capacity();
if (cache->isConstantEmptyCache()) {
// Cache is read-only. Replace it.
cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
}
else if (newOccupied <= capacity / 4 * 3) {
// Cache is less than 3/4 full. Use it as-is.
}
else {
// Cache is too full. Expand it.
cache->expand();
}
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot because the
// minimum size is 4 and we resized at 3/4 full.
bucket_t *bucket = cache->find(key, receiver);
if (bucket->key() == 0) cache->incrementOccupied();
bucket->set(key, imp);
}
複製代碼
加個斷點在函數調用棧中驗證了咱們找的方向是正確的
cache_fill_nolock方法比較複雜,筆者這裏將一步步分析
①if (!cls->isInitialized()) return;
類是否初始化對象,沒有就返回
②if (cache_getImp(cls, sel)) return;
傳入cls
和sel
,若是在緩存中查找到imp
就返回,不能就下一步
③cache_t *cache = getCache(cls);
調用getCache
來獲取cls
的緩存對象
④cache_key_t key = getKey(sel);
經過getKey
來獲取到緩存的key——實際上是將SEL
類型強轉成cache_key_t
類型
⑤mask_t newOccupied = cache->occupied() + 1;
在cache
已經佔用的基礎上進行加 1,獲得的是新的緩存佔用大小 newOccupied
⑥mask_t capacity = cache->capacity();
讀取如今緩存的容量capacity
⑥判斷緩存佔用
if (cache->isConstantEmptyCache()) {
// Cache is read-only. Replace it.
cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
}
else if (newOccupied <= capacity / 4 * 3) {
// Cache is less than 3/4 full. Use it as-is.
}
else {
// Cache is too full. Expand it.
cache->expand();
}
複製代碼
<=
緩存容量的四分之三,則能夠進行緩存流程⑦bucket_t *bucket = cache->find(key, receiver);
經過key
在緩存中查找到對應的bucket_t
⑧if (bucket->key() == 0) cache->incrementOccupied();
若是⑦找到的bucket
中key
爲0,那麼_occupied++
⑨bucket->set(key, imp);
把key
、imp
成對放入bucket
總結:
cache_fill_nolock
先找到類的緩存cache
,若是緩存cache
爲空就建立並覆蓋;若是目標占用(緩存以後的佔用大小newOccupied
)大於緩存容量的四分之三,先擴容再裝入對應key
值的桶內bucket
;不然直接裝入對應key
值的桶內bucket
分析完cache_fill_nolock主流程,再根據一些方法進行擴展
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
bool freeOld = canBeFreed();
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
// Cache's old contents are not propagated.
// This is thought to save cache memory at the cost of extra cache fills.
// fixme re-measure this
assert(newCapacity > 0);
assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
cache_collect_free(oldBuckets, oldCapacity);
cache_collect(false);
}
}
bucket_t *allocateBuckets(mask_t newCapacity)
{
// Allocate one extra bucket to mark the end of the list.
// This can't overflow mask_t because newCapacity is a power of 2.
// fixme instead put the end mark inline when +1 is malloc-inefficient
bucket_t *newBuckets = (bucket_t *)
calloc(cache_t::bytesForCapacity(newCapacity), 1);
bucket_t *end = cache_t::endMarker(newBuckets, newCapacity);
#if __arm__
// End marker's key is 1 and imp points BEFORE the first bucket.
// This saves an instruction in objc_msgSend.
end->setKey((cache_key_t)(uintptr_t)1);
end->setImp((IMP)(newBuckets - 1));
#else
// End marker's key is 1 and imp points to the first bucket.
end->setKey((cache_key_t)(uintptr_t)1);
end->setImp((IMP)newBuckets);
#endif
if (PrintCaches) recordNewCache(newCapacity);
return newBuckets;
}
void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
// objc_msgSend uses mask and buckets with no locks.
// It is safe for objc_msgSend to see new buckets but old mask.
// (It will get a cache miss but not overrun the buckets' bounds).
// It is unsafe for objc_msgSend to see old buckets and new mask.
// Therefore we write new buckets, wait a lot, then write new mask.
// objc_msgSend reads mask first, then buckets.
// ensure other threads see buckets contents before buckets pointer
mega_barrier();
_buckets = newBuckets;
// ensure other threads see new buckets before new mask
mega_barrier();
_mask = newMask;
_occupied = 0;
}
複製代碼
oldBuckets
獲取到當前bucket
allocateBuckets
初始化bucket_t
,保存在newBuckets
setBucketsAndMask
作的操做: 用新建立的bucket
保存,mask=newcapcity-1
,occupied
置零(由於尚未方法緩存)bucket
、capacity
爲何使用cache_collect_free消除記憶,而不是從新讀寫、內存拷貝的方式?一是從新讀寫不安全;二是抹掉速度快
void cache_t::expand()
{
cacheUpdateLock.assertLocked();
uint32_t oldCapacity = capacity();
uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
if ((uint32_t)(mask_t)newCapacity != newCapacity) {
// mask overflow - can't grow further
// fixme this wastes one bit of mask
newCapacity = oldCapacity;
}
reallocate(oldCapacity, newCapacity);
}
enum {
INIT_CACHE_SIZE_LOG2 = 2,
INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2)
};
mask_t cache_t::capacity()
{
return mask() ? mask()+1 : 0;
}
複製代碼
oldCapacity
的值爲mask+1
oldCapacity
存在的狀況下,newCapacity
取oldCapacity
的兩倍;不然取INIT_CACHE_SIZE
INIT_CACHE_SIZE
爲二進制的100
=>十進制的4
reallocate
cache_t::find
是 找對應的存儲桶
bucket_t * cache_t::find(cache_key_t k, id receiver)
{
assert(k != 0);
bucket_t *b = buckets();
mask_t m = mask();
mask_t begin = cache_hash(k, m);
mask_t i = begin;
do {
if (b[i].key() == 0 || b[i].key() == k) {
return &b[i];
}
} while ((i = cache_next(i, m)) != begin);
// hack
Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
cache_t::bad_cache(receiver, (SEL)k, cls);
}
複製代碼
buckets()
方法獲取當前cache_t
下全部的緩存桶bucket
mask()
方法獲取當前cache_t
的緩存容量減一的值mask_t
key & mask
做爲循環索引do-while
循環裏遍歷整個bucket_t
,若是key
爲 0,說明當前索引位置上尚未緩存過方法,則須要中止循環,返回當前位置上的bucket_t
;若是key
爲要查詢的 k
,說明緩存命中了,則直接返回結果cache_next
返回(i+1) & mask
來更新索引bad_cache
LRU算法
的全稱是Least Recently Used
,也就是最近最少使用策略——這個策略的核心思想就是先淘汰最近最少使用的內容,在方法緩存中也用到了這種算法
mask
是做爲cache_t
的屬性存在的,它表明的是緩存容量的大小減一的值mask
對於bucket
來講,主要是用來在緩存查找時的哈希算法capacity
的變化主要發生在擴容cache->expand()
的時候,當緩存已經佔滿了四分之三的時候,會進行兩倍原來緩存空間大小的擴容,這一步是爲了不哈希衝突
在哈希這種數據結構裏面,有一個概念用來表示空位的多少叫作裝載因子
——裝載因子越大,說明空閒位置越少,衝突越多,散列表的性能會降低
負載因子是3/4
的時候,空間利用率比較高,並且避免了至關多的Hash衝突,提高了空間效率
具體能夠閱讀HashMap的負載因子爲何默認是0.75?
static inline mask_t cache_hash(cache_key_t key, mask_t mask) {
return (mask_t)(key & mask);
}
複製代碼
方法緩存是無序的,由於是用哈希算法來計算緩存下標——下標值取決於key
和mask
的值
cls
擁有屬性cache_t
,cache_t
中的buckets
有多個bucket
——存儲着方法實現imp
和方法編號sel
強轉成的key值cache_key_t
mask
對於bucket
來講,主要是用來在緩存查找時的哈希算法capacity
則能夠獲取到cache_t
中bucket
的數量緩存的主要目的就是經過一系列策略讓編譯器更快的執行消息發送的邏輯
關於cache_t
的內容雖然很少但仍是蠻繞的,多讀讀源碼會有更深的理解。下篇文章講objc_msgsend
,做爲cache_fill_nolock
前置方法,必定程序上會對cache_t
的理解有所幫助