iOS的OC的方法緩存的源碼分析

時間 2020-06-06

標籤 ios 方法緩存源碼分析欄目 iOS 简体版

原文原文鏈接

前言

筆者整理了一系列有關OC的底層文章，但願能夠幫助到你。這篇文章主要講解的是方法緩存的底層源碼分析。數組

1.iOS的OC對象建立的alloc原理緩存

2.iOS的OC對象的內存對齊bash

3.iOS的OC的isa的底層原理less

4.iOS的OC源碼分析之類的結構分析函數

在平常開發中，咱們調用方法的時候有沒有想過一個問題，在咱們頻繁地調用方法，爲了高效蘋果會不會對使用過的方法作緩存起來？若是有作緩存的話，具體是怎樣作的呢？爲了瞭解這塊的內容，本篇文章就對cache_t作源碼分析。源碼分析

1.cache_t

在上一篇文章iOS的OC源碼分析之類的結構分析中知道cache_t是在objc_class結構體中，佔16個字節，cache_t的源碼以下：post

struct cache_t {
    struct bucket_t *_buckets;
    mask_t _mask;
    mask_t _occupied;
    ...
};

struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    MethodCacheIMP _imp;
    cache_key_t _key;
#else
    cache_key_t _key;
    MethodCacheIMP _imp;
#endif
}

using MethodListIMP = IMP;
typedef uintptr_t cache_key_t;
複製代碼

從源碼能夠知道，經過將方法編號SEL和函數地址IMP緩存在bucket_t（又稱哈希桶）中。爲了方便接下來的內容，定義了一個TestObject的類，具體的代碼以下：ui

#import <Foundation/Foundation.h>

NS_ASSUME_NONNULL_BEGIN

@interface TestObject : NSObject{
    NSString *nickName;
}

@property(nonatomic,copy) NSString *name;

-(void)sayName;
-(void)sayHello;
-(void)sayTest;
+(void)sayNickName;

@end

NS_ASSUME_NONNULL_END

#import "TestObject.h"

@implementation TestObject

-(void)sayName{
    NSLog(@"%p",__func__);
}

-(void)sayHello{
    NSLog(@"%p",__func__);
}

-(void)sayTest{
    NSLog(@"%p",__func__);
}

+(void)sayNickName{
    NSLog(@"%p",__func__);
}

@end

//實現的代碼
TestObject *testObject = [TestObject alloc];
Class tClass = object_getClass(testObject);
[testObject sayName];
[testObject sayHello];
NSLog(@"%@",testObject);

複製代碼

由於實例對象裏面的方法是在類裏面調用的，爲了驗證明例方法是否是存在cache_t裏面，咱們能夠經過lldb的指令來找到cache_t而後深刻進去查看，以下圖所示this

由上圖能夠知道，由於咱們在調用了 TestObject類的三個方法(包括了init方法)，圖中的 _mask和 _occupied都爲3。那麼咱們再調用多一個方法，以下所示

TestObject *testObject = [[TestObject alloc] init];
Class tClass = object_getClass(testObject);
[testObject sayName];
[testObject sayHello];
[testObject sayTest];
NSLog(@"%@",testObject);
複製代碼

再次使用 lldb的指令來查看，能夠知道此時的 _mask爲7，可是 _occupied爲1，而且此時的 buckets的數組裏面只有一個 sayTest方法，仍是不是有序存放，此時其餘的方法不存在了。因此由此能夠知道，方法的緩存並非有一個存一個的，裏面是有對方法的緩存作必定的處理的。

1.1 cache_t的屬性值

_buckets：是bucket_t結構體的數組，bucket_t是用來存放方法的SEL內存地址和IMP的。
_mask：是數組容量的大小用做掩碼。（由於這裏維護的數組大小都是2的整數次冪，因此_mask的二進制位000011,000111,001111）恰好能夠用做hash取餘數的掩碼。恰好保證相與後不超過緩存大小。
_occupied：是當前已緩存的方法數，即數組中已使用了多少位置。

2.方法緩存的原理分析

OC方法的本質是消息發送（即objc_msgSend），底層是經過方法的 SEL 查找 IMP。讀取cache_t緩存是經過objc_msgSend的查找，cache_t緩存的寫首先是經過cache_fill函數，以下源碼：atom

* Cache readers (PC-checked by collecting_in_critical())
 * objc_msgSend*
 * cache_getImp
 *
 * Cache writers (hold cacheUpdateLock while reading or writing; not PC-checked)
 * cache_fill         (acquires lock)
 * cache_expand       (only called from cache_fill)
 * cache_create       (only called from cache_expand)
 * bcopy               (only called from instrumented cache_expand)
 * flush_caches        (acquires lock)
 * cache_flush        (only called from cache_fill and flush_caches)
 * cache_collect_free (only called from cache_expand and cache_flush)
複製代碼

2.1 cache_fill

方法的緩存首先是經過cache_fill函數，源碼以下

void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if !DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
複製代碼

cache_fill方法傳入cls類的Class和方法的SEL,IMP。

2.2 cache_fill_nolock

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    // Never cache before +initialize is done
    if (!cls->isInitialized()) return;

    // Make sure the entry wasn't added to the cache by some other thread // before we grabbed the cacheUpdateLock. if (cache_getImp(cls, sel)) return; cache_t *cache = getCache(cls); cache_key_t key = getKey(sel); // Use the cache as-is if it is less than 3/4 full mask_t newOccupied = cache->occupied() + 1; mask_t capacity = cache->capacity(); if (cache->isConstantEmptyCache()) { // Cache is read-only. Replace it. cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE); } else if (newOccupied <= capacity / 4 * 3) { // Cache is less than 3/4 full. Use it as-is. } else { // Cache is too full. Expand it. cache->expand(); } // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum size is 4 and we resized at 3/4 full. bucket_t *bucket = cache->find(key, receiver); if (bucket->key() == 0) cache->incrementOccupied(); bucket->set(key, imp); } cache_t *getCache(Class cls) { assert(cls); return &cls->cache; } cache_key_t getKey(SEL sel) { assert(sel); return (cache_key_t)sel; } /* Initial cache bucket count. INIT_CACHE_SIZE must be a power of two. */ enum { INIT_CACHE_SIZE_LOG2 = 2, INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2) }; #if __LP64__ typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits #else typedef uint16_t mask_t; #endif typedef uintptr_t cache_key_t; 複製代碼

從源碼中各個方法來分析一下，其中的getCache(cls)經過cls來獲取到類的cache_t。getKey(sel)將SEL轉化爲cache_key_t的類型。下面是 cache->occupied()和cache->capacity()的源碼。

mask_t cache_t::occupied() 
{
    return _occupied;
}

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}

mask_t cache_t::mask() 
{
    return _mask; 
}

複製代碼

_occupied是方法的數量，默認是0，因此一開始進來的話newOccupied的值是1至關於佔用1個緩存的數量來作緩存，而capacity()是獲取緩存的方法數量，默認也是0的，若是mask()有值了就是在這個基礎上加1，這就至關於獲取方法的容量。接下來就是三個的條件判斷了，第一個判斷isConstantEmptyCache()是判斷是否有緩存，第二個判斷是判斷佔用的方法數量是否小於等於容量的3/4，若是是就什麼都不作。不然就須要開始擴容expand。若是沒有緩存的話就須要執行reallocate函數。其中reallocate中的INIT_CACHE_SIZE是4,因此一開始傳進去的reallocate的值是0和4.

2.2.1 reallocate

從函數名的大概能夠看出意思，就是從新初始化緩存的意思。這個函數的源碼以下：

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    //判斷是否能夠釋放舊的緩存的標示
    bool freeOld = canBeFreed();
    //獲取舊的buckets
    bucket_t *oldBuckets = buckets();
    //建立新的buckets
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. // This is thought to save cache memory at the cost of extra cache fills. // fixme re-measure this assert(newCapacity > 0); assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1); //設置新的buckets和賦值mask setBucketsAndMask(newBuckets, newCapacity - 1); if (freeOld) { //釋放舊的buckets cache_collect_free(oldBuckets, oldCapacity); cache_collect(false); } } bool cache_t::canBeFreed() { return !isConstantEmptyCache(); } bucket_t *allocateBuckets(mask_t newCapacity) { // Allocate one extra bucket to mark the end of the list. // This can't overflow mask_t because newCapacity is a power of 2.
    // fixme instead put the end mark inline when +1 is malloc-inefficient
    bucket_t *newBuckets = (bucket_t *)
        calloc(cache_t::bytesForCapacity(newCapacity), 1);

    bucket_t *end = cache_t::endMarker(newBuckets, newCapacity);

#if __arm__
    // End marker's key is 1 and imp points BEFORE the first bucket. // This saves an instruction in objc_msgSend. end->setKey((cache_key_t)(uintptr_t)1); end->setImp((IMP)(newBuckets - 1)); #else // End marker's key is 1 and imp points to the first bucket.
    end->setKey((cache_key_t)(uintptr_t)1);
    end->setImp((IMP)newBuckets);
#endif
    
    if (PrintCaches) recordNewCache(newCapacity);

    return newBuckets;
}


void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds). // It is unsafe for objc_msgSend to see old buckets and new mask. // Therefore we write new buckets, wait a lot, then write new mask. // objc_msgSend reads mask first, then buckets. // ensure other threads see buckets contents before buckets pointer mega_barrier(); _buckets = newBuckets; // ensure other threads see new buckets before new mask mega_barrier(); _mask = newMask; _occupied = 0; } 複製代碼

從源碼中能夠看到reallocate獲取舊的buckets和建立新的buckets，由於舊的buckets在判斷能夠釋放的時候是須要抹掉的。建立新的buckets在allocateBuckets函數能夠知道，經過calloc函數來申請cache_t類型的內存空間，而且對key和imp都設置了默認值。在setBucketsAndMask函數中對buckets和_mask賦值,由於一開始傳進來的newMask爲3，_occupied爲0之因此爲0是由於此時尚未對方法作緩存只是初始化值。這就很好地說明了上面第一次用lldb指令的時候獲得的mask爲3.

2.2.2 expand

在newOccupied的值大於capacity的3/4，這時候就須要擴容，這時候就須要執行expand()函數

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can't grow further // fixme this wastes one bit of mask newCapacity = oldCapacity; } reallocate(oldCapacity, newCapacity); } mask_t cache_t::capacity() { return mask() ? mask()+1 : 0; } 複製代碼

在須要擴容的時候，此時的capacity()值爲4了，因此oldCapacity爲4，newCapacity爲8，而後會繼續執行reallocate函數，傳進去的參數分別爲4和8。根據上面的reallocate函數的執行流程會將舊的buckets清空，修改mask的值爲7，而後occupied的值爲0.可是爲何會在lldb的指令的時候看到的occupied爲1呢？在這個流程走完以後，執行完判斷的流程以後，會執行到

// Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(key, receiver);
    if (bucket->key() == 0) cache->incrementOccupied();
    bucket->set(key, imp);
    
    void cache_t::incrementOccupied() 
{
    _occupied++;
}

複製代碼

其中find函數經過上面的key和receiver來查找bucket_t。若是key()爲0的時候，這時會對_occupied數量+1。而且對bucket的key和imp進行填充。

2.2.3 find函數

bucket_t * cache_t::find(cache_key_t k, id receiver)
{
    assert(k != 0);

    bucket_t *b = buckets();
    mask_t m = mask();
    // 經過cache_hash函數【begin  = k & m】計算出key值 k 對應的 index值 begin，用來記錄查詢起始索引
    mask_t begin = cache_hash(k, m);
    // begin 賦值給 i，用於切換索引
    mask_t i = begin;
    do {
        if (b[i].key() == 0  ||  b[i].key() == k) {
            //用這個i從散列表取值，若是取出來的bucket_t的 key = k，則查詢成功，返回該bucket_t，
            //若是key = 0，說明在索引i的位置上尚未緩存過方法，一樣須要返回該bucket_t，用於停止緩存查詢。
            return &b[i];
        }
    } while ((i = cache_next(i, m)) != begin);
    
    // 這一步其實至關於 i = i-1,回到上面do循環裏面，至關於查找散列表上一個單元格里面的元素，再次進行key值 k的比較，
    //當i=0時，也就i指向散列表最首個元素索引的時候從新將mask賦值給i，使其指向散列表最後一個元素，從新開始反向遍歷散列表，
    //其實就至關於繞圈，把散列表頭尾連起來，不就是一個圈嘛，從begin值開始，遞減索引值，當走過一圈以後，必然會從新回到begin值，
    //若是此時尚未找到key對應的bucket_t，或者是空的bucket_t，則循環結束，說明查找失敗，調用bad_cache方法。
 
    // hack
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)k, cls);
}

static inline mask_t cache_hash(cache_key_t key, mask_t mask) 
{
    return (mask_t)(key & mask);
}
複製代碼

從find函數能夠知道，經過mask的大小與獲取的key用hash函數的形式獲得begin下標來獲得bucket_t的地址進行返回，由於hash函數是無序的，因此在buckets裏面存放的位置也是無序的。

在類的cache_t中是找不到類方法的，由於類方法都是緩存在元類中，因此若是想經過lldb指令來查找類方法，能夠先經過isa找到元類，能夠根據上面的流程來驗證元類中是否是存放類方法。

3.最後

OC方法的本質是消息發送（即objc_msgSend），底層是經過方法的 SEL 查找 IMP。

1.方法緩存在cache_t中，分別用buckets指針地址來存方法數組，mask來存放方法數組的容量大小，occupied來存放當前的方法佔用數量。
2.在方法的newOccupied新的方法佔用數量大於當前的方法數量capacity()的3/4就須要擴容。
3.在擴容的過程當中，會設置mask爲capacity() * 2 - 1即方法的數量的2倍減1，例如第一次爲3，第二次爲7。最後都會將舊的buckets列表清空。可是最後都會將執行到須要擴容的方法加入到buckets裏面。