Objective-C 之 Runtime 數據結構篇

時間 2020-05-31

原文原文鏈接

對象

當不肯定一個對象的類型的時候，一般使用 id 類型進行表示數組

- (id)copy;
複製代碼

id 表明一個對象，它是指向一個實例對象的指針緩存

typedef struct objc_object *id;
複製代碼

實際上，id 是一個 objc_object 結構體類型指針的別名bash

struct objc_object {
    isa_t isa;
};
複製代碼

而 objc_object 這個結構體中只有一個 isa_t 類型的成員 isa，它包含了當前對象所屬於的類的信息。架構

isa_t

union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };
#endif
};
複製代碼

isa_t 是一個聯合體，這就意味着 isa_t 中保存的既能夠是一個 Class 類型的指針，也能夠是一個 64 位的 bits，但在某一個時刻，只能保存兩者中的一個。ide

# if __arm64__
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
# define ISA_BITFIELD \
      uintptr_t nonpointer        : 1;                                       \
      uintptr_t has_assoc         : 1;                                       \
      uintptr_t has_cxx_dtor      : 1;                                       \
      uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
      uintptr_t magic             : 6;                                       \
      uintptr_t weakly_referenced : 1;                                       \
      uintptr_t deallocating      : 1;                                       \
      uintptr_t has_sidetable_rc  : 1;                                       \
      uintptr_t extra_rc          : 19
# define RC_ONE (1ULL<<45)
# define RC_HALF (1ULL<<18)

# elif __x86_64__
# define ISA_MASK 0x00007ffffffffff8ULL
# define ISA_MAGIC_MASK 0x001f800000000001ULL
# define ISA_MAGIC_VALUE 0x001d800000000001ULL
# define ISA_BITFIELD \
      uintptr_t nonpointer        : 1;                                         \
      uintptr_t has_assoc         : 1;                                         \
      uintptr_t has_cxx_dtor      : 1;                                         \
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
      uintptr_t magic             : 6;                                         \
      uintptr_t weakly_referenced : 1;                                         \
      uintptr_t deallocating      : 1;                                         \
      uintptr_t has_sidetable_rc  : 1;                                         \
      uintptr_t extra_rc          : 8
# define RC_ONE (1ULL<<56)
# define RC_HALF (1ULL<<7)
複製代碼

在不一樣架構下 bits 內存佈局有所不一樣，可是結構體中的成員和其表明的含義都是相同的，只是具體結構體的實現和位數可能有些差異。函數

isa 初始化

inline void 
objc_object::initClassIsa(Class cls)
{
    if (DisableNonpointerIsa  ||  cls->instancesRequireRawIsa()) {
        initIsa(cls, false/*not nonpointer*/, false);
    } else {
        initIsa(cls, true/*nonpointer*/, false);
    }
}

inline void 
objc_object::initInstanceIsa(Class cls, bool hasCxxDtor)
{
    initIsa(cls, true, hasCxxDtor);
}

複製代碼

不管是初始化類仍是實例的 isa 都是調用了下面的方法佈局

inline void 
objc_object::initIsa(Class cls, bool nonpointer, bool hasCxxDtor) 
{   
    if (!nonpointer) {
        isa = isa_t((uintptr_t)cls);
    } else {
        isa_t newisa(0);
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.shiftcls = (uintptr_t)cls >> 3;
        isa = newisa;
    }
}
複製代碼

若是是老版本 isa 中就是保存 Class 類型的指針。若是是新版本，就對 64 位的 bits 進行配置ui

其中 ISA_MAGIC_VALUE 實際上配置告終構體中成員 nonpointer 和 magic 的值。this

nonpointer 若是是 0 那麼 isa 中就是保存 Class 類型的指針，若是是 1 就保存的是 bits。
magic 用於調試器判斷當前對象是真的對象仍是沒有初始化的空間

接着配置了 has_cxx_dtor 的值，該值表示當前對象有 C++ 或者 ObjC 的析構器（destructor），若是沒有析構器就會快速釋放內存。atom

最後將類的指針右移了 3 位後賦值給了成員 shiftcls。這是由於類的指針是按照字節（8bits）對齊的，其指針後三位都是沒有意義的 0，所以能夠右移 3 位進行消除，以減少無心義的內存佔用。

isa_t 成員

成員名	含義
nonpointer	區分新老版本，老版本是 `Class cls`，新版本是 `uintptr_t bits`
has_assoc	對象含有或者曾經含有關聯引用，沒有關聯引用的能夠更快地釋放內存
has_cxx_dtor	對象有 C++ 或者 ObjC 的析構器（destructor），若是沒有析構器就會快速釋放內存
shiftcls	類的指針
magic	調試器判斷當前對象是真的對象仍是沒有初始化的空間
weakly_referenced	對象被指向或者曾經指向一個 ARC 的弱變量，沒有弱引用的對象能夠更快釋放
deallocating	對象正在釋放內存
has_sidetable_rc	對象的引用計數太大了，存不下
extra_rc	對象的引用計數超過 1，會存在這個這個裏面，若是引用計數爲 10，該值就爲 9

類

在獲取一個對象所屬的類時，返回值的類型是 Class

- (Class)class;
複製代碼

Class 表明一個類，它是指向一個類對象的指針

typedef struct objc_class *Class;
複製代碼

實際上，Class 是一個 objc_class 結構體類型指針的別名

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
};
複製代碼

能夠看到 objc_class 中有三個成員。

首先，objc_class 是繼承自 objc_object，也就是說類也是一個對象。其中從 objc_object 繼承下來的成員 isa，它包含了當前類對象所屬的元類的信息。
其次，成員 superclass 指向了該類的父類
接着，成員 cache 中緩存了最近調用過的方法
最後，成員 bits 根據註釋能夠知道 bits = 指向 class_rw_t 類型結構體的指針 + 自定義的 rr/alloc 標識

cache_t

struct cache_t {
    explicit_atomic<struct bucket_t *> _buckets;
    explicit_atomic<mask_t> _mask;
    uint16_t _occupied;
};
複製代碼

用來緩存調用過方法的是一個 cache_t 類型的結構體，共有三個成員。

第一個成員 _buckets 是一個 bucket_t 類型的數組。

第二個成員 _mask 保存着總共申請用來緩存的數量。

第三個成員 _occupied 保存着目前已經使用緩存的數量。

bucket_t

struct bucket_t {
#if __arm64__
    explicit_atomic<uintptr_t> _imp;
    explicit_atomic<SEL> _sel;
#else
    explicit_atomic<SEL> _sel;
    explicit_atomic<uintptr_t> _imp;
#endif
};
複製代碼

bucket_t 結構體中存儲了鍵值對，其中 _imp 是一個函數指針，指向函數的具體實現，_sel 是方法名或者方法標識。也就說 _buckets 是一個哈希表，裏面經過鍵值對的方式保存着調用過的方法。鍵爲 SEL 類型，值爲 IMP 類型。

SEL

typedef struct objc_selector *SEL;
複製代碼

從聲明上看 SEL 是一個 objc_selector 結構體指針的別名；從實現中找，發現 SEL 是指向一個 char * 變量，也就是說其實它就是個映射到方法的 C 字符串，能夠理解爲是區分方法的 ID。

SEL 只與方法名有關：

不一樣類中相同名字的方法所對應的方法選擇器是相同的
方法名相同而變量類型不一樣所對應的方法選擇器也是相同的

IMP

#if !OBJC_OLD_DISPATCH_PROTOTYPES
typedef void (*IMP)(void /* id, SEL, ... */ ); 
#else
typedef id _Nullable (*IMP)(id _Nonnull, SEL _Nonnull, ...); 
#endif
複製代碼

IMP 是一個函數指針，它指向函數的具體實現。

class_data_bits_t

bits 中保存着類的相關信息，它是一個 class_data_bits_t 類型的結構體，

struct class_data_bits_t {
    // Values are the FAST_ flags above.
    uintptr_t bits;
};
複製代碼

這個結構體中的成員和 isa_t 結構體相同，都是經過一個 64 位的 bits 儲存信息

// class is a Swift class from the pre-stable Swift ABI
#define FAST_IS_SWIFT_LEGACY (1UL<<0)
// class is a Swift class from the stable Swift ABI
#define FAST_IS_SWIFT_STABLE (1UL<<1)
// class or superclass has default retain/release/autorelease/retainCount/
//   _tryRetain/_isDeallocating/retainWeakReference/allowsWeakReference
#define FAST_HAS_DEFAULT_RR (1UL<<2)
// data pointer
#define FAST_DATA_MASK 0x00007ffffffffff8UL
複製代碼

經過定義的宏能夠知道這些位都表明了什麼含義

FAST_IS_SWIFT_LEGACY 是不是來自預穩定 Swift ABI 的 Swift 類
FAST_IS_SWIFT_STABLE 是不是來自已穩定 Swift ABI 的 Swift 類
FAST_HAS_DEFAULT_RR 當前類或者父類含有默認的 retain/release/autorelease/retainCount/_tryRetain/_isDeallocating/retainWeakReference/allowsWeakReference 方法
FAST_DATA_MASK 數據指針

搜索 FAST_DATA_MASK 的使用位置，能夠找到一對取值/賦值方法

class_rw_t* data() const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
void setData(class_rw_t *newData)
{
    ASSERT(!data()  ||  (newData->flags & (RW_REALIZING | RW_FUTURE)));
    // Set during realization or construction only. No locking needed.
    // Use a store-release fence because there may be concurrent
    // readers of data and data's contents. uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData; atomic_thread_fence(memory_order_release); bits = newBits; } 複製代碼

經過這兩方法，能夠知道數據指針是一個 class_rw_t 類型的指針

class_rw_t

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint16_t version;
    uint16_t witness;

    const class_ro_t *ro;

    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
複製代碼

結構體 class_rw_t 名稱中的 rw 表明 readwrite，其中的成員 ro 是一個 class_ro_t 類型的指針，這裏的 ro 表明 readonly。ro 中存儲了當前類在編譯期就已經肯定的屬性、方法以及遵循的協議。而 class_rw_t 提供了運行時對類拓展的能力，其中的 methods、properties、protocols 保存着經過 Category 在運行時添加的方法、屬性及協議。

成員 methods、properties、protocols 對應的 method_array_t、property_array_t、protocol_array_t 類型都是繼承自 list_array_tt<Element, List>，該類型能夠保存三種類型的值：

空值；
指向單個列表的指針；
指向列表的指針數組。

經過第 3 種類型也就是二維數組類型，能夠實現數據的擴展。例如，method_array_t 是一個數組，其中保存的元素是 method_list_t，而 method_list_t 也是一個數組，其中保存的元素是 method_t。

class_ro_t

相對的，class_ro_t 保存的就是在編譯期肯定的數據。

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
};
複製代碼

結構體中的成員 baseMethodList、baseProperties、baseProtocols 和 ivar_list_t 對應的 method_list_t、property_list_t、protocol_list_t 和 ivar_list_t 類型都是繼承自 entsize_list_tt，一個泛型的數組結構，沒有擴展功能。

類的初始化

類在運行時第一次初始化時會調用 realizeClass... 系列方法

ro = (const class_ro_t *)cls->data();
if (ro->flags & RO_FUTURE) {
    // This was a future class. rw data is already allocated.
    rw = cls->data();
    ro = cls->data()->ro;
    cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
} else {
    // Normal class. Allocate writeable class data.
    rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
    rw->ro = ro;
    rw->flags = RW_REALIZED|RW_REALIZING;
    cls->setData(rw);
}
複製代碼

在該方法的實現中，能夠知道在編譯期間類的結構中的 class_data_bits_t *data 指向的是一個 class_ro_t *。在運行時的時候調用該初始化方法時，首先爲 class_rw_t * 申請內存空間，而後將 class_ro_t * 賦值給 rw->ro，接着設置 class_rw_t * 的標誌位標識狀態，最後將建立好的 class_rw_t * 賦值給類結構。

分類

Category 提供了在運行時動態的向已經存在的類中添加方法、協議和屬性的功能

typedef struct category_t *Category;
複製代碼

struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    // Fields below this point are not always present on disk.
    struct property_list_t *_classProperties;
};
複製代碼

加載分類

static void
attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{
    constexpr uint32_t ATTACH_BUFSIZ = 64;
    method_list_t   *mlists[ATTACH_BUFSIZ];

    uint32_t mcount = 0;
    bool fromBundle = NO;
    bool isMeta = (flags & ATTACH_METACLASS);
    auto rw = cls->data();

    for (uint32_t i = 0; i < cats_count; i++) {
        auto& entry = cats_list[i];

        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            if (mcount == ATTACH_BUFSIZ) {
                prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
                rw->methods.attachLists(mlists, mcount);
                mcount = 0;
            }
            mlists[ATTACH_BUFSIZ - ++mcount] = mlist;
            fromBundle |= entry.hi->isBundle();
        }
    }

    if (mcount > 0) {
        prepareMethodLists(cls, mlists + ATTACH_BUFSIZ - mcount, mcount, NO, fromBundle);
        rw->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount);
        if (flags & ATTACH_EXISTING) flushCaches(cls);
    }
}
複製代碼

原來的方法中包含了對分類中方法、代理和屬性的處理，但原理相同，爲了減小篇幅，只保留對方法的處理：

聲明一個大小爲 64，元素類型爲 method_list_t * 的二維數組類型的臨時變量 mlists
遍歷分類數組，獲取分類中的方法列表 mlist，從後向前的添加到 mlists 中
若是 mlists 數組在遍歷過程當中存滿，則合併到 rw->methods 中，不然等待遍歷完合併

合併方法

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;
            memmove(array()->lists + addedCount, array()->lists, 
                    oldCount * sizeof(array()->lists[0]));
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }
複製代碼

在合併時，大體有兩種狀況。一種是當前數組中爲空，那就直接指向新添加的數組；另外一種是當前數組中已經有數據：