iOS 底層探索篇 —— alloc、init、new的探索

時間 2020-01-06

標籤 ios 底層探索 alloc init new 欄目 iOS 简体版

原文原文鏈接

1、alloc探索

咱們寫代碼的時候老是會alloc，以下，咱們知道它是在爲對象開闢內存空間，那麼具體的開闢流程是怎麼樣的呢，接下來咱們開始對其研究。c++

XDPerson *person = [XDPerson alloc];
複製代碼

1 準備工做

下斷點調試。

打開Debug->Debug Workflow->Always Show Disassembly。
真機運行。

咱們能夠看到彙編代碼，而且在彙編文件的頂部看到libobjc.A.dylib objc_alloc。git

objc源碼 + 配置流程。

這裏對源碼作一下說明，它分爲兩個版本 legecy->objc1，modern->objc2。
咱們如今使用的是objc2版本。

llvm源碼，源碼比較大，能夠自行下載查看。

2 探索過程

源碼裏面寫代碼來探索，一樣是這麼一行代碼。github

XDPerson *person = [XDPerson alloc];
複製代碼

現象：咱們斷點在alloc這行，運行起來以後能夠進入到源碼裏面。發現直接進入到alloc類方法，下一步就是_objc_rootAlloc這個函數。算法
這個時候咱們有疑問，不是應該alloc類方法以後的下一步是objc_alloc嗎？sass

沒錯，這個過程是編譯器直接給咱們優化了，也就間接了說明了objc4源碼並無真正的開源，只作到了部分的開源，這裏分爲兩個階段，編譯階段llvm源碼 + 運行階段objc源碼。安全

2.1 源碼`llvm`探索

llvm很友好，它提供了一系列的測試代碼，能夠供咱們去理性理解分析，接下來開始咱們的探索之路。bash

測試代碼：test_alloc_class_ptr搜索app

// Make sure we get a bitcast on the return type as the
// call will return i8* which we have to cast to A*
// CHECK-LABEL: define {{.*}}void @test_alloc_class_ptr
A* test_alloc_class_ptr() {
  // CALLS: {{call.*@objc_alloc}}
  // CALLS-NEXT: bitcast i8*
  // CALLS-NEXT: ret
  return [B alloc];
}

// Make sure we get a bitcast on the return type as the
// call will return i8* which we have to cast to A*
// CHECK-LABEL: define {{.*}}void @test_alloc_class_ptr
A* test_allocWithZone_class_ptr() {
  // CALLS: {{call.*@objc_allocWithZone}}
  // CALLS-NEXT: bitcast i8*
  // CALLS-NEXT: ret
  return [B allocWithZone:nil];
}
複製代碼

全局搜索test_alloc_class_ptr咱們能夠看到一段測試的代碼的相關說明，意思就是告訴開發者調用objc_alloc以後會繼續走alloc的流程。函數

objc_alloc探索測試

查找目標： objc_alloc搜索

llvm::Value *CodeGenFunction::EmitObjCAlloc(llvm::Value *value,
                                            llvm::Type *resultType) {
  return emitObjCValueOperation(*this, value, resultType,
                                CGM.getObjCEntrypoints().objc_alloc,
                                "objc_alloc");
}
複製代碼

往上追溯：EmitObjCAlloc搜索

static Optional<llvm::Value *>
tryGenerateSpecializedMessageSend(CodeGenFunction &CGF, QualType ResultType,
                                  llvm::Value *Receiver,
                                  const CallArgList& Args, Selector Sel,
                                  const ObjCMethodDecl *method,
                                  bool isClassMessage) {
  auto &CGM = CGF.CGM;
  if (!CGM.getCodeGenOpts().ObjCConvertMessagesToRuntimeCalls)
    return None;

    // objc_alloc
    // 2: 只是去讀取字符串
    // 3:
    // 4:
  auto &Runtime = CGM.getLangOpts().ObjCRuntime;
  switch (Sel.getMethodFamily()) {
  case OMF_alloc:
    if (isClassMessage &&
        Runtime.shouldUseRuntimeFunctionsForAlloc() &&
        ResultType->isObjCObjectPointerType()) {
        // [Foo alloc] -> objc_alloc(Foo) or
        // [self alloc] -> objc_alloc(self)
        if (Sel.isUnarySelector() && Sel.getNameForSlot(0) == "alloc")
          
            return CGF.EmitObjCAlloc(Receiver, CGF.ConvertType(ResultType));
        
        // [Foo allocWithZone:nil] -> objc_allocWithZone(Foo) or
        // [self allocWithZone:nil] -> objc_allocWithZone(self)
       
        
        if (Sel.isKeywordSelector() && Sel.getNumArgs() == 1 &&
            Args.size() == 1 && Args.front().getType()->isPointerType() &&
            Sel.getNameForSlot(0) == "allocWithZone") {
         
            const llvm::Value* arg = Args.front().getKnownRValue().getScalarVal();
          
            
            if (isa<llvm::ConstantPointerNull>(arg))
            return CGF.EmitObjCAllocWithZone(Receiver,
                                             CGF.ConvertType(ResultType));
          return None;
        }
    }
    break;

  case OMF_autorelease:
    if (ResultType->isObjCObjectPointerType() &&
        CGM.getLangOpts().getGC() == LangOptions::NonGC &&
        Runtime.shouldUseARCFunctionsForRetainRelease())
      return CGF.EmitObjCAutorelease(Receiver, CGF.ConvertType(ResultType));
    break;

  case OMF_retain:
    if (ResultType->isObjCObjectPointerType() &&
        CGM.getLangOpts().getGC() == LangOptions::NonGC &&
        Runtime.shouldUseARCFunctionsForRetainRelease())
      return CGF.EmitObjCRetainNonBlock(Receiver, CGF.ConvertType(ResultType));
    break;

  case OMF_release:
    if (ResultType->isVoidType() &&
        CGM.getLangOpts().getGC() == LangOptions::NonGC &&
        Runtime.shouldUseARCFunctionsForRetainRelease()) {
      CGF.EmitObjCRelease(Receiver, ARCPreciseLifetime);
      return nullptr;
    }
    break;

  default:
    break;
  }
  return None;
}
複製代碼

看到這段代碼，咱們好熟悉呀，alloc、autorelease、retain、release，這裏其實就是編譯階段符號綁定symblos的相關信息。

繼續往上追溯：tryGenerateSpecializedMessageSend搜索

CodeGen::RValue CGObjCRuntime::GeneratePossiblySpecializedMessageSend(
    CodeGenFunction &CGF, ReturnValueSlot Return, QualType ResultType,
    Selector Sel, llvm::Value *Receiver, const CallArgList &Args,
    const ObjCInterfaceDecl *OID, const ObjCMethodDecl *Method,
    bool isClassMessage) {
    
  if (Optional<llvm::Value *> SpecializedResult =
          tryGenerateSpecializedMessageSend(CGF, ResultType, Receiver, Args,
                                            Sel, Method, isClassMessage)) {
    return RValue::get(SpecializedResult.getValue());
  }
    
  return GenerateMessageSend(CGF, Return, ResultType, Sel, Receiver, Args, OID,
                             Method);
}
複製代碼

截止追蹤源頭到這裏就結束了。

首先儘可能的去走通用的特殊名稱的消息發送。

咱們從源碼的流程中大體的能夠獲得這個一個過程，alloc特殊名稱消息來了以後就會走objc_alloc消息調用流程。
而後再走經過的消息發送。

objc_alloc在objc4源碼裏面會最終調用[cls alloc]，它是一個沒有返回值的函數，雖然也是alloc特殊函數名稱，可是在咱們追蹤到源頭的位置if條件裏面沒有成立，因而就直接走了通用消息查找流程。

筆者只對llvm作了一下簡單的流程分析，裏面還有不少小細節，能夠去探索發現。

2.2 源碼`objc`探索

下面咱們開始對咱們感興趣的objc源碼進行分析。有了前面的llvm的過程，咱們就能夠直接進入源碼查看alloc的流程了。在下面的分析中，咱們分爲兩塊，alloc主線流程 + 具體函數分支流程。

`alloc`主線流程

alloc的入口

+ (id)alloc {
    return _objc_rootAlloc(self);
}
複製代碼

一個簡單的有OC方法進入到C函數裏面。

_objc_rootAlloc分析

id
_objc_rootAlloc(Class cls)
{
    return callAlloc(cls, false/*a*/, true/*allocWithZone*/);
}
複製代碼

一個調用流程，調用callAlloc，入參cls類，false->checkNil，true->allocWithZone。

callAlloc分析

static ALWAYS_INLINE id
callAlloc(Class cls, bool checkNil, bool allocWithZone=false)
{
    if (slowpath(checkNil && !cls)) return nil;

#if __OBJC2__
    if (fastpath(!cls->ISA()->hasCustomAWZ())) {
        // No alloc/allocWithZone implementation. Go straight to the allocator.
        // fixme store hasCustomAWZ in the non-meta class and 
        // add it to canAllocFast is summary
        if (fastpath(cls->canAllocFast())) {
            // No ctors, raw isa, etc. Go straight to the metal.
            bool dtor = cls->hasCxxDtor();
            id obj = (id)calloc(1, cls->bits.fastInstanceSize());
            if (slowpath(!obj)) return callBadAllocHandler(cls);
            obj->initInstanceIsa(cls, dtor);
            return obj;
        }
        else {
            // Has ctor or raw isa or something. Use the slower path.
            id obj = class_createInstance(cls, 0);
            if (slowpath(!obj)) return callBadAllocHandler(cls);
            return obj;
        }
    }
#endif

    // No shortcuts available.
    if (allocWithZone) return [cls allocWithZone:nil];
    return [cls alloc];
}
複製代碼

if (slowpath(checkNil && !cls)) return nil;條件判斷，slowpath表示條件極可能爲false。
if (fastpath(!cls->ISA()->hasCustomAWZ()))，fastpath表示條件極可能爲true。
if (fastpath(cls->canAllocFast()))當前cls是否能快速alloc。
bool dtor = cls->hasCxxDtor();當前的cls是否有c++的析構函數。
id obj = (id)calloc(1, cls->bits.fastInstanceSize());讓系統開闢內存空間。
callBadAllocHandler(cls);表示alloc失敗。
obj->initInstanceIsa(cls, dtor);初始化isa。
id obj = class_createInstance(cls, 0);建立實例化對象，咱們會在下面具體針對這個函數作講解。
if (allocWithZone) return [cls allocWithZone:nil];當前類實現了allocWithZone函數，就調用類的allocWithZone方法。

經過callAlloc函數的調用流程，alloc的主線流程咱們大體瞭解了。

具體函數分支流程

下面咱們針對具體函數作分析

cls->ISA()->hasCustomAWZ()
- cls->ISA()經過對象的isa的值經過一個&運算isa.bits & ISA_MASK找到當前類的元類。
- hasCustomAWZ()判斷元類裏面是否有自定義的allocWithZone，這個與咱們在類裏面寫的allocWithZone是不一樣。
cls->canAllocFast()
- 這裏咱們能夠在源碼裏面看到最終直接返回的值是false，這段部分涉及到objc_class結構體裏面的相關解釋，這裏就不作展開說明了。
[cls allocWithZone:nil]
- 調用_objc_rootAllocWithZone。
- 咱們已經瞭解到目前的是objc2版本，直接進入class_createInstance(cls, o)。

class_createInstance(cls, 0)

筆者會對這個函數着重分析，咱們能夠發現_objc_rootAllocWithZone最後也是調用的這個函數。

流程分析：

id 
class_createInstance(Class cls, size_t extraBytes)
{
    return _class_createInstanceFromZone(cls, extraBytes, nil);
}
複製代碼

咱們能夠看到實際調用是_class_createInstanceFromZone，入參cls類，extraBytes值爲0。

static __attribute__((always_inline)) 
id
_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, 
                              bool cxxConstruct = true, 
                              size_t *outAllocatedSize = nil)
{
    if (!cls) return nil;

    assert(cls->isRealized());

    // Read class is info bits all at once for performance
    bool hasCxxCtor = cls->hasCxxCtor();
    bool hasCxxDtor = cls->hasCxxDtor();
    bool fast = cls->canAllocNonpointer();

    size_t size = cls->instanceSize(extraBytes);
    if (outAllocatedSize) *outAllocatedSize = size;
    id obj;
    if (!zone  &&  fast) {
        obj = (id)calloc(1, size);
        if (!obj) return nil;
        obj->initInstanceIsa(cls, hasCxxDtor);
    } 
    else {
        if (zone) {
            obj = (id)malloc_zone_calloc ((malloc_zone_t *)zone, 1, size);
        } else {
            obj = (id)calloc(1, size);
        }
        if (!obj) return nil;

        // Use raw pointer isa on the assumption that they might be 
        // doing something weird with the zone or RR.
        obj->initIsa(cls);
    }

    if (cxxConstruct && hasCxxCtor) {
        obj = _objc_constructOrFree(obj, cls);
    }

    return obj;
}
複製代碼

咱們先對幾個參數分析，當前函數被調用的時候

size_t extraBytes 爲0。 void *zone 指針爲nil。

基本條件函數分析

bool hasCxxCtor = cls->hasCxxCtor(); 是否有c++構造函數。

bool hasCxxDtor = cls->hasCxxDtor(); 是否有c++析構函數。

bool fast = cls->canAllocNonpointer(); 是否能建立nonPointer,這裏的結果是true,之後咱們會作相應的介紹。

size_t size = cls->instanceSize(extraBytes); 這一步比較重要，申請內存

size_t instanceSize(size_t extraBytes) {
        size_t size = alignedInstanceSize() + extraBytes;
        // CF requires all objects be at least 16 bytes.
        if (size < 16) size = 16;
        return size;
    }
    
uint32_t alignedInstanceSize() {
    return word_align(unalignedInstanceSize());
}

uint32_t unalignedInstanceSize() {
    assert(isRealized());
    return data()->ro->instanceSize;
}

//WORD_MASK 7UL
static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}
複製代碼

最後一步的就是內存對齊的算法

經過演算 ( 8 + 7 ) & (~7) 咱們轉話成2進制以後計算值爲8，同時這裏也能夠說明對象申請內存的時候是以8字節對齊，意思就是申請的內存大小是8的倍數。

if (size < 16) size = 16;這一步又給咱們表達了申請內存小於16字節的，按照16字節返回。這裏是爲了後面系統開闢內存空間大小作的一致性原則的處理。

calloc 與 malloc_zone_calloc 去根據申請的內存空間大小size 讓系統開闢內存空間給obj對象。

這裏涉及到另外一份源碼libmalloc。咱們就不展開分析了，可是咱們要知道，malloc開闢內存空間的原則是按照16字節對齊的。

initInstanceIsa()

會調用initIsa(cls, true, hasCxxDtor);函數

isa_t newisa(0);
newisa.bits = ISA_MAGIC_VALUE;
// isa.magic is part of ISA_MAGIC_VALUE
// isa.nonpointer is part of ISA_MAGIC_VALUE
newisa.has_cxx_dtor = hasCxxDtor;
newisa.shiftcls = (uintptr_t)cls >> 3;
isa = newisa;
複製代碼

相應的代碼簡化以後就是這麼幾部流程，都是來初始化isa

到此爲止，alloc的分析流程就結束了。這裏附上一個相應的alloc主線流程圖。

2、init探索

從源碼入手

- (id)init {
    return _objc_rootInit(self);
}

id
_objc_rootInit(id obj)
{
    // In practice, it will be hard to rely on this function.
    // Many classes do not properly chain -init calls.
    return obj;
}
複製代碼

這裏就比較簡單了，直接_objc_rootInit以後就返回了obj,說明init沒作啥事，返回的是alloc出來的obj。

3、new探索

從源碼入手

+ (id)new {
    return [callAlloc(self, false/*checkNil*/) init];
}
複製代碼

new實際上就是走了alloc流程裏面的步驟，而後在+init。

4、總結

問下面輸出的結果有什麼區別？

XDPerson *p1 = [XDPerson alloc];
XDPerson *p2 = [p1 init];
XDPerson *p3 = [p1 init];

NSLog(@"p1 - %@,%p",p1, &p1);
NSLog(@"p2 - %@,%p",p2, &p2);
NSLog(@"p3 - %@,%p",p3, &p3);
複製代碼

答案：

咱們能夠分析init實際上返回的結果就是alloc出來的對象，所以p1、p2、p3指向的是同一個內存地址，可是它們的指針地址自己就是不一樣的。

2019-12-29 20:09:02.971814+0800 XDTest[1809:186909] p1 - <XDPerson: 0x100f33010>,0x7ffeefbff5b8
2019-12-29 20:09:02.971978+0800 XDTest[1809:186909] p2 - <XDPerson: 0x100f33010>,0x7ffeefbff5b0
2019-12-29 20:09:02.972026+0800 XDTest[1809:186909] p3 - <XDPerson: 0x100f33010>,0x7ffeefbff5a8
複製代碼

init作了什麼？這樣設計的好處？解釋一下if (self = [super init])?

init實際上就是直接返回了alloc裏面建立的objc。
這樣設計能夠給開發者提供更多的工廠設計方便，好比咱們有時候會重寫initWith...這樣的方法，讓開發者更好的自定義。
self = [super init]響應繼承鏈上父類的init方法，防止父類實現了某些特殊的方法，到了本身這裏被丟棄。加上if處理，咱們能夠理解爲一層安全的判斷，防止父類在init裏面直接返回nil。