iOS啓動優化之二進制重排

介紹

去年年末二進制重排的概念被宇宙廠帶火了起來,出於學習的目的,綜合網上已有資料並總結實現了下,以便對啓動優化有更好的瞭解。html

App啓動和內存加載

Linux 系統下,進程申請內存並非直接物理內存給咱們運行,而是隻標記當前進程擁有該段內存,當真正使用這段段內存時纔會分配,此時的內存是虛擬內存。node

當咱們須要訪問一個內存地址時,若是虛擬內存地址對應的物理內存還未分配,CPU 會執行 page fault,將指令從磁盤加載到物理內存中並進行驗籤操做(App Store 發佈狀況下)。git

在App 啓動過程當中,會調用各類函數,因爲這些函數分佈在各個 TEXT 段中且不連續,此時須要執行屢次 page fault 建立分頁,將代碼讀取到物理內存中,而且這些分頁中的部分代碼不會在啓動階段被調用。以下圖所示,假設咱們在啓動階段須要調用 Func A、B、C,則需執行3次 page default(包括首次讀取),並使用3個分頁。github

如何優化?

優化的思路很簡單,即把啓動階段須要用到的函數按順序排放,減小 page fault 執行次數和分頁數量,並使 page fault 在相鄰頁執行,以下圖所示,相較於以前,減小了一次 page fault 和分頁加載,當工程複雜度高時,優化的效果就很客觀了。bash

Xcode 的連接器提供了一個 Order File 配置,文件中的符號會按照順序寫入二進制文件中,咱們能夠將調用到的函數寫到該文件,實現優化。架構

實現詳解

Link Map瞭解連接順序

Link Map 是 App 編譯過程的中間產物,記載了二進制文件的佈局,咱們能夠經過 Link Map 文件分析可執行文件的構成是怎樣,裏面的內容都是些什麼,哪些庫佔用空間較高等等,須要手動在 Build Settings 將 Write Link Map File 設置爲 Yes。app

默認生成的 Link Map 文件在 build 目錄下,能夠經過修改 Path To Link Map 指定存放地址。iphone

以demo爲例,文件中的內容以下,各部位含義見註釋:函數

// Link Map對應安裝包地址
# Path: /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Products/Debug-iphoneos/IOSDevelopTools.app/IOSDevelopTools

// 對應的架構
# Arch: arm64

// 編譯後生成的.o文件列表,包括系統和用戶自定的類,UIKit庫等等。
# Object files:
[  0] linker synthesized
[  1] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/YECallMonitor.o
[  2] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/YECallRecordCell.o
[  3] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/YECallRecordModel.o
[  4] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/YECallTraceCore.o
[  5] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/fishhook.o
[  6] /Users/yehuangbin/Library/Developer/Xcode/DerivedData/IOSDevelopTools-bpjwhcswecoziihayzwjgxztowne/Build/Intermediates.noindex/IOSDevelopTools.build/Debug-iphoneos/IOSDevelopTools.build/Objects-normal/arm64/ViewController.o
...

// Section是各類數據類型所在的內存空間,Section主要分爲兩大類,__Text和__DATA。__Text指的是程序代碼,__DATA指的是已經初始化的變量等。
# Sections:
# Address Size Segment Section
0x10000572C	0x0000B184	__TEXT	__text
0x1000108B0	0x000002C4	__TEXT	__stubs
0x100010B74	0x000002DC	__TEXT	__stub_helper
0x100010E50	0x00000088	__TEXT	__const
0x100010ED8	0x000006EC	__TEXT	__cstring
0x1000115C4	0x000019EF	__TEXT	__objc_methname
0x100012FB4	0x00000134	__TEXT	__ustring
0x1000130E8	0x000000F6	__TEXT	__objc_classname
0x1000131DE	0x00000CBF	__TEXT	__objc_methtype
0x100013EA0	0x00000160	__TEXT	__unwind_info
0x100014000	0x00000030	__DATA	__got
0x100014030	0x000001D8	__DATA	__la_symbol_ptr
0x100014208	0x000001C0	__DATA	__const
0x1000143C8	0x000004A0	__DATA	__cfstring
0x100014868	0x00000038	__DATA	__objc_classlist
0x1000148A0	0x00000008	__DATA	__objc_catlist
0x1000148A8	0x00000028	__DATA	__objc_protolist
...

// 變量名、類名、方法名等符號表
# Symbols:
# Address Size File Name
0x10000572C	0x00000080	[  1] +[YECallMonitor shareInstance]
0x1000057AC	0x0000005C	[  1] ___30+[YECallMonitor shareInstance]_block_invoke
0x100005808	0x00000024	[  1] -[YECallMonitor start]
0x10000582C	0x00000024	[  1] -[YECallMonitor stop]
0x100005850	0x00000200	[  1] -[YECallMonitor getThreadCallRecord]
0x100005A50	0x000002F8	[  1] ___36-[YECallMonitor getThreadCallRecord]_block_invoke
0x100005D48	0x000000A4	[  1] ___copy_helper_block_e8_32s40s48s
0x100005DEC	0x00000068	[  1] ___destroy_helper_block_e8_32s40s48s
0x100005E54	0x0000002C	[  1] -[YECallMonitor setDepth:]
0x100005E80	0x0000002C	[  1] -[YECallMonitor setMinTime:]
0x100005EAC	0x00000024	[  1] -[YECallMonitor clear]
0x100005ED0	0x00000028	[  1] -[YECallMonitor enable]
0x100005EF8	0x0000026C	[  1] -[YECallMonitor setFilterClassNames:]
0x100006164	0x00000230	[  1] -[YECallMonitor findStartDepthIndex:arr:]
0x100006394	0x00000610	[  1] -[YECallMonitor recursive_getRecord:]
0x1000069A4	0x00000240	[  1] -[YECallMonitor setRecordDic:record:]
...


# Dead Stripped Symbols:
# Size File Name
<<dead>> 	0x00000008	[  2] 8-byte-literal
<<dead>> 	0x00000006	[  2] literal string: depth
<<dead>> 	0x00000012	[  2] literal string: stringWithFormat:
<<dead>> 	0x00000007	[  2] literal string: string
<<dead>> 	0x00000034	[  2] literal string: stringByPaddingToLength:withString:startingAtIndex:
<<dead>> 	0x0000000E	[  2] literal string: appendString:
<<dead>> 	0x00000004	[  2] literal string: cls
<<dead>> 	0x0000000E	[  2] literal string: .cxx_destruct
<<dead>> 	0x00000002	[  2] literal string: +
<<dead>> 	0x00000002	[  2] literal string: -
<<dead>> 	0x00000020	[  2] CFString
<<dead>> 	0x00000020	[  2] CFString
<<dead>> 	0x0000000B	[  2] literal string: v24@0:8@16
<<dead>> 	0x00000008	[  2] literal string: @16@0:8
<<dead>> 	0x00000008	[  2] literal string: v16@0:8
<<dead>> 	0x00000005	[  3] literal string: init
<<dead>> 	0x0000000A	[  3] literal string: setDepth:
<<dead>> 	0x00000006	[  3] literal string: class
<<dead>> 	0x00000004	[  3] literal string: cls
<<dead>> 	0x00000004	[  3] literal string: sel
<<dead>> 	0x00000009	[  3] literal string: costTime
<<dead>> 	0x00000006	[  3] literal string: depth
<<dead>> 	0x00000006	[  3] literal string: total
<<dead>> 	0x0000000A	[  3] literal string: callCount
<<dead>> 	0x00000022	[  3] literal string: initWithCls:sel:time:depth:total:
...


複製代碼

能夠看到此時 Symbols 的符號表並非按照啓動時執行的函數順序加載的,而是按照庫的編譯順序所有載入。工具

SanitizerCoverage採集調用函數信息

SanitizerCoverage內置在LLVM中,能夠在函數、基本塊和邊界這些級別上插入對用戶定義函數的回調,詳細介紹能夠再 Clang 11 documentation 找到。

在 build settings 裏的 「Other C Flags」 中添加 -fsanitize-coverage=func,trace-pc-guard。若是含有 Swift 代碼的話,還須要在 「Other Swift Flags」 中加入 -sanitize-coverage=func-sanitize=undefined。需注意,全部連接到 App 中的二進制都須要開啓 SanitizerCoverage,這樣才能徹底覆蓋到全部調用。

開啓後,函數的調用會執行 void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {} 回調,咱們可在該回調中插入本身的統計代碼,收集函數名,啓動完成後再將數據導出。借鑑玉令天下的實現代碼,稍微修改了下,如需自取 AppCallCollecter,代碼以下:

static OSQueueHead qHead = OS_ATOMIC_QUEUE_INIT;
static BOOL stopCollecting = NO;

typedef struct {
    void *pointer;
    void *next;
} PointerNode;

void __sanitizer_cov_trace_pc_guard_init(uint32_t *start,
                                         uint32_t *stop) {
    static uint32_t N;  // Counter for the guards.
    if (start == stop || *start) return;  // Initialize only once.
    printf("INIT: %p %p\n", start, stop);
    for (uint32_t *x = start; x < stop; x++)
        *x = ++N;  // Guards should start from 1.
}

// This callback is inserted by the compiler on every edge in the
// control flow (some optimizations apply).
// Typically, the compiler will emit the code like this:
//    if(*guard)
//      __sanitizer_cov_trace_pc_guard(guard);
// But for large functions it will emit a simple call:
//    __sanitizer_cov_trace_pc_guard(guard);
void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
    // If initialization has not occurred yet (meaning that guard is uninitialized), that means that initial functions like +load are being run. These functions will only be run once anyways, so we should always allow them to be recorded and ignore guard
    if (stopCollecting) {
        return;
    }
    // If you set *guard to 0 this code will not be called again for this edge.
    // Now you can get the PC and do whatever you want:
    //   store it somewhere or symbolize it and print right away.
    // The values of `*guard` are as you set them in
    // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
    // and use them to dereference an array or a bit vector.
    *guard = 0;
    // __builtin_return_address 獲取當前調用棧信息,取第一幀地址
    void *PC = __builtin_return_address(0);
    PointerNode *node = malloc(sizeof(PointerNode));
    *node = (PointerNode){PC, NULL};
    OSAtomicEnqueue(&qHead, node, offsetof(PointerNode, next));

    
}

extern NSArray <NSString *> *getAllFunctions(NSString *currentFuncName) {
    NSMutableSet<NSString *> *unqSet = [NSMutableSet setWithObject:currentFuncName];
    NSMutableArray <NSString *> *functions = [NSMutableArray array];
    while (YES) {
        PointerNode *front = OSAtomicDequeue(&qHead, offsetof(PointerNode, next));
        if(front == NULL) {
            break;
        }
        Dl_info info = {0};
        // dladdr獲取地址符號信息
        dladdr(front->pointer, &info);
        NSString *name = @(info.dli_sname);
        if([unqSet containsObject:name]) {
            continue;
        }
        BOOL isObjc = [name hasPrefix:@"+["] || [name hasPrefix:@"-["];
        NSString *symbolName = isObjc ? name : [@"_" stringByAppendingString:name];
        [unqSet addObject:name];
        [functions addObject:symbolName];
    }
    return [[functions reverseObjectEnumerator] allObjects];;

}

#pragma mark - public

extern NSArray <NSString *> *getAppCalls(void) {
    
    stopCollecting = YES;
    // 內存屏障,防止cpu的亂序執行調度內存(原子鎖)
    __sync_synchronize();
    NSString* curFuncationName = [NSString stringWithUTF8String:__FUNCTION__];
    return getAllFunctions(curFuncationName);
}




extern void appOrderFile(void(^completion)(NSString* orderFilePath)) {
    
    stopCollecting = YES;
    __sync_synchronize();
   NSString* curFuncationName = [NSString stringWithUTF8String:__FUNCTION__];
    
    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(0.01 * NSEC_PER_SEC)), dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
        NSArray *functions = getAllFunctions(curFuncationName);
        NSString *orderFileContent = [functions.reverseObjectEnumerator.allObjects componentsJoinedByString:@"\n"];
        NSLog(@"[orderFile]: %@",orderFileContent);
        NSString *filePath = [NSTemporaryDirectory() stringByAppendingPathComponent:@"orderFile.order"];
        [orderFileContent writeToFile:filePath
                           atomically:YES
                             encoding:NSUTF8StringEncoding
                                error:nil];
        if(completion){
            completion(filePath);
        }
    });
}
複製代碼

在項目啓動後調用 appOrderFile 方法,將調用列表寫到沙盒中,經過在 Devices 下載 xcappdata 文件便可獲取該列表。

裏面的內容便是啓動過程被調用的函數順序。

_getThreadMethodStack
_after_objc_msgSend
_before_objc_msgSend
-[YECallMonitor ignoreClassArr]
-[YECallMonitor setFilterClassNames:]
_get_protection
_perform_rebinding_with_section
_rebind_symbols_for_image
__rebind_symbols_for_image
_prepend_rebindings
_rebind_symbols
___startMonitor_block_invoke
_startMonitor
-[YECallMonitor start]
_setMinConsumeTime
-[YECallMonitor setMinTime:]
___30+[YECallMonitor shareInstance]_block_invoke
+[YECallMonitor shareInstance]
-[AppDelegate application:didFinishLaunchingWithOptions:]
-[AppDelegate setWindow:]
-[AppDelegate window]
_main
複製代碼

最後在 Order File 配置下文件地址,從新編譯打包。

結果對比

從重排後的 Link Map Symbols 部分能夠看到此時的載入順序跟咱們的 order file 文件是一致的。

...
# Symbols:
# Address Size File Name
0x100007CCC	0x000000AC	[  4] _getThreadMethodStack
0x100007D78	0x00000234	[  4] _after_objc_msgSend
0x100007FAC	0x0000016C	[  4] _before_objc_msgSend
0x100008118	0x000001AC	[  1] -[YECallMonitor ignoreClassArr]
0x1000082C4	0x00000298	[  1] -[YECallMonitor setFilterClassNames:]
0x10000855C	0x000000A0	[  5] _get_protection
0x1000085FC	0x000003D0	[  5] _perform_rebinding_with_section
0x1000089CC	0x00000320	[  5] _rebind_symbols_for_image
0x100008CEC	0x00000058	[  5] __rebind_symbols_for_image
0x100008D44	0x00000104	[  5] _prepend_rebindings
0x100008E48	0x000000F8	[  5] _rebind_symbols
0x100008F40	0x000000E0	[  4] ___startMonitor_block_invoke
0x100009020	0x00000074	[  4] _startMonitor
0x100009094	0x00000044	[  1] -[YECallMonitor start]
0x1000090D8	0x00000044	[  4] _setMinConsumeTime
0x10000911C	0x00000054	[  1] -[YECallMonitor setMinTime:]
0x100009170	0x00000074	[  1] ___30+[YECallMonitor shareInstance]_block_invoke
0x1000091E4	0x0000009C	[  1] +[YECallMonitor shareInstance]
0x100009280	0x00000208	[ 11] -[AppDelegate application:didFinishLaunchingWithOptions:]
0x100009488	0x00000070	[ 11] -[AppDelegate setWindow:]
0x1000094F8	0x00000058	[ 11] -[AppDelegate window]
0x100009550	0x000000D4	[  9] _main
...
複製代碼

經過 system trace 工具對比下優化先後的啓動速度,因爲 Demo 工程內容少,沒法看出明顯區別,這裏用公司項目做爲對比:

能夠看到執行 page fault 少了將近 1/3,速度提高了 1/4,說明對啓動優化上仍是有必定效果,尤爲是在大項目中。

總結

網上還有其餘方案來實現二進制重排,原理大同小異,抖音經過手動插樁獲取的符號數據(包括C++靜態初始化、+Load、Block等)會更加準確,但就其複雜度來講感受性價比不高,而手淘的方案比較特殊,經過修改 .o 目標文件實現靜態插樁,須要對目標代碼較爲熟悉,通用性不高。

因爲在 iOS 上,一頁有16KB(Mac 爲4KB),能夠存放大量代碼,因此在啓動階段執行 page fault 的次數並不會不少,二進制重排相比於其餘優化手段,提高效果不明顯,應優先從其餘方面去進行啓動優化,可參考筆者以前的文章 iOS性能提高全面探索(啓動優化),完成這些性價比較高的優化後,再考慮是否作重排優化,不過從技術學習的層面本篇內容仍是值得研究的 😁。

相關代碼

AppCallCollecter

參考

About Me 🐝

今年計劃完成10個優秀第三方源碼解讀,會陸續提交到 iOS-Framework-Analysis ,歡迎 star 項目陪伴筆者一塊兒提升進步,如有什麼不足之處,敬請告知 🏆。

相關文章
相關標籤/搜索