iOS代碼瘦身實踐:刪除無用的類

本文將提供一種靜態分析的方式,用於查找可執行文件Mach-o中未使用的類,源碼連接:xuezhulian/classunrefgit

Mach-o文件中__DATA __objc_classrefs段記錄了引用類的地址,__DATA __objc_classlist段記錄了全部類的地址,取差集能夠獲得未使用的類的地址,而後進行符號化,就能夠獲得未被引用的類信息。github

引用類地址

能夠經過Mac自帶的工具otool打印Mach-o中的段信息,須要注意的是模擬器和真機對應的可執行文件,數據的存儲方式不一樣須要加以區分。 能夠經過file命令獲取到archbash

#binary_file_arch: distinguish Big-Endian and Little-Endian
#file -b output example: Mach-O 64-bit executable arm64
binary_file_arch = os.popen('file -b ' + path).read().split(' ')[-1].strip()
複製代碼複製代碼

在取類地址的時候區分x86_64arm微信

def pointers_from_binary(line, binary_file_arch):
    line = line[16:].strip().split(' ')
    pointers = set()
    if binary_file_arch == 'x86_64':
        #untreated line example:00000001030cec80 d8 75 15 03 01 00 00 00 68 77 15 03 01 00 00 00
        pointers.add(''.join(line[4:8][::-1] + line[0:4][::-1]))
        pointers.add(''.join(line[12:16][::-1] + line[8:12][::-1]))
        return pointers
    #arm64 confirmed,armv7 arm7s unconfirmed
    if binary_file_arch.startswith('arm'):
        #untreated line example:00000001030bcd20 03138580 00000001 03138878 00000001
        pointers.add(line[1] + line[0])
        pointers.add(line[3] + line[2])
        return pointers
    return None
複製代碼複製代碼

經過otool -v -s __DATA __objc_classrefs獲取到引用類的地址。工具

def class_ref_pointers(path, binary_file_arch):
    ref_pointers = set()
    lines = os.popen('/usr/bin/otool -v -s __DATA __objc_classrefs %s' % path).readlines()
    for line in lines:
        pointers = pointers_from_binary(line, binary_file_arch)
        ref_pointers = ref_pointers.union(pointers)
    return ref_pointers
複製代碼複製代碼

全部類地址

經過otool -v -s __DATA __objc_classlist獲取全部類的地址。post

def class_list_pointers(path, binary_file_arch):
    list_pointers = set()
    lines = os.popen('/usr/bin/otool -v -s __DATA __objc_classlist %s' % path).readlines()
    for line in lines:
        pointers = pointers_from_binary(line, binary_file_arch)
        list_pointers = list_pointers.union(pointers)
    return list_pointers
複製代碼複製代碼

取差集

用全部類信息減去引用類的信息,此時咱們能夠拿到未使用類的地址信息。學習

unref_pointers = class_list_pointers(path, binary_file_arch) - class_ref_pointers(path, binary_file_arch)
複製代碼複製代碼

符號化

經過nm -nm命令能夠獲得地址和對應的類名字。ui

def class_symbols(path):
    symbols = {}
    #class symbol format from nm: 0000000103113f68 (__DATA,__objc_data) external _OBJC_CLASS_$_EpisodeStatusDetailItemView
    re_class_name = re.compile('(\w{16}) .* _OBJC_CLASS_\$_(.+)')
    lines = os.popen('nm -nm %s' % path).readlines()
    for line in lines:
        result = re_class_name.findall(line)
        if result:
            (address, symbol) = result[0]
            symbols[address] = symbol
    return symbols
複製代碼複製代碼

過濾

在實際分析的過程當中發現,若是一個類的子類被實例化,父類未被實例化,此時父類不會出如今__objc_classrefs這個段裏,在未使用的類中須要將這一部分父類過濾出去。使用otool -oV能夠獲取到類的繼承關係。spa

def filter_super_class(unref_symbols):
    re_subclass_name = re.compile("\w{16} 0x\w{9} _OBJC_CLASS_\$_(.+)")
    re_superclass_name = re.compile("\s*superclass 0x\w{9} _OBJC_CLASS_\$_(.+)")
    #subclass example: 0000000102bd8070 0x103113f68 _OBJC_CLASS_$_TTEpisodeStatusDetailItemView
    #superclass example: superclass 0x10313bb80 _OBJC_CLASS_$_TTBaseControl
    lines = os.popen("/usr/bin/otool -oV %s" % path).readlines()
    subclass_name = ""
    superclass_name = ""
    for line in lines:
        subclass_match_result = re_subclass_name.findall(line)
        if subclass_match_result:
            subclass_name = subclass_match_result[0]
        superclass_match_result = re_superclass_name.findall(line)
        if superclass_match_result:
            superclass_name = superclass_match_result[0]

        if len(subclass_name) > 0 and len(superclass_name) > 0:
            if superclass_name in unref_symbols and subclass_name not in unref_symbols:
                unref_symbols.remove(superclass_name)
            superclass_name = ""
            subclass_name = ""
    return unref_symbols
複製代碼複製代碼

爲了防止一些三方庫的誤傷,還能夠去過濾一些前綴,或者是是僅保留帶有某些前綴的類。ssr

for unref_pointer in unref_pointers:
        if unref_pointer in symbols:
            unref_symbol = symbols[unref_pointer]
            if len(reserved_prefix) > 0 and not unref_symbol.startswith(reserved_prefix):
                continue
            if len(filter_prefix) > 0 and unref_symbol.startswith(filter_prefix):
                continue
            unref_symbols.add(unref_symbol)
複製代碼複製代碼

最終結果保存在腳本目錄下。

script_path = sys.path[0].strip()
f = open(script_path+"/result.txt","w")
f.write( "unref class number: %d\n" % len(unref_symbles))
f.write("\n")
for unref_symble in unref_symbles:
    f.write(unref_symble+"\n")
f.close()
複製代碼複製代碼

這個思路在必定程度上可以減小代碼的冗餘,減少包的體積。由於是靜態分析,不能包括動態調用的狀況,對於須要刪除的類須要進一步的確認。

給你們推薦一個優秀的iOS交流平臺,平臺裏的夥伴們都是很是優秀的iOS開發人員,咱們專一於技術的分享與技巧的交流,你們能夠在平臺上討論技術,交流學習。歡迎你們的加入(想要進入的可加小編微信)

做者:xuezhulian 連接:https://juejin.im/post/5d5d1a92e51d45620923886a 來源:掘金
相關文章
相關標籤/搜索