聊一聊iOS中的hook方案

iOS中一般使用runtime來對OC方法進行hook,runtime不能用於C語言函數。而fishhook堪稱iOS中的hook神器,能夠對C語言函數進行hook。這篇博客對這兩種hook方案以及一些第三方庫的源碼進行了解析。html

使用runtime對OC方法進行hook

runtime提供了兩個函數用於實現OC方法hook,class_replaceMethod和method_exchangeImplementations。ios

/** 
 * Replaces the implementation of a method for a given class.
 * 
 * @param cls The class you want to modify.
 * @param name A selector that identifies the method whose implementation you want to replace.
 * @param imp The new implementation for the method identified by name for the class identified by cls.
 * @param types An array of characters that describe the types of the arguments to the method. 
 *  Since the function must take at least two arguments—self and _cmd, the second and third characters
 *  must be 「@:」 (the first character is the return type).
 * 
 * @return The previous implementation of the method identified by \e name for the class identified by \e cls.
 * 
 * @note This function behaves in two different ways:
 *  - If the method identified by \e name does not yet exist, it is added as if \c class_addMethod were called. 
 *    The type encoding specified by \e types is used as given.
 *  - If the method identified by \e name does exist, its \c IMP is replaced as if \c method_setImplementation were called.
 *    The type encoding specified by \e types is ignored.
 */
OBJC_EXPORT IMP _Nullable
class_replaceMethod(Class _Nullable cls, SEL _Nonnull name, IMP _Nonnull imp, 
                    const char * _Nullable types) 
複製代碼
/** 
 * Exchanges the implementations of two methods.
 * 
 * @param m1 Method to exchange with second method.
 * @param m2 Method to exchange with first method.
 * 
 * @note This is an atomic version of the following:
 *  \code 
 *  IMP imp1 = method_getImplementation(m1);
 *  IMP imp2 = method_getImplementation(m2);
 *  method_setImplementation(m1, imp2);
 *  method_setImplementation(m2, imp1);
 *  \endcode
 */
OBJC_EXPORT void
method_exchangeImplementations(Method _Nonnull m1, Method _Nonnull m2) 
    OBJC_AVAILABLE(10.5, 2.0, 9.0, 1.0, 2.0);
複製代碼

舉個例子,經過以下代碼,能夠對UIButton的sendAction:to:forEvent:方法進行hook,添加一些自定義的邏輯。git

#import <objc/runtime.h>

@implementation UIButton (MyHook)

+ (void)load {
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        Class cls = [self class];
        Method before   = class_getInstanceMethod(self, @selector(sendAction:to:forEvent:));
        Method after    = class_getInstanceMethod(self, @selector(cs_sendAction:to:forEvent:));
        method_exchangeImplementations(before, after);
    });
}

- (void)cs_sendAction:(SEL)action to:(id)target forEvent:(UIEvent *)event {
  /// 一些hook須要的邏輯
  
  /// 這裏調用hook後的方法,其實現其實已是原方法了。
  [self cs_sendAction:action to:target forEvent:event];
}

@end
複製代碼

method_exchangeImplementations的函數實現以下:github

void method_exchangeImplementations(Method m1, Method m2) {
    if (!m1  ||  !m2) return;

    mutex_locker_t lock(runtimeLock);

    IMP m1_imp = m1->imp;
    m1->imp = m2->imp;
    m2->imp = m1_imp;


    // RR/AWZ updates are slow because class is unknown
    // Cache updates are slow because class is unknown
    // fixme build list of classes whose Methods are known externally?

    flushCaches(nil);

    // Update custom RR and AWZ when a method changes its IMP
    updateCustomRR_AWZ(nil, m1);
    updateCustomRR_AWZ(nil, m2);
}
複製代碼

一些注意事項

  1. 一般在+load方法中執行runtime的hook操做,若是在+initialize方法中執行,則未必是線程安全的。
  2. 若是被hook的方法,內部實現依賴了_cmd,則可能有問題。

Aspects三方庫

iOS中常用Aspects來進行AOP(面向切面編程)。編程

Think of Aspects as method swizzling on steroids. It allows you to add code to existing methods per class or per instance, whilst thinking of the insertion point e.g. before/instead/after. Aspects automatically deals with calling super and is easier to use than regular method swizzling.數組

Aspects能夠單獨針對一個實例來進行hook操做,這一點要比常見的runtime強大得多。安全

不過Aspects官方文檔也提出了一些問題,並建議不要在生產環境使用Aspects。bash

Aspects hooks deep into the class hierarchy and creates dynamic subclasses, much like KVO. There's known issues with this approach, and to this date (February 2019) I STRICTLY DO NOT RECOMMEND TO USE Aspects IN PRODUCTION CODE. We use it for partial test mocks in, PSPDFKit, an iOS PDF framework that ships with apps like Dropbox or Evernote, it's also very useful for quickly hacking something up.數據結構

Aspects使用OC的消息轉發來進行方法hook,即並不是如通用的runtime操做那樣直接使用method_exchangeImplementations等技巧。架構

Aspects uses Objective-C message forwarding to hook into messages. This will create some overhead. Don't add aspects to methods that are called a lot. Aspects is meant for view/controller code that is not called 1000 times per second.

使用AOP來進行界面打點

[UIViewController aspect_hookSelector:@selector(viewDidLoad) withOptions:AspectPositionAfter usingBlock:^(id<AspectInfo> aspectInfo) {
    NSLog(@"statistics: viewDidLoad has been hooked.");
    NSLog(@"View Controller %@ didLoad\n\n", aspectInfo.instance);
} error:nil];
複製代碼

Aspects的原理解析

Aspects很是輕量,對外僅有的兩個接口放在NSObject的category中,同時也支持移除hook。能夠說很是強大了,源碼也是頂級的,很是值得學習。

Aspects的對外接口

hook同時能夠支持實例方法和類方法。

/**
 Aspects uses Objective-C message forwarding to hook into messages. This will create some overhead. Don't add aspects to methods that are called a lot. Aspects is meant for view/controller code that is not called a 1000 times per second.

 Adding aspects returns an opaque token which can be used to deregister again. All calls are thread safe.
 */
@interface NSObject (Aspects)

/// Adds a block of code before/instead/after the current `selector` for a specific class.
///
/// @param block Aspects replicates the type signature of the method being hooked.
/// The first parameter will be `id<AspectInfo>`, followed by all parameters of the method.
/// These parameters are optional and will be filled to match the block signature.
/// You can even use an empty block, or one that simple gets `id<AspectInfo>`.
///
/// @note Hooking static methods is not supported.
/// @return A token which allows to later deregister the aspect.
+ (id<AspectToken>)aspect_hookSelector:(SEL)selector
                      withOptions:(AspectOptions)options
                       usingBlock:(id)block
                            error:(NSError **)error
{
    return aspect_add((id)self, selector, options, block, error);
}

/// Adds a block of code before/instead/after the current `selector` for a specific instance.
- (id<AspectToken>)aspect_hookSelector:(SEL)selector
                      withOptions:(AspectOptions)options
                       usingBlock:(id)block
                            error:(NSError **)error
{
    return aspect_add(self, selector, options, block, error);
}

@end
複製代碼

Aspects的數據結構

AspectOptions

AspectOptions容許設置hook操做的執行時機,默認在原有實現以後執行。還能夠設置hook爲一次性的,用完就移除!

typedef NS_OPTIONS(NSUInteger, AspectOptions) {
    AspectPositionAfter   = 0,            /// Called after the original implementation (default)
    AspectPositionInstead = 1,            /// Will replace the original implementation.
    AspectPositionBefore  = 2,            /// Called before the original implementation.
    
    AspectOptionAutomaticRemoval = 1 << 3 /// Will remove the hook after the first execution.
};
複製代碼

AspectToken

調用aspect_hookSelector方法會返回一個遵循AspectToken協議的id對象,AspectToken協議有一個remove方法,可以移除已添加的hook操做。

/// Opaque Aspect Token that allows to deregister the hook.
@protocol AspectToken <NSObject>

/// Deregisters an aspect.
/// @return YES if deregistration is successful, otherwise NO.
- (BOOL)remove;

@end
複製代碼

AspectInfo

AspectInfo表示了hook一個OC方法須要的信息,能夠看出NSInvocation很關鍵。

/// The AspectInfo protocol is the first parameter of our block syntax.
@protocol AspectInfo <NSObject>

/// The instance that is currently hooked.
- (id)instance;

/// The original invocation of the hooked method.
- (NSInvocation *)originalInvocation;

/// All method arguments, boxed. This is lazily evaluated.
- (NSArray *)arguments;

@end
複製代碼

AspectIdentifier

AspectIdentifier表示一個單獨的aspect的相關信息。

// Tracks a single aspect.
@interface AspectIdentifier : NSObject
+ (instancetype)identifierWithSelector:(SEL)selector object:(id)object options:(AspectOptions)options block:(id)block error:(NSError **)error;
- (BOOL)invokeWithInfo:(id<AspectInfo>)info;
@property (nonatomic, assign) SEL selector;
@property (nonatomic, strong) id block;
@property (nonatomic, strong) NSMethodSignature *blockSignature;
@property (nonatomic, weak) id object;
@property (nonatomic, assign) AspectOptions options;
@end
複製代碼

AspectsContainer

AspectsContainer是對象或類的全部aspect信息。

// Tracks all aspects for an object/class.
@interface AspectsContainer : NSObject
- (void)addAspect:(AspectIdentifier *)aspect withOptions:(AspectOptions)injectPosition;
- (BOOL)removeAspect:(id)aspect;
- (BOOL)hasAspects;
@property (atomic, copy) NSArray *beforeAspects;
@property (atomic, copy) NSArray *insteadAspects;
@property (atomic, copy) NSArray *afterAspects;
@end
複製代碼

AspectTracker

用於追蹤aspect

@interface AspectTracker : NSObject
- (id)initWithTrackedClass:(Class)trackedClass parent:(AspectTracker *)parent;
@property (nonatomic, strong) Class trackedClass;
@property (nonatomic, strong) NSMutableSet *selectorNames;
@property (nonatomic, weak) AspectTracker *parentEntry;
@end
複製代碼

_AspectBlock

typedef struct _AspectBlock {
	__unused Class isa;
	AspectBlockFlags flags;
	__unused int reserved;
	void (__unused *invoke)(struct _AspectBlock *block, ...);
	struct {
		unsigned long int reserved;
		unsigned long int size;
		// requires AspectBlockFlagsHasCopyDisposeHelpers
		void (*copy)(void *dst, const void *src);
		void (*dispose)(const void *);
		// requires AspectBlockFlagsHasSignature
		const char *signature;
		const char *layout;
	} *descriptor;
	// imported variables
} *AspectBlockRef;
複製代碼

aspect_add與aspect_remove

static id aspect_add(id self, SEL selector, AspectOptions options, id block, NSError **error) {
    NSCParameterAssert(self);
    NSCParameterAssert(selector);
    NSCParameterAssert(block);

    __block AspectIdentifier *identifier = nil;
    // 經過自旋鎖來保證線程安全,因此Aspects號稱的線程安全也體如今這裏。
    aspect_performLocked(^{
        // 判斷是否能夠hook(由於Aspects有一些黑名單),hook執行的時機是否合理等。
        if (aspect_isSelectorAllowedAndTrack(self, selector, options, error)) {
            // AspectsContainer對象,使用關聯對象。
            AspectsContainer *aspectContainer = aspect_getContainerForObject(self, selector);
            // 將aspect的信息封裝到AspectIdentifier對象中。
            identifier = [AspectIdentifier identifierWithSelector:selector object:self options:options block:block error:error];
            if (identifier) {
                // AspectsContainer中也會包含identifier,會用到hook時機的那個參數
                [aspectContainer addAspect:identifier withOptions:options];

                // Modify the class to allow message interception.
                aspect_prepareClassAndHookSelector(self, selector, error);
            }
        }
    });
    return identifier;
}

static BOOL aspect_remove(AspectIdentifier *aspect, NSError **error) {
    NSCAssert([aspect isKindOfClass:AspectIdentifier.class], @"Must have correct type.");

    __block BOOL success = NO;
    aspect_performLocked(^{
        id self = aspect.object; // strongify
        if (self) {
            AspectsContainer *aspectContainer = aspect_getContainerForObject(self, aspect.selector);
            success = [aspectContainer removeAspect:aspect];

            aspect_cleanupHookedClassAndSelector(self, aspect.selector);
            // destroy token
            aspect.object = nil;
            aspect.block = nil;
            aspect.selector = NULL;
        }else {
            NSString *errrorDesc = [NSString stringWithFormat:@"Unable to deregister hook. Object already deallocated: %@", aspect];
            AspectError(AspectErrorRemoveObjectAlreadyDeallocated, errrorDesc);
        }
    });
    return success;
}
複製代碼

這裏都是hook實際操做以外的一些相關代碼,包括容錯等。如:

  1. 禁止hook的方法包括retain、release、autorelease、forwardInvocation。
  2. dealloc方法的hook操做,只容許在原有代碼以前執行。這一點是確定的,由於dealloc後對象就銷燬了。
  3. 判斷方法是否已經被hook過了,避免重複hook致使異常出現。
  4. 方法在類中要存在。

aspect_prepareClassAndHookSelector

這裏是hook實際操做的核心代碼,其中也有使用runtime的class_replaceMethod函數來實現,只不過這裏替換的是消息轉發的系統方法。

static void aspect_prepareClassAndHookSelector(NSObject *self, SEL selector, NSError **error) {
    NSCParameterAssert(selector);
    Class klass = aspect_hookClass(self, error);
    Method targetMethod = class_getInstanceMethod(klass, selector);
    IMP targetMethodIMP = method_getImplementation(targetMethod);
    if (!aspect_isMsgForwardIMP(targetMethodIMP)) {
        // Make a method alias for the existing method implementation, it not already copied.
        const char *typeEncoding = method_getTypeEncoding(targetMethod);
        SEL aliasSelector = aspect_aliasForSelector(selector);
        if (![klass instancesRespondToSelector:aliasSelector]) {
            /// 如:aliasSelector的方法實現指向原方法viewDidLoad。
            __unused BOOL addedAlias = class_addMethod(klass, aliasSelector, method_getImplementation(targetMethod), typeEncoding);
            NSCAssert(addedAlias, @"Original implementation for %@ is already copied to %@ on %@", NSStringFromSelector(selector), NSStringFromSelector(aliasSelector), klass);
        }

        /// 如:klass已動態添加了aspects__viewDidLoad方法。
        /// aspect_getMsgForwardIMP(self, selector) 爲 (IMP) msgForwardIMP = 0x00007fff513f8400 (libobjc.A.dylib`_objc_msgForward)
        /// 將selector的實現指向_objc_msgForward,即原方法調用,直接走到了消息轉發。
        /// 而該子類的消息轉發方法已被替換爲__ASPECTS_ARE_BEING_CALLED__。
        // We use forwardInvocation to hook in.
        class_replaceMethod(klass, selector, aspect_getMsgForwardIMP(self, selector), typeEncoding);
        AspectLog(@"Aspects: Installed hook for -[%@ %@].", klass, NSStringFromSelector(selector));
    }
}
複製代碼

這裏主要有兩個關鍵點:

Class klass = aspect_hookClass(self, error);

static Class aspect_hookClass(NSObject *self, NSError **error) {
    NSCParameterAssert(self);
	  Class statedClass = self.class;
	  Class baseClass = object_getClass(self);
	  NSString *className = NSStringFromClass(baseClass);

    // Already subclassed
	  if ([className hasSuffix:AspectsSubclassSuffix]) {
		    return baseClass;

        // We swizzle a class object, not a single object.
	  }else if (class_isMetaClass(baseClass)) {
        return aspect_swizzleClassInPlace((Class)self);
        // Probably a KVO'ed class. Swizzle in place. Also swizzle meta classes in place.
    }else if (statedClass != baseClass) {
        return aspect_swizzleClassInPlace(baseClass);
    }

    // Default case. Create dynamic subclass.
	  const char *subclassName = [className stringByAppendingString:AspectsSubclassSuffix].UTF8String;
	  Class subclass = objc_getClass(subclassName);

	  if (subclass == nil) {
        // 動態建立類
		    subclass = objc_allocateClassPair(baseClass, subclassName, 0);
		    if (subclass == nil) {
            NSString *errrorDesc = [NSString stringWithFormat:@"objc_allocateClassPair failed to allocate class %s.", subclassName];
            AspectError(AspectErrorFailedToAllocateClassPair, errrorDesc);
            return nil;
        }

		    aspect_swizzleForwardInvocation(subclass);
		    aspect_hookedGetClass(subclass, statedClass);
		    aspect_hookedGetClass(object_getClass(subclass), statedClass);
        // 註冊新類
		    objc_registerClassPair(subclass);
	  }

    // isa混淆
	  object_setClass(self, subclass);
	  return subclass;
}
複製代碼

這一大串代碼中,能夠看到比較熟悉的一些runtime相關接口:object_getClass、objc_allocateClassPair、objc_registerClassPair、object_setClass等。

這段代碼中,動態生成了一個當前對象的子類,而後aspect_hookedGetClass(subclass, statedClass);使得動態生成的子類對象的@selector(class)會返回原對象的類,同時還有aspect_hookedGetClass(object_getClass(subclass), statedClass);,這一塊比較難以理解。

object_setClass(self, subclass); 這句代碼尤爲吸引眼球。這不就是KVO中使用到的isa替換麼。。。因此,以後經過object_getClass(self)獲取到的isa即指向了包含Aspects字符串的子類。這樣的好處在於,對於一個實例或類,經過查看isa指針就能直觀知道其是否已經被Aspects執行過hook操做,外部調用的時候則繼續視爲原對象使用,全部的hook操做都發生在動態生成的子類中,而不會涉及到對象自身的一些沒必要要改動。

class_replaceMethod(klass, selector, aspect_getMsgForwardIMP(self, selector), typeEncoding)

若是動態生成的類的實例不能響應方法,則先添加

__unused BOOL addedAlias = class_addMethod(klass, aliasSelector, method_getImplementation(targetMethod), typeEncoding);
複製代碼

這一步就是runtime的典型hook操做了。

// We use forwardInvocation to hook in.
class_replaceMethod(klass, selector, aspect_getMsgForwardIMP(self, selector), typeEncoding);
複製代碼

aspect_swizzleForwardInvocation

其中,aspect_swizzleForwardInvocation(klass)函數,會將klass的forwardInvocation:方法的實現體,替換爲__ASPECTS_ARE_BEING_CALLED__。

static NSString *const AspectsForwardInvocationSelectorName = @"__aspects_forwardInvocation:";
static void aspect_swizzleForwardInvocation(Class klass) {
    NSCParameterAssert(klass);
    // If there is no method, replace will act like class_addMethod.
    IMP originalImplementation = class_replaceMethod(klass, @selector(forwardInvocation:), (IMP)__ASPECTS_ARE_BEING_CALLED__, "v@:@");
    if (originalImplementation) {
        class_addMethod(klass, NSSelectorFromString(AspectsForwardInvocationSelectorName), originalImplementation, "v@:@");
    }
    AspectLog(@"Aspects: %@ is now aspect aware.", NSStringFromClass(klass));
}
複製代碼

ASPECTS_ARE_BEING_CALLED

而__ASPECTS_ARE_BEING_CALLED__的實現以下:

/// 這裏是實際hook代碼執行的地方。
// This is the swizzled forwardInvocation: method.
static void __ASPECTS_ARE_BEING_CALLED__(__unsafe_unretained NSObject *self, SEL selector, NSInvocation *invocation) {
    NSCParameterAssert(self);
    NSCParameterAssert(invocation);
    SEL originalSelector = invocation.selector;
	  SEL aliasSelector = aspect_aliasForSelector(invocation.selector);
    invocation.selector = aliasSelector;
    AspectsContainer *objectContainer = objc_getAssociatedObject(self, aliasSelector);
    AspectsContainer *classContainer = aspect_getContainerForClass(object_getClass(self), aliasSelector);
    AspectInfo *info = [[AspectInfo alloc] initWithInstance:self invocation:invocation];
    NSArray *aspectsToRemove = nil;

    /// 經過aspect_invoke函數,來執行AspectIdentifier中的block
    // Before hooks.
    aspect_invoke(classContainer.beforeAspects, info);
    aspect_invoke(objectContainer.beforeAspects, info);

    // Instead hooks.
    BOOL respondsToAlias = YES;
    if (objectContainer.insteadAspects.count || classContainer.insteadAspects.count) {
        aspect_invoke(classContainer.insteadAspects, info);
        aspect_invoke(objectContainer.insteadAspects, info);
    }else {
        Class klass = object_getClass(invocation.target);
        do {
            if ((respondsToAlias = [klass instancesRespondToSelector:aliasSelector])) {
                [invocation invoke];
                break;
            }
        }while (!respondsToAlias && (klass = class_getSuperclass(klass)));
    }

    // After hooks.
    aspect_invoke(classContainer.afterAspects, info);
    aspect_invoke(objectContainer.afterAspects, info);

    // If no hooks are installed, call original implementation (usually to throw an exception)
    if (!respondsToAlias) {
        invocation.selector = originalSelector;
        SEL originalForwardInvocationSEL = NSSelectorFromString(AspectsForwardInvocationSelectorName);
        if ([self respondsToSelector:originalForwardInvocationSEL]) {
            ((void( *)(id, SEL, NSInvocation *))objc_msgSend)(self, originalForwardInvocationSEL, invocation);
        }else {
            [self doesNotRecognizeSelector:invocation.selector];
        }
    }

    // Remove any hooks that are queued for deregistration.
    [aspectsToRemove makeObjectsPerformSelector:@selector(remove)];
}
複製代碼

aspect_invoke

aspect_invoke中依次執行傳入的AspectIdentifier對象中封裝的操做步驟。

// This is a macro so we get a cleaner stack trace.
#define aspect_invoke(aspects, info) \ for (AspectIdentifier *aspect in aspects) {\ [aspect invokeWithInfo:info];\ if (aspect.options & AspectOptionAutomaticRemoval) { \ aspectsToRemove = [aspectsToRemove?:@[] arrayByAddingObject:aspect]; \ } \ }
複製代碼

而invokeWithInfo中則完整地使用了NSInvocation。關於NSInvocation,能夠經過這篇博客來iOS中消息轉發的套路來回顧一下。

NSInvocation能夠給任意OC對象發送消息,其使用方式有固定的步驟:

  1. 根據selector來初始化方法簽名對象NSMethodSignature
  2. 根據方法簽名對象NSMethodSignature來初始化NSInvocation對象,必須使用invocationWithMethodSignature:方法。
  3. 設置target和selector。
  4. 設置參數,注意參數的index從2開始,由於0和1分別對應爲target和selector。若參數index超出則會出錯。
  5. 調用NSInvocation對象的invoke方法。
  6. 如有返回值,使用NSInvocation對象的getReturnValue來獲取返回值。
- (BOOL)invokeWithInfo:(id<AspectInfo>)info {
    NSInvocation *blockInvocation = [NSInvocation invocationWithMethodSignature:self.blockSignature];
    NSInvocation *originalInvocation = info.originalInvocation;
    NSUInteger numberOfArguments = self.blockSignature.numberOfArguments;

    // Be extra paranoid. We already check that on hook registration.
    if (numberOfArguments > originalInvocation.methodSignature.numberOfArguments) {
        AspectLogError(@"Block has too many arguments. Not calling %@", info);
        return NO;
    }

    // The `self` of the block will be the AspectInfo. Optional.
    if (numberOfArguments > 1) {
        // index爲0的參數爲target,index爲1的參數爲selector
        [blockInvocation setArgument:&info atIndex:1];
    }
    
	  void *argBuf = NULL;
    // target和selector除外的參數,是從index爲2開始
    for (NSUInteger idx = 2; idx < numberOfArguments; idx++) {
        const char *type = [originalInvocation.methodSignature getArgumentTypeAtIndex:idx];
		    NSUInteger argSize;
		    NSGetSizeAndAlignment(type, &argSize, NULL);
        
        /// reallocf將argBuf的內存大小增大或縮小爲argSize大小。
		    if (!(argBuf = reallocf(argBuf, argSize))) {
            AspectLogError(@"Failed to allocate memory for block invocation.");
			      return NO;
		    }
        
		    [originalInvocation getArgument:argBuf atIndex:idx];
		    [blockInvocation setArgument:argBuf atIndex:idx];
    }
    
    [blockInvocation invokeWithTarget:self.block];
    
    if (argBuf != NULL) {
        free(argBuf);
    }
    return YES;
}
複製代碼

使用fishhook對C語言函數進行hook

FOUNDATION_EXPORT void NSLog(NSString *format, ...) NS_FORMAT_FUNCTION(1,2) NS_NO_TAIL_CALL;
複製代碼

如NSLog並不是OC方法,所以沒法使用runtime進行hook。這就須要使用到fishhook了。

對NSLog進行hook

// 申明一個函數指針,用於保存原NSLog的真實函數地址,其函數簽名必須與原函數保持一致。
// 由於hook掉原函數後,在新函數中依然須要調用,否則原有功能就缺失了。
static void (*orig_nslog)(NSString *format, ...);
void my_nslog(NSString *format, ...) {
    // 此時,函數體已經交換,該調用實際上用的是NSLog的函數體。
    orig_nslog([NSString stringWithFormat:@"個人NSLog: %@", format]);
}

struct rebinding rebinding_nslog = {"NSLog", my_nslog, (void *)&orig_nslog};
rebind_symbols((struct rebinding [1]){rebinding_nslog}, 1);
複製代碼

原函數的實現體的內存地址須要保存至orig_nslog中,而後替換後的函數my_nslog中須要調用原函數,以保證系統函數的原有功能完整。

使用rebinding結構體表示一次hook操做,使用rebind_symbols函數進行符號重定向操做。

NSLog(@"123");
struct rebinding rebinding_nslog = {"NSLog", my_nslog, (void *)&orig_nslog};
rebind_symbols((struct rebinding [1]){rebinding_nslog}, 1);
NSLog(@"123");
NSLog([NSString stringWithFormat:@"456 %d", 789]);
複製代碼

輸出結果:

123
個人NSLog: 123
個人NSLog: 456 789
複製代碼

對open/close進行hook

fishhook的示例代碼對C語言函數open/close進行了hook操做,則在對App的Mach-O文件(包括App Binary,Plist文件,.data文件等)進行open操做的時候,可以插入自定義代碼。

struct rebinding rebinding_close = {"close", my_close, (void *)&orig_close};
struct rebinding rebinding_open = {"open", my_open, (void *)&orig_open};

// rebinding是一個struct,定義了須要rebind的符號的信息
rebind_symbols((struct rebinding[2]){rebinding_close, rebinding_open}, 2);
複製代碼

依然,須要將原函數的實現體保存至orig_open和orig_close中,而後替換後的函數中再去調用原函數。

static int (*orig_close)(int);
static int (*orig_open)(const char *, int, ...);
int my_close(int fd) {
    printf("Calling real close(%d)\n", fd);
    return orig_close(fd);
}
int my_open(const char *path, int oflag, ...) {
    va_list ap = {0};
    mode_t mode = 0;
    
    if ((oflag & O_CREAT) != 0) {
        // mode only applies to O_CREAT
        va_start(ap, oflag);
        mode = va_arg(ap, int);
        va_end(ap);
        printf("Calling real open('%s', %d, %d)\n", path, oflag, mode);
        return orig_open(path, oflag, mode);
    } else {
        printf("Calling real open('%s', %d)\n", path, oflag);
        return orig_open(path, oflag, mode);
    }
}
複製代碼

fishhook官方提供的示例,也比較直觀。

不能hook自定義函數

fishhook不能對自定義函數進行hook。

static void (*myFuncImp)(void);
void myFunc() {
    NSLog(@"myFunc");
}
void hookMyFunc() {
    NSLog(@"hookMyFunc");
}
複製代碼
myFunc();
struct rebinding rebinding_myFunc = {"myFunc", hookMyFunc, (void *)&myFuncImp};
rebind_symbols((struct rebinding [1]){rebinding_myFunc}, 1);
myFunc();
複製代碼

輸出結果:

myFunc
myFunc
複製代碼

不能hook自定義函數的緣由在於,App在須要調用系統函數的時候,會在_DATA段創建一個指針。dyld進行動態綁定,將該指針指向一個函數實現體。如,調用NSLog的時候,系統先創建一個函數指針,在dyld動態加載Foundation框架時,將該指針指向NSLog的函數實現體。而fishhook便可以經過修改該指針的指向地址,將其指向替換後的函數實現地址,即達到了hook C語言函數的目的。而自定義函數則不存在這樣的邏輯,所以沒法hook。

fishhook的原理解析

fishhook的源碼涉及到了很是深刻的Mach-O相關知識,不熟悉的同窗建議先看下這篇博客對Mach-O文件的初步探索。fishhook便是針對符號進行從新綁定,來作到hook C語言函數的。

對於動態連接庫裏邊的C語言函數,其函數的實現地址存放在__DATA.__la_symbol_ptr(懶綁定符號指針)和__DATA.__nl_symbol_ptr(非懶綁定符號指針)這兩個section。以後調用函數,直接根據這兩個section便可以找到函數實現地址。而fishhook即經過修改這兩個section中存儲的函數實現地址,來進行C函數的hook操做。

fishhook官方的原理解釋,How it works:

dyld binds lazy and non-lazy symbols by updating pointers in particular sections of the __DATA segment of a Mach-O binary. fishhook re-binds these symbols by determining the locations to update for each of the symbol names passed to rebind_symbols and then writing out the corresponding replacements.

For a given image, the __DATA segment may contain two sections that are relevant for dynamic symbol bindings: __nl_symbol_ptr and __la_symbol_ptr. __nl_symbol_ptr is an array of pointers to non-lazily bound data (these are bound at the time a library is loaded) and __la_symbol_ptr is an array of pointers to imported functions that is generally filled by a routine called dyld_stub_binder during the first call to that symbol (it's also possible to tell dyld to bind these at launch). In order to find the name of the symbol that corresponds to a particular location in one of these sections, we have to jump through several layers of indirection. For the two relevant sections, the section headers (struct sections from <mach-o/loader.h>) provide an offset (in the reserved1 field) into what is known as the indirect symbol table. The indirect symbol table, which is located in the __LINKEDIT segment of the binary, is just an array of indexes into the symbol table (also in __LINKEDIT) whose order is identical to that of the pointers in the non-lazy and lazy symbol sections. So, given struct section nl_symbol_ptr, the corresponding index in the symbol table of the first address in that section is indirect_symbol_table[nl_symbol_ptr->reserved1]. The symbol table itself is an array of struct nlists (see <mach-o/nlist.h>), and each nlist contains an index into the string table in __LINKEDIT which where the actual symbol names are stored. So, for each pointer __nl_symbol_ptr and __la_symbol_ptr, we are able to find the corresponding symbol and then the corresponding string to compare against the requested symbol names, and if there is a match, we replace the pointer in the section with the replacement.

The process of looking up the name of a given entry in the lazy or non-lazy pointer tables looks like this:
複製代碼

即:dyld經過更新Mach-O中的__DATA段的特定section中的指針,來綁定懶綁定和非懶綁定的符號。fishhook爲傳入rebind_symbols函數的每一個符號名進行判斷,決定其更新後的函數實現體的地址,並完成對應的函數替換,以此來從新綁定這些符號。

對於給定的image,__DATA段一般包含兩個跟動態符號綁定相關的section:__nl_symbol_ptr和__la_symbol_ptr,分別是non-lazy binding和lazy binding。__nl_symbol_ptr是一個指針數組,存儲的是非懶綁定的bound data(約束數據)(當一個庫被加載的時候的bound),__la_symbol_ptr也是一個指針數組,存儲的是一個叫作dyld_stub_binder的routine在首次調用那個符號(也多是dyld在launch的時候綁定的)的時候導入的函數。爲了在這些section中的特定位置找到那個符號的名稱,咱們必須經歷幾個中間層。對於這兩個相關的section,section headers(在<mach-o/loader.h>定義)中提供了一個偏移量offset(即reserved1字段),這些offset存在於間接符號表(indirect symbol table)中。間接符號表也存儲於Mach-O二進制文件的__LINKEDIT段中,只是一個數組而已,裏邊存儲的是對應於符號表中(也在__LINKEDIT段)的索引,這些索引的順序與指針在非懶綁定和懶綁定符號section的順序保持一致。所以,對於nl_symbol_ptr section,其section的首地址 其section的其實地址在符號表中的索引,即爲indirect_symbol_table[nl_symbol_ptr->reserved1]。符號表自己就是一個存儲nlist結構體(見<mach-o/nlist.h>中)的數組,而且每個nlist都包含了一個在__LINKEDIT段的字符串表中的索引,這個字符串表實際上存儲的是真實的符號名。所以,對於每個__nl_symbol_ptr和__la_symbol_ptr指針,咱們可以獲得其對應的符號及字符串,跟給定的符號名進行比較,若是匹配上了,咱們就將section中的指針替換掉。

參考巧用符號表 - 探求 fishhook 原理(一)

__nl_symbol_ptr和__la_symbol_ptr都是由Indirect Pointer組成的指針數組。其中的元素決定了咱們調用的方法應該以哪一個代碼段的方法來執行。經過Indirect Pointer,取出符號名,當與rebinds傳入的函數名匹配則重寫該Indirect Pointer指向的地址,即完成了函數的rebind操做。

根據給定的懶綁定或非懶綁定的指針表的入口,查找其名稱的過程如圖所示:

在這裏,__DATA.__la_symbol_ptr(懶綁定符號指針)和__DATA.__nl_symbol_ptr(非懶綁定符號指針)這兩個section很是關鍵。__la_symbol_ptr是懶綁定(lazy binding)的符號指針,在加載的時候,並未直接肯定符號地址,而是在第一次調用該函數的時候,經過PLT(Procedure Linkage Table)進行一次懶綁定。而__nl_symbol_ptr則不會進行懶綁定。

fishhook的對外接口

rebinding結構體用於表示即將對一個函數進行hook所需的封裝結構。

/*
 * A structure representing a particular intended rebinding from a symbol
 * name to its replacement
 */
struct rebinding {
  const char *name;
  void *replacement;
  void **replaced;
};
複製代碼

replacement指向替換後的函數實現體,replaced用於保存原函數實現體。

rebind_symbols函數接收即將進行hook的rebinding結構體數組,以及數組個數。

/*
 * For each rebinding in rebindings, rebinds references to external, indirect
 * symbols with the specified name to instead point at replacement for each
 * image in the calling process as well as for all future images that are loaded
 * by the process. If rebind_functions is called more than once, the symbols to
 * rebind are added to the existing list of rebindings, and if a given symbol
 * is rebound more than once, the later rebinding will take precedence.
 */
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel);
複製代碼
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
  int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
  if (retval < 0) {
    return retval;
  }
  // If this was the first call, register callback for image additions (which is also invoked for
  // existing images, otherwise, just run on existing images
  if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
  } else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
      _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
    }
  }
  return retval;
}
複製代碼

rebind_symbols的源碼實現分爲兩部分,首先prepend_rebindings函數,根據傳入的rebindings數組,構建一個鏈表結構,表頭爲_rebindings_head。

而後視狀況調用_dyld_register_func_for_add_image或_rebind_symbols_for_image函數。

鏈表結構

這裏有一個鏈表,表頭是_rebindings_head,每一個節點都存儲一個指針,指向rebinding結構體組成的數組,rebindings_nel即爲數組個數,另一個next指針指向後繼節點。

struct rebindings_entry {
  struct rebinding *rebindings;
  size_t rebindings_nel;
  struct rebindings_entry *next;
};

static struct rebindings_entry *_rebindings_head;
複製代碼

執行這句代碼,

int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
複製代碼

prepend_rebindings的源碼以下:

static int prepend_rebindings(struct rebindings_entry **rebindings_head,
                              struct rebinding rebindings[],
                              size_t nel) {
  struct rebindings_entry *new_entry = malloc(sizeof(struct rebindings_entry));
  if (!new_entry) {
    return -1;
  }
  /// 申請nel個數的rebinding結構體所需的內存空間
  new_entry->rebindings = malloc(sizeof(struct rebinding) * nel);
  if (!new_entry->rebindings) {
    free(new_entry);
    return -1;
  }
  /// 將傳入rebindings結構體數組的內存, 拷貝至new_entry->rebindings. 第三個參數爲拷貝的內存大小
  memcpy(new_entry->rebindings, rebindings, sizeof(struct rebinding) * nel);
  new_entry->rebindings_nel = nel;

  /// 這裏是熟悉的鏈表操做:鏈表頭部插入一個節點
  /// 後rebinding的放在了鏈表的頭部
  new_entry->next = *rebindings_head;
  *rebindings_head = new_entry;
  
  return 0;
}
複製代碼

_dyld_register_func_for_add_image

_rebindings_head->next爲空,意味着是首次調用。使用 ***_dyld_register_func_for_add_image(_rebind_symbols_for_image);***,將_rebind_symbols_for_image註冊爲dyld加載image後的回調函數。則,每次dyld加載一個image的時候,都會觸發該_rebind_symbols_for_image函數。

/* * The following functions allow you to install callbacks which will be called * by dyld whenever an image is loaded or unloaded. During a call to _dyld_register_func_for_add_image() * the callback func is called for every existing image. Later, it is called as each new image * is loaded and bound (but initializers not yet run). The callback registered with * _dyld_register_func_for_remove_image() is called after any terminators in an image are run * and before the image is un-memory-mapped. */
extern void _dyld_register_func_for_add_image(void (*func)(const struct mach_header* mh, intptr_t vmaddr_slide))    __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
extern void _dyld_register_func_for_remove_image(void (*func)(const struct mach_header* mh, intptr_t vmaddr_slide)) __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
複製代碼

_rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));

這個函數便是fishhook的從新綁定符號的過程:遍歷dyld動態加載的image,依次執行rebind操做。

_rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));

static void _rebind_symbols_for_image(const struct mach_header *header,
                                      intptr_t slide) {
    rebind_symbols_for_image(_rebindings_head, header, slide);
}
複製代碼

_dyld_get_image_header和_dyld_get_image_vmaddr_slide兩個函數均接收image_index,即image的索引,分別返回image的header和image的虛擬地址偏移量。slide的出現,是由於ASLR技術(Address space layout randomization),即內核將Mach-O加載到虛擬內存中時,處於安全的考慮將其內存地址偏移一個隨機的offset。

/* * The following functions allow you to iterate through all loaded images. * This is not a thread safe operation. Another thread can add or remove * an image during the iteration. * * Many uses of these routines can be replace by a call to dladdr() which * will return the mach_header and name of an image, given an address in * the image. dladdr() is thread safe. */
extern uint32_t                    _dyld_image_count(void)                              __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
extern const struct mach_header* _dyld_get_image_header(uint32_t image_index) __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
extern intptr_t                    _dyld_get_image_vmaddr_slide(uint32_t image_index)   __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
extern const char*                 _dyld_get_image_name(uint32_t image_index)           __OSX_AVAILABLE_STARTING(__MAC_10_1, __IPHONE_2_0);
複製代碼

fishhook的數據結構

dl_info

dladdr函數將mach-header中的信息填到dl_info結構體中。

/* * Structure filled in by dladdr(). */
typedef struct dl_info {
        const char      *dli_fname;     /* Pathname of shared object */
        void            *dli_fbase;     /* Base address of shared object */
        const char      *dli_sname;     /* Name of nearest symbol */
        void            *dli_saddr;     /* Address of nearest symbol */
} Dl_info;

extern int dladdr(const void *, Dl_info *);
複製代碼

segment_command_t

segment_command_64結構體對應於Mach-O文件中的一個segment load命令LC_SEGMENT_64。該命令用於加載各類命令。

typedef struct segment_command_64 segment_command_t;

/* * The 64-bit segment load command indicates that a part of this file is to be * mapped into a 64-bit task's address space. If the 64-bit segment has * sections then section_64 structures directly follow the 64-bit segment * command and their size is reflected in cmdsize. */
struct segment_command_64 { /* for 64-bit architectures */
	uint32_t	cmd;		/* LC_SEGMENT_64 */
	uint32_t	cmdsize;	/* includes sizeof section_64 structs */
	char		segname[16];	/* segment name */
	uint64_t	vmaddr;		/* memory address of this segment */
	uint64_t	vmsize;		/* memory size of this segment */
	uint64_t	fileoff;	/* file offset of this segment */
	uint64_t	filesize;	/* amount to map from the file */
	vm_prot_t	maxprot;	/* maximum VM protection */
	vm_prot_t	initprot;	/* initial VM protection */
	uint32_t	nsects;		/* number of sections in segment */
	uint32_t	flags;		/* flags */
};
複製代碼

這與使用MachOView查看Mach-O文件的結果一致。如LC_SEGMENT_64(__TEXT)命令用於加載代碼段,其類型是LC_SEGMENT_64, cmdsize爲472,segment name爲__TEXT。

section是相同或類似信息的集合,如.text、.data、.bss section都是不一樣的section。而segment是由多個屬性相同的section組成的。咱們一般說的代碼段和數據段指的其實就是segment。

typedef struct section_64 section_t;

struct section_64 { /* for 64-bit architectures */
	char		sectname[16];	/* name of this section */
	char		segname[16];	/* segment this section goes in */
	uint64_t	addr;		/* memory address of this section */
	uint64_t	size;		/* size in bytes of this section */
	uint32_t	offset;		/* file offset of this section */
	uint32_t	align;		/* section alignment (power of 2) */
	uint32_t	reloff;		/* file offset of relocation entries */
	uint32_t	nreloc;		/* number of relocation entries */
	uint32_t	flags;		/* flags (section type and attributes)*/
	uint32_t	reserved1;	/* reserved (for offset or index) */
	uint32_t	reserved2;	/* reserved (for count or sizeof) */
	uint32_t	reserved3;	/* reserved */
};
複製代碼

symtab_command

對應於LC_SYMTAB命令,用於加載符號表信息。

/* * The symtab_command contains the offsets and sizes of the link-edit 4.3BSD * "stab" style symbol table information as described in the header files * <nlist.h> and <stab.h>. */
struct symtab_command {
	uint32_t	cmd;		/* LC_SYMTAB */
	uint32_t	cmdsize;	/* sizeof(struct symtab_command) */
	uint32_t	symoff;		/* symbol table offset */
	uint32_t	nsyms;		/* number of symbol table entries */
	uint32_t	stroff;		/* string table offset */
	uint32_t	strsize;	/* string table size in bytes */
};
複製代碼

dysymtab_command

這個源碼較長,這裏就不貼出來了。其中一些關鍵的也是動態符號表的偏移量和符號個數等。

對應於LC_DYSYMTAB命令,用於動態連接器所須要的符號表信息。

/* * This is the second set of the symbolic information which is used to support * the data structures for the dynamically link editor. */
複製代碼

nlist_t

/* * This is the symbol table entry structure for 64-bit architectures. */
struct nlist_64 {
    union {
        uint32_t  n_strx; /* index into the string table */
    } n_un;
    uint8_t n_type;        /* type flag, see below */
    uint8_t n_sect;        /* section number or NO_SECT */
    uint16_t n_desc;       /* see <mach-o/stab.h> */
    uint64_t n_value;      /* value of this symbol (or stab offset) */
};
複製代碼

String Table

strtab是存放section名、變量名、符號名的字符串表,以\0爲分隔符。符號名字符串的地址 = strtab的基地址base + 符號表中該符號名的偏移量offset。

rebind_symbols_for_image(_rebindings_head, header, slide);

接下來就是rebind的關鍵操做,下邊的兩個for循環是真的難點,要深刻理解Mach-O文件的格式才能基本看懂。

static void rebind_symbols_for_image(struct rebindings_entry *rebindings, const struct mach_header *header, intptr_t slide) {
  Dl_info info;
  if (dladdr(header, &info) == 0) {
    return;
  }

  /// 先聲明幾個變量,在第二次循環中會使用到。
  segment_command_t *cur_seg_cmd;
  segment_command_t *linkedit_segment = NULL;
  struct symtab_command* symtab_cmd = NULL;
  struct dysymtab_command* dysymtab_cmd = NULL;
  
  /// 首先跳過Mach-O Header
  uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t);
  /// 下邊依次遍歷每個Load Command。cmdsize爲加載命令的內存大小。
  for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
    /// 取出Load Command
    cur_seg_cmd = (segment_command_t *)cur;
    if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
      /// LC_SEGMENT_ARCH_DEPENDENT是啥意思?特定架構?
      if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) {
        /// __LINKEDIT包含了方法和變量的元數據(位置、偏移量),及代碼簽名等信息。
        /// 動態連接庫使用的原始數據。
        linkedit_segment = cur_seg_cmd;
      }
    } else if (cur_seg_cmd->cmd == LC_SYMTAB) {
      /// 符號表
      symtab_cmd = (struct symtab_command*)cur_seg_cmd;
    } else if (cur_seg_cmd->cmd == LC_DYSYMTAB) {
      /// 動態符號表
      dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd;
    }
  }

  if (!symtab_cmd || !dysymtab_cmd || !linkedit_segment ||
      !dysymtab_cmd->nindirectsyms) {
    return;
  }

  // Find base symbol/string table addresses
  /// 計算獲得Mach-O在虛擬內存中的基地址
  /// linkedit_segment->vmaddr爲__LINKEDIT段的虛擬地址。
  /// linkedit_segment->fileoff爲__LINKEDIT段在Mach-O中的偏移量
  /// 因此linkedit_segment->vmaddr - linkedit_segment->fileoff,即獲得了進行連接時的基地址。
  /// slide爲該image(Mach-O文件)在虛擬內存中地址偏移量(ASLR引入)。
  uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
  /// LC_SYMTAB和LC_DYSYMTAB中所記錄的Offset都是基於Mach-O在虛擬內存中的基地址的。
  /// linkedit_base + symtab_cmd->symoff 即爲符號表的地址
  /// linkedit_base + symtab_cmd->stroff 即爲符號表的字符串表地址
  /// 這部分要看symtab_command結構的詳細組成部分。
  /// 將這一部份內存地址,對應於一個nlist_t結構體。nlist_t是符號表入口
  nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff);
  /// 獲取字符串表
  char *strtab = (char *)(linkedit_base + symtab_cmd->stroff);

  // Get indirect symbol table (array of uint32_t indices into symbol table)
  /// linkedit_base + dysymtab_cmd->indirectsymoff 即爲間接符號表的地址
  uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff);

  /// 又從新來一次:遍歷每一個Load Command
  cur = (uintptr_t)header + sizeof(mach_header_t);
  for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
    cur_seg_cmd = (segment_command_t *)cur;
    if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
      if (strcmp(cur_seg_cmd->segname, SEG_DATA) != 0 &&
          strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) != 0) {
        continue;
      }
      /// 找到__DATA段
      /// 一個Load Command的segment下邊有多個section
      /// 其實關心的僅僅是懶綁定表和非懶綁定表。由於這兩個section中存儲的是函數實現地址。
      for (uint j = 0; j < cur_seg_cmd->nsects; j++) {
        /// 這一句不太懂,爲啥是sizeof(segment_command_t),而非sizeof(section_64)
        section_t *sect =
          (section_t *)(cur + sizeof(segment_command_t)) + j;
        /// 懶綁定符號表
        if ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {
          perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
        }
        /// 非懶綁定符號表
        if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {
          perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);
        }
      }
    }
  }
}
複製代碼

關於這個計算過程的註釋,已經都放在對應的代碼位置了。其中計算的linkedit_base爲連接時Mach-O在虛擬內存中的基地址,以此來計算符號表、字符串表、動態符號表的間接符號表的地址。

間接符號表 uint32_t indirectsymoff; 比較難理解。包含符號指針和routine stubs的section,有跟間接符號表的每一個指針和stub分別對應起來的索引(也可能有基於section大小和入口固定大小的隱含個數)。對於這兩類中的每一個section,對應於間接符號表的索引存儲在section header的reserved1字段(注意,這個字段在以後會用到)。間接符號表入口是一個簡單的對應於符號表的32位索引,該符號即經過指針或者stub來引用。間接符號表用來在section中匹配入口。

/* * The sections that contain "symbol pointers" and "routine stubs" have * indexes and (implied counts based on the size of the section and fixed * size of the entry) into the "indirect symbol" table for each pointer * and stub. For every section of these two types the index into the * indirect symbol table is stored in the section header in the field * reserved1. An indirect symbol table entry is simply a 32bit index into * the symbol table to the symbol that the pointer or stub is referring to. * The indirect symbol table is ordered to match the entries in the section. */
uint32_t indirectsymoff; /* file offset to the indirect symbol table */
uint32_t nindirectsyms;  /* number of indirect symbol table entries */
複製代碼

INDIRECT_SYMBOL_LOCAL和INDIRECT_SYMBOL_ABS是兩個特殊的間接符號表入口。

/* * An indirect symbol table entry is simply a 32bit index into the symbol table * to the symbol that the pointer or stub is refering to. Unless it is for a * non-lazy symbol pointer section for a defined symbol which strip(1) as * removed. In which case it has the value INDIRECT_SYMBOL_LOCAL. If the * symbol was also absolute INDIRECT_SYMBOL_ABS is or'ed with that. */
#define INDIRECT_SYMBOL_LOCAL 0x80000000
#define INDIRECT_SYMBOL_ABS 0x40000000
複製代碼

perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);

在上一步,對懶綁定符號表和非懶綁定符號表,均調用了perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab);函數,指向rebinding操做。這裏的sect即爲nl_symbol_ptr和la_symbol_ptr這兩個section。

static void perform_rebinding_with_section(struct rebindings_entry *rebindings, section_t *section, intptr_t slide, nlist_t *symtab, char *strtab, uint32_t *indirect_symtab) {
  /// 間接符號表的地址 + reserved1,即獲得了間接符號表中存儲的全部索引。這裏是一個數組,存儲的是uint32_t類型元素。
  /// 不曉得爲啥用reserved1這個字段,而不使用一個更加有意義的命名。
  uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
  /// section的地址 + slide偏移量 即爲la_symbol_ptr在Mach-O映射的虛擬內存中的實際地址。
  /// slide依然是該image(Mach-O文件)在虛擬內存中地址偏移量(ASLR引入)。
  /// 該指針指向另外一個指針A,A指向的是全部的懶綁定符號表數組
  void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
  /// 針對每4個或8個bytes(這跟CPU是多少位有關),進行遍歷操做
  for (uint i = 0; i < section->size / sizeof(void *); i++) {
    /// 獲取一個間接符號表中的索引,即
    uint32_t symtab_index = indirect_symbol_indices[i];
    /// 跳過這兩類入口
    if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
        symtab_index == (INDIRECT_SYMBOL_LOCAL   | INDIRECT_SYMBOL_ABS)) {
      continue;
    }
    
    /// 獲取到符號表對應的字符串表,n_un.n_strx是字符串表的索引。這裏的索引能夠簡單等同於offset
    uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx;
    /// 這樣即得到了該索引對應着的符號名(即string table中存儲的是符號名,經過偏移量獲取符號名)
    char *symbol_name = strtab + strtab_offset;
    struct rebindings_entry *cur = rebindings;
    while (cur) {
      /// 遍歷rebindings中的每一個元素
      for (uint j = 0; j < cur->rebindings_nel; j++) {
        /// 符號名與方法名相等。爲什麼要判斷&symbol_name[1]?
        if (strlen(symbol_name) > 1 &&
            strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
          if (cur->rebindings[j].replaced != NULL &&
              indirect_symbol_bindings[i] != cur->rebindings[j].replacement) {
            /// 先將原有的函數實現保存
            *(cur->rebindings[j].replaced) = indirect_symbol_bindings[i];
          }
          /// 替換函數
          indirect_symbol_bindings[i] = cur->rebindings[j].replacement;
          goto symbol_loop;
        }
      }
      cur = cur->next;
    }
  symbol_loop:;
  }
}
複製代碼

對於fishhook還有一些疑問待解決:

  1. 爲什麼要判斷&symbol_name[1]?
  2. 對indirect_symbol_bindings的理解不夠透徹?

對objc_msgSend進行hook

對objc_msgSend進行hook是另外一個比較深刻的話題了,須要涉及很多彙編代碼了,暫時還遺留太多疑問了。這裏暫時先貼一些參考資料:

  1. objc_msgSend_hook
  2. MTHawkeye
  3. amd64-and-va_arg
  4. IHI0055B_aapcs64
  5. ARM64FunctionCallingConventions

參考資料

  1. Aspects
  2. fishhook
  3. Hook 原理之 fishhook 源碼解析
  4. 巧用符號表 - 探求 fishhook 原理(一)
  5. 驗證試驗 - 探求 fishhook 原理(二)
相關文章
相關標籤/搜索