iOS Out-Of-Memory 原理闡述及方案調研

什麼是 OOM?

OOM 的全稱是 Out-Of-Memory,是因爲 iOS 的 Jetsam 機制形成的一種「另類」 Crash,它不一樣於常規的 Crash,經過 Signal 捕獲等 Crash 監控方案沒法捕獲到 OOM 事件。html

爲何會發生 oom?

目前猜想兩種狀況會形成 OOM,ios

  1. 系統總體內存使用較高,系統基於優先級殺死優先級較低的 App
  2. 當前使用的 App 達到了 「high water mark」,也就是達到了系統對單個 App 的內存限制,系統會將你 Kill

驗證方案 1 :

XNU 中 opensource.apple.com/source/xnu/… 、opensource.apple.com/source/xnu/… 提供了一些函數和宏,咱們能夠在 root 權限下使用這些宏和函數來獲取當前狀態下的全部 App 的 oom 內存閾值,而且基於 PID 甚至能夠修改進程的 內存閾值,達到增大 oom內存閾值的效果。git

對咱們最有用的信息以下:github

// 獲取進程的 pid、優先級、狀態、內存閾值等信息
typedef struct memorystatus_priority_entry {
    pid_t pid;
    int32_t priority;
    uint64_t user_data;
    int32_t limit;
    uint32_t state;
} memorystatus_priority_entry_t;
 
 
// 基於下面這些宏能夠達到查詢內存閾值等信息,也能夠修改內存閾值等
/* Commands */
#define MEMORYSTATUS_CMD_GET_PRIORITY_LIST 1
#define MEMORYSTATUS_CMD_SET_PRIORITY_PROPERTIES 2
#define MEMORYSTATUS_CMD_GET_JETSAM_SNAPSHOT 3
#define MEMORYSTATUS_CMD_GET_PRESSURE_STATUS 4
#define MEMORYSTATUS_CMD_SET_JETSAM_HIGH_WATER_MARK 5 /* Set active memory limit = inactive memory limit, both non-fatal */
#define MEMORYSTATUS_CMD_SET_JETSAM_TASK_LIMIT 6 /* Set active memory limit = inactive memory limit, both fatal */
#define MEMORYSTATUS_CMD_SET_MEMLIMIT_PROPERTIES 7 /* Set memory limits plus attributes independently */
#define MEMORYSTATUS_CMD_GET_MEMLIMIT_PROPERTIES 8 /* Get memory limits plus attributes */
#define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_ENABLE 9 /* Set the task's status as a privileged listener w.r.t memory notifications */
#define MEMORYSTATUS_CMD_PRIVILEGED_LISTENER_DISABLE 10 /* Reset the task's status as a privileged listener w.r.t memory notifications */
/* Commands that act on a group of processes */
#define MEMORYSTATUS_CMD_GRP_SET_PROPERTIES 100

複製代碼

咱們能夠建立一個以下代碼的程序bash

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "kern_memorystatus.h"

#define NUM_ENTRIES 1024

char *state_to_text(int State)
{
    // Convert kMemoryStatus constants to a textual representation

    static char returned[80];

    sprintf (returned, "0x%02x ",State);

    if (State & kMemorystatusSuspended) strcat(returned,"Suspended,");

    if (State & kMemorystatusFrozen) strcat(returned,"Frozen,");

    if (State & kMemorystatusWasThawed) strcat(returned,"WasThawed,");

    if (State & kMemorystatusTracked) strcat(returned,"Tracked,");

    if (State & kMemorystatusSupportsIdleExit) strcat(returned,"IdleExit,");

    if (State & kMemorystatusDirty) strcat(returned,"Dirty,");

    if (returned[strlen(returned) -1] == ',')

        returned[strlen(returned) -1] = '\0';

    return (returned);
}

int main (int argc, char **argv)
{
    struct memorystatus_priority_entry memstatus[NUM_ENTRIES];

    size_t  count = sizeof(struct memorystatus_priority_entry) * NUM_ENTRIES;

    // call memorystatus_control

    int rc = memorystatus_control (MEMORYSTATUS_CMD_GET_PRIORITY_LIST,    // 1 - only supported command on OS X

                                   0,    // pid

                                   0,    // flags

                                   memstatus, // buffer

                                   count); // buffersize

    if (rc < 0) { perror ("memorystatus_control"); exit(rc);}

    int entry = 0;

    for (; rc > 0; rc -= sizeof(struct memorystatus_priority_entry))
    {
        printf ("PID: %5d\tPriority:%2d\tUser Data: %llx\tLimit:%2d\tState:%s\n",

                memstatus[entry].pid,

                memstatus[entry].priority,

                memstatus[entry].user_data,

                memstatus[entry].limit,

                state_to_text(memstatus[entry].state));

        entry++;
    }
}
複製代碼

而後經過 MonekyDev 提供的 Command-line Tool 工具將程序注入到越獄設備(當時的測試環境爲5s、iOS 9.1)中去,經過 SSH 鏈接到設備,而後經過終端運行該程序。就能夠獲得 dump 的信息。以下所示:app

PID:  9967	Priority: 3	User Data: 0	Limit: 6	State:0x38 Tracked,IdleExit,Dirty
PID: 11151	Priority: 3	User Data: 0	Limit: 6	State:0x38 Tracked,IdleExit,Dirty
PID: 11154	Priority: 3	User Data: 0	Limit:10	State:0x38 Tracked,IdleExit,Dirty
PID: 11165	Priority: 3	User Data: 0	Limit: 6	State:0x38 Tracked,IdleExit,Dirty
PID: 11499	Priority: 3	User Data: 0	Limit:18	State:0x28 Tracked,Dirty
PID: 10039	Priority: 4	User Data: 2100	Limit:108	State:0x00
PID:  9981	Priority: 7	User Data: 0	Limit:10	State:0x08 Tracked
PID:  9977	Priority: 7	User Data: 0	Limit:20	State:0x08 Tracked
PID:  9979	Priority: 7	User Data: 0	Limit:25	State:0x38 Tracked,IdleExit,Dirty
PID: 10021	Priority: 7	User Data: 0	Limit: 6	State:0x08 Tracked
PID: 11575	Priority:10	User Data: 10100	Limit:650	State:0x00
PID:   103	Priority:11	User Data: 0	Limit:96	State:0x08 Tracked
PID: 11442	Priority:11	User Data: 0	Limit:38	State:0x08 Tracked
PID:    67	Priority:12	User Data: 0	Limit:24	State:0x28 Tracked,Dirty
PID:    31	Priority:14	User Data: 0	Limit:650	State:0x08 Tracked
PID:    45	Priority:14	User Data: 0	Limit: 9	State:0x08 Tracked
複製代碼

以上代碼中,Priority:10 的進程就是我測試的 好好學習 App,此時 App 在前臺而且活躍,因此優先級是 10,而且獲得 oom 內存閾值是 650ide

驗證方案 2 :

當咱們的 App 因爲 jetsam 被殺死的時候,在手機中會有系統日誌,從手機設置-隱私-分析這條操做路徑中,能夠拿到JetsamEvent 開頭的日誌。這些日誌中就能夠獲取一些關於 App 的內存信息,以個人 6s 爲例,pageSize * rpages 的值獲取的值即是閾值,同時日誌中也代表緣由是  "reason" : "per-process-limit" (並非全部的  JetsamEvent 中均可以拿到準確的閾值,有的存在誤差。。。)函數

"pageSize" : 16384
{
    "uuid" : "b8d6682c-5903-3007-b9c2-561d1e6ca9d5",
    "states" : [
      "frontmost",
      "resume"
    ],
    "killDelta" : 18859,
    "genCount" : 0,
    "age" : 1775369503,
    "purgeable" : 0,
    "fds" : 50,
    "coalition" : 691,
    "rpages" : 89600,
    "reason" : "per-process-limit",
    "pid" : 960,
    "cpuTime" : 1.6920809999999999,
    "name" : "MemoryLimitTest",
    "lifetimeMax" : 34182
}
複製代碼

驗證方案 3:

能夠經過大量的測試來尋找它的oom 內存閾值是多少,StackOverFlow 上已經存在一個清單,該清單列舉了一些常見設備的 oom 閾值。該清單閾值和真實閾值存在誤差,我猜想原有有二,第一,它取內存的時機不可能徹底和 oom 時機吻合,只能儘量接近這個時機,第二,他取內存的方法和 XNU 中 jetsam 機制所用的內存獲取方式不一致。正確獲取內存的方式下面會闡述。工具

Results of testing with the utility Split wrote (link is in his answer):
device: (crash amount/total amount/percentage of total)
iPad1: 127MB/256MB/49%
iPad2: 275MB/512MB/53%
iPad3: 645MB/1024MB/62%
iPad4: 585MB/1024MB/57% (iOS 8.1)
iPad Mini 1st Generation: 297MB/512MB/58%
iPad Mini retina: 696MB/1024MB/68% (iOS 7.1)
iPad Air: 697MB/1024MB/68%
iPad Air 2: 1383MB/2048MB/68% (iOS 10.2.1)
iPad Pro 9.7": 1395MB/1971MB/71% (iOS 10.0.2 (14A456)) iPad Pro 10.5」: 3057/4000/76% (iOS 11 beta4) iPad Pro 12.9」 (2015): 3058/3999/76% (iOS 11.2.1) iPad Pro 12.9」 (2017): 3057/3974/77% (iOS 11 beta4) iPod touch 4th gen: 130MB/256MB/51% (iOS 6.1.1) iPod touch 5th gen: 286MB/512MB/56% (iOS 7.0) iPhone4: 325MB/512MB/63% iPhone4s: 286MB/512MB/56% iPhone5: 645MB/1024MB/62% iPhone5s: 646MB/1024MB/63% iPhone6: 645MB/1024MB/62% (iOS 8.x) iPhone6+: 645MB/1024MB/62% (iOS 8.x) iPhone6s: 1396MB/2048MB/68% (iOS 9.2) iPhone6s+: 1392MB/2048MB/68% (iOS 10.2.1) iPhoneSE: 1395MB/2048MB/69% (iOS 9.3) iPhone7: 1395/2048MB/68% (iOS 10.2) iPhone7+: 2040MB/3072MB/66% (iOS 10.2.1) iPhone X: 1392/2785/50% (iOS 11.2.1) https://stackoverflow.com/questions/5887248/ios-app-maximum-memory-budget/15200855#15200855 複製代碼

如何正確度量 App 的使用內存

常見的獲取 App 內存的方式是使用 resident_size 代碼以下:post

#import <mach/mach.h>

- (int64_t)memoryUsage {
    int64_t memoryUsageInByte = 0;
    struct task_basic_info taskBasicInfo;
    mach_msg_type_number_t size = sizeof(taskBasicInfo);
    kern_return_t kernelReturn = task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t) &taskBasicInfo, &size);

    if(kernelReturn == KERN_SUCCESS) {
        memoryUsageInByte = (int64_t) taskBasicInfo.resident_size;
        NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte);
    } else {
        NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn));
    }

    return memoryUsageInByte;
}
複製代碼

而正確的方式應該是使用 phys_footprint,由於 Apple 就是用的這個指標,和 Apple 保持一致才能說明問題。能夠看源碼驗證一下:opensource.apple.com/source/xnu/…

#import <mach/mach.h>

- (int64_t)memoryUsage {
    int64_t memoryUsageInByte = 0;
    task_vm_info_data_t vmInfo;
    mach_msg_type_number_t count = TASK_VM_INFO_COUNT;
    kern_return_t kernelReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t) &vmInfo, &count);
    if(kernelReturn == KERN_SUCCESS) {
        memoryUsageInByte = (int64_t) vmInfo.phys_footprint;
        NSLog(@"Memory in use (in bytes): %lld", memoryUsageInByte);
    } else {
        NSLog(@"Error with task_info(): %s", mach_error_string(kernelReturn));
    }

    return memoryUsageInByte;
}
複製代碼

oom 定位的方案

方案1:

最先看到 oom 相關的方案是 FaceBook 的一篇博客中講到的,code.facebook.com/posts/11469…,經過排除法來統計 OOM 率是多少。固然這種方案統計的結果多少會與實際數據存在偏差,好比 ApplicationState 不許確,watchdog 也被統計在 oom 中之類的。

方案2:

近期騰訊也開源了本身的 OOM 定位方案,OOMDetector 組件:github.com/Tencent/OOM… 。這種方案經過利用 libmalloc 中的 malloc_logger 函數指針,能夠經過堆棧來幫助開發定位大內存。可是也存在一些缺陷,就是頻繁的 dump 堆棧對 App 性能形成了影響,只能灰度一小部分用戶來進行數據統計和定位。

方案3:

基於近期的發現,能夠在線下獲取 App 的 high water mark,也就是 oom 內存閾值。 那麼就產生了方案3

  • 監控內存增加,在達到  high water mark 附近的時候,dump 內存信息,獲取對象名稱、對象個數、各對象的內存值;若是穩定能夠全量開啓,不會有性能問題
  • OOMDetector 能夠拿到分配內存的堆棧,對於定位到代碼層面更加有效;能夠灰度開放
相關文章
相關標籤/搜索