Wakeup in XNU

時間 2021-04-15

標籤前端 git github 微信網絡架構 app 異步 ide oop 欄目 Git 简体版

原文原文鏈接

本文做者：段家順

蘋果在iOS13的時候，在內核中加入了一個新的性能衡量指標wakeup，同時因爲這個指標而被系統殺死的應用數不勝數，其中也包括咱們經常使用的微信淘寶等。而這個指標徹底是由 XNU 內核統計的，因此咱們很難經過日誌等普通手段去準確的定位問題，因此這裏經過另外一種思路去解決這個問題。前端

爲何要統計 wakeup

要定位這個問題，首先咱們須要知道這個指標的目的是什麼。git

XNU 中，對性能的指標有CPU、內存、IO，而wakeup屬於 CPU 的性能指標，同時屬於 CPU 指標的還有 CPU 使用率，下面是XNU中對其限制的定義。github

/*
 * Default parameters for CPU usage monitor.
 *
 * Default setting is 50% over 3 minutes.
 */
#define         DEFAULT_CPUMON_PERCENTAGE 50
#define         DEFAULT_CPUMON_INTERVAL   (3 * 60)

#define TASK_WAKEUPS_MONITOR_DEFAULT_LIMIT              150 /* wakeups per second */
#define TASK_WAKEUPS_MONITOR_DEFAULT_INTERVAL   300 /* in seconds. */

/*
 * Level (in terms of percentage of the limit) at which the wakeups monitor triggers telemetry.
 *
 * (ie when the task's wakeups rate exceeds 70% of the limit, start taking user
 *  stacktraces, aka micro-stackshots)
 */
#define TASK_WAKEUPS_MONITOR_DEFAULT_USTACKSHOTS_TRIGGER        70

總結來講，當 CPU 使用率在3分鐘內均值超過50%，就認爲過分使用CPU，當wakeup在300秒內均值超過150次，則認爲喚起次數過多，同時在閾值的70%水位內核會開啓監控。微信

CPU 使用率咱們很容易理解，使用率越高，電池壽命越低，並且並非線性增長的。那麼wakeup又是如何影響電池壽命的呢？網絡

首先咱們須要看看ARM架構中對於 CPU 功耗問題的描述：架構

Many ARM systems are mobile devices and powered by batteries. In such systems, optimization of power use, and total energy use, is a key design constraint. Programmers often spend significant amounts of time trying to save battery life in such systems.

因爲ARM被大量使用於低功耗設備，而這些設備每每會由電池來做爲驅動，因此 ARM 在硬件層面就對功耗這個問題進行了優化設計。app

Energy use can be divided into two components: 

- Static 
Static power consumption, also often called leakage, occurs whenever the core logic or RAM blocks have power applied to them. In general terms, the leakage currents are proportional to the total silicon area, meaning that the bigger the chip, the higher the leakage. The proportion of power consumption from leakage gets significantly higher as you move to smaller fabrication geometries. 

- Dynamic 
Dynamic power consumption occurs because of transistor switching and is a function of the core clock speed and the numbers of transistors that change state per cycle. Clearly, higher clock speeds and more complex cores consume more power.

功耗能夠分爲2種類型，即靜態功耗與動態功耗。異步

靜態功耗指的是隻要 CPU 通上電，因爲芯片沒法保證絕對絕緣，因此會存在「漏電」的狀況，並且越大的芯片這種問題越嚴重，這也是芯片廠家爲何拼命的研究更小尺寸芯片的緣由。這部分功耗因爲是硬件自己決定的，因此咱們沒法去控制，而這種類型功耗佔比不大。ide

動態功耗指的是 CPU 運行期間，接通時鐘後，執行指令所帶來的額外開銷，而這個開銷會和時鐘週期頻率相關，頻率越高，耗電量越大。這也就說明了蘋果爲何會控制 CPU 使用率，而相關研究（Facebook 也作過）也代表，CPU 在20如下和20以上的能耗幾乎是成倍的增長。oop

CPU 使用率已經可以從必定程度上限制電池損耗問題了，那麼wakeup又是什麼指標呢？

wakeup 是什麼

要了解wakeup是什麼，首先要知道ARM低功耗模式的2個重要指令WFI和WFE。

ARM assembly language includes instructions that can be used to place the core in a low-power state. The architecture defines these instructions as hints, meaning that the core is not required to take any specific action when it executes them. In the Cortex-A processor family, however, these instructions are implemented in a way that shuts down the clock to almost all parts of the core. This means that the power consumption of the core is significantly reduced so that only static leakage currents are drawn, and there is no dynamic power consumption.

經過這2個指令進入低功耗模式後，時鐘將會被關閉，這個 CPU 將不會再執行任何指令，這樣這個 CPU 的動態能耗就沒有了。這個能力的實現是由和 CPU 核心強綁定的空轉線程idle thread實現的，有意思的是XNU中的實現較爲複雜，而Zircon中則很是直接暴力：

__NO_RETURN int arch_idle_thread_routine(void*) {
  for (;;) {
    __asm__ volatile(「wfi」);
  }
}

在 XNU 中，一個 CPU 核心的工做流程被歸納爲以下狀態機：

/*
 *           -------------------- SHUTDOWN
 *          /                     ^     ^
 *        _/                      |      \
 *  OFF_LINE ---> START ---> RUNNING ---> IDLE ---> DISPATCHING
 *         \_________________^   ^ ^______/           /
 *                                \__________________/
 */

而wakeup則表示的是，從低功耗模式喚起進入運行模式的次數。

wakeup 如何統計的

ARM異常系統

CPU 時鐘被關閉了，那麼又要怎麼喚起呢？這就涉及到 CPU 的異常系統。

在 ARM 中，異常和中斷的概念比較模糊，他把全部會引發 CPU 執行狀態變動的事件都稱爲異常，其中包括軟中斷，debug 中斷，硬件中斷等。

從觸發時機上能夠區分爲同步異常與異步異常。這裏指的同步異步並非應用程序的概念，這裏同步指的是擁有明確的觸發時機，好比系統調用，缺頁中斷等，都會發生在明確的時機，而異步中斷，則徹底無視指令的邏輯，會強行打斷指令執行，好比 FIQ 和 IRQ，這裏比較典型的是定時器中斷。

異常系統有不少能力，其中一個重要的能力就是內核態與用戶態切換。ARM的執行權限分爲4個等級，EL0，EL1，EL2，EL3。其中 EL0 表明用戶態，而 EL1 表明內核態，當用戶態想要切換至內核態的時候，必須經過異常系統進行切換，並且異常系統只能向同等或更高等級權限進行切換。

那麼這麼多類型的異常，又是如何響應的呢？這裏就涉及到一個異常處理表（exception table），在系統啓動的時候，須要首先就去註冊這個表，在XNU中，這個表以下：

.section __DATA_CONST,__const
    .align 3
    .globl EXT(exc_vectors_table)
LEXT(exc_vectors_table)
    /* Table of exception handlers.
         * These handlers sometimes contain deadloops. 
         * It's nice to have symbols for them when debugging. */
    .quad el1_sp0_synchronous_vector_long
    .quad el1_sp0_irq_vector_long
    .quad el1_sp0_fiq_vector_long
    .quad el1_sp0_serror_vector_long
    .quad el1_sp1_synchronous_vector_long
    .quad el1_sp1_irq_vector_long
    .quad el1_sp1_fiq_vector_long
    .quad el1_sp1_serror_vector_long
    .quad el0_synchronous_vector_64_long
    .quad el0_irq_vector_64_long
    .quad el0_fiq_vector_64_long
    .quad el0_serror_vector_64_long

wakeup 計數

那麼咱們回過頭來看看wakeup計數的地方：

/*
 *  thread_unblock:
 *
 *  Unblock thread on wake up.
 *  Returns TRUE if the thread should now be placed on the runqueue.
 *  Thread must be locked.
 *  Called at splsched().
 */
boolean_t
thread_unblock(
    thread_t                thread,
    wait_result_t   wresult)
{
      // . . .
    boolean_t aticontext, pidle;
    ml_get_power_state(&aticontext, &pidle);

     /* Obtain power-relevant interrupt and 「platform-idle exit" statistics.
     * We also account for 「double hop」 thread signaling via
     * the thread callout infrastructure.
     * DRK: consider removing the callout wakeup counters in the future
     * they’re present for verification at the moment.
     */

    if (__improbable(aticontext /* . . . */)) {
        // wakeup ++
    }
    // . . .
}

而這裏的aticontext則是經過ml_at_interrupt_context獲取的，其含義則是是否處於中斷上下文中。

/*
 *  Routine:        ml_at_interrupt_context
 *  Function:   Check if running at interrupt context
 */
boolean_t
ml_at_interrupt_context(void)
{
    /* Do not use a stack-based check here, as the top-level exception handler
     * is free to use some other stack besides the per-CPU interrupt stack.
     * Interrupts should always be disabled if we’re at interrupt context.
     * Check that first, as we may be in a preemptible non-interrupt context, in
     * which case we could be migrated to a different CPU between obtaining
     * the per-cpu data pointer and loading cpu_int_state.  We then might end
     * up checking the interrupt state of a different CPU, resulting in a false
     * positive.  But if interrupts are disabled, we also know we cannot be
     * preempted. */
    return !ml_get_interrupts_enabled() && (getCpuDatap()->cpu_int_state != NULL);
}

那麼cpu_int_state標記又是在何時設置上去的呢？只有在locore.S中，纔會更新該標記：

str        x0, [x23, CPU_INT_STATE]            // Saved context in cpu_int_state

同時發現以下幾個方法會配置這個標記：

el1_sp0_irq_vector_long
el1_sp1_irq_vector_long
el0_irq_vector_64_long
el1_sp0_fiq_vector_long
el0_fiq_vector_64_long

結合上述的異常處理表的註冊位置，與ARM官方文檔的位置進行對比，能夠發現：

這幾個中斷類型均爲 FIQ 或者 IRQ，也就是硬中斷。由此咱們能夠判斷，wakeup必然是由硬中斷引發的，而像系統調用，線程切換，缺頁中斷這種並不會引發wakeup。

進程統計

由上能夠看出，wakeup實際上是對CPU核心喚起次數的統計，和應用層的線程與進程彷佛絕不相干。但從程序執行的角度思考，若是一個程序一直在運行，就不會進入等待狀態，而從等待狀態喚醒，確定是由於某些異常中斷，好比網絡，vsync 等。

在 CPU 核心被喚醒後，在當前 CPU 核心執行的線程會進行wakeup++，而系通通計維度是應用維度，也就是進程維度，因此會累計該進程下面的全部線程的wakeup計數。

queue_iterate(&task->threads, thread, thread_t, task_threads) {
        info->task_timer_wakeups_bin_1 += thread->thread_timer_wakeups_bin_1;
        info->task_timer_wakeups_bin_2 += thread->thread_timer_wakeups_bin_2;
}

因此在咱們代碼中，若是在2個不一樣線程啓用用一樣的定時器，wakeup是同一個線程起2個定時器的2倍（一樣的定時器在底層實際上是一顆樹，註冊一樣的定時器實際只註冊了一個）。

用戶層獲取該統計值則能夠經過以下方式：

#include <mach/task.h>
#include <mach/mach.h>

BOOL GetSystemWakeup(NSInteger *interrupt_wakeup, NSInteger *timer_wakeup) {
    struct task_power_info info = {0};
    mach_msg_type_number_t count = TASK_POWER_INFO_COUNT;
    kern_return_t ret = task_info(current_task(), TASK_POWER_INFO, (task_info_t)&info, &count);
    if (ret == KERN_SUCCESS) {
        if (interrupt_wakeup) {
            *interrupt_wakeup = info.task_interrupt_wakeups;
        }
        if (timer_wakeup) {
            *timer_wakeup = info.task_timer_wakeups_bin_1 + info.task_timer_wakeups_bin_2;
        }
        return true;
    }
    else {
        if (interrupt_wakeup) {
            *interrupt_wakeup = 0;
        }
        if (timer_wakeup) {
            *timer_wakeup = 0;
        }
        return false;
    }
}

wakeup 治理

從以上分析來看，咱們只須要排查各類硬件相關事件便可。

從實際排查結果來看，目前只有定時器或者擁有定時能力的類型是最廣泛的場景。

好比NSTimer，CADisplayLink，dispatch_semaphore_wait，pthread_cond_timedwait等。

關於定時器，咱們儘可能複用其能力，避免在不一樣線程去建立一樣的定時能力，同時在回到後臺的時候，關閉不須要的定時器，由於大部分定時器都是UI相關的，關閉定時器也是一種標準的作法。

關於 wait 類型的能力，從方案選擇上避免輪詢的方案，或者增長輪詢間隔時間，好比能夠經過 try_wait，runloop或者 EventKit 等能力進行優化。

監控與防劣化

一旦咱們知道了問題緣由，那麼對問題的治理比較簡單，然後續咱們須要創建持續的管控等長效措施才能夠。

在此咱們能夠簡單的定義一些規則，而且嵌入線下監控能力中：

定時器時間週期小於1s的，在進入後臺須要進行暫停
wait 類型延遲小於1s，而且持續使用10次以上的狀況須要進行優化

總結

wakeup因爲是 XNU 內核統計數據，因此在問題定位排查方面特別困難，因此從另外一個角度去解決這個問題反而是一種更好的方式。

同時從 XNU 中對 CPU 功耗的控制粒度能夠看出，蘋果在極致的優化方面作的很好，在自身的軟件生態中要求也比較高。電量問題在短期內應該不會有技術上的突破，因此咱們自身也須要多思考如何減小電池損耗。

本文發佈自網易雲音樂大前端團隊，文章未經受權禁止任何形式的轉載。咱們常年招收前端、iOS、Android，若是你準備換工做，又剛好喜歡雲音樂，那就加入咱們 grp.music-fe(at)corp.netease.com！

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。