Android消息機制

時間 2019-11-07

原文原文鏈接

本文分析基於Android P(9.0) 源碼android

1 概述

消息機制是Android中重要的線程間通訊手段。c++

它的存在可讓一個線程通知另外一個線程去工做。那麼一個線程爲何會有讓另外一個線程工做的需求呢？shell

能夠看一個常見的應用場景——UI更新。Google官方有一句話解釋了UI更新的規則：The Android UI toolkit is not thread-safe and the view must always be manipulated on the UI thread。由於UI更新並非線程安全的，因此Android爲了規避開發者可能的不安全操做，乾脆將全部UI更新都放在了主線程中進行。在這種場景下，就會出現其餘線程請求主線程來幫忙更新UI的需求。設計模式

除了UI更新，某些設計模式的實現也離不開消息機制。數組

下圖即是消息機制最基本的工做方式。A線程發送消息到B線程的消息隊列中，B線程不斷從消息隊列中取出新的消息進行處理。安全

線程A在這裏表現的就像是一個甩手掌櫃，只負責發送消息，卻從不幹活。而線程B就像是一個苦力，不斷地處理到來的消息。app

2 詳細過程

下圖即是消息機制的詳細過程，主要分爲兩個部分：less

消息發送過程
消息處理過程

消息經過Handler發送到另外一個線程的MessageQueue中。另外一個線程經過Looper不斷輪詢消息隊列，取出其中的消息，並交給當初發送它的Handler進行處理。異步

上述詳細過程有一個前提假設，也即線程B中存在Looper和MessageQueue。事實上，這兩樣東西並非天生存在的。因此真正完整的詳細過程包含如下三個部分：

消息隊列準備過程
消息發送過程
消息處理過程

2.1 消息隊列準備過程

在Android應用中，主線程自帶Looper和MessageQueue，其餘線程若是想具有消息機制的功能，則必須首先調用Looper.prepare()。

主線程爲何會自帶Looper和MessageQueue呢？

全部Android應用的主線程都對應一個ActivityThread，正是因爲全部Activity的回調方法都運行在主線程，因此Google便用ActivityThread來對應主線程。

ActivityThread的main方法是每一個Android應用啓動時的入口。經過6642行代碼可知，主線程並不是自帶了Looper和MessageQueue，而是在ActivityThread的main方法中提早爲咱們建立好了而已。6642行建立了主線程的Looper和MessageQueue（下文有詳述），6669行便開始了Looper的循環工做：不斷從MessageQueue中取出消息並執行，消息隊列爲空時就將所在線程掛起休息，有新的消息到來時再起來繼續工做。周而復始，永不停歇。

以上就是Android主線程的基本工做模型。至於咱們所熟知的onCreate、onDestroy，其實背後也都是消息機制在起做用（固然還有Binder的身影）。

/frameworks/base/core/java/android/app/ActivityThread.java

6623    public static void main(String[] args) {
6624        Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ActivityThreadMain");
6625
6626        // CloseGuard defaults to true and can be quite spammy. We
6627        // disable it here, but selectively enable it later (via
6628        // StrictMode) on debug builds, but using DropBox, not logs.
6629        CloseGuard.setEnabled(false);
6630
6631        Environment.initForCurrentUser();
6632
6633        // Set the reporter for event logging in libcore
6634        EventLogger.setReporter(new EventLoggingReporter());
6635
6636        // Make sure TrustedCertificateStore looks in the right place for CA certificates
6637        final File configDir = Environment.getUserConfigDirectory(UserHandle.myUserId());
6638        TrustedCertificateStore.setDefaultUserDirectory(configDir);
6639
6640        Process.setArgV0("<pre-initialized>");
6641
6642        Looper.prepareMainLooper();
6643
6644        // Find the value for {@link #PROC_START_SEQ_IDENT} if provided on the command line.
6645        // It will be in the format "seq=114"
6646        long startSeq = 0;
6647        if (args != null) {
6648            for (int i = args.length - 1; i >= 0; --i) {
6649                if (args[i] != null && args[i].startsWith(PROC_START_SEQ_IDENT)) {
6650                    startSeq = Long.parseLong(
6651                            args[i].substring(PROC_START_SEQ_IDENT.length()));
6652                }
6653            }
6654        }
6655        ActivityThread thread = new ActivityThread();
6656        thread.attach(false, startSeq);
6657
6658        if (sMainThreadHandler == null) {
6659            sMainThreadHandler = thread.getHandler();
6660        }
6661
6662        if (false) {
6663            Looper.myLooper().setMessageLogging(new
6664                    LogPrinter(Log.DEBUG, "ActivityThread"));
6665        }
6666
6667        // End of event ActivityThreadMain.
6668        Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
6669        Looper.loop();
6670
6671        throw new RuntimeException("Main thread loop unexpectedly exited");
6672    }
複製代碼

Looper.prepareMainLooper是一個靜態方法，它的做用是爲主線程建立一個Looper和MessageQueue。其最終調用了prepare方法，建立了一個新的Looper並將它寫入sThreadLocal字段。

sThreadLocal字段是一個靜態變量，按照常理它應該在內存中獨一份，且各個線程都可訪問的。但這裏sThreadLocal利用了TLS（ThreadLocalStorage）的機制，每一個線程訪問到的sThreadLocal是相互獨立的，並非同一個。因此，主線程調用prepareMainLooper方法，至關於建立了一個線程獨有的Looper，而且將這個Looper賦值給名爲sMainLooper的靜態變量（方便其餘線程獲取主線程的Looper）。

/frameworks/base/core/java/android/os/Looper.java

114    public static void prepareMainLooper() {
115        prepare(false);
116        synchronized (Looper.class) {
117            if (sMainLooper != null) {
118                throw new IllegalStateException("The main Looper has already been prepared.");
119            }
120            sMainLooper = myLooper();
121        }
122    }
複製代碼

/frameworks/base/core/java/android/os/Looper.java

97    public static void prepare() {
98        prepare(true);
99    }
100
101    private static void prepare(boolean quitAllowed) {
102        if (sThreadLocal.get() != null) {
103            throw new RuntimeException("Only one Looper may be created per thread");
104        }
105        sThreadLocal.set(new Looper(quitAllowed));
106    }
複製代碼

Looper的構造方法中會建立一個MessageQueue，因此調用Looper.prepare方法便會建立與線程惟一對應的Looper和MessageQueue。

/frameworks/base/core/java/android/os/Looper.java

267    private Looper(boolean quitAllowed) {
268        mQueue = new MessageQueue(quitAllowed);
269        mThread = Thread.currentThread();
270    }
複製代碼

MessageQueue的構造方法以下，它會調用nativeInit方法在native層作一些初始化的工做。

/frameworks/base/core/java/android/os/MessageQueue.java

70    MessageQueue(boolean quitAllowed) {
71        mQuitAllowed = quitAllowed;
72        mPtr = nativeInit();
73    }
複製代碼

63    private native static long nativeInit();
複製代碼

nativeInit對應的JNI方法爲android_os_MessageQueue_nativeInit，其中建立了一個NativeMessageQueue對象，並將該對象的指針轉化爲long型傳遞給java層。在Android的世界中，存在大量java層對象和native層對象一一映射的關係，一般都是在java層對象中設立一個long型的字段，用於記錄native對象的指針值。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

172static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
173    NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
174    if (!nativeMessageQueue) {
175        jniThrowRuntimeException(env, "Unable to allocate native queue");
176        return 0;
177    }
178
179    nativeMessageQueue->incStrong(env);
180    return reinterpret_cast<jlong>(nativeMessageQueue);
181}
複製代碼

在NativeMessageQueue的構造函數中建立一個native層的Looper，並經過TLS的機制和線程綁定。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

78NativeMessageQueue::NativeMessageQueue() :
79        mPollEnv(NULL), mPollObj(NULL), mExceptionObj(NULL) {
80    mLooper = Looper::getForThread();
81    if (mLooper == NULL) {
82        mLooper = new Looper(false);
83        Looper::setForThread(mLooper);
84    }
85}
複製代碼

在native層Looper的構造過程當中，67行的代碼很是關鍵。它用於mWakeEventFd的初始化，建立出來的eventfd將會在rebuildEpollLocked函數中被epoll監聽（151行）。Epoll機制是Linux內核中一種事件觸發的機制，能夠同時監聽多個文件描述符。在調用epoll_wait將線程掛起的時候，若是有被監測的事件產生，則線程從掛起狀態恢復，從新恢復運行。這實際上是一種中斷式的wait/notify機制。若是想了解這個機制的詳細內容，能夠參考這兩篇博客：博客1 和博客2。博客1中對epoll的基本概念講述較多，博客2對epoll中的Level Trigger和Edge Trigger講的很是清楚。

咱們以149行到151行的代碼爲例，EPOLLIN表示監測mWakeEventFd上的可讀事件，當該線程調用epoll_wait時，若是mWakeEventFd上有可讀事件，則線程直接返回，不然掛起。在該線程掛起的時候，若是有其餘線程往mWakeEventFd上寫入新的數據，則該線程會接收到事件，並從掛起狀態恢復爲運行狀態。

/system/core/libutils/Looper.cpp

63Looper::Looper(bool allowNonCallbacks) :
64        mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
65        mPolling(false), mEpollFd(-1), mEpollRebuildRequired(false),
66        mNextRequestSeq(0), mResponseIndex(0), mNextMessageUptime(LLONG_MAX) {
67    mWakeEventFd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
68    LOG_ALWAYS_FATAL_IF(mWakeEventFd < 0, "Could not make wake event fd: %s",
69                        strerror(errno));
70
71    AutoMutex _l(mLock);
72    rebuildEpollLocked();
73}

複製代碼

/system/core/libutils/Looper.cpp

134void Looper::rebuildEpollLocked() {
135    // Close old epoll instance if we have one.
136    if (mEpollFd >= 0) {
137#if DEBUG_CALLBACKS
138        ALOGD("%p ~ rebuildEpollLocked - rebuilding epoll set", this);
139#endif
140        close(mEpollFd);
141    }
142
143    // Allocate the new epoll instance and register the wake pipe.
144    mEpollFd = epoll_create(EPOLL_SIZE_HINT);
145    LOG_ALWAYS_FATAL_IF(mEpollFd < 0, "Could not create epoll instance: %s", strerror(errno));
146
147    struct epoll_event eventItem;
148    memset(& eventItem, 0, sizeof(epoll_event)); // zero out unused members of data field union
149    eventItem.events = EPOLLIN;
150    eventItem.data.fd = mWakeEventFd;
151    int result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeEventFd, & eventItem);
152    LOG_ALWAYS_FATAL_IF(result != 0, "Could not add wake event fd to epoll instance: %s",
153                        strerror(errno));
154
155    for (size_t i = 0; i < mRequests.size(); i++) {
156        const Request& request = mRequests.valueAt(i);
157        struct epoll_event eventItem;
158        request.initEventItem(&eventItem);
159
160        int epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, request.fd, & eventItem);
161        if (epollResult < 0) {
162            ALOGE("Error adding epoll events for fd %d while rebuilding epoll set: %s",
163                  request.fd, strerror(errno));
164        }
165    }
166}

複製代碼

綜上所述，一個能夠處理消息的線程，必然會有一個惟一的Looper和惟一的MessageQueue。

2.2 消息發送過程

消息經過Handler進行發送。

經過調用Handler類的sendMessage方法，咱們能夠發送一個消息。sendMessage最終調用的是sendMessageAtTime方法。參數uptimeMillis表示但願消息發送的時間點距離開機時間點的毫秒數，譬如手機15:00:00開機，消息發送者但願這條消息15:00:01準時發送，那麼傳入的uptimeMillis就是1000。

/frameworks/base/core/java/android/os/Handler.java

602    public final boolean sendMessage(Message msg) 603 {
604        return sendMessageDelayed(msg, 0);
605    }

複製代碼

662    public final boolean sendMessageDelayed(Message msg, long delayMillis) 663 {
664        if (delayMillis < 0) {
665            delayMillis = 0;
666        }
667        return sendMessageAtTime(msg, SystemClock.uptimeMillis() + delayMillis);
668    }

複製代碼

689    public boolean sendMessageAtTime(Message msg, long uptimeMillis) {
690        MessageQueue queue = mQueue;
691        if (queue == null) {
692            RuntimeException e = new RuntimeException(
693                    this + " sendMessageAtTime() called with no mQueue");
694            Log.w("Looper", e.getMessage(), e);
695            return false;
696        }
697        return enqueueMessage(queue, msg, uptimeMillis);
698    }

複製代碼

sendMessageAtTime方法取出Handler的mQueue字段，並調用enqueueMessage方法。enqueueMessage的做用就是將消息加入到消息隊列中。首先，將消息的target字段設置爲發送時的Handler，代表這個消息被接收後依然由此Handler進行處理。其後根據Handler是否異步來決定發送的消息是否異步。最後調用MessageQueue的enqueueMessage方法。

/frameworks/base/core/java/android/os/Handler.java

740    private boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) {
741        msg.target = this;
742        if (mAsynchronous) {
743            msg.setAsynchronous(true);
744        }
745        return queue.enqueueMessage(msg, uptimeMillis);
746    }

複製代碼

745行的queue是從Handler的sendMessageAtTime方法中傳遞過來的，它是Handler對象的mQueue字段，在Handler對象的構造方法中被賦值。爲了搞清楚mQueue的來及，咱們有必要看一看Handler的構造方法。

2.2.1 消息被髮送到哪一個線程？

Handler的構造方法被重載了不少個，但底層其實都是這兩個：

/frameworks/base/core/java/android/os/Handler.java

192    public Handler(Callback callback, boolean async) {
193        if (FIND_POTENTIAL_LEAKS) {
194            final Class<? extends Handler> klass = getClass();
195            if ((klass.isAnonymousClass() || klass.isMemberClass() || klass.isLocalClass()) &&
196                    (klass.getModifiers() & Modifier.STATIC) == 0) {
197                Log.w(TAG, "The following Handler class should be static or leaks might occur: " +
198                    klass.getCanonicalName());
199            }
200        }
201
202        mLooper = Looper.myLooper();
203        if (mLooper == null) {
204            throw new RuntimeException(
205                "Can't create handler inside thread " + Thread.currentThread()
206                        + " that has not called Looper.prepare()");
207        }
208        mQueue = mLooper.mQueue;
209        mCallback = callback;
210        mAsynchronous = async;
211    }

複製代碼

232    public Handler(Looper looper, Callback callback, boolean async) {
233        mLooper = looper;
234        mQueue = looper.mQueue;
235        mCallback = callback;
236        mAsynchronous = async;
237    }

複製代碼

兩者最大的區別就在於，一個傳入了Looper，另外一個沒有傳入Looper。

傳入Looper的話，Handler對象的mQueue就等於looper.mQueue。假設Handler對象在線程A中建立，其構造時傳入的是線程B的Looper，那麼經過這個Handler發送的消息就將由線程B來處理。

沒有傳入Looper的話，Handler對象的mQueue就等於其建立線程的Looper。依然假設Handler對象在線程A中建立，此時構造Handler時沒有傳入Looper，那麼經過這個Handler發送的消息就將由線程A來處理。

請仔細體會上述兩種狀況的區別。

針對沒有傳入Looper的狀況，這裏還要多提幾句。Handler對象建立以後，因爲它存在於Java堆上，因此能夠被任何線程訪問、使用。任何線程經過它發送的消息，最終都將彙總到其建立線程的MessageQueue中，包括在它的建立線程中發送消息。

下面看看MessageQueue的enqueueMessage方法作了哪些工做。

/frameworks/base/core/java/android/os/MessageQueue.java

536    boolean enqueueMessage(Message msg, long when) {
537        if (msg.target == null) {
538            throw new IllegalArgumentException("Message must have a target.");
539        }
540        if (msg.isInUse()) {
541            throw new IllegalStateException(msg + " This message is already in use.");
542        }
543
544        synchronized (this) {
545            if (mQuitting) {
546                IllegalStateException e = new IllegalStateException(
547                        msg.target + " sending message to a Handler on a dead thread");
548                Log.w(TAG, e.getMessage(), e);
549                msg.recycle();
550                return false;
551            }
552
553            msg.markInUse();
554            msg.when = when;
555            Message p = mMessages;
556            boolean needWake;
557            if (p == null || when == 0 || when < p.when) {
558                // New head, wake up the event queue if blocked.
559                msg.next = p;
560                mMessages = msg;
561                needWake = mBlocked;
562            } else {
563                // Inserted within the middle of the queue. Usually we don't have to wake
564                // up the event queue unless there is a barrier at the head of the queue
565                // and the message is the earliest asynchronous message in the queue.
566                needWake = mBlocked && p.target == null && msg.isAsynchronous();
567                Message prev;
568                for (;;) {
569                    prev = p;
570                    p = p.next;
571                    if (p == null || when < p.when) {
572                        break;
573                    }
574                    if (needWake && p.isAsynchronous()) {
575                        needWake = false;
576                    }
577                }
578                msg.next = p; // invariant: p == prev.next
579                prev.next = msg;
580            }
581
582            // We can assume mPtr != 0 because mQuitting is false.
583            if (needWake) {
584                nativeWake(mPtr);
585            }
586        }
587        return true;
588    }

複製代碼

跳過enqueueMessage方法中的異常判斷，其核心的做用只有一個：將新消息加入MessageQueue中的消息鏈表中。MessageQueue中的Message經過鏈表的方式進行管理，其中的消息按照發送時間的前後順序排列。在管理鏈表的過程當中，只需持有頭部對象就能夠遍歷全部的對象。所以MessageQueue只用了一個字段（mMessages）來記錄消息鏈表的頭部消息。

2.2.2 消息應該被插入到鏈表的什麼位置？

557行和562行分別表示對新消息的兩種處理方式，第一種是將新消息插入到鏈表頭部，第二種是將新消息插入到鏈表中間（或尾部）。

先分析插入鏈表頭部的狀況。

p == null 表示MessageQueue的消息鏈表爲空，也即全部消息發送完畢，新加入的消息理所應當插入到頭部。
when == 0表示消息經過sendMessageAtTime方法發送，且傳入的uptime爲0，此類消息優先級最高，無論消息鏈表中是何種狀況，新加入的消息都要插入到頭部。
when < p.when表示新消息預設的發送時間要早於現有頭部消息的發送時間，根據時間越早越靠前的原則，新加入的消息要插入到頭部。

除了插入到頭部的三種狀況外，其餘狀況下消息都將插入到鏈表中間（或尾部）。568行的for循環其實就是遍歷消息鏈表，根據發送時間的前後順序將消息插入到鏈表中。

2.2.3 消息加入鏈表後是否應該主動喚醒線程？

除了須要將新消息插入到鏈表的合適位置，enqueueMessage還要決定是否喚醒MessageQueue所在的線程。MessageQueue的mBlocked字段記錄了其所屬線程是否已經發生阻塞（被掛起），該字段在消息處理的過程當中被賦值。

當新消息插入到鏈表頭部時，needWake = mBlocked：

若是MessageQueue此時已經發生阻塞，則新消息插入頭部時，須要喚醒阻塞線程，以便讓它根據頭部的新消息從新決定處理邏輯（多是當即處理，也多是延時處理）。
若是MessageQueue此時未發生阻塞，則新消息插入頭部後無需作多餘處理。它只須要靜靜地等在那裏，線程處理完手中的消息後天然會同它碰面。

當新消息插入到鏈表中間（或尾部）時，needWake的賦值變得複雜起來。這主要是因爲異步消息和同步屏障的存在。

同步屏障像是一個守衛，當消息鏈表的頭部是一個同步屏障時，後續的同步消息都沒法被放行，即使這些消息已經知足發送的時間要求。此時，鏈表上的異步消息卻不受影響，它們照常按照發送時間的邏輯，順利地被處理。

同步屏障是一種特殊的Message，它的target爲null，代表這個消息是不須要被處理的，而普通消息的target都是最終來處理該消息的Handler。經過MessageQueue的postSyncBarrier方法能夠放置同步屏障，只不過這個方法是hide的，並且從Android P開始，反射調用非 SDK 的接口被限制了。雖然網上有一些手段能夠繞開這種限制，但Google的本意應該是不想讓開發者再使用同步屏障了。與之對應，撤除同步屏障的方法是removeSyncBarrier。

/frameworks/base/core/java/android/os/MessageQueue.java

461    public int postSyncBarrier() {
462        return postSyncBarrier(SystemClock.uptimeMillis());
463    }

複製代碼

465    private int postSyncBarrier(long when) {
466        // Enqueue a new sync barrier token.
467        // We don't need to wake the queue because the purpose of a barrier is to stall it.
468        synchronized (this) {
469            final int token = mNextBarrierToken++;
470            final Message msg = Message.obtain();
471            msg.markInUse();
472            msg.when = when;
473            msg.arg1 = token;
474
475            Message prev = null;
476            Message p = mMessages;
477            if (when != 0) {
478                while (p != null && p.when <= when) {
479                    prev = p;
480                    p = p.next;
481                }
482            }
483            if (prev != null) { // invariant: p == prev.next
484                msg.next = p;
485                prev.next = msg;
486            } else {
487                msg.next = p;
488                mMessages = msg;
489            }
490            return token;
491        }
492    }

複製代碼

同步消息和異步消息的惟一差別在於Message的flag是否被置上FLAG_ASYNCHRONOUS標誌位。這個標誌位只在setAsynchronous方法中被改變。若是Handler的mAsynchronous爲true，則經過該Handler發送的消息默認都是異步；反之，默認都是同步。除此之外，咱們也能夠經過消息的setAsynchronous方法來單獨地給某個方法設置是否異步。

/frameworks/base/core/java/android/os/Message.java

447    public boolean isAsynchronous() {
448        return (flags & FLAG_ASYNCHRONOUS) != 0;
449    }

複製代碼

477    public void setAsynchronous(boolean async) {
478        if (async) {
479            flags |= FLAG_ASYNCHRONOUS;
480        } else {
481            flags &= ~FLAG_ASYNCHRONOUS;
482        }
483    }

複製代碼

回到新消息插入到鏈表中間（或尾部）時needWake的賦值，needWake在遍歷之初被賦值以下：

/frameworks/base/core/java/android/os/MessageQueue.java

566                needWake = mBlocked && p.target == null && msg.isAsynchronous();

複製代碼

只有當MessageQueue所在的線程阻塞，鏈表頭部爲同步屏障，且新消息爲異步消息時，needWake才爲true。三者缺一不可。

mBlocked爲false，代表線程未阻塞，天然不須要喚醒。
p.target != null，代表頭部消息有效，此時即使mBlocked爲true，這時候的阻塞也是有超時的，超時時間到達後，線程自動喚醒，無需外部喚醒。
msg.isAsynchronous() = false，代表新消息爲同步消息，此時若頭部消息爲同步屏障，則新消息也沒法被放行，喚醒線程也沒用，乾脆不喚醒。

另外在遍歷的過程當中，若是發現新消息的前面有另外一個消息爲異步消息，則needWake從新置爲false。這種狀況代表原有的異步消息爲線程設置了有超時的阻塞，當下時間未達到異步消息的發送時間，因此mBlocked爲true。但因爲這次阻塞設有超時，因此並不須要外不喚醒。

線程的阻塞至關於人類的睡眠，從阻塞狀態中恢復有兩種可能，一種是超時喚醒，另外一個是外部喚醒。類比到人類的睡眠，人從睡夢中被叫醒也有兩種可能，一種是本身定鬧鐘，鬧鐘響後將本身叫醒，另外一種是被別人拍醒（不考慮天然醒，由於天然醒本質也是鬧鐘叫醒，只不過這個鬧鐘是生物鐘）。

2.2.4 如何主動喚醒線程？

上面介紹了是否應該主動喚醒線程，若是回答「須要」的話，那咱們又該怎樣去喚醒線程呢？

/frameworks/base/core/java/android/os/MessageQueue.java

584                nativeWake(mPtr);

複製代碼

經過nativeWake的native方法，咱們就能夠實現喚醒MessageQueue所在線程的目的。它對應的JNI方法是android_os_MessageQueue_nativeWake。傳入的mPtr其實是native對象的指針，它被存在一個Java的字段中，用於Java層和native層的互動。

mPtr被轉換成NativeMessageQueue對象(c++對象)的指針，緊接着調用NativeMessageQueue對象的wake方法。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

194static void android_os_MessageQueue_nativeWake(JNIEnv* env, jclass clazz, jlong ptr) {
195    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
196    nativeMessageQueue->wake();
197}

複製代碼

再追蹤下去，發現調用的是NativeMessageQueue中mLooper變量的wake方法。最終只幹了一件事：往native層的Looper對象的mWakeEventFd中寫一個「1」。結合2.1中對Epoll機制的描述，mWakeEventFd上有可讀數據時，epfd將會監測到該事件，並將線程從掛起狀態恢復爲運行狀態。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

121void NativeMessageQueue::wake() {
122    mLooper->wake();
123}

複製代碼

/system/core/libutils/Looper.cpp

398void Looper::wake() {
399#if DEBUG_POLL_AND_WAKE
400    ALOGD("%p ~ wake", this);
401#endif
402
403    uint64_t inc = 1;
404    ssize_t nWrite = TEMP_FAILURE_RETRY(write(mWakeEventFd, &inc, sizeof(uint64_t)));
405    if (nWrite != sizeof(uint64_t)) {
406        if (errno != EAGAIN) {
407            LOG_ALWAYS_FATAL("Could not write wake signal to fd %d: %s",
408                    mWakeEventFd, strerror(errno));
409        }
410    }
411}

複製代碼

2.3 消息處理過程

對於想要運行消息機制的線程而言，除了須要經過Looper.prepare來建立屬於本身的Looper和MessageQueue，還須要調用Looper.loop來真正的輪詢、處理消息。

/frameworks/base/core/java/android/os/Looper.java

127    public static Looper getMainLooper() {
128        synchronized (Looper.class) {
129            return sMainLooper;
130        }
131    }
132
133    /** 134 * Run the message queue in this thread. Be sure to call 135 * {@link #quit()} to end the loop. 136 */
137    public static void loop() {
138        final Looper me = myLooper();
139        if (me == null) {
140            throw new RuntimeException("No Looper; Looper.prepare() wasn't called on this thread.");
141        }
142        final MessageQueue queue = me.mQueue;
143
144        // Make sure the identity of this thread is that of the local process,
145        // and keep track of what that identity token actually is.
146        Binder.clearCallingIdentity();
147        final long ident = Binder.clearCallingIdentity();
148
149        // Allow overriding a threshold with a system prop. e.g.
150        // adb shell 'setprop log.looper.1000.main.slow 1 && stop && start'
151        final int thresholdOverride =
152                SystemProperties.getInt("log.looper."
153                        + Process.myUid() + "."
154                        + Thread.currentThread().getName()
155                        + ".slow", 0);
156
157        boolean slowDeliveryDetected = false;
158
159        for (;;) {
160            Message msg = queue.next(); // might block
161            if (msg == null) {
162                // No message indicates that the message queue is quitting.
163                return;
164            }
165
166            // This must be in a local variable, in case a UI event sets the logger
167            final Printer logging = me.mLogging;
168            if (logging != null) {
169                logging.println(">>>>> Dispatching to " + msg.target + " " +
170                        msg.callback + ": " + msg.what);
171            }
172
173            final long traceTag = me.mTraceTag;
174            long slowDispatchThresholdMs = me.mSlowDispatchThresholdMs;
175            long slowDeliveryThresholdMs = me.mSlowDeliveryThresholdMs;
176            if (thresholdOverride > 0) {
177                slowDispatchThresholdMs = thresholdOverride;
178                slowDeliveryThresholdMs = thresholdOverride;
179            }
180            final boolean logSlowDelivery = (slowDeliveryThresholdMs > 0) && (msg.when > 0);
181            final boolean logSlowDispatch = (slowDispatchThresholdMs > 0);
182
183            final boolean needStartTime = logSlowDelivery || logSlowDispatch;
184            final boolean needEndTime = logSlowDispatch;
185
186            if (traceTag != 0 && Trace.isTagEnabled(traceTag)) {
187                Trace.traceBegin(traceTag, msg.target.getTraceName(msg));
188            }
189
190            final long dispatchStart = needStartTime ? SystemClock.uptimeMillis() : 0;
191            final long dispatchEnd;
192            try {
193                msg.target.dispatchMessage(msg);
194                dispatchEnd = needEndTime ? SystemClock.uptimeMillis() : 0;
195            } finally {
196                if (traceTag != 0) {
197                    Trace.traceEnd(traceTag);
198                }
199            }
200            if (logSlowDelivery) {
201                if (slowDeliveryDetected) {
202                    if ((dispatchStart - msg.when) <= 10) {
203                        Slog.w(TAG, "Drained");
204                        slowDeliveryDetected = false;
205                    }
206                } else {
207                    if (showSlowLog(slowDeliveryThresholdMs, msg.when, dispatchStart, "delivery",
208                            msg)) {
209                        // Once we write a slow delivery log, suppress until the queue drains.
210                        slowDeliveryDetected = true;
211                    }
212                }
213            }
214            if (logSlowDispatch) {
215                showSlowLog(slowDispatchThresholdMs, dispatchStart, dispatchEnd, "dispatch", msg);
216            }
217
218            if (logging != null) {
219                logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
220            }
221
222            // Make sure that during the course of dispatching the
223            // identity of the thread wasn't corrupted.
224            final long newIdent = Binder.clearCallingIdentity();
225            if (ident != newIdent) {
226                Log.wtf(TAG, "Thread identity changed from 0x"
227                        + Long.toHexString(ident) + " to 0x"
228                        + Long.toHexString(newIdent) + " while dispatching to "
229                        + msg.target.getClass().getName() + " "
230                        + msg.callback + " what=" + msg.what);
231            }
232
233            msg.recycleUnchecked();
234        }
235    }

複製代碼

159行開始的for循環在正常狀態下永遠不會退出，除非調用Looper或MessageQueue的quit方法。在每一次循環的過程當中，都作了如下三件事：

取出消息鏈表中可被處理的頭部消息。
執行消息所對應的Handler的dispatchMessage方法，並記錄消息處理的delivery時間和dispatch時間，用於監測消息隊列的運轉狀態是否正常。
回收此消息。

在這三個步驟中，須要詳細分析的是1和2。1須要較多的篇幅來闡述，所以咱們先分析2的過程。

2.3.1 Delievery Time和Dispatch Time分別指的是什麼？

Delievery Time:

待發送的消息一般都有一個預設的發送時間點，也即message的when字段。當這個消息從消息鏈表中被取出時，記錄另外一個時間點，稱之爲dispatchStart。正常狀況下，dispatchStart和msg.when相同，代表消息按照預設的時間點被取出。非正常狀況下，若是前面消息處理時間過長，將會延誤後續消息的發送（由於消息鏈表是串行發送的）。這個道理和排隊的情形很類似。

DelieveryTime = dispatchStart - msg.when，表示消息被取出的時間點和預設的時間點之間的差距。差值較小，代表消息基本是按照預設的時間來取出的。差值較大，則代表消息隊列有些擁堵，多是前面的消息過多，也多是前面某個消息的處理耗時過長。總之，當前這個消息並無按照預設的時間被取出，而是有些滯後了。

Dispatch Time:

消息的處理時間，也即消息所對應Handler的dispatchMessage方法的運行時間。每一個消息都有屬於本身的處理方法，其中可能包含某些耗時操做。所以記錄下dispatch time，當這個時間超過某個閾值時給出相應的警告，能夠幫助開發者瞭解程序的性能以及運行時的壓力。

2.3.2 消息處理最終執行哪一個方法？

消息處理會調用Handler的dispatchMessage方法來對消息進行處理。在這個方法內部，咱們能夠看出一個消息會有三種處理方式。三種處理方式並不是隨機選擇，而是具備必定的優先級的。

當message自己的callback字段不爲空時，按照callback指定的方式對消息進行處理。
當條件1不知足，且Handler對象的mCallback字段不爲空時，按照mCallback指定的方式對消息進行處理。
當條件一、2均不知足時，按照Handler類的handleMessage方法對消息進行處理。

/frameworks/base/core/java/android/os/Handler.java

97    public void dispatchMessage(Message msg) {
98        if (msg.callback != null) {
99            handleCallback(msg);
100        } else {
101            if (mCallback != null) {
102                if (mCallback.handleMessage(msg)) {
103                    return;
104                }
105            }
106            handleMessage(msg);
107        }
108    }

複製代碼

如下分別列舉知足3種處理方式的例子：

當message自己的callback字段不爲空時，按照callback指定的方式對消息進行處理。

/frameworks/base/core/java/android/speech/tts/TextToSpeechService.java

579            Runnable runnable = new Runnable() {
580                @Override
581                public void run() {
582                    if (setCurrentSpeechItem(speechItem)) {
583                        speechItem.play();
584                        removeCurrentSpeechItem();
585                    } else {
586                        // The item is alreadly flushed. Stopping.
587                        speechItem.stop();
588                    }
589                }
590            };
591            Message msg = Message.obtain(this, runnable);

複製代碼

當條件1不知足，且Handler對象的mCallback字段不爲空時，按照mCallback指定的方式對消息進行處理。

/frameworks/base/services/core/java/com/android/server/GraphicsStatsService.java

110        mWriteOutHandler = new Handler(bgthread.getLooper(), new Handler.Callback() {
111            @Override
112            public boolean handleMessage(Message msg) {
113                switch (msg.what) {
114                    case SAVE_BUFFER:
115                        saveBuffer((HistoricalBuffer) msg.obj);
116                        break;
117                    case DELETE_OLD:
118                        deleteOldBuffers();
119                        break;
120                }
121                return true;
122            }
123        });

複製代碼

當條件一、2均不知足時，按照Handler類的handleMessage方法對消息進行處理。

/frameworks/base/services/core/java/com/android/server/pm/ProcessLoggingHandler.java

35public final class ProcessLoggingHandler extends Handler {
......
......
47    @Override
48    public void handleMessage(Message msg) {
49        switch (msg.what) {
50            case LOG_APP_PROCESS_START_MSG: {
51                Bundle bundle = msg.getData();
52                String processName = bundle.getString("processName");
53                int uid = bundle.getInt("uid");
54                String seinfo = bundle.getString("seinfo");
55                String apkFile = bundle.getString("apkFile");
56                int pid = bundle.getInt("pid");
57                long startTimestamp = bundle.getLong("startTimestamp");
58                String apkHash = computeStringHashOfApk(apkFile);
59                SecurityLog.writeEvent(SecurityLog.TAG_APP_PROCESS_START, processName,
60                        startTimestamp, uid, pid, seinfo, apkHash);
61                break;
62            }
63            case INVALIDATE_BASE_APK_HASH_MSG: {
64                Bundle bundle = msg.getData();
65                mProcessLoggingBaseApkHashes.remove(bundle.getString("apkFile"));
66                break;
67            }
68        }
69    }

複製代碼

開發者定義的都是Handler的子類（譬如上面的ProcessingLoggingHandler），若是須要最終由Handler類的handleMessage來對消息進行處理，則子類中必須覆蓋父類的handleMessage方法。不然將不會對消息進行處理，由於父類（Handler）的handleMessage方法是一個空方法。

這種階梯式處理消息的設計，能夠給予開發者更大的自由度。

2.3.3 如何取出下一個消息？

接下來重點講述如何取出消息鏈表中可被處理的頭部消息。讓咱們走進MessageQueue的next方法。

/frameworks/base/core/java/android/os/Looper.java

160            Message msg = queue.next(); // might block

複製代碼

/frameworks/base/core/java/android/os/MessageQueue.java

310    Message next() {
311        // Return here if the message loop has already quit and been disposed.
312        // This can happen if the application tries to restart a looper after quit
313        // which is not supported.
314        final long ptr = mPtr;
315        if (ptr == 0) {
316            return null;
317        }
318
319        int pendingIdleHandlerCount = -1; // -1 only during first iteration
320        int nextPollTimeoutMillis = 0;
321        for (;;) {
322            if (nextPollTimeoutMillis != 0) {
323                Binder.flushPendingCommands();
324            }
325
326            nativePollOnce(ptr, nextPollTimeoutMillis);
327
328            synchronized (this) {
329                // Try to retrieve the next message. Return if found.
330                final long now = SystemClock.uptimeMillis();
331                Message prevMsg = null;
332                Message msg = mMessages;
333                if (msg != null && msg.target == null) {
334                    // Stalled by a barrier. Find the next asynchronous message in the queue.
335                    do {
336                        prevMsg = msg;
337                        msg = msg.next;
338                    } while (msg != null && !msg.isAsynchronous());
339                }
340                if (msg != null) {
341                    if (now < msg.when) {
342                        // Next message is not ready. Set a timeout to wake up when it is ready.
343                        nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
344                    } else {
345                        // Got a message.
346                        mBlocked = false;
347                        if (prevMsg != null) {
348                            prevMsg.next = msg.next;
349                        } else {
350                            mMessages = msg.next;
351                        }
352                        msg.next = null;
353                        if (DEBUG) Log.v(TAG, "Returning message: " + msg);
354                        msg.markInUse();
355                        return msg;
356                    }
357                } else {
358                    // No more messages.
359                    nextPollTimeoutMillis = -1;
360                }
361
362                // Process the quit message now that all pending messages have been handled.
363                if (mQuitting) {
364                    dispose();
365                    return null;
366                }
367
368                // If first time idle, then get the number of idlers to run.
369                // Idle handles only run if the queue is empty or if the first message
370                // in the queue (possibly a barrier) is due to be handled in the future.
371                if (pendingIdleHandlerCount < 0
372                        && (mMessages == null || now < mMessages.when)) {
373                    pendingIdleHandlerCount = mIdleHandlers.size();
374                }
375                if (pendingIdleHandlerCount <= 0) {
376                    // No idle handlers to run. Loop and wait some more.
377                    mBlocked = true;
378                    continue;
379                }
380
381                if (mPendingIdleHandlers == null) {
382                    mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
383                }
384                mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
385            }
386
387            // Run the idle handlers.
388            // We only ever reach this code block during the first iteration.
389            for (int i = 0; i < pendingIdleHandlerCount; i++) {
390                final IdleHandler idler = mPendingIdleHandlers[i];
391                mPendingIdleHandlers[i] = null; // release the reference to the handler
392
393                boolean keep = false;
394                try {
395                    keep = idler.queueIdle();
396                } catch (Throwable t) {
397                    Log.wtf(TAG, "IdleHandler threw exception", t);
398                }
399
400                if (!keep) {
401                    synchronized (this) {
402                        mIdleHandlers.remove(idler);
403                    }
404                }
405            }
406
407            // Reset the idle handler count to 0 so we do not run them again.
408            pendingIdleHandlerCount = 0;
409
410            // While calling an idle handler, a new message could have been delivered
411            // so go back and look again for a pending message without waiting.
412            nextPollTimeoutMillis = 0;
413        }
414    }

複製代碼

首先分析326行的nativePollOnce方法，它的做用是設定下一次發送的時間或掛起線程。其對應的JNI方法爲android_os_MessageQueue_nativePollOnce。內部調用NativeMessageQueue的pollOnce函數。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

188static void android_os_MessageQueue_nativePollOnce(JNIEnv* env, jobject obj, 189 jlong ptr, jint timeoutMillis) {
190    NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
191    nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
192}

複製代碼

NativeMessageQueue的pollOnce函數進一步調用Looper的pollOnce函數，並傳入timeoutMills參數。

/frameworks/base/core/jni/android_os_MessageQueue.cpp

107void NativeMessageQueue::pollOnce(JNIEnv* env, jobject pollObj, int timeoutMillis) {
108    mPollEnv = env;
109    mPollObj = pollObj;
110    mLooper->pollOnce(timeoutMillis);
111    mPollObj = NULL;
112    mPollEnv = NULL;
113
114    if (mExceptionObj) {
115        env->Throw(mExceptionObj);
116        env->DeleteLocalRef(mExceptionObj);
117        mExceptionObj = NULL;
118    }
119}

複製代碼

一層層往下走，發現最終調用的是Looper的pollInner函數，最終經過系統調用epoll_wait陷入內核態。

/system/core/libutils/Looper.cpp

242    int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);

複製代碼

傳入epoll_wait的timeoutMillis參數將直接決定epoll的行爲。這裏能夠分爲三種狀況：

timeoutMillis = 0，意味着無需等待。檢測epfd上是否有事件，有或沒有都將直接返回，繼續執行後面的操做。
timeoutMillis > 0，意味着epoll_wait有超時時間。對於Level Trigger的fd事件（這裏是這種狀況），在調用epoll_wait的時候會首先查看該事件是否已經存在。若是存在則直接返回，不然線程被掛起呈現阻塞狀態，等待超時時間到達後恢復至運行狀態。在超時等待的這段時間內，若是有新的消息被加入到鏈表頭部，發送線程將會喚醒此線程以從新決定timeoutMillis的值。
timeoutMillis = -1，epoll_wait會首先查看監測事件是否已經存在，若是存在則直接返回，不然將無限期地等待下去，直到有新消息到來，其餘線程喚醒此線程。

經過320行可知，nextPollTimeoutMillis在第一次循環時被設置爲0，意味着第一次循環將跳過epoll_wait的等待，直接去檢查消息鏈表的狀態。

/frameworks/base/core/java/android/os/MessageQueue.java

320        int nextPollTimeoutMillis = 0;
321        for (;;) {
322            if (nextPollTimeoutMillis != 0) {
323                Binder.flushPendingCommands();
324            }
325
326            nativePollOnce(ptr, nextPollTimeoutMillis);

複製代碼

330-339行的主要工做是取出鏈表中第一個可被處理的消息。上文提到，MessageQueue只用了一個字段（mMessages）來記錄消息鏈表的頭部消息，因此經過332行即可以取到頭部消息。若是鏈表頭部是同步屏障，那麼就要遍歷去尋找鏈表中第一個異步消息。

/frameworks/base/core/java/android/os/MessageQueue.java

330                final long now = SystemClock.uptimeMillis();
331                Message prevMsg = null;
332                Message msg = mMessages;
333                if (msg != null && msg.target == null) {
334                    // Stalled by a barrier. Find the next asynchronous message in the queue.
335                    do {
336                        prevMsg = msg;
337                        msg = msg.next;
338                    } while (msg != null && !msg.isAsynchronous());
339                }

複製代碼

當取出的可處理消息爲null時，意味着鏈表中暫時沒有消息能夠被處理，因此將nextPollTimeoutMillis置爲-1，讓next下一次輪詢的時候直接經過epoll_wait將線程掛起休息。

反之則須要有進一步的處理，分兩種狀況討論：

當下時間 < 該消息預約的處理時間，此時不該處理消息，須要等待時機成熟。因而將nextPollTimeoutMillis設置爲當下時間和預約處理時間之間的差值，保證超時後可以再次輪詢此消息，並進行相應處理。
當下時間 ≥ 該消息預約的處理時間，此時消息已經成熟，應該被處理。此時將mBlocked置爲false，代表該線程處於Runnable狀態，而且立刻就要執行消息的處理方法。接着重構鏈表，將此消息從鏈表中刪除。最後返回此消息到Looper的loop方法進行消息的實際處理。

/frameworks/base/core/java/android/os/MessageQueue.java

340                if (msg != null) {
341                    if (now < msg.when) {
342                        // Next message is not ready. Set a timeout to wake up when it is ready.
343                        nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
344                    } else {
345                        // Got a message.
346                        mBlocked = false;
347                        if (prevMsg != null) {
348                            prevMsg.next = msg.next;
349                        } else {
350                            mMessages = msg.next;
351                        }
352                        msg.next = null;
353                        if (DEBUG) Log.v(TAG, "Returning message: " + msg);
354                        msg.markInUse();
355                        return msg;
356                    }
357                } else {
358                    // No more messages.
359                    nextPollTimeoutMillis = -1;
360                }

複製代碼

2.3.4 IdleHandler有什麼用？

在MessageQueue的next方法中，還會對IdleHandler進行處理。IdleHandler，顧名思義，表示線程空閒時才須要去執行的一些操做。若是此時鏈表頭部的消息爲空或還沒有到達發送時間，則代表線程空閒，所以能夠去處理一些瑣事（IdleHandler裏的工做）。

經過319行可知，pendingIdleHandlerCount最初始被賦值爲-1。

/frameworks/base/core/java/android/os/MessageQueue.java

319        int pendingIdleHandlerCount = -1; // -1 only during first iteration

複製代碼

因此第一次運行到371行時，pendingIdleHandlerCount一定小於0。經過373行到384行，將mIdleHandlers（類型爲ArrayList）中的元素賦值給mPendingIdleHandlers（類型爲數組）。之因此不直接使用mIdleHandlers來進行遍歷，是由於遍歷處理mIdleHandles時無需持有MessageQueue的monitor lock，因而乾脆將鎖釋放，讓其餘線程能夠在處理mPendingIdleHandlers中的元素時，同時往mIdleHandlers中插入新的元素。

若是不須要對IdleHandler處理，或者mIdleHandlers中沒有須要處理的對象，則設置mBlocked爲true（377行），在下一輪循環的過程當中會經過epoll_wait將本線程掛起。須要注意的一點是，若是這次next()方法可以取出有效消息進行處理，代碼是不會執行到371行及如下的位置，它會在355行直接返回。

接下來即是遍歷mIdleHandlers中的元素，並執行它們的queueIdle方法的過程。若是queueIdle返回false，代表該IdleHandler只會執行一次，執行完以後就從mIdleHandlers列表中刪除。

/frameworks/base/core/java/android/os/MessageQueue.java

371                if (pendingIdleHandlerCount < 0
372                        && (mMessages == null || now < mMessages.when)) {
373                    pendingIdleHandlerCount = mIdleHandlers.size();
374                }
375                if (pendingIdleHandlerCount <= 0) {
376                    // No idle handlers to run. Loop and wait some more.
377                    mBlocked = true;
378                    continue;
379                }
380
381                if (mPendingIdleHandlers == null) {
382                    mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
383                }
384                mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
385            }
386
387            // Run the idle handlers.
388            // We only ever reach this code block during the first iteration.
389            for (int i = 0; i < pendingIdleHandlerCount; i++) {
390                final IdleHandler idler = mPendingIdleHandlers[i];
391                mPendingIdleHandlers[i] = null; // release the reference to the handler
392
393                boolean keep = false;
394                try {
395                    keep = idler.queueIdle();
396                } catch (Throwable t) {
397                    Log.wtf(TAG, "IdleHandler threw exception", t);
398                }
399
400                if (!keep) {
401                    synchronized (this) {
402                        mIdleHandlers.remove(idler);
403                    }
404                }
405            }
406
407            // Reset the idle handler count to 0 so we do not run them again.
408            pendingIdleHandlerCount = 0;
409
410            // While calling an idle handler, a new message could have been delivered
411            // so go back and look again for a pending message without waiting.
412            nextPollTimeoutMillis = 0;
413        }
414    }

複製代碼

3 總結

本文從如下三個方面詳細介紹了Android中的消息機制：

消息隊列準備過程
消息發送過程
消息處理過程

分析了消息從哪裏來，到哪裏去的問題。順着這條主線，也穿插講述了消息機制中一些不爲人熟知的機制：同步屏障、epoll機制、delievery time以及IdleHandler的處理時機等。但願這些分析可以幫助到你們。

原文連接：banshan.tech/Android消息機制…

相關標籤/搜索