Android ANR日誌分析指南

當你的項目越作越複雜,或者你的用戶達到某個數量級的時候,你的代碼不當心出現細小的問題,你會收到各類各樣的bug,其中ANR的問題你必定不會陌生。本文將詳細講解ANR的類型、出現的緣由、ANR案例詳細分析、經典的案例。php

定義

ANR(Application Not Responding) 應用程序無響應。若是你應用程序在UI線程被阻塞太長時間,就會出現ANR,一般出現ANR,系統會彈出一個提示提示框,讓用戶知道,該程序正在被阻塞,是否繼續等待仍是關閉。java

ANR類型

出現ANR的通常有如下幾種類型:
1:KeyDispatchTimeout(常見)
input事件在5S內沒有處理完成發生了ANR。
logcat日誌關鍵字:Input event dispatching timed out
android

2:BroadcastTimeout
前臺Broadcast:onReceiver在10S內沒有處理完成發生ANR。
後臺Broadcast:onReceiver在60s內沒有處理完成發生ANR。
logcat日誌關鍵字:Timeout of broadcast BroadcastRecord
sql

3:ServiceTimeout
前臺Service:onCreateonStartonBind等生命週期在20s內沒有處理完成發生ANR。
後臺Service:onCreateonStartonBind等生命週期在200s內沒有處理完成發生ANR
logcat日誌關鍵字:Timeout executing service
數據庫

4:ContentProviderTimeout
ContentProvider 在10S內沒有處理完成發生ANR。 logcat日誌關鍵字:timeout publishing content providers
bash

ANR出現的緣由

1:主線程頻繁進行耗時的IO操做:如數據庫讀寫
2:多線程操做的死鎖,主線程被block;
3:主線程被Binder 對端block;
4:System Server中WatchDog出現ANR;
5:service binder的鏈接達到上線沒法和和System Server通訊
6:系統資源已耗盡(管道、CPU、IO)
多線程

ANR案例分析過程

咱們將一步一步分析ANR,這個過程更加理解如何找到問題、分析問題以及解決問題。app

1、 查看events_log

查看mobilelog文件夾下的events_log,從日誌中搜索關鍵字:am_anr,找到出現ANR的時間點、進程PID、ANR類型。ide

如日誌:函數

07-20 15:36:36.472  1000  1520  1597 I am_anr  : [0,1480,com.xxxx.moblie,952680005,Input dispatching timed out (AppWindowToken{da8f666 token=Token{5501f51 ActivityRecord{15c5c78 u0 com.xxxx.moblie/.ui.MainActivity t3862}}}, Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)]
複製代碼

從上面的log咱們能夠看出: 應用com.xxxx.moblie07-20 15:36:36.472時間,發生了一次KeyDispatchTimeout類型的ANR,它的進程號是1480. 把關鍵的信息整理一下:
ANR時間:07-20 15:36:36.472
進程pid:1480
進程名:com.xxxx.moblie
ANR類型:KeyDispatchTimeout

咱們已經知道了發生KeyDispatchTimeout的ANR是由於 input事件在5秒內沒有處理完成。那麼在這個時間07-20 15:36:36.472 的前5秒,也就是(15:36:30 ~15:36:31)時間段左右程序到底作了什麼事情?這個簡單,由於咱們已經知道pid了,再搜索一下pid = 1480的日誌.這些日誌表示該進程所運行的軌跡,關鍵的日誌以下:

07-20 15:36:29.749 10102  1480  1737 D moblie-Application: [Thread:17329] receive an intent from server, action=com.ttt.push.RECEIVE_MESSAGE
07-20 15:36:30.136 10102  1480  1737 D moblie-Application: receiving an empty message, drop
07-20 15:36:35.791 10102  1480  1766 I Adreno  : QUALCOMM build                   : 9c9b012, I92eb381bc9
07-20 15:36:35.791 10102  1480  1766 I Adreno  : Build Date                       : 12/31/17
07-20 15:36:35.791 10102  1480  1766 I Adreno  : OpenGL ES Shader Compiler Version: EV031.22.00.01
07-20 15:36:35.791 10102  1480  1766 I Adreno  : Local Branch                     : 
07-20 15:36:35.791 10102  1480  1766 I Adreno  : Remote Branch                    : refs/tags/AU_LINUX_ANDROID_LA.UM.6.4.R1.08.00.00.309.049
07-20 15:36:35.791 10102  1480  1766 I Adreno  : Remote Branch                    : NONE
07-20 15:36:35.791 10102  1480  1766 I Adreno  : Reconstruct Branch               : NOTHING
07-20 15:36:35.826 10102  1480  1766 I vndksupport: sphal namespace is not configured for this process. Loading /vendor/lib64/hw/gralloc.msm8998.so from the current namespace instead.
07-20 15:36:36.682 10102  1480  1480 W ViewRootImpl[MainActivity]: Cancelling event due to no window focus: KeyEvent { action=ACTION_UP, keyCode=KEYCODE_PERIOD, scanCode=0, metaState=0, flags=0x28, repeatCount=0, eventTime=16099429, downTime=16099429, deviceId=-1, source=0x101 }
複製代碼

從上面咱們能夠知道,在時間 07-20 15:36:29.749 程序收到了一個action消息。

07-20 15:36:29.749 10102  1480  1737 D moblie-Application: [Thread:17329] receive an intent from server, action=com.ttt.push.RECEIVE_MESSAGE。
複製代碼

原來是應用com.xxxx.moblie 收到了一個推送消息(com.ttt.push.RECEIVE_MESSAGE)致使了阻塞,咱們再串聯一下目前所獲取到的信息:在時間07-20 15:36:29.749 應用com.xxxx.moblie 收到了一下推送信息action=com.ttt.push.RECEIVE_MESSAGE發生阻塞,5秒後發生了KeyDispatchTimeout的ANR

雖然知道了是怎麼開始的,可是具體緣由尚未找到,是否是當時CPU很緊張、各路APP再搶佔資源? 咱們再看看CPU的信息,。搜索關鍵字關鍵字: ANR IN

07-20 15:36:58.711  1000  1520  1597 E ActivityManager: ANR in com.xxxx.moblie (com.xxxx.moblie/.ui.MainActivity) (進程名)
07-20 15:36:58.711  1000  1520  1597 E ActivityManager: PID: 1480 (進程pid)
07-20 15:36:58.711  1000  1520  1597 E ActivityManager: Reason: Input dispatching timed out (AppWindowToken{da8f666 token=Token{5501f51 ActivityRecord{15c5c78 u0 com.xxxx.moblie/.ui.MainActivity t3862}}}, Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.)
07-20 15:36:58.711  1000  1520  1597 E ActivityManager: Load: 0.0 / 0.0 / 0.0 (Load代表是1分鐘,5分鐘,15分鐘CPU的負載)
07-20 15:36:58.711  1000  1520  1597 E ActivityManager: CPU usage from 20ms to 20286ms later (2018-07-20 15:36:36.170 to 2018-07-20 15:36:56.436):
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   42% 6774/pressure: 41% user + 1.4% kernel / faults: 168 minor
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   34% 142/kswapd0: 0% user + 34% kernel
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   31% 1520/system_server: 13% user + 18% kernel / faults: 58724 minor 1585 major
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   13% 29901/com.ss.android.article.news: 7.7% user + 6% kernel / faults: 56007 minor 2446 major
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   13% 32638/com.android.quicksearchbox: 9.4% user + 3.8% kernel / faults: 48999 minor 1540 major
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   11% (CPU的使用率)1480/com.xxxx.moblie: 5.2%(用戶態的使用率) user + (內核態的使用率) 6.3% kernel / faults: 76401 minor 2422 major
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   8.2% 21000/kworker/u16:12: 0% user + 8.2% kernel
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   0.8% 724/mtd: 0% user + 0.8% kernel / faults: 1561 minor 9 major
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   8% 29704/kworker/u16:8: 0% user + 8% kernel
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   7.9% 24391/kworker/u16:18: 0% user + 7.9% kernel
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   7.1% 30656/kworker/u16:14: 0% user + 7.1% kernel
07-20 15:36:58.711  1000  1520  1597 E ActivityManager:   7.1% 9998/kworker/u16:4: 0% user + 7.1% kernel
複製代碼

我已經在log 中標誌了相關的含義。com.xxxx.moblie 佔用了11%的CPU,其實這並不算多。如今的手機基本都是多核CPU。假如你的CPU是4核,那麼上限是400%,以此類推。

既然不是CPU負載的緣由,那麼究竟是什麼緣由呢? 這時就要看咱們的終極大殺器——traces.txt

2、 traces.txt 日誌分析

當APP不響應、響應慢了、或者WatchDog的監視沒有獲得迴應時,系統就會dump出一個traces.txt文件,存放在文件目錄:/data/anr/traces.txt,經過traces文件,咱們能夠拿到線程名、堆棧信息、線程當前狀態、binder call等信息。
經過adb命令拿到該文件:adb pull /data/anr/traces.txt
trace: Cmd line:com.xxxx.moblie

"main" prio=5 tid=1 Runnable
  | group="main" sCount=0 dsCount=0 obj=0x73bcc7d0 self=0x7f20814c00
  | sysTid=20176 nice=-10 cgrp=default sched=0/0 handle=0x7f251349b0
  | state=R schedstat=( 0 0 0 ) utm=12 stm=3 core=5 HZ=100
  | stack=0x7fdb75e000-0x7fdb760000 stackSize=8MB
  | held mutexes= "mutator lock"(shared held)
  // java 堆棧調用信息,能夠查看調用的關係,定位到具體位置
  at ttt.push.InterceptorProxy.addMiuiApplication(InterceptorProxy.java:77)
  at ttt.push.InterceptorProxy.create(InterceptorProxy.java:59)
  at android.app.Activity.onCreate(Activity.java:1041)
  at miui.app.Activity.onCreate(SourceFile:47)
  at com.xxxx.moblie.ui.b.onCreate(SourceFile:172)
  at com.xxxx.moblie.ui.MainActivity.onCreate(SourceFile:68)
  at android.app.Activity.performCreate(Activity.java:7050)
  at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1214)
  at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2807)
  at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2929)
  at android.app.ActivityThread.-wrap11(ActivityThread.java:-1)
  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1618)
  at android.os.Handler.dispatchMessage(Handler.java:105)
  at android.os.Looper.loop(Looper.java:171)
  at android.app.ActivityThread.main(ActivityThread.java:6699)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:246)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:783)
複製代碼

我詳細解析一下traces.txt裏面的一些字段,看看它到底能給咱們提供什麼信息.
main:main標識是主線程,若是是線程,那麼命名成「Thread-X」的格式,x表示線程id,逐步遞增。
prio:線程優先級,默認是5
tid:tid不是線程的id,是線程惟一標識ID
group:是線程組名稱
sCount:該線程被掛起的次數
dsCount:是線程被調試器掛起的次數
obj:對象地址
self:該線程Native的地址
sysTid:是線程號(主線程的線程號和進程號相同)
nice:是線程的調度優先級
sched:分別標誌了線程的調度策略和優先級
cgrp:調度歸屬組
handle:線程處理函數的地址。
state:是調度狀態
schedstat:從 /proc/[pid]/task/[tid]/schedstat讀出,三個值分別表示線程在cpu上執行的時間、線程的等待時間和線程執行的時間片長度,不支持這項信息的三個值都是0;
utm:是線程用戶態下使用的時間值(單位是jiffies)
stm:是內核態下的調度時間值
core:是最後執行這個線程的cpu核的序號。

Java的堆棧信息是咱們最關心的,它可以定位到具體位置。從上面的traces,咱們能夠判斷ttt.push.InterceptorProxy.addMiuiApplicationInterceptorProxy.java:77 致使了com.xxxx.moblie發生了ANR。這時候能夠對着源碼查看,找到出問題,而且解決它。

總結一下這分析流程:首先咱們搜索am_anr,找到出現ANR的時間點、進程PID、ANR類型、而後再找搜索PID,找前5秒左右的日誌。過濾ANR IN 查看CPU信息,接着查看traces.txt,找到java的堆棧信息定位代碼位置,最後查看源碼,分析與解決問題。這個過程基本能找到發生ANR的前因後果。

ANR 案例整理

1、主線程被其餘線程lock,致使死鎖

waiting on <0x1cd570> (a android.os.MessageQueue)

DALVIK THREADS:
"main" prio=5 tid=3 TIMED_WAIT
  | group="main" sCount=1 dsCount=0 s=0 obj=0x400143a8
  | sysTid=691 nice=0 sched=0/0 handle=-1091117924
  at java.lang.Object.wait(Native Method)
  - waiting on <0x1cd570> (a android.os.MessageQueue)
  at java.lang.Object.wait(Object.java:195)
  at android.os.MessageQueue.next(MessageQueue.java:144)
  at android.os.Looper.loop(Looper.java:110)
  at android.app.ActivityThread.main(ActivityThread.java:3742)
  at java.lang.reflect.Method.invokeNative(Native Method)
  at java.lang.reflect.Method.invoke(Method.java:515)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:739)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:497)
  at dalvik.system.NativeStart.main(Native Method)

"Binder Thread #3" prio=5 tid=15 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x434e7758
  | sysTid=734 nice=0 sched=0/0 handle=1733632
  at dalvik.system.NativeStart.run(Native Method)

"Binder Thread #2" prio=5 tid=13 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x1cd570
  | sysTid=696 nice=0 sched=0/0 handle=1369840
  at dalvik.system.NativeStart.run(Native Method)

"Binder Thread #1" prio=5 tid=11 NATIVE
  | group="main" sCount=1 dsCount=0 s=0 obj=0x433aca10
  | sysTid=695 nice=0 sched=0/0 handle=1367448
  at dalvik.system.NativeStart.run(Native Method)

----- end 691 -----
複製代碼

2、主線程作耗時的操做:好比數據庫讀寫。

"main" prio=5 tid=1 Native
held mutexes=
kernel: (couldn't read /proc/self/task/11003/stack) native: #00 pc 000492a4 /system/lib/libc.so (nanosleep+12) native: #01 pc 0002dc21 /system/lib/libc.so (usleep+52) native: #02 pc 00009cab /system/lib/libsqlite.so (???) native: #03 pc 00011119 /system/lib/libsqlite.so (???) native: #04 pc 00016455 /system/lib/libsqlite.so (???) native: #16 pc 0000fa29 /system/lib/libsqlite.so (???) native: #17 pc 0000fad7 /system/lib/libsqlite.so (sqlite3_prepare16_v2+14) native: #18 pc 0007f671 /system/lib/libandroid_runtime.so (???) native: #19 pc 002b4721 /system/framework/arm/boot-framework.oat (Java_android_database_sqlite_SQLiteConnection_nativePrepareStatement__JLjava_lang_String_2+116) at android.database.sqlite.SQLiteConnection.setWalModeFromConfiguration(SQLiteConnection.java:294) at android.database.sqlite.SQLiteConnection.open(SQLiteConnection.java:215) at android.database.sqlite.SQLiteConnection.open(SQLiteConnection.java:193) at android.database.sqlite.SQLiteConnectionPool.openConnectionLocked(SQLiteConnectionPool.java:463) at android.database.sqlite.SQLiteConnectionPool.open(SQLiteConnectionPool.java:185) at android.database.sqlite.SQLiteConnectionPool.open(SQLiteConnectionPool.java:177) at android.database.sqlite.SQLiteDatabase.openInner(SQLiteDatabase.java:808) locked <0x0db193bf> (a java.lang.Object) at android.database.sqlite.SQLiteDatabase.open(SQLiteDatabase.java:793) at android.database.sqlite.SQLiteDatabase.openDatabase(SQLiteDatabase.java:696) at android.app.ContextImpl.openOrCreateDatabase(ContextImpl.java:690) at android.content.ContextWrapper.openOrCreateDatabase(ContextWrapper.java:299) at android.database.sqlite.SQLiteOpenHelper.getDatabaseLocked(SQLiteOpenHelper.java:223) at android.database.sqlite.SQLiteOpenHelper.getWritableDatabase(SQLiteOpenHelper.java:163) locked <0x045a4a8c> (a com.xxxx.video.common.data.DataBaseHelper) at com.xxxx.video.common.data.DataBaseORM.<init>(DataBaseORM.java:46) at com.xxxx.video.common.data.DataBaseORM.getInstance(DataBaseORM.java:53) locked <0x017095d5> (a java.lang.Class<com.xxxx.video.common.data.DataBaseORM>) 複製代碼

3、binder數據量過大

07-21 04:43:21.573  1000  1488 12756 E Binder  : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 388568 (data: 1, 32, 7274595)
07-21 04:43:21.573  1000  1488 12756 E Binder  : android.util.Log$TerribleFailure: Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 388568 (data: 1, 32, 7274595)
07-21 04:43:21.607  1000  1488  2951 E Binder  : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 211848 (data: 1, 23, 7274595)
07-21 04:43:21.607  1000  1488  2951 E Binder  : android.util.Log$TerribleFailure: Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 211848 (data: 1, 23, 7274595)
07-21 04:43:21.662  1000  1488  6258 E Binder  : Unreasonably large binder reply buffer: on android.content.pm.BaseParceledListSlice$1@770c74f calling 1 size 259300 (data: 1, 33, 7274595)
複製代碼

4、binder 通訊失敗

07-21 06:04:35.580 <6>[32837.690321] binder: 1698:2362 transaction failed 29189/-3, size 100-0 line 3042
07-21 06:04:35.594 <6>[32837.704042] binder: 1765:4071 transaction failed 29189/-3, size 76-0 line 3042
07-21 06:04:35.899 <6>[32838.009132] binder: 1765:4067 transaction failed 29189/-3, size 224-8 line 3042
07-21 06:04:36.018 <6>[32838.128903] binder: 1765:2397 transaction failed 29189/-22, size 348-0 line 2916
複製代碼

推薦

最新源碼彙總:每週分享新的開源代碼,有效果圖,更直觀。

相關文章
相關標籤/搜索