當CPU飆升的時候,咱們須要知道CPU此時在幹嗎,具體什麼進程、什麼線程讓CPU飆升html
線程是操做系統可以進行運算調度的最小單位,跟蹤佔用CPU高的線程 能夠了解線程的運行狀況,定位可能或者問題代碼java
The first line of each thread represents the thread summary, which contains the following items:git
SECTION | EXAMPLE | DESCRIPTION |
---|---|---|
Name | "Reference Handler" |
Human-readable name of the thread. This name can be set by calling the setName method on a Thread object and be obtained by calling getName on the object. |
ID | #2 |
A unique ID associated with each Thread object. This number is generated, starting at 1 , for all threads in the system. Each time a Thread object is created, the sequence number is incremented and then assigned to the newly created Thread . This ID is read-only and can be obtained by calling getId on a Thread object. |
Daemon status | daemon |
A tag denoting if the thread is a daemon thread. If the thread is a daemon, this tag will be present; if the thread is a non-daemon thread, no tag will be present. For example, Thread-0 is not a daemon thread and therefore has no associated daemon tag in its summary: Thread-0" #12 prio=5... . |
Priority | prio=10 |
The numeric priority of the Java thread. Note that this does not necessarily correspond to the priority of the OS thread to with the Java thread is dispatched. The priority of a Thread object can be set using the setPriority method and obtained using the getPriority method. |
OS Thread Priority | os_prio=2 |
The OS thread priority. This priority can differ from the Java thread priority and corresponds to the OS thread on which the Java thread is dispatched. |
Address | tid=0x00000250e4979000 |
The address of the Java thread. This address represents the pointer address of the Java Native Interface (JNI) native Thread object (the C++ Thread object that backs the Java thread through the JNI). This value is obtained by converting the pointer to this (of the C++ object that backs the Java Thread object) to an integer on line 879 of hotspot/share/runtime/thread.cpp : st->print("tid=" INTPTR_FORMAT " ", p2i(this)); Although the key for this item (tid ) may appear to be the thread ID, it is actually the address of the underlying JNI C++ Thread object and thus is not the ID returned when calling getId on a Java Thread object. |
OS Thread ID | nid=0x3c28 |
The unique ID of the OS thread to which the Java Thread is mapped. This value is printed on line 42 of hotspot/share/runtime/osThread.cpp : st->print("nid=0x%x ", thread_id()); |
Status | waiting on condition |
A human-readable string depicting the current status of the thread. This string provides supplementary information beyond the basic thread state (see below) and can be useful in discovering the intended actions of a thread (i.e. was the thread trying to acquire a lock or waiting on a condition when it blocked). |
Last Known Java Stack Pointer | [0x000000b82a9ff000] |
The last known Stack Pointer (SP) for the stack associated with the thread. This value is supplied using native C++ code and is interlaced with the Java Thread class using the JNI. This value is obtained using the last_Java_sp() native method and is formatted into the thread dump on line 2886 of hotspot/share/runtime/thread.cpp: st->print_cr("[" INTPTR_FORMAT "]", (intptr_t)last_Java_sp() & ~right_n_bits(12)); For simple thread dumps, this information may not be useful, but for more complex diagnostics, this SP value can be used to trace lock acquisition through a program. |
The second line represents the current state of the thread. The possible states for a thread are captured in the Thread.State
enumeration:github
NEW
api
RUNNABLE
tomcat
BLOCKED
多線程
WAITING
併發
TIMED_WAITING
oracle
TERMINATED
app
以上內容來源:https://dzone.com/articles/how-to-read-a-thread-dump
定位高cpu進程、線程參考上文 Java運行狀態分析2:線程狀態及堆棧信息
ps -mp {pid} -o THREAD,tid,time|sort -k 3 -n
root 0.4 19 - futex_ - - 8063 00:12:37 root 0.4 19 - futex_ - - 8072 00:11:25 root 0.4 19 - futex_ - - 8074 00:11:28 root 0.4 19 - futex_ - - 8083 00:11:50 root 0.4 19 - futex_ - - 8086 00:10:44 root 0.4 19 - futex_ - - 8087 00:10:38 root 0.4 19 - futex_ - - 8091 00:11:04 root 0.4 19 - futex_ - - 8093 00:11:35 root 0.4 19 - futex_ - - 8094 00:11:57 root 0.4 19 - futex_ - - 8095 00:11:01 root 0.4 19 - futex_ - - 8096 00:10:17 root 0.4 19 - futex_ - - 8099 00:12:08 root 0.4 19 - futex_ - - 8100 00:12:11 root 0.4 19 - futex_ - - 8101 00:12:26 root 0.5 19 - ep_pol - - 7071 00:12:42 root 0.5 19 - ep_pol - - 7073 00:12:53 root 0.5 19 - ep_pol - - 7078 00:14:19 root 0.5 19 - futex_ - - 8000 00:12:42 root 0.5 19 - futex_ - - 8021 00:13:38 root 0.5 19 - futex_ - - 8032 00:14:43 root 0.5 19 - futex_ - - 8098 00:13:17 root 0.5 19 - futex_ - - 8103 00:12:54 root 0.8 19 - ep_pol - - 7069 00:22:14 root 0.9 19 - ep_pol - - 7149 00:24:44 root 1.3 19 - futex_ - - 7100 00:35:06 root 2.2 19 - futex_ - - 20595 00:01:03
找到佔用CPU高的線程id( 線程多,正序sort方便查佔用CPU最大的線程)
將tid轉16進制
printf %x {tid}
printf %x 18849 49a1
查對應高load線程堆棧信息
jstack {進程id}|grep -A 30 {線程id}
"AsyncAppender-Worker-ASYNC-IS-FILE" #9635 daemon prio=5 os_prio=0 tid=0x00007fcee0f55800 nid=0x36c9 waiting on condition [0x00007fd09f9fc000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000576dd0688> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at ch.qos.logback.core.AsyncAppenderBase$Worker.run(AsyncAppenderBase.java:264)
看了下 是logback日誌打印佔用,跟蹤了下日誌打印,這塊日誌打印的量相對較多,正常穩定高峯期間每秒大概產生10000條日誌(7個節點),是其餘服務的4倍左右
"RxIoScheduler-172" #9646 daemon prio=5 os_prio=0 tid=0x00007fcd98b5c000 nid=0x36dc waiting on condition [0x00007fcd2fe93000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000588b75208> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
"RxIoScheduler-174" #9864 daemon prio=5 os_prio=0 tid=0x00007fcd99158000 nid=0x40a9 waiting on condition [0x00007fce0f1f0000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000006cb53ddf0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
"hystrix-commonability-core-297" #9960 daemon prio=5 os_prio=0 tid=0x00007fd07c016000 nid=0x41f5 waiting on condition [0x00007fcd2a940000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000243e8fff8> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
"RxIoScheduler-175" #10015 daemon prio=5 os_prio=0 tid=0x00007fcd991b5800 nid=0x45e8 waiting on condition [0x00007fcd55cdb000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000732e899a8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
"http-nio-18002-exec-123" #2505 daemon prio=5 os_prio=0 tid=0x00007fce90042800 nid=0xafa waiting on condition [0x00007fcd2932a000] -- "RxComputationScheduler-13" #277 daemon prio=5 os_prio=0 tid=0x00007fcec4214000 nid=0x128 runnable [0x00007fce0d9d8000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000023c4df068> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
"RxComputationScheduler-30" #350 daemon prio=5 os_prio=0 tid=0x00007fcf34093000 nid=0x171 runnable [0x00007fcd54bcc000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000023cd13530> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
hystrix rxjava CPU較高,這塊後續分析
推薦使用:
https://fastthread.io
可本地安裝,區分收費免費,簡單調試跟蹤能夠直接在線使用
上傳dump文件,進入分析界面,界面比較直觀的統計了線程狀況,經過線程狀態作了分組
線程概覽
主要須要關心下線程的總數 是否符合咱們的預期,blocked線程可能會致使死鎖狀況
目前咱們系統線程數相對較多,200 Tomcat worker線程、+ 10個系統hystrix線程 + 異步任務線程 + 其餘線程,這邊800多個在預期內,一些等待線程也是正常狀態
關於blocked線程,須要看下量是否多,是不是正常範圍內的,同線程狀態角度,涉及同步操做,是會產生blocked的,工程這邊kafka在poll期間代碼實現經過synchronized實現,這塊是正常場景
waiting : nio Tomcat在壓力小的時候,worker等待新請求到來
Timed_waiting: nio tomcat 壓力大場景 worker經過線程池實現,請求進入隊列,新請求場景會等待釋放線程,同時處理默認200,出現這種場景,能夠跟蹤下當時的流量和rt狀況,查看釋放由於請求量和響應時間問題致使這塊阻塞,主機性能好能夠適當提高下併發處理量,並從接口性能層面跟蹤排查
異步任務場景,進入條件等待也將產生timed_waiting
線程分組統計
這塊能夠比較方便的看各個線程組的線程數量狀況,經過這邊看是否符合預期場景
線程數量原則上是儘量的保持小,多線程能夠更多的佔用CPU,必定程度減小響應時間,但並不是是提示系統性能的手段,過多的線程將代理CPU上下文切換時間,線程上下文拷貝也帶來系統資源的開銷