當Java Web程序運行變慢,或者發生故障時,須要使用Thread Dumps. 若是你以爲ThreadDumps很是複雜,這篇文章極可能幫助你。將會分析Java中的線程,線程如何建立的,如何管理線程,怎麼從運行中的程序中dump 線程,最後怎麼分析他們獲得阻塞和存在瓶頸的線程。本文是在應用程序調試下獲得的結果。java
Java和線程linux
一個web server使用幾十到幾百條線程去處理大量的併發用戶。若是多條線程使用共享的資源,沒法避免線程之間對數據的競爭,有時候還會發生死鎖。web
線程競爭是web程序上不一樣的線程去訪問共享資源,一條線程等待另外線程釋放鎖。例如,在記錄log的時候,線程記錄log時,必須先得到鎖,而後去再訪問共享資源。數據庫
死鎖是一種特殊的線程競爭,兩個或多個線程要完成本身的任務,都要必需要等待其餘的線程完成他們的任務。apache
線程競爭會帶來各類不一樣的問題,爲了分析這些問題,須要使用Thread Dump。Thread Dump記錄了每一個線程真正的狀態。多線程
Java線程的背景併發
線程同步oracle
多條線程之間能夠同時執行,爲了確保多線程在使用共享資源上面的通用性,使用線程同步保證在同一時間只能有一條線程能夠訪問共享資源。app
線程同步在Java中可使用監視器。每一個Java對象都有一個監視器,這個監視器只能被一個線程擁有。當一個線程要得到另外線程擁有的監視器時,須要進入等待隊列直到線程釋放監視器。socket
線程的狀態
爲了分析Thread Dump ,須要先了解線程的狀態。線程的狀態是在java.lang.Thread.State中。
NEW:線程被建立可是尚未被執行
RUNNABLE:線程正在佔用cpu而且在執行任務
BLOCKED:線程爲了得到監視器須要等待其餘線程釋放鎖
WAITING:調用了wait,join,park方法使線程等待-無限期等待
TIMED_WAITING:調用了sleep,wait,join,park方法使線程等待--有限期等待
線程類型
java中線程能夠分爲兩種:
1. 後臺線程
2. 非後臺線程
當沒有其餘的非後臺線程運行時後臺線程將會終止。即便你不建立線程,java應用默認也會建立不少線程。這些大多數都是後臺線程,主要爲了執行gc或者jmx等類型的任務
從 'static void main(String[] args)’方法中開啓的線程叫作非後臺線程,當這些線程中止時,其餘的全部後臺線程也會中止()
得到一個Thread Dump
將會介紹三種經常使用的方法。請注意還會有其餘不少方法能夠獲取Thread Dump。一個Thread dump僅僅能夠顯示測量時的線程狀態。因此爲了查看線程狀態的變化,建議5到10次,每次間隔5秒。
使用jstack得到Thread Dump
經過使用jps命令來得到當前正在運行的Java程序的PID
[user@linux ~]$ jps -v 25780 RemoteTestRunner -Dfile.encoding=UTF-8 25590 sub.rmi.registry.RegistryImpl 2999 -Dapplication.home=/home1/user/java/jdk.1.6.0_24 -Xms8m 26300 sun.tools.jps.Jps -mlvV -Dapplication.home=/home1/user/java/jdk.1.6.0_24 -Xms8m
使用PID做爲jstack的參數得到Thread Dump
[user@linux ~]$ jstack -f 5824
在Linux終端中生成
經過使用ps -ef命令去得到當前正在運行的Java進程
[user@linux ~]$ ps - ef | grep java
user 2477 1 0 Dec23 ? 00:10:45 ...
user 25780 25361 0 15:02 pts/3 00:00:02 ./jstatd -J -Djava.security.policy=jstatd.all.policy -p 2999
user 26335 25361 0 15:49 pts/3 00:00:00 grep java
Use the extracted pid as the parameter of kill –SIGQUIT(3) to obtain a thread dump.
"pool-1-thread-13" prio=6 tid=0x000000000729a000 nid=0x2fb4 runnable [0x0000000007f0f000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) - locked <0x0000000780b7e688> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.readLine(BufferedReader.java:299) - locked <0x0000000780b7e688> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:362)
Thread name:當使用 Java.lang.Thread類去生成一個線程,將被命名爲Thre-(Number),然而當使用java.util.concurrent,ThreadFactory類,將會被命名爲pool-(Number)-thread-(Number)
Priority:表示線程的優先級
Thread ID:表明線程的惟一id。(經過線程id能夠得到一些有用的信息,包括cpu使用率,或者內存使用率)
Thread status:表明線程的狀態
Thread callstack:表明線程調用的堆棧信息
Thread Dump模式的類型
當沒法得到一個鎖(阻塞)
當一個線程佔領住鎖而其餘線程沒法得到這個鎖,而致使應用程序全部的性能都降低。在下面的例子中,
BLOCKED_TEST pool-1-thread-1 線程運行時得到<0x0000000780a000b0>鎖, 同時BLOCKED_TEST pool-1-thread-2 和 BLOCKED_TEST pool-1-thread-3正在等待得到<0x0000000780a000b0>鎖
"BLOCKED_TEST pool-1-thread-1" prio=6 tid=0x0000000006904800 nid=0x28f4 runnable [0x000000000785f000] java.lang.Thread.State: RUNNABLE at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:282) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) - locked <0x0000000780a31778> (a java.io.BufferedOutputStream) at java.io.PrintStream.write(PrintStream.java:432) - locked <0x0000000780a04118> (a java.io.PrintStream) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272) at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:85) - locked <0x0000000780a040c0> (a java.io.OutputStreamWriter) at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:168) at java.io.PrintStream.newLine(PrintStream.java:496) - locked <0x0000000780a04118> (a java.io.PrintStream) at java.io.PrintStream.println(PrintStream.java:687) - locked <0x0000000780a04118> (a java.io.PrintStream) at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:44) - locked <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState) at com.nbp.theplatform.threaddump.ThreadBlockedState$1.run(ThreadBlockedState.java:7) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0x0000000780a31758> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "BLOCKED_TEST pool-1-thread-2" prio=6 tid=0x0000000007673800 nid=0x260c waiting for monitor entry [0x0000000008abf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:43) - waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState) at com.nbp.theplatform.threaddump.ThreadBlockedState$2.run(ThreadBlockedState.java:26) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0x0000000780b0c6a0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "BLOCKED_TEST pool-1-thread-3" prio=6 tid=0x00000000074f5800 nid=0x1994 waiting for monitor entry [0x0000000008bbf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:42) - waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState) at com.nbp.theplatform.threaddump.ThreadBlockedState$3.run(ThreadBlockedState.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0x0000000780b0e1b8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
當是死鎖的狀態
線程A須要得到線程B的鎖才能繼續執行任務,同時線程B須要得到線程A的鎖才能繼續執行任務。在Thread Dump中,能夠發現 DEADLOCK_TEST-1 線程擁有0x00000007d58f5e48鎖,而且試着去獲取0x00000007d58f5e60這把鎖。另外 DEADLOCK_TEST-2 線程擁有0x00000007d58f5e60鎖,而且嘗試獲取0x00000007d58f5e78鎖。,DEADLOCK_TEST-3 線程擁有0x00000007d58f5e78鎖,而且嘗試得到0x00000007d58f5e48鎖。能夠看得出來,每一個線程都在等待另外線程的鎖,這種狀態知道一個線程放棄鎖以前都不會被改變。
"DEADLOCK_TEST-1" daemon prio=6 tid=0x000000000690f800 nid=0x1820 waiting for monitor entry [0x000000000805f000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197) - waiting to lock <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182) - locked <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135) Locked ownable synchronizers: - None "DEADLOCK_TEST-2" daemon prio=6 tid=0x0000000006858800 nid=0x17b8 waiting for monitor entry [0x000000000815f000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197) - waiting to lock <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182) - locked <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135) Locked ownable synchronizers: - None "DEADLOCK_TEST-3" daemon prio=6 tid=0x0000000006859000 nid=0x25dc waiting for monitor entry [0x000000000825f000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197) - waiting to lock <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182) - locked <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor) at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135) Locked ownable synchronizers: - None
持續等待來自遠程服務的信息
線程看起來是正常的,由於它的狀態一直都是RUNNABLE,然而當你將thread dump按時間有序的排列,你能夠看出來socketReadThread線程一直在讀socket
"socketReadThread" prio=6 tid=0x0000000006a0d800 nid=0x1b40 runnable [0x00000000089ef000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) - locked <0x00000007d78a2230> (a java.io.InputStreamReader) at sun.nio.cs.StreamDecoder.read0(StreamDecoder.java:107) - locked <0x00000007d78a2230> (a java.io.InputStreamReader) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:93) at java.io.InputStreamReader.read(InputStreamReader.java:151) at com.nbp.theplatform.threaddump.ThreadSocketReadState$1.run(ThreadSocketReadState.java:27) at java.lang.Thread.run(Thread.java:662)
當等待
線程保持wait狀態,在thread dump中,ioWaitThread線程等待從LinkedBlockingQueue中接收信息,若是LinkedBlockQueue一直沒有信息,那麼線程的狀態將不會改變。
"IoWaitThread" prio=6 tid=0x0000000007334800 nid=0x2b3c waiting on condition [0x000000000893f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000007d5c45850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:440) at java.util.concurrent.LinkedBlockingDeque.take(LinkedBlockingDeque.java:629) at com.nbp.theplatform.threaddump.ThreadIoWaitState$IoWaitHandler2.run(ThreadIoWaitState.java:89) at java.lang.Thread.run(Thread.java:662)
當線程資源不能正常組織
當線程的資源沒法正常組織,不用的線程將會堆積起來。若是發生了,建議監聽下線程組織的過程或者檢查線程終止的條件
如何利用Thread Dump解決問題
例子1:當cpu使用率不正常的偏高
1 提取最高cpu使用率的線程
[user@linux ~]$ ps -mo pid.lwp.stime.time.cpu -C java PID LWP STIME TIME %CPU 10029 - Dec07 00:02:02 99.5 - 10039 Dec07 00:00:00 0.1 - 10040 Dec07 00:00:00 95.5
從應用中找出cpu使用率的線程
得到輕量級鎖使用的cpu率,並將LWP對應的數字(10039) 轉換成十六進制(0x2737)
2 得到thread dump後,檢測線程的狀態
提取出應用中thread dump中pid爲10029 ,再找出nid爲0x2737的線程
"NioProcessor-2" prio=10 tid=0x0a8d2800 nid=0x2737 runnable [0x49aa5000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0x74c52678> (a sun.nio.ch.Util$1) - locked <0x74c52668> (a java.util.Collections$UnmodifiableSet) - locked <0x74c501b0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at external.org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:65) at external.org.apache.mina.common.AbstractPollingIoProcessor$Worker.run(AbstractPollingIoProcessor.java:708) at external.org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
每隔一小時提取幾回,而後分析線程的狀態以肯定問題。
例2:當性能不正常的降低
在得到thread dump後,查詢下面一組線程的狀態是BLOCKED
" DB-Processor-13" daemon prio=5 tid=0x003edf98 nid=0xca waiting for monitor entry [0x000000000825f000] java.lang.Thread.State: BLOCKED (on object monitor) at beans.ConnectionPool.getConnection(ConnectionPool.java:102) - waiting to lock <0xe0375410> (a beans.ConnectionPool) at beans.cus.ServiceCnt.getTodayCount(ServiceCnt.java:111) at beans.cus.ServiceCnt.insertCount(ServiceCnt.java:43) "DB-Processor-14" daemon prio=5 tid=0x003edf98 nid=0xca waiting for monitor entry [0x000000000825f020] java.lang.Thread.State: BLOCKED (on object monitor) at beans.ConnectionPool.getConnection(ConnectionPool.java:102) - waiting to lock <0xe0375410> (a beans.ConnectionPool) at beans.cus.ServiceCnt.getTodayCount(ServiceCnt.java:111) at beans.cus.ServiceCnt.insertCount(ServiceCnt.java:43) " DB-Processor-3" daemon prio=5 tid=0x00928248 nid=0x8b waiting for monitor entry [0x000000000825d080] java.lang.Thread.State: RUNNABLE at oracle.jdbc.driver.OracleConnection.isClosed(OracleConnection.java:570) - waiting to lock <0xe03ba2e0> (a oracle.jdbc.driver.OracleConnection) at beans.ConnectionPool.getConnection(ConnectionPool.java:112) - locked <0xe0386580> (a java.util.Vector) - locked <0xe0375410> (a beans.ConnectionPool) at beans.cus.Cue_1700c.GetNationList(Cue_1700c.java:66) at org.apache.jsp.cue_1700c_jsp._jspService(cue_1700c_jsp.java:120)
If the threads are BLOCKED, extract the threads related to the lock that the threads are trying to obtain.
經過thread dump,能夠確認處在BLOCKED狀態的線程,由於<0xe0375410>鎖沒法被得到,這種問題能夠目前持有鎖的堆棧跟蹤中解決問題。
在使用dbms時,有兩個緣由會致使上面描述的問題頻繁發生。第一個緣由是:
不適當的配置:儘管這些線程仍然在工做,由於數據庫鏈接池的不適當配置,使得沒法顯示出最好的性能。經過屢次收集分析thread dumps,將會發現一些以前處於BLOCKED狀態的線程會處於不一樣的狀態。
第二個緣由是:不正當的鏈接。若是和數據庫的鏈接保持異常,那麼線程會一直等待超時。在這個例子中,經過屢次收集分析thread dumps,將會看到和數據庫有關的線程會一直保持BLOCKED狀態。經過適當的參數,好比超時時間,能夠縮短問題發生的時間
簡單的Thread Dump編碼:
線程的命名
使用java.lang.Thread 類來建立一個線程對象,將被命名爲Thread-(Number)。當使用java.util.concurrent.DefaultThreadFactory對象建立線程,將被命名爲pool-(Number)-thread-(Number)。在分析應用中幾十到幾百的線程時,若是全部的線程都使用默認的名字,分析會變得很是困難,很難區別出線程。
所以,能夠養成在建立線程時爲線程命名的好習慣
經過java.lang.Thread建立線程,能夠給線程構造參數中傳入自定義的名
public Thread(Runnable target, String name); public Thread(ThreadGroup group, String name); public Thread(ThreadGroup group, Runnable target, String name); public Thread(ThreadGroup group, Runnable target, String name, long stackSize);
經過java.util.concurrent.ThreadFactory建立線程,能夠取本身ThredFactory的名。若是不須要特殊的功能,可使用下面的MyThreadFactory
import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ThreadFactory; import java.util.concurrent.atomic.AtomicInteger; public class MyThreadFactory implements ThreadFactory { private static final ConcurrentHashMap<String, AtomicInteger> POOL_NUMBER = new ConcurrentHashMap<String, AtomicInteger>(); private final ThreadGroup group; private final AtomicInteger threadNumber = new AtomicInteger(1); private final String namePrefix; public MyThreadFactory(String threadPoolName) { if (threadPoolName == null) { throw new NullPointerException("threadPoolName"); } POOL_NUMBER.putIfAbsent(threadPoolName, new AtomicInteger()); SecurityManager securityManager = System.getSecurityManager(); group = (securityManager != null) ? securityManager.getThreadGroup() : Thread.currentThread().getThreadGroup(); AtomicInteger poolCount = POOL_NUMBER.get(threadPoolName); if (poolCount == null) { namePrefix = threadPoolName + " pool-00-thread-"; } else { namePrefix = threadPoolName + " pool-" + poolCount.getAndIncrement() + "-thread-"; } } public Thread newThread(Runnable runnable) { Thread thread = new Thread(group, runnable, namePrefix + threadNumber.getAndIncrement(), 0); if (thread.isDaemon()) { thread.setDaemon(false); } if (thread.getPriority() != Thread.NORM_PRIORITY) { thread.setPriority(Thread.NORM_PRIORITY); } return thread; } }
經過MBean來得到更多的信息
能夠經過MBean來得到更多的ThreadInfo的信息。一樣使用ThreadInfo也能夠得到更多的在thread dumps中難以獲取的信息。
ThreadMXBean mxBean = ManagementFactory.getThreadMXBean(); long[] threadIds = mxBean.getAllThreadIds(); ThreadInfo[] threadInfos = mxBean.getThreadInfo(threadIds); for (ThreadInfo threadInfo : threadInfos) { System.out.println( threadInfo.getThreadName()); System.out.println( threadInfo.getBlockedCount()); System.out.println( threadInfo.getBlockedTime()); System.out.println( threadInfo.getWaitedCount()); System.out.println( threadInfo.getWaitedTime()); }
使用ThreaInfo中的信息能夠得到等待線程或者阻塞線程所用的時間,也能夠得到已經被停用了的異常線程的列表