我開發的worker,每隔幾個月線上都會阻塞一次,一直都沒查出問題。今天終於了了這個心結。把解決過程總結下和你們分享。java
首先用jstack命令打出這個進程的所有線程堆棧。拿到線程dump文件以後,搜索本身的worker名字。mysql
-
"DefaultQuartzScheduler_Worker-10" prio=10 tid=0x00007f55cd54d800 nid=0x3e2e waiting for monitor entry [0x00007f51ab8f7000]
-
java.lang.Thread.State: BLOCKED (
on object monitor)
-
at com.jd.chat.worker.service.impl.NewPopAccountSyncServiceImpl.addAccounts(NewPopAccountSyncServiceImpl.java:
86)
-
- waiting
to lock <0x0000000782359268> (a com.jd.chat.worker.service.impl.NewPopAccountSyncServiceImpl)
-
at com.jd.chat.worker.service.
timer.AccountIncSyncTimer.run(AccountIncSyncTimer.java:114)
-
at com.jd.chat.worker.service.
timer.AbstractTimer.start(AbstractTimer.java:44)
-
at com.jd.chat.worker.service.
timer.AbstractTimer.doJob(AbstractTimer.java:49)
-
at com.jd.chat.worker.web.context.StartAppListener$TimerJob.
execute(StartAppListener.java:188)
-
at org.quartz.core.JobRunShell.run(JobRunShell.java:
202)
-
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:
573)
-
- locked <
0x0000000783641c68> (a java.lang.Object)
很快便找到了線程在哪一行被阻塞。可是就憑這麼點信息,並不能查出問題的真正緣由,這裏推薦一個工具,叫tda.bat。同事給個人,網上應該有下載。把這個dump文件導入到tda中。找到阻塞的線程。阻塞的線程是紅色的。web
之因此說這個軟件好,是由於當你找到blocked的線程後,界面的下方,會打出阻塞的更詳細的線程堆棧。截取這個堆棧的部分信息。spring
-
at org.mariadb.jdbc.MySQLPreparedStatement.execute(MySQLPreparedStatement.java:
141)
-
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:
172)
-
at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:
172)
-
at com.ibatis.sqlmap.engine.execution.SqlExecutor.executeUpdate(SqlExecutor.java:
80)
-
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.sqlExecuteUpdate(MappedStatement.java:
216)
-
at com.ibatis.sqlmap.engine.mapping.statement.MappedStatement.executeUpdate(MappedStatement.java:
94)
-
at com.ibatis.sqlmap.engine.impl.SqlMapExecutorDelegate.update(SqlMapExecutorDelegate.java:
457)
-
at com.ibatis.sqlmap.engine.impl.SqlMapSessionImpl.update(SqlMapSessionImpl.java:
90)
-
at org.springframework.orm.ibatis.SqlMapClientTemplate$
9.doInSqlMapClient(SqlMapClientTemplate.java:380)
-
at org.springframework.orm.ibatis.SqlMapClientTemplate$
9.doInSqlMapClient(SqlMapClientTemplate.java:1)
-
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:
200)
-
at org.springframework.orm.ibatis.SqlMapClientTemplate.update(SqlMapClientTemplate.java:
378)
-
at com.jd.im.data.dataresource.ImSqlMapClientTemplate.retriedWithoutAnyInterventionUpdate(ImSqlMapClientTemplate.java:
169)
-
at com.jd.im.data.dataresource.ImSqlMapClientTemplate.update(ImSqlMapClientTemplate.java:
137)
-
at com.jd.chat.dao.impl.WriteDaoImpl.update(WriteDaoImpl.java:
21)
-
at com.jd.chat.zone.service.impl.GroupServiceImpl.updateRoute(GroupServiceImpl.java:
766)
-
at com.jd.chat.worker.service.impl.NewPopAccountSyncServiceImpl.addAccounts(NewPopAccountSyncServiceImpl.java:
267)
-
- locked <
0x0000000782359268> (a com.jd.chat.worker.service.impl.NewPopAccountSyncServiceImpl)
這個纔是真正有用的堆棧!它告訴了我程序是在執行SQL的時候,SQL發生死鎖,因而線程被阻塞。它還提供了更有用的信息,那就是究竟是哪一個SQL致使的死鎖。堆棧的倒數第三行指示了致使死鎖的SQL。sql
可是必定要用這個工具才能找到具體的緣由嗎?答案固然是NO!apache
告訴你們怎麼不經過工具找到阻塞的真正緣由!app
剛剛經過「BLOCKED」關鍵字搜到了線程堆棧,找到它的線程名「DefaultQuartzScheduler_Worker-10」。OK,而後,把最後的10改爲1,也就是「DefaultQuartzScheduler_Worker-1」,而後再拿這個關鍵字搜索整個進程堆棧。socket
-
"DefaultQuartzScheduler_Worker-1" prio=10 tid=0x00007f55cd2aa000 nid=0x3e25 runnable [0x00007f51b02c0000]
-
java.lang.Thread.State: RUNNABLE
-
at java.net.SocketInputStream.socketRead0(Native Method)
-
at java.net.SocketInputStream.read(SocketInputStream.java:
129)
-
at java.io.BufferedInputStream.fill(BufferedInputStream.java:
218)
-
at java.io.BufferedInputStream.read1(BufferedInputStream.java:
258)
-
at java.io.BufferedInputStream.read(BufferedInputStream.java:
317)
-
- locked <
0x0000000791370d50> (a java.io.BufferedInputStream)
-
at org.mariadb.jdbc.internal.common.packet.buffer.ReadUtil.readFully(ReadUtil.java:
82)
-
at org.mariadb.jdbc.internal.common.packet.buffer.ReadUtil.readFully(ReadUtil.java:
92)
-
at org.mariadb.jdbc.internal.common.packet.RawPacket.nextPacket(RawPacket.java:
77)
-
at org.mariadb.jdbc.internal.common.packet.SyncPacketFetcher.getRawPacket(SyncPacketFetcher.java:
67)
-
at org.mariadb.jdbc.internal.mysql.MySQLProtocol.getResult(MySQLProtocol.java:
891)
-
at org.mariadb.jdbc.internal.mysql.MySQLProtocol.executeQuery(MySQLProtocol.java:
982)
-
at org.mariadb.jdbc.MySQLStatement.execute(MySQLStatement.java:
280)
-
- locked <
0x0000000791370678> (a org.mariadb.jdbc.internal.mysql.MySQLProtocol)
-
at org.mariadb.jdbc.MySQLPreparedStatement.execute(MySQLPreparedStatement.java:
141)
貼出這個進程堆棧的一部分。這個進程堆棧其實也就是剛剛tda軟件界面下方展現的致使線程阻塞的真正的堆棧!這個線程是runnable狀態的,惋惜mysql是鎖死的。也就是說阻塞在了mysql裏。工具
感受這是一個由張三的命案牽出李四的命案的故事。ui