咱們線上5臺solr讀服務器,配置同樣,可是相同的請求,其中一臺響應時間明顯比其餘4臺慢,咱們想經過arthas來定位具體哪裏執行慢。html
阿里開源的java調試工具,能解決以下的問題:java
從咱們實際使用經驗來講,能夠解決兩個常見的問題:git
咱們分別在2臺測試環境和1臺線上環境測試arthas,發如今2臺測試環境中出現以下兩個問題:github
建議: 上述2問題,在線上生產環境沒遇到,可是不建議在線上正常運行環境中使用arthas;但線上java程序遇到重大問題,能夠考慮使用arthas快速定位問題,可是前提是能接受咱們遇到的兩個坑。apache
# 安裝 curl -L https://alibaba.github.io/arthas/install.sh | sh # 若是無法正常安裝,提示SSL connect error,執行:yum -y update nss,更新nss # 打開 ./as.sh # 若是java進程使用非root帳戶啓動,如solr用戶啓動,則使用以下啓動腳本 sudo -u solr -EH ./as.sh # 查看cpu佔用高的3個線程 相似top -H,可是能看到堆棧信息 thread -n 3 # 跟蹤某個方法的調用時間 trace 包名.類名 方法名 # 退出arthas exit
solr全部請求入口都是org.apache.solr.handler.StandardRequestHandler,咱們跟蹤此入口,可是arthas無法跟蹤到完整代碼堆棧,尤爲是遇到抽象方法之類,得從新跟蹤。tomcat
# 啓動 [root@ arthas]# sudo -u solr -EH ./as.sh Arthas script version: 3.0.4.2 ERROR: '' does not exist. JAVA_HOME: /usr/local/java/jdk1.8.0_05 Found existing java process, please choose one and hit RETURN. * [1]: 10433 org.apache.catalina.startup.Bootstrap [2]: 10786 org.apache.catalina.startup.Bootstrap [3]: 11170 org.apache.catalina.startup.Bootstrap [4]: 11372 org.apache.catalina.startup.Bootstrap [5]: 10606 org.apache.catalina.startup.Bootstrap [6]: 10383 org.apache.catalina.startup.Bootstrap [7]: 64118 org.apache.catalina.startup.Bootstrap [8]: 10967 org.apache.catalina.startup.Bootstrap [9]: 11037 org.apache.catalina.startup.Bootstrap [10]: 77756 org.apache.catalina.startup.Bootstrap [11]: 11646 org.apache.catalina.startup.Bootstrap [12]: 11487 org.apache.catalina.startup.Bootstrap # 選擇定位問題的進程 7 Calculating attach execution time... Attaching to 64118 using version 3.0.4... real 0m0.294s user 0m1.354s sys 0m0.143s Attach success. telnet connecting to arthas server... current timestamp is 1542786758 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. ,---. ,------. ,--------.,--. ,--. ,---. ,---. / O \ | .--. ''--. .--'| '--' | / O \ ' .-' | .-. || '--'.' | | | .--. || .-. |`. `-. | | | || |\ \ | | | | | || | | |.-' | `--' `--'`--' '--' `--' `--' `--'`--' `--'`-----' wiki: https://alibaba.github.io/arthas version: 3.0.4 pid: 64118 timestamp: 1542786758978 $ trace org.apache.solr.handler.StandardRequestHandler The argument 'method-pattern' is required # 跟蹤org.apache.solr.handler.StandardRequestHandler handleRequest $ trace org.apache.solr.handler.StandardRequestHandler handleRequest No class or method is affected, try: 1. sm CLASS_NAME METHOD_NAME to make sure the method you are tracing actually exists (it might be in your parent class). 2. reset CLASS_NAME and try again, your method body might be too large. 3. visit middleware-container/arthas/issues/278 for more detail $ trace org.apache.solr.handler.RequestHandlerBase handleRequest Press Ctrl+C to abort. Affect(class-cnt:21 , method-cnt:1) cost in 8197 ms. `---ts=2018-11-21 15:53:08;thread_name=http-nio-10070-exec-51;id=3a973;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.WebappClassLoader@67c5a31d `---[193.784245ms] org.apache.solr.handler.RequestHandlerBase:handleRequest() +---[0.026761ms] java.util.concurrent.atomic.AtomicLong:incrementAndGet() +---[0.010342ms] org.apache.solr.util.stats.Timer:time() +---[0.010381ms] org.apache.solr.util.SolrPluginUtils:setDefaults() +---[0.010746ms] org.apache.solr.response.SolrQueryResponse:setHttpCaching() # 該方法最耗時,須要進一步跟蹤 +---[193.214175ms] org.apache.solr.handler.RequestHandlerBase:handleRequestBody() +---[0.01285ms] org.apache.solr.response.SolrQueryResponse:getResponseHeader() +---[0.036761ms] org.apache.solr.common.util.NamedList:get() `---[0.063972ms] org.apache.solr.util.stats.TimerContext:stop() # 繼續跟蹤 $ trace org.apache.solr.handler.RequestHandlerBase handleRequestBody Press Ctrl+C to abort. Affect(class-cnt:21 , method-cnt:16) cost in 8933 ms. `---ts=2018-11-21 15:53:35;thread_name=http-nio-10070-exec-35;id=2f44f;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.WebappClassLoader@67c5a31d `---[175.443917ms] org.apache.solr.handler.component.SearchHandler:handleRequestBody() +---[0.019261ms] org.apache.solr.request.SolrQueryRequest:getContentStreams() +---[0.009544ms] org.apache.solr.handler.component.ResponseBuilder:<init>() +---[0.015031ms] org.apache.solr.request.SolrRequestInfo:setResponseBuilder() +---[min=6.36E-4ms,max=0.003173ms,total=0.005898ms,count=3] org.apache.solr.request.SolrQueryRequest:getParams() +---[min=7.02E-4ms,max=0.024157ms,total=0.024859ms,count=2] org.apache.solr.common.params.SolrParams:getBool() +---[0.004623ms] org.apache.solr.handler.component.ResponseBuilder:setDebug() +---[0.012386ms] org.apache.solr.common.params.SolrParams:getParams() +---[0.006289ms] org.apache.solr.util.SolrPluginUtils:getDebugInterests() +---[min=7.88E-4ms,max=0.003469ms,total=0.004257ms,count=2] org.apache.solr.handler.component.ResponseBuilder:isDebug() +---[0.021305ms] org.apache.solr.handler.component.ShardHandlerFactory:getShardHandler() +---[0.040045ms] org.apache.solr.handler.component.ShardHandler:checkDistributed() +---[min=7.07E-4ms,max=0.015151ms,total=0.015858ms,count=2] java.util.List:iterator() +---[min=5.35E-4ms,max=0.019616ms,total=0.029199ms,count=16] java.util.Iterator:hasNext() +---[min=4.93E-4ms,max=0.004724ms,total=0.011896ms,count=14] java.util.Iterator:next() +---[min=9.53E-4ms,max=0.69593ms,total=0.735595ms,count=7] org.apache.solr.handler.component.SearchComponent:prepare() `---[min=8.73E-4ms,max=173.993692ms,total=174.025895ms,count=7] org.apache.solr.handler.component.SearchComponent:process() ...... 重複上面的工做
最後定位到是org.apache.lucene.search.Weight類的scorer方法慢,該方法功能是計算排序分。bash