OOM(OutOfMemoryError) 問題歸根結底三點緣由:html
解決思路,換成Java服務分析,三個緣由也能夠解讀爲:java
所以,針對解決思路,快速定位OOM問題的三板斧是:apache
以正式線上的tomcat爲例,tomcat運行5個ssm架構的java項目,啓動時須要60秒左右,運行一段時間偶爾會有OOM出現,如今逐一排查:tomcat
(1) 確認是否是內存自己就分配太小服務器
在服務器(8核16G)上輸入 top 查看 java啓動時內存變化狀況,順便找到java的進程ID : 10397網絡
而後, 輸入:jmap -heap 10397,觀察堆、新生代、老年代的內存使用狀況,發現大概都用了一半,能夠肯定,不是內存分配太小問題。架構
wen@S189919:/opt/tomcat8$ jmap -heap 1246 Attaching to process ID 1246, please wait... Debugger attached successfully. Server compiler detected. JVM version is 24.65-b04 using thread-local object allocation. Parallel GC with 8 thread(s) Heap Configuration: MinHeapFreeRatio = 0 MaxHeapFreeRatio = 100 MaxHeapSize = 4208984064 (4014.0MB) NewSize = 1310720 (1.25MB) MaxNewSize = 17592186044415 MB OldSize = 5439488 (5.1875MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 21757952 (20.75MB) MaxPermSize = 85983232 (82.0MB) G1HeapRegionSize = 0 (0.0MB) Heap Usage: PS Young Generation Eden Space: capacity = 1172307968 (1118.0MB) used = 679248008 (647.781379699707MB) free = 493059960 (470.21862030029297MB) 57.94108941857845% used From Space: capacity = 85983232 (82.0MB) used = 0 (0.0MB) free = 85983232 (82.0MB) 0.0% used To Space: capacity = 115343360 (110.0MB) used = 0 (0.0MB) free = 115343360 (110.0MB) 0.0% used PS Old Generation capacity = 259522560 (247.5MB) used = 147065016 (140.25212860107422MB) free = 112457544 (107.24787139892578MB) 56.667526707504734% used PS Perm Generation capacity = 63963136 (61.0MB) used = 32219528 (30.72693634033203MB) free = 31743608 (30.27306365966797MB) 50.37202678742956% used 16612 interned Strings occupying 2080416 bytes.
(2) 找到最耗內存的對象dom
jmap -histo:live 1246| moressh
輸入命令後,會以表格的形式顯示存活對象的信息,並按照所佔內存大小排序:socket
實例數
所佔內存大小
類名
經過觀察,雖然我不知道 [B 是什麼類,可是最大也只有72M,對內存來講簡直沒有知覺。
若是發現某類對象佔用內存很大(例如幾個G),極可能是類對象建立太多,且一直未釋放。例如:
申請完資源後,未調用close()或dispose()釋放資源
消費者消費速度慢(或中止消費了),而生產者不斷往隊列中投遞任務,致使隊列中任務累積過多
wen@S189919:/opt/tomcat8$ jmap -histo:live 1246 | more num #instances #bytes class name ---------------------------------------------- 1: 79073 72095344 [B 2: 103049 13630576 [C 3: 57516 8155328 <constMethodKlass> 4: 57516 7373456 <methodKlass> 5: 5413 6128216 <constantPoolKlass> 6: 5413 3861128 <instanceKlassKlass> 7: 4455 3264960 <constantPoolCacheKlass> 8: 101128 2427072 java.lang.String 9: 46704 1868160 java.lang.ref.Finalizer 10: 5314 1486584 [Ljava.util.HashMap$Entry; 11: 22264 1419160 [Ljava.lang.Object; 12: 17286 1382880 java.lang.reflect.Method 13: 20810 1165360 java.util.zip.ZipFile$ZipFileInputStream 14: 20389 1141784 java.util.zip.ZipFile$ZipFileInflaterInputStream 15: 34592 1106944 java.util.HashMap$Entry 16: 1963 1075048 <methodDataKlass> 17: 1762 943992 [I 18: 22136 708352 java.util.concurrent.ConcurrentHashMap$HashEntry 19: 5866 704008 java.lang.Class 20: 14549 581960 java.util.LinkedHashMap$Entry 21: 21158 507792 java.util.ArrayList 22: 7742 453448 [S 23: 8839 450464 [[I 24: 7362 412272 java.util.LinkedHashMap 25: 3735 328416 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 26: 14544 322536 [Ljava.lang.Class; 27: 7350 294000 com.sun.org.apache.xerces.internal.dom.DeferredTextImpl 28: 2973 273488 [Ljava.util.WeakHashMap$Entry; 29: 6660 266400 com.sun.org.apache.xerces.internal.dom.DeferredAttrImpl 30: 5394 258912 java.util.HashMap 31: 6441 257640 javax.servlet.jsp.tagext.TagAttributeInfo 32: 436 237184 <objArrayKlassKlass> 33: 14200 227200 java.lang.Object 34: 2783 222640 sun.net.www.protocol.jar.URLJarFile 35: 3914 219184 com.sun.org.apache.xerces.internal.dom.DeferredElementImpl 36: 6016 192512 java.util.concurrent.locks.ReentrantLock$NonfairSync 37: 4328 173120 java.lang.ref.SoftReference 38: 2970 166320 java.util.WeakHashMap 39: 3200 153184 [Ljava.lang.String; 40: 3735 149400 java.util.concurrent.ConcurrentHashMap$Segment
(3) 確認是不是資源耗盡
經過查看 sshd 進程,得出句柄詳情和線程數
/proc/${PID}/fd
/proc/${PID}/task
最終的結果句柄數和線程數8和4,更不可能引起內存溢出
root@S189919:/home/wen# ps -aux | grep sshd Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html root 749 0.0 0.0 50036 2928 ? Ss 19:01 0:00 /usr/sbin/sshd -D root 1321 0.0 0.0 73440 3608 ? Ss 19:15 0:00 sshd: wen [priv] wen 1464 0.0 0.0 73440 1528 ? S 19:15 0:00 sshd: wen@pts/0 root 1585 0.0 0.0 9388 940 pts/0 S+ 19:20 0:00 grep --color=auto sshd root@S189919:/home/wen# ll /proc/749/fd total 0 dr-x------ 2 root root 0 Sep 4 19:01 ./ dr-xr-xr-x 8 root root 0 Sep 4 19:01 ../ lrwx------ 1 root root 64 Sep 4 19:01 0 -> /dev/null lrwx------ 1 root root 64 Sep 4 19:01 1 -> /dev/null lrwx------ 1 root root 64 Sep 4 19:01 2 -> /dev/null lr-x------ 1 root root 64 Sep 4 19:01 3 -> socket:[7330] lrwx------ 1 root root 64 Sep 4 19:21 4 -> socket:[7332] root@S189919:/home/wen# ll /proc/749/task total 0 dr-xr-xr-x 3 root root 0 Sep 4 19:21 ./ dr-xr-xr-x 8 root root 0 Sep 4 19:01 ../ dr-xr-xr-x 6 root root 0 Sep 4 19:21 749/ root@S189919:/home/wen# ll /proc/749/fd | wc -l 8 root@S189919:/home/wen# ll /proc/749/task | wc -l
(4) 合併相同的 jar 包
最後,想來想去,頗有多是項目啓動時加載太多第三方jar包,因而,將5個ssm的jar包合併,覆蓋掉相同的,放在tomcat的shared lib目錄:修改 ${ TOMCAT_HOME }/conf/catalina.properties文件中shared.loader= ${catalina.base}/shared/lib,${catalina.base}/shared/lib/*.jar 也能夠將公用的jar所有放置${ TOMCAT_HOME }/lib包下1啓動tomcat,加載完用了37秒,希望能解決OOM問題,今後再也不被領導說。