常常有同窗會問,爲啥個人應用 Old Gen 的使用佔比沒達到 CMSInitiatingOccupancyFraction 參數配置的閾值,就觸發了 CMS GC,表示很莫名奇妙,不知道問題出在哪?java
其實 CMS GC 的觸發條件很是多,不僅是 CMSInitiatingOccupancyFraction 閾值觸發這麼簡單。本文經過源碼全面梳理了觸發 CMS GC 的條件,儘量的幫你瞭解平時遇到的奇奇怪怪的 CMS GC 問題。算法
先拋出一些問題,來吸引你的注意力。bootstrap
- 爲何 Old Gen 使用佔比僅 50% 就進行了一次 CMS GC?
- Metaspace 的使用也會觸發 CMS GC 嗎?
- 爲何 Old Gen 使用佔比很是小就進行了一次 CMS GC?
觸發條件後端
CMS GC 在實現上分紅 foreground collector 和 background collector。foreground collector 相對比較簡單,background collector 比較複雜,狀況比較多。安全
下面咱們從 foreground collector 和 background collector 分別來講明他們的觸發條件:性能優化
說明:本文內容是基於 JDK 8架構
說明:本文僅涉及 CMS GC 的觸發條件,至於算法的具體過程,以及何時進行 MSC(mark sweep compact)不在本文範圍併發
foreground collectorapp
foreground collector 觸發條件比較簡單,通常是遇到對象分配但空間不夠,就會直接觸發 GC,來當即進行空間回收。採用的算法是 mark sweep,不壓縮。less
background collector
說明 background collector 的觸發條件以前,先來講下 background collector 的流程,它是經過 CMS 後臺線程不斷的去掃描,過程當中主要是判斷是否符合 background collector 的觸發條件,一旦有符合的狀況,就會進行一次 background 的 collect。
- void ConcurrentMarkSweepThread::run() {
- ...//省略
- while (!_should_terminate) {
- sleepBeforeNextCycle();
- if (_should_terminate) break;
- GCCause::Cause cause = _collector->_full_gc_requested ?
- _collector->_full_gc_cause : GCCause::_cms_concurrent_mark;
- _collector->collect_in_background(false, cause);
- }
- ...//省略
- }
每次掃描過程當中,先等 CMSWaitDuration 時間,而後再去進行一次 shouldConcurrentCollect 判斷,看是否知足 CMS background collector 的觸發條件。CMSWaitDuration 默認時間是 2s(常常會有業務遇到頻繁的 CMS GC,注意看每次 CMS GC 之間的時間間隔,若是是 2s,那基本就能夠判定是 CMS 的 background collector)。
- void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
- while (!_should_terminate) {
- if (CMSIncrementalMode) {
- icms_wait();
- if(CMSWaitDuration >= 0) {
- // Wait until the next synchronous GC, a concurrent full gc
- // request or a timeout, whichever is earlier.
- wait_on_cms_lock_for_scavenge(CMSWaitDuration);
- }
- return;
- } else {
- if(CMSWaitDuration >= 0) {
- // Wait until the next synchronous GC, a concurrent full gc
- // request or a timeout, whichever is earlier.
- wait_on_cms_lock_for_scavenge(CMSWaitDuration);
- } else {
- // Wait until any cms_lock event or check interval not to call shouldConcurrentCollect permanently
- wait_on_cms_lock(CMSCheckInterval);
- }
- }
- // Check if we should start a CMS collection cycle
- if (_collector->shouldConcurrentCollect()) {
- return;
- }
- // .. collection criterion not yet met, let's go back
- // and wait some more
- }
- }
那 shouldConcurrentCollect() 方法中都有哪些條件呢?
- bool CMSCollector::shouldConcurrentCollect() {
- // 第一種觸發狀況
- if (_full_gc_requested) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print_cr("CMSCollector: collect because of explicit "
- " gc request (or gc_locker)");
- }
- return true;
- }
- // For debugging purposes, change the type of collection.
- // If the rotation is not on the concurrent collection
- // type, don't start a concurrent collection.
- NOT_PRODUCT(
- if (RotateCMSCollectionTypes &&
- (_cmsGen->debug_collection_type() !=
- ConcurrentMarkSweepGeneration::Concurrent_collection_type)) {
- assert(_cmsGen->debug_collection_type() !=
- ConcurrentMarkSweepGeneration::Unknown_collection_type,
- "Bad cms collection type");
- return false;
- }
- )
- FreelistLocker x(this);
- // ------------------------------------------------------------------
- // Print out lots of information which affects the initiation of
- // a collection.
- if (PrintCMSInitiationStatistics && stats().valid()) {
- gclog_or_tty->print("CMSCollector shouldConcurrentCollect: ");
- gclog_or_tty->stamp();
- gclog_or_tty->print_cr("");
- stats().print_on(gclog_or_tty);
- gclog_or_tty->print_cr("time_until_cms_gen_full %3.7f",
- stats().time_until_cms_gen_full());
- gclog_or_tty->print_cr("free="SIZE_FORMAT, _cmsGen->free());
- gclog_or_tty->print_cr("contiguous_available="SIZE_FORMAT,
- _cmsGen->contiguous_available());
- gclog_or_tty->print_cr("promotion_rate=%g", stats().promotion_rate());
- gclog_or_tty->print_cr("cms_allocation_rate=%g", stats().cms_allocation_rate());
- gclog_or_tty->print_cr("occupancy=%3.7f", _cmsGen->occupancy());
- gclog_or_tty->print_cr("initiatingOccupancy=%3.7f", _cmsGen->initiating_occupancy());
- gclog_or_tty->print_cr("metadata initialized %d",
- MetaspaceGC::should_concurrent_collect());
- }
- // ------------------------------------------------------------------
- // 第二種觸發狀況
- // If the estimated time to complete a cms collection (cms_duration())
- // is less than the estimated time remaining until the cms generation
- // is full, start a collection.
- if (!UseCMSInitiatingOccupancyOnly) {
- if (stats().valid()) {
- if (stats().time_until_cms_start() == 0.0) {
- return true;
- }
- } else {
- // We want to conservatively collect somewhat early in order
- // to try and "bootstrap" our CMS/promotion statistics;
- // this branch will not fire after the first successful CMS
- // collection because the stats should then be valid.
- if (_cmsGen->occupancy() >= _bootstrap_occupancy) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print_cr(
- " CMSCollector: collect for bootstrapping statistics:"
- " occupancy = %f, boot occupancy = %f", _cmsGen->occupancy(),
- _bootstrap_occupancy);
- }
- return true;
- }
- }
- }
- // 第三種觸發狀況
- // Otherwise, we start a collection cycle if
- // old gen want a collection cycle started. Each may use
- // an appropriate criterion for making this decision.
- // XXX We need to make sure that the gen expansion
- // criterion dovetails well with this. XXX NEED TO FIX THIS
- if (_cmsGen->should_concurrent_collect()) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print_cr("CMS old gen initiated");
- }
- return true;
- }
- // 第四種觸發狀況
- // We start a collection if we believe an incremental collection may fail;
- // this is not likely to be productive in practice because it's probably too
- // late anyway.
- GenCollectedHeap* gch = GenCollectedHeap::heap();
- assert(gch->collector_policy()->is_two_generation_policy(),
- "You may want to check the correctness of the following");
- if (gch->incremental_collection_will_fail(true /* consult_young */)) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print("CMSCollector: collect because incremental collection will fail ");
- }
- return true;
- }
- // 第五種觸發狀況
- if (MetaspaceGC::should_concurrent_collect()) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print("CMSCollector: collect for metadata allocation ");
- }
- return true;
- }
- return false;
- }
上述代碼可知,從大類上分, background collector 一共有 5 種觸發狀況:
1.是不是並行 Full GC
指的是在 GC cause 是 gclocker 且配置了 GCLockerInvokesConcurrent 參數, 或者 GC cause 是javalangsystemgc(就是 System.gc()調用)and 且配置了 ExplicitGCInvokesConcurrent 參數,這時會觸發一次 background collector。
2.根據統計數據動態計算(僅未配置 UseCMSInitiatingOccupancyOnly 時) 未配置 UseCMSInitiatingOccupancyOnly 時,會根據統計數據動態判斷是否須要進行一次 CMS GC。
判斷邏輯是,若是預測 CMS GC 完成所須要的時間大於預計的老年代將要填滿的時間,則進行 GC。 這些判斷是須要基於歷史的 CMS GC 統計指標,然而,第一次 CMS GC 時,統計數據尚未造成,是無效的,這時會跟據 Old Gen 的使用佔比來判斷是否要進行 GC。
- if (!UseCMSInitiatingOccupancyOnly) {
- if (stats().valid()) {
- if (stats().time_until_cms_start() == 0.0) {
- return true;
- }
- } else {
- // We want to conservatively collect somewhat early in order
- // to try and "bootstrap" our CMS/promotion statistics;
- // this branch will not fire after the first successful CMS
- // collection because the stats should then be valid.
- if (_cmsGen->occupancy() >= _bootstrap_occupancy) {
- if (Verbose && PrintGCDetails) {
- gclog_or_tty->print_cr(
- " CMSCollector: collect for bootstrapping statistics:"
- " occupancy = %f, boot occupancy = %f", _cmsGen->occupancy(),
- _bootstrap_occupancy);
- }
- return true;
- }
- }
- }
那佔多少比率,開始回收呢?(也就是 bootstrapoccupancy 的值是多少呢?) 答案是 50%。或許你已經遇到過相似案例,在沒有配置 UseCMSInitiatingOccupancyOnly 時,發現老年代佔比到 50% 就進行了一次 CMS GC,當時的你或許還一頭霧水呢。
- _bootstrap_occupancy = ((double)CMSBootstrapOccupancy)/(double)100;
- //參數默認值
- product(uintx, CMSBootstrapOccupancy, 50,
- "Percentage CMS generation occupancy at which to initiate CMS collection for bootstrapping collection stats")
3.根據 Old Gen 狀況判斷
- bool ConcurrentMarkSweepGeneration::should_concurrent_collect() const {
- assert_lock_strong(freelistLock());
- if (occupancy() > initiating_occupancy()) {
- if (PrintGCDetails && Verbose) {
- gclog_or_tty->print(" %s: collect because of occupancy %f / %f ",
- short_name(), occupancy(), initiating_occupancy());
- }
- return true;
- }
- if (UseCMSInitiatingOccupancyOnly) {
- return false;
- }
- if (expansion_cause() == CMSExpansionCause::_satisfy_allocation) {
- if (PrintGCDetails && Verbose) {
- gclog_or_tty->print(" %s: collect because expanded for allocation ",
- short_name());
- }
- return true;
- }
- if (_cmsSpace->should_concurrent_collect()) {
- if (PrintGCDetails && Verbose) {
- gclog_or_tty->print(" %s: collect because cmsSpace says so ",
- short_name());
- }
- return true;
- }
- return false;
- }
從源碼上看,這裏主要分紅兩類: (a) Old Gen 空間使用佔比狀況與閾值比較,若是大於閾值則進行 CMS GC 也就是"occupancy() > initiatingoccupancy()",occupancy 毫無疑問是 Old Gen 當前空間的使用佔比,而 initiatingoccupancy 是多少呢?
- _cmsGen ->init_initiating_occupancy(CMSInitiatingOccupancyFraction, CMSTriggerRatio);
- ...
- void ConcurrentMarkSweepGeneration::init_initiating_occupancy(intx io, uintx tr) {
- assert(io <= 100 && tr <= 100, "Check the arguments");
- if (io >= 0) {
- _initiating_occupancy = (double)io / 100.0;
- } else {
- _initiating_occupancy = ((100 - MinHeapFreeRatio) +
- (double)(tr * MinHeapFreeRatio) / 100.0)
- / 100.0;
- }
- }
能夠看到當 CMSInitiatingOccupancyFraction 參數配置值大於 0,就是 「io / 100.0」;
當 CMSInitiatingOccupancyFraction 參數配置值小於 0 時(注意,默認是 -1),是 「((100 - MinHeapFreeRatio) + (double)(tr * MinHeapFreeRatio) / 100.0) / 100.0」,這究竟是多少呢?是 92%,這裏就不貼出具體的計算過程了,或許你已經在某些書或者博客中瞭解過,CMSInitiatingOccupancyFraction 沒有配置,就是 92,可是其實 CMSInitiatingOccupancyFraction 沒有配置是 -1,因此閾值取後者 92%,並非 CMSInitiatingOccupancyFraction 的值是 92。
(b) 接下來沒有配置 UseCMSInitiatingOccupancyOnly 的狀況
這裏也分紅有兩小類狀況:
當 Old Gen 剛由於對象分配空間而進行擴容,且成功分配空間,這時會考慮進行一次 CMS GC;
根據 CMS Gen 空閒鏈判斷,這裏有點複雜,目前也沒整清楚,好在按照默認配置其實這裏返回的是 false,因此默認是不用考慮這種觸發條件了。
4.根據增量 GC 是否可能會失敗(悲觀策略)
什麼意思呢?兩代的 GC 體系中,主要指的是 Young GC 是否會失敗。若是 Young GC 已經失敗或者可能會失敗,JVM 就認爲須要進行一次 CMS GC。
- bool incremental_collection_will_fail(bool consult_young) {
- // Assumes a 2-generation system; the first disjunct remembers if an
- // incremental collection failed, even when we thought (second disjunct)
- // that it would not.
- assert(heap()->collector_policy()->is_two_generation_policy(),
- "the following definition may not be suitable for an n(>2)-generation system");
- return incremental_collection_failed() ||
- (consult_young && !get_gen(0)->collection_attempt_is_safe());
- }
咱們看兩個判斷條件,「incrementalcollectionfailed()」 和 「!getgen(0)->collectionattemptissafe()」 incrementalcollectionfailed() 這裏指的是 Young GC 已經失敗,至於爲何會失敗通常是由於 Old Gen 沒有足夠的空間來容納晉升的對象。
!getgen(0)->collectionattemptissafe() 指的是新生代晉升是否安全。 經過判斷當前 Old Gen 剩餘的空間大小是否足夠容納 Young GC 晉升的對象大小。 Young GC 到底要晉升多少是沒法提早知道的,所以,這裏經過統計平均每次 Young GC 晉升的大小和當前 Young GC 可能晉升的最大大小來進行比較。
- //av_promo 是平均每次 YoungGC 晉升的大小,max_promotion_in_bytes 是當前可能的最大晉升大小( eden+from 當前使用空間的大小)
- bool res = (available >= av_promo) || (available >= max_promotion_in_bytes);
5.根據 meta space 狀況判斷
這裏主要看 metaspace 的 shouldconcurrent_collect 標誌,這個標誌在 meta space 進行擴容前若是配置了 CMSClassUnloadingEnabled 參數時,會進行設置。這種狀況下就會進行一次 CMS GC。所以常常會有應用啓動不久,Old Gen 空間佔比還很小的狀況下,進行了一次 CMS GC,讓你很莫名其妙,其實就是這個緣由致使的。
這裏推薦一下個人Java後端技術羣:834962734,羣裏有(分佈式架構、高可擴展、高性能、高併發、性能優化、Spring boot、Redis、ActiveMQ、等學習資源)進羣免費送給每一位Java小夥伴,無論你是轉行,仍是工做中想提高本身能力均可以,歡迎進羣一塊兒深刻交流學習!
總結
本文梳理了 CMS GC 的 foreground collector 和 background collector 的觸發條件,foreground collector 的觸發條件相對來講比較簡單,而 background collector 的觸發條件比較多,分紅 5 大種狀況,各大種狀況種還有一些小的觸發分支。尤爲是在沒有配置 UseCMSInitiatingOccupancyOnly 參數的狀況下,會多出不少種觸發可能,通常在生產環境是強烈建議配置 UseCMSInitiatingOccupancyOnly 參數,以便於可以比較肯定的執行 CMS GC,另外,也方便排查 GC 緣由。