presto內存管理及調優

內存池

Presto有三種內存池,分別爲GENERAL_POOL、RESERVED_POOL、SYSTEM_POOL。這三個內存池佔用的內存大小是由下面算法進行分配的:node

builder.put(RESERVED_POOL, new MemoryPool(RESERVED_POOL, config.getMaxQueryMemoryPerNode()));

builder.put(SYSTEM_POOL, new MemoryPool(SYSTEM_POOL, systemMemoryConfig.getReservedSystemMemory()));

long maxHeap = Runtime.getRuntime().maxMemory();
maxMemory = new DataSize(maxHeap - systemMemoryConfig.getReservedSystemMemory().toBytes(), BYTE);
DataSize generalPoolSize = new DataSize(Math.max(0, maxMemory.toBytes() - config.getMaxQueryMemoryPerNode().toBytes()), BYTE);
builder.put(GENERAL_POOL, new MemoryPool(GENERAL_POOL, generalPoolSize));

梳理這塊代碼對應的邏輯和配置文件,得出RESERVED_POOL大小由config.properties裏的query.max-memory-per-node指定;SYSTEM_POOL由config.properties裏的resources.reserved-system-memory指定,若是不指定,默認值爲Runtime.getRuntime().maxMemory() * 0.4,即0.4 * Xmx值;而GENERAL_POOL值爲 總內存(Xmx值)- 預留的(max-memory-per-node)- 系統的(0.4 * Xmx)。算法

而這三種內存池分別用於不一樣的地方,分析代碼和閱讀Presto開發手冊,大致能夠定位出:併發

  • GENERAL_POOL is the memory pool used by the physical operators in a query.
  • SYSTEM_POOL is mostly used by the exchange buffers and readers/writers.
  • RESERVED_POOL is for running a large query when the general pool becomes full.

簡單說GENERAL_POOL用於普通查詢的physical operators;SYSTEM_POOL用於讀寫buffer;而RESERVED_POOL比較特殊,大部分時間裏是不參與計算的,只有當同時知足以下情形下,纔會被使用,而後從全部查詢裏獲取佔用內存最大的那個查詢,而後將該查詢放到 RESERVED_POOL 裏執行,同時注意RESERVED_POOL只能用於一個Query。ui

一、GENERAL_POOL有節點出現阻塞節點(block node)狀況,即該node內存不足
二、RESERVED_POOL沒有被使用spa

GENERAL_POOL、RESERVED_POOL、SYSTEM_POOL應配合合理的值,若是併發比較大時,建議SYSTEM_POOL保持默認或者稍微再大一點。目前個人經驗配置是SYSTEM_POOL爲1/3 * Xmx(雖然咱們併發較多,可是依然調低了此值);RESERVED_POOL 爲 1/9 * XMX。3d

固然你能夠經過HTTP請求查看每臺Worker的/v1/status,來預估具體須要配置多大的內存,如圖所示,顯示了各內存池的使用量。rest

http://10.84.99.3:8080/v1/status
{
    "nodeId": "10-84-99-3", 
    "nodeVersion": {
        "version": "2bcd31d-dirty"
    }, 
    "environment": "product", 
    "coordinator": true, 
    "uptime": "13.23d", 
    "externalAddress": "10.84.99.53", 
    "internalAddress": "10.84.99.53", 
    "memoryInfo": {
        "totalNodeMemory": "35433480192B", 
        "pools": {
            "reserved": {
                "maxBytes": 10737418240, 
                "reservedBytes": 0, 
                "reservedRevocableBytes": 0, 
                "queryMemoryReservations": { }, 
                "queryMemoryRevocableReservations": { }, 
                "freeBytes": 10737418240
            }, 
            "general": {
                "maxBytes": 24696061952, 
                "reservedBytes": 0, 
                "reservedRevocableBytes": 0, 
                "queryMemoryReservations": { }, 
                "queryMemoryRevocableReservations": { }, 
                "freeBytes": 24696061952
            }, 
            "system": {
                "maxBytes": 16106127360, 
                "reservedBytes": 0, 
                "reservedRevocableBytes": 0, 
                "queryMemoryReservations": { }, 
                "queryMemoryRevocableReservations": { }, 
                "freeBytes": 16106127360
            }
        }
    }, 
    "processors": 40, 
    "processCpuLoad": 0.0004106776180698152, 
    "systemCpuLoad": 0.00041050903119868636, 
    "heapUsed": 9741942512, 
    "heapAvailable": 51539607552, 
    "nonHeapUsed": 264888168
}

 

內存限制和管理

單機維度

  • GENERAL_POOL每次內存申請時,都會判斷內存使用量是否超過了最大內存,若是超過了就報錯,錯誤爲「Query exceeded local memory limit of x」,這保護了Presto會無限申請內存,只會致使當前查詢出錯。同時,若是該節點的GENERAL_POOL可以使用內存以及可回收內存爲0,那麼認爲該node爲Block node。code

  • RESERVED_POOL能夠認爲是查詢最大的SQL,其能知足GENERAL_POOL的內存限制策略,那麼確定會知足RESERVED_POOL的策略(複用了GENERAL_POOL策略)。內存

  • RESERVED_POOL目前版本未發現能夠限制內存,因此當併發很是高,且scan的數據很是大時,有低機率會引發OOM問題。可是配合Resource Group,內存設置合理,也基本會避免OOM問題。開發

集羣維度

同時知足如下兩點時,Presto便認爲集羣超出要求的內存了:

  • GENERAL_POOL出現阻塞節點(Block node)
  • RESERVED_POOL已經被使用

當判斷出集羣超出CLuster Memory時,有兩種方式管理內存: 一、挨個遍歷每一個查詢,判斷當前查詢佔用的總內存是否超過了query.max-memory(config.properties裏配置),若是超過了,那麼該查詢就被failed。 二、若是query.max-memory配置的不合理,值很是大,那麼可能過了5秒(默認時間)依然不知足第一種情形,那麼將會使用第二種方法管理查詢。第二種管理方法又分爲兩種小的管理,根據LowMemoryKillerPolicy來決定Kill查詢策略,其分爲total-reservation和total-reservation-on-blocked-nodes。配置total-reservation的做用是kill掉全部查詢裏最費內存的查詢;而total-reservation-on-blocked-nodes殺死在內存不足(阻塞)的節點上使用最多內存的查詢。

相關文章
相關標籤/搜索