Memcached內存存儲

早就據說過Memcached獨特的內存管理方式,寫着篇文章的目的就是了解Memcached的內存管理,學習其源代碼.數組

1.什麼是Slab Allocator

memcached默認狀況下采用了名爲Slab Allocator的機制分配、管理內存,Slab Allocator的基本原理是按照預先規定的大小,將分配的內存分割成特定長度的塊,以指望徹底解決內存碎片問題。並且,slab allocator還有重複使用已分配的內存的目的。 也就是說,分配到的內存不會釋放,而是重複利用。緩存

2.Slab Allocation的主要術語

Page        分配給Slab的內存空間,默認是1MB,分配給Slab以後根據slab的大小切分紅chunk
Chunk       用於緩存記錄的內存空間
Slab Class  特定大小的chunk的組

3.Slab初始化

在Memcached啓動時候會調用slab的初始化代碼(詳見memcached.c中main函數調用slabs_init函數).memcached

slabs_init函數聲明:函數

1
2 3 4 5 6 7 
/** Init the subsystem. 1st argument is the limit on no. of bytes to allocate,  0 if no limit. 2nd argument is the growth factor; each slab will use a chunk  size equal to the previous slab's chunk size times this factor.  3rd argument specifies if the slab allocator should allocate all memory  up front (if true), or allocate memory in chunks as it is needed (if false) */ void slabs_init(const size_t limit, const double factor, const bool prealloc); 

其中limit表示memcached最大使用內存;factor表示slab中chunk size的增加因子,slab中chunk size的大小等於前一個slab的chunk size乘以factor;學習

memcached.c中main函數調用slabs_init函數:ui

1
slabs_init(settings.maxbytes, settings.factor, preallocate); 

其中settings.maxbytes默認值爲64M,啓動memcached使用選項-m設置;settings.factor默認爲1.25,啓動memcached時候使用-f設置;preallocate指的是啓動memcached的時候默認爲每種類型slab預先分配一個page的內存,默認是false;this

1
2 3 4 5 
settings.maxbytes = 64 * 1024 * 1024; /* default is 64MB */ ... settings.factor = 1.25; ... preallocate = false 

slabs_init函數實現:spa

1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 
/**  * Determines the chunk sizes and initializes the slab class descriptors  * accordingly.  */ void slabs_init(const size_t limit, const double factor, const bool prealloc) {  int i = POWER_SMALLEST - 1;  //真實佔用大小=對象大小+48  unsigned int size = sizeof(item) + settings.chunk_size;   mem_limit = limit;   //開啓預分配,則首先將limit大小(默認64M)的內存所有申請  if (prealloc) {  /* Allocate everything in a big chunk with malloc */  mem_base = malloc(mem_limit);  if (mem_base != NULL) {  mem_current = mem_base;  mem_avail = mem_limit;  } else {  fprintf(stderr, "Warning: Failed to allocate requested memory in"  " one large chunk.\nWill allocate in smaller chunks\n");  }  }   //清空全部的slab  memset(slabclass, 0, sizeof(slabclass));   while (++i < POWER_LARGEST && size <= settings.item_size_max / factor) {  /* Make sure items are always n-byte aligned */  if (size % CHUNK_ALIGN_BYTES)  size += CHUNK_ALIGN_BYTES - (size % CHUNK_ALIGN_BYTES);   slabclass[i].size = size;  slabclass[i].perslab = settings.item_size_max / slabclass[i].size;  size *= factor;  if (settings.verbose > 1) {  fprintf(stderr, "slab class %3d: chunk size %9u perslab %7u\n",  i, slabclass[i].size, slabclass[i].perslab);  }  }   //最大chunksize的一個slab,chunksize爲settings.item_size_max(默認1M)  power_largest = i;  slabclass[power_largest].size = settings.item_size_max;  slabclass[power_largest].perslab = 1;  if (settings.verbose > 1) {  fprintf(stderr, "slab class %3d: chunk size %9u perslab %7u\n",  i, slabclass[i].size, slabclass[i].perslab);  }   //記錄已分配的空間大小  /* for the test suite: faking of how much we've already malloc'd */  {  char *t_initial_malloc = getenv("T_MEMD_INITIAL_MALLOC");  if (t_initial_malloc) {  mem_malloced = (size_t)atol(t_initial_malloc);  }  }   //開啓了預分配,則爲每種slab都分配一個page的空間  if (prealloc) {  slabs_preallocate(power_largest);  } } 

其中settings.chunk_size默認爲48:.net

settings.chunk_size = 48;         /* space for a modest key and value */

POWER_LARGEST指slab種類的最大值,默認只爲200,在memcached.c中設置3d

#define POWER_LARGEST  200

settings.item_size_max就是每一個page的大小,默認1M,在memcached.c中初始化:

settings.item_size_max = 1024 * 1024; /* The famous 1MB upper limit. */

默認不開啓預分配,由於不少時候Memcached只存儲一種類型的數據(即其大小相對比較固定),這時候其餘類型的預分配的slab空間就會浪費.

預分配的邏輯就是從最小的slab開始,爲每類slab分配一個Page大小的空間(空間不足時中止分配):

1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 
static void slabs_preallocate (const unsigned int maxslabs) {  int i;  unsigned int prealloc = 0;   /* pre-allocate a 1MB slab in every size class so people don't get  confused by non-intuitive "SERVER_ERROR out of memory"  messages. this is the most common question on the mailing  list. if you really don't want this, you can rebuild without  these three lines. */   for (i = POWER_SMALLEST; i <= POWER_LARGEST; i++) {  if (++prealloc > maxslabs)  return;  if (do_slabs_newslab(i) == 0) {  fprintf(stderr, "Error while preallocating slab memory!\n"  "If using -L or other prealloc options, max memory must be "  "at least %d megabytes.\n", power_largest);  exit(1);  }  }  } 

do_slabs_newslab的工做就是爲某一個slab分配空間,並將空間劃分乘固定大小的chunk:

1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
static int do_slabs_newslab(const unsigned int id) {  slabclass_t *p = &slabclass[id];  int len = settings.slab_reassign ? settings.item_size_max  : p->size * p->perslab;  char *ptr;   if ((mem_limit && mem_malloced + len > mem_limit && p->slabs > 0) ||  (grow_slab_list(id) == 0) ||  ((ptr = memory_allocate((size_t)len)) == 0)) { //申請內存   MEMCACHED_SLABS_SLABCLASS_ALLOCATE_FAILED(id);  return 0;  }   memset(ptr, 0, (size_t)len);  //將內存劃分乘chunk  split_slab_page_into_freelist(ptr, id);   //維護slab鏈表  p->slab_list[p->slabs++] = ptr;  mem_malloced += len;  MEMCACHED_SLABS_SLABCLASS_ALLOCATE(id);   return 1; } 

split_slab_page_into_freelist的主要控制就是Page劃分乘chunk並清空:

1
2 3 4 5 6 7 8 
static void split_slab_page_into_freelist(char *ptr, const unsigned int id) {  slabclass_t *p = &slabclass[id];  int x;  for (x = 0; x < p->perslab; x++) {  do_slabs_free(ptr, 0, id);  ptr += p->size;  } } 

memcached的內存分配策略就是:按slab需求分配page,各slab按需使用chunk存儲.

按需分配的意思就是某一類slab沒有對象可存,就不會分配(非preallocate模式),某類slab存儲對象不少,就會分配多個slab造成鏈表.

這裏有幾個特色要注意:

1.Memcached分配出去的page不會被回收或者從新分配;
2.Memcached申請的內存不會被釋放;
3.slab空閒的chunk不會借給任何其餘slab使用(新版本memcached有slab_reassign,slab_automove的功能);

slab內存結構圖,二維數組鏈表:

4.往Slab中緩存記錄

memcached根據收到的數據的大小,選擇最適合數據大小的slab. memcached中保存着slab內空閒chunk的列表,根據該列表選擇chunk, 而後將數據緩存於其中.

代碼以下:

1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
/*  * Figures out which slab class (chunk size) is required to store an item of  * a given size.  *  * Given object size, return id to use when allocating/freeing memory for object  * 0 means error: can't store such a large object  */ unsigned int slabs_clsid(const size_t size) {  int res = POWER_SMALLEST; //最小slab編號   if (size == 0)  return 0;  while (size > slabclass[res].size)  if (res++ == power_largest) /* won't fit in the biggest slab */  return 0;  return res; } 

參數是待存儲對象的大小,根據這個大小,從最小的Chunk Size開始查找,找到第一個(即最小的)能放下size大小的對象的Chunk.找不到(size大於最大的Chunk Size)返回0(這就是爲何slab class從1開始而不是從0開始).

若是某個Slab沒有剩餘的Chunk了,系統便會給這個Slab分配一個新的Page以供使用,若是沒有Page可用,系統就會觸發LRU機制,經過刪除冷數據來爲新數據騰出空間,這裏有一點須要注意的是:LRU不是全局的,而是針對Slab而言的.

slab內存分配示例:

5.Slab Allocator的缺點

因爲Slab Allocator分配的是特定長度的內存,所以沒法有效利用分配的內存。 例如,將100字節的數據緩存到128字節的chunk中,剩餘的28字節就浪費了。

6.Memcached減小內存浪費

4.1:調整growth factor

(1).估算咱們item的大小
key鍵長+suffix+value值長+結構大小(48字節)
(2).逐步調整growth factor,使得某個slab的大小和咱們的item大小接近(必須大於咱們item的大小)

7.過時數據

(1).LRU過時策略;
(2).在slab級別上執行LRU策略;
(3).查看是否過去是在get的時候,即懶惰(lazy)檢查;

8.memcached-tool腳本

memcached-tool腳本能夠方便地得到slab的使用狀況 (它將memcached的返回值整理成容易閱讀的格式),能夠從下面的地址得到腳本: http://www.netingcn.com/demo/memcached-tool.zip

使用方法也極其簡單:

1
perl memcached-tool server_ip:prot option 

好比:

1
2 3 4 5 
perl memcached-tool 10.0.0.5:11211 display # shows slabs perl memcached-tool 10.0.0.5:11211 # same. (default is display) perl memcached-tool 10.0.0.5:11211 stats # shows general stats perl memcached-tool 10.0.0.5:11211 move 7 9 # takes 1MB slab from class #7  # to class #9. 

輸出示例:

1
2 3 4 
# Item_Size Max_age 1MB_pages Count Full?  1 104 B 1394292 s 1215 12249628 yes  2 136 B 1456795 s 52 400919 yes  ... 

各列的含義爲:

#           slab class編號
Item_Size   Chunk大小
Max_age     LRU內最舊的記錄的生存時間
1MB_pages   分配給Slab的頁數
Count       Slab內的記錄數
Full?       Slab內是否含有空閒chunk
相關文章
相關標籤/搜索