深刻剖析 redis 數據淘汰策略

概述

在 redis 中,容許用戶設置最大使用內存大小 server.maxmemory,在內存限定的狀況下是頗有用的。譬如,在一臺 8G 機子上部署了 4 個 redis 服務點,每個服務點分配 1.5G 的內存大小,減小內存緊張的狀況,由此獲取更爲穩健的服務。 linux

redis 內存數據集大小上升到必定大小的時候,就會施行數據淘汰策略。redis 提供 6種數據淘汰策略: redis

  1. volatile-lru:從已設置過時時間的數據集(server.db[i].expires)中挑選最近最少使用的數據淘汰
  2. volatile-ttl:從已設置過時時間的數據集(server.db[i].expires)中挑選將要過時的數據淘汰
  3. volatile-random:從已設置過時時間的數據集(server.db[i].expires)中任意選擇數據淘汰
  4. allkeys-lru:從數據集(server.db[i].dict)中挑選最近最少使用的數據淘汰
  5. allkeys-random:從數據集(server.db[i].dict)中任意選擇數據淘汰
  6. no-enviction(驅逐):禁止驅逐數據

redis 肯定驅逐某個鍵值對後,會刪除這個數據並,並將這個數據變動消息發佈到本地(AOF 持久化)和從機(主從鏈接)。 服務器

LRU 數據淘汰機制

在服務器配置中保存了 lru 計數器 server.lrulock,會定時(redis 定時程序 serverCorn())更新,server.lrulock 的值是根據 server.unixtime 計算出來的。

另外,從 struct redisObject 中能夠發現,每個 redis 對象都會設置相應的 lru。能夠想象的是,每一次訪問數據的時候,會更新 redisObject.lru。 數據結構

LRU 數據淘汰機制是這樣的:在數據集中隨機挑選幾個鍵值對,取出其中 lru 最大的鍵值對淘汰。因此,你會發現,redis 並非保證取得全部數據集中最近最少使用(LRU)的鍵值對,而只是隨機挑選的幾個鍵值對中的。 app

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// redisServer 保存了 lru 計數器
struct redisServer {
     . . .
     unsigned lruclock : 22 ;       /* Clock incrementing every minute, for LRU */
     . . .
} ;
 
// 每個 redis 對象都保存了 lru
#define REDIS_LRU_CLOCK_MAX ((1<<21)-1) /* Max value of obj->lru */
#define REDIS_LRU_CLOCK_RESOLUTION 10 /* LRU clock resolution in seconds */
typedef struct redisObject {
     // 剛恰好 32 bits
 
     // 對象的類型,字符串/列表/集合/哈希表
     unsigned type : 4 ;
     // 未使用的兩個位
     unsigned notused : 2 ;     /* Not used */
     // 編碼的方式,redis 爲了節省空間,提供多種方式來保存一個數據
     // 譬如:「123456789」 會被存儲爲整數 123456789
     unsigned encoding : 4 ;
     unsigned lru : 22 ;        /* lru time (relative to server.lruclock) */
 
     // 引用數
     int refcount ;
 
     // 數據指針
     void * ptr ;
} robj ;
 
// redis 定時執行程序。聯想:linux cron
int serverCron ( struct aeEventLoop * eventLoop , long long id , void * clientData ) {
     . . . . . .
     /* We have just 22 bits per object for LRU information.
     * So we use an (eventually wrapping) LRU clock with 10 seconds resolution.
     * 2^22 bits with 10 seconds resolution is more or less 1.5 years.
     *
     * Note that even if this will wrap after 1.5 years it's not a problem,
     * everything will still work but just some object will appear younger
     * to Redis. But for this to happen a given object should never be touched
     * for 1.5 years.
     *
     * Note that you can change the resolution altering the
     * REDIS_LRU_CLOCK_RESOLUTION define.
     */
     updateLRUClock ( ) ;
     . . . . . .
}
 
// 更新服務器的 lru 計數器
void updateLRUClock ( void ) {
     server . lruclock = ( server . unixtime / REDIS_LRU_CLOCK_RESOLUTION ) &
                                                 REDIS_LRU_CLOCK_MAX ;
}

 

TTL 數據淘汰機制

redis 數據集數據結構中保存了鍵值對過時時間的表,即 redisDb.expires。和 LRU 數據淘汰機制相似,TTL 數據淘汰機制是這樣的:從過時時間的表中隨機挑選幾個鍵值對,取出其中 ttl 最大的鍵值對淘汰。一樣你會發現,redis 並非保證取得全部過時時間的表中最快過時的鍵值對,而只是隨機挑選的幾個鍵值對中的。 less

總結

redis 每服務客戶端執行一個命令的時候,會檢測使用的內存是否超額。若是超額,即進行數據淘汰。 dom

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
// 執行命令
int processCommand ( redisClient * c ) {
     . . . . . .
     // 內存超額
     /* Handle the maxmemory directive.
     *
     * First we try to free some memory if possible (if there are volatile
     * keys in the dataset). If there are not the only thing we can do
     * is returning an error. */
     if ( server . maxmemory ) {
         int retval = freeMemoryIfNeeded ( ) ;
         if ( ( c -> cmd -> flags & REDIS_CMD_DENYOOM ) && retval == REDIS_ERR ) {
             flagTransaction ( c ) ;
             addReply ( c , shared . oomerr ) ;
             return REDIS_OK ;
         }
     }
     . . . . . .
}
 
// 若是須要,是否一些內存
int freeMemoryIfNeeded ( void ) {
     size_t mem_used , mem_tofree , mem_freed ;
     int slaves = listLength ( server . slaves ) ;
 
     // redis 從機回覆空間和 AOF 內存大小不計算入 redis 內存大小
     /* Remove the size of slaves output buffers and AOF buffer from the
     * count of used memory. */
     mem_used = zmalloc_used_memory ( ) ;
 
     // 從機回覆空間大小
     if ( slaves ) {
         listIter li ;
         listNode * ln ;
 
         listRewind ( server . slaves , & li ) ;
         while ( ( ln = listNext ( & li ) ) ) {
             redisClient * slave = listNodeValue ( ln ) ;
             unsigned long obuf_bytes = getClientOutputBufferMemoryUsage ( slave ) ;
             if ( obuf_bytes > mem_used )
                 mem_used = 0 ;
             else
                 mem_used - = obuf_bytes ;
         }
     }
     // server.aof_buf && server.aof_rewrite_buf_blocks
     if ( server . aof_state != REDIS_AOF_OFF ) {
         mem_used - = sdslen ( server . aof_buf ) ;
         mem_used - = aofRewriteBufferSize ( ) ;
     }
 
     // 內存是否超過設置大小
     /* Check if we are over the memory limit. */
     if ( mem_used <= server . maxmemory ) return REDIS_OK ;
 
     // redis 中能夠設置內存超額策略
     if ( server . maxmemory_policy == REDIS_MAXMEMORY_NO_EVICTION )
         return REDIS_ERR ; /* We need to free memory, but policy forbids. */
 
     /* Compute how much memory we need to free. */
     mem_tofree = mem_used - server . maxmemory ;
     mem_freed = 0 ;
     while ( mem_freed < mem_tofree ) {
         int j , k , keys_freed = 0 ;
 
         // 遍歷全部數據集
         for ( j = 0 ; j < server . dbnum ; j ++ ) {
             long bestval = 0 ; /* just to prevent warning */
             sds bestkey = NULL ;
             struct dictEntry * de ;
             redisDb * db = server . db + j ;
             dict * dict ;
 
             // 不一樣的策略,選擇的數據集不同
             if ( server . maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                 server . maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM )
             {
                 dict = server . db [ j ] . dict ;
             } else {
                 dict = server . db [ j ] . expires ;
             }
 
             // 數據集爲空,繼續下一個數據集
             if ( dictSize ( dict ) == 0 ) continue ;
 
             // 隨機淘汰隨機策略:隨機挑選
             /* volatile-random and allkeys-random policy */
             if ( server . maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||
                 server . maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM )
             {
                 de = dictGetRandomKey ( dict ) ;
                 bestkey = dictGetKey ( de ) ;
             }
 
             // LRU 策略:挑選最近最少使用的數據
             /* volatile-lru and allkeys-lru policy */
             else if ( server . maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                 server . maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU )
             {
                 // server.maxmemory_samples 爲隨機挑選鍵值對次數
                 // 隨機挑選 server.maxmemory_samples個鍵值對,驅逐最近最少使用的數據
                 for ( k = 0 ; k < server . maxmemory_samples ; k ++ ) {
                     sds thiskey ;
                     long thisval ;
                     robj * o ;
 
                     // 隨機挑選鍵值對
                     de = dictGetRandomKey ( dict ) ;
 
                     // 獲取鍵
                     thiskey = dictGetKey ( de ) ;
 
                     /* When policy is volatile-lru we need an additional lookup
                     * to locate the real key, as dict is set to db->expires. */
                     if ( server . maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU )
                         de = dictFind ( db -> dict , thiskey ) ;
                     o = dictGetVal ( de ) ;
 
                     // 計算數據的空閒時間
                     thisval = estimateObjectIdleTime ( o ) ;
 
                     // 當前鍵值空閒時間更長,則記錄
                     /* Higher idle time is better candidate for deletion */
                     if ( bestkey == NULL || thisval > bestval ) {
                         bestkey = thiskey ;
                         bestval = thisval ;
                     }
                 }
             }
 
             // TTL 策略:挑選將要過時的數據
             /* volatile-ttl */
             else if ( server . maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_TTL ) {
                 // server.maxmemory_samples 爲隨機挑選鍵值對次數
                 // 隨機挑選 server.maxmemory_samples個鍵值對,驅逐最快要過時的數據
                 for ( k = 0 ; k < server . maxmemory_samples ; k ++ ) {
                     sds thiskey ;
                     long thisval ;
 
                     de = dictGetRandomKey ( dict ) ;
                     thiskey = dictGetKey ( de ) ;
                     thisval = ( long ) dictGetVal ( de ) ;
 
                     /* Expire sooner (minor expire unix timestamp) is better
                     * candidate for deletion */
                     if ( bestkey == NULL || thisval < bestval ) {
                         bestkey = thiskey ;
                         bestval = thisval ;
                     }
                 }
             }
 
             // 刪除選定的鍵值對
             /* Finally remove the selected key. */
             if ( bestkey ) {
                 long long delta ;
 
                 robj * keyobj = createStringObject ( bestkey , sdslen ( bestkey ) ) ;
 
                 // 發佈數據更新消息,主要是 AOF 持久化和從機
                 propagateExpire ( db , keyobj ) ;
 
                 // 注意, propagateExpire() 可能會致使內存的分配, propagateExpire()
提早執行就是由於 redis 只計算 dbDelete ( ) 釋放的內存大小。假若同時計算 dbDelete ( ) 釋放的內存
propagateExpire ( ) 分配空間的大小,與此同時假設分配空間大於釋放空間,就有可能永遠退不出這個循環。
                 // 下面的代碼會同時計算 dbDelete() 釋放的內存和 propagateExpire() 分配空間的大小:
                 // propagateExpire(db,keyobj);
                 // delta = (long long) zmalloc_used_memory();
                 // dbDelete(db,keyobj);
                 // delta -= (long long) zmalloc_used_memory();
                 // mem_freed += delta;
                 /////////////////////////////////////////
 
                 /* We compute the amount of memory freed by dbDelete() alone.
                 * It is possible that actually the memory needed to propagate
                 * the DEL in AOF and replication link is greater than the one
                 * we are freeing removing the key, but we can't account for
                 * that otherwise we would never exit the loop.
                 *
                 * AOF and Output buffer memory will be freed eventually so
                 * we only care about memory used by the key space. */
                 // 只計算 dbDelete() 釋放內存的大小
                 delta = ( long long ) zmalloc_used_memory ( ) ;
                 dbDelete ( db , keyobj ) ;
                 delta - = ( long long ) zmalloc_used_memory ( ) ;
                 mem_freed + = delta ;
 
                 server . stat_evictedkeys ++ ;
 
                 // 將數據的刪除通知全部的訂閱客戶端
                 notifyKeyspaceEvent ( REDIS_NOTIFY_EVICTED , "evicted" ,
                     keyobj , db -> id ) ;
                 decrRefCount ( keyobj ) ;
                 keys_freed ++ ;
 
                 // 將從機回覆空間中的數據及時發送給從機
                 /* When the memory to free starts to be big enough, we may
                 * start spending so much time here that is impossible to
                 * deliver data to the slaves fast enough, so we force the
                 * transmission here inside the loop. */
                 if ( slaves ) flushSlavesOutputBuffers ( ) ;
             }
         }
 
         // 未能釋放空間,且此時 redis 使用的內存大小依舊超額,失敗返回
         if ( ! keys_freed ) return REDIS_ERR ; /* nothing to free... */
     }
     return REDIS_OK ;
}
 
相關文章
相關標籤/搜索