關於Redis數據過時策略

時間 2019-11-12

標籤關於 redis 數據過時策略欄目 Redis 简体版

原文原文鏈接

一、Redis中key的的過時時間

經過EXPIRE key seconds命令來設置數據的過時時間。返回1代表設置成功，返回0代表key不存在或者不能成功設置過時時間。在key上設置了過時時間後key將在指定的秒數後被自動刪除。被指定了過時時間的key在Redis中被稱爲是不穩定的。redis

當key被DEL命令刪除或者被SET、GETSET命令重置後與之關聯的過時時間會被清除算法

127.0.0.1:6379> setex s 20 1
OK
127.0.0.1:6379> ttl s
(integer) 17
127.0.0.1:6379> setex s 200 1
OK
127.0.0.1:6379> ttl s
(integer) 195
127.0.0.1:6379> setrange s 3 100
(integer) 6
127.0.0.1:6379> ttl s
(integer) 152
127.0.0.1:6379> get s 
"1\x00\x00100"
127.0.0.1:6379> ttl s
(integer) 108
127.0.0.1:6379> getset s 200
"1\x00\x00100"
127.0.0.1:6379> get s 
"200"
127.0.0.1:6379> ttl s
(integer) -1

使用PERSIST能夠清除過時時間數據庫

127.0.0.1:6379> setex s 100 test
OK
127.0.0.1:6379> get s
"test"
127.0.0.1:6379> ttl s
(integer) 94
127.0.0.1:6379> type s
string
127.0.0.1:6379> strlen s
(integer) 4
127.0.0.1:6379> persist s
(integer) 1
127.0.0.1:6379> ttl s
(integer) -1
127.0.0.1:6379> get s
"test"

使用rename只是改了key值服務器

127.0.0.1:6379> expire s 200
(integer) 1
127.0.0.1:6379> ttl s
(integer) 198
127.0.0.1:6379> rename s ss
OK
127.0.0.1:6379> ttl ss
(integer) 187
127.0.0.1:6379> type ss
string
127.0.0.1:6379> get ss
"test"

說明：Redis2.6之後expire精度能夠控制在0到1毫秒內，key的過時信息以絕對Unix時間戳的形式存儲（Redis2.6以後以毫秒級別的精度存儲），因此在多服務器同步的時候，必定要同步各個服務器的時間app

二、Redis過時鍵刪除策略

Redis key過時的方式有三種：框架

被動刪除：當讀/寫一個已通過期的key時，會觸發惰性刪除策略，直接刪除掉這個過時key
主動刪除：因爲惰性刪除策略沒法保證冷數據被及時刪掉，因此Redis會按期主動淘汰一批已過時的key
當前已用內存超過maxmemory限定時，觸發主動清理策略

被動刪除

只有key被操做時(如GET)，REDIS纔會被動檢查該key是否過時，若是過時則刪除之而且返回NIL。less

一、這種刪除策略對CPU是友好的，刪除操做只有在不得不的狀況下才會進行，不會其餘的expire key上浪費無謂的CPU時間。dom

二、可是這種策略對內存不友好，一個key已通過期，可是在它被操做以前不會被刪除，仍然佔據內存空間。若是有大量的過時鍵存在可是又不多被訪問到，那會形成大量的內存空間浪費。expireIfNeeded(redisDb *db, robj *key)函數位於src/db.c。ide

/*-----------------------------------------------------------------------------
 * Expires API
 *----------------------------------------------------------------------------*/

int removeExpire(redisDb *db, robj *key) {
    /* An expire may only be removed if there is a corresponding entry in the
     * main dict. Otherwise, the key will never be freed. */
    redisAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);
    return dictDelete(db->expires,key->ptr) == DICT_OK;
}

void setExpire(redisDb *db, robj *key, long long when) {
    dictEntry *kde, *de;

    /* Reuse the sds from the main dict in the expire dict */
    kde = dictFind(db->dict,key->ptr);
    redisAssertWithInfo(NULL,key,kde != NULL);
    de = dictReplaceRaw(db->expires,dictGetKey(kde));
    dictSetSignedIntegerVal(de,when);
}

/* Return the expire time of the specified key, or -1 if no expire
 * is associated with this key (i.e. the key is non volatile) */
long long getExpire(redisDb *db, robj *key) {
    dictEntry *de;

    /* No expire? return ASAP */
    if (dictSize(db->expires) == 0 ||
       (de = dictFind(db->expires,key->ptr)) == NULL) return -1;

    /* The entry was found in the expire dict, this means it should also
     * be present in the main dict (safety check). */
    redisAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);
    return dictGetSignedIntegerVal(de);
}

/* Propagate expires into slaves and the AOF file.
 * When a key expires in the master, a DEL operation for this key is sent
 * to all the slaves and the AOF file if enabled.
 *
 * This way the key expiry is centralized in one place, and since both
 * AOF and the master->slave link guarantee operation ordering, everything
 * will be consistent even if we allow write operations against expiring
 * keys. */
void propagateExpire(redisDb *db, robj *key) {
    robj *argv[2];

    argv[0] = shared.del;
    argv[1] = key;
    incrRefCount(argv[0]);
    incrRefCount(argv[1]);

    if (server.aof_state != REDIS_AOF_OFF)
        feedAppendOnlyFile(server.delCommand,db->id,argv,2);
    replicationFeedSlaves(server.slaves,db->id,argv,2);

    decrRefCount(argv[0]);
    decrRefCount(argv[1]);
}

int expireIfNeeded(redisDb *db, robj *key) {
    mstime_t when = getExpire(db,key);
    mstime_t now;

    if (when < 0) return 0; /* No expire for this key */

    /* Don't expire anything while loading. It will be done later. */
    if (server.loading) return 0;

    /* If we are in the context of a Lua script, we claim that time is
     * blocked to when the Lua script started. This way a key can expire
     * only the first time it is accessed and not in the middle of the
     * script execution, making propagation to slaves / AOF consistent.
     * See issue #1525 on Github for more information. */
    now = server.lua_caller ? server.lua_time_start : mstime();

    /* If we are running in the context of a slave, return ASAP:
     * the slave key expiration is controlled by the master that will
     * send us synthesized DEL operations for expired keys.
     *
     * Still we try to return the right information to the caller,
     * that is, 0 if we think the key should be still valid, 1 if
     * we think the key is expired at this time. */
    if (server.masterhost != NULL) return now > when;

    /* Return when this key has not expired */
    if (now <= when) return 0;

    /* Delete the key */
    server.stat_expiredkeys++;
    propagateExpire(db,key);
    notifyKeyspaceEvent(REDIS_NOTIFY_EXPIRED,
        "expired",key,db->id);
    return dbDelete(db,key);
}

/*-----------------------------------------------------------------------------
 * Expires Commands
 *----------------------------------------------------------------------------*/

/* This is the generic command implementation for EXPIRE, PEXPIRE, EXPIREAT
 * and PEXPIREAT. Because the commad second argument may be relative or absolute
 * the "basetime" argument is used to signal what the base time is (either 0
 * for *AT variants of the command, or the current time for relative expires).
 *
 * unit is either UNIT_SECONDS or UNIT_MILLISECONDS, and is only used for
 * the argv[2] parameter. The basetime is always specified in milliseconds. */
void expireGenericCommand(redisClient *c, long long basetime, int unit) {
    robj *key = c->argv[1], *param = c->argv[2];
    long long when; /* unix time in milliseconds when the key will expire. */

    if (getLongLongFromObjectOrReply(c, param, &when, NULL) != REDIS_OK)
        return;

    if (unit == UNIT_SECONDS) when *= 1000;
    when += basetime;

    /* No key, return zero. */
    if (lookupKeyRead(c->db,key) == NULL) {
        addReply(c,shared.czero);
        return;
    }

    /* EXPIRE with negative TTL, or EXPIREAT with a timestamp into the past
     * should never be executed as a DEL when load the AOF or in the context
     * of a slave instance.
     *
     * Instead we take the other branch of the IF statement setting an expire
     * (possibly in the past) and wait for an explicit DEL from the master. */
    if (when <= mstime() && !server.loading && !server.masterhost) {
        robj *aux;

        redisAssertWithInfo(c,key,dbDelete(c->db,key));
        server.dirty++;

        /* Replicate/AOF this as an explicit DEL. */
        aux = createStringObject("DEL",3);
        rewriteClientCommandVector(c,2,aux,key);
        decrRefCount(aux);
        signalModifiedKey(c->db,key);
        notifyKeyspaceEvent(REDIS_NOTIFY_GENERIC,"del",key,c->db->id);
        addReply(c, shared.cone);
        return;
    } else {
        setExpire(c->db,key,when);
        addReply(c,shared.cone);
        signalModifiedKey(c->db,key);
        notifyKeyspaceEvent(REDIS_NOTIFY_GENERIC,"expire",key,c->db->id);
        server.dirty++;
        return;
    }
}

void expireCommand(redisClient *c) {
    expireGenericCommand(c,mstime(),UNIT_SECONDS);
}

void expireatCommand(redisClient *c) {
    expireGenericCommand(c,0,UNIT_SECONDS);
}

void pexpireCommand(redisClient *c) {
    expireGenericCommand(c,mstime(),UNIT_MILLISECONDS);
}

void pexpireatCommand(redisClient *c) {
    expireGenericCommand(c,0,UNIT_MILLISECONDS);
}

void ttlGenericCommand(redisClient *c, int output_ms) {
    long long expire, ttl = -1;

    /* If the key does not exist at all, return -2 */
    if (lookupKeyRead(c->db,c->argv[1]) == NULL) {
        addReplyLongLong(c,-2);
        return;
    }
    /* The key exists. Return -1 if it has no expire, or the actual
     * TTL value otherwise. */
    expire = getExpire(c->db,c->argv[1]);
    if (expire != -1) {
        ttl = expire-mstime();
        if (ttl < 0) ttl = 0;
    }
    if (ttl == -1) {
        addReplyLongLong(c,-1);
    } else {
        addReplyLongLong(c,output_ms ? ttl : ((ttl+500)/1000));
    }
}

void ttlCommand(redisClient *c) {
    ttlGenericCommand(c, 0);
}

void pttlCommand(redisClient *c) {
    ttlGenericCommand(c, 1);
}

void persistCommand(redisClient *c) {
    dictEntry *de;

    de = dictFind(c->db->dict,c->argv[1]->ptr);
    if (de == NULL) {
        addReply(c,shared.czero);
    } else {
        if (removeExpire(c->db,c->argv[1])) {
            addReply(c,shared.cone);
            server.dirty++;
        } else {
            addReply(c,shared.czero);
        }
    }
}

但僅是這樣是不夠的，由於可能存在一些key永遠不會被再次訪問到，這些設置了過時時間的key也是須要在過時後被刪除的，咱們甚至能夠將這種狀況看做是一種內存泄露----無用的垃圾數據佔用了大量的內存，而服務器卻不會本身去釋放它們，這對於運行狀態很是依賴於內存的Redis服務器來講，確定不是一個好消息函數

主動刪除

先說一下時間事件，對於持續運行的服務器來講，服務器須要按期對自身的資源和狀態進行必要的檢查和整理，從而讓服務器維持在一個健康穩定的狀態，這類操做被統稱爲常規操做（cron job）

在 Redis 中，常規操做由 redis.c/serverCron 實現，它主要執行如下操做

更新服務器的各種統計信息，好比時間、內存佔用、數據庫佔用狀況等。
清理數據庫中的過時鍵值對。
對不合理的數據庫進行大小調整。
關閉和清理鏈接失效的客戶端。
嘗試進行 AOF 或 RDB 持久化操做。
若是服務器是主節點的話，對附屬節點進行按期同步。
若是處於集羣模式的話，對集羣進行按期同步和鏈接測試。

Redis 將 serverCron 做爲時間事件來運行，從而確保它每隔一段時間就會自動運行一次，又由於 serverCron 須要在 Redis 服務器運行期間一直按期運行，因此它是一個循環時間事件： serverCron 會一直按期執行，直到服務器關閉爲止。

在 Redis 2.6 版本中，程序規定 serverCron 每秒運行 10 次，平均每 100 毫秒運行一次。從 Redis 2.8 開始，用戶能夠經過修改 hz選項來調整 serverCron 的每秒執行次數，具體信息請參考 redis.conf 文件中關於 hz 選項的說明

也叫定時刪除，這裏的「按期」指的是Redis按期觸發的清理策略，由位於src/redis.c的activeExpireCycle(void)函數來完成。

serverCron是由redis的事件框架驅動的定位任務，這個定時任務中會調用activeExpireCycle函數，針對每一個db在限制的時間REDIS_EXPIRELOOKUPS_TIME_LIMIT內遲可能多的刪除過時key，之因此要限制時間是爲了防止過長時間的阻塞影響redis的正常運行。這種主動刪除策略彌補了被動刪除策略在內存上的不友好。

所以，Redis會週期性的隨機測試一批設置了過時時間的key並進行處理。測試到的已過時的key將被刪除。典型的方式爲,Redis每秒作10次以下的步驟：

隨機測試100個設置了過時時間的key
刪除全部發現的已過時的key
若刪除的key超過25個則重複步驟1

這是一個基於機率的簡單算法，基本的假設是抽出的樣本可以表明整個key空間，redis持續清理過時的數據直至將要過時的key的百分比降到了25%如下。這也意味着在任何給定的時刻已通過期但仍佔據着內存空間的key的量最多爲每秒的寫操做量除以4.

Redis-3.0.0中的默認值是10，表明每秒鐘調用10次後臺任務。

除了主動淘汰的頻率外，Redis對每次淘汰任務執行的最大時長也有一個限定，這樣保證了每次主動淘汰不會過多阻塞應用請求，如下是這個限定計算公式：

#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* CPU max % for keys collection */  
...  
timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100;

hz調大將會提升Redis主動淘汰的頻率，若是你的Redis存儲中包含不少冷數據佔用內存過大的話，能夠考慮將這個值調大，但Redis做者建議這個值不要超過100。咱們實際線上將這個值調大到100，觀察到CPU會增長2%左右，但對冷數據的內存釋放速度確實有明顯的提升（經過觀察keyspace個數和used_memory大小）。

能夠看出timelimit和server.hz是一個倒數的關係，也就是說hz配置越大，timelimit就越小。換句話說是每秒鐘指望的主動淘汰頻率越高，則每次淘汰最長佔用時間就越短。這裏每秒鐘的最長淘汰佔用時間是固定的250ms（1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/100），而淘汰頻率和每次淘汰的最長時間是經過hz參數控制的。

從以上的分析看，當redis中的過時key比率沒有超過25%以前，提升hz能夠明顯提升掃描key的最小個數。假設hz爲10，則一秒內最少掃描200個key（一秒調用10次*每次最少隨機取出20個key），若是hz改成100，則一秒內最少掃描2000個key；另外一方面，若是過時key比率超過25%，則掃描key的個數無上限，可是cpu時間每秒鐘最多佔用250ms。

當REDIS運行在主從模式時，只有主結點纔會執行上述這兩種過時刪除策略，而後把刪除操做」del key」同步到從結點。

maxmemory

當前已用內存超過maxmemory限定時，觸發主動清理策略

volatile-lru：只對設置了過時時間的key進行LRU（默認值）
allkeys-lru ：刪除lru算法的key
volatile-random：隨機刪除即將過時key
allkeys-random：隨機刪除
volatile-ttl ：刪除即將過時的
noeviction ：永不過時，返回錯誤當mem_used內存已經超過maxmemory的設定，對於全部的讀寫請求，都會觸發redis.c/freeMemoryIfNeeded(void)函數以清理超出的內存。注意這個清理過程是阻塞的，直到清理出足夠的內存空間。因此若是在達到maxmemory而且調用方還在不斷寫入的狀況下，可能會反覆觸發主動清理策略，致使請求會有必定的延遲。

當mem_used內存已經超過maxmemory的設定，對於全部的讀寫請求，都會觸發redis.c/freeMemoryIfNeeded(void)函數以清理超出的內存。注意這個清理過程是阻塞的，直到清理出足夠的內存空間。因此若是在達到maxmemory而且調用方還在不斷寫入的狀況下，可能會反覆觸發主動清理策略，致使請求會有必定的延遲。

清理時會根據用戶配置的maxmemory-policy來作適當的清理（通常是LRU或TTL），這裏的LRU或TTL策略並非針對redis的全部key，而是以配置文件中的maxmemory-samples個key做爲樣本池進行抽樣清理。

maxmemory-samples在redis-3.0.0中的默認配置爲5，若是增長，會提升LRU或TTL的精準度，redis做者測試的結果是當這個配置爲10時已經很是接近全量LRU的精準度了，而且增長maxmemory-samples會致使在主動清理時消耗更多的CPU時間，建議：

儘可能不要觸發maxmemory，最好在mem_used內存佔用達到maxmemory的必定比例後，須要考慮調大hz以加快淘汰，或者進行集羣擴容。
若是可以控制住內存，則能夠不用修改maxmemory-samples配置；若是Redis自己就做爲LRU cache服務（這種服務通常長時間處於maxmemory狀態，由Redis自動作LRU淘汰），能夠適當調大maxmemory-samples。

如下是上文中提到的配置參數的說明

# Redis calls an internal function to perform many background tasks, like  
# closing connections of clients in timeout, purging expired keys that are  
# never requested, and so forth.  
#  
# Not all tasks are performed with the same frequency, but Redis checks for  
# tasks to perform according to the specified "hz" value.  
#  
# By default "hz" is set to 10. Raising the value will use more CPU when  
# Redis is idle, but at the same time will make Redis more responsive when  
# there are many keys expiring at the same time, and timeouts may be  
# handled with more precision.  
#  
# The range is between 1 and 500, however a value over 100 is usually not  
# a good idea. Most users should use the default of 10 and raise this up to  
# 100 only in environments where very low latency is required.  
hz 10  

# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory  
# is reached. You can select among five behaviors:  
#  
# volatile-lru -> remove the key with an expire set using an LRU algorithm  
# allkeys-lru -> remove any key according to the LRU algorithm  
# volatile-random -> remove a random key with an expire set  
# allkeys-random -> remove a random key, any key  
# volatile-ttl -> remove the key with the nearest expire time (minor TTL)  
# noeviction -> don't expire at all, just return an error on write operations  
#  
# Note: with any of the above policies, Redis will return an error on write  
#       operations, when there are no suitable keys for eviction.  
#  
#       At the date of writing these commands are: set setnx setex append  
#       incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd  
#       sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby  
#       zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby  
#       getset mset msetnx exec sort  
#  
# The default is:  
#  
maxmemory-policy noeviction  

# LRU and minimal TTL algorithms are not precise algorithms but approximated  
# algorithms (in order to save memory), so you can tune it for speed or  
# accuracy. For default Redis will check five keys and pick the one that was  
# used less recently, you can change the sample size using the following  
# configuration directive.  
#  
# The default of 5 produces good enough results. 10 Approximates very closely  
# true LRU but costs a bit more CPU. 3 is very fast but not very accurate.  
#  
maxmemory-samples 5

Replication link和AOF文件中的過時處理

爲了得到正確的行爲而不至於致使一致性問題，當一個key過時時DEL操做將被記錄在AOF文件並傳遞到全部相關的slave。也即過時刪除操做統一在master實例中進行並向下傳遞，而不是各salve各自掌控。這樣一來便不會出現數據不一致的情形。當slave鏈接到master後並不能當即清理已過時的key（須要等待由master傳遞過來的DEL操做），slave仍需對數據集中的過時狀態進行管理維護以便於在slave被提高爲master會能像master同樣獨立的進行過時處理。