Redis學習--主節點過時鍵清理策略

Redis過時鍵刪除

在Redis中使用server.dbnum來控制Redis實例包含的DB數量,每一個RedisDB結構以下:html

/* Redis database representation. There are multiple databases identified
 * by integers from 0 (the default database) up to the max configured
 * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    dict *expires;              /* Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP)*/
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
    list *defrag_later;         /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;

其中使用*dict字典來存放該DB的全部鍵,使用*expires字典來存放該DB下全部設置過時時間的鍵,*expires字典存放的value是對應Key的過時時間( UNIX時間戳)。c++

Redis提供三種過時刪除策略:redis

  • 定時刪除,在設置鍵過時時同時設置一個定時器,定時器到期後當即刪除該鍵。優勢是能保證鍵在過時後能當即被刪除,缺點是定時器會消耗過多CPU資源。
  • 惰性刪除,在每次請求鍵時判斷該鍵是否已過時,若是過時則刪除該鍵。優勢是消耗CPU資源較少,缺點是刪除操做實時性較低,存在過時鍵長時間未被刪除的狀況。
  • 按期刪除,經過定時任務進行觸發,遍歷全部RedisDB,並從每一個RedisDB的*expires字典隨機獲取已設置過時的鍵,找出已過時的鍵並進行刪除。

在實際生成環境中,主要採用惰性刪除策略+按期刪除策略來對已過時的鍵進行清理。數據庫

惰性刪除策略

惰性刪除主要依賴於函數expireIfNeeded來完成,在進行lookupKeyRead、lookupKeyWrite、dbRandomKey等操做時,都會調用expireIfNeeded來檢查鍵是否過時。less

/* This function is called when we are going to perform some operation
 * in a given key, but such key may be already logically expired even if
 * it still exists in the database. The main way this function is called
 * is via lookupKey*() family of functions.
 *
 * The behavior of the function depends on the replication role of the
 * instance, because slave instances do not expire keys, they wait
 * for DELs from the master for consistency matters. However even
 * slaves will try to have a coherent return value for the function,
 * so that read commands executed in the slave side will be able to
 * behave like if the key is expired even if still present (because the
 * master has yet to propagate the DEL).
 *
 * In masters as a side effect of finding a key which is expired, such
 * key will be evicted from the database. Also this may trigger the
 * propagation of a DEL/UNLINK command in AOF / replication stream.
 *
 * The return value of the function is 0 if the key is still valid,
 * otherwise the function returns 1 if the key is expired. */
int expireIfNeeded(redisDb *db, robj *key) {
    if (!keyIsExpired(db,key)) return 0;

    /* If we are running in the context of a slave, instead of
     * evicting the expired key from the database, we return ASAP:
     * the slave key expiration is controlled by the master that will
     * send us synthesized DEL operations for expired keys.
     *
     * Still we try to return the right information to the caller,
     * that is, 0 if we think the key should be still valid, 1 if
     * we think the key is expired at this time. */
    if (server.masterhost != NULL) return 1;

    /* Delete the key */
    server.stat_expiredkeys++;
    propagateExpire(db,key,server.lazyfree_lazy_expire);
    notifyKeyspaceEvent(NOTIFY_EXPIRED,
        "expired",key,db->id);
    return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
                                         dbSyncDelete(db,key);
}

主動刪除策略

主動刪除策略主要依賴activeExpireCycleTryExpire函數來實現單個鍵的刪除,經過activeExpireCycle和expireSlaveKeys來分別清理主實例和從實例上的過時鍵。dom

/*-----------------------------------------------------------------------------
 * Incremental collection of expired keys.
 *
 * When keys are accessed they are expired on-access. However we need a
 * mechanism in order to ensure keys are eventually removed when expired even
 * if no access is performed on them.
 *----------------------------------------------------------------------------*/

/* Helper function for the activeExpireCycle() function.
 * This function will try to expire the key that is stored in the hash table
 * entry 'de' of the 'expires' hash table of a Redis database.
 *
 * If the key is found to be expired, it is removed from the database and
 * 1 is returned. Otherwise no operation is performed and 0 is returned.
 *
 * When a key is expired, server.stat_expiredkeys is incremented.
 *
 * The parameter 'now' is the current time in milliseconds as is passed
 * to the function to avoid too many gettimeofday() syscalls. */
int activeExpireCycleTryExpire(redisDb *db, dictEntry *de, long long now) {
    long long t = dictGetSignedIntegerVal(de);
    if (now > t) {
        sds key = dictGetKey(de);
        robj *keyobj = createStringObject(key,sdslen(key));

        propagateExpire(db,keyobj,server.lazyfree_lazy_expire);
        if (server.lazyfree_lazy_expire)
            dbAsyncDelete(db,keyobj);
        else
            dbSyncDelete(db,keyobj);
        notifyKeyspaceEvent(NOTIFY_EXPIRED,
            "expired",keyobj,db->id);
        decrRefCount(keyobj);
        server.stat_expiredkeys++;
        return 1;
    } else {
        return 0;
    }
}

函數activeExpireCycle提供二者工做模式:async

  • ACTIVE_EXPIRE_CYCLE_FAST,快速過時模式,執行的時間不會長過 EXPIRE_FAST_CYCLE_DURATION 毫秒(默認1ms),且在EXPIRE_FAST_CYCLE_DURATION 毫秒以內不會再從新執行。
  • ACTIVE_EXPIRE_CYCLE_SLOW,正常過時模式,執行時間上限爲:1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100,其中ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC默認值爲25,server.hz默認值爲10,默認爲25ms。

/* Try to expire a few timed out keys. The algorithm used is adaptive and
 * will use few CPU cycles if there are few expiring keys, otherwise
 * it will get more aggressive to avoid that too much memory is used by
 * keys that can be removed from the keyspace.
 *
 * 函數嘗試刪除數據庫中已通過期的鍵。
 * 當帶有過時時間的鍵比較少時,函數運行得比較保守,
 * 若是帶有過時時間的鍵比較多,那麼函數會以更積極的方式來刪除過時鍵,
 * 從而可能地釋放被過時鍵佔用的內存。
 *
 * No more than REDIS_DBCRON_DBS_PER_CALL databases are tested at every
 * iteration.
 *
 * 每次循環中被測試的數據庫數目不會超過 REDIS_DBCRON_DBS_PER_CALL 。
 * REDIS_DBCRON_DBS_PER_CALL 在代碼中已經寫死爲16,不能夠配置。
 *
 * This kind of call is used when Redis detects that timelimit_exit is
 * true, so there is more work to do, and we do it more incrementally from
 * the beforeSleep() function of the event loop.
 *
 * 若是 timelimit_exit 爲真,那麼說明還有更多刪除工做要作,
 * 那麼在 beforeSleep() 函數調用時,程序會再次執行這個函數。
 *
 * Expire cycle type:
 *
 * 過時循環的類型:
 *
 * If type is ACTIVE_EXPIRE_CYCLE_FAST the function will try to run a
 * "fast" expire cycle that takes no longer than EXPIRE_FAST_CYCLE_DURATION
 * microseconds, and is not repeated again before the same amount of time.
 *
 * 若是循環的類型爲 ACTIVE_EXPIRE_CYCLE_FAST ,
 * 那麼函數會以「快速過時」模式執行,
 * 執行的時間不會長過 EXPIRE_FAST_CYCLE_DURATION 毫秒,
 * 而且在 EXPIRE_FAST_CYCLE_DURATION 毫秒以內不會再從新執行。
 *
 * If type is ACTIVE_EXPIRE_CYCLE_SLOW, that normal expire cycle is
 * executed, where the time limit is a percentage of the REDIS_HZ period
 * as specified by the REDIS_EXPIRELOOKUPS_TIME_PERC define. 
 *
 * 若是循環的類型爲 ACTIVE_EXPIRE_CYCLE_SLOW ,
 * 那麼函數會以「正常過時」模式執行,
 * 函數的執行時限爲 REDIS_HS 常量的一個百分比,
 * 這個百分比由 REDIS_EXPIRELOOKUPS_TIME_PERC 定義。
 */
 
void activeExpireCycle(int type) {
    /* This function has some global state in order to continue the work
     * incrementally across calls. */
    // 靜態變量,用來累積函數連續執行時的數據
    static unsigned int current_db = 0; /* Last DB tested. */
    static int timelimit_exit = 0;      /* Time limit hit in previous call? */
    static long long last_fast_cycle = 0; /* When last fast cycle ran. */
 
    unsigned int j, iteration = 0;
    // 默認每次處理的數據庫數量
    unsigned int dbs_per_call = REDIS_DBCRON_DBS_PER_CALL;
    // 函數開始的時間
    long long start = ustime(), timelimit;
 
    // 快速模式
    if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
        /* Don't start a fast cycle if the previous cycle did not exited
         * for time limt. Also don't repeat a fast cycle for the same period
         * as the fast cycle total duration itself. */
        // 若是上次函數沒有觸發 timelimit_exit ,那麼不執行處理
        if (!timelimit_exit) return;
        // 若是距離上次執行未夠必定時間,那麼不執行處理
        if (start < last_fast_cycle + ACTIVE_EXPIRE_CYCLE_FAST_DURATION*2) return;
        // 運行到這裏,說明執行快速處理,記錄當前時間
        last_fast_cycle = start;
    }
 
    /* We usually should test REDIS_DBCRON_DBS_PER_CALL per iteration, with
     * two exceptions:
     *
     * 通常狀況下,函數只處理 REDIS_DBCRON_DBS_PER_CALL 個數據庫,
     * 除非:
     *
     * 1) Don't test more DBs than we have.
     *    當前數據庫的數量小於 REDIS_DBCRON_DBS_PER_CALL
     * 2) If last time we hit the time limit, we want to scan all DBs
     * in this iteration, as there is work to do in some DB and we don't want
     * expired keys to use memory for too much time. 
     *     若是上次處理遇到了時間上限,那麼此次須要對全部數據庫進行掃描,
     *     這能夠避免過多的過時鍵佔用空間
     */
    if (dbs_per_call > server.dbnum || timelimit_exit)
        dbs_per_call = server.dbnum;
 
    /* We can use at max ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC percentage of CPU time
     * per iteration. Since this function gets called with a frequency of
     * server.hz times per second, the following is the max amount of
     * microseconds we can spend in this function. */
    // 函數處理的微秒時間上限
    // ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 默認爲 25 ,也便是 25 % 的 CPU 時間
    timelimit = 1000000*ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC/server.hz/100;
    timelimit_exit = 0;
    if (timelimit <= 0) timelimit = 1;
 
    // 若是是運行在快速模式之下
    // 那麼最多隻能運行 FAST_DURATION 微秒 
    // 默認值爲 1000 (微秒)
    if (type == ACTIVE_EXPIRE_CYCLE_FAST)
        timelimit = ACTIVE_EXPIRE_CYCLE_FAST_DURATION; /* in microseconds. */
 
    // 遍歷數據庫
    for (j = 0; j < dbs_per_call; j++) {
        int expired;
        // 指向要處理的數據庫
        redisDb *db = server.db+(current_db % server.dbnum);
 
        /* Increment the DB now so we are sure if we run out of time
         * in the current DB we'll restart from the next. This allows to
         * distribute the time evenly across DBs. */
        // 爲 DB 計數器加一,若是進入 do 循環以後由於超時而跳出
        // 那麼下次會直接從下個 DB 開始處理
        current_db++;
 
        /* Continue to expire if at the end of the cycle more than 25%
         * of the keys were expired. */
        do {
            unsigned long num, slots;
            long long now, ttl_sum;
            int ttl_samples;
 
            /* If there is nothing to expire try next DB ASAP. */
            // 獲取數據庫中帶過時時間的鍵的數量
            // 若是該數量爲 0 ,直接跳過這個數據庫
            if ((num = dictSize(db->expires)) == 0) {
                db->avg_ttl = 0;
                break;
            }
            // 獲取數據庫中鍵值對的數量
            slots = dictSlots(db->expires);
            // 當前時間
            now = mstime();
 
            /* When there are less than 1% filled slots getting random
             * keys is expensive, so stop here waiting for better times...
             * The dictionary will be resized asap. */
            // 這個數據庫的使用率低於 1% ,掃描起來太費力了(大部分都會 MISS)
            // 跳過,等待字典收縮程序運行
            if (num && slots > DICT_HT_INITIAL_SIZE &&
                (num*100/slots < 1)) break;
 
            /* The main collection cycle. Sample random keys among keys
             * with an expire set, checking for expired ones. 
             *
             * 樣本計數器
             */
            // 已處理過時鍵計數器
            expired = 0;
            // 鍵的總 TTL 計數器
            ttl_sum = 0;
            // 總共處理的鍵計數器
            ttl_samples = 0;
 
            // 每次最多隻能檢查 LOOKUPS_PER_LOOP 個鍵, ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP 已經寫死爲20
            if (num > ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP)
                num = ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP;
 
            // 開始遍歷數據庫
            while (num--) {
                dictEntry *de;
                long long ttl;
 
                // 從 expires 中隨機取出一個帶過時時間的鍵
                if ((de = dictGetRandomKey(db->expires)) == NULL) break;
                // 計算 TTL
                ttl = dictGetSignedIntegerVal(de)-now;
                // 若是鍵已通過期,那麼刪除它,並將 expired 計數器增一
                if (activeExpireCycleTryExpire(db,de,now)) expired++;
                if (ttl < 0) ttl = 0;
                // 累積鍵的 TTL
                ttl_sum += ttl;
                // 累積處理鍵的個數
                ttl_samples++;
            }
 
            /* Update the average TTL stats for this database. */
            // 爲這個數據庫更新平均 TTL 統計數據
            if (ttl_samples) {
                // 計算當前平均值
                long long avg_ttl = ttl_sum/ttl_samples;
                
                // 若是這是第一次設置數據庫平均 TTL ,那麼進行初始化
                if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
                /* Smooth the value averaging with the previous one. */
                // 取數據庫的上次平均 TTL 和今次平均 TTL 的平均值
                db->avg_ttl = (db->avg_ttl+avg_ttl)/2;
            }
 
            /* We can't block forever here even if there are many keys to
             * expire. So after a given amount of milliseconds return to the
             * caller waiting for the other active expire cycle. */
            // 咱們不能用太長時間處理過時鍵,
            // 因此這個函數執行必定時間以後就要返回
 
            // 更新遍歷次數
            iteration++;
 
            // 每遍歷 16 次執行一次
            if ((iteration & 0xf) == 0 && /* check once every 16 iterations. */
                (ustime()-start) > timelimit)
            {
                // 若是遍歷次數正好是 16 的倍數
                // 而且遍歷的時間超過了 timelimit
                // 那麼斷開 timelimit_exit
                timelimit_exit = 1;
            }
 
            // 已經超時了,返回
            if (timelimit_exit) return;
 
            /* We don't repeat the cycle if there are less than 25% of keys
             * found expired in the current DB. */
            // 若是已刪除的過時鍵佔當前總數據庫帶過時時間的鍵數量的 25 %
            // 那麼再也不遍歷
        } while (expired > ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP/4);
    }
}

ACTIVE_EXPIRE_CYCLE_SLOW(正常過時模式)使用serverCron-->databasesCron來調用,其執行頻率由參數redisServer.hz來控制,默認值爲10,即每秒執行10次。ide

/* This function handles 'background' operations we are required to do
 * incrementally in Redis databases, such as active key expiring, resizing,
 * rehashing. */
void databasesCron(void) {
    /* Expire keys by random sampling. Not required for slaves
     * as master will synthesize DELs for us. */
    if (server.active_expire_enabled) {
        if (server.masterhost == NULL) {
            activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
        } else {
            expireSlaveKeys();
        }
    }
}

/* This is our timer interrupt, called server.hz times per second.
 * Here is where we do a number of things that need to be done asynchronously.
 * For instance:
 *
 * - Active expired keys collection (it is also performed in a lazy way on
 *   lookup).
 * - Software watchdog.
 * - Update some statistic.
 * - Incremental rehashing of the DBs hash tables.
 * - Triggering BGSAVE / AOF rewrite, and handling of terminated children.
 * - Clients timeout of different kinds.
 * - Replication reconnection.
 * - Many more...
 *
 * Everything directly called here will be called server.hz times per second,
 * so in order to throttle execution of things we want to do less frequently
 * a macro is used: run_with_period(milliseconds) { .... }
 */
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
    ......
    /* We need to do a few operations on clients asynchronously. */
    clientsCron();

    /* Handle background operations on Redis databases. */
    databasesCron();
    ......
}

ACTIVE_EXPIRE_CYCLE_FAST(快速過時模式)在函數beforeSleep中調用,beforeSleep函數在main函數中綁定到server.el(aeEventLoop)循環事件上。快速過時模式執行頻率較高,但單次執行事件較短(最多1ms)。函數

/* This function gets called every time Redis is entering the
 * main loop of the event driven library, that is, before to sleep
 * for ready file descriptors. */
void beforeSleep(struct aeEventLoop *eventLoop) {
    UNUSED(eventLoop);

    /* Call the Redis Cluster before sleep function. Note that this function
     * may change the state of Redis Cluster (from ok to fail or vice versa),
     * so it's a good idea to call it before serving the unblocked clients
     * later in this function. */
    if (server.cluster_enabled) clusterBeforeSleep();

    /* Run a fast expire cycle (the called function will return
     * ASAP if a fast cycle is not needed). */
    if (server.active_expire_enabled && server.masterhost == NULL)
        activeExpireCycle(ACTIVE_EXPIRE_CYCLE_FAST);
}

在按期清理策略中,主要仍是依賴ACTIVE_EXPIRE_CYCLE_SLOW--正常過時模式來清理數據。oop

參考資料

Redis 過時鍵回收的注意點

小紅書簡單命令觸發Slowlog排查

redis系列--鍵過時的知識

相關文章
相關標籤/搜索