Redis-數據刪除策略以及逐出(淘汰)策略

Redis-刪除策略以及逐出(淘汰)策略篇

Redis簡介

Redis 是 C 語言開發的一個高性能鍵值對(key -value) 內存數據庫,能夠用做數據庫,緩存和消息中間件等。web

特色redis

  1. 做爲內存數據庫,它的性能很是優秀,數據存儲在內存當中,讀寫速度很是快,支持併發10W QPS(每秒查詢次數),單進程單線程,是線程安全的,採用IO多路複用機制。算法

  2. 豐富的數據類型,支持字符串,散列,列表,集合,有序集合等,支持數據持久化。能夠將內存中數據保存在磁盤中,重啓時加載。數據庫

  3. 主從複製,哨兵,高可用,可用做分佈式鎖。能夠做爲消息中間件使用,支持發佈訂閱。windows


刪除策略以及逐出策略

什麼是過時數據?

Redis 是一種內存數據庫,全部數據都存放在內存中,內存中的數據能夠經過 TTL 指令獲取其狀態。
如:緩存

  • XX:具備時效性的數據,經過下列命令來定義:
    • setex key seconds value
    • expire key seconds
    • expireat key timestamp
    • pexpire key milliseconds
    • pexpireat key milliseconds-timestamp
      在這裏插入圖片描述
  • -1:永久有效的數據
    在這裏插入圖片描述
  • -2:已過時 | 未定義 | 已刪除的數據
    在這裏插入圖片描述

redis存儲 key-value 結構:安全

代碼服務器

/* Redis database representation. There are multiple databases identified * by integers from 0 (the default database) up to the max configured * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
    dict *dict;                 /* 數據庫鍵空間,保存全部鍵值對信息 The keyspace for this DB */
    dict *expires;              /* 鍵的有效期信息 Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP)*/
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* 實現監控 WATCHED keys for MULTI/EXEC CAS */
    int id;                     /* 數據庫號,標記是哪個數據庫的 Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
    unsigned long expires_cursor; /* Cursor of the active expire cycle. */
    list *defrag_later;         /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;

在這裏用到了redisDB這個結構體的:併發

  1. dict *dict; —>數據庫鍵空間,保存全部鍵值對信息
  2. dict *expires;—>鍵的有效期信息
  3. int id;—>數據庫號

如圖一:
在這裏插入圖片描述app

過時的數據是否真的被刪除了?

過時數據:指的是曾經設置過過時時間的數據,到達了它的過時時間失效。

redis 須要處理某條數據的時候,發送一條指令給 CPUCPU 輕輕鬆鬆就能夠搞定,相對來講不會佔用太多時間,可是若是有多個 redis 同時發送了很是多的增刪查改指令過來,那 CPU 壓力就會變得很是大,形成性能降低,全部操做都在排隊等着 CPU 空閒處理指令。那麼,咱們在這裏能不能作一個優化,查數據,加數據,改數據這部分仍是得正常進來處理,可是過時數據貌似就不是一個很着急的事情了。若是內存空間也不是很緊張,沒達到閾值,那能夠先放在內存裏,等有空的時間再刪掉。也就是說,當這些數據過時之後,實際上仍是先放在內存裏等到要刪的時候再去刪它。而具體怎麼刪,Redis 會提供相應的刪除策略。


Redis提供的刪除策略

Redis 提供了三種刪除策略:1. 定時刪除 | 2. 惰性刪除 | 2. 按期刪除

數據刪除策略的目標就是在內存CPU 佔用之間尋找一種平衡,避免某一邊壓力過大形成總體性能降低,甚至引起服務器宕機或內存泄露。


定時刪除

key 設置過時時間的時候,建立一個定時器事件,當 key 過時時間到達時,由定時器任務當即執行對 key 的刪除操做,刪除操做先刪除存儲空間的,再移除掉 expirekey

優勢:節約內存,到時就刪除,快速釋放掉沒必要要的內存佔用

缺點:CPU 壓力大,不管 CPU 此時負載量多高,都會去佔用 CPU 進行 key 的刪除操做,會影響 Redis 服務器響應時間和吞吐量,是一種比較低效的方式

結論:用 CPU 性能換取內存空間,時間換空間

在這裏插入圖片描述


惰性刪除|被動刪除

數據到達超時時間的,不當即處理,等下次訪問該數據的時候,再去刪除(操做會執行expireIfNeeded()函數去檢查)

優勢:不佔用 CPU 節約 CPU 性能,只在獲取訪問key的時候才判斷是否過時,過時則刪除,只會刪除當前獲取的這一個key,其餘的key仍是保持原樣

缺點:內存佔用大,若是一直沒有獲取它,那麼數據就會長期佔用內存空間,當有大量的key沒有被使用到,也形成了大量內存浪費,對內存數據庫來講,也不太友好

結論:空間換時間
在這裏插入圖片描述
過時刪除調用的幾個主要函數 db.c

  • int expireIfNeeded(redisDb *db, robj *key)
    • int keyIsExpired(redisDb *db, robj *key)
      • long long getExpire(redisDb *db, robj *key)
    • notifyKeyspaceEvent(NOTIFY_EXPIRED,"expired",key,db->id);
    • server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) : dbSyncDelete(db,key);
int expireIfNeeded(redisDb *db, robj *key) {
    if (!keyIsExpired(db,key)) return 0; //未過時的key

    /* If we are running in the context of a slave, instead of * evicting the expired key from the database, we return ASAP: * the slave key expiration is controlled by the master that will * send us synthesized DEL operations for expired keys. * * Still we try to return the right information to the caller, * that is, 0 if we think the key should be still valid, 1 if * we think the key is expired at this time. */
    if (server.masterhost != NULL) return 1;

    /* Delete the key */
    server.stat_expiredkeys++;
    propagateExpire(db,key,server.lazyfree_lazy_expire);
    notifyKeyspaceEvent(NOTIFY_EXPIRED,
        "expired",key,db->id);
        //刪除操做
    int retval = server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
                                               dbSyncDelete(db,key);
    if (retval) signalModifiedKey(NULL,db,key);
    return retval;
}

按期刪除|主動刪除

前面說的兩種方案1.時間換空間2.空間換時間都是兩個極端方法,爲避免前面方案帶來的問題,Redis 引入了按期刪除策略(是他們的一個比較折中的方案)

週期性輪詢 Redis 庫中的時效性數據,採起隨機抽取的策略,利用過時數據佔比的方式控制刪除頻度。

  1. 在Redis服務器初始化時,讀取server.hz的值,默認值爲10
    • 定時輪詢服務器,每秒鐘執行server.hzserverCron() 函數。
    • databaseCron() 在後臺輪詢處理 16 個 redis 數據庫的操做,如這裏的過時 key 的處理
    • activeExpireCycle(),對每一個數據庫的expire空間進行檢測,每次執行250ms/server.hz
  2. 隨機選取一批expire空間的 key(redis有16個數據庫,從0號數據庫開始---15號數據庫
    • 刪除這批 key 中已過時的
    • 若是這批 key 中已過時的佔比超過25%,那麼再重複執行步驟一。(循環到小於25%結束當前數據庫的刪除
    • 若是這批 key 中已過時的佔比 ≤ 25%,檢測下一個數據庫的expire空間(current_db++

用info命令查看相關配置參數,如:server.hz配置
在這裏插入圖片描述
代碼位置:
server.c

/* This is our timer interrupt, called server.hz times per second. Here is where we do a number of things that need to be done asynchronously. For instance: Active expired keys collection (it is also performed in a lazy way on lookup). .............. */
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData){
    /* Handle background operations on Redis databases. */
    databasesCron();
    //略............
}

server.c

/* This function handles 'background' operations we are required to do incrementally in Redis databases, such as active key expiring, resizing, rehashing. */
void databasesCron(void) {
    /* Expire keys by random sampling. Not required for slaves * as master will synthesize DELs for us. */
    if (server.active_expire_enabled) {
        if (iAmMaster()) {
            activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
        } else {
            expireSlaveKeys();
        }
    }
    //略............
}

expire.c

void activeExpireCycle(int type){
    //代碼太長了不放了,主要執行流程是
    隨機選取一批expire空間的key(從0號數據庫開始---15號數據庫)
      刪除這批key中已過時的
      若是這批key中已過時的佔比超過25%,那麼再重複執行步驟一。(`循環到小於25%結束當前數據庫的刪除`)
      若是這批key中已過時的佔比 ≤ 25%,檢測下一個數據庫的expire空間(`current_db++}

除了主動淘汰的頻率外,Redis 對每次淘汰任務執行的最大時長也有一個限定,這樣保證了每次主動淘汰不會過多阻塞應用請求,如下是這個限定計算公式:

#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
/* Adjust the running parameters according to the configured expire * effort. The default effort is 1, and the maximum configurable effort * is 10. */
unsigned long effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
unsigned long config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC + 2*effort;
/* We can use at max 'config_cycle_slow_time_perc' percentage of CPU * time per iteration. Since this function gets called with a frequency of * server.hz times per second, the following is the max amount of * microseconds we can spend in this function. */
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;

結論:CPU 性能佔用設置有峯值,檢測頻度可自定義設置,內存壓力不是很大,長期佔用內存的冷數據會被持續清理(週期性隨機抽查,重點抽查)


刪除策略比對

  1. 定時刪除(時間換空間)
    • 節約內存無佔用
    • 不分時段佔用 CPU 資源,頻度高
  2. 惰性刪除(空間換時間)
    • 內存佔用高
    • 延遲執行,不會一直佔用CPU資源,CPU 壓力小,頻度低
  3. 按期刪除(週期性隨機抽查)
    • 內存按期隨機清理
    • 每秒花費固定 CPU 資源維護內存(清除過時數據)

逐出(淘汰)策略

Redis 中常常會進行數據的增刪查改操做,那麼若是在添加數據的時候遇到了內存不足,該怎麼辦?在前面用的刪除策略能夠避免出現這種狀況嗎?

實際上,在前面所說的刪除策略,它針對的是expire命令進行的操做,也就是說那些具備時效性的數據(已通過期,而且還在佔用內存的數據),咱們在這裏說的是針對那些並無過時,或者是內存中的數據沒有一個帶有有效期,全是永久性數據,這時候刪除策略就不起做用了,因此這個時候內存滿了咱們再去插入數據到內存是怎麼作?


介紹

Redis在進行存儲操做的時候,會先幹一件事,在執行每個命令前都會去調用freeMemoryIfNeeded(void)方法去檢測內存是否充足,若是內存不知足新加入數據最低存儲要求,則須要臨時刪除一些數據爲當前數據騰出存儲空間。清理數據的方策叫作逐出(淘汰)算法。

逐出(淘汰)算法不是100%能清理出足夠的可以使用的內存空間,若是不成功則反覆執行。當對全部數據嘗試完成後,仍是不能達到要求的話,就會報錯。

步驟大體以下:

  1. redis.windows-service.conf|redis.conf 中配置 maxmemory <bytes>限制內存使用量爲100mb–>maxmemory 100mb,默認值設置爲 0 則表示內存不限制,一般設置佔物理內存的 50% 以上
    在這裏插入圖片描述
  2. redis.windows-service.conf|redis.conf 中配置 maxmemory-samples x每次選取刪除數據的個數,選取數據時並不會全庫掃描而致使嚴重的性能消耗下降讀寫性能,所以採用隨機獲取數據的方式做爲待檢測刪除數據。
    在這裏插入圖片描述
  3. redis.windows-service.conf|redis.conf 中配置 maxmemory-policy noeviction刪除策略,默認是noevictionredis內存超出限制時,觸發逐出(淘汰)機制,對被挑選出來的數據進行刪除。
    在這裏插入圖片描述

代碼流程:

redis用int processCommand(client *c)函數處理每條命令,在這個函數裏回去調用int freeMemoryIfNeededAndSafe(void)方法來判斷內存空間

int processCommand(client *c) {
//..............略
    /* Handle the maxmemory directive. * * Note that we do not want to reclaim memory if we are here re-entering * the event loop since there is a busy Lua script running in timeout * condition, to avoid mixing the propagation of scripts with the * propagation of DELs due to eviction. */
    if (server.maxmemory && !server.lua_timedout) {
        int out_of_memory = freeMemoryIfNeededAndSafe() == C_ERR;
        /* freeMemoryIfNeeded may flush slave output buffers. This may result * into a slave, that may be the active client, to be freed. */
        if (server.current_client == NULL) return C_ERR;

        /* It was impossible to free enough memory, and the command the client * is trying to execute is denied during OOM conditions or the client * is in MULTI/EXEC context? Error. */
        if (out_of_memory &&
            (c->cmd->flags & CMD_DENYOOM ||
             (c->flags & CLIENT_MULTI &&
              c->cmd->proc != execCommand &&
              c->cmd->proc != discardCommand)))
        {
            flagTransaction(c);
            addReply(c, shared.oomerr);
            return C_OK;
        }

        /* Save out_of_memory result at script start, otherwise if we check OOM * untill first write within script, memory used by lua stack and * arguments might interfere. */
        if (c->cmd->proc == evalCommand || c->cmd->proc == evalShaCommand) {
            server.lua_oom = out_of_memory;
        }
    }
    //..............略
}

int freeMemoryIfNeededAndSafe(void)則會去調用真正判斷內存的freeMemoryIfNeeded()函數來判斷當前使用的內存是否超過了最大使用內存

/* This is a wrapper for freeMemoryIfNeeded() that only really calls the * function if right now there are the conditions to do so safely: * * - There must be no script in timeout condition. * - Nor we are loading data right now. * */
int freeMemoryIfNeededAndSafe(void) {
    if (server.lua_timedout || server.loading) return C_OK;
    return freeMemoryIfNeeded();
}

int freeMemoryIfNeeded(void)這個函數開始進行內存計算,進一步選出須要淘汰的鍵

/* This function is periodically called to see if there is memory to free * according to the current "maxmemory" settings. In case we are over the * memory limit, the function will try to free some memory to return back * under the limit. * * The function returns C_OK if we are under the memory limit or if we * were over the limit, but the attempt to free memory was successful. * Otehrwise if we are over the memory limit, but not enough memory * was freed to return back under the limit, the function returns C_ERR. */
int freeMemoryIfNeeded(void) {
    int keys_freed = 0;
    /* By default replicas should ignore maxmemory * and just be masters exact copies. */
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;

    size_t mem_reported, mem_tofree, mem_freed;
    mstime_t latency, eviction_latency, lazyfree_latency;
    long long delta;
    int slaves = listLength(server.slaves);
    int result = C_ERR;

    /* When clients are paused the dataset should be static not just from the * POV of clients not being able to write, but also from the POV of * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return C_OK;
    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
        return C_OK;

    mem_freed = 0;

    latencyStartMonitor(latency);
    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        goto cant_free; /* We need to free memory, but policy forbids. */

    while (mem_freed < mem_tofree) {
        int j, k, i;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) || server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                /* We don't want to make local-db choices when expiring keys, * so to start populate the eviction pool sampling keys from * every DB. */
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ? db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[pool[k].dbid].dict,
                            pool[k].key);
                    } else {
                        de = dictFind(server.db[pool[k].dbid].expires,
                            pool[k].key);
                    }

                    /* Remove the entry from the pool. */
                    if (pool[k].key != pool[k].cached)
                        sdsfree(pool[k].key);
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    /* If the key exists, is our pick. Otherwise it is * a ghost and we need to try the next element. */
                    if (de) {
                        bestkey = dictGetKey(de);
                        break;
                    } else {
                        /* Ghost... Iterate again. */
                    }
                }
            }
        }

        /* volatile-random and allkeys-random policy */
        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
        {
            /* When evicting a random key, we try to evict a key for * each DB, so we use the static 'next_db' variable to * incrementally visit all DBs. */
            for (i = 0; i < server.dbnum; i++) {
                j = (++next_db) % server.dbnum;
                db = server.db+j;
                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
                        db->dict : db->expires;
                if (dictSize(dict) != 0) {
                    de = dictGetRandomKey(dict);
                    bestkey = dictGetKey(de);
                    bestdbid = j;
                    break;
                }
            }
        }

        /* Finally remove the selected key. */
        if (bestkey) {
            db = server.db+bestdbid;
            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
            /* We compute the amount of memory freed by db*Delete() alone. * It is possible that actually the memory needed to propagate * the DEL in AOF and replication link is greater than the one * we are freeing removing the key, but we can't account for * that otherwise we would never exit the loop. * * AOF and Output buffer memory will be freed eventually so * we only care about memory used by the key space. */
            delta = (long long) zmalloc_used_memory();
            latencyStartMonitor(eviction_latency);
            if (server.lazyfree_lazy_eviction)
                dbAsyncDelete(db,keyobj);
            else
                dbSyncDelete(db,keyobj);
            signalModifiedKey(NULL,db,keyobj);
            latencyEndMonitor(eviction_latency);
            latencyAddSampleIfNeeded("eviction-del",eviction_latency);
            delta -= (long long) zmalloc_used_memory();
            mem_freed += delta;
            server.stat_evictedkeys++;
            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
                keyobj, db->id);
            decrRefCount(keyobj);
            keys_freed++;

            /* When the memory to free starts to be big enough, we may * start spending so much time here that is impossible to * deliver data to the slaves fast enough, so we force the * transmission here inside the loop. */
            if (slaves) flushSlavesOutputBuffers();

            /* Normally our stop condition is the ability to release * a fixed, pre-computed amount of memory. However when we * are deleting objects in another thread, it's better to * check, from time to time, if we already reached our target * memory, since the "mem_freed" amount is computed only * across the dbAsyncDelete() call, while the thread can * release the memory all the time. */
            if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {
                if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                    /* Let's satisfy our stop condition. */
                    mem_freed = mem_tofree;
                }
            }
        } else {
            goto cant_free; /* nothing to free... */
        }
    }
    result = C_OK;

cant_free:
    /* We are here if we are not able to reclaim memory. There is only one * last thing we can try: check if the lazyfree thread has jobs in queue * and wait... */
    if (result != C_OK) {
        latencyStartMonitor(lazyfree_latency);
        while(bioPendingJobsOfType(BIO_LAZY_FREE)) {
            if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                result = C_OK;
                break;
            }
            usleep(1000);
        }
        latencyEndMonitor(lazyfree_latency);
        latencyAddSampleIfNeeded("eviction-lazyfree",lazyfree_latency);
    }
    latencyEndMonitor(latency);
    latencyAddSampleIfNeeded("eviction-cycle",latency);
    return result;
}

逐出(淘汰)算法策略及其相關配置

random:在expire空間或者dict空間隨機淘汰。
    volatile:在expire空間先淘汰到期或快到期數據。
    allkeys:在dict空間查找
    近似 LRU 算法(最近最少使用Least Recently Used)
    近似 LFU 算法 (最近使用次數最少Least Frequently Used)

1. 檢測帶有時效性的數據進行淘汰(第i個數據庫的expire空間
  • volatile-lru:在設置了時效性的 keys 中選擇最近最少使用的數據淘汰(Evict using approximated LRU, only keys with an expire set.
  • volatile-lfu:在設置了時效性的 keys 中選擇最近使用次數最少的數據淘汰(Evict using approximated LFU, only keys with an expire set.
  • volatile-random:在設置了時效性的 keys 中隨機選擇一個淘汰(Remove a random key having an expire set.
  • volatile-ttl:在設置了時效性的 keys 中選擇最快過時TTL最短的數據淘汰(Remove the key with the nearest expire time (minor TTL)

2. 檢測全庫的數據進行淘汰(第i個數據庫的dict空間
  • allkeys-lru:在全部 key 中使用最近最少使用的數據淘汰(Evict any key using approximated LRU.
  • allkeys-lfu:在全部 key 中使用最近使用次數最少的數據淘汰(Evict any key using approximated LFU.
  • allkeys-random:在全部 key 中隨機選擇一個淘汰(Remove a random key, any key.

不一樣的策略,指向的數據集也不一樣:根據指向expire的空間仍是dict空間來刪除,主要能夠看下面這兩段代碼能夠看出:

if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
    server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
{
    //根據淘汰策略選擇一個空間dict空間或expire空間
    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ? db->dict : db->expires;
}
/* volatile-random and allkeys-random policy */
else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM || server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
{
    //根據淘汰策略選擇一個空間dict空間或expire空間
    dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ? db->dict : db->expires;
}

3. 不進行淘汰–NO_EVICTION
  • noeviction:不淘汰任何東西,僅在寫操做時返回一個錯誤(Don't evict anything, just return an error on write operations.)目前(redis_version:3.2.100)版本默認是配置 noeviction 策略。容易引起 OOM
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        goto cant_free; /* We need to free memory, but policy forbids. */

在這裏插入圖片描述在這裏插入圖片描述


同系列一:Redis 緩存數據庫入門與數據類型教程

同系列二:Redis-key的通用指令篇

同系列三:Redis-RDB-AOF持久化篇

同系列四:Redis-Redis-事務控制以及分佈式鎖實現