redis源碼分析以內存佈局

時間 2019-11-11

標籤 redis 源碼分析內存佈局欄目 Redis 简体版

原文原文鏈接

1. 介紹

衆所周知，redis是一個開源、短小、高效的key-value存儲系統，相對於memcached，redis可以支持更加豐富的數據結構，包括：redis

字符串（string）
哈希表（map）
列表（list）
集合（set）
有序集（zset）

主流的key-value存儲系統，都是在系統內部維護一個hash表，由於對hash表的操做時間複雜度爲O(1)。若是數據增長之後，致使衝突嚴重，時間複雜度增長，則能夠對hash表進行rehash，以此來保證操做的常量時間複雜度。數據庫

那麼，對於這樣一個基於hash表的key-value存儲系統，是如何提供這麼豐富的數據結構的呢？這些數據結構在內存中如何存儲呢？這篇文章將用大量的圖片演示redis的內存佈局和數據存儲。數組

2. redisServer

在redis系統內部，有一個redisServer結構體的全局變量server，server保存了redis服務端全部的信息，包括當前進程的PID、服務器的端口號、數據庫個數、統計信息等等。固然，它也包含了數據庫信息，包括數據庫的個數、以及一個redisDb數組。服務器

struct redisServer {
    ……
    redisDb *db;
    int dbnum;                      /* Total number of configured DBs */
    ……
}

顯然，dbnum就是redisDb數組的長度，每個數據庫，都對應於一個redisDb，在redis的客戶端中，能夠經過select N來選擇使用哪個數據庫，各個數據庫之間互相獨立。例如：能夠在不一樣的數據庫中同時存在名爲」redis」的key。數據結構

從上面的分析中能夠看到，server是一個全局變量，它包含了若干個redisDb，每個redisDb是一個keyspace，各個keyspace互相獨立，互不干擾。ide

下面來看一下redisDb的定義：memcached

/* Redis database representation. There are multiple databases identified
 * by integers from 0 (the default database) up to the max configured
 * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    dict *expires;              /* Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP) */
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */
    struct evictionPoolEntry *eviction_pool;    /* Eviction pool of keys */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
} redisDb;

redis的每個數據庫是一個獨立的keyspace，所以，咱們理所固然的認爲，redis的數據庫是一個hash表。可是，從redisDb的定義來看，它並非一個hash表，而是一個包含了不少hash表的結構。之因此這樣作，是由於redis還須要提供除了set、get之外更加豐富的功能(例如：鍵的超時機制)。咱們今天只關注最重要的數據結構：佈局

typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    ……
} redisDb;

redisDb與redisServer的關係以下所示：ui

下面再看dict的定義：this

typedef struct dict {
    ……
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    ……
} dict;

dict包含了兩個hash表，這樣作的目的是爲了支持漸進式的rehash，即：在大多數狀況下，只使用第一個hash表，若是第一個hash表的數據太多，則須要執行rehash。

dict與redisDb、redisServer的關係以下：

下面看一下dictht的定義，至此，咱們總算見到了redis的hash表，與絕大多數的hash表沒有什麼兩樣：

/* This is our hash table structure. Every dictionary has two of this as we
* implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

dictht與dict、redisDb、redisServer之間的關係以下：

redis對hash表的節點也進行了簡單的封裝，hash表的每個節點都是一個dictEntry，redis的hash表看起來是這樣：

總結： redis內存有一個全局變量redisServer server，該變量包含若干個數據庫，每一個數據庫都用一個redisDb表示，redisDb包含若干個字典，其中，存儲數據的是dict* dict，dict內部包含兩個hash表，通常狀況下面，咱們只會使用ht[0]，在rehash時，咱們會同時使用兩個hash表，hash表的每一項，都是一個dictEntry結構體的變量。

從宏觀角度來看，redis的數據存儲應該是這樣的：

3. 存儲不一樣的數據類型

在上一節中，詳細介紹了redis的hash表以及核心數據結構之間的關係，至此，以及對redis存儲數據有了一個初步的印象，可是，到目前爲止尚未回答文章最開始的問題：redis如何存儲不一樣的數據結構？

要理解redis如何存儲不一樣的數據結構，首先來看一下redisObject的定義：

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */
    int refcount;
    void *ptr;
} robj;

其中，type是邏輯數據類型，即redis提供給用戶的字符串、列表、hash表等。type的取值以下：

/* Object types */
#define REDIS_STRING 0
#define REDIS_LIST 1
#define REDIS_SET 2
#define REDIS_ZSET 3
#define REDIS_HASH 4

type雖然很關鍵，可是，在咱們這篇文章中，更多的須要關注encoding字段，該字段的含義是邏輯數據類型的具體實現。encoding的取值以下：

#define REDIS_ENCODING_RAW 0     /* Raw representation */
#define REDIS_ENCODING_INT 1     /* Encoded as integer */
#define REDIS_ENCODING_HT 2      /* Encoded as hash table */
#define REDIS_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
#define REDIS_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define REDIS_ENCODING_INTSET 6  /* Encoded as intset */
#define REDIS_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
#define REDIS_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
#define REDIS_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */

例如，對於list這種數據類型，在redis內部，可使用ziplist實現（更加省內存），也可使用linkedlist實現。

在知足如下兩個條件時，使用ziplist實現，不然，使用linkedlist實現。

列表對象保存的全部字符串元素的長度都小於64字節
列表對象保存的元素數量小於512

再次強調：對於同一種數據類型，redis內部提供了多種實現，不一樣的實現適用於不一樣的場景，且用戶只能經過redis.conf文件進行有限的控制，具體使用哪種實現，徹底是redis內部決定。能夠經過object encoding key查看當前key的內部編碼，即內部實現。

這篇文章介紹redis的內存佈局，天然更應該關係的是內部的具體實現，而不是邏輯數據類型。不論是邏輯類型(type)仍是具體實現(encoding)，都保存在redisObject中，redisObject至關因而全部數據結構的父類，redis的hash表的每個項都是dictEntry，而每個dictEntry，都指向一個redisObject。

redis在數據的存取時，首先經過key找到對應的dictEntry，接着經過dictEntry獲取redisObject對象，而後經過redisObject的encoding的取值，對redisObject的ptr指針進行強制類型轉換。

例如： 對於一個簡短的list，redis頗有可能使用的是quicklist存儲，所以，在讀取list的數據時，redis首先經過key找到dictEntry，而後經過dictEntry找到redisObject，經過redisObject的encoding對ptr指針進行強制類型轉換，在本例中，將ptr強制轉換爲quicklist，轉換爲quicklist之後，就可以獲取head和tail指針，可使用head和tail訪問數據。