Redis Hash哈希（2）

時間 2020-03-12

標籤 redis hash 哈希欄目 Redis 简体版

原文原文鏈接

存儲類型

包含鍵值對的無序散列表。value只能是字符串，不能嵌套其餘類型。redis

一樣是存儲字符串，Hash與String的主要區別？

一、把全部相關的值彙集到一個key中，節省內存空間數組

二、只使用一個key，減小key衝突數據結構

三、當須要批量獲取值的時候，只須要使用一個命令，減小內存/IO/CPU的消耗ide

Hash不適合的場景：

一、Field不能單獨設置過時時間性能

二、沒有bit操做ui

三、須要考慮數據量分佈的問題（value值很是大的時候，沒法分佈到多個節點）this

存儲（實現）原理

Redis的Hash自己也是一個KV的結構，相似於Java中的HashMap。編碼

外層的哈希（RedisKV的實現）只用到了hashtable。當存儲hash數據類型時，咱們把它叫作內層的哈希。內層的哈希底層可使用兩種數據結構實現：指針

ziplist：OBJ_ENCODING_ZIPLIST（壓縮列表）code

hashtable：OBJ_ENCODING_HT（哈希表）

執行命令

ziplist壓縮列表

ziplist是一個通過特殊編碼的雙向鏈表，它不存儲指向上一個鏈表節點和指向下一個鏈表節點的指針，而是存儲上一個節點長度和當前節點長度，經過犧牲部分讀寫性能，來換取高效的內存空間利用率，是一種時間換空間的思想。只用在字段個數少，字段值小的場景裏面。

ziplist的內部結構

ziplist.c源碼第16行的註釋：

typedef struct zlentry {
    unsigned int prevrawlensize; /* 上一個鏈表節點佔用長度*/
    unsigned int prevrawlen;     /* 上一個鏈表節點的長度數值所需的字節數 */
    unsigned int lensize;        /* 當前鏈表節點長度數值所需字節數 */
    unsigned int len;            /* 當前鏈表節點佔用的長度 */
    unsigned int headersize;     /* 當前鏈表節點的頭部大小（prevrawlensize + lensize），即非數據域大小 */
    unsigned char encoding;      /* 編碼方式*/
    unsigned char *p;            /*壓縮鏈表以字符串的形式保存，該指針指向當前節點起始位置 */
} zlentry;

編碼encoding（ziplist.c源碼第204行）
	
#define ZIP_STR_06B (0 << 6)
#define ZIP_STR_14B (1 << 6)
#define ZIP_STR_32B (2 << 6)

何時使用ziplist存儲？

當hash對象同時知足如下兩個條件的時候，使用ziplist編碼：

一、全部的鍵值對的健和值的字符串長度都小於等於64byte（一個英文字母一個字節）

二、哈希對象保存的鍵值對數量小於512個。

/*redis.conf配置*/
hash-max-ziplist-value 64     //ziplist中最大能存放的值長度
hash-max-ziplist-entries 512  //ziplist中最多能存放的entry節點數量

一個哈希對象超過配置的閾值（鍵和值的長度有>64byte，鍵值對個數>512個）時，會轉換成哈希表（hashtable）。

hashtable（源碼位置：dict.h ）

在Redis中，hashtable被稱爲字典（dictionary），它是一個數組+鏈表的結構。

前面咱們知道了，Redis的KV結構是經過一個dictEntry來實現的。

Redis又對dictEntry進行了多層的封裝。

typedef struct dictEntry {
    void *key;         /*Key關鍵字定義*/
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

dictEntry放到了dictht（hashtable裏面）

/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;     /*哈希表數組*/
    unsigned long size;    /*哈希表數組*/
    unsigned long sizemask;/*掩碼大小，用於計算索引值。等於size-1*/
    unsigned long used;    /*已有節點數*/
} dictht;

ht放到了dict裏面

typedef struct dict {
    dictType *type; /*字段類型*/
    void *privdata; /*私有數據*/
    dictht ht[2];   /*一個字段有兩個哈希表*/
    long rehashidx; /* rehash索引  */
    unsigned long iterators; /* 當前正在使用的迭代器數量 */
} dict;

從最底層到最高層dictEntry——dictht——dict——OBJ_ENCODING_HT

哈希的存儲結構

爲何要定義兩個哈希表呢？ht[2]

redis的hash默認使用的是ht[0]，ht[1]不會初始化和分配空間。

哈希表dictht是用鏈地址法來解決碰撞問題的。在這種狀況下，哈希表的性能取決於它的大小（size屬性）和它所保存的節點的數量（used屬性）之間的比率：

比率在1:1時（一個哈希表ht只存儲一個節點entry），哈希表的性能最好；
若是節點數量比哈希表的大小要大不少的話（這個比例用ratio表示，5表示平均一個ht存儲5個entry），那麼哈希表就會退化成多個鏈表，哈希表自己的性能優點就再也不存在。

在這種狀況下須要擴容。Redis裏面的這種操做叫作rehash。

rehash的步驟：

一、爲字符ht[1]哈希表分配空間，這個哈希表的空間大小取決於要執行的操做，以及ht[0]當前包含的鍵值對的數量。

擴展：ht[1]的大小爲第一個大於等於ht[0].used*2。

二、將全部的ht[0]上的節點rehash到ht[1]上，從新計算hash值和索引，而後放入指定的位置。

三、當ht[0]所有遷移到了ht[1]以後，釋放ht[0]的空間，將ht[1]設置爲ht[0]表，並建立新的ht[1]，爲下次rehash作準備。

何時觸發擴容？

負載因子（源碼位置：dict.c）

static int dict_can_resize = 1;
static unsigned int dict_force_resize_ratio = 5;

ratio=used/size，已使用節點與字典大小的比例dict_can_resize爲1而且

dict_force_resize_ratio已使用節點數和字典大小之間的比率超過1：5，觸發擴容

擴容判斷 _dictExpandIfNeeded（源碼dict.c）

/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
    /* Incremental rehashing already in progress. Return. */
    if (dictIsRehashing(d)) return DICT_OK;

    /* If the hash table is empty expand it to the initial size. */
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

    /* If we reached the 1:1 ratio, and we are allowed to resize the hash
     * table (global setting) or we should avoid it but the ratio between
     * elements/buckets is over the "safe" threshold, we resize doubling
     * the number of buckets. */
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2);
    }
    return DICT_OK;
}

擴容方法dictExpand（源碼dict.c）

/* Expand or create the hash table */
int dictExpand(dict *d, unsigned long size)
{
    /* the size is invalid if it is smaller than the number of
     * elements already inside the hash table */
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    dictht n; /* the new hash table */
    unsigned long realsize = _dictNextPower(size);

    /* Rehashing to the same table size is not useful. */
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* Allocate the new hash table and initialize all pointers to NULL */
    n.size = realsize;
    n.sizemask = realsize-1;
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;

    /* Is this the first initialization? If so it's not really a rehashing
     * we just set the first hash table so that it can accept keys. */
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    /* Prepare a second hash table for incremental rehashing */
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

縮容：server.c

int htNeedsResize(dict *dict) {
    long long size, used;

    size = dictSlots(dict);
    used = dictSize(dict);
    return (size > DICT_HT_INITIAL_SIZE &&
            (used*100/size < HASHTABLE_MIN_FILL));
}

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。