昨天跟同事討論redis集羣,談到redis cluster
時隨口吹噓了一遍工做機制:"redis cluster採用虛擬槽分區,將key根據哈希函數映射到了16384個槽位... ..."云云node
隨即同事A:「爲什麼redis cluster使用16384個槽位?」git
是呀,redis cluster
使用slot=CRC16(key) & 16384
計算槽位。而hash函數crc16()
產生的hash值有16位,天然會產生2^16=65536
個值。也就是hash的值分佈在0-65535
範圍內,按道理咱們應該使用65536
來進行mod
操做,爲什麼使用16384
呢?github
查了下,果真早有人有此疑問(https://github.com/redis/redis/issues/2576),並且做者也給出瞭解釋:web
The reason is:redis
Normal heartbeat packets carry the full configuration of a node, that can be replaced in an idempotent way with the old in order to update an old config. This means they contain the slots configuration for a node, in raw form, that uses 2k of space with16k slots, but would use a prohibitive 8k of space using 65k slots. At the same time it is unlikely that Redis Cluster would scale to more than 1000 mater nodes because of other design tradeoffs. So 16k was in the right range to ensure enough slots per master with a max of 1000 maters, but a small enough number to propagate the slot configuration as a raw bitmap easily. Note that in small clusters the bitmap would be hard to compress because when N is small the bitmap would have slots/N bits set that is a large percentage of bits set.微信
總結一下,主要兩個緣由:網絡
-
消息大小的考慮,槽位數越大,維護槽位信息佔用空間越大,浪費帶寬,也容易致使網絡擁塞。
redis cluster
中將節點加入到集羣,須要執行cluster meet ip:port
來完成節點的握手操做,以後節點間就能夠經過按期ping-pong
來交換信息,其消息頭結構體以下:編輯器
#define CLUSTER_SLOTS 16384
typedef struct {
char sig[4]; /* Signature "RCmb" (Redis Cluster message bus). */
uint32_t totlen; /* Total length of this message */
uint16_t ver; /* Protocol version, currently set to 1. */
uint16_t port; /* TCP base port number. */
uint16_t type; /* Message type */
uint16_t count; /* Only used for some kind of messages. */
uint64_t currentEpoch; /* The epoch accordingly to the sending node. */
uint64_t configEpoch; /* The config epoch if it's a master, or the last
epoch advertised by its master if it is a
slave. */
uint64_t offset; /* Master replication offset if node is a master or
processed replication offset if node is a slave. */
char sender[CLUSTER_NAMELEN]; /* Name of the sender node */
unsigned char myslots[CLUSTER_SLOTS/8];
char slaveof[CLUSTER_NAMELEN];
char myip[NET_IP_STR_LEN]; /* Sender IP, if not all zeroed. */
char notused1[34]; /* 34 bytes reserved for future usage. */
uint16_t cport; /* Sender TCP cluster bus port */
uint16_t flags; /* Sender node flags */
unsigned char state; /* Cluster state from the POV of the sender */
unsigned char mflags[3]; /* Message flags: CLUSTERMSG_FLAG[012]_... */
union clusterMsgData data;
} clusterMsg;
其中的unsigned char myslots[CLUSTER_SLOTS/8];
維護了當前節點持有槽信息的bitmap。每一位表明一個槽,對應位爲1表示此槽屬於當前節點。由於#define CLUSTER_SLOTS 16384
故而myslots
佔用空間爲:16384/8/1024=2kb
,但若是#define CLUSTER_SLOTS
爲65536
,則佔用了8kb。ide
並且在消息體中也會攜帶其餘節點的信息用於交換。這個「其餘節點的信息」具體約爲集羣節點數量的1/10,至少攜帶3個節點的信息。故而集羣節點越多,消息內容佔用空間就越大。svg
-
redis集羣的主節點數據通常不可能超過1000個。
節點越多,交換信息報文也越大;另外一方面由於節點槽位信息是經過bitmap維護的,傳輸過程當中會對bitmap進行壓縮。若是槽位越小,節點也少的狀況下,bitmap的填充率slots/N(N表示節點數)就較小,對應壓縮率就高。反之節點不多槽位不少則壓縮率就很低。
因此綜合考慮,做者以爲實際上16384個槽位就夠了。
若是閱讀過程當中發現本文存疑或錯誤的地方,能夠關注公衆號留言。點贊在看人燦爛😁
本文分享自微信公衆號 - 光華路程序猿(syd3600520)。
若有侵權,請聯繫 support@oschina.cn 刪除。
本文參與「OSC源創計劃」,歡迎正在閱讀的你也加入,一塊兒分享。