redis3最大的變化之一就是cluster功能的正式發佈,之前要搞redis集羣,得藉助一致性hash來本身搞sharding,如今方便多了,直接上cluster功能就好了,並且還支持節點動態添加、HA、節點增減後緩存從新分佈(resharding)。html
下面是參考官方教程cluster-tutorial 在mac機上搭建cluster的過程:node
目前最新版是3.0.7,下載地址:http://www.redis.io/downloadredis
編譯很簡單,一個make命令便可,不清楚的同窗,可參考我以前的筆記: redis 學習筆記(1)-編譯、啓動、中止算法
mkdir ~/app/redis-cluster/ #先建一個根目錄 mkdir 7000 7001 7002 7003 7004 7005
注:與大多數分佈式中間件同樣,redis的cluster也是依賴選舉算法來保證集羣的高可用,因此相似ZK同樣,通常是奇數個節點(能夠容許N/2如下的節點失效),再考慮到每一個節點作Master-Slave互爲備份,因此一個redis cluster集羣最少也得6個節點。shell
而後把步驟1裏編譯好的redis,複製到這6個目錄下。緩存
port 7000 cluster-enabled yes cluster-config-file nodes.conf cluster-node-timeout 5000 appendonly yes
把上面這段保存成redis-cluster.conf,放到每一個目錄的redis目錄中,注意修改port端口,即7000目錄下的port爲7000,7001目錄下的port爲7001...ruby
cluster-node-timeout 是集羣中各節點相互通信時,容許"失聯"的最大毫秒數,上面的配置爲5秒,若是超過5秒某個節點沒向其它節點彙報成功,認爲該節點掛了。bash
在每一個目錄redis的src子目錄下,輸入:app
./redis-server ../redis-cluster.conf
這樣7000~7005這6個節點就啓動了。socket
brew update brew install ruby sudo gem install redis #注:這個步驟建議翻^牆,否則你懂的
解釋:雖然步驟4把6個redis server啓動成功了,可是彼此之間是徹底獨立的,須要藉助其它工具將其加入cluster,而這個工具就是redis提供的一個名爲redis-trib.rb的ruby腳本(我的估計redis的做者比較偏心ruby),mac自帶了ruby2.0環境,可是沒有redis模塊,因此要安裝這玩意兒,不然接下來的建立cluster將失敗。
./redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 \ 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
仍然保持在某個目錄的src子目錄下,運行上面這段shell腳本,cluster就建立成功了,replicas 1的意思,就是每一個節點建立1個副本(即:slave),因此最終的結果,就是後面的127.0.0.1:7000~127.0.0.1:7005中,會有3個會指定成master,而其它3個會指定成slave。
注:利用redis-trib建立cluster的操做,只須要一次便可,假設系統關機,把全部6個節點全關閉後,下次重啓後,即自動進入cluster模式,不用再次redis-trib.rb create。
此時,如何用ps查看redis進程,會看到每一個進程後附帶了cluster的字樣
若是想知道,哪些端口的節點是master,哪些端口的節點是slave,能夠用下面的命令:
./redis-trib.rb check 127.0.0.1:7000
輸出結果以下:
>>> Performing Cluster Check (using node 127.0.0.1:7000) S: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000 slots: (0 slots) slave replicates 38910c5baafea02c5303505acfd9bd331c608cfc M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005 slots: (0 slots) slave replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004 slots: (0 slots) slave replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa M: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003 slots:0-5460 (5461 slots) master 1 additional replica(s) M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
從上面的輸出,能夠看出7000、700四、7005是slave,而700一、700三、7002是master(若是你們人爲作過一些failover的測試,好比把某個節點手動停掉,再恢復,輸出的結果可能與上面不太同樣),除了check參數,還有一個經常使用的參數info
./redis-trib.rb info 127.0.0.1:7000
輸出結果以下:
127.0.0.1:7001 (e0e8dfdd...) -> 2 keys | 5462 slots | 1 slaves. 127.0.0.1:7003 (38910c5b...) -> 2 keys | 5461 slots | 1 slaves. 127.0.0.1:7002 (ec964a7c...) -> 0 keys | 5461 slots | 1 slaves. [OK] 4 keys in 3 masters. 0.00 keys per slot on average.
它會把全部的master信息輸出,包括這個master上有幾個緩存key,有幾個slave,全部master上的keys合計,以及平均每一個slot上有多少key,想了解更多redis-trib腳本的其它參數,能夠用
./redis-trib.rb help
輸出以下:
Usage: redis-trib <command> <options> <arguments ...> create host1:port1 ... hostN:portN --replicas <arg> check host:port info host:port fix host:port --timeout <arg> reshard host:port --from <arg> --to <arg> --slots <arg> --yes --timeout <arg> --pipeline <arg> rebalance host:port --weight <arg> --auto-weights --use-empty-masters --timeout <arg> --simulate --pipeline <arg> --threshold <arg> add-node new_host:new_port existing_host:existing_port --slave --master-id <arg> del-node host:port node_id set-timeout host:port milliseconds call host:port command arg arg .. arg import host:port --from <arg> --copy --replace help (show this help) For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.
上面已經屢次出現了slot這個詞,略爲解釋一下:
如上圖,redis-cluster把整個集羣的存儲空間劃分爲16384個slot(譯爲:插槽?),當6個節點分爲3主3從時,至關於整個cluster中有3組HA的節點,3個master會平均分攤全部slot,每次向cluster中的key作操做時(好比:讀取/寫入緩存),redis會對key值作CRC32算法處理,獲得一個數值,而後再對16384取模,經過餘數判斷該緩存項應該落在哪一個slot上,肯定了slot,也就肯定了保存在哪一個master節點上,當cluster擴容或刪除節點時,只須要將slot從新分配便可(即:把部分slot從一些節點移動到其它節點)。
./redis-cli -c -h localhost -p 7000
注意加參數-c,表示進入cluster模式,隨便添加一個緩存試試:
localhost:7000> set user1 jimmy -> Redirected to slot [8106] located at 127.0.0.1:7001 OK
注意第2行的輸出,表示user1這個緩存經過計算後,落在8106這個slot上,最終定位在7001這個端口對應的節點上(解釋:由於7000是slave,7001纔是master,只有master才能寫入),若是是在7001上重複上面的操做時,不會出現第2行(解釋:7001是master,因此不存在redirect的過程)
➜ src ./redis-cli -c -h localhost -p 7001 localhost:7001> set user1 yang OK localhost:7001>
先用redis-trib.rb 查看下當前的主、從狀況
➜ src ./redis-trib.rb check localhost:7000 >>> Performing Cluster Check (using node localhost:7000) S: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e localhost:7000 slots: (0 slots) slave replicates 38910c5baafea02c5303505acfd9bd331c608cfc M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004 slots: (0 slots) slave replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005 slots: (0 slots) slave replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa M: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003 slots:0-5460 (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
從輸出上看7000是7003(38910c5baafea02c5303505acfd9bd331c608cfc)的slave,如今咱們人工把7003的redis進程給kill掉,而後觀察7000的終端輸出:
872:S 21 Mar 10:55:55.663 * Connecting to MASTER 127.0.0.1:7003 3872:S 21 Mar 10:55:55.663 * MASTER <-> SLAVE sync started 3872:S 21 Mar 10:55:55.663 # Error condition on socket for SYNC: Connection refused 3872:S 21 Mar 10:55:55.771 * Marking node 38910c5baafea02c5303505acfd9bd331c608cfc as failing (quorum reached). 3872:S 21 Mar 10:55:55.771 # Cluster state changed: fail 3872:S 21 Mar 10:55:55.869 # Start of election delayed for 954 milliseconds (rank #0, offset 183). 3872:S 21 Mar 10:55:56.703 * Connecting to MASTER 127.0.0.1:7003 3872:S 21 Mar 10:55:56.703 * MASTER <-> SLAVE sync started 3872:S 21 Mar 10:55:56.703 # Error condition on socket for SYNC: Connection refused 3872:S 21 Mar 10:55:56.909 # Starting a failover election for epoch 10. 3872:S 21 Mar 10:55:56.911 # Failover election won: I'm the new master. 3872:S 21 Mar 10:55:56.911 # configEpoch set to 10 after successful failover 3872:M 21 Mar 10:55:56.911 * Discarding previously cached master state. 3872:M 21 Mar 10:55:56.911 # Cluster state changed: ok
注意5,6,11這幾行,第5行代表因爲7003宕機,cluster狀態已經切換到fail狀態,第6行表示發起選舉,第11行表示7000端口對應的節點當選爲new master。
業務規模變大後,集羣擴容是遲早的事情,下面演示如何再添加2個節點,先把7000複製二份,變成7006,7007,而後進入7006/7007目錄redis的src子目錄下
rm nodes.conf dump.rdb appendonly.aof
因爲7000咱們剛纔啓動過,裏面有已經有一些數據了,因此要把數據文件,日誌文件,以及cluster的nodes.conf文件刪除,變成一個空的redis獨立節點,不然沒法加入cluster。
而後修改redis-cluster.conf
port 7000 cluster-enabled yes cluster-config-file "nodes.conf" cluster-node-timeout 10000 appendonly yes # Generated by CONFIG REWRITE dir "/Users/yjmyzz/app/redis-cluster/7000/redis-3.0.7/src"
要修改的地方有二處,1是第一行的端口,改爲與7006/7007匹配的端口,2是最後2行,這是7000運行後,自動添加的,把最後二行刪除。
作完這些後,啓動7006,7007這二個redis節點,此時這2個新節點與cluster沒有任何關係,能夠用下面的命令將7006作爲master添加到cluster中。
./redis-trib.rb add-node 127.0.0.1:7006 127.0.0.1:7000
注:第1個參數爲新節點的"IP:端口",第2個參數爲集羣中的任一有效的節點。
順利的話,輸出以下:
>>> Adding node 127.0.0.1:7006 to cluster 127.0.0.1:7000 >>> Performing Cluster Check (using node 127.0.0.1:7000) M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000 slots:0-5460 (5461 slots) master 1 additional replica(s) M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004 slots: (0 slots) slave replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005 slots: (0 slots) slave replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003 slots: (0 slots) slave replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. >>> Send CLUSTER MEET to node 127.0.0.1:7006 to make it join the cluster. [OK] New node added correctly.
能夠再用check確認下狀態:
➜ src ./redis-trib.rb check 127.0.0.1:7000 >>> Performing Cluster Check (using node 127.0.0.1:7000) M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000 slots:0-5460 (5461 slots) master 1 additional replica(s) M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004 slots: (0 slots) slave replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa M: 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7006 slots: (0 slots) master 0 additional replica(s) M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005 slots: (0 slots) slave replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003 slots: (0 slots) slave replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
12-14行說明7006已是cluster的新master了,繼續,用下面的命令把7007當成slave加入:
./redis-trib.rb add-node --slave --master-id 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7007 127.0.0.1:7000
這裏多出了二個參數:--slave 表示準備將新節點當成slave加入,--master-id xxxxx 則是指定要當誰的slave,後面的xxx部分,即爲前面check的輸出結果中,7006的ID,完事以後,能夠再次確認狀態:
➜ src ./redis-trib.rb check 127.0.0.1:7000 >>> Performing Cluster Check (using node 127.0.0.1:7000) M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 792bcccf35845c4922dd33d7f9827420ebb89bc9 127.0.0.1:7007 slots: (0 slots) slave replicates 226d1af3c95bf0798ea9fed86373b89347f889da M: e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa 127.0.0.1:7001 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: be7e9fd3b7d096b037306bc14e1017150fa59d7a 127.0.0.1:7004 slots: (0 slots) slave replicates e0e8dfddd4e9d855090d6efd18e55ea9c0e1f7aa M: 226d1af3c95bf0798ea9fed86373b89347f889da 127.0.0.1:7006 slots: (0 slots) master 1 additional replica(s) M: ec964a7c7cd53b986f54318a190c1426fc53a5fa 127.0.0.1:7002 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 88e16f91609c03277f2ee6ce5285932f58c221c1 127.0.0.1:7005 slots: (0 slots) slave replicates ec964a7c7cd53b986f54318a190c1426fc53a5fa S: 38910c5baafea02c5303505acfd9bd331c608cfc 127.0.0.1:7003 slots: (0 slots) slave replicates 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
觀察6-8行、15-17行,說明7007已是7006的slave。
增長新的節點以後,問題就來了,16384個slot已經被其它3組節點分完了,新節點沒有slot,沒辦法存放緩存,因此須要將slot從新分佈。
➜ src ./redis-trib.rb info 127.0.0.1:7000 127.0.0.1:7000 (0b7e0d53...) -> 4 keys | 5461 slots | 1 slaves. 127.0.0.1:7001 (e0e8dfdd...) -> 4 keys | 5462 slots | 1 slaves. 127.0.0.1:7006 (226d1af3...) -> 0 keys | 0 slots | 1 slaves. #7006上徹底沒有slot 127.0.0.1:7002 (ec964a7c...) -> 9 keys | 5461 slots | 1 slaves. [OK] 17 keys in 4 masters. 0.00 keys per slot on average.
用下面的命令能夠從新分配slot
./redis-trib.rb reshard 127.0.0.1:7000
reshard後面的IP:port,只要是在cluster中的有效節點便可。
➜ src ./redis-trib.rb reshard 127.0.0.1:7000 >>> Performing Cluster Check (using node 127.0.0.1:7000) M: 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e 127.0.0.1:7000 slots:1792-4095 (2304 slots) master 0 additional replica(s) ... [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)? 1000 #這裏輸入要移動多少slot What is the receiving node ID? 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e #這裏輸入目標節點的id Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:all #將全部node都當成源節點 ... Moving slot 4309 from ec964a7c7cd53b986f54318a190c1426fc53a5fa Moving slot 4310 from ec964a7c7cd53b986f54318a190c1426fc53a5fa Moving slot 4311 from ec964a7c7cd53b986f54318a190c1426fc53a5fa Moving slot 4312 from ec964a7c7cd53b986f54318a190c1426fc53a5fa Moving slot 4313 from ec964a7c7cd53b986f54318a190c1426fc53a5fa Do you want to proceed with the proposed reshard plan (yes/no)? yes #確認執行
注:第一個交互詢問,填寫多少slot移動時,要好好想一想,若是填成16384,則將全部slot都移動到一個固定節點上,會致使更加不均衡!建議每次移動500~1000,這樣對線上的影響比較小。
另外在填寫source node時,除了all以外,還能夠直接填寫源節點的id,即:
[OK] All 16384 slots covered. How many slots do you want to move (from 1 to 16384)? 300 What is the receiving node ID? 0b7e0d5337e87ac7b59bba4c1248e5c9e8d1905e Please enter all the source node IDs. Type 'all' to use all the nodes as source nodes for the hash slots. Type 'done' once you entered all the source nodes IDs. Source node #1:226d1af3c95bf0798ea9fed86373b89347f889da #這裏填寫源節點的id Source node #2:done #這裏輸入done表示,再也不繼續添加源節點了
reshard能夠屢次操做,直到達到指望的分佈爲止(注:我的以爲redis的reshard這裏有點麻煩,要移動多少slot須要人工計算,若是能提供一個參數之類,讓16384個slot自動平均分配就行了),調整完成後,能夠再看看分佈狀況:
➜ src ./redis-trib.rb info 127.0.0.1:7000 127.0.0.1:7000 (0b7e0d53...) -> 4 keys | 4072 slots | 0 slaves. 127.0.0.1:7001 (e0e8dfdd...) -> 5 keys | 4099 slots | 0 slaves. 127.0.0.1:7006 (226d1af3...) -> 5 keys | 4132 slots | 4 slaves. 127.0.0.1:7002 (ec964a7c...) -> 3 keys | 4081 slots | 0 slaves. [OK] 17 keys in 4 masters. 0.00 keys per slot on average.
既然有擴容,就會有反向需求,某些節點再也不須要時,能夠用del-node刪除,好比剛纔我一陣亂倒騰後,發現7006已經有4個slave了,而其它master一個slave都沒有,這明顯不合理。
刪除節點命令:
./redis-trib.rb del-node 127.0.0.1:7006 88e16f91609c03277f2ee6ce5285932f58c221c1
del-node後面的ip:port只要是cluster中有效節點便可,最後一個參數爲目標節點的id,注意:只有slave節點和空的master節點能夠刪除,若是master非空,先用reshard把上面的slot移動到其它node後再刪除,若是有一組master-slave節點,將master上全部slot移到其它節點,而後將master刪除,剩下的slave會另尋他主,變成其它master的slave。
另外:刪除節點的含義,不只僅是從cluster中將這個節點移除,還會直接將目標節點的redis服務中止。
參考文章: