Redis Cluster 自動化安裝，擴容和縮容

時間 2019-12-19

原文原文鏈接

Redis Cluster 自動化安裝，擴容和縮容html

以前寫過一篇基於python的redis集羣自動化安裝的實現，基於純命令的集羣實現仍是至關繁瑣的，所以官方提供了redis-trib.rb這個工具
雖然官方的的redis-trib.rb提供了集羣建立、檢查、修復、均衡等命令行工具，之所我的接受不了redis-trib.rb，緣由在於redis-trib.rb沒法自定義實現集羣中節點的主從關係。
好比ABCDEF6個節點，在建立集羣的過程當中必然要明確指定哪些是主，哪些是從，主從對應關係，惋惜經過redis-trib.rb沒法自定義控制，參考以下截圖。
更多的時候，是須要明確指明哪些機器做爲主節點，哪些做爲從節點，redis-trib.rb作不到自動控制集羣中的哪些機器（實例）做爲主，哪些機器（實例）做爲從。
若是使用redis-trib.rb，還須要解決ruby的環境依賴，所以我的不太接受使用redis-trib.rb搭建集羣。node

引用《Redis開發與運維》裏面的原話：
若是部署節點使用不一樣的IP地址， redis-trib.rb會盡量保證主從節點不分配在同一機器下，所以會從新排序節點列表順序。
節點列表順序用於肯定主從角色，先主節點以後是從節點。
這說明：使用redis-trib.rb是沒法人爲地徹底控制主從節點的分配的。python

後面redis 5.0版本的Redis-cli --cluster已經實現了集羣的建立，無需依賴redis-trib.rb，包括ruby環境，redis 5.0版本Redis-cli --cluster自己已經實現了集羣等相關功能
可是基於純命令自己仍是比較複雜的,尤爲是在較爲複雜的生產環境，經過手動方式來建立集羣，擴容或者縮容，會存在一系列的手工操做，以及一些不安全因素。
因此，自動化的集羣建立，擴容以及縮容是有必要的。redis

測試環境安全

這裏基於python3，以redis-cli --cluster命令爲基礎，實現redis自動化集羣，自動化擴容，自動化縮容ruby

測試環境以單機多實例爲示例，一共8個節點，
1，自動化集羣的建立，6各節點（10001~10006）建立爲3主（10001~10002）3從（10004~10006）的集羣
2，集羣的自動化擴容，增長新節點10007爲主節點，同時添加10008爲10007節點的slave節點
3，集羣的自動化縮容，與2相反，移除集羣中的10007以及其slave的10008節點運維

Redis集羣建立工具

集羣的本質是執行兩組命令，一個是將主節點加入到集羣中，一個是依次對主節點添加slave節點。
可是期間會涉及到找到各個節點id的邏輯，所以手動實現的話，比較繁瑣。
主要命令以下：測試

################# create cluster #################
redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 -a ****** --cluster-yes
################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******spa

這裏使用python建立的過程當中打印出來redis-cli --cluster 命令的日誌信息

[root@JD redis_install]# python3 create_redis_cluster.py
################# flush master/slave slots #################
################# create cluster #################
redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003   -a ****** --cluster-yes
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 3 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 127.0.0.1:10001)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
0
################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:10004 to cluster 127.0.0.1:10001
>>> Performing Cluster Check (using node 127.0.0.1:10001)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:10004 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:10001.
[OK] New node added correctly.
0
redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:10005 to cluster 127.0.0.1:10002
>>> Performing Cluster Check (using node 127.0.0.1:10002)
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004
   slots: (0 slots) slave
   replicates 6164025849a8ff9297664fc835bc851af5004f61
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:10005 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:10002.
[OK] New node added correctly.
0
redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:10006 to cluster 127.0.0.1:10003
>>> Performing Cluster Check (using node 127.0.0.1:10003)
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005
   slots: (0 slots) slave
   replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004
   slots: (0 slots) slave
   replicates 6164025849a8ff9297664fc835bc851af5004f61
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:10006 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:10003.
[OK] New node added correctly.
0
################# cluster nodes info: #################
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 myself,master - 0 1575947748000 53 connected 10923-16383
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575947748000 52 connected 5461-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575947746000 52 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575947748103 51 connected 0-5460
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575947749000 51 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575947749105 53 connected

[root@JD redis_install]#

Redis集羣擴容

redis擴容主要分爲兩步：
1，增長主節點，同時爲主節點增長從節點。
2，從新分配slot到新增長的master節點上。

主要命令以下：

增長主節點到集羣中
redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001 -a ******
爲增長的主節點添加從節點
redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******

從新分片slot
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-16383
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-12287
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460
可見新加的節點成功從新分配了slot，集羣擴容成功。

這裏有幾個須要注意的兩個問題，若是是自動化安裝的話：
1，add-node以後（不論是柱節點仍是從節點），要sleep足夠長的時間（這裏是20秒），讓集羣中全部的節點都meet到新節點，不然會擴容失敗
2，新節點的reshard以後要sleep足夠長的時間（這裏是20秒），不然繼續reshard其餘節點的slot會致使上一個reshared失敗

整個過程以下

[root@JD redis_install]# python3 create_redis_cluster.py
#########################cleanup instance#################################
#########################add node into cluster#################################
 redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001  -a redis@password
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:10007 to cluster 127.0.0.1:10001
>>> Performing Cluster Check (using node 127.0.0.1:10001)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006
   slots: (0 slots) slave
   replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004
   slots: (0 slots) slave
   replicates 6164025849a8ff9297664fc835bc851af5004f61
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005
   slots: (0 slots) slave
   replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:10007 to make it join the cluster.
[OK] New node added correctly.
0
 redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1:10008 to cluster 127.0.0.1:10007
>>> Performing Cluster Check (using node 127.0.0.1:10007)
M: 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007
   slots: (0 slots) master
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004
   slots: (0 slots) slave
   replicates 6164025849a8ff9297664fc835bc851af5004f61
S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006
   slots: (0 slots) slave
   replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005
   slots: (0 slots) slave
   replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1:10008 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:10007.
[OK] New node added correctly.
0
#########################reshard slots#################################
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1
################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-16383
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-12287
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460

[root@JD redis_install]#

Redis集羣縮容

縮容按道理是擴容的反向操做.
從這個命令就能夠看出來：del-node host:port node_id #刪除給定的一個節點，成功後關閉該節點服務。
縮容就縮容了，從集羣中移除掉（cluster forget nodeid）某個主節點就好了，爲何還要關閉？所以本文不會採用redis-cli --cluster del-node的方式縮容，而是經過普通命令行來縮容。

這裏的自定義縮容實質上分兩步
1，將移除的主節點的slot分配回集羣中其餘節點，這裏測試四個主節點縮容爲三個主節點，實際上執行命令以下。
2，集羣中的節點依次執行cluster forget master_node_id(slave_node_id)

############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12

完整代碼以下

[root@JD redis_install]# python3 create_redis_cluster.py
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
################# cluster nodes info: #################
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575968426000 76 connected
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575968422619 75 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 myself,master - 0 1575968426000 75 connected 0-5460
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575968425000 77 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575968427626 77 connected 10923-16383
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575968426000 76 connected 5461-10922

[root@JD redis_install]#

其實到這裏並無結束，這裏要求縮容以後集羣中的全部節點都要成功地執行cluster forget master_node_id(和slave_node_id)
不然其餘節點仍然有10007節點的心跳信息，超過1分鐘以後，仍舊會將已經踢出集羣的10007節點(以及從節點10008)會被添加回來
這就一開始就遇到一個奇葩問題，由於沒有在縮容後的集羣的slave節點上執行cluster forget，被移除的節點，會不斷地被添加回來……。
參考這裏：http://www.redis.cn/commands/cluster-forget.html

完整的代碼實現以下

import os
import time
import redis
from time import ctime,sleep


def create_redis_cluster(list_master_node,list_slave_node):
    print('################# flush master/slave slots #################')
    for node in list_master_node:
        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
        currenrt_conn.execute_command('flushall')
        currenrt_conn.execute_command('cluster reset')

    for node in list_slave_node:
        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
        #currenrt_conn.execute_command('flushall')
        currenrt_conn.execute_command('cluster reset')

    print('################# create cluster #################')
    master_nodes = ''
    for node in list_master_node:
        master_nodes = master_nodes + node["host"] + ':' + str(node["port"]) + ' '
    command = "redis-cli --cluster create {0}  -a ****** --cluster-yes".format(master_nodes)
    print(command)
    msg = os.system(command)
    print(msg)
    time.sleep(5)

    print('################# add slave nodes #################')
    counter = 0
    for node in list_master_node:
        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
        current_master_node = node["host"] + ':' + str(node["port"])
        current_slave_node = list_slave_node[counter]["host"] + ':' + str(list_slave_node[counter]["port"])
        myid = currenrt_conn.cluster('myid')
        #slave 節點在前，master節點在後
        command = "redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(current_slave_node,current_master_node,myid)
        print(command)
        msg = os.system(command)
        counter = counter + 1
        print(msg)
    # show cluster nodes info
    time.sleep(10)
    print("################# cluster nodes info: #################")
    cluster_nodes = currenrt_conn.execute_command('cluster nodes')
    print(cluster_nodes)

# 返回擴容後，原始節點中，每一個主節點須要遷出的slot數量
def get_migrated_slot(list_master_node,n):
    migrated_slot_count = int(16384/len(list_master_node)) - int(16384/(len(list_master_node)+n))
    return migrated_slot_count

def redis_cluster_expansion(list_master_node,dict_master_node,dict_slave_node):
    new_master_node =  dict_master_node["host"] + ':' + str(dict_master_node["port"])
    new_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])

    print("#########################cleanup instance#################################")
    new_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)
    new_master_conn.execute_command('flushall')
    new_master_conn.execute_command('cluster reset')
    new_master_id = new_master_conn.cluster('myid')

    new_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)
    new_slave_conn.execute_command('cluster reset')
    new_slave_id = new_slave_conn.cluster('myid')
    #new_slave_conn.execute_command('slaveof no one')

    # 判斷新增的節點是否歸屬於當前集羣，
    # 若是已經歸屬於當前集羣且不佔用slot，則先踢出當前集羣 cluster forget nodeid,或者終止，給出告警，總之，怎麼開心怎麼來
    # 登陸集羣中的任何一個節點
    cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)
    dict_node_info = cluster_node_conn.cluster('nodes')
    '''dict_node_info format example :
    {
    '127.0.0.1:10008@20008': {'node_id': '1d10c3ce3b9b7f956a26122980827fe6ce623d22', 'flags': 'master', 'master_id': '-','last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True}, 
    '127.0.0.1:10002@20002': {'node_id': '64e634307bdc339b503574f5a77f1b156c021358', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '7', 'slots': [['5461', '10922']], 'connected': True}, 
    '127.0.0.1:10001@20001': {'node_id': '6164025849a8ff9297664fc835bc851af5004f61', 'flags': 'myself,master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599438000', 'epoch': '6', 'slots': [['0', '5460']], 'connected': True}, 
    '127.0.0.1:10007@20007': {'node_id': '307f589ec7b1eb7bd65c680527afef1e30ce2303', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599443599', 'epoch': '5', 'slots': [], 'connected': True}, 
    '127.0.0.1:10005@20005': {'node_id': '23e1871c4e1dc1047ce567326e74a6194589146c', 'flags': 'slave', 'master_id': '64e634307bdc339b503574f5a77f1b156c021358', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599441000', 'epoch': '7', 'slots': [], 'connected': True}, 
    '127.0.0.1:10004@20004': {'node_id': '026f0179631f50ca858d46c2b2829b3af71af2c8', 'flags': 'slave', 'master_id': '6164025849a8ff9297664fc835bc851af5004f61', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599440000', 'epoch': '6', 'slots': [], 'connected': True}, 
    '127.0.0.1:10006@20006': {'node_id': '9f265545ebb799d2773cfc20c71705cff9d733ae', 'flags': 'slave', 'master_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True}, 
    '127.0.0.1:10003@20003': {'node_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442599', 'epoch': '8', 'slots': [['10923', '16383']], 'connected': True}
    }
    '''
    dict_master_node_in_cluster = 0
    dict_slave_node_in_cluster = 0

    for key_node in dict_node_info:
        if new_master_node in key_node:
            dict_master_node_in_cluster = 1
            if len(dict_node_info[key_node]['slots']) > 0:
                print('error: ' +new_master_node + ' already existing in cluster and alloted slots,execute break......')
                return
        if new_slave_node in key_node:
            dict_slave_node_in_cluster = 1
            if len(dict_node_info[key_node]['slots']) > 0:
                print('error: ' +new_slave_node + ' already existing in cluster and alloted slots,execute break......')
                return

    if dict_master_node_in_cluster == 1:
        for master_node in list_master_node:
            key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
            print('waring: ' + new_master_node + ' already existing in cluster,cluster forget it......')
            forget_command = 'cluster forget {0}'.format(new_master_id)
            key_node_conn.execute_command(forget_command)
    if dict_slave_node_in_cluster == 1:
        for master_node in list_master_node:
            key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
            print('waring: ' + new_slave_node + ' already existing in cluster,forget it......')
            forget_command = 'cluster forget {0}'.format(new_slave_id)
            key_node_conn.execute_command(forget_command)

    print("#########################add node into cluster#################################")
    try:
        cluster_node = list_master_node[0]["host"] + ':' + str(list_master_node[0]["port"])
        # 1,待加入節點在前，第二個節點爲集羣中的任意一個節點
        add_node_command = " redis-cli --cluster add-node {0} {1}  -a ****** ".format(new_master_node,cluster_node)
        print(add_node_command)
        print(os.system(add_node_command))
        time.sleep(20)
        # slave 節點在前，master節點在後
        add_node_command = " redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(new_slave_node,new_master_node,new_master_id)
        print(add_node_command)
        print(os.system(add_node_command))
        time.sleep(20)
    except Exception as e:
        print('add new node error,the reason is:')
        print(e)

    print("#########################reshard slots#################################")
    migrated_slot_count = get_migrated_slot(list_master_node,1)
    for node in list_master_node:
        current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
        current_master_node = node["host"] + ':' + str(node["port"])
        current_master_node_id = current_master_conn.cluster('myid')
        '''
        example:3節點-->擴容4節點，每一個遷移1365
        '''
        try:
            command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots {3} --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1 '''. format(current_master_node,current_master_node_id,new_master_id,migrated_slot_count)
            print('############################ execute reshard #########################################')
            print(command)
            msg = os.system(command)
            time.sleep(20)
        except Exception as e:
            print('reshard slots error,the reason is:')
            print(e)

    print("################# cluster nodes info: #################")
    cluster_nodes = new_master_conn.execute_command('cluster nodes')
    print(cluster_nodes)


def redis_cluster_shrinkage(list_master_node,list_slave_node,dict_master_node,dict_slave_node):
    # 判斷新增的節點是否歸屬於當前集羣，
    # 若是不歸屬當前集羣，則退出
    cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)
    dict_node_info = cluster_node_conn.cluster('nodes')

    removed_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"])+'@'+str(dict_master_node["port"]+10000)
    removed_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])+'@'+str(dict_slave_node["port"]+10000)

    if not removed_master_node in dict_node_info.keys():
        print('Error:'+ str(removed_master_node) +' not in cluster,exiting')
        return
    if not removed_slave_node in dict_node_info.keys():
        print('Error:' + str(removed_slave_node) + ' not in cluster,exiting')
        return

    removed_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)
    removed_master_id = removed_master_conn.cluster('myid')
    removed_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)
    removed_slave_id = removed_slave_conn.cluster('myid')

    for node in list_master_node:
        current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
        current_master_node = node["host"] + ':' + str(node["port"])
        current_master_node_id = current_master_conn.cluster('myid')
        '''
        4節點-->縮容3節點，平均將slot歸還到三個master節點
        '''
        try:
            command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1 '''.\
                format(current_master_node, removed_master_id, current_master_node_id)
            print('############################ execute reshard #########################################')
            print(command)
            msg = os.system(command)
            time.sleep(10)
        except Exception as e:
            print('reshard slots error,the reason is:')
            print(e)

    removed_master_conn.execute_command('cluster reset')
    removed_slave_conn.execute_command('cluster reset')

    for master_node in list_master_node:
        master_node_conn =  redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
        foget_master_command = 'cluster forget {0}'.format(removed_master_id)
        foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)
        print(str(master_node)+ '--->' + foget_master_command)
        print(str(master_node)+ '--->' + foget_slave_command)
        master_node_conn.execute_command(foget_master_command)
        master_node_conn.execute_command(foget_slave_command)

    for slave_node in list_slave_node:
        slave_node_conn = redis.StrictRedis(host=slave_node["host"], port=slave_node["port"], password=slave_node["password"], decode_responses=True)
        foget_master_command = 'cluster forget {0}'.format(removed_master_id)
        foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)
        print(str(slave_node)+ '--->' +foget_master_command)
        print(str(slave_node)+ '--->' +foget_slave_command)
        slave_node_conn.execute_command(foget_master_command)
        slave_node_conn.execute_command(foget_slave_command)

    print("################# cluster nodes info: #################")
    cluster_nodes = cluster_node_conn.execute_command('cluster nodes')
    print(cluster_nodes)


if __name__ == '__main__':
    # master
    node_1 = {'host': '127.0.0.1', 'port': 10001, 'password': '******'}
    node_2 = {'host': '127.0.0.1', 'port': 10002, 'password': '******'}
    node_3 = {'host': '127.0.0.1', 'port': 10003, 'password': '******'}
    # slave
    node_4 = {'host': '127.0.0.1', 'port': 10004, 'password': '******'}
    node_5 = {'host': '127.0.0.1', 'port': 10005, 'password': '******'}
    node_6 = {'host': '127.0.0.1', 'port': 10006, 'password': '******'}
    # 主從節點個數必須相同
    list_master_node = [node_1, node_2, node_3]
    list_slave_node = [node_4, node_5, node_6]
    
    # 自動化集羣建立
    #create_redis_cluster(list_master_node,list_slave_node)

    # 自動化擴容
    node_1 = {'host': '127.0.0.1', 'port': 10007, 'password': '******'}
    node_2 = {'host': '127.0.0.1', 'port': 10008, 'password': '******'}
    redis_cluster_expansion(list_master_node,node_1,node_2)
    
    # 自動化縮容，
    #redis_cluster_shrinkage(list_master_node,list_slave_node,node_1,node_2)

參考：http://www.javashuo.com/article/p-aynmxqmx-dd.html

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。