Redis Sentinel 高可用實現說明

時間 2019-11-06

標籤 redis sentinel 可用實現說明欄目 Redis 简体版

原文原文鏈接

背景：html

前面介紹了Redis 複製、Sentinel的搭建和原理說明，經過這篇文章大體能瞭解Sentinel的原理和實現方法以及相關的搭建。這篇文章就針對Redis Sentinel的搭建作下詳細的說明。java

安裝：node

這裏對源碼編譯進行一下說明，本文實例的操做系統是Ubuntu16.04，使用Redis的版本是3.2.0。安裝步驟以下：python

下載源碼包：wget http://download.redis.io/releases/redis-3.2.0.tar.gz
安裝依賴包：sudo apt-get install gcc tcl
解壓編譯：
```
#tar zxvf redis-3.2.0.tar.gz ... ... #make ... Hint: It's a good idea to run 'make test' ;)
#make test ... \o/ All tests passed without errors! ... #make install
```
注意：這裏極可能會在make test 這步出現一個錯誤：mysql

[err]: Test replication partial resync: ok psync (diskless: yes, reconnect: 1) in tests/integration/replication-psync.tcllinux

Expected condition '[s -1 sync_partial_ok] > 0' to be true ([s -1 sync_partial_ok] > 0)git

出現這個問題的緣由多是"測試點在配置比較低的機器上會由於超時而過不了"，本文的環境是一個lxc的虛擬機。不過有2個方法能夠避免：
github
```
1:在解壓目錄中修改 # vi tests/integration/replication-psync.tcl 把 after 100 改爲 after 500

2：用taskset來make test # taskset -c 1 make test
```
到此redis編譯安裝完成。web
編譯文件的目錄裏有2個配置：
redis.conf、sentinel.conf，配置文件說明請見這篇文章。
本文測試的環境架構：
3個redis實例1主、2從、3sentinel。M：10.0.3.1十、S：10.0.3.9二、10.0.3.66，每一個redis實例上配置一個sentinel實例。修改配置文件：
redis.conf

# Redis configuration file example.
# ./redis-server /path/to/redis.conf

################################## INCLUDES ###################################

# include /path/to/local.conf
# include /path/to/other.conf

################################## NETWORK #####################################

bind 10.0.3.110

protected-mode yes

port 6379

tcp-backlog 511

unixsocket "/tmp/redis.sock"
unixsocketperm 700

timeout 0

tcp-keepalive 0

################################# GENERAL #####################################

daemonize yes

pidfile "/var/run/redis6379.pid"

loglevel notice

logfile "/var/log/redis/redis_6379.log"

# syslog-enabled no
# syslog-ident redis
# syslog-facility local0

databases 16
supervised no

################################ SNAPSHOTTING  ################################

save 900 1
save 300 10
save 60 10000

stop-writes-on-bgsave-error yes

rdbcompression yes

rdbchecksum yes

dbfilename "dump_6379.rdb"

dir "/var/lib/redis_6379"

################################# REPLICATION #################################

# slaveof <masterip> <masterport>
masterauth "dxydxy"

slave-serve-stale-data yes
slave-read-only yes

repl-diskless-sync no
repl-diskless-sync-delay 5

# repl-ping-slave-period 10
# repl-timeout 60

repl-disable-tcp-nodelay no
repl-backlog-size 5mb
repl-backlog-ttl 3600

slave-priority 100

#min-slaves-to-write 3
#min-slaves-max-lag 10

################################## SECURITY ###################################

requirepass "dxydxy"
# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
# rename-command CONFIG ""

################################### LIMITS ####################################

maxclients 1000
#maxmemory <bytes>
maxmemory-policy noeviction
# maxmemory-samples 5

############################## APPEND ONLY MODE ###############################

appendonly yes

appendfilename "appendonly_6379.aof"

# appendfsync always
appendfsync everysec
# appendfsync no

no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes

################################ LUA SCRIPTING  ###############################

lua-time-limit 5000

################################ REDIS CLUSTER  ###############################

# cluster-enabled yes
# cluster-config-file nodes-6379.conf
# cluster-node-timeout 15000
# cluster-slave-validity-factor 10
# cluster-migration-barrier 1
# cluster-require-full-coverage yes

################################## SLOW LOG ###################################

slowlog-log-slower-than 10000
slowlog-max-len 128

################################ LATENCY MONITOR ##############################

latency-monitor-threshold 0

############################# EVENT NOTIFICATION ##############################

notify-keyspace-events ""

############################### ADVANCED CONFIG ###############################

hash-max-ziplist-entries 512
hash-max-ziplist-value 64

list-max-ziplist-entries 512
list-max-ziplist-value 64

list-compress-depth 0
set-max-intset-entries 512

zset-max-ziplist-entries 128
zset-max-ziplist-value 64

hll-sparse-max-bytes 3000

activerehashing yes

client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60

hz 10
aof-rewrite-incremental-fsync yes

list-max-ziplist-size -2

View Code

sentinel.conf
redis

port 16379

dir "/var/lib/sentinel_16379"

logfile "/var/log/redis/sentinel_16379.log"

daemonize yes

protected-mode no

sentinel monitor dxy 10.0.3.110 6379 2

sentinel auth-pass dxy dxydxy

sentinel down-after-milliseconds dxy 15000

sentinel failover-timeout dxy 120000

#發生切換以後執行的一個自定義腳本：如發郵件、vip切換等
#sentinel notification-script <master-name> <script-path>
#sentinel client-reconfig-script <master-name> <script-path>

配置文件保存在 /etc/redis/目錄下，按照配置文件建立相應的目錄。和Redis 複製、Sentinel的搭建和原理說明這裏不一樣的是各個redis實例都配置了密碼訪問的限制（requirepass）。
注意：當一個master配置須要密碼才能鏈接時，客戶端和slave在鏈接時都須要提供密碼。master經過requirepass設置自身的密碼，不提供密碼沒法鏈接到這個master。slave經過masterauth來設置訪問master時的密碼。客戶端須要auth提供密碼，可是當使用了sentinel時，因爲一個master可能會變成一個slave，一個slave也可能會變成master，因此須要同時設置上述兩個配置項，而且sentinel須要鏈接master和slave，須要設置參數：sentinel auth-pass <master_name> xxxxx。

建立redis用戶和組，把配置文件裏指定的目錄均受權。

# useradd redis # groupadd redis # chown -R redis.redis redis/ # chown -R redis.redis /etc/redis/

開啓各個redis實例
```
redis-server /etc/redis/redis.conf
```

注意：開啓的時redis的日誌會報幾個WARNING：

29407:M 14 Jun 14:36:42.186 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 處理：修改/etc/sysctl.conf文件，增長一行 net.core.somaxconn= 1024；而後執行命令：sysctl -p 29407:M 14 Jun 14:36:42.186 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 處理：echo 1 > /proc/sys/vm/

29407:M 14 Jun 14:36:42.187 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. 處理：echo never > /sys/kernel/mm/transparent_hugepage/enabled

WARNING說明：

net.core.somaxconn是linux中的一個kernel參數，表示socket監聽（listen）的backlog上限。
backlog是socket的監聽隊列，當一個請求（request）還沒有被處理或創建時，他會進入backlog。
而socket server能夠一次性處理backlog中的全部請求，處理後的請求再也不位於監聽隊列中。
當server處理請求較慢，以致於監聽隊列被填滿後，新來的請求會被拒絕。
因此說net.core.somaxconn限制了接收新 TCP 鏈接偵聽隊列的大小。
對於一個常常處理新鏈接的高負載 web服務環境來講，默認的 128 過小了。大多數環境這個值建議增長到 1024 或者更多。


overcommit_memory參數說明：
設置內存分配策略（可選，根據服務器的實際狀況進行設置）
/proc/sys/vm/overcommit_memory
可選值：0、1、2。
0， 表示內核將檢查是否有足夠的可用內存供應用進程使用；若是有足夠的可用內存，內存申請容許；不然，內存申請失敗，並把錯誤返回給應用進程。
1， 表示內核容許分配全部的物理內存，而無論當前的內存狀態如何。
2， 表示內核容許分配超過全部物理內存和交換空間總和的內存
注意：redis在dump數據的時候，會fork出一個子進程，理論上child進程所佔用的內存和parent是同樣的，好比parent佔用的內存爲8G，這個時候也要一樣分配8G的內存給child,若是內存沒法負擔，每每會形成redis服務器的down機或者IO負載太高，效率降低。因此這裏比較優化的內存分配策略應該設置爲 1（表示內核容許分配全部的物理內存，而無論當前的內存狀態如何）。

View Code

創建好複製後（slaveof）開啓各個sentinel實例

redis-sentinel /etc/redis/sentinel.conf

注意：這裏出現一個問題，這個問題罪魁禍首是參數：protected-mode。看下日誌：

2208:X 14 Jun 23:13:09.185 * +sentinel sentinel ebf9b1b4a5cc98bffead5d0996b8f43deb806641 10.0.3.92 16379 @ dxy 10.0.3.110 6379
2208:X 14 Jun 23:13:24.234 # +sdown sentinel ebf9b1b4a5cc98bffead5d0996b8f43deb806641 10.0.3.92 16379 @ dxy 10.0.3.110 6379
2208:X 14 Jun 23:14:18.888 * +sentinel sentinel 07e189ae6c30d4951d3eb48e9effd948de026c3b 10.0.3.66 16379 @ dxy 10.0.3.110 6379
2208:X 14 Jun 23:14:33.962 # +sdown sentinel 07e189ae6c30d4951d3eb48e9effd948de026c3b 10.0.3.66 16379 @ dxy 10.0.3.110 6379

從日誌裏能夠看到，除了本地的sentinel正常，其餘2個sentinel都主觀不可用了（SDOWN），時間恰好15秒(down-after-milliseconds 15000)，sentinel會向master發送心跳PING來確認master是否存活，若是master在「必定時間範圍」內不迴應PONG 或者是回覆了一個錯誤消息，那麼這個sentinel會主觀地(單方面地)認爲這個master已經不可用了(subjectively down, 也簡稱爲SDOWN)。而這個down-after-milliseconds就是用來指定這個「必定時間範圍」的，單位是毫秒。
經過時間點的判斷能夠看到，sentinel之間發現不了對方，致使SDOWN（從Redis 複製、Sentinel的搭建和原理說明裏介紹的發現機制）。由於沒有錯誤信息，這裏找了半天緣由都沒發現什麼問題。最後登錄sentinel上查看一下：

# redis -h 10.0.3.110 -p 16379
10.0.3.110:16379> info
DENIED Redis is running in protected mode because protected mode is enabled, no bind address was specified, no authentication password is requested to clients. In this mode connections are only accepted from the loopback interface. If you want to connect from external computers to Redis you may adopt one of the following solutions: 1) Just disable protected mode sending the command 'CONFIG SET protected-mode no' from the loopback interface by connecting to Redis from the same host the server is running, however MAKE SURE Redis is not publicly accessible from internet if you do so. Use CONFIG REWRITE to make this change permanent. 2) Alternatively you can just disable the protected mode by editing the Redis configuration file, and setting the protected mode option to 'no', and then restarting the server. 3) If you started the server manually just for testing, restart it with the '--protected-mode no' option. 4) Setup a bind address or an authentication password. NOTE: You only need to do one of the above things in order for the server to start accepting connections from the outside.

這裏看到一大串的信息，總的就是在說redis在沒有開啓bind和密碼的狀況下，保護模式被開啓。而後Redis的只接受來自環回IPv4和IPv6地址的鏈接。拒絕外部鏈接，使用戶知道發生了什麼錯誤。其實應該爲用戶提供了線索，而不是拒絕鏈接。具體的說明能夠看做者的討論，最後做者給出的建議是關閉保護模式：--portected-mode no。因此最後咱們這裏的錯誤信息能夠獲得解釋：因爲sentinel沒有指定bind和密碼訪問，因此被開啓了protected-mode保護模式，拒絕其餘sentinel的鏈接。致使進入了ODWON。在sentinel.conf里加入：

protected-mode no

問題獲得解決。portected-mode是3.2被引入，默認開啓。具體的信息以下：

# Protected mode is a layer of security protection, in order to avoid that
# Redis instances left open on the internet are accessed and exploited.
#
# When protected mode is on and if:
#
# 1) The server is not binding explicitly to a set of addresses using the
#    "bind" directive.
# 2) No password is configured.
#
# The server only accepts connections from clients connecting from the
# IPv4 and IPv6 loopback addresses 127.0.0.1 and ::1, and from Unix domain
# sockets.
#
# By default protected mode is enabled. You should disable it only if
# you are sure you want clients from other hosts to connect to Redis
# even if no authentication is configured, nor a specific set of interfaces
# are explicitly listed using the "bind" directive.
protected-mode yes

View Code

開啓sentinel，查看日誌：(成功開啓）

2253:X 14 Jun 23:48:05.477 # Sentinel ID is 68fdb1e07c0998b119e4678f7aead7742a7b1f64 2253:X 14 Jun 23:48:05.477 # +monitor master dxy 10.0.3.110 6379 quorum 2
2253:X 14 Jun 23:48:05.478 * +slave slave 10.0.3.92:6379 10.0.3.92 6379 @ dxy 10.0.3.110 6379
2253:X 14 Jun 23:48:05.512 * +slave slave 10.0.3.66:6379 10.0.3.66 6379 @ dxy 10.0.3.110 6379
2253:X 14 Jun 23:48:14.894 * +sentinel sentinel b2fb07a1cce853ddec86a993428fb09edf15b6c1 10.0.3.92 16379 @ dxy 10.0.3.110 6379
2253:X 14 Jun 23:48:23.346 * +sentinel sentinel d9b198d75ede190fc63d95af8a7ca58e1a395c9b 10.0.3.66 16379 @ dxy 10.0.3.110 6379

查看狀態，驗證sentinel是否創建成功。（任意登錄一個sentinel查看）

10.0.3.92:16379> info sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=dxy,status=ok,address=10.0.3.110:6379,slaves=2,sentinels=3

上面粗體的字說明sentinel開啓成功。

測試：

注意：由於上面的虛擬機連不了郵件服務器，因此更換了環境。新環境：版本2.8.4，3個redis實例1主、2從、3sentinel。M：192.168.200.208<6379>、S：192.168.200.19九、192.168.200.73，每一個redis實例上配置一個sentinel<7379>實例。

① 查看：info

192.168.200.208:6379> info replication # Replication role:master connected_slaves:2 slave0:ip=192.168.200.199,port=6379,state=online,offset=354835,lag=0 slave1:ip=192.168.200.73,port=6379,state=online,offset=354835,lag=0 master_repl_offset:354974 repl_backlog_active:1 repl_backlog_size:5242880 repl_backlog_first_byte_offset:2 repl_backlog_histlen:354973
192.168.200.208:6379>

192.168.200.208:7379> info sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0

192.168.200.208:7379> sentinel master dxy
 1) "name"
 2) "dxy"
 3) "ip"
 4) "192.168.200.208"
 5) "port"
 6) "6379"
 7) "runid"
 8) "50ad7cfe6676fc1a1e671ead4a780958942879fc"
 9) "flags"
10) "master"
11) "pending-commands"
12) "0"
13) "last-ok-ping-reply"
14) "682"
15) "last-ping-reply"
16) "682"
17) "info-refresh"
18) "3301"
19) "role-reported"
20) "master"
21) "role-reported-time"
22) "1930980"
23) "config-epoch"
24) "4"
25) "num-slaves"
26) "2"
27) "num-other-sentinels"
28) "2"
29) "quorum"
30) "2"
31) "down-after-milliseconds"
32) "30000"
33) "failover-timeout"
34) "180000"
35) "parallel-syncs"
36) "1"
37) "client-reconfig-script"
38) "/opt/bin/notify.py"

192.168.200.208:7379> sentinel slaves dxy
1)  1) "name"
    2) "192.168.200.199:6379"
    3) "ip"
    4) "192.168.200.199"  
    5) "port"
    6) "6379"
    7) "runid"
    8) "c4e7bf53f7cee3c28bc369e1db656f879bf41947"
    9) "flags"
   10) "slave"
   11) "pending-commands" 
   12) "0"
   13) "last-ok-ping-reply"
   14) "591"
   15) "last-ping-reply"  
   16) "591"
   17) "info-refresh"
   18) "3606"
   19) "role-reported"
   20) "slave"
   21) "role-reported-time"
   22) "1971346"
   23) "master-link-down-time"
   24) "0"
   25) "master-link-status"
   26) "ok"
   27) "master-host"
   28) "192.168.200.208"
   29) "master-port"
   30) "6379"
   31) "slave-priority"
   32) "100"
   33) "slave-repl-offset"
   34) "400362"
2)  1) "name"
    2) "192.168.200.73:6379"
    3) "ip"
    4) "192.168.200.73"
    5) "port"
    6) "6379"
    7) "runid"
    8) "64ad290c43bba2b062220029c4c91274bb4465b9"
    9) "flags"
   10) "slave"
   11) "pending-commands"
   12) "0"
   13) "last-ok-ping-reply"
   14) "591"
   15) "last-ping-reply"
   16) "591"
   17) "info-refresh"
   18) "4817"
   19) "role-reported"
   20) "slave"
   21) "role-reported-time"
   22) "326006"
   23) "master-link-down-time"
   24) "0"
   25) "master-link-status"
   26) "ok"
   27) "master-host"
   28) "192.168.200.208"
   29) "master-port"
   30) "6379"
   31) "slave-priority"
   32) "100"
   33) "slave-repl-offset"
   34) "400085"

View Code

② 驗證failover

kill 掉 master，經過日誌查看是切換過程的信息：

[7637] 17 Jun 12:11:08.728 # +sdown master dxy 192.168.200.208 6379   #進入客觀不可用
[7637] 17 Jun 12:11:08.819 # +odown master dxy 192.168.200.208 6379   #quorum 2/2 #投票好以後進入主觀不可用
[7637] 17 Jun 12:11:08.819 # +new-epoch 5                             #版本號
[7637] 17 Jun 12:11:08.819 # +try-failover master dxy 192.168.200.208 6379  #達到failover條件，正等待其餘sentinel的選舉
[7637] 17 Jun 12:11:08.819 # +vote-for-leader 38da843c4ad8baf95dcfdcd968ae6c2f05ab995c 5  #選舉出leader
[7637] 17 Jun 12:11:08.820 # 192.168.200.199:7379 voted for 38da843c4ad8baf95dcfdcd968ae6c2f05ab995c 5
[7637] 17 Jun 12:11:08.820 # 192.168.200.73:7379 voted for 38da843c4ad8baf95dcfdcd968ae6c2f05ab995c 5
[7637] 17 Jun 12:11:08.909 # +elected-leader master dxy 192.168.200.208 6379 #選擇leader
[7637] 17 Jun 12:11:08.909 # +failover-state-select-slave master dxy 192.168.200.208 6379 #選擇一個slave當選新master
[7637] 17 Jun 12:11:08.965 # +selected-slave slave 192.168.200.73:6379 192.168.200.73 6379 @ dxy 192.168.200.208 6379 #選擇了從73做爲master
[7637] 17 Jun 12:11:08.965 * +failover-state-send-slaveof-noone slave 192.168.200.73:6379 192.168.200.73 6379 @ dxy 192.168.200.208 6379 #當把選擇爲新master的slave的身份進行切換
[7637] 17 Jun 12:11:09.017 * +failover-state-wait-promotion slave 192.168.200.73:6379 192.168.200.73 6379 @ dxy 192.168.200.208 6379 #等待其餘sentinel的確認
[7637] 17 Jun 12:11:09.867 # +promoted-slave slave 192.168.200.73:6379 192.168.200.73 6379 @ dxy 192.168.200.208 6379 #確認成功
[7637] 17 Jun 12:11:09.867 # +failover-state-reconf-slaves master dxy 192.168.200.208 6379 #Failover狀態變爲reconf-slaves 
[7637] 17 Jun 12:11:09.957 * +slave-reconf-sent slave 192.168.200.199:6379 192.168.200.199 6379 @ dxy 192.168.200.208 6379 #sentinel發送SLAVEOF命令把它從新配置，從新配置到新主
[7637] 17 Jun 12:11:10.887 * +slave-reconf-inprog slave 192.168.200.199:6379 192.168.200.199 6379 @ dxy 192.168.200.208 6379 #slave被從新配置爲另一個master的slave，但數據複製還未發生
[7637] 17 Jun 12:11:10.887 * +slave-reconf-done slave 192.168.200.199:6379 192.168.200.199 6379 @ dxy 192.168.200.208 6379 #slave被從新配置爲另一個master的slave而且數據複製已經與master同步
[7637] 17 Jun 12:11:10.946 # -odown master dxy 192.168.200.208 6379 #老主離開主觀不可用
[7637] 17 Jun 12:11:10.946 # +failover-end master dxy 192.168.200.208 6379 ##failover成功完成
[7637] 17 Jun 12:11:10.946 # +switch-master dxy 192.168.200.208 6379 192.168.200.73 6379 #監聽新的master
[7637] 17 Jun 12:11:10.946 * +slave slave 192.168.200.199:6379 192.168.200.199 6379 @ dxy 192.168.200.73 6379 #發現slave
[7637] 17 Jun 12:11:10.947 * +slave slave 192.168.200.208:6379 192.168.200.208 6379 @ dxy 192.168.200.73 6379
[7637] 17 Jun 12:11:40.960 # +sdown slave 192.168.200.208:6379 192.168.200.208 6379 @ dxy 192.168.200.73 6379

View Code

start 老的master，經過日誌查看：

[98910] 17 Jun 12:29:01.856 # -sdown slave 192.168.200.208:6379 192.168.200.208 6379 @ dxy 192.168.200.73 6379
[98910] 17 Jun 12:29:11.793 * +convert-to-slave slave 192.168.200.208:6379 192.168.200.208 6379 @ dxy 192.168.200.73 6379  #failover 成功！

View Code

更多的日誌信息見上一篇文章。在sentinel裏有個選項client-reconfig-script，接下來講明下。

failover腳本：高可用，經過參數 client-reconfig-script 指定腳本：failover發生時候執行的腳本。

該參數的解釋：

# When the master changed because of a failover a script can be called in
# order to perform application-specific tasks to notify the clients that the
# configuration has changed and the master is at a different address.
# 
# The following arguments are passed to the script:
#
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
#
# <state> is currently always "failover"
# <role> is either "leader" or "observer"
# 
# The arguments from-ip, from-port, to-ip, to-port are used to communicate
# the old address of the master and the new address of the elected slave
# (now a master).
#
# This script should be resistant to multiple invocations.

View Code

返回的參數：

<master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>

腳本的目的是在發生failover以後，發送郵件報警，而且把vip切換到新的master上，有點相似MySQL的MHA，腳本比較簡單，沒有作其餘多餘的判斷，也能夠根據複雜的狀況增強這個腳本。實現方法：

①：首先在三臺redis實例上創建信任用密碼登錄。

用ssh-keygen建立公鑰，一直默認回車，最後會在.ssh/下面生成id_rsa.pub
ssh-keygen -t rsa  

把id_rsa.pub 文件複製到另外2臺機子並導入公鑰： 
cat id_rsa.pub >> /root/.ssh/authorized_keys

這裏須要注意：由於測試中的sentinel實例和redis實例是放一塊兒的，要是本地的sentinel要操做(down,up VIP)redis實例，也須要本地也能夠訪問本地，即本身ssh-keygen建立的公鑰也要放到本身的authorized_keys中，最後每一個服務器的authorized_keys都相互包含（三行）。

②：第一次執行的時候須要在master上先設置vip，即搭好redis sentinel以後，就須要在master上設置好vip。

③：經過收集日誌，取得所須要的ip。

④：發送、記錄日誌，而且遠程執行up、down VIP。

在此以前首先要安裝paramiko模塊：easy_install paramiko，須要依賴包：apt-get install python-setuptools python-dev build-essential libffi-dev libssl-dev；或則直接執行：apt-get install python-paramiko。

具體腳本以下：logging說明

#!/usr/bin/env python
#-*-encoding:utf8-*-
#------------------------------------------------
# Name:        notify.py
# Purpose:     failover切換後的操做
# Author:      zhoujy
# Created:     2016-06-17
#------------------------------------------------
import os
import sys
import time
import datetime
import smtplib
import subprocess
import fileinput
import logging
import paramiko
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.Utils import COMMASPACE, formatdate

reload(sys)
sys.setdefaultencoding('utf8')

def send_mail(to, subject, text, from_mail, server="localhost"):
    message = MIMEMultipart()
    message['From'] = from_mail
    message['To'] = COMMASPACE.join(to)
    message['Date'] = formatdate(localtime=True)
    message['Subject'] = subject
    message.attach(MIMEText(text,_charset='utf-8'))
    smtp = smtplib.SMTP(server)
    smtp.sendmail(from_mail, to, message.as_string())
    smtp.close()

#關vip
def down_vip(hostname,port):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect(hostname=hostname,port=port)
    stdin, stdout, stderr = ssh.exec_command("ifconfig eth0:0 down")
#    print stdout.readlines()
    if  not stderr.readlines() :
        print "down vip ok..."
    else :
        print stderr.readlines()
    ssh.close()

#開vip
def up_vip(hostname,port,vip):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect(hostname=hostname,port=port)
    stdin, stdout, stderr = ssh.exec_command("ifconfig eth0:0 %s;arping -c 3 -A %s;hash -r" %(vip,vip))
#    print stdout.readlines()
    if  not stderr.readlines() :
        print "up vip ok..."
    else :
        print stderr.readlines()
    ssh.close()

if __name__ == "__main__":
#服務器端口
    ssh_port = 22
#指定VIP
    vip      = '192.168.200.2'
#經過logging.basicConfig函數對日誌的輸出格式及方式作相關配置
    logging.basicConfig(level=logging.INFO,
                format=':::%(levelname)s::: \n%(message)s',
                datefmt='%a, %d %b %Y %H:%M:%S',
                filename='/var/log/redis/failover.txt',
                filemode='a')
#定義一個StreamHandler，將INFO級別的日誌信息打印到標準錯誤，並將其添加到當前的日誌處理對象
    console = logging.StreamHandler()
    console.setLevel(logging.INFO)
    formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
    console.setFormatter(formatter)
    logging.getLogger('').addHandler(console)

    time =  (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
    message = sys.argv[1:]
    master_name = sys.argv[1]
    role = sys.argv[2]
    stats = sys.argv[3]
    from_ip = sys.argv[4]
    from_port = sys.argv[5]
    to_ip = sys.argv[6]
    to_port = sys.argv[7]
    messages = "++++++++++++++++++++++++++"+time+" failover++++++++++++++++++++++++++"+'\n'+' '.join(message)
    subject = ''' Redis 【%s】 Failover ''' %master_name
    info = ''' %s : Redis Master %s failover %s(%s:%s) to %s(%s:%s) succeeded ! '''  %(time,master_name,from_ip,from_ip,from_port,to_ip,to_ip,to_port)
    mail_list =['zjy@dxyer.com']
    if role == 'leader':
        logging.info(messages)
        down_vip(from_ip,ssh_port)
        up_vip(to_ip,ssh_port,vip)
        send_mail(mail_list, subject.encode("utf8"), info +' and VIP do sucessed !!', "Redis_failover_report@ls.xxx.net", server="192.168.xxx.xxx")

當發生切換時，最終郵件報警的內容以下：

2016-06-17 19:06:42 : Redis Master dxy failover 192.168.200.73(192.168.200.73:6379) to 192.168.200.208(192.168.200.208:6379) succeeded !  and VIP do sucessed !!

日誌裏記錄的信息以下：

::INFO:::
++++++++++++++++++++++++++2016-06-17 19:06:42 failover++++++++++++++++++++++++++
dxy leader start 192.168.200.73 6379 192.168.200.208 6379
:::INFO:::
Connected (version 2.0, client OpenSSH_6.6.1p1)
:::INFO:::
Authentication (publickey) successful!
:::INFO:::
Connected (version 2.0, client OpenSSH_6.6.1p1)
:::INFO:::
Authentication (publickey) successful!

BTW：程序能夠直接連vip訪問Redis，實現必定的高可用：當vip切換的時候，服務會斷開，多久不可用主要看設置的檢測時間(down-after-milliseconds：默認30秒，能夠設置更低，如5000即5秒)和程序重連的時間。固然也能夠直接用java的jedis客戶端訪問，直接實現高可用（經過sentinel中的信息獲得master，再連master）。

總結：

經過Redis 複製、Sentinel的搭建和原理說明和本文大體的瞭解redis sentinel 高可用的實現，sentinel比較簡單在壓力不大，單機能夠知足需求的狀況下，redis sentinel是一個不錯的選擇。

參考文檔：

Redis 複製、Sentinel的搭建和原理說明