Sentinel 進程是用於監控 redis 集羣中 Master 主服務器工做的狀態,在 Master 主服務器發生故障的時候,能夠實現 Master 和 Slave 服務器的切換,保證系統的高可用,其已經被集成在 redis2.6+的版本中, Redis 的哨兵模式到了 2.8 版本以後就穩定了下來。通常在生產環境也建議使用 Redis 的 2.8 版本的之後版本。哨兵(Sentinel) 是一個分佈式系統, 能夠在一個架構中運行多個哨兵(sentinel) 進程,這些進程使用流言協議(gossip protocols)來接收關於 Master 主服務器是否下線的信息,並使用投票協議(Agreement Protocols)來決定是否執行自動故障遷移,以及選擇哪一個 Slave 做爲新的 Master。每一個哨兵(Sentinel)進程會向其它哨兵(Sentinel)、 Master、 Slave 定時發送消息,以確認對方是否」活」着,若是發現對方在指定配置時間(可配置的)內未獲得迴應,則暫時認爲對方已掉線,也就是所謂的」 主觀認爲宕機」 , 主觀是每一個成員都具備的獨自的並且可能相同也可能不一樣的意識,英文名稱: Subjec Down,簡稱 SDOWN。有主觀宕機,確定就有客觀宕機。當「哨兵羣」中的多數 Sentinel 進程在對 Master 主服務器作出 SDOWN 的判斷,而且經過 SENTINEL is-master-down-by-addr 命令互相交流以後,得出的 Master Server 下線判斷,這種方式就是「客觀宕機」, 客觀是不依賴於某種意識而已經實際存在的一切事物, 英文名稱是: Objectively Down, 簡稱 ODOWN。經過必定的 vote 算法,從剩下的 slave 從服務器節點中,選一臺提高爲 Master 服務器節點,而後自動修改相關配置,並開啓故障轉移(failover)。
Sentinel 機制能夠解決 master 和 slave 角色的切換問題。redis
主機名 | 主機IP地址 |
---|---|
Master | 192.168.36.110 |
Slave-1 | 192.168.36.111 |
Slave-2 | 192.168.36.112 |
[root@Master ~]#ss -ntl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 *:6379 *:* [root@Slave-1 ~]#ss -ntl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 *:6379 *:* [root@Slave-2 ~]#ss -ntl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 *:6379 *:*
Redis服務器默認爲master,指定master服務器後將其餘slave服務器使用命令配置爲master服務器的slave。由於哨兵的前提是已經手動實現了一個redis master-slave的運行環境。算法
[root@Slave-1 ~]#vim /apps/redis/etc/redis.conf .... 281 slaveof 192.168.36.110 6379 # slaveof指向master 288 masterauth 123456 .... [root@Slave-1 ~]#ps -ef | grep redis root 7397 1 0 10:45 ? 00:00:01 redis-server 0.0.0.0:6379 root 7484 7349 0 10:54 pts/0 00:00:00 grep --color=auto redis [root@Slave-1 ~]#kill -9 7397 # 終止進程 [root@Slave-1 ~]#redis-server /apps/redis/etc/redis.conf # 從新加載配置文件
[root@Slave-2 ~]#vim /apps/redis/etc/redis.conf .... 281 slaveof 192.168.36.110 6379 288 masterauth 123456 .... [root@Slave-2 ~]#ps -ef | grep redis root 8017 1 0 10:44 ? 00:00:01 redis-server 0.0.0.0:6379 root 8173 7926 0 10:56 pts/0 00:00:00 grep --color=auto redis [root@Slave-2 ~]#kill 8017 [root@Slave-2 ~]#redis-server /apps/redis/etc/redis.conf # 從新加載配置文件
# Slave-1狀態 [root@Slave-1 ~]#redis-cli 127.0.0.1:6379> AUTH 123456 OK 127.0.0.1:6379> INFO replication # Replication role:slave # 已變爲slave master_host:192.168.36.110 master_port:6379 master_link_status:up # 開啓了狀態同步 master_last_io_seconds_ago:8 master_sync_in_progress:0 slave_repl_offset:84 slave_priority:100 slave_read_only:1 connected_slaves:0 master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37 master_replid2:0000000000000000000000000000000000000000 # 保存了上一次的master_replid的值,當發生故障轉移後此值會記錄當前的master的id master_repl_offset:84 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:84 # Slave-2 狀態 [root@Slave-2 ~]#redis-cli 127.0.0.1:6379> AUTH 123456 OK 127.0.0.1:6379> INFO replication # Replication role:slave # 已變爲slave master_host:192.168.36.110 master_port:6379 master_link_status:up # 開啓了狀態同步 master_last_io_seconds_ago:1 master_sync_in_progress:0 slave_repl_offset:224 slave_priority:100 slave_read_only:1 connected_slaves:0 master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:224 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:155 repl_backlog_histlen:70 # Master狀態 [root@Master ~]#redis-cli 127.0.0.1:6379> AUTH 123456 OK 127.0.0.1:6379> INFO replication # Replication role:master connected_slaves:2 # 2個slave,此時Slave-一、Slave-2已經加入進來 slave0:ip=192.168.36.111,port=6379,state=online,offset=336,lag=1 slave1:ip=192.168.36.112,port=6379,state=online,offset=336,lag=1 master_replid:99a1dcabb930a97bbdea90450b2f891778c83e37 master_replid2:0000000000000000000000000000000000000000 master_repl_offset:336 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:336 # 此時兩個slave同步master數據,能夠查看但不能寫數據 127.0.0.1:6379> KEYS * 1) "key3" 2) "key2" 3) "key1" 127.0.0.1:6379> SET key5 value5 (error) READONLY You can't write aga×××t a read only slave. 127.0.0.1:6379> GET key3 "value4"
# 因爲Redis爲編譯安裝,因此須要cp拷貝sentinel配置文件 # 若是yum安裝,則存在sentinel配置文件,無需拷貝 [root@Master ~]#cp /root/redis-4.0.14/sentinel.conf /apps/redis/etc/
[root@Master ~]#vim /apps/redis/etc/sentinel.conf [root@Master ~]#grep "^[a-Z]" /apps/redis/etc/sentinel.conf bind 0.0.0.0 port 26379 daemonize yes #pidfile "redis-sentinel.pid" logfile "sentinel_26379.log" dir "/apps/redis/" sentinel deny-scripts-reconfig yes sentinel monitor mymaster 192.168.36.111 6379 2 # 法定人數限制(quorum),即有幾個 slave 認爲 master down 了就進行故障轉移 sentinel auth-pass mymaster 123456 sentinel down-after-milliseconds mymaster 10000 # (SDOWN)主觀下線的時間,單位(毫秒) sentinel parallel-syncs mymaster 1 # 發生故障轉移時候同時向新 master 同步數據的 slave 數量, 數字越小總同步時間越長 sentinel failover-timeout mymaster 180000 # 全部 slaves 指向新的 master 所需的超時時間 sentinel deny-scripts-reconfig yes # 禁止修改腳本 # 將配置文件scp到兩個slave節點 [root@Master redis-4.0.14]#scp /apps/redis/sentinel.conf 192.168.36.111:/apps/redis/ root@192.168.36.111's password: sentinel.conf 100% 282 214.2KB/s 00:00 [root@Master redis-4.0.14]#scp /apps/redis/sentinel.conf 192.168.36.112:/apps/redis/ root@192.168.36.112's password: sentinel.conf 100% 282 267.0KB/s 00:00
[root@Master ~]#redis-sentinel /apps/redis/etc/sentinel.conf [root@Master ~]#ss -ntl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 511 *:26379 *:* [root@Slave-1 ~]#redis-sentinel /apps/redis/etc/sentinel.conf [root@Slave-2 ~]#redis-sentinel /apps/redis/etc/sentinel.conf
[root@Master ~]#tail -f /apps/redis/logs/sentinel_26379.log 14129:X 14 Jun 16:23:34.697 # Sentinel is now ready to exit, bye bye... 14134:X 14 Jun 16:23:40.985 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 14134:X 14 Jun 16:23:40.985 # Redis version=4.0.14, bits=64, commit=00000000, modified=0, pid=14134, just started 14134:X 14 Jun 16:23:40.985 # Configuration loaded 14134:X 14 Jun 16:23:40.986 * Increased maximum number of open files to 10032 (it was originally set to 1024). 14134:X 14 Jun 16:23:40.987 * Running mode=sentinel, port=26379. 14134:X 14 Jun 16:23:40.987 # Sentinel ID is 69d6647e2c6236b5b72d8e943b5d5707db47b9a4 14134:X 14 Jun 16:23:40.987 # +monitor master mymaster 192.168.36.110 6379 quorum 2 14134:X 14 Jun 16:23:43.015 * +sentinel sentinel abeb0c89a25c690b5cbe09491de6ab822deee15e 192.168.36.112 26379 @ mymaster 192.168.36.110 6379 14134:X 14 Jun 16:23:43.050 * +sentinel sentinel 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 192.168.36.111 26379 @ mymaster 192.168.36.110 6379
[root@Master ~]#redis-cli -h 192.168.36.110 -p 26379 # 經過哨兵26379端口進行查看 192.168.36.110:26379> INFO sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=192.168.36.110:6379,slaves=2,sentinels=3
[root@Master ~]#tail -f /apps/redis/logs/sentinel_26379.log 14232:X 14 Jun 16:29:58.189 # +sdown master mymaster 192.168.36.110 6379 14232:X 14 Jun 16:29:58.218 # +new-epoch 1 14232:X 14 Jun 16:29:58.219 # +vote-for-leader 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 1 14232:X 14 Jun 16:29:58.266 # +odown master mymaster 192.168.36.110 6379 #quorum 3/2 14232:X 14 Jun 16:29:58.266 # Next failover delay: I will not start a failover before Fri Jun 14 16:35:58 2019 14232:X 14 Jun 16:29:59.468 # +config-update-from sentinel 4d3b7eb172aaef1a58b35c1a567534c67f3977ef 192.168.36.111 26379 @ mymaster 192.168.36.110 6379 14232:X 14 Jun 16:29:59.468 # +switch-master mymaster 192.168.36.110 6379 192.168.36.111 6379 14232:X 14 Jun 16:29:59.469 * +slave slave 192.168.36.112:6379 192.168.36.112 6379 @ mymaster 192.168.36.111 6379 14232:X 14 Jun 16:29:59.469 * +slave slave 192.168.36.110:6379 192.168.36.110 6379 @ mymaster 192.168.36.111 6379 14232:X 14 Jun 16:30:29.507 # +sdown slave 192.168.36.110:6379 192.168.36.110 6379 @ mymaster 192.168.36.111 6379
[root@Master ~]#redis-cli -h 192.168.36.110 -p 26379 192.168.36.110:26379> INFO sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=192.168.36.111:6379,slaves=2,sentinels=3
# 故障轉移後 redis.conf 中的 replicaof 行的 master IP 會被修改, sentinel.conf 中的 sentinel monitor IP 會被修改 [root@Slave-1 ~]#cat /apps/redis/sentinel.conf bind 0.0.0.0 port 26379 logfile "sentinel_26379.log" dir "/apps/redis/logs" sentinel myid 4d3b7eb172aaef1a58b35c1a567534c67f3977ef sentinel deny-scripts-reconfig yes sentinel monitor mymaster 192.168.36.111 6379 2 sentinel auth-pass mymaster 123456 sentinel config-epoch mymaster 1 # Generated by CONFIG REWRITE sentinel leader-epoch mymaster 1 sentinel known-slave mymaster 192.168.36.110 6379 sentinel known-slave mymaster 192.168.36.112 6379 sentinel known-sentinel mymaster 192.168.36.110 26379 69d6647e2c6236b5b72d8e943b5d5707db47b9a4 sentinel known-sentinel mymaster 192.168.36.112 26379 abeb0c89a25c690b5cbe09491de6ab822deee15e sentinel current-epoch 1
[root@Slave-1 ~]#redis-cli 127.0.0.1:6379> AUTH 123456 OK 127.0.0.1:6379> INFO replication # Replication role:master # Slave-1變爲master節點 connected_slaves:1 # slave0:ip=192.168.36.112,port=6379,state=online,offset=162954,lag=1 master_replid:e95e0241596bd1073ca558fc7cb892a7a6b4dbe6 # 故障轉移後的當前master_replid master_replid2:305f29a1bce5172f4c7e263de0d346fd33362d4d # 故障轉移前的master_replid master_repl_offset:163240 second_repl_offset:72111 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:163240 [root@Slave-2 ~]#redis-cli 127.0.0.1:6379> AUTH 123456 OK 127.0.0.1:6379> INFO replication # Replication role:slave master_host:192.168.36.111 # 故障轉移後新master IP地址 master_port:6379 master_link_status:up master_last_io_seconds_ago:1 master_sync_in_progress:0 slave_repl_offset:187718 slave_priority:100 slave_read_only:1 connected_slaves:0 master_replid:e95e0241596bd1073ca558fc7cb892a7a6b4dbe6 master_replid2:305f29a1bce5172f4c7e263de0d346fd33362d4d master_repl_offset:187718 second_repl_offset:72111 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:71 repl_backlog_histlen:187648