Redis 主從複製全剖析

時間 2020-05-22

標籤 redis 主從複製剖析欄目 Redis 简体版

原文原文鏈接

Redis的主從複製是如何工做的？如何在同步數據的同時，還保持着高性能，你瞭解嗎？redis

- https://redis.io/topics/replication
  注意如下基於 redis 5 最新版本，slave 名詞和配置項已經被官方改成 replica，實際上是一個東西，都指從節點。

主從複製的基本流程

# Master-Replica replication. Use replicaof to make a Redis instance a copy of
# another Redis server. A few things to understand ASAP about Redis replication.
#
# +------------------+ +---------------+
# | Master | ---> | Replica |
# | (receive writes) | | (exact copy) |
# +------------------+ +---------------+
#
# 1) Redis replication is asynchronous, but you can configure a master to
# stop accepting writes if it appears to be not connected with at least
# a given number of replicas.
# 2) Redis replicas are able to perform a partial resynchronization with the
# master if the replication link is lost for a relatively small amount of
# time. You may want to configure the replication backlog size (see the next
# sections of this file) with a sensible value depending on your needs.
# 3) Replication is automatic and does not need user intervention. After a
# network partition replicas automatically try to reconnect to masters
# and resynchronize with them.
#
# replicaof <masterip> <masterport>

主 `Master` 與從 `replica` 複製的基本流程

主 Master 和 replica 鏈接穩定時，Master 持續進行增量同步(partial resync)，發送增量數據給 replica, replica接受到數據後更新本身的數據，並以每秒 REPLCONF ACK PING 給 Master 報告處理的狀況。
若是replica與Master斷開再重連時，replica 嘗試發送 PSYNC 命令給 Master, 若是條件知足（好比引用的是已知的歷史副本，或backlog積壓足夠）則觸發繼續增量同步(partial resync)。不然將觸發一次 Master 向該 replica 全量同步（full resync）

從以上基本流程中，咱們能夠看出來若是網絡存在問題，咱們能夠會致使全量同步（full resync），這樣會嚴重影響從replica追趕master的數據進度。
那麼如何解決呢？
能夠從兩個方面：主從響應時間策略、主從空間堆積策略。shell

主從響應時間策略

一、每repl-ping-replica-period 秒PING一次 Master，檢測 Master是否掛了。

repl-ping-replica-period 10

二、replica（salve）和 Master之間的複製超時時間，默認爲60s
a) replica 角度，在全量同步SYNC期間，沒有收到master傳輸的 RDB 數據
b) replica 角度，沒有收到master發送的數據包或者replica發送的PING響應
c) master角度，沒有收到replica 的REPCONF ACK PINGs（複製偏移量offset）。
當redis檢測到repl-timeout超時(默認值60s)，將會關閉主從之間的鏈接，redis replica 發起從新創建主從鏈接的請求。

repl-timeout 60

主從空間堆積策略

Master 在接受數據寫入後，會寫到 replication buffer（這個主要用於主從複製的數據傳輸緩衝），同時也寫到積壓replication backlog。
當replica斷開重連 PSYNC （包含replication ID，和目前已處理的offset），若是replication backlog 中能夠找到歷史副本，則觸發增量同步(partial resync)，不然將觸發
一次 Master 向該 replica 全量同步（full resync）。編程

# Set the replication backlog size. The backlog is a buffer that accumulates
# replica data when replicas are disconnected for some time, so that when a replica
# wants to reconnect again, often a full resync is not needed, but a partial
# resync is enough, just passing the portion of data the replica missed while
# disconnected.
#
# The bigger the replication backlog, the longer the time the replica can be
# disconnected and later be able to perform a partial resynchronization.
#
# The backlog is only allocated once there is at least a replica connected.
#
# repl-backlog-size 1mb

積壓replication backlog的相關參數：安全

# 增量同步窗口
repl-backlog-size 1mb 
repl-backlog-ttl 3600

full resync 全量同步工做流程

全量同步的工做流程：服務器

replica發送PSYNC。
（假設知足全量同步的條件）
Master 經過子進程處理全量同步，子進程經過 BGSAVE命令，fork一個子進程寫入快照 dump.rdb。同時，Master 開始緩衝從客戶端收到的全部新寫命令到 replication buffer。
Master子進程經過網卡傳輸 rdb數據給 replica。
replica 保存 rdb數據到磁盤，而後加載到內存（刪除舊數據，並阻塞加載新數據）
（後續就是增量同步）

其中 master 若是磁盤慢，而帶寬比較好，可使用無盤模式（須要注意，這是實驗性的）：微信

repl-diskless-sync no --> yes 則開啓無盤模式
repl-diskless-sync-delay 5

replica在全量同步或斷開鏈接期間，默認是能夠提供服務的。網絡

replica-serve-stale-data yes

replica在在 replica加載到內存的時間窗口，replica會阻塞客戶端的鏈接。app

若是保證數據安全交付（Allow writes only with N attached replicas ）

Master默認採用異步複製，意思是客戶端寫入命令，master須要本身確認，而且確認至少有N個副本，而且延遲少於M秒，則將接受寫入，不然返回錯誤less

# 默認是沒開啓的
min-replicas-to-write <replica 數量>    
min-replicas-max-lag <秒數>

另外客戶端Client可使用WAIT命令相似ACK機制，能確保其餘Redis實例中具備指定數量的已確認副本。異步

127.0.0.1:9001>set a x
OK.
127.0.0.1:9001>wait 1 1000
1

故障轉移

replication ID 的做用主要是標識來自當前 master 的數據集標識。
replication ID 有兩個：master_replid，master_replid2

127.0.0.1:9001> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=9011,state=online,offset=437,lag=1
master_replid:9ab608f7590f0e5898c4574299187a52ad0db7ec
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:437
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:437

當 master 掛了，其中一個replica 升級爲 master，它將開啓一個新紀元，生成新的 replication ID ： master_replid
同時舊的 master_replid 設置到 master_replid2。

# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=9021,state=online,offset=34874,lag=0
slave1:ip=127.0.0.1,port=9001,state=online,offset=34741,lag=0
master_replid:dfa343264a79179c1061f8fb81d49077db8e4e5f
master_replid2:9ab608f7590f0e5898c4574299187a52ad0db7ec
master_repl_offset:34874
second_repl_offset:6703
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:34874

這樣其餘replica 鏈接新的 master 就不須要又來一次全量同步，能夠繼續副本同步完，再使用新的紀元數據。

replica如何處理已過時的 Key ？

replica 不主動讓已過時的key 被刪除掉，只有當 Master 經過LRU等內存淘汰策略或主動訪問過時，合成 DEL 命令給到 replica ，replica 纔會刪掉它
以上存在一個時間差，replica 內部採用邏輯時鐘，當客戶端client嘗試讀取一個過時key的時候，replica 會報告不存在。