PostgreSQL9.5：pg_rewind 快速恢復備節點

時間 2019-11-16

標籤 postgresql9.5 postgresql rewind 快速恢復節點欄目 Postgre SQL 简体版

原文原文鏈接

瞭解 PG 的朋友應該知道 PG 的主備切換並不容易，步驟較嚴謹，在激活備節點前需主動關閉主節點，不然再想以備節點角色拉起主節點會比較困難，以前博客介紹過主備切換，PostgreSQL HOT-Standby 的主備切換，PG 9.5 版本已經將 pg_rewind 加入到源碼，當主備發生切換時，能夠將原來主庫經過同步模式恢復，避免重作備庫。這樣對於較大的庫來講，節省了大量重作備庫時間。html

pg_rewind 會將目標庫的數據文件，配置文件複製到本地目錄，因爲 pg_rewind 不會讀取全部未發生變化的數據塊，因此速度比重作備庫要快不少，sql

一環境準備

流複製環境
192.168.2.37/1931 主節點(主機名 db1)
192.168.2.38/1931 備節點(主機名 db2)
備註：流複製環境參考 PostgreSQL：使用 pg_basebackup 搭建流複製環境，本文略。數據庫

--pg_rewind 前提條件
1 full_page_writes
2 wal_log_hints 設置成 on 或者 PG 在初始化時開啓 checksums 功能app

二主備切換

--備節點 recovery.conf 配置: db2 上操做ide

[pg95@db2 pg_root]$ grep ^[a-z] recovery.conf 
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=192.168.2.37 port=1931 user=repuser'           # e.g. 'host=localhost port=5432'

--激活備節點: db2 上操做post

[pg95@db2 pg_root]$ pg_ctl promote -D $PGDATA
server promoting

[pg95@db2 pg_root]$ pg_controldata | grep cluster
Database cluster state:               in production

--備節點激活後，建立一張測試表並插入數據測試

[pg95@db2 pg_root]$ psql
psql (9.5alpha1)
Type "help" for help.

postgres=# create table test_2(id int4);
CREATE TABLE
                   
postgres=# insert into test_2(id) select n from generate_series(1,10000) n;
INSERT 0 10000

--停原來主節點: db1 上操做ui

[pg95@db1 ~]$ pg_controldata | grep cluster
Database cluster state:               in production

[pg95@db1 ~]$ pg_ctl stop -m fast -D $PGDATA
waiting for server to shut down....... done
server stopped

備註：停完原主庫後，千萬不能當即以備節點形式拉起老庫，不然在執行 pg_rewind 時會報，"target server must be shut down cleanly" 錯誤。this

--pg_rewind: db1 上操做idea

[pg95@db1 pg_root]$ pg_ctl stop -m fast -D $PGDATA
waiting for server to shut down......... done
server stopped

[pg95@db1 pg_root]$ pg_rewind --target-pgdata $PGDATA --source-server='host=192.168.2.38 port=1931 user=postgres dbname=postgres' -P 
connected to server
target server needs to use either data checksums or "wal_log_hints = on"

備註：執行 pg_rewind 拋出以上錯誤，錯誤內容很明顯。

--pg_rewind 代碼分析

364     /*
  365      * Target cluster need to use checksums or hint bit wal-logging, this to
  366      * prevent from data corruption that could occur because of hint bits.
  367      */
  368     if (ControlFile_target.data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
  369         !ControlFile_target.wal_log_hints)
  370     {
  371         pg_fatal("target server needs to use either data checksums or \"wal_log_hints = on\"\n");
  372     }
  373

備註：數據庫在 initdb 時須要開啓 checksums 或者設置 "wal_log_hints = on"，接着設置主，備節點的 wal_log_hints 參數並重啓數據庫。

--再次 pg_rewind, db1 上操做

[pg95@db1 pg_root]$ pg_rewind --target-pgdata $PGDATA --source-server='host=192.168.2.38 port=1931 user=postgres dbname=postgres' -P
connected to server
The servers diverged at WAL position 0/1300CEB0 on timeline 5.
Rewinding from last common checkpoint at 0/1200008C on timeline 5
reading source file list
reading target file list
reading WAL in target
need to copy 59 MB (total source directory size is 76 MB)
61185/61185 kB (100%) copied
creating backup label and updating control file
Done!

備註：pg_rewind 成功。

--調整 recovery.conf 文件: db1 操做
[pg95@db1 ~]$ cd $PGDATA
[pg95@db1 pg_root]$ mv recovery.done recovery.conf

備註：注意是否須要修改 primary_conninfo 配置。

[pg95@db1 pg_root]$ grep ^[a-z] recovery.conf 
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=192.168.2.38 port=1931 user=repuser'           # e.g. 'host=localhost port=5432'

--啓動原主庫， db1 上操做

[pg95@db1 pg_root]$ pg_ctl start -D $PGDATA
server starting

[pg95@db1 pg_root]$ pg_controldata | grep cluster
Database cluster state:               in archive recovery

--數據驗證, db1 上操做

[pg95@db1 pg_root]$ psql
psql (9.5alpha1)
Type "help" for help.

postgres=# select count(*) from test_2;
 count 
-------
 10000
(1 row)

備註：pg_rewind 成功，原主庫如今是以備庫角色啓動，並且數據表 test_2 也同步過來了。

三 pg_rewind 原理

The basic idea is to copy everything from the new cluster to the old cluster, except for the blocks that we know to be the same.

    1)Scan the WAL log of the old cluster, starting from the last checkpoint before the point where the new cluster's timeline history forked off from the old cluster. For each WAL record, make a note of the data blocks that were touched. This yields a list of all the data blocks that were changed in the old cluster, after the new cluster forked off.

    2)Copy all those changed blocks from the new cluster to the old cluster.

    3)Copy all other files like clog, conf files etc. from the new cluster to old cluster. Everything except the relation files.

    4) Apply the WAL from the new cluster, starting from the checkpoint created at failover. (Strictly speaking, pg_rewind doesn't apply the WAL, it just creates a backup label file indicating that when PostgreSQL is started, it will start replay from that checkpoint and apply all the required WAL.)