官網示例html
使用 pgpool-II 軟件;咱們經常使用來實現流複製的高可用性;備庫只讀的,不可寫;就是當主庫出現問題時;須要把備庫自動激活爲主庫;來接管服務。node
這在其餘高可用軟件也有這功能,而 pgpool-II 在配置文件 pgpool.conf 中提供配置項 failover_command 。讓用戶配置一個腳本,當發生故障切換時,執行該腳本。sql
本示例採用 PostgreSQL12 + pgpool-II4。shell
演練目的:數據庫
主機名 | 角色 | ip | 端口 | 數據目錄 |
---|---|---|---|---|
node3 | pgpool | 192.168.1.221 | 9999 | |
node3 | primary | 192.168.1.221 | 6000 | /data1/postgres/data |
node4 | standby | 192.168.1.202 | 6000 | /data1/postgres/data |
用戶 | 密碼 | 用途詳情 |
---|---|---|
postgres | 123456 | 用於在線恢復 |
replica | replica | 流複製用戶 |
pgpool | 123456 | Pgpool-II health check (health_check_user) replication delay check (sr_check_user) |
N/A服務器
查看 《 pgpool-II安裝 》app
本示例涉及到在線恢復;須要安裝 pgpool_recoveryssh
-- 在 primary 操做 psql -c "create extension pgpool_recovery" template1
primary 節點操做post
建立數據庫用戶測試
alter user postgres password '123456'; CREATE ROLE pgpool WITH LOGIN password '123456';; CREATE ROLE replica WITH REPLICATION LOGIN password 'replica'; --If you want to show "replication_state" and "replication_sync_state" column in SHOW POOL NODES command result, role pgpool needs to be PostgreSQL super user or or in pg_monitor group (Pgpool-II 4.1 or later) GRANT pg_monitor TO pgpool;
配置歸檔
搭建流複製是不須要配置歸檔;可是在線恢復須要歸檔日誌。
$ mkdir /data1/archivedir $ vi postgresql.conf archive_mode = on archive_command = 'cp %p /data1/archivedir/%f' wal_log_hints = on
--在 standby 操做 # 用 root 操做系統用戶在202建立PostgreSQL工做目錄 mkdir -p /data1/postgres/data chown -R postgres:postgres /data1/postgres/data chmod 700 /data1/postgres/data # 用 postgres 操做系統用戶執行 pg_basebackup 命令;進行備庫拷貝 pg_basebackup -F p -R --progress -D /data1/postgres/data -h 192.168.1.221 -p 6000 -U replica # 用 postgres 操做系統戶用啓動備庫 pg_ctl start
在上面講到的 實現原理,使用Pgpool-II的自動故障轉移和在線恢復;須要 pgpool 服務免密碼在各個機器上執行;以及後續在在線恢復功能;這裏咱們使用 postgres 操做用戶。
-- 在pgpool節點執行 $ cd ~/.ssh $ ssh-keygen -t rsa -f id_rsa_pgpool $ ssh-copy-id -i id_rsa_pgpool.pub postgres@node3 $ ssh-copy-id -i id_rsa_pgpool.pub postgres@node4 -- 驗證免密碼登陸 ssh postgres@serverX -i ~/.ssh/id_rsa_pgpool
能夠查考 《 pgpool 配置 》;這裏咱們是用 postgres 操做用戶進行安裝
配置環境變量
export PGHOME=/opt/pg12 export PGDATA=/data1/postgres/data export PGPOOLHOME=/opt/pgpool export PATH=$PGHOME/bin:$PATH:$HOME/bin:$PGPOOLHOME/bin
一、設置 pcp 的管理用戶/密碼文件 pcp.conf
「pcpadm/pgpool123」
#1 進入配置目錄 [postgres@node3 ~]$ cd /opt/pgpool/etc [postgres@node3 etc]$ cp pcp.conf.sample pcp.conf # 在該文件中;用戶/密碼出如今每一行; # USERID:MD5PASSWD #2 pg_md5 生成配置的用戶名密碼是 pgpool123 [postgres@node3 etc]$ pg_md5 pgpool123 fa039bd52c3b2090d86b0904021a5e33 #3 編輯pcp.conf;這裏配置用戶是 pcpadm, [postgres@node3 etc]$ vi pcp.conf # USERID:MD5PASSWD pcpadm:fa039bd52c3b2090d86b0904021a5e33
二、配置 pool_hba.conf
用於認證用戶登陸方式,如客戶端IP限制等,相似於postgresql的pg_hba.conf文件
[postgres@node3 ~]$ cd /opt/pgpool/etc/ [postgres@node3 etc]$ vi pool_hba.conf # 添加下面內容 host all all 0.0.0.0/0 md5
三、生成 pool_passwd
pgpool 密鑰文件;經過 pgpool 訪問須要用戶驗證;
這裏暫用數據庫用戶 pgpool
[postgres@node3 ~]$ cd /opt/pgpool/etc/ [postgres@node3 etc]$ pg_md5 --md5auth -u pgpool -p password: [postgres@node3 etc]$ ll pool_passwd -rw-r--r--. 1 postgres postgres 132 Nov 30 10:43 pool_passwd
四、配置.pgpass
使用pgpool-II進行故障庫自動切換(failover)、或在線恢復(online recovery)(在線恢復:主庫故障後切換,原主庫恢復後變動爲備庫。注意是 Online recovery,而不是自動恢復,須要手工執行命令恢復),須要可以無密碼 SSH 訪問其餘 PostgreSQL 服務器。爲了知足此條件,咱們須要在每一個 PostgreSQL 服務器上,在 postgres 用戶的 home file下建立了.pgpass 文件,並修改器文件權限爲600
# su - postgres $ vi /var/lib/pgsql/.pgpass server1:5432:replication:repl:<repl user password> server2:5432:replication:repl:<repl user passowrd> server3:5432:replication:repl:<repl user passowrd> $ chmod 600 /var/lib/pgsql/.pgpass
若設置 pg_hba.conf 的該網段免密碼驗證 trust;能夠忽略該步驟
host replication replica 192.168.1.0/24 trust
五、配置 pcp 的 .pcppass
須要 follow_master_command 腳本狀況下,因爲此腳本必須在不輸入密碼的狀況下執行pcp命令,因此咱們在 postgres 用戶的home directory下建立.pcppass
# echo 'localhost:9898:pgpool:pgpool' > ~/.pcppass # chmod 600 ~/.pcppass
六、配置pgpool.conf
listen_addresses = '*' port = 9999 backend_hostname0 = '192.168.1.221' backend_port0 = 6000 backend_weight0 = 1 backend_data_directory0 = '/data1/postgres/data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_application_name0 = 'server0' backend_hostname1 = '192.168.1.202' backend_port1 = 6000 backend_weight1 = 1 backend_data_directory1 = '/data1/postgres/data' backend_flag1 = 'ALLOW_TO_FAILOVER' backend_application_name1 = 'server1' enable_pool_hba = on pool_passwd = 'pool_passwd' pid_file_name = '/opt/pgpool/pgpool.pid' logdir = '/opt/pgpool' replication_mode = off load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' sr_check_period = 10 sr_check_user = 'pgpool' sr_check_password = '123456' sr_check_database = 'postgres' delay_threshold = 10000000 health_check_period = 5 health_check_user = 'pgpool' health_check_password = '123456' health_check_database = 'postgres' health_check_max_retries = 3 failover_command = '/opt/pgpool/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S' # If we use 3 PostgreSQL servers, we need to specify follow_primary_command to run after failover on the primary node failover. # In case of two PostgreSQL servers, follow_primary_command setting is not necessary # follow_primary_command = '/opt/pgpool/follow_primary.sh %d %h %p %D %m %H %M %P %r %R' # online recovery recovery_user = 'postgres' recovery_password = '123456' recovery_1st_stage_command = '' recovery_2nd_stage_command = '' recovery_timeout = 90
七、配置 failover_command 腳本
[postgres@node3 ~]$ cd $PGPOOLHOME [postgres@node3 pgpool]$ cp etc/failover.sh.sample failover.sh [postgres@node3 pgpool]$ vi failover.sh 修改變量 PGHOME [postgres@node3 pgpool]$ chmod +x failover.sh
[postgres@node3 ~]$ pgpool -n > /tmp/pgpool.log & [postgres@node3 ~]$ psql -p 9999 postgres pgpool 2020-12-01 14:50:09: pid 2422: LOG: new connection received 2020-12-01 14:50:09: pid 2422: DETAIL: connecting host=[local] psql (12.2) Type "help" for help. postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_syn c_state | last_status_change ---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+---------------- --------+--------------------- 0 | 192.168.1.221 | 6000 | up | 0.500000 | primary | 0 | false | 0 | | | 2020-12-01 14:38:09 1 | 192.168.1.202 | 6000 | up | 0.500000 | standby | 0 | true | 0 | | | 2020-12-01 14:38:09 (2 rows)
咱們先把主庫停掉,看看備庫是否能夠激活爲主庫;
[postgres@node3 ~]$ pg_ctl stop waiting for server to shut down..... done server stopped # 再次查看節點信息 [postgres@node3 ~]$ psql -p 9999 postgres pgpool 2020-12-01 14:53:57: pid 2591: LOG: new connection received 2020-12-01 14:53:57: pid 2591: DETAIL: connecting host=[local] psql (12.2) Type "help" for help. postgres=> show pool_nodes; node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_syn c_state | last_status_change ---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+---------------- --------+--------------------- 0 | 192.168.1.221 | 6000 | down | 0.500000 | standby | 0 | false | 0 | | | 2020-12-01 14:53:07 1 | 192.168.1.202 | 6000 | up | 0.500000 | primary | 0 | true | 0 | | | 2020-12-01 14:53:07 (2 rows)
測試結果: 備庫成功激活爲新主庫
從上面的查詢結果能夠看到 「node_id=1」的 role 變成了 「primary」
如今咱們把原主庫加回集羣,變成備庫。後面再演示 online recovery。先手動執行
一、同步時間線
202 備庫提高爲新主庫;其時間線 +1;與 221 不一樣步;這是須要使用pg_rewind同步數據
[postgres@node3 ~]$ pg_rewind --target-pgdata $PGDATA --source-server='host=192.168.1.202 port=6000 user=postgres dbname=postgres password=123456' pg_rewind: servers diverged at WAL location 0/18000000 on timeline 1 pg_rewind: rewinding from last common checkpoint at 0/17000148 on timeline 1 pg_rewind: Done!
二、配置 postgresql.conf
# 192.168.1.221 $ cd $PGDATA $ touch standby.signal $ vi postgresql.conf primary_conninfo = 'host=192.168.1.202 port=6000 user=replica'
三、啓動 postgresql
[postgres@node3 ~]$ pg_ctl start
後續講解online recovery。未完待續...