生產上選用repmgr給PostgreSQL數據庫作高可用集羣,在給生產上一套庫作高可用改造時發現standby clone時報錯,沒法複製備庫,報錯內容以下:
node
先說緣由,是由於對PG和pg_basebackup比較瞭解的同窗可能本身就能夠想出解決方案,不須要再繼續往下看了。緣由是因爲建立的獨立表空間指定的目錄放在$PGDATA目錄下,repmgr的standby clone調用的是pg_basebackup,並且沒有指定輸出格式,默認爲plain,會複製主庫目錄時把PGDATA目錄下全部文件、目錄和獨立表空間目錄,因此會報錯File exists。sql
因爲方案一涉及到對主庫作操做,不建議在生產上操做,除非不介意對應用的影響。數據庫
添加表空間、建立database、寫表 app
postgres=# create user pguser login password 'pguser'; CREATE ROLE postgres=# create tablespace tbs_mydb owner pguser location '/home/postgres/data/pg_tbs/tbs_mydb'; WARNING: tablespace location should not be inside the data directory CREATE TABLESPACE postgres=# create database mydb with owner=pguser template=template0 encoding='UTF8' tablespace =tbs_mydb; CREATE DATABASE postgres=# grant all on database mydb to pguser with grant option; GRANT postgres=# grant all on tablespace tbs_mydb to pguser; GRANT postgres=# \c mydb pguser You are now connected to database "mydb" as user "pguser". mydb=> create table t1 (id int); CREATE TABLE mydb=> insert into t1 values(1); INSERT 0 1 mydb=> select * from t1; id ---- 1 (1 row)
第一次嘗試 standby clone,出現與生產上一致的報錯,報錯信息與生產一致socket
INFO: checking and correcting permissions on existing directory "/home/postgres/data" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /usr/local/pgsql/bin/pg_basebackup -l "repmgr base backup" -D /home/postgres/data -h 192.168.56.111 -p 6000 -U repmgr -X stream pg_basebackup: could not create directory "/home/postgres/data/pg_tbs": File exists pg_basebackup: removing contents of data directory "/home/postgres/data" pg_basebackup: changes to tablespace directories will not be undone ERROR: unable to take a base backup of the source server HINT: data directory ("/home/postgres/data") may need to be cleaned up manually
修改repmgr.conf中的data_directory='/home/postgres/repmgr'ide
再次嘗試 standby clone,成功post
[postgres@repmgr2 ~]$ repmgr -h 192.168.56.111 -U repmgr -d repmgr -f ~/repmgr.conf standby clone -p6000 NOTICE: destination directory "/home/postgres/repmgr" provided INFO: connecting to source node DETAIL: connection string is: host=192.168.56.111 user=repmgr port=6000 dbname=repmgr DETAIL: current installation size is 45 MB DEBUG: 1 node records returned by source node DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.111 port=6000 fallback_application_name=repmgr options=-csearch_path=" DEBUG: upstream_node_id determined as 111 INFO: replication slot usage not requested; no replication slot will be set up for this standby NOTICE: checking for available walsenders on the source node (2 required) NOTICE: checking replication connections can be made to the source server (2 required) INFO: checking and correcting permissions on existing directory "/home/postgres/repmgr" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /usr/local/pgsql/bin/pg_basebackup -l "repmgr base backup" -D /home/postgres/repmgr -h 192.168.56.111 -p 6000 -U repmgr -X stream DEBUG: create_recovery_file(): creating "/home/postgres/repmgr/recovery.conf"... DEBUG: recovery.conf line: standby_mode = 'on' DEBUG: recovery.conf line: primary_conninfo = 'host=192.168.56.111 user=repmgr port=6000 application_name=repmgr2 connect_timeout=2' DEBUG: recovery.conf line: recovery_target_timeline = 'latest' NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /home/postgres/repmgr start HINT: after starting the server, you need to register this standby with "repmgr standby register"
修改repmgr.conf爲原來的配置,並把repmgr目錄下的全部文件mv到data目錄下測試
data_directory='/home/postgres/data' [postgres@repmgr2 repmgr]$ mv * ~/data/ mv: cannot move ‘pg_tbs’ to ‘/home/postgres/data/pg_tbs’: File exists
修改配置文件中的cluster_name參數並啓動數據庫ui
[postgres@repmgr2 data]$ pg_ctl -D /home/postgres/data/ start waiting for server to start....2021-02-28 10:09:15.905 CST [3498] LOG: listening on IPv4 address "0.0.0.0", port 6000 2021-02-28 10:09:15.912 CST [3498] LOG: listening on Unix socket "/tmp/.s.PGSQL.6000" 2021-02-28 10:09:15.949 CST [3498] LOG: redirecting log output to logging collector process 2021-02-28 10:09:15.949 CST [3498] HINT: Future log output will appear in directory "log". . done server started
註冊備庫成功this
[postgres@repmgr2 data]$ repmgr -f ../repmgr.conf standby register INFO: connecting to local node "repmgr2" (ID: 113) DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.113 port=6000 fallback_application_name=repmgr options=-csearch_path=" INFO: connecting to primary database DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.111 port=6000 fallback_application_name=repmgr options=-csearch_path=" WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 111) INFO: standby registration complete NOTICE: standby node "repmgr2" (ID: 113) successfully registered
檢查集羣狀態
[postgres@repmgr2 data]$ repmgr -f ../repmgr.conf cluster show DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.113 port=6000 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.111 port=6000 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.113 port=6000 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=192.168.56.111 port=6000 fallback_application_name=repmgr options=-csearch_path=" ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string -----+---------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------- 111 | repmgr1 | primary | * running | | default | 100 | 5 | host=192.168.56.111 port=6000 user=repmgr dbname=repmgr connect_timeout=2 113 | repmgr2 | standby | running | repmgr1 | default | 100 | 5 | host=192.168.56.113 port=6000 user=repmgr dbname=repmgr connect_timeout=2
主庫測試添加數據
mydb=> insert into t1 values(2); INSERT 0 1 mydb=> select * from t1; id ---- 1 2 (2 rows)
從庫查詢
[postgres@repmgr2 data]$ psql psql (10.11) Type "help" for help. postgres=# \c mydb pguser You are now connected to database "mydb" as user "pguser". mydb=> select * from t1; id ---- 1 2 (2 rows)
其實在建立獨立表空間時PG已經作了提示表空間不該用在DATA目錄,因此出現上面的報錯就是掉進了前人的坑。
WARNING: tablespace location should not be inside the data directory
若是想嘗試方案一的能夠提供一下思路
#新建立一個表空間 postgres=# create tablespace zhijian owner pguser location '/data/pgdata/11/pg_tbs/tbs_zhijian'; CREATE TABLESPACE #更改數據庫的表空間 mydb=> \c postgres postgres postgres=# alter database mydb set tablespace zhijian; ALTER DATABASE