PostgreSQL-11.3+etcd+patroni構建高可用數據庫集羣

1 摘要

使用Etcd和Patroni能夠構建高可用PostgreSQL集羣.node

Etcd用於Patroni節點之間共享信息.python

Patroni監控本地的PostgreSQL狀態。若是主庫(Primary)故障,Patroni把一個從庫(Standby)拉起來,做爲新的主(Primary)數據庫. 若是一個故障PostgreSQL被搶救過來了,可以從新自動或手動加入集羣。算法

1.1 關於 etcd

Etcd按照Raft算法和協議開發的,是一個強一致性的、分佈式的key-value數據庫。它爲分佈式系統提供了可靠的數據存儲訪問機制。sql

只有一個etcd節點被選作Leader, 其它的etcd節點做爲Follower.數據庫

Etcd裏的數據以key標識, 例如可使用以下數據bootstrap

key   = /service/postgresql/leader
        value = postgresql1

來表示一個PostgreSQL集羣中的主庫是'postgresql1'.vim

figure1: an etcd cluster including three etcd nodes 
===================================================

     |---------------|                          |-------------|
     |etcd1<follower>| <----------+-----------> |etcd2<leader>|
     |---------------|            |             |-------------|
                                  |
                                  |
                                  |
                          |-------V-------|                            
                          |etcd3<follower>|                             
                          |---------------|

1.2 Etcd、Patroni 和PostgreSQL是如何一塊兒工做的

下面的圖中(figure2), 使用三個主機(host1,host2,host3) 構建一個PostgreSQL集羣。 每個主機上都安裝部署Etcd, Patroni 和 PostgreSQL。api

figure2: a PostgreSQL cluster with 3 hosts, each host having etcd, Partoni and PostgreSQL
=====================================================================================


          .........................................................................
          . <host1>                                                               .
          .     |------------|         |-------------|           |-------------|  .  
   +--<-------->|    etcd1   |<------->|  patroni1   +---------->| postgresql1 |  .
   |      .     |------------|         |-------------|           |-------------|  .
   |      .                                                                       .
   |      .........................................................................
   |
   |
   |
   |      .........................................................................
   |      . <host2>                                                               .
   |      .     |------------|         |-------------|           |-------------|  .  
   +-<--------->|    etcd2   |<------->|  patroni2   +---------->| postgresql2 |  .
   |      .     |------------|         |-------------|           |-------------|  .
   |      .                                                                       .
   |      .........................................................................
   |
   |
   |
   |      ..........................................................................
   |      . <host3>                                                                .
   |      .     |------------|         |-------------|           |-------------|   . 
   +--<-------->|    etcd3   |<------->|  patroni3   +---------->| postgresql3 |   .
          .     |------------|         |-------------|           |-------------|   .
          .                                                                        .
          ..........................................................................

1.2.1 Etcd: 分佈式的Key-Value數據庫

Etcd一、etcd二、 etcd3做爲分佈式的Key-Value數據庫,被partroni一、 patroni二、 patroni3讀/寫,用於共享/傳遞信息。每個 Patroni都能讀/寫etcd中的數據.app

1.2.2 Paroni: 控制/監控本地的PostgreSQL, 把本地PostgreSQL信息/狀態寫入etcd。

每個 Patroni實例監控/控制本地的PostgreSQL,把本地本地PostgreSQL信息/狀態寫入etcd 。curl

一個Patroni實例可以經過讀取etcd獲取外地PostgreSQL的信息/狀態.

1.2.3 PostreSQL主節點的選舉

Patroni判斷本地PostgreSQL是否能夠做爲Primary庫。若是能夠,Paroni試圖選舉本地PostgreSQL做爲Primary(Leader) ,選舉方式是:把etcd中的某個key(e.g. /service/postgresql/leader) 更新成爲本地PostgreSQL的名字(e.g. postgresql1)。

若是多個Paroni同時更改同一個key,只有一個能改爲功,而後成爲Primary(Leader)。

2 規劃測試

2.1 系統/軟件/版本

  • CentOS 7.6
  • PostgreSQL 11.3
  • etcd: 3.3.11
  • python: Python 3.7.4
  • Patroni: 1.6.0
  • database user:
    • superuser:postgres
    • replication user: replicator

2.2 主機名稱/IP地址規劃

NO.  |     IP          | HOSTNAME
-----+-----------------+---------------
 1   |  192.168.56.10  |  host1
 2   |  192.168.56.11  |  host2
 3   |  192.168.56.12  |  host3
[rudi@host1 ~]$ more /etc/hosts
192.168.56.10    host1
192.168.56.11    host2
192.168.56.12    host3

2.3 OS用戶/目錄/文件

  • Linux user: rudi
  • main folder: /home/rudi
    • Postgres data folder: /home/rudi/pgdata/
    • folder for scripts: /home/rudi/scripts/

3 在全部主機(host1/host2/host3)安裝軟件/模塊

3.1 安裝etcd

yum -y install etcd libyaml

3.2 從源代碼安裝PostgreSQL 11.3

yum -y install flex bison readline-devel zlib-devel
wget https://ftp.postgresql.org/pub/source/v11.3/postgresql-11.3.tar.gz
tar zxvf postgresql-11.3.tar.gz
cd postgresql-11.3/
./configure
make
su
make install

3.3 從源代碼安裝python3,安裝相應的python模塊

yum install libffi-devel openssl-devel

wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz
tar zxvf Python-3.7.4.tgz 
cd Python-3.7.4/
./configure
make
su 
make install
python3 -m pip install --upgrade pip
python3 -m pip install psycopg2_binary
python3 -m pip install patroni[etcd]

4 準備目錄/文件

4.1 在全部主機(host1/host2/host3)上建立目錄:數據庫目錄、配置腳本目錄

cd ~
mkdir ~/scripts
mkdir ~/pgdata

4.2 建立/編輯etcd運行腳本文件

4.2.1 host1: etcd1.sh

cd ~/scripts
vim etcd1.sh
etcd --name etcdnode1 --initial-advertise-peer-urls http://192.168.56.10:2380 \
  --listen-peer-urls http://192.168.56.10:2380 \
  --listen-client-urls http://192.168.56.10:2379,http://127.0.0.1:2379 \
  --advertise-client-urls http://192.168.56.10:2379 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-cluster etcdnode1=http://192.168.56.10:2380,etcdnode2=http://192.168.56.11:2380,etcdnode3=http://192.168.56.12:2380 \
  --initial-cluster-state new

4.2.2 host2: etcd2.sh

cd ~/scripts
vim etcd2.sh
etcd --name etcdnode2 --initial-advertise-peer-urls http://192.168.56.11:2380 \
--listen-peer-urls http://192.168.56.11:2380 \
--listen-client-urls http://192.168.56.11:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.56.11:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcdnode1=http://192.168.56.10:2380,etcdnode2=http://192.168.56.11:2380,etcdnode3=http://192.168.56.12:2380 \
--initial-cluster-state new

4.2.3 host3: etcd3.sh

cd ~/scripts
vim etcd3.sh
etcd --name etcdnode3 --initial-advertise-peer-urls http://192.168.56.12:2380 \
--listen-peer-urls http://192.168.56.12:2380 \
--listen-client-urls http://192.168.56.12:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://192.168.56.12:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster etcdnode1=http://192.168.56.10:2380,etcdnode2=http://192.168.56.11:2380,etcdnode3=http://192.168.56.12:2380 \
--initial-cluster-state new

4.3 建立/編輯Patroni配置文件

4.3.1 host1: postgresql1.yml

cd ~/scripts
vim postgresql1.yml
scope: postgresql
namespace: /service/
name: postgresql1

restapi:
    listen: 192.168.56.10:8008
    connect_address: 192.168.56.10:8008

etcd:
    host: 192.168.56.10:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.56.10/0 md5
    - host replication replicator 192.168.56.11/0 md5
    - host replication replicator 192.168.56.12/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb
postgresql:
    listen: 192.168.56.10:15432
    connect_address: 192.168.56.10:15432
    bin_dir: /usr/local/pgsql/bin
    data_dir: /home/rudi/pgdata
    pgpass: /tmp/pgpass1
    authentication:
        replication:
            username: replicator
            password: rep-pass
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'
        synchronous_commit: "on"
        synchronous_standby_names: "*"

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

4.3.2 host2: postgresql2.yml

cd ~/scripts
vim postgresql2.yml
scope: postgresql
namespace: /service/
name: postgresql2

restapi:
    listen: 192.168.56.11:8008
    connect_address: 192.168.56.11:8008

etcd:
    host: 192.168.56.11:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.56.10/0 md5
    - host replication replicator 192.168.56.11/0 md5
    - host replication replicator 192.168.56.12/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 192.168.56.11:15432
    connect_address: 192.168.56.11:15432
    bin_dir: /usr/local/pgsql/bin
    data_dir: /home/rudi/pgdata
    pgpass: /tmp/pgpass1
    authentication:
        replication:
            username: replicator
            password: rep-pass
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'
        synchronous_commit: "on"
        synchronous_standby_names: "*"

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

4.3.3 host3: postgresql3.yml

cd ~/scripts
vim postgresql3.yml
scope: postgresql
namespace: /service/
name: postgresql3

restapi:
    listen: 192.168.56.12:8008
    connect_address: 192.168.56.12:8008

etcd:
    host: 192.168.56.12:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.56.10/0 md5
    - host replication replicator 192.168.56.11/0 md5
    - host replication replicator 192.168.56.12/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 192.168.56.12:15432
    connect_address: 192.168.56.12:15432
    bin_dir: /usr/local/pgsql/bin
    data_dir: /home/rudi/pgdata
    pgpass: /tmp/pgpass1
    authentication:
        replication:
            username: replicator
            password: rep-pass
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'
        synchronous_commit: "on"
        synchronous_standby_names: "*"

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

5 啓動集羣

5.1 在全部主機上(host1/host2/host3)關閉/中止防火牆

systemctl stop firewalld
systemctl disable firewalld

5.2 按順序,啓動etcd,在全部主機上(host1/host2/host3)

host1:

cd ~
source ./scripts/etcd1.sh

host2:

cd ~
source ./scripts/etcd2.sh

host3:

cd ~
source ./scripts/etcd3.sh

查看etcd狀態信息:

[rudi@host1 ~]$ curl -L http://host2:2379/version
{"etcdserver":"3.3.11","etcdcluster":"3.3.0"}
[rudi@host1 ~]$ curl -L http://host3:2379/version
{"etcdserver":"3.3.11","etcdcluster":"3.3.0"}

[rudi@host1 ~]$ curl -L http://host1:2379/version
{"etcdserver":"3.3.11","etcdcluster":"3.3.0"}

[rudi@host1 scripts]$ etcdctl member list
51b08bf82e03e049: name=etcdnode1 peerURLs=http://192.168.56.10:2380 clientURLs=http://192.168.56.10:2379 isLeader=true
6d36a224cc993604: name=etcdnode2 peerURLs=http://192.168.56.11:2380 clientURLs=http://192.168.56.11:2379 isLeader=false
bb961ca5e3abf011: name=etcdnode3 peerURLs=http://192.168.56.12:2380 clientURLs=http://192.168.56.12:2379 isLeader=false

5.3 啓動patroni,在全部主機上(host1/host2/host3)

5.3.1 啓動Patroni,在host1上

  • Patroni1 把本地PostgreSQL(postgresql1)的信息寫入etcd.
  • Patroni1 監測到數據庫目錄(/home/rudi/pgdata/)是空的,因而初始化數據庫(initdb -D /home/rudi/pgdata)
  • Patroni1 配置本地數據庫相關的配置文件,例如:postgresql.conf, pg_hba.conf
  • Patroni1 啓動本地數據庫(postgresql1): pg_ctl -D /home/rudi/pgdata start
  • Patroni1 把本地數據庫(postgresql1)設定爲主數據庫(Primary)
cd ~
patroni ./scripts/postgresql1.yml

5.3.2 啓動Patroni,在host2/host3上

  • Patroni2/Patroni3 基於postgresql1作數據庫備份(pg_basebackup),建立各自的本地數據庫
  • Patroni2/Patroni3 配置本地數據庫相關的配置文件,例如:postgresql.conf, pg_hba.conf
  • Patroni2 啓動postgresql2,做爲從庫(Standby)
  • Patroni3 啓動postgresql3,做爲從庫(Standby)

host2:

cd ~
patroni ./scripts/postgresql2.yml

host3:

cd ~
patroni ./scripts/postgresql3.yml

5.3.3 檢查集羣狀態信息, 經過Patroni接口:

[rudi@host1 ~]$ curl -L http://host1:8008/
{"state": "running", "postmaster_start_time": "2019-09-17 04:15:30.959 EDT", "role": "master", "server_version": 110003, "cluster_unlocked": false, "xlog": {"location": 83886400}, "timeline": 1, "replication": [{"usename": "replicator", "application_name": "postgresql2", "client_addr": "192.168.56.11", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "replicator", "application_name": "postgresql3", "client_addr": "192.168.56.12", "state": "streaming", "sync_state": "potential", "sync_priority": 1}], "database_system_identifier": "6737550116166691859", "patroni": {"version": "1.6.0", "scope": "postgresql"}}
[rudi@host1 ~]$ patronictl version
patronictl version 1.6.0

[rudi@host1 ~]$ patronictl -c ./scripts/postgresql1.yml list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  1 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  1 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  1 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+
[rudi@host1 ~]$ 
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  1 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  1 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  1 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

5.3.4 檢查集羣狀態信息, 經過etcd接口:

[rudi@host1 scripts]$ etcdctl ls --recursive --sort -p /service
/service/postgresql/
/service/postgresql/config
/service/postgresql/initialize
/service/postgresql/leader
/service/postgresql/members/
/service/postgresql/members/postgresql1
/service/postgresql/members/postgresql2
/service/postgresql/members/postgresql3
/service/postgresql/optime/
/service/postgresql/optime/leader

[rudi@host1 ~]$ etcdctl get /service/postgresql/leader
postgresql1

[rudi@host1 ~]$ etcdctl get /service/postgresql/members/postgresql1
{"conn_url":"postgres://192.168.56.10:15432/postgres","api_url":"http://192.168.56.10:8008/patroni","state":"running","role":"master","version":"1.6.0","xlog_location":83888056,"timeline":1}

6 實驗數據讀寫

經過任意一臺主機( host1,host2, host3)訪問數據庫

6.1 嘗試向主庫(Primary)寫數據,並讀取數據:

export PATH=$PATH:/usr/local/pgsql/bin
psql -U postgres -d postgres -p 15432 -h host1
create table test (id int, name varchar(100));
postgres=# create table test (id int, name varchar(100));
CREATE TABLE
postgres=# insert into test values ( 1,'1');
INSERT 0 1
postgres=# select * from test;
 id | name 
----+------
  1 | 1
(1 row)

6.2 嘗試向一個從庫(Standby)寫數據

psql -U postgres -d postgres -p 15432 -h  host2
postgres=# insert into test values ( 1,'1');
ERROR:  cannot execute INSERT in a read-only transaction

6.3 嘗試從一個從庫(Standby)讀數據

psql -U postgres -d postgres -p 15432 -h  host3
postgres=# select * from test;
 id | name 
----+------
  1 | 1
(1 row)

7 Kill主庫(Primary)上的postmater進程

7.1 Kill以前的狀態信息:

  • 主庫是postgresql1/host1
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  1 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  1 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  1 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

7.2 執行Kill,在host1上:

[rudi@host1 ~]$ ps -ef|grep postgres
rudi      3908  3759  0 11:35 pts/5    00:00:01 /usr/local/bin/python3 /usr/local/bin/patroni ./scripts/postgresql1.yml
rudi      3929     1  0 11:35 ?        00:00:00 /usr/local/pgsql/bin/postgres -D /home/rudi/pgdata --config-file=/home/rudi/pgdata/postgresql.conf --listen_addresses=192.168.56.10 --port=15432 --cluster_name=postgresql --wal_level=replica --hot_standby=on --max_connections=100 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=8 --wal_log_hints=on
rudi      3935  3929  0 11:35 ?        00:00:00 postgres: postgresql: checkpointer
rudi      3936  3929  0 11:35 ?        00:00:00 postgres: postgresql: background writer
rudi      3937  3929  0 11:35 ?        00:00:00 postgres: postgresql: walwriter
rudi      3938  3929  0 11:35 ?        00:00:00 postgres: postgresql: autovacuum launcher
rudi      3939  3929  0 11:35 ?        00:00:00 postgres: postgresql: stats collector
rudi      3940  3929  0 11:35 ?        00:00:00 postgres: postgresql: logical replication launcher
rudi      3944  3929  0 11:35 ?        00:00:00 postgres: postgresql: postgres postgres 192.168.56.10(44044) idle
rudi      3954  3929  0 11:35 ?        00:00:00 postgres: postgresql: walsender replicator 192.168.56.11(42620) streaming 0/4019F60
rudi      3958  3929  0 11:35 ?        00:00:00 postgres: postgresql: walsender replicator 192.168.56.12(46540) streaming 0/4019F60

[rudi@host1 ~]$ kill -9 3929
[rudi@host1 ~]$

7.3 在host1上,Patroni再次啓動postgresql1,postgresql1依然是主庫

  • postgresql1正在其中中is starting
  • postgresql1依然是主庫,沒有切換
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |    |   unknown |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  2 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  2 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

7.4 查看主庫上的PID, 全部進程都是新的PID:

[rudi@host1 ~]$ ps -ef|grep postgres
rudi      3908  3759  0 11:35 pts/5    00:00:01 /usr/local/bin/python3 /usr/local/bin/patroni ./scripts/postgresql1.yml
rudi      4034     1  0 11:46 ?        00:00:00 /usr/local/pgsql/bin/postgres -D /home/rudi/pgdata --config-file=/home/rudi/pgda
ta/postgresql.conf --listen_addresses=192.168.56.10 --port=15432 --cluster_name=postgresql --wal_level=replica --hot_standby=on 
--max_connections=100 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp
=off --max_replication_slots=10 --max_worker_processes=8 --wal_log_hints=on
rudi      4037  4034  0 11:46 ?        00:00:00 postgres: postgresql: checkpointer
rudi      4038  4034  0 11:46 ?        00:00:00 postgres: postgresql: background writer   
rudi      4039  4034  0 11:46 ?        00:00:00 postgres: postgresql: stats collector   
rudi      4044  4034  0 11:46 ?        00:00:00 postgres: postgresql: postgres postgres 192.168.56.10(44742) idle
rudi      4049  4034  0 11:46 ?        00:00:00 postgres: postgresql: walwriter   
rudi      4050  4034  0 11:46 ?        00:00:00 postgres: postgresql: autovacuum launcher   
rudi      4051  4034  0 11:46 ?        00:00:00 postgres: postgresql: logical replication launcher   
rudi      4054  4034  0 11:46 ?        00:00:00 postgres: postgresql: walsender replicator 192.168.56.11(43266) streaming 0/50001A8
rudi      4055  4034  0 11:46 ?        00:00:00 postgres: postgresql: walsender replicator 192.168.56.12(47174) streaming 0/50001A8

7.5 查看集羣信息, postgresql1是主庫,正常工做

[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  3 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  3 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  3 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

8 手工切換(switchover)

8.1 切換以前的狀態信息

[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  3 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  3 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  3 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

8.2 執行手工切換(switchover)

  • 當前的主(Primary)是: postgresql1/host1
  • 選擇新的主(Primary): postgresql3/host3
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 switchover postgresql
Master [postgresql1]: 
Candidate ['postgresql2', 'postgresql3'] []: postgresql3
When should the switchover take place (e.g. 2019-09-17T12:53 )  [now]: 
Current cluster topology
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  3 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  3 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  3 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+
Are you sure you want to switchover cluster postgresql, demoting current master postgresql1? [y/N]: y
2019-09-17 11:53:12.19439 Successfully switched over to "postgresql3"
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 |        | stopped |    |   unknown |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  3 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 | Leader | running |  3 |           |
+------------+-------------+---------------------+--------+---------+----+-----------+

8.3 持續查看集羣狀態信息:

  • 新的主庫(Primary)是:postgresql3/host3
  • Patroni從新啓動了postgresql1/host1
  • 最後, postgresql1/host1 做爲從庫(Standby)從新加入集羣,正常工做
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 |        | stopped |    |   unknown |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  4 |           |
| postgresql | postgresql3 | 192.168.56.12:15432 | Leader | running |  4 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 |        | running |  4 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  4 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 | Leader | running |  4 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

9 重啓動主機,主庫(Primay)所在的主機

9.1 重啓動以前的集羣信息

[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 |        | running |  4 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  4 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 | Leader | running |  4 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

9.2 重啓動host3(Primary database)

[root@host3 ~]# reboot
Connection to host3 closed by remote host.
Connection to host3 closed.

9.3 查看集羣狀態信息

  • postgresql3/host3中止了
  • postgresql1/host1成爲了主庫(Primary)
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  5 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  5 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | stopped |    |   unknown |
+------------+-------------+---------------------+--------+---------+----+-----------+

9.4 當host3啓動後,手工啓動etcd3,手工啓動postgresql3

[rudi@host3 ~]source ./scripts/etcd3.sh
[rudi@host3 ~]patroni ./scripts/postgresql3.yml

9.5 當etcd3/postgresql3啓動後,查看集羣狀態信息

  • postgresql3/host3成爲從庫(Standby),正常工做
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  5 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  5 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  5 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

10 重啓動從庫(Standby)主機

10.1 重啓以前的集羣信息

  • 從庫:postgresql2,postgresql3
  • 主庫:postgresql1
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  5 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  5 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  5 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

10.2 重啓動host2(Standby)

[root@host2 ~]# reboot
Connection to host2 closed by remote host.
Connection to host2 closed.

10.3 查看集羣信息

  • postgresql2 已經中止
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  5 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | stopped |    |   unknown |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  5 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+

10.4 當host2啓動完畢後, 按照前後順序,手動啓動etcd2和postgresql2

[rudi@host2 ~]source ./scripts/etcd2.sh
[rudi@host2 ~]patroni ./scripts/postgresql2.yml

10.5 當etcd2和postgresql2啓動完畢後,查看集羣狀態

  • postgresql2仍然是從庫,正常工做
[rudi@host1 ~]$ patronictl -d etcd://host1:2379 list postgresql
+------------+-------------+---------------------+--------+---------+----+-----------+
|  Cluster   |    Member   |         Host        |  Role  |  State  | TL | Lag in MB |
+------------+-------------+---------------------+--------+---------+----+-----------+
| postgresql | postgresql1 | 192.168.56.10:15432 | Leader | running |  5 |         0 |
| postgresql | postgresql2 | 192.168.56.11:15432 |        | running |  5 |         0 |
| postgresql | postgresql3 | 192.168.56.12:15432 |        | running |  5 |         0 |
+------------+-------------+---------------------+--------+---------+----+-----------+
相關文章
相關標籤/搜索