pg-xl 的基本方式添加節點

os: centos 7.4
pgxl:pg.version ‘10.3 (Postgres-XL 10alpha2)html

添加節點、刪除節點在平常運維中是很常見的操做。
本次記錄的是 pgxl 添加 datanode 類型的節點，典型的橫向擴展。node

node4節點準備

安裝依賴包
關閉防火牆，selinux
建立用戶，修改環境變量
建立目錄
git獲取pgxl
編譯安裝pgxl
配置ssh信任
ntp同步時間linux

這些步驟均可以參考上一篇博客。git

node4節點上 datanode

$ initdb -D /var/lib/pgxl/data --nodename datanode3 -E UTF8 --locale=C -U postgres -W
Success. You can now start the database server of the Postgres-XL coordinator using:

    pg_ctl -D /var/lib/pgxl/data -l logfile start -Z coordinator

or
 You can now start the database server of the Postgres-XL datanode using:

    pg_ctl -D /var/lib/pgxl/data -l logfile start -Z datanode

$ vi /var/lib/pgxl/data/pg_hba.conf

host    all             all             192.168.56.101/32         trust
host    all             all             192.168.56.102/32         trust
host    all             all             192.168.56.103/32         trust
host    all             all             192.168.56.104/32         trust

$ vi /var/lib/pgxl/data/postgresql.conf

listen_addresses = '*'
port = 5432
max_connections = 100

pooler_port = 6667
max_pool_size = 100

gtm_host = 'node1'
gtm_port = 6668
pgxc_node_name = 'datanode3'

啓動 datanode3redis

$ pg_ctl -D /var/lib/pgxl/data -l logfile start -Z datanode

添加 coordinator，datanode信息sql

postgres=# alter node datanode3 with (type=datanode,host='node4', port=5432);
postgres=# create node coordinator1 with (type=coordinator,host='node1', port=5432);
postgres=# create node coordinator2 with (type=coordinator,host='node1', port=5433);

postgres=# create node datanode1 with (type=datanode, host='node2',port=5432,primary,preferred);
postgres=# create node datanode2 with (type=datanode, host='node3',port=5432);

postgres=# select pgxc_pool_reload();
postgres=# select * from pgxc_node;

coordinator，datanode添加 datanode3信息

coordinator1，coordinator2，datanode1，datanode2 都須要添加centos

postgres=# create node datanode3 with (type=datanode, host='node4',port=5432);
postgres=# select pgxc_pool_reload();
postgres=# select * from pgxc_node;

若是使用pgxc_ctl就會方便不少運維

表從新sharding

登陸node1節點的 coordinator1, redistribute tables ssh

$ psql -p 5432
psql (PGXL 10alpha2, based on PG 10.3 (Postgres-XL 10alpha2))
Type "help" for help.

postgres=# select * from pgxc_node;
  node_name   | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
--------------+-----------+-----------+-----------+----------------+------------------+-------------
 coordinator2 | C         |      5433 | node1     | f              | f                | -2089598990
 datanode1    | D         |      5432 | node2     | t              | t                |   888802358
 datanode2    | D         |      5432 | node3     | f              | f                |  -905831925
 coordinator1 | C         |      5432 | node1     | f              | f                |  1938253334
 datanode3    | D         |      5432 | node4     | f              | f                | -1894792127
(5 rows)

postgres=# EXECUTE DIRECT ON (datanode3) 'create database peiybdb;';
postgres=# \c peiybdb
peiybdb=# EXECUTE DIRECT ON (datanode3) ' create table tmp_t0 (c0 varchar(100),c1 varchar(100));';

peiybdb=# ALTER TABLE tmp_t0 ADD NODE (datanode3);
peiybdb=# select xc_node_id,count(1) from tmp_t0 group by xc_node_id;
 xc_node_id  | count 
-------------+-------
 -1894792127 |  3220
  -905831925 |  3429
   888802358 |  3351
(3 rows)

控制表數據分佈的節點post

ALTER TABLE tmp_t0 ADD NODE (datanode3);
ALTER TABLE tmp_t0 DELETE NODE (datanode3);

ALTER TABLE tmp_t0 DISTRIBUTE BY HASH(c0);
ALTER TABLE tmp_t0 DISTRIBUTE BY REPLICATION;

下面是幫助文檔的截取

To change the distribution type and the list of nodes where table data is located:

ALTER TABLE distributors TO NODE (dn1, dn7), DISTRIBUTE BY HASH(dist_id);

To add a node where data of table is distributed:

ALTER TABLE distributors ADD NODE (dn9, dn14);

To remove a node where data of table is distributed:

ALTER TABLE distributors DELETE NODE (dn4, dn0);

下面是簡單的建表分析

CREATE TABLE table_name(...)
DISTRIBUTE BY 
HASH(col)|MODULO(col)|ROUNDROBIN|REPLICATION
TO NODE(nodename1,nodename2...)

能夠看到，若是DISTRIBUTE BY 後面有以下選項：
REPLICATION，則是複製模式，其他則是分片模式，
HASH 指的是按照指定列的哈希值分佈數據，
MODULO 指的是按照指定列的取摩運算分佈數據，
ROUNDROBIN 指的是按照輪詢的方式分佈數據

TO NODE指定了數據分佈的節點範圍，若是沒有指定則默認全部數據節點參與數據分佈。若是沒有指定分佈模式，即便用普通的CREATE TABLE語句，PGXL會默認採用分片模式將數據分佈到全部數據節點。

參考：
https://www.postgres-xl.org/documentation/tutorial-createcluster.html

https://www.postgres-xl.org/documentation/sql-createtable.html
https://www.postgres-xl.org/documentation/sql-altertable.html