Clickhouse-010之MySQL數據增量同步到clickhouse庫

1、演示環境

node01 部署 mysql數據庫服務,MySQL版本5.7.22
node02 部署 clickhouse庫服務,proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm ,以及同步數據程序服務
特別申明: 同步的程序時python語言寫的,目前只支持到pyth2.7.5版本
服務器IP:html

node01   172.16.0.246
node02   172.16.0.197

clickhouse服務客戶端和服務端的版本:node

[root@node01 ~]# clickhouse-server -V
ClickHouse server version 20.8.3.18.
[root@node02 ~]# clickhouse-client -V
ClickHouse client version 20.8.3.18.

服務器系統和python版本:python

[root@node02 ~]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[root@node02 ~]#  python -V
Python 2.7.5
[root@node01 ~]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[root@node01 ~]# python -V
Python 2.7.5

node02機器 proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm軟件安裝說明:
主要是爲了clickhouse兼容mysql協議mysql

2、服務器安裝

2.1 、node02服務器clickhouse服務單機版安裝

此處node2服務器 clickhouse服務單機版安裝就再也不演示
具體安裝參考鏈接:https://blog.51cto.com/wujianwei/2949877
簡單說明下安裝完單機版clickhouse服務:linux

[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;"
_temporary_and_external_tables
default
system

default數據庫裏面沒有任何東西,和mysql裏面的test庫是同樣的。system庫看名字就知道是什麼。git

clickhouse服務安裝補充,在官方的文檔裏面有幾點建議:github

  1. 關閉大頁
  2. 調整內存使用
  3. 關閉cpu節能模式
    echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    echo 0 > /proc/sys/vm/overcommit_memory
    echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled

2.二、node1服務器MySQL服務安裝

mysql部署請自行部署。這裏不作介紹。
若是想從mysql同步數據那麼binlog 格式必須是row。並且必須binlog_row_image=fullredis

2.三、node02服務器安裝同步程序

安裝同步程序依賴的包;同步程序能夠放在clickhouse服務器上面,也能夠單獨放在其餘服務器。
同步程序使用pypy啓動,能夠提升同步數據的速度。
因此安裝包的時候須要安裝pypy相關的軟件包和依賴
具體安裝命令以下:sql

yum -y install pypy-libs pypy pypy-devel
wget https://bootstrap.pypa.io/get-pip.py
提示:一開始下載這個文件後,執行pypy get-pip.py 提示須要python3的環境,python2.7的環境須要下載下面的版本
wget https://bootstrap.pypa.io/pip/2.7/get-pip.py
pypy get-pip.py數據庫

執行安裝下面的命令,可是直接複製粘貼一堆報錯,因爲對python不是很熟,致使沒法進行下去:

/usr/lib64/pypy-5.0.1/bin/pip install MySQL-python
/usr/lib64/pypy-5.0.1/bin/pip install mysql-replication
/usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20
/usr/lib64/pypy-5.0.1/bin/pip install redis

2.四、成功的安裝步驟

通過幾回測試:按照下面的順序安裝python模塊,能夠安裝成功:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install MySQL-python
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting MySQL-python
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip (108 kB)
     |████████████████████████████████| 108 kB 2.6 MB/s 
Building wheels for collected packages: MySQL-python
  Building wheel for MySQL-python (setup.py) ... done
  Created wheel for MySQL-python: filename=MySQL_python-1.2.5-pp27-pypy_41-linux_x86_64.whl size=49118 sha256=ff86f8fba2433c5d623d1bf2158b8d9f8ab346b8f09dcfa9acfc074130e07bcb
  Stored in directory: /root/.cache/pip/wheels/21/03/d9/41cbcc2b332380d24663723922354ab876fffe2224b259a834
Successfully built MySQL-python
Installing collected packages: MySQL-python
Successfully installed MySQL-python-1.2.5
[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting mysql-replication==0.24
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB)
Collecting pymysql>=0.6
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/2b/c4/3c3e7e598b1b490a2525068c22f397fda13f48623b7bd54fb209cd0ab774/PyMySQL-1.0.0.tar.gz (45 kB)
     |████████████████████████████████| 45 kB 37.2 MB/s 
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/pypy -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"'; __file__='"'"'/tmp/pip-install-XgbqzK/pymysql/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-fnr4Dg
         cwd: /tmp/pip-install-XgbqzK/pymysql/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-XgbqzK/pymysql/setup.py", line 6, in <module>
        with open("./README.rst", encoding="utf-8") as f:
    TypeError: __init__() got an unexpected keyword argument 'encoding'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

執行上面的命令遇到報錯,詳細看報錯提示,按照報錯提示Collecting pymysql>=0.6 說明缺乏這個模塊,並且要求版本要大於0.6.因而採用下面的命令安裝成功:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install pymysql==0.6
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting pymysql==0.6
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/0c/3b/17407490b878d2abbc0c544ff71491e08932d1d44225b84a103eae317b7c/PyMySQL-0.6.tar.gz (52 kB)
     |████████████████████████████████| 52 kB 748 kB/s 
Building wheels for collected packages: pymysql
  Building wheel for pymysql (setup.py) ... done
  Created wheel for pymysql: filename=PyMySQL-0.6-py2-none-any.whl size=60771 sha256=6947b8d7c9e24e3d13982b4871f06c97923231b2223d8e2442f5ccce41fb4548
  Stored in directory: /root/.cache/pip/wheels/e0/b8/37/bbe7db22c5f90fb4dc04e9766ca49aa05a5f76acd0956e62ce
Successfully built pymysql
Installing collected packages: pymysql
Successfully installed pymysql-0.6

接着執行下面的安裝命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install mysql-replication==0.24 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting mysql-replication==0.24
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/ec/5e/36b87b6068210f1fbd606768e0c2541727a229ac0ebf557d65fd31bc79e9/mysql-replication-0.24.tar.gz (33 kB)
Requirement already satisfied: pymysql>=0.6 in /usr/lib64/pypy-5.0.1/site-packages (from mysql-replication==0.24) (0.6)
Building wheels for collected packages: mysql-replication
  Building wheel for mysql-replication (setup.py) ... done
  Created wheel for mysql-replication: filename=mysql_replication-0.24-py2-none-any.whl size=42153 sha256=8c9ba52edb99fc8c17c07b329f0a1b00ac535cd87b4562ec90fbc6cc1f367512
  Stored in directory: /root/.cache/pip/wheels/11/a5/cd/912029dfb7e8a159dc3d439416f6cf8bccc65cc010cf124fff
Successfully built mysql-replication
Installing collected packages: mysql-replication
Successfully installed mysql-replication-0.24

接着執行下面的安裝命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver==0.0.20 
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting clickhouse-driver==0.0.20
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/9e/a4/bc945ee53254b6f38fd9c7ee6e97a5834c116a68220d1910bf0850c7bc64/clickhouse-driver-0.0.20.tar.gz (36 kB)
Collecting pytz
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/70/94/784178ca5dd892a98f113cdd923372024dc04b8d40abe77ca76b5fb90ca6/pytz-2021.1-py2.py3-none-any.whl (510 kB)
     |████████████████████████████████| 510 kB 19.9 MB/s 
Collecting enum34
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/6f/2c/a9386903ece2ea85e9807e0e062174dc26fdce8b05f216d00491be29fad5/enum34-1.1.10-py2-none-any.whl (11 kB)
Collecting ipaddress
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/c2/f8/49697181b1651d8347d24c095ce46c7346c37335ddc7d255833e7cde674d/ipaddress-1.0.23-py2.py3-none-any.whl (18 kB)
Building wheels for collected packages: clickhouse-driver
  Building wheel for clickhouse-driver (setup.py) ... done
  Created wheel for clickhouse-driver: filename=clickhouse_driver-0.0.20-py2-none-any.whl size=50313 sha256=828b07473b373d9b9ef0538e76b192cd2af592951d41175acfd1cc5b68206ed5
  Stored in directory: /root/.cache/pip/wheels/ef/f1/f0/1926c46953bd8f9d65f1176efc995c223006504c4fbfe37a73
Successfully built clickhouse-driver
Installing collected packages: pytz, enum34, ipaddress, clickhouse-driver
Successfully installed clickhouse-driver-0.0.20 enum34-1.1.10 ipaddress-1.0.23 pytz-2021.1
[root@node02 soft]#

接着執行下面的安裝命令:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip install redis
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
Collecting redis
  Downloading http://mirrors.cloud.aliyuncs.com/pypi/packages/a7/7c/24fb0511df653cf1a5d938d8f5d19802a88cef255706fdda242ff97e91b7/redis-3.5.3-py2.py3-none-any.whl (72 kB)
     |████████████████████████████████| 72 kB 3.0 MB/s 
Installing collected packages: redis
Successfully installed redis-3.5.3
[root@node02 soft]#

說明: 這裏也安裝了redis模塊是由於同步的binlog pos能夠存放在redis裏面,固然程序也是支持存放在文件裏面。

查看已經安裝完成的模塊:

[root@node02 soft]# /usr/lib64/pypy-5.0.1/bin/pip  list|egrep -i "MySQL-python|mysql-replication|clickhouse-driver|redis"
DEPRECATION: pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
clickhouse-driver 0.0.20
MySQL-python      1.2.5
mysql-replication 0.24
redis             3.5.3

2.五、node02服務器安裝proxysql

proxysql安裝(主要是爲了clickhouse兼容mysql協議):

proxysql在這裏下載:https://github.com/sysown/proxysql/releases 選擇帶clickhouse的包下載,不然不會支持clickhouse。
ps:ClickHouse server version 20.8.3.18版本的clickhouse已經原生兼容mysql協議。可是再同步MySQL數據時,有嚴格的格式要求,目前還不能很好的結合已有的MySQL庫數據進行配置同步到clickhouser庫

proxysql安裝及配置以下:

[root@node02 soft]# rpm -ivh proxysql-2.0.13-1-clickhouse-centos7.x86_64.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:proxysql-2.0.13-1                warning: group proxysql does not exist - using root
warning: group proxysql does not exist - using root
################################# [100%]
Created symlink from /etc/systemd/system/multi-user.target.wants/proxysql.service to /etc/systemd/system/proxysql.service.

啓動(必須這樣啓動,不然是不支持clickhouse的:

proxysql --clickhouse-server
[root@node02 soft]# proxysql --clickhouse-server 
2021-07-15 12:54:28 [INFO] Using config file /etc/proxysql.cnf
2021-07-15 12:54:28 [INFO] Using OpenSSL version: OpenSSL 1.1.1d  10 Sep 2019
2021-07-15 12:54:28 [INFO] No SSL keys/certificates found in datadir (/var/lib/proxysql). Generating new keys/certificates.
[root@node02 soft]#
[root@node02 soft]# ss -lntup|grep proxysql
tcp    LISTEN     0      128       *:6090                  *:*                   users:(("proxysql",pid=20648,fd=28))
tcp    LISTEN     0      128       *:6032                  *:*                   users:(("proxysql",pid=20648,fd=27))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=26))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=25))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=24))
tcp    LISTEN     0      1024      *:6033                  *:*                   users:(("proxysql",pid=20648,fd=23))

登陸proxsql服務端:

[root@node02 soft]# mysql -uadmin -padmin -h127.0.0.1 -P6032 -e "show databases;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----+---------------+-------------------------------------+
| seq | name          | file                                |
+-----+---------------+-------------------------------------+
| 0   | main          |                                     |
| 2   | disk          | /var/lib/proxysql/proxysql.db       |
| 3   | stats         |                                     |
| 4   | monitor       |                                     |
| 5   | stats_history | /var/lib/proxysql/proxysql_stats.db |
+-----+---------------+-------------------------------------+

登陸proxysql,設置clicku帳戶,經過這個帳戶來登陸後端的clickhouse服務:

mysql -uadmin -padmin -h127.0.0.1 -P6032
admin@node02 12:57:  [(none)]> select * from clickhouse_users;
Empty set (0.00 sec)

admin@node02 12:57:  [(none)]> 

INSERT INTO clickhouse_users VALUES ('clicku','clickp',1,100);
LOAD CLICKHOUSE USERS TO RUNTIME;
SAVE CLICKHOUSE USERS TO DISK;

admin@node02 12:57:  [(none)]> select * from clickhouse_users;
+----------+----------+--------+-----------------+
| username | password | active | max_connections |
+----------+----------+--------+-----------------+
| clicku   | clickp   | 1      | 100             |
+----------+----------+--------+-----------------+
1 row in set (0.00 sec)

使用proxysql鏈接到clickhouse:

[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e "show databases;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------------+
| name                           |
+--------------------------------+
| _temporary_and_external_tables |
| default                        |
| system                         |

3、同步node01上的MySQL的數據到node02的clickhouse

mysql同步數據到clickhouse

3.一、案例1:mysql裏面有個庫test001,庫裏面有張表tb1,同步這張表到clickhoue

####3.1.一、登陸node01 MySQL庫建立須要同步的測試庫和測試表

root@node01 13:05:  [(none)]> create database test001;
Query OK, 1 row affected (0.00 sec)

root@node01 13:05:  [(none)]> 
root@node01 13:05:  [(none)]> use test001;
Database changed
root@node01 13:05:  [test001]> CREATE TABLE `tb1` (   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,   `pay_money` decimal(20,2) NOT NULL DEFAULT '0.00',   `pay_day` date NOT NULL,   `pay_time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',   PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Query OK, 0 rows affected (0.02 sec)
**建立複製node01庫的帳戶:**
 GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges;
root@node01 13:05:  [test001]> GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'click_rep'@'172.16.0.197' identified by 'jwts996';flush privileges;
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

root@node01 13:09:  [test001]>

3.1.二、登陸node02 clickhouse服務建立和MySQL中對應的庫表

1. clickhoue裏面建庫,建表:

[root@node02 soft]# clickhouse-client -h 127.0.0.1 -m -q "show databases;"
_temporary_and_external_tables
default
system

node02 :) create database test001;

2. 建表(clickhouse建表的格式以及字段類型和mysql徹底不同,若是字段少還能夠本身建,若是字段多比較痛苦,可使用clickhouse自帶的從mysql導數據的命令來建表),在建表以前須要進行受權,由於程序同步也是模擬一個從庫拉取數據.
登錄clickhouse進行建表:

CREATE TABLE tb1
ENGINE = MergeTree
PARTITION BY toYYYYMM(pay_time)
ORDER BY pay_time AS
SELECT *
FROM mysql('172.16.0.246:3306', 'test001', 'tb1', 'click_rep', 'jwts996');

關於clickhouse表結構的說明:

[root@node02 soft]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090 -e " show create table test001.tb1;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| statement                                                                                                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE TABLE test001.tb1
(
    `id` UInt32,
    `pay_money` String,
    `pay_day` Date,
    `pay_time` DateTime
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(pay_time)
ORDER BY pay_time
SETTINGS index_granularity = 8192

 這裏使用MergeTree引擎,MergeTree是clickhouse裏面最牛逼的引擎,支持海量數據,支持索引,支持分區,支持更新刪除。toYYYYMM(pay_time)的意思是根據pay_time分區,粒度是按月。
 ORDER BY (pay_time)的意思是根據pay_time排序存儲,同時也是索引。上面的create table命令若是mysql表裏面之後數據那麼數據也會一併進入clickhouse裏面。
 其中這裏的index_granularity = 8192是指索引的粒度。若是數據量沒有達到百億,那麼一般無需更改.

3.1.三、執行同步程序命令

[root@node02 sync]# pypy mysql-clickhouse-replication.py --help 
Traceback (most recent call last):
  File "mysql-clickhouse-replication.py", line 10, in <module>
    import MySQLdb
  File "/usr/lib64/pypy-5.0.1/site-packages/MySQLdb/__init__.py", line 19, in <module>
    import _mysql
ImportError: unable to load extension module '/usr/lib64/pypy-5.0.1/site-packages/_mysql.pypy-41.so': libmysqlclient.so.20: cannot open shared object file: No such file or directory
[root@node02 sync]#

解決辦法:

[root@node02 sync]# ln -sv /usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20 /usr/lib64/
‘/usr/lib64/libmysqlclient.so.20’ -> ‘/usr/local/mysql-5.7.22-linux-glibc2.12-x86_64/lib/libmysqlclient.so.20’
[root@node02 sync]# pypy mysql-clickhouse-replication.py --help 
usage: Data Replication to clikhouse [-h] [-c CONF] [-d] [-l]

mysql data is copied to clikhouse

optional arguments:
  -h, --help            show this help message and exit
  -c CONF, --conf CONF  Data synchronization information file
  -d, --debug           Display SQL information
  -l, --logtoredis      log position to redis ,default file
By dengyayun @2019

到此處同步程序算是安裝完成

3.1.四、編寫和配置同步程序配置文件

表結構也建立完成之後如今配置同步程序配置文件metainfo.conf
配置文件內容以下:

[root@node02 sync]# cat metainfo.conf
# 從這裏同步數據
[master_server]
host='172.16.0.246'
port=3306
user='click_rep'
passwd='jwts996'
server_id=172160246

# redis配置信息,用於存放pos點
[redis_server]
host='127.0.0.1'
port=6379
passwd='xx'
log_pos_prefix='log_pos_'
**##這次演示沒采用redis來存放指定的binglog文件和pos位置點**
#把log_position記錄到文件
[log_position]
file='./repl_pos.log'
**##本次演示的是把binlog文件和位置點記錄到文件repl_pos.log**
#[root@node02 soft]# cat sync/repl_pos.log 
#[log_position]
#filename = mysql-bin.000111
#position = 360752645
###################################

**# ch server信息,數據同步之後寫入這裏**
[clickhouse_server]
host=127.0.0.1
port=9000
passwd=''
user='default'
#字段大小寫. 1是大寫,0是小寫
column_lower_upper=0

**# 須要同步的數據庫**
[only_schemas]
schemas='test001'

**# 須要同步的表**
[only_tables]
tables='tb1'

# 指定庫表跳過DML語句(update,delete可選)
[skip_dmls_sing]
skip_delete_tb_name = ''
skip_update_tb_name = ''

#跳過全部表的DML語句(update,delete可選)
[skip_dmls_all]
#skip_type = 'delete'
#skip_type = 'delete,update'
skip_type = ''

[bulk_insert_nums]
**#多少記錄提交一次,使用pypy運行推薦2w記錄提交。**
insert_nums=20000
**#選擇每隔多少秒同步一次,負數表示不啓用,單位秒**
#interval=60
interval=1

# 告警郵件設置
[failure_alarm]
mail_host= 'smtp.xx.com'
mail_port= 25
mail_user= 'xx'
mail_pass= 'xxx'
mail_send_from = 'xxx'
#報警收件人
alarm_mail = 'yymysql@gmail.com'

**#日誌存放路徑**
[repl_log]
log_dir="/tmp/relication_mysql_clickhouse.log"

3.1.五、啓動同步程序

默認pos點就是記錄文件,無需再指定記錄binlog pos方式,啓動同步程序:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
13:26:55 INFO     開始同步數據時間 2021-07-15 13:26:55
13:26:55 INFO     同步binlog pos點從文件讀取
13:26:55 INFO     從服務器 172.16.0.246:3306 同步數據
13:26:55 INFO     讀取binlog: mysql-bin.000111:360750299
13:26:55 INFO     同步到clickhouse server 127.0.0.1:9000
13:26:55 INFO     同步到clickhouse的數據庫: ['test001']
13:26:55 INFO     同步到clickhouse的表: ['tb1']

13:27:59 INFO     INSERT 數據插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 1, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 2, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 3, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 4, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 5, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 6, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 7, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 8, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 9, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 10, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 11, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}] 
13:28:31 INFO     INSERT 數據插入SQL: INSERT INTO test001.tb1 VALUES, [{u'id': 12, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 13, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 14, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 15, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 16, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 17, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 18, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 19, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 20, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 21, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}]

3.1.六、驗證同步結果

[root@node01 soft]# mysql -e "select * from test001.tb1 where 1=1 order by id limit 20;"
+----+-----------+------------+---------------------+
| id | pay_money | pay_day    | pay_time            |
+----+-----------+------------+---------------------+
|  1 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
|  2 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
|  3 |     66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
+----+-----------+------------+---------------------+

[root@node02 sync]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.tb1 where 1=1 order by id limit 20;"
1   66.22   2019-06-29  2019-06-29 14:00:00
2   66.22   2019-06-29  2019-06-29 14:00:00
3   66.22   2019-06-29  2019-06-29 14:00:00

3.二、新增一張MySQL表同步到clickhouse庫

node01庫服務器test001庫下再新增一張表:

CREATE TABLE `t_call_log1` (
  `id` bigint(20) NOT NULL COMMENT '記錄標識',
  `user_id` bigint(20) NOT NULL COMMENT '用戶標識',
  `customer_id` bigint(20) DEFAULT NULL COMMENT '客戶標識',
  `city_id` bigint(20) DEFAULT NULL COMMENT '城市標識',
  `phone` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '對方電話',
  `name` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '對方名稱',
  `is_recorded` bit(1) NOT NULL COMMENT '是否錄音',
  `file_size` bigint(20) DEFAULT NULL COMMENT '文件大小(字節)',
  `file_name` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT '文件名稱',
  `created_time` datetime NOT NULL COMMENT '建立時間',
  `modified_time` datetime DEFAULT NULL COMMENT '修改時間',
  `call_type` tinyint(4) DEFAULT '1' COMMENT '呼叫方式(1,手機 2,呼叫中心)',
  `call_id` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT '智齒id',
  `status_id` tinyint(4) DEFAULT '-1' COMMENT '當前客戶狀態1.未授信;2.已授信;3.已成單;4.全退租',
  `contact_id` bigint(20) DEFAULT '0' COMMENT '聯繫人id',
  PRIMARY KEY (`id`),
  KEY `index_phone` (`phone`),
  KEY `fk_clog_user_id` (`user_id`) USING BTREE,
  KEY `index_customer_id` (`customer_id`),
  KEY `index_call_id` (`call_id`) USING BTREE,
  KEY `idx_created_time` (`created_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='電話記錄表'

node02服務器上的clickhouse服務也新增一張表t_call_log1:

[root@node02 data]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090

CREATE TABLE t_call_log1
ENGINE = MergeTree
PARTITION BY toYYYYMM(created_time)
ORDER BY created_time AS
SELECT *
FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996');

或者以下:
[root@node02 proxysql]#  mysql -u clicku -pclickp -h 127.0.0.1 -P6090
clicku@node02 12:56:  [test001]> CREATE TABLE t_call_log1 ENGINE = MergeTree PARTITION BY toYYYYMM(created_time) ORDER BY created_time AS SELECT * FROM mysql('172.16.0.246:3306', 'test001', 't_call_log1', 'click_rep', 'jwts996'); 
Query OK, 0 rows affected (0.01 sec)

clicku@node02 16:14:  [(none)]> show create test001.t_call_log1\G
*************************** 1. row ***************************
statement: CREATE TABLE test001.t_call_log1
(
    `id` Int64,
    `user_id` Int64,
    `customer_id` Nullable(Int64),
    `city_id` Nullable(Int64),
    `phone` String,
    `name` Nullable(String),
    `is_recorded` String,
    `file_size` Nullable(Int64),
    `file_name` Nullable(String),
    `created_time` DateTime,
    `modified_time` Nullable(DateTime),
    `call_type` Nullable(Int8),
    `call_id` String,
    `status_id` Nullable(Int8),
    `contact_id` Nullable(Int64)
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(created_time)
ORDER BY created_time
SETTINGS index_granularity = 8192
1 row in set (0.00 sec)

3.2.一、驗證node01主庫mysql下test001.t_call_log1表增刪改查

配置文件再新增一張表:

[root@tidb04 ~]# egrep "t_call_log1|test001" /data/soft/sync/metainfo.conf
schemas='test001'
tables='tb1,t_call_log1'

啓動同步程序:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
16:19:02 INFO     開始同步數據時間 2021-07-18 16:19:02
16:19:02 INFO     同步binlog pos點從文件讀取
16:19:02 INFO     從服務器 172.16.0.246:3306 同步數據
16:19:02 INFO     讀取binlog: mysql-bin.000111:360767728
16:19:02 INFO     同步到clickhouse server 127.0.0.1:9000
16:19:02 INFO     同步到clickhouse的數據庫: ['test001']
16:19:02 INFO     同步到clickhouse的表: ['tb1', 't_call_log1']

MySQL下的test001.t_call_log1 表插入數據:

insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(1,001,1,0001,18535001234,'小花',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(2,001,1,0001,18535001234,'張婉',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(3,001,1,0001,18535001234,'李四',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(4,001,1,0001,18535001234,'王五',0,null,null,now(),now(),1,1,1,0);
insert into t_call_log1 (id,user_id,customer_id,city_id,phone,name,is_recorded,file_size,file_name,created_time,modified_time,call_type,call_id,status_id,contact_id)values(5,001,1,0001,18535001234,'趙六',0,null,null,now(),now(),1,1,1,0);

查看同步日誌:

[root@node02 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug 
16:19:02 INFO     開始同步數據時間 2021-07-18 16:19:02
16:19:02 INFO     同步binlog pos點從文件讀取
16:19:02 INFO     從服務器 172.16.0.246:3306 同步數據
16:19:02 INFO     讀取binlog: mysql-bin.000111:360767728
16:19:02 INFO     同步到clickhouse server 127.0.0.1:9000
16:19:02 INFO     同步到clickhouse的數據庫: ['test001']
16:19:02 INFO     同步到clickhouse的表: ['tb1', 't_call_log1']

16:19:47 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5c0f\u82b1', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:20:46 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 2, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u5f20\u5a49', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'modified_time': datetime.datetime(2021, 7, 18, 16, 20, 46), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:33 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 3, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u674e\u56db', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:33 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 4, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u738b\u4e94', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 33), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}] 
16:21:34 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 5, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u8d75\u516d', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'modified_time': datetime.datetime(2021, 7, 18, 16, 21, 34), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]

驗證clickhouser表數據:

[root@node02 proxysql]#  clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1;"
5   1   1   1   18535001234 趙六  0   \N  \N  2021-07-18 16:21:34 2021-07-18 16:21:34 1   1   1   0
1   1   1   1   18535001234 小花  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
2   1   1   1   18535001234 張婉  0   \N  \N  2021-07-18 16:20:46 2021-07-18 16:20:46 1   1   1   0
3   1   1   1   18535001234 李四  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0
4   1   1   1   18535001234 王五  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0

update 跟新MySQL表

root@node01 16:23:  [test001]> update t_call_log1 set name='百萬' where id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

root@node01 16:24:  [test001]> select * from t_call_log1;
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
| id | user_id | customer_id | city_id | phone       | name   | is_recorded | file_size | file_name | created_time        | modified_time       | call_type | call_id | status_id | contact_id |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
|  1 |       1 |           1 |       1 | 18535001234 | 百萬   |             |      NULL | NULL      | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 |         1 | 1       |         1 |          0 |
|  2 |       1 |           1 |       1 | 18535001234 | 張婉   |             |      NULL | NULL      | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 |         1 | 1       |         1 |          0 |
|  3 |       1 |           1 |       1 | 18535001234 | 李四   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
|  4 |       1 |           1 |       1 | 18535001234 | 王五   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
|  5 |       1 |           1 |       1 | 18535001234 | 趙六   |             |      NULL | NULL      | 2021-07-18 16:21:34 | 2021-07-18 16:21:34 |         1 | 1       |         1 |          0 |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
5 rows in set (0.00 sec)

同步日誌以下:

16:24:12 INFO     INSERT 數據插入SQL: INSERT INTO test001.t_call_log1 VALUES, [{u'id': 1, u'user_id': 1, u'customer_id': 1, u'city_id': 1, u'phone': u'18535001234', u'name': u'\u767e\u4e07', u'is_recorded': '0', u'file_size': None, u'file_name': None, u'created_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'modified_time': datetime.datetime(2021, 7, 18, 16, 19, 47), u'call_type': 1, u'call_id': u'1', u'status_id': 1, u'contact_id': 0}]

clickhouse庫驗證:

[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 where name='百萬';"
1   1   1   1   18535001234 百萬  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
[root@node02 proxysql]#

delete刪表:

root@node01 16:26:  [test001]> delete from t_call_log1  where id in(4,5);
Query OK, 2 rows affected (0.00 sec)

root@node01 16:27:  [test001]> select * from t_call_log1;
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
| id | user_id | customer_id | city_id | phone       | name   | is_recorded | file_size | file_name | created_time        | modified_time       | call_type | call_id | status_id | contact_id |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
|  1 |       1 |           1 |       1 | 18535001234 | 百萬   |             |      NULL | NULL      | 2021-07-18 16:19:47 | 2021-07-18 16:19:47 |         1 | 1       |         1 |          0 |
|  2 |       1 |           1 |       1 | 18535001234 | 張婉   |             |      NULL | NULL      | 2021-07-18 16:20:46 | 2021-07-18 16:20:46 |         1 | 1       |         1 |          0 |
|  3 |       1 |           1 |       1 | 18535001234 | 李四   |             |      NULL | NULL      | 2021-07-18 16:21:33 | 2021-07-18 16:21:33 |         1 | 1       |         1 |          0 |
+----+---------+-------------+---------+-------------+--------+-------------+-----------+-----------+---------------------+---------------------+-----------+---------+-----------+------------+
3 rows in set (0.00 sec)

同步日誌以下:

16:27:18 INFO     DELETE 數據刪除SQL: alter table test001.t_call_log1 delete where id in (4) 
16:27:18 INFO     DELETE 數據刪除SQL: alter table test001.t_call_log1 delete where id in (5)

clickhouse庫驗證:

[root@node02 proxysql]# clickhouse-client -h 127.0.0.1 -m -q "select * from test001.t_call_log1 ;"
1   1   1   1   18535001234 百萬  0   \N  \N  2021-07-18 16:19:47 2021-07-18 16:19:47 1   1   1   0
2   1   1   1   18535001234 張婉  0   \N  \N  2021-07-18 16:20:46 2021-07-18 16:20:46 1   1   1   0
3   1   1   1   18535001234 李四  0   \N  \N  2021-07-18 16:21:33 2021-07-18 16:21:33 1   1   1   0

結果一致

參考文檔:
http://www.javashuo.com/article/p-mwthqvpm-be.html在此要特別感謝師兄鄧亞運提供的生產解決案例

相關文章
相關標籤/搜索