gravity 是摩拜單車出票的一個 異構/同構 數據複製通道軟件,提供主流軟件的支持,並支持k8s雲原生。比較看好它的發展。mysql
項目地址: https://github.com/moiot/gravitygit
官方文檔:https://github.com/moiot/gravity/blob/master/docs/2.0/01-quick-start.mdgithub
gravity的編譯和部署不是這裏的重點,咱們直接跳過。sql
gravity的部署:數據庫
cd /root/ git clone https://github.com/moiot/gravity.git cd gravity && make mkdir /usr/local/gravity/ cd /usr/local/gravity/ cp /root/gravity/bin/gravity /usr/local/gravity/ 配置文件這裏先忽略,
下面是個人架構圖:bash
業務場景:架構
一個老表,隨着業務量增大,考慮到分表,按照 user_id 作hash取模拆分,而後業務層面去作數據CRUD操做。ide
數據表以下:測試
create database testdb; use testdb; CREATE TABLE `gravity_t1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id', `user_id` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '用戶id', `s_status` tinyint(1) unsigned NOT NULL DEFAULT '0' COMMENT '狀態', PRIMARY KEY (`id`), KEY `idx_uid` (`user_id`) USING BTREE ) COMMENT = '測試表' ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4; 準備拆分後的4個分表: use testdb; create table t1_shard1 LIKE gravity_t1 ; create table t1_shard2 LIKE gravity_t1 ; create table t1_shard3 LIKE gravity_t1 ; create table t1_shard4 LIKE gravity_t1 ;
測試數據庫鏈接方式:ui
數據庫地址:192.168.2.4 超級帳號: dts 密碼: dts 假設業務用的普通帳號叫rd ,密碼無所謂。
造些測試用的數據:
for i in {1..10000} ; do mysql -hdts -pdts -h 192.168.2.4 -e "insert into testdb.gravity_t1 (user_id,s_status) values (\"$RANDOM\",'0');" done
結果大體這樣:
[test] > select count(*) from gravity_t1 ; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.007 sec) [testdb] > select (user_id%4) as hash_id,count(*) FROM gravity_t1 group by (user_id%4); +---------+----------+ | hash_id | count(*) | +---------+----------+ | 0| 2537 | | 1 | 2419 | | 2 | 2509 | | 3| 2535 | +---------+----------+ 4 rows in set (0.009 sec)
shard1的配置文件,內容以下:
cat config_shard1.toml
# name 必填,這裏保持每一個配置文件的惟一性 name = "shard1" # 內部用於保存位點、心跳等事項的庫名,默認爲 _gravity , 實測發現這裏改了名字也沒用,保持默認便可 internal-db-name = "_gravity" # # Input 插件的定義,此處定義使用 mysql # [input] type = "mysql" mode = "replication" [input.config.source] host = "192.168.2.4" username = "dts" password = "dts" port = 3306 # # Output 插件的定義,此處使用 mysql # [output] type = "mysql" [output.config.target] host = "192.168.2.4" username = "dts" password = "dts" port = 3306 # 路由規則的定義 [[output.config.routes]] match-schema = "testdb" match-table = "gravity_t1" target-schema = "testdb" target-table = "t1_shard1" # 這個target-table 表明的是須要寫入到的分片名稱,每一個gravity實例的配置中都須要修改
開4個窗口演示:
cd /usr/local/gravity/ ./bin/gravity -config config_shard1.toml -http-addr ":8083" ./bin/gravity -config config_shard2.toml -http-addr ":8184" ./bin/gravity -config config_shard3.toml -http-addr ":8185" ./bin/gravity -config config_shard4.toml -http-addr ":8186"
TIPS:
若是咱們此時開了數據庫的general_log的話, 能看到gravity到dest端是使用replace into方式插入全量數據的。而後再根據啓動時候監聽的binlog 實現增量數據的追平操做。
而後,看下 gravity 自動生成的庫,存放都是和數據複製相關的信息:
[testdb] > show tables from _gravity ; +----------------------+ | Tables_in__gravity | +----------------------+ | gravity_heartbeat_v2 | | gravity_positions | +----------------------+ 2 rows in set (0.000 sec) [testdb] > select * from _gravity.gravity_heartbeat_v2; +--------+--------+----------------------------+----------------------------+ | name | offset | update_time_at_gravity | update_time_at_source | +--------+--------+----------------------------+----------------------------+ | shard1 | 57 | 2020-03-26 16:19:08.070483 | 2020-03-26 16:19:08.070589 | | shard2 | 51 | 2020-03-26 16:19:07.469721 | 2020-03-26 16:19:07.469811 | | shard3 | 50 | 2020-03-26 16:19:09.135751 | 2020-03-26 16:19:09.135843 | | shard4 | 48 | 2020-03-26 16:19:08.448371 | 2020-03-26 16:19:08.448450 | +--------+--------+----------------------------+----------------------------+ 4 rows in set (0.001 sec) [testdb] > select * from _gravity.gravity_positions\G *************************** 1. row *************************** name: shard1 stage: stream position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28148767,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600359"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":12866955,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2559919"}} created_at: 2020-03-26 16:16:14 updated_at: 2020-03-26 16:19:26 *************************** 2. row *************************** name: shard2 stage: stream position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28155813,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600366"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":16601348,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2569941"}} created_at: 2020-03-26 16:16:31 updated_at: 2020-03-26 16:19:29 *************************** 3. row *************************** name: shard3 stage: stream position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28151964,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600363"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":20333055,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2579960"}} created_at: 2020-03-26 16:16:35 updated_at: 2020-03-26 16:19:29 *************************** 4. row *************************** name: shard4 stage: stream position: {"current_position":{"binlog-name":"mysql-bin.000014","binlog-pos":28152473,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2600364"},"start_position":{"binlog-name":"mysql-bin.000014","binlog-pos":24076960,"binlog-gtid":"fd2adbd9-e263-11e8-847a-141877487b3d:1-2589987"}} created_at: 2020-03-26 16:16:40 updated_at: 2020-03-26 16:19:29 4 rows in set (0.000 sec)
TIPS:
到這一步,咱們的4個分表的數據同步都配好了,咱們能夠再插入幾條數據測試下。
-- insert into testdb.gravity_t1(user_id,s_status) values ('11111','0'); -- insert into testdb.gravity_t1(user_id,s_status) values ('11112','0'); -- 我這裏演示就不插了
原始和拆分表的數據條數對比:
[testdb] > select (user_id%4) as hash_id,count(*) FROM gravity_t1 group by (user_id%4); +---------+----------+ | hash_id | count(*) | +---------+----------+ | 0 | 2537 | | 1 | 2419 | | 2 | 2509 | | 3 | 2535 | +---------+----------+ 4 rows in set (0.009 sec
select count(*) FROM t1_shard1 where user_id%4=0; select count(*) FROM t1_shard2 where user_id%4=1; select count(*) FROM t1_shard3 where user_id%4=2; select count(*) FROM t1_shard4 where user_id%4=3;
先作一次對分表中不須要的數據的刪除操做,防止後期切換後刪除數據量過大:
delete from t1_shard1 where user_id %4!=0; delete from t1_shard2 where user_id %4!=1; delete from t1_shard3 where user_id %4!=2; delete from t1_shard4 where user_id %4!=3; ## 注意:生產環境大表的刪除操做,建議使用pt-archiver進行
而後,再到原始表和分表中查詢對比下數據是否一致:
select (user_id%4),count(*) as hash_id FROM gravity_t1 group by (user_id%4); select count(*) FROM t1_shard1 where user_id%4=0; select count(*) FROM t1_shard2 where user_id%4=1; select count(*) FROM t1_shard3 where user_id%4=2; select count(*) FROM t1_shard4 where user_id%4=3;
而後,等低峯期進行操做。
一、dba對涉及到的業務帳號 對這個大表寫權限回收掉
revoke insert,update,delete on testdb.gravity_t1 from rd@'%'; flush hosts; flush tables;
二、通知業務方發版,切換數據庫鏈接到4個新表
三、切換完成後,dba再執行一次刪除各個分表髒數據的操做,
delete from t1_shard1 where user_id %4!=0; delete from t1_shard2 where user_id %4!=1; delete from t1_shard3 where user_id %4!=2; delete from t1_shard4 where user_id %4!=3;
四、打開4個新表的寫權限
GRANT select,insert,update,delete on testdb.t1_shard1 TO rd@'%'; GRANT select,insert,update,delete on testdb.t1_shard2 TO rd@'%'; GRANT select,insert,update,delete on testdb.t1_shard3 TO rd@'%'; GRANT select,insert,update,delete on testdb.t1_shard4 TO rd@'%';
五、而後,通知業務方測試。
六、業務方驗證無問題後收工。至此,單表 拆分爲分表的操做所有完成。
七、回退方案,待補充 (打開gravity的雙向複製??)