測試起因mysql
隨着PXC的逐步上線。線上數據庫的同步方式慢慢由以前的STATEMENT模式轉換到了ROW模式。因爲同步方式的改變引起了一些同步問題。sql
測試目的數據庫
必定程度上解決ROW模式下主從同步的問題。做爲之後PXC集羣down掉,人工修復的操做文檔。vim
測試環境session
masterold02:7301app
masterold03:7302ide
skavetest178:7303測試
主庫操做 spa
vim my.cnf 加入下一面一句code
binlog_format=ROW 數據庫binlog使用ROW模式同步
分別賦予叢庫同步用戶的權限
grant all on *.* to okooo_rep@'192.168.%.%' identified by 'Bjfcmlc@Mhxzkhl';
flush privileges;
測試開始
測試基礎同步功能
?.讓test178做爲從去同步old02的數據
CHANGE MASTER TO MASTER_HOST='192.168.8.72',MASTER_USER='okooo_rep',MASTER_PASSWORD='Bjfcmlc@Mhxzkhl',
MASTER_PORT=7301,MASTER_LOG_FILE='logbin.000001',MASTER_LOG_POS=4;
? 查看主從狀態,咱們看到很快test178就能夠和old02保持一致了。
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.72
Master_User: okooo_rep
Master_Port: 7301
Connect_Retry: 60
Master_Log_File: logbin.000006
Read_Master_Log_Pos: 332
Relay_Log_File: relay.000007
Relay_Log_Pos: 475
Relay_Master_Log_File: logbin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
? 讓test178做爲從去同步old03的數據,咱們看到很快test178也和old03保持一致了。
stop slave;
CHANGE MASTER TO MASTER_HOST='192.168.8.73',MASTER_USER='okooo_rep',MASTER_PASSWORD='Bjfcmlc@Mhxzkhl',MASTER_PORT=7302,MASTER_LOG_FILE='logbin.000001',MASTER_LOG_POS=4;
start slave;
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.73
Master_User: okooo_rep
Master_Port: 7302
Connect_Retry: 60
Master_Log_File: logbin.000005
Read_Master_Log_Pos: 332
Relay_Log_File: relay.000006
Relay_Log_Pos: 475
Relay_Master_Log_File: logbin.000005
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
總結:基礎同步測試完成,說明在數據庫新搭建結束的時候數據庫中數據一致的狀況下,test178能夠正常的和old02和old03中任意主庫同步數據。
寫入測試
? 分別在old02,old03上創建新的數據庫和表
create database row_slave;
CREATE TABLE `row_test` (
`id` int(10) unsigned NOT NULL,
`hostname` varchar(20) NOT NULL default '',
`create_time` datetime NOT NULL default '0000-00-00 00:00:00',
`update_time` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=1 ;
? old02寫入數據
insert into row_test values(1,'old02','2013-12-11 00:00:00','2013-12-11 00:00:00');
insert into row_test values(2,'old02','2013-12-11 00:00:00','2013-12-11 00:00:00');
insert into row_test values(3,'old03','2013-12-11 01:00:00','2013-12-11 01:00:00');
insert into row_test values(4,'old03','2013-12-11 01:00:00','2013-12-11 01:00:00');
?查看old02,old03,test178 皆能夠查出來
mysql> select * from row_test;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 2 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 3 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 4 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
+----+----------+---------------------+---------------------+
?old03寫入數據,此時old03(主)和test178(叢)在同步
insert into row_test values(5,'old03','2013-12-11 02:00:00','2013-12-11 02:00:00');
insert into row_test values(6,'old03','2013-12-11 02:00:00','2013-12-11 02:00:00');
?查看old03,test178 皆可查出。此時test178和 old02數據已經不一致了,叢庫比old02多出2條數據id=5,6。
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 2 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 3 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 4 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 5 | old03 | 2013-12-11 02:00:00 | 2013-12-11 02:00:00 |
| 6 | old03 | 2013-12-11 02:00:00 | 2013-12-11 02:00:00 |
+----+----------+---------------------+---------------------+
?old02寫入數據 此時主從庫仍是test178和old03在同步,和old02沒有關係
insert into row_test values(7,'old02','2013-12-11 03:00:00','2013-12-11 03:00:00');
insert into row_test values(8,'old02','2013-12-11 03:00:00','2013-12-11 03:00:00');
?查看 old02的binlog 來找到插入id =7,8的 pos點
cd /home/okooo/apps/tmp_slave01/logs
../bin/mysqlbinlog --no-defaults --base64-output=decode-rows -v -v ./logbin.000007
# at 1399
#131211 11:36:42 server id 1287301 end_log_pos 1472 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1386733002/*!*/;
BEGIN
/*!*/;
# at 1472
# at 1529
#131211 11:36:42 server id 1287301 end_log_pos 1529 Table_map: `row_slave`.`row_test` mapped to number 33
#131211 11:36:42 server id 1287301 end_log_pos 1585 Write_rows: table id 33 flags: STMT_END_F
### INSERT INTO row_slave.row_test
### SET
### @1=7 /* INT meta=0 nullable=0 is_null=0 */
### @2='old02' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 03:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 03:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
# at 1585
#131211 11:36:42 server id 1287301 end_log_pos 1612 Xid = 40
COMMIT/*!*/;
# at 1612
#131211 11:36:43 server id 1287301 end_log_pos 1685 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1386733003/*!*/;
BEGIN
/*!*/;
# at 1685
# at 1742
#131211 11:36:43 server id 1287301 end_log_pos 1742 Table_map: `row_slave`.`row_test` mapped to number 33
#131211 11:36:43 server id 1287301 end_log_pos 1798 Write_rows: table id 33 flags: STMT_END_F
### INSERT INTO row_slave.row_test
### SET
### @1=8 /* INT meta=0 nullable=0 is_null=0 */
### @2='old02' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 03:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 03:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
# at 1798
#131211 11:36:43 server id 1287301 end_log_pos 1825 Xid = 41
COMMIT/*!*/;
DELIMITER ;
# End of log file
?改變test178的同步點和old02同步
stop slave;
CHANGE MASTER TO MASTER_HOST='192.168.8.72',MASTER_USER='okooo_rep',MASTER_PASSWORD='Bjfcmlc@Mhxzkhl',MASTER_PORT=7301,MASTER_LOG_FILE='logbin.000007',MASTER_LOG_POS=1399;
start slave;
show slave status\G
?發現old02數據改變之後叢庫同步了old02的數據,這時候的test178(叢庫) 已經擁有所有數據了。 其中id in(1,2,3,4)3庫共有的。 id in(5,6 )old03獨有的 id in (7,8) odl03獨有的。
mysql> select * from row_test;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 2 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 3 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 4 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 5 | old03 | 2013-12-11 02:00:00 | 2013-12-11 02:00:00 |
| 6 | old03 | 2013-12-11 02:00:00 | 2013-12-11 02:00:00 |
| 7 | old02 | 2013-12-11 03:00:00 | 2013-12-11 03:00:00 |
| 8 | old02 | 2013-12-11 03:00:00 | 2013-12-11 03:00:00 |
+----+----------+---------------------+---------------------+
總結:確認叢庫表比主庫表少數據不影響新數據寫入
更新測試
?改變一條old02和test78都存在的數據 此時test178和old02同步數據,主從依然同步
update row_test set update_time =now() ,hostname ='old021' where id=7;
?改變一條old03和test178都有的數據此時test178和old02同步數據,沒有和old03同步,改變old03的數據爲下面作準備
update row_test set update_time =now() ,hostname ='old031' where id=5;
? 查看old03的binlog,尋找要同步的POS點
../bin/mysqlbinlog --no-defaults --base64-output=decode-rows -v -v ./logbin.000006
# at 1825
#131211 15:20:16 server id 1807302 end_log_pos 1906 Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1386746416/*!*/;
SET @@session.time_zone='SYSTEM'/*!*/;
BEGIN
/*!*/;
# at 1906
# at 1963
#131211 15:20:16 server id 1807302 end_log_pos 1963 Table_map: `row_slave`.`row_test` mapped to number 33
#131211 15:20:16 server id 1807302 end_log_pos 2048 Update_rows: table id 33 flags: STMT_END_F
### UPDATE row_slave.row_test
### WHERE
### @1=5 /* INT meta=0 nullable=0 is_null=0 */
### @2='old03' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 02:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 02:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### SET
### @1=5 /* INT meta=0 nullable=0 is_null=0 */
### @2='old031' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 02:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 15:20:16 /* DATETIME meta=0 nullable=0 is_null=0 */
# at 2048
#131211 15:20:16 server id 1807302 end_log_pos 2075 Xid = 32
COMMIT/*!*/;
DELIMITER ;
# End of log file
?改變test178的同步點和old03同步
stop slave;
CHANGE MASTER TO MASTER_HOST='192.168.8.73',MASTER_USER='okooo_rep',MASTER_PASSWORD='Bjfcmlc@Mhxzkhl',MASTER_PORT=7302,MASTER_LOG_FILE='logbin.000006',MASTER_LOG_POS=1825;
start slave;
show slave status\G
?查看test178數據,發現更新成功 (確認修改不一樣行數據的時候,同時多個主同步數據不會相互牽制。深層理解,主從同步不會校驗表數據是否一致和行數據是否一致。以後會繼續驗證這個觀點)
mysql> select * from row_test;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 2 | old02 | 2013-12-11 00:00:00 | 2013-12-11 00:00:00 |
| 3 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 4 | old03 | 2013-12-11 01:00:00 | 2013-12-11 01:00:00 |
| 5 | old031 | 2013-12-11 02:00:00 | 2013-12-11 15:20:16 |
| 6 | old03 | 2013-12-11 02:00:00 | 2013-12-11 02:00:00 |
| 7 | old021 | 2013-12-11 03:00:00 | 2013-12-11 15:15:34 |
| 8 | old02 | 2013-12-11 03:00:00 | 2013-12-11 03:00:00 |
+----+----------+---------------------+---------------------+
?修改在3個庫上全都有的數據 首先改old03上的 id=1的數據
update row_test set update_time =now() ,hostname ='old032' where id=1;
?主叢庫同步數據之後 test178和old03在同步數據
mysql> select * from row_test where id=1;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old032 | 2013-12-11 00:00:00 | 2013-12-11 15:49:53 |
+----+----------+---------------------+---------------------+
?修改old02上一樣的數據。
update row_test set update_time =now() ,hostname ='old022' where id=1;
? 查看old02上的binlog
../bin/mysqlbinlog --no-defaults --base64-output=decode-rows -v -v ./logbin.000007
# at 2075
#131211 15:51:15 server id 1287301 end_log_pos 2156 Query thread_id=9 exec_time=0 error_code=0
SET TIMESTAMP=1386748275/*!*/;
BEGIN
/*!*/;
# at 2156
# at 2213
#131211 15:51:15 server id 1287301 end_log_pos 2213 Table_map: `row_slave`.`row_test` mapped to number 33
#131211 15:51:15 server id 1287301 end_log_pos 2298 Update_rows: table id 33 flags: STMT_END_F
### UPDATE row_slave.row_test
### WHERE
### @1=1 /* INT meta=0 nullable=0 is_null=0 */
### @2='old02' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 00:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 00:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### SET
### @1=1 /* INT meta=0 nullable=0 is_null=0 */
### @2='old022' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
### @3=2013-12-11 00:00:00 /* DATETIME meta=0 nullable=0 is_null=0 */
### @4=2013-12-11 15:51:15 /* DATETIME meta=0 nullable=0 is_null=0 */
# at 2298
#131211 15:51:15 server id 1287301 end_log_pos 2325 Xid = 73
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
?修改test178到old02的同步點(主從和old02同步)
stop slave;
CHANGE MASTER TO MASTER_HOST='192.168.8.72',MASTER_USER='okooo_rep',MASTER_PASSWORD='Bjfcmlc@Mhxzkhl',MASTER_PORT=7301,MASTER_LOG_FILE='logbin.000007',MASTER_LOG_POS=2075;
start slave;
show slave status\G
?發現數據能夠同步過來(old02的數據 覆蓋了old03的數據,在一開始咱們分析第一個binlog的時候就已經發現,ROW的同步是一個全行的update操做)
mysql> select * from row_test where id=1;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old022 | 2013-12-11 00:00:00 | 2013-12-11 15:51:15 |
+----+----------+---------------------+---------------------+
總結:同時多個主同步數據不會相互牽制。深層理解,主從同步不會校驗表數據是否一致和行數據是否一致。ROW的同步是一個全行的update操做。屬於無腦執行,不會判斷原始數據內容。
刪除測試
?刪除test178的id=1的數據
delete from row_test where id=1;
?更新old02的id=1的數據(主庫和old02在同步數據)
update row_test set update_time =now() ,hostname ='old023' where id=1;
mysql> select * from row_test where id=1;
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old023 | 2013-12-11 00:00:00 | 2013-12-11 16:09:12 |
+----+----------+---------------------+---------------------+
?在test178上看叢庫同步狀態
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.72
Master_User: okooo_rep
Master_Port: 7301
Connect_Retry: 60
Master_Log_File: logbin.000007
Read_Master_Log_Pos: 3078
Relay_Log_File: relay.000002
Relay_Log_Pos: 500
Relay_Master_Log_File: logbin.000007
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: mysql
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table row_slave.row_test; Can't find record in 'row_test', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log logbin.000007, end_log_pos 2549
Skip_Counter: 0
Exec_Master_Log_Pos: 2325
Relay_Log_Space: 1399
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table row_slave.row_test; Can't find record in 'row_test', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log logbin.000007, end_log_pos 2549
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1287301
錯誤解釋:主從數據庫中表的數據不一致致使。進過上面的實驗咱們發現,只有delete數據纔會出現這個錯誤。
如今爲止重現了schedule的PXC倒掉之後的備份庫同步失敗的現象。
總結:當數據不存在叢庫的時候,主庫的更新沒法執行。
測試總結:當叢庫上表的數據和主庫不一致的時候,能夠執行insert操做。update操做會把最後一次執行的記錄覆蓋到叢庫上。delete的數據若是不存在的話,則detele失敗,致使主從不一樣步。
修復方式
1.暴力的方法,也是對數據重要的方法
stop slave;
SET GLOBAL sql_slave_skip_counter=1; 跳過一句叢庫同步
start slave;
2.針對小量數據比較好的方式,手動修改叢庫數據。覺得在上面咱們知道ROW模式檢驗數據一致性,只是覆蓋數據。因此,咱們只要補上缺失的數據便可。
insert into row_test values(1,'new_row',now(),now());
mysql> select * from row_test where id=1; 咱們加入了一條本身編的數據 hostname=‘new_row’
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | new_row | 2013-12-11 08:49:37 | 2013-12-11 08:49:37 |
+----+----------+---------------------+---------------------+
stop slave;
start slave;
mysql> select * from row_test where id=1; 數據變成了同步之後的數據
+----+----------+---------------------+---------------------+
| id | hostname | create_time | update_time |
+----+----------+---------------------+---------------------+
| 1 | old023 | 2013-12-11 00:00:00 | 2013-12-11 16:09:12 |
+----+----------+---------------------+---------------------+
3.最保險的方式,同時也是數據量比較大的時候。咱們能夠找到主庫上寫入id=1的這個時間點的binlog,讓數據重頭開始同步數據。(這個方式時間比較長,基本是基於時間點的增量數據恢復)