做者:趙黎明
愛可生 MySQL DBA 團隊成員,Oracle 10g OCM,MySQL 5.7 OCP,擅長數據庫性能問題診斷、事務與鎖問題的分析等,負責處理客戶 MySQL 及我司自研 DMP 平臺平常運維中的問題,對開源數據庫相關技術很是感興趣。
本文來源:原創投稿
*愛可生開源社區出品,原創內容未經受權不得隨意使用,轉載請聯繫小編並註明來源。
大部分 DBA 應該都已經熟悉並使用過一些閃回工具,諸如:binlog2sql、MyFlash。
今天要介紹的是另外一款基於 Go 編寫的 binlog 解析工具:my2sql,他的同門師兄還有 my2fback、binlog_inspector(binlog_rollback)。
爲何不直接稱其爲閃回工具呢?由於閃回只是它其中一個功能,除此以外,還可用於從 binlog 中解析出執行過的 SQL 語句用於數據補償,或者對線上執行的事務進行分析(捕獲大/長事務)。
Github 地址:https://github.com/liuhr/my2sqlmysql
-- 建立測試表 mysql> show create table t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `k` int(11) NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> select count(*) from t; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.00 sec) -- 查看測試表校驗值 mysql> checksum table t; +-------+------------+ | Table | Checksum | +-------+------------+ | zlm.t | 1966401574 | +-------+------------+ 1 row in set (0.02 sec) -- 查看當前binlog mysql> show master status; +------------------+----------+--------------+------------------+----------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+----------------------------------------------+ | mysql-bin.000003 | 15994082 | | | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-23123 | +------------------+----------+--------------+------------------+----------------------------------------------+ 1 row in set (0.00 sec) -- 刪除5000條記錄 mysql> delete from t where id<5001; Query OK, 5000 rows affected (0.04 sec) -- 執行my2sql生成回滾語句(rollback模式) 04:38 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.62 -port 3332 -work-type rollback -start-file mysql-bin.000003 -start-pos 15994082 --add-extraInfo -output-dir /tmp/my2sql_test [2020/12/25 16:38:17] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.62 3332 zlm utf8 false false <nil> false Local false 0 0s 0s 0 false false 0} [2020/12/25 16:38:17] [info] events.go:58 start thread 1 to generate redo/rollback sql [2020/12/25 16:38:17] [info] events.go:58 start thread 2 to generate redo/rollback sql [2020/12/25 16:38:17] [info] events.go:208 start thread to write redo/rollback sql into file [2020/12/25 16:38:17] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000003, 15994082) [2020/12/25 16:38:17] [info] stats_process.go:166 start thread to analyze statistics from binlog [2020/12/25 16:38:17] [info] repl.go:15 start to get binlog from mysql [2020/12/25 16:38:17] [info] binlogsyncer.go:777 rotate to (mysql-bin.000003, 15994082) [2020/12/25 16:38:17] [info] events.go:242 finish processing mysql-bin.000003 16002473 [2020/12/25 16:38:22] [info] repl.go:83 deadline exceeded. [2020/12/25 16:38:22] [info] repl.go:17 finish getting binlog from mysql [2020/12/25 16:38:22] [info] stats_process.go:266 exit thread to analyze statistics from binlog [2020/12/25 16:38:22] [info] events.go:183 exit thread 1 to generate redo/rollback sql [2020/12/25 16:38:22] [info] events.go:183 exit thread 2 to generate redo/rollback sql [2020/12/25 16:38:22] [info] events.go:257 finish writing rollback sql into tmp files, start to revert content order of tmp files [2020/12/25 16:38:22] [info] rollback_process.go:15 start thread 1 to revert rollback sql files [2020/12/25 16:38:22] [info] rollback_process.go:41 start to revert tmp file /tmp/my2sql_test/.rollback.3.sql into /tmp/my2sql_test/rollback.3.sql [2020/12/25 16:38:22] [info] rollback_process.go:156 finish reverting tmp file /tmp/my2sql_test/.rollback.3.sql into /tmp/my2sql_test/rollback.3.sql [2020/12/25 16:38:22] [info] rollback_process.go:25 exit thread 1 to revert rollback sql files [2020/12/25 16:38:22] [info] events.go:270 finish reverting content order of tmp files [2020/12/25 16:38:22] [info] events.go:275 exit thread to write redo/rollback sql into file -- 查看生成的文件 04:40 PM dmp1 /tmp/my2sql_test# ll total 1228 -rw-r--r-- 1 root root 251 Dec 25 16:38 biglong_trx.txt -rw-r--r-- 1 root root 288 Dec 25 16:38 binlog_status.txt -rw-r--r-- 1 root root 1246880 Dec 25 16:38 rollback.3.sql ## 其中rollback.x.sql就是咱們閃回數據須要的sql文件(x對應mysql-binlog.00000x的文件編號x),因爲執行命令時只指定了binlog開始的位置,實例中後續執行的DML事務也都會被記錄 ## 另外2個文件分別是從binlog中獲取到的binlog狀態和事務信息,以後的案例會詳細展現說明,此處略過 -- 查看生成的回滾SQL文本文件 04:42 PM dmp1 /tmp/my2sql_test# tail -5 rollback.3.sql INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (4,5027,'54133149494-75722987476-23015721680-47254589498-40242947469-55055884969-23675271222-20181439230-74473404563-55407972672','88488171626-98596569412-94026374972-58040528656-38000028170'); INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (3,4990,'51185622598-89397522786-28007882305-52050087550-68686337807-48942386476-96555734557-05264042377-33586177817-31986479495','00592560354-80393027097-78244247549-39135306455-88936868384'); INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (2,5025,'13241531885-45658403807-79170748828-69419634012-13605813761-77983377181-01582588137-21344716829-87370944992-02457486289','28733802923-10548894641-11867531929-71265603657-36546888392'); INSERT INTO `zlm`.`t` (`id`,`k`,`c`,`pad`) VALUES (1,5015,'68487932199-96439406143-93774651418-41631865787-96406072701-20604855487-25459966574-28203206787-41238978918-19503783441','22195207048-70116052123-74140395089-76317954521-98694025897'); # datetime=2020-12-25_16:33:30 database=zlm table=t binlog=mysql-bin.000003 startpos=15994218 stoppos=16002473 05:10 PM dmp1 /tmp/my2sql_test# cat rollback.3.sql |grep "INSERT INTO"|wc -l 5000 ## 能夠看到,該閃回SQL文件中有5000個INSERT語句,正好對應以前刪除的5000條記錄 ## 閃回SQL文件中的「# datetime=...」這行的內容就是加了參數-add-extrainfo後加入的附加信息,能夠獲取每一個語句執行的具體時間和點位 ## 在實際狀況中,從binlog中解析出來的事務會很複雜,爲了便於分析,建議加上過濾庫、表的參數-databases和-tables,這樣生成的SQL文件也會小不少 -- 數據恢復(將誤刪數據導入) 05:17 PM dmp1 (master) ~# mysql32 < /tmp/my2sql_test/rollback.3.sql mysql: [Warning] Using a password on the command line interface can be insecure. 05:19 PM dmp1 (master) ~# mysql32 -Ne "select count(*) from zlm.t;checksum table zlm.t;" mysql: [Warning] Using a password on the command line interface can be insecure. +-------+ | 10000 | +-------+ +-------+------------+ | zlm.t | 1966401574 | +-------+------------+ ## 5000條刪除的數據導入表後,表的checksum與刪除前一致,說明該在表上沒有進行過其餘DML操做 ## 若是記錄數一致而checksum不一致,則認爲恢復後的數據仍然是不一致的,須要確認是否須要作更多的閃回操做
-- 用sysbench給一個持續寫入 01:34 AM dmp1 /usr/local/sysbench/share/sysbench# sysbench /usr/local/sysbench/share/sysbench/oltp_read_write.lua --db-driver=mysql --tables=1 --table_size=10000 --mysql-host=10.186.60.62 --mysql-port=3332 --mysql-db=zlm --mysql-user=zlm --mysql-password=zlm --report-interval=2 --threads=10 --time=600 --skip-trx=on --mysql-ignore-errors=1062,1213 --db-ps-mode=disable run sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 10 Report intermediate results every 2 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! [ 2s ] thds: 10 tps: 712.84 qps: 12964.69 (r/w/o: 10095.03/2855.81/13.85) lat (ms,95%): 27.17 err/s: 5.44 reconn/s: 0.00 [ 4s ] thds: 10 tps: 733.05 qps: 13289.89 (r/w/o: 10342.52/2939.79/7.57) lat (ms,95%): 28.16 err/s: 4.54 reconn/s: 0.00 ... 略 -- 確認從庫正常複製後中止IO線程 mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.186.60.62 Master_User: zlm Master_Port: 3332 Connect_Retry: 60 Master_Log_File: mysql-bin.000013 Read_Master_Log_Pos: 60687481 Relay_Log_File: relay-bin.000009 Relay_Log_Pos: 60685775 Relay_Master_Log_File: mysql-bin.000013 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 60685562 Relay_Log_Space: 594023093 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 623332 Master_UUID: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e Master_Info_File: /data/mysql/mysql3332/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-915239 Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-915235 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) mysql> stop slave io_thread; Query OK, 0 rows affected (0.01 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_Host: 10.186.60.62 Master_User: zlm Master_Port: 3332 Connect_Retry: 60 Master_Log_File:mysql-bin.000013 Read_Master_Log_Pos: 78769827 Relay_Log_File: relay-bin.000009 Relay_Log_Pos: 78770040 Relay_Master_Log_File: mysql-bin.000013 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos:78769827 Relay_Log_Space: 612105439 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 623332 Master_UUID: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e Master_Info_File: /data/mysql/mysql3332/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-941653 Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) -- 中止sysbench寫入 [ 120s ] thds: 10 tps: 820.27 qps: 14836.91 (r/w/o: 11540.32/3284.59/12.00) lat (ms,95%): 22.28 err/s: 4.00 reconn/s: 0.00 [ 122s ] thds: 10 tps: 718.49 qps: 13009.74 (r/w/o: 10126.80/2874.44/8.50) lat (ms,95%): 24.83 err/s: 4.50 reconn/s: 0.00 [ 124s ] thds: 10 tps: 781.74 qps: 14178.87 (r/w/o: 11029.89/3137.97/11.00) lat (ms,95%): 23.52 err/s: 6.50 reconn/s: 0.00 ^C ## 此時主庫的數據是比從庫多的(能夠看做是異步複製中的IO線程延遲),假設又正好發生了高可用切換,從庫被切成了主庫,咱們須要手動補償新主上缺失的這部分數據(若是高可用切換工具沒有這部分實現邏輯的話) -- 確認後續(缺失)事務的起始點位 從show slave status的輸出,咱們能夠獲知如下信息: 一、新主已執行過的事務集合: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653 二、對應舊主上的點位: Master_Log_File:mysql-bin.000013 Exec_Master_Log_Pos: 78769827 -- 查看舊主當前GTID信息 mysql> show master status; +------------------+-----------+--------------+------------------+------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+-----------+--------------+------------------+------------------------------------------------+ | mysql-bin.000013 | 151283749 | | | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-1047585 | +------------------+-----------+--------------+------------------+------------------------------------------------+ 1 row in set (0.00 sec) ## 相差了105932個事務,binlog從77M增加到150M左右 -- 也能夠離線解析binlog文件來確認file+pos的點位與GTID是不是對應 /usr/local/mysql5732/bin/mysqlbinlog -vv --base64-output=decode-rows mysql-bin.000013|less ... 略 #201226 1:36:31 server id 623332 end_log_pos 78769827 CRC32 0xe5032fcf Xid = 4003981 COMMIT/*!*/; # at78769827 //當1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941653的事務COMMIT後,pos點位就停在這裏,是一致的 #201226 1:36:31 server id 623332 end_log_pos 78769892 CRC32 0x6cec07fc GTID last_committed=115069 sequence_number=115070 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= '1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941654'/*!*/; # at 78769892 #201226 1:36:31 server id 623332 end_log_pos 78769963 CRC32 0x0b9b5557 Query thread_id=89 exec_time=0 error_code=0 SET TIMESTAMP=1608917791/*!*/; BEGIN /*!*/; # at 78769963 #201226 1:36:31 server id 623332 end_log_pos 78770228 CRC32 0x75e40978 Rows_query # INSERT INTO sbtest1 (id, k, c, pad) VALUES (5020, 5013, '79652507036-05590009094-10370692577-33401396318-81508361252-10613546461-82822929332-17272183925-71915791860-00345159222', '25450417435-19336936168-49193845527-09907338597-56878802246') # at 78770228 #201226 1:36:31 server id 623332 end_log_pos 78770284 CRC32 0xfc0d0d16 Table_map: `zlm`.`sbtest1` mapped to number 204 # at 78770284 #201226 1:36:31 server id 623332 end_log_pos 78770509 CRC32 0xe1d44365 Write_rows: table id 204 flags: STMT_END_F ### INSERT INTO `zlm`.`sbtest1` ### SET ### @1=5020 /* INT meta=0 nullable=0 is_null=0 */ ### @2=5013 /* INT meta=0 nullable=0 is_null=0 */ ### @3='79652507036-05590009094-10370692577-33401396318-81508361252-10613546461-82822929332-17272183925-71915791860-00345159222' /* STRING(360) meta=61032 nullable=0 is_null=0 */ ### @4='25450417435-19336936168-49193845527-09907338597-56878802246' /* STRING(180) meta=65204 nullable=0 is_null=0 */ # at 78770509 #201226 1:36:31 server id 623332 end_log_pos 78770540 CRC32 0x1adbcf14 Xid = 4003993 COMMIT/*!*/; # at 78770540 #201226 1:36:31 server id 623332 end_log_pos 78770605 CRC32 0x52d9646f GTID last_committed=115070 sequence_number=115071 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= '1d7ef0f4-4593-11eb-9f04-02000aba3c3e:941655'/*!*/; # at 78770605 #201226 1:36:31 server id 623332 end_log_pos 78770676 CRC32 0x7ed98e8d Query thread_id=85 exec_time=0 error_code=0 SET TIMESTAMP=1608917791/*!*/; BEGIN /*!*/; # at 78770676 #201226 1:36:31 server id 623332 end_log_pos 78770738 CRC32 0x28584bc5 Rows_query # UPDATE sbtest1 SET k=k+1 WHERE id=6248 -- 生成前滾SQL(從mysql-bin.000013的78769827開始) 01:55 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.62 -port 3332 -work-type 2sql -start-file mysql-bin.000013 -start-pos78769827 --add-extraInfo -output-dir /tmp/my2sql_test [2020/12/26 01:54:51] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.62 3332 zlm utf8 false false <nil> false Local false 0 0s 0s 0 false false 0} [2020/12/26 01:54:51] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000013, 78769827) [2020/12/26 01:54:51] [info] events.go:208 start thread to write redo/rollback sql into file [2020/12/26 01:54:51] [info] events.go:58 start thread 1 to generate redo/rollback sql [2020/12/26 01:54:51] [info] events.go:58 start thread 2 to generate redo/rollback sql [2020/12/26 01:54:51] [info] stats_process.go:166 start thread to analyze statistics from binlog [2020/12/26 01:54:51] [info] repl.go:15 start to get binlog from mysql [2020/12/26 01:54:51] [info] binlogsyncer.go:777 rotate to (mysql-bin.000013, 78769827) [2020/12/26 01:54:51] [info] events.go:242 finish processing mysql-bin.000013 78770509 [2020/12/26 01:54:52] [info] events.go:242 finish processing mysql-bin.000013 89256973 [2020/12/26 01:54:53] [info] events.go:242 finish processing mysql-bin.000013 99742835 [2020/12/26 01:54:54] [info] events.go:242 finish processing mysql-bin.000013 110229245 [2020/12/26 01:54:55] [info] events.go:242 finish processing mysql-bin.000013 120715455 [2020/12/26 01:54:56] [info] events.go:242 finish processing mysql-bin.000013 131201391 [2020/12/26 01:54:57] [info] events.go:242 finish processing mysql-bin.000013 141687468 [2020/12/26 01:55:03] [info] repl.go:83 deadline exceeded. [2020/12/26 01:55:03] [info] repl.go:17 finish getting binlog from mysql [2020/12/26 01:55:03] [info] stats_process.go:266 exit thread to analyze statistics from binlog [2020/12/26 01:55:03] [info] events.go:183 exit thread 1 to generate redo/rollback sql [2020/12/26 01:55:03] [info] events.go:183 exit thread 2 to generate redo/rollback sql [2020/12/26 01:55:03] [info] events.go:272 finish writing redo/forward sql into file [2020/12/26 01:55:03] [info] events.go:275 exit thread to write redo/rollback sql into file -- 檢查生成的新文件 01:54 AM dmp1 /tmp/my2sql_test# ll -lrt total 25732 -rw-r--r-- 1 root root 107 Dec 26 01:54 biglong_trx.txt -rw-r--r-- 1 root root 26339348 Dec 26 01:55 forward.13.sql -rw-r--r-- 1 root root 432 Dec 26 01:55 binlog_status.txt 01:56 AM dmp1 /tmp/my2sql_test# cat binlog_status.txt binlog starttime stoptime startpos stoppos inserts updates deletes database table mysql-bin.000013 2020-12-26_01:36:31 2020-12-26_01:37:00 78770228 143726338 23713 47466 23715 zlm sbtest1 mysql-bin.000013 2020-12-26_01:37:01 2020-12-26_01:37:04 143726770 151283718 2755 5530 2753 zlm sbtest1 01:57 AM dmp1 /tmp/my2sql_test# cat biglong_trx.txt binlog starttime stoptime startpos stoppos rows duration tables -- 計算事務個數 01:57 AM dmp1 /tmp/my2sql_test# mysql -Ne "select 23713+47466+23715+2755+5530+2753;" +--------+ | 105932 | +--------+ 02:01 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep INSERT|wc -l 26468 02:03 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep UPDATE|wc -l 52996 02:03 AM dmp1 /tmp/my2sql_test# cat forward.13.sql |grep DELETE|wc -l 26468 02:04 AM dmp1 /tmp/my2sql_test# mysql32 -Ne "select 26468+52996+26468;" mysql: [Warning] Using a password on the command line interface can be insecure. +--------+ | 105932 | +--------+ 01:56 AM dmp1 /tmp/my2sql_test# scp forward.13.sql 10.186.60.68:/tmp forward.13.sql 100% 25MB 140.9MB/s 00:00 ## 前滾SQL文件有26M,拷貝到新主(60.68) ## 查看生成的binlog_status.txt文件,會統計每一個時間段(POS)區間內相關庫表所產生的DML次數 ## biglong_trx.txt文件記錄的是大/長事務,此文件爲空,說明沒有大/長事務 ## 由文件中記錄的DML總執行次數可知,確實是執行了105932 個事務,與以前估算的一致 -- 將差別數據導入新主 02:05 AM dmp2 (master) ~# mysql32 < /tmp/forward.13.sql mysql: [Warning] Using a password on the command line interface can be insecure. 02:05 AM dmp2 (master) ~# -- 查看從庫的事務寫入狀況 mysql> show slave status\G ... 略 Retrieved_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:58752-941653 Executed_Gtid_Set: 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653, 3f4e72ab-46af-11eb-9bac-02000aba3c44:1-105932 //新主寫入了105932個事務,與以前統計的值一致 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) -- 校驗舊主和新主的測試表 ## 舊主 mysql> checksum table sbtest1; +-------------+-----------+ | Table | Checksum | +-------------+-----------+ | zlm.sbtest1 | 670442058 | +-------------+-----------+ 1 row in set (0.01 sec) ## 新主 mysql> checksum table zlm.sbtest1; +-------------+-----------+ | Table | Checksum | +-------------+-----------+ | zlm.sbtest1 | 670442058 | +-------------+-----------+ 1 row in set (0.02 sec) ## 因爲本次測試中僅僅是對sbtest1表執行了DML操做,能夠認爲新主缺失的數據已經獲得補償 ## 在真實環境中,補償數據可能會比較麻煩一些,由於會涉及到不少庫表的操做,但原理是同樣的
3.1 準備linux
-- 建立一張500w行的大表並持續寫入 02:38 AM dmp1 /usr/local/sysbench/share/sysbench# sysbench /usr/local/sysbench/share/sysbench/oltp_read_write.lua --db-driver=mysql --tables=1 --table_size=5000000 --mysql-host=10.186.60.68 --mysql-port=3332 --mysql-db=zlm --mysql-user=zlm --mysql-password=zlm --report-interval=2 --threads=10 --time=600 --skip-trx=on --mysql-ignore-errors=1062,1213 --db-ps-mode=disable prepare sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' ... 略 -- 查看主從延遲 ## 主庫正在執行的事務 mysql> show master status; +------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | mysql-bin.000018 | 215205384 | | | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653, 3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1335429 | +------------------+-----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) ## 從庫回放狀態 11:22 AM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}' mysql: [Warning] Using a password on the command line interface can be insecure. Yes Yes mysql: [Warning] Using a password on the command line interface can be insecure. 0 ## 此時並無發現SQL線程有延遲(Senconds_Behind_Master=0) -- 用my2sql工具分析mysql-bin.000018這個binlog 11:30 AM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000018 -start-pos 194 -output-dir /tmp/my2sql_test [2020/12/28 11:30:35] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm utf8 false false <nil> false Local false 0 0s 0s 0 false false 0} [2020/12/28 11:30:35] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000018, 194) [2020/12/28 11:30:35] [info] stats_process.go:166 start thread to analyze statistics from binlog [2020/12/28 11:30:35] [info] repl.go:15 start to get binlog from mysql [2020/12/28 11:30:35] [info] binlogsyncer.go:777 rotate to (mysql-bin.000018, 194) [2020/12/28 11:30:44] [info] repl.go:83 deadline exceeded. [2020/12/28 11:30:44] [info] repl.go:17 finish getting binlog from mysql [2020/12/28 11:30:44] [info] stats_process.go:266 exit thread to analyze statistics from binlog -- 查看生成文件的內容 11:32 AM dmp1 /tmp/my2sql_test# cat binlog_status.txt binlog starttime stoptime startpos stoppos inserts updates deletes database table mysql-bin.000018 2020-12-28_11:24:43 2020-12-28_11:25:12 390 40472959 14704 29399 14703 zlm sbtest1 mysql-bin.000018 2020-12-28_11:25:13 2020-12-28_11:25:42 40473397 83094395 15482 30964 15481 zlm sbtest1 mysql-bin.000018 2020-12-28_11:25:43 2020-12-28_11:26:12 83094627 124446683 15020 30043 15020 zlm sbtest1 mysql-bin.000018 2020-12-28_11:26:13 2020-12-28_11:26:42 124446910 165109718 14771 29540 14771 zlm sbtest1 mysql-bin.000018 2020-12-28_11:26:43 2020-12-28_11:27:12 165110068 205873372 14806 29615 14807 zlm sbtest1 mysql-bin.000018 2020-12-28_11:27:13 2020-12-28_11:27:19 205873604 215205353 3391 6778 3390 zlm sbtest1 11:32 AM dmp1 /tmp/my2sql_test# cat biglong_trx.txt binlog starttime stoptime startpos stoppos rows duration tables ## 在18的binlog中寫入了大量的事務,從11:24:43(390)開始,到11:27:19(215205353),因爲沒有指定中止的位置,打印的內容就截止到執行my2sql的時間點
3.2 大事務分析git
-- 將主庫上測試表的主鍵刪除 mysql> show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `k` int(11) NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`) ) ENGINE=InnoDB AUTO_INCREMENT=5000001 DEFAULT CHARSET=utf8 1 row in set (0.01 sec) mysql> alter table sbtest1 modify id int not null,drop primary key; Query OK, 5000000 rows affected (1 min 33.31 sec) Records: 5000000 Duplicates: 0 Warnings: 0 mysql> show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `id` int(11) NOT NULL, `k` int(11) NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', KEY `k_1` (`k`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) -- 主庫執行刪除400萬行記錄 mysql> select count(*) from sbtest1; +----------+ | count(*) | +----------+ | 5000000 | +----------+ 1 row in set (3.05 sec) mysql> delete from sbtest1 where id<4000001; Query OK, 4000000 rows affected (3 min 7.85 sec) -- 肯定主庫當前事務執行到的點位 mysql> show master status;select @@server_uuid; +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | mysql-bin.000019 | 194 | | | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653, 3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1398898 | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | 3f4e72ab-46af-11eb-9bac-02000aba3c44 | +--------------------------------------+ 1 row in set (0.00 sec) ## 很明顯,這是一個大事務,主庫執行查詢全表記錄用了3s,執行刪除400w行記錄用了3.785s ## 由於要模擬主從延遲,先把測試表的主鍵刪除,再進行數據刪除 -- 查看主從延遲 12:57 PM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}' mysql: [Warning] Using a password on the command line interface can be insecure. Yes Yes mysql: [Warning] Using a password on the command line interface can be insecure. 281 ... 略 12:58 PM dmp1 (master) ~# mysql32 -e "show slave status\G"|grep Running |grep -v "Running_State"|awk '{print $2}' && mysql32 -e "show slave status\G"|grep Behind |awk '{print $2}' mysql: [Warning] Using a password on the command line interface can be insecure. Yes Yes mysql: [Warning] Using a password on the command line interface can be insecure. 441 ## 如今已經開始出現延遲,而且Seconds_Behind_Master的值會愈來愈大(當主庫上的表沒有主鍵或惟一鍵時,從庫回放時須要全表掃描來定位每一行記錄,記錄越多,這個過程越慢,最終致使主從延遲) -- 執行my2sql,指定工做模式爲事務分析(指定10000爲大事務閾值) 01:10 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000018 -big-trx-row-limit 10000 -output-dir /tmp/my2sql_test [2020/12/28 13:11:18] [info] stats_process.go:166 start thread to analyze statistics from binlog [2020/12/28 13:11:18] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm utf8 false false <nil> false Local false 0 0s 0s 0 false false 0} [2020/12/28 13:11:18] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000018, 4) [2020/12/28 13:11:18] [info] repl.go:15 start to get binlog from mysql [2020/12/28 13:11:18] [info] binlogsyncer.go:777 rotate to (mysql-bin.000018, 4) [2020/12/28 13:11:27] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4) [2020/12/28 13:11:27] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4) [2020/12/28 13:11:32] [info] repl.go:83 deadline exceeded. [2020/12/28 13:11:32] [info] repl.go:17 finish getting binlog from mysql [2020/12/28 13:11:32] [info] stats_process.go:266 exit thread to analyze statistics from binlog -- 查看生成的文件內容 01:13 PM dmp1 /tmp/my2sql_test# ll total 8 -rw-r--r-- 1 root root 260 Dec 28 13:11 biglong_trx.txt -rw-r--r-- 1 root root 1440 Dec 28 13:11 binlog_status.txt 01:13 PM dmp1 /tmp/my2sql_test# cat binlog_status.txt binlog starttime stoptime startpos stoppos inserts updates deletes database table mysql-bin.000018 2020-12-28_11:24:43 2020-12-28_11:25:12 390 40472959 14704 29399 14703 zlm sbtest1 mysql-bin.000018 2020-12-28_11:25:13 2020-12-28_11:25:42 40473397 83094395 15482 30964 15481 zlm sbtest1 mysql-bin.000018 2020-12-28_11:25:43 2020-12-28_11:26:12 83094627 124446683 15020 30043 15020 zlm sbtest1 mysql-bin.000018 2020-12-28_11:26:13 2020-12-28_11:26:42 124446910 165109718 14771 29540 14771 zlm sbtest1 mysql-bin.000018 2020-12-28_11:26:43 2020-12-28_11:27:12 165110068 205873372 14806 29615 14807 zlm sbtest1 mysql-bin.000018 2020-12-28_11:27:13 2020-12-28_11:27:19 205873604 215205353 3391 6778 3390 zlm sbtest1 mysql-bin.000018 2020-12-28_12:37:02 2020-12-28_12:37:31 215205990 249240812 30410 11715 5787 sbtest sbtest1 mysql-bin.000018 2020-12-28_12:37:32 2020-12-28_12:37:40 249241253 260168297 7263 5533 2758 sbtest sbtest1 mysql-bin.000018 2020-12-28_12:53:51 2020-12-28_12:53:51 260168687 1023424583 0 0 4000000 zlm sbtest1 01:13 PM dmp1 /tmp/my2sql_test# cat biglong_trx.txt binlog starttime stoptime startpos stoppos rows duration tables mysql-bin.000018 2020-12-28_12:53:51 2020-12-28_12:53:51 260168556 1023424614 4000000 0 [zlm.sbtest1(inserts=0, updates=0, deletes=4000000)] ## 因爲主庫仍然在寫mysql-bin.000018這個binlog文件,還未切換,咱們繼續以它爲起始位置分析 ## 執行命令的時候用了-big-trx-row-limit 10000來指定超過1w行的DML爲大事務;若是不指定該參數,默認超過500行就會被統計 ## 由生成的結果文件得知,在12:53:51的時候,在zml.sbtest1表上有一個deletes=4000000的大事務操做,與以前的操做相對應
3.3 長事務分析github
-- 建立一個測試表,顯式開啓10s事務 mysql> show master status; +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ | mysql-bin.000019 | 380 | | | 1d7ef0f4-4593-11eb-9f04-02000aba3c3e:1-941653, 3f4e72ab-46af-11eb-9bac-02000aba3c44:1-1398899 | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> create table t1(id int); Query OK, 0 rows affected (0.01 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> insert into t1(id) values(1);select sleep(10);commit; Query OK, 1 row affected (0.00 sec) +-----------+ | sleep(10) | +-----------+ | 0 | +-----------+ 1 row in set (10.00 sec) Query OK, 0 rows affected (0.00 sec) mysql> select count(*) from t1; +----------+ | count(*) | +----------+ | 1 | +----------+ 1 row in set (0.00 sec) -- 執行my2sql,工做模式選擇事務分析(指定5s爲長事務閾值) 02:03 PM dmp1 (master) ~# ./my2sql -user zlm -password zlm -host 10.186.60.68 -port 3332 -work-type stats -start-file mysql-bin.000019 -long-trx-seconds 5 -output-dir /tmp/my2sql_test [2020/12/28 14:10:01] [info] binlogsyncer.go:144 create BinlogSyncer with config {1113306 mysql 10.186.60.68 3332 zlm utf8 false false <nil> false Local false 0 0s 0s 0 false false 0} [2020/12/28 14:10:01] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin.000019, 4) [2020/12/28 14:10:01] [info] stats_process.go:166 start thread to analyze statistics from binlog [2020/12/28 14:10:01] [info] repl.go:15 start to get binlog from mysql [2020/12/28 14:10:01] [info] binlogsyncer.go:777 rotate to (mysql-bin.000019, 4) [2020/12/28 14:10:06] [info] repl.go:83 deadline exceeded. [2020/12/28 14:10:06] [info] repl.go:17 finish getting binlog from mysql [2020/12/28 14:10:06] [info] stats_process.go:266 exit thread to analyze statistics from binlog -- 查看生成的文件內容 02:10 PM dmp1 /tmp/my2sql_test# ll total 8 -rw-r--r-- 1 root root 249 Dec 28 14:10 biglong_trx.txt -rw-r--r-- 1 root root 288 Dec 28 14:10 binlog_status.txt 02:10 PM dmp1 /tmp/my2sql_test# cat binlog_status.txt binlog starttime stoptime startpos stoppos inserts updates deletes database table mysql-bin.000019 2020-12-28_14:09:02 2020-12-28_14:09:02 728 812 1 0 0 zlm t1 02:10 PM dmp1 /tmp/my2sql_test# cat biglong_trx.txt binlog starttime stoptime startpos stoppos rows duration tables mysql-bin.000019 2020-12-28_14:09:02 2020-12-28_14:09:12 605 843 1 10 [zlm.t1(inserts=1, updates=0, deletes=0)] ## 首先,肯定事務開始的binlog位置爲mysql-bin.000019,從這個文件開始解析 ## 用參數-long-trx-seconds指定長事務的閾值爲10s,只要超過這個值的事務就會被統計;若是不指定該參數,默認執行超過5min的事務會被統計,建議實際使用的時候指定這個參數,並設置較小的閾值進行分析 ## 生成的統計文件中記錄了剛纔執行的10s事務(即zlm.t1表上的insert操做)