MySQL恢復的方法通常有三種:mysql
1. 官方推薦的基於全備+binlog , 一般作法是先恢復最近一次的全備,而後經過mysqlbiinlog --start-position --stop-position binlog.000xxx | mysql -uroot -p xxx -S database 恢復到目標數據庫作恢復sql
2. 基於主從同步恢復數據,一般作法是先恢復最近一次的全備,而後恢復後的實例作slave 掛載到現有的master 上面,經過 start slave sql_thread until master_log_pos 恢復到故障前的一個pos。數據庫
如今嘗試第三種恢復方式, 經過原來主庫上面的binlog 把數據都恢復到slave 上。bash
處理思路: lua
由於relaylog和binlog本質其實是同樣的,因此是否能夠利用MySQL自身的sql_thread來增量binlogspa
1)從新初始化一個實例,恢復全量備份文件。
2)找到第一個binlog文件的position,和剩下全部的binlog。
3)將binlog假裝成relaylog,經過sql thread增量恢復。rest
應用場景:日誌
1. 最近的一次全備離故障位置比較遠,經過上面兩種方式的恢復時間太慢orm
2. 雙主keepalived的集羣,因爲keepalived沒有像MHA 那樣有日誌補全機制,出故障是有可能會有數據丟失的,萬一同步有嚴重的複製延時出現故障切換到slave,這樣數據就不一致,須要作日誌補全blog
1. 創建基於主從同步(這裏實驗基於傳統的pos, 其實GTID 也同樣可行)
M1 :
root@localhost:mysql3307.sock [(none)]>select * from restore.t1; +----+------+ | id | c1 | +----+------+ | 1 | 1 | | 2 | 3 | | 3 | 2 | | 4 | 3 | | 5 | 6 | | 6 | 7 | | 7 | 9 | | 10 | NULL | | 11 | 10 | +----+------+ 9 rows in set (0.00 sec)
M2:(slave)
root@localhost:mysql3307.sock [(none)]>select * from restore.t1; +----+------+ | id | c1 | +----+------+ | 1 | 1 | | 2 | 3 | | 3 | 2 | | 4 | 3 | | 5 | 6 | | 6 | 7 | | 7 | 9 | | 10 | NULL | | 11 | 10 | +----+------+ 9 rows in set (0.00 sec)
root@localhost:mysql3307.sock [restore]>show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: m1 Master_User: repl Master_Port: 3307 Connect_Retry: 60 Master_Log_File: 3307-binlog.000002 Read_Master_Log_Pos: 154 Relay_Log_File: M2-relay-bin.000004 Relay_Log_Pos: 371 Relay_Master_Log_File: 3307-binlog.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 154 Relay_Log_Space: 624 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 13307 Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3 Master_Info_File: /data/mysql/3307/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
記錄此時slave 的 relay-log 信息
[root@M2 data]# more M2-relay-bin.index ./M2-relay-bin.000003 ./M2-relay-bin.000004 [root@M2 data]# more relay-log.info 7 ./M2-relay-bin.000004 371 3307-binlog.000002 154 0 0 1
2. 使用sysbench 模擬數據不一樣步
[root@M1 logs]# mysqladmin create sbtest
[root@M1 sysbench]# sysbench --db-driver=mysql --mysql-host=m1 --mysql-port=3307 --mysql-user=sbtest --mysql-password='sbtest' /usr/share/sysbench/oltp_common.lua --tables=4 --table-size=100000 --threads=2 --time=60 --report-interval=10 prepare
在主庫導入數據的時候在slave端中止同步,製造數據不一致
root@localhost:mysql3307.sock [mysql]>stop slave
3. 等sysbench執行完,查看主庫的數據和slave 的數據
主庫:
root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest1; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec) root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest2; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec) root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec) root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec)
slave 端:
root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4; +----------+ | count(1) | +----------+ | 67550 | +----------+ 1 row in set (0.06 sec) root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3; +----------+ | count(1) | +----------+ | 70252 | +----------+ 1 row in set (0.04 sec)
能夠看到主從不一樣步。
4. 此時查看slave 的status:
root@localhost:mysql3307.sock [(none)]>show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_Host: m1 Master_User: repl Master_Port: 3307 Connect_Retry: 60 Master_Log_File: 3307-binlog.000002 Read_Master_Log_Pos: 76364214 Relay_Log_File: M2-relay-bin.000004 Relay_Log_Pos: 64490301 Relay_Master_Log_File: 3307-binlog.000002 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 64490084 Relay_Log_Space: 76364861 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 0 Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3 Master_Info_File: /data/mysql/3307/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
因爲本地的relay log 沒有執行完畢,爲了保證明驗準確性,咱們先讓本地的relaylog 執行完 , start slave sql_thread
再次檢查:
*************************** 1. row *************************** Slave_IO_State: Master_Host: m1 Master_User: repl Master_Port: 3307 Connect_Retry: 60 Master_Log_File: 3307-binlog.000002 Read_Master_Log_Pos: 76364214 Relay_Log_File: M2-relay-bin.000005 Relay_Log_Pos: 4 Relay_Master_Log_File: 3307-binlog.000002 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 76364214 Relay_Log_Space: 154 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 0 Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3 Master_Info_File: /data/mysql/3307/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
本地relaylog 已經所有執行完畢,此時記錄最新的relay log 信息:
[root@M2 data]# more relay-log.info
7
./M2-relay-bin.000005
4
3307-binlog.000002 76364214
0
0
1
0
0
1
上面這個信息很重要,說明了從庫執行到主庫的000002 的binlog的76364214 這個位置,咱們下面將主庫的binlog 拷貝過來模擬relaylog, 並從這個位置開始恢復
5. 拷貝binlog 到目標端,並模擬成relay log
拷貝前先關閉從庫,並修改cnf (skip-slave-start)讓slave 不會重啓後自動開始複製
[root@M2 data]# ll
total 185248
-rw-r----- 1 root root 461 Oct 24 17:14 3307-binlog.000001 -rw-r----- 1 root root 76364609 Oct 24 17:14 3307-binlog.000002 -rw-r----- 1 root root 203 Oct 24 17:14 3307-binlog.000003 -rw-r----- 1 root root 419 Oct 24 17:14 3307-binlog.000004 -rw-r----- 1 root root 164 Oct 24 17:14 3307-binlog.index
-rw-r----- 1 mysql mysql 56 Oct 24 15:08 auto.cnf
-rw-r----- 1 mysql mysql 4720 Oct 24 17:14 ib_buffer_pool
-rw-r----- 1 mysql mysql 12582912 Oct 24 17:14 ibdata1
-rw-r----- 1 mysql mysql 50331648 Oct 24 17:14 ib_logfile0
-rw-r----- 1 mysql mysql 50331648 Oct 24 17:11 ib_logfile1
-rw-r----- 1 mysql mysql 177 Oct 24 17:14 M2-relay-bin.000005
-rw-r----- 1 mysql mysql 22 Oct 24 17:11 M2-relay-bin.index
-rw-r----- 1 mysql mysql 122 Oct 24 17:14 master.info
drwxr-x--- 2 mysql mysql 4096 Oct 24 15:07 mysql
-rw------- 1 root root 0 Oct 24 15:08 nohup.out
drwxr-x--- 2 mysql mysql 4096 Oct 24 15:07 performance_schema
-rw-r----- 1 mysql mysql 68 Oct 24 17:14 relay-log.info
drwxr-x--- 2 mysql mysql 4096 Oct 24 15:07 restore
drwxr-x--- 2 mysql mysql 4096 Oct 24 16:47 sbtest
drwxr-x--- 2 mysql mysql 12288 Oct 24 15:07 sys
-rw-r----- 1 mysql mysql 24 Oct 24 15:07 xtrabackup_binlog_pos_innodb
-rw-r----- 1 mysql mysql 577 Oct 24 15:07 xtrabackup_info
更名爲relay log
[root@M2 data]# cp 3307-binlog.000001 relay.000001 [root@M2 data]# cp 3307-binlog.000002 relay.000002 [root@M2 data]# cp 3307-binlog.000003 relay.000003 [root@M2 data]# cp 3307-binlog.000004 relay.000004
改權限屬性
[root@M2 data]# chown mysql.mysql -R *
修改relay log index 文件,讓系統能識別
[root@M2 data]# cat M2-relay-bin.index ./relay.000001 ./relay.000002 ./relay.000003 ./relay.000004
修改relay log info 文件,告訴系統從哪一個位置開始複製
[root@M2 data]# cat relay-log.info 7 ./relay.000002 76364214 3307-binlog.000002 76364214 0 0 1 0 0 1
最後開起sql_thread 進程開始快速恢復
start slave sql_thread
6. 檢查數據是否一致
slave:
oot@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec) root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3; +----------+ | count(1) | +----------+ | 100000 | +----------+ 1 row in set (0.05 sec)
能夠看到slave 已經把缺失的數據都所有恢復了。