以前文章介紹過MySQL修改lower_case_table_names參數,若是以前大寫存儲的表將沒法識別,須要特殊處理。
最近遇到一例應用開發人員在修改這個參數以後,爲了清除以前大寫存儲的表,作了誤操做,致使主主不一樣步。mysql
root@mysqldb 22:43: [(none)]> create database test; Query OK, 1 row affected (0.01 sec) root@mysqldb 22:43: [(none)]> use test; Database changed root@mysqldb 22:43: [test]> create table TT(id int); Query OK, 0 rows affected (0.07 sec) root@mysqldb 22:43: [test]> show tables; +----------------+ | Tables_in_test | +----------------+ | TT | +----------------+ 1 row in set (0.00 sec)
在修改lower_case_table_names=1時刪除TT不成功:sql
root@mysqldb 22:27: [test]> drop table TT; ERROR 1051 (42S02): Unknown table 'test.tt'
此時誤操做來了。。據這樣操做的人員反饋,是直接在網絡搜索到這個錯誤就是要到OS層面去刪除表的文件,而後就作了
我這裏也按照這個誤操做在測試環境來模擬下:shell
[root@test01 test]# rm TT.* rm: remove regular file `TT.frm'? y rm: remove regular file `TT.ibd'? y
並且後續根據故障現象推測:操做人員最初只在一個主節點作了這樣的操做,隨後在這個主節點執行了刪除數據庫的動做,最後又創建了新的數據庫從新建表,最終才發現另外一個主節點已經不一樣步了,嘗試本身沒法解決後,上報了故障給客戶DBA。
此刻現象就是:Master1 刪除數據庫成功後,但Master2 同步報錯1010,內容是刪除數據庫發生錯誤,具體以下:數據庫
root@mysqldb 23:04: [test]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.121 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mybinlog.000013 Read_Master_Log_Pos: 756 Relay_Log_File: test02-relay-bin.000034 Relay_Log_Pos: 532 Relay_Master_Log_File: mybinlog.000013 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1010 Last_Error: Error 'Error dropping database (can't rmdir './test', errno: 39)' on query. Default database: 'test'. Query: 'drop database test' Skip_Counter: 0 Exec_Master_Log_Pos: 601 Relay_Log_Space: 1060 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1010 Last_SQL_Error: Error 'Error dropping database (can't rmdir './test', errno: 39)' on query. Default database: 'test'. Query: 'drop database test' Replicate_Ignore_Server_Ids: Master_Server_Id: 1121 Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997a Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 200702 23:04:11 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:549-550 Executed_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:5-549, 5d3f3359-98ab-11ea-8101-080027763d24:1-13 Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) root@mysqldb 23:04: [test]> \q
set gtid_next='$Master_UUID:$gno'; begin; commit; set gtid_next=automatic; start slave;
這裏實際就是選取Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997a
和gno:550
(由於Executed_Gtid_Set最後是549,當前報錯對應應該是549/550,指望用空事物代替跳過)
注意:這裏的gno是連續的。第一次我嘗試gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:549'
是不成功的,因此又嘗試550:網絡
set gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:550'; begin; commit; set gtid_next=automatic; start slave;
此次執行後再次查看slave狀態,確認已恢復正常:測試
root@mysqldb 23:11: [(none)]> set gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:550'; Query OK, 0 rows affected (0.00 sec) root@mysqldb 23:11: [(none)]> begin; Query OK, 0 rows affected (0.00 sec) root@mysqldb 23:11: [(none)]> commit; Query OK, 0 rows affected (0.00 sec) root@mysqldb 23:11: [(none)]> set gtid_next=automatic; Query OK, 0 rows affected (0.00 sec) root@mysqldb 23:11: [(none)]> start slave; Query OK, 0 rows affected (0.01 sec) root@mysqldb 23:11: [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.121 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mybinlog.000013 Read_Master_Log_Pos: 951 Relay_Log_File: test02-relay-bin.000034 Relay_Log_Pos: 687 Relay_Master_Log_File: mybinlog.000013 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 951 Relay_Log_Space: 1060 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1121 Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997a Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:549-550 Executed_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:5-550, 5d3f3359-98ab-11ea-8101-080027763d24:1-14 Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) root@mysqldb 23:11: [(none)]>
固然Master2遺留的這個test庫記得要處理掉,否則之後還會有問題隱患。3d