案例:推動GTID解決MySQL主主不一樣步問題

以前文章介紹過MySQL修改lower_case_table_names參數,若是以前大寫存儲的表將沒法識別,須要特殊處理。
最近遇到一例應用開發人員在修改這個參數以後,爲了清除以前大寫存儲的表,作了誤操做,致使主主不一樣步。mysql

1.故障現象模擬

在lower_case_table_names=0時建立了測試庫test和表TT:
root@mysqldb 22:43:  [(none)]> create database test;
Query OK, 1 row affected (0.01 sec)

root@mysqldb 22:43:  [(none)]> use test;
Database changed
root@mysqldb 22:43:  [test]> create table TT(id int);
Query OK, 0 rows affected (0.07 sec)

root@mysqldb 22:43:  [test]> show tables;
+----------------+
| Tables_in_test |
+----------------+
| TT             |
+----------------+
1 row in set (0.00 sec)

在修改lower_case_table_names=1時刪除TT不成功:sql

root@mysqldb 22:27:  [test]> drop table TT;
ERROR 1051 (42S02): Unknown table 'test.tt'

此時誤操做來了。。據這樣操做的人員反饋,是直接在網絡搜索到這個錯誤就是要到OS層面去刪除表的文件,而後就作了
我這裏也按照這個誤操做在測試環境來模擬下:shell

[root@test01 test]# rm TT.*
rm: remove regular file `TT.frm'? y
rm: remove regular file `TT.ibd'? y

並且後續根據故障現象推測:操做人員最初只在一個主節點作了這樣的操做,隨後在這個主節點執行了刪除數據庫的動做,最後又創建了新的數據庫從新建表,最終才發現另外一個主節點已經不一樣步了,嘗試本身沒法解決後,上報了故障給客戶DBA。
此刻現象就是:Master1 刪除數據庫成功後,但Master2 同步報錯1010,內容是刪除數據庫發生錯誤,具體以下:數據庫

root@mysqldb 23:04:  [test]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.121
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mybinlog.000013
          Read_Master_Log_Pos: 756
               Relay_Log_File: test02-relay-bin.000034
                Relay_Log_Pos: 532
        Relay_Master_Log_File: mybinlog.000013
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1010
                   Last_Error: Error 'Error dropping database (can't rmdir './test', errno: 39)' on query. Default database: 'test'. Query: 'drop database test'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 601
              Relay_Log_Space: 1060
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1010
               Last_SQL_Error: Error 'Error dropping database (can't rmdir './test', errno: 39)' on query. Default database: 'test'. Query: 'drop database test'
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1121
                  Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997a
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 200702 23:04:11
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:549-550
            Executed_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:5-549,
5d3f3359-98ab-11ea-8101-080027763d24:1-13
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

root@mysqldb 23:04:  [test]> \q

2.推動GTID解決

這時就能夠用一個空事物將當前執行報錯的GTID(Global Transaction Identifier)給跳過去:
set gtid_next='$Master_UUID:$gno';
begin;
commit;
set gtid_next=automatic;
start slave;

這裏實際就是選取Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997agno:550(由於Executed_Gtid_Set最後是549,當前報錯對應應該是549/550,指望用空事物代替跳過)
注意:這裏的gno是連續的。第一次我嘗試gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:549'是不成功的,因此又嘗試550:網絡

set gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:550';
begin;
commit;
set gtid_next=automatic;
start slave;

此次執行後再次查看slave狀態,確認已恢復正常:測試

root@mysqldb 23:11:  [(none)]> set gtid_next='08c887bf-98ab-11ea-b70c-080027c2997a:550';
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 23:11:  [(none)]> begin;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 23:11:  [(none)]> commit;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 23:11:  [(none)]> set gtid_next=automatic;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 23:11:  [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)

root@mysqldb 23:11:  [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.121
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mybinlog.000013
          Read_Master_Log_Pos: 951
               Relay_Log_File: test02-relay-bin.000034
                Relay_Log_Pos: 687
        Relay_Master_Log_File: mybinlog.000013
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 951
              Relay_Log_Space: 1060
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1121
                  Master_UUID: 08c887bf-98ab-11ea-b70c-080027c2997a
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:549-550
            Executed_Gtid_Set: 08c887bf-98ab-11ea-b70c-080027c2997a:5-550,
5d3f3359-98ab-11ea-8101-080027763d24:1-14
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

root@mysqldb 23:11:  [(none)]>

固然Master2遺留的這個test庫記得要處理掉,否則之後還會有問題隱患。3d

相關文章
相關標籤/搜索