問題緣由,MySQL主從使用的是kvm虛擬機,物理機超分嚴重,在負載高的狀況下會kill掉佔用資源最多的虛擬機,再啓動後致使主從失敗mysql
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.0.230 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.002783 Read_Master_Log_Pos: 812026 Relay_Log_File: 10-0-0-236-relay-bin.000002 Relay_Log_Pos: 83853 Relay_Master_Log_File: master-bin.002781 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 973991562 Relay_Log_Space: 9657278 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 1128 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: f7c24af7-a54a-11e6-88b4-525400169c04 Master_Info_File: /Data/work/local/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: update Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 1 row in set (0.00 sec)
從上面的報錯信息中咱們能夠看到應該是主庫的binlog日誌或者從庫的relay日誌損壞,致使從庫讀取日誌的時候時候,從而致使從庫複製線程報錯。sql
主要緣由是服務器斷電致使從庫的relay log損壞,從而致使從庫複製線程報錯。服務器
在show slave status\G中找到以下信息: Relay_Master_Log_File: master-bin.002781 # slave庫已讀取的master的binlog Exec_Master_Log_Pos: 968089314 # 在slave上已經執行的position位置點 停掉slave,以slave已經讀取的binlog文件,和已經執行的position爲起點,從新設置同步。 mysql> stop slave; Query OK, 0 rows affected (0.08 sec) mysql> change master to master_log_file='master-bin.002781', master_log_pos=968089314; Query OK, 0 rows affected (1.03 sec) mysql> start slave; Query OK, 0 rows affected (0.41 sec)