1、描述
MySQL 5.7版本主從複製,批量時候顯示延遲上萬秒。mysql
2、現象ios
一、io使用率高 #iostat -dxm 1 1000 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vdb 0.00 96.00 0.00 2596.00 0.00 8.54 6.74 1.33 0.51 0.37 95.30 vdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vdd 0.00 0.00 0.00 11.00 0.00 0.06 11.64 0.00 0.09 0.09 0.10 vde 0.00 0.00 0.00 7.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vdf 0.00 0.00 0.00 511.00 0.00 0.00 0.00 0.05 0.09 0.09 4.60 vdg 0.00 0.00 0.00 511.00 0.00 0.00 0.00 0.05 0.09 0.09 4.80 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 34.00 0.00 0.23 13.65 0.02 0.59 0.38 1.30 dm-3 0.00 0.00 0.00 2144.00 0.00 8.38 8.00 1.40 0.65 0.45 97.20 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 二、dm3是relay log 和binlog分區 $ ls -l /dev/mapper total 0 lrwxrwxrwx 1 root root 7 Jul 23 23:20 backup-backup -> ../dm-0 crw-rw---- 1 root root 10, 58 Jul 23 23:20 control lrwxrwxrwx 1 root root 7 Jul 23 23:20 VG00-lv_root -> ../dm-4 lrwxrwxrwx 1 root root 7 Jul 23 23:20 zxmysql-zxdba -> ../dm-1 lrwxrwxrwx 1 root root 7 Jul 23 23:20 zxmysql-zxlog -> ../dm-3 三、slave狀態 mysql> show slave status \G; *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log 略......................................... Connect_Retry: 60 Master_Log_File: mysql-bin.011494 Read_Master_Log_Pos: 21037034 Relay_Log_File: relay-log.001904 Relay_Log_Pos: 3154097 Relay_Master_Log_File: mysql-bin.011494 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 3153884 Relay_Log_Space: 21037535 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 471 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 400011 Master_UUID: 0f8507ea-6da1-11e8-8646-005056873c4a Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 0f8507ea-6da1-11e8-8646-005056873c4a:14137114-19288497 Executed_Gtid_Set: 0f8507ea-6da1-11e8-8646-005056873c4a:1-19288446 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.01 sec) ERROR: No query specified
3、分析
經過以上現象發現備庫io使用率太高,超過90%。io太高的磁盤爲日誌盤,存放relay log和binlog。io thead一致在寫relay log,調用fdatasync寫磁盤。這裏涉及到一個參數sync_relay_log,默認值爲10000,查看當前系統參數值爲1.sql
4、解決方案
優化io thread線程和sql thread線程。sync_relay_log使用默認值,使用mts優化sql thread。app
stop slave; set global slave_parallel_type=logical_clock; set global slave_parallel_workers=8; set global sync_master_info=10000; set global sync_relay_log=10000; set global sync_relay_log_info=10000; start slave;