原創做者: 洪斌mysql
在master上執行了一個無where條件delete操做,該表50多萬記錄。binlog_format是mixed模式,但transaction_isolation是RC模式,因此dml語句會以row模式記錄。此表沒有主鍵有非惟一索引。在slave重放時超過10小時沒有執行完成。sql
首先來了解下slave在row模式下是如何重放relay log的。在row模式下,binlog中會記錄DML變動操做的事件描述信息、BEFORE IMAGE、AFTER IMAGE。mvc
+----------------------------+ | Header: table id | | column information etc. | +--------------+-------------+ | BEFORE IMAGE | AFTER IMAGE | +--------------+-------------+ | BEFORE IMAGE | AFTER IMAGE | +--------------+-------------+
DML事件類型與image的關係矩陣app
+------------------+--------------+-------------+ | EVENT TYPE | BEFORE IMAGE | AFTER IMAGE | +------------------+--------------+-------------+ | WRITE_ROWS_EVENT | No | Yes | +------------------+--------------+-------------+ | DELETE_ROWS_EVENT| Yes | No | +------------------+--------------+-------------+ | UPDATE_ROWS_EVENT| Yes | Yes | +------------------+--------------+-------------+
delete和update包含了查找操做,基於BI內容搜索找到對應的記錄執行相應操做。ide
基於row模式binlog的重放主要在此函數中進行Rows_log_event::do_apply_event,它根據事件類型調用相應的do_before_row_operations 以delete操做爲例函數
Delete_rows_log_event::do_before_row_operations,此函數會更新sql command計數器(com_delete)性能
接下來調用Rows_log_event::row_operations_scan_and_key_setup分配須要的內存空間測試
Prepare memory structures for search operations. If search is performed:ui
1.using hash search => initialize the hashspa
2.using key => decide on key to use and allocate mem structures
3.using table scan => do nothing
選擇何種搜索策略取決於Rows_log_event::decide_row_lookup_algorithm_and_key的結果,其決策矩陣依賴表的索引信息和slave_rows_search_algorithms參數的設置。 Decision table:
|--------------+-----------+------+------+------| | Index\Option | I , T , H | I, T | I, H | T, H | |--------------+-----------+------+------+------| | PK / UK | I | I | I | Hi | | K | Hi | I | Hi | Hi | | No Index | Ht | T | Ht | Ht | |--------------+-----------+------+------+------|
默認slave_rows_search_algorithms是TABLE_SCAN,INDEX_SCAN,對應函數Rows_log_event::do_index_scan_and_update
若是是INDEX_SCAN,HASH_SCAN,對應函數Rows_log_event::do_hash_scan_and_update
在沒有主鍵的狀況下,會遍歷binlog每行事件,再用該事件的BI去查找對應的記錄,而後變動成對應AI信息。
for each row in the event do { search for the correct row to be modified using BI replace the row in the table with the corresponding AI }
若是是HASH SCAN over table,會先對binlog事件中的記錄執行hash,放到hash表中,再對錶中每行記錄進行hash,與hash表中的記錄對比,條件匹配回放AI部分。
for each row in the event do { hash the row. } for each row in the table do { key= hash the row; if (key is present in the hash) { apply the AI to the row. } }
若是是HASH SCAN over index,在有非惟一索引的狀況下,對binlog事件中的記錄執行hash時,也會將該記錄的key保存在一個去重的key列表集合中,而後根據該索引集合去查找記錄,對找到的記錄執行hash操做並與hash表中的記錄對比,若是匹配則回放AI部分。
for each row in the event do { hash the row. store the key in a list of distinct key. } for each row corresponding key values in the key list do { key= hash the row; if (key is present in the hash) { apply the AI to the row. } }
從上述分析能夠推測在沒有主鍵的狀況下Hi的掃描方式會快於Ht和Index scan。
對比slave_rows_search_algorithms在TABLE_SCAN,INDEX_SCAN和INDEX_SCAN,HASH_SCAN兩種參數設置下,delete大表哪一個效率更高。
CREATE TABLE `ants_bnzbw_temp` ( `accrued_status` varchar(1) COLLATE utf8_bin DEFAULT NULL, `contract_no` varchar(32) COLLATE utf8_bin DEFAULT NULL, `business_date` date DEFAULT NULL, `prin_bal` int(11) DEFAULT NULL COMMENT, `ovd_prin_bal` int(11) DEFAULT NULL COMMENT , `ovd_int_bal` int(11) DEFAULT NULL COMMENT , `int_amt` int(11) DEFAULT NULL COMMENT , `ovd_prin_pnlt_amt` int(11) DEFAULT NULL COMMENT , `ovd_int_pnlt_amt` int(11) DEFAULT NULL COMMENT, KEY `accrued_status` (`accrued_status`) USING BTREE, KEY `contract_no` (`contract_no`) USING BTREE, KEY `business_date` (`business_date`) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin master [localhost] {msandbox} (test) > select count(*) from ants_bnzbw_temp; +----------+ | count(*) | +----------+ | 522490 | +----------+ 1 row in set (0.15 sec) master [localhost] {msandbox} (test) > delete from ants_bnzbw_temp; Query OK, 522490 rows affected (25.86 sec)
slave_rows_search_algorithms='INDEX_SCAN,HASH_SCAN'
事務執行大約2000s(沒有實時追蹤事務執行時間)
SET @@SESSION.GTID_NEXT= '00020594-1111-1111-1111-111111111111:237'/*!*/; # at 221356832 #180102 14:04:48 server id 1 end_log_pos 221356895 CRC32 0xafdd018f Query thread_id=20 exec_time=25 error_code=0 ---TRANSACTION 5582, ACTIVE 1447 sec mysql tables in use 1, locked 1 2581 lock struct(s), heap size 319696, 799680 row lock(s), undo log entries 399840
調用棧採樣
frame #2: 0x000000010f86d27e mysqld`ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function) + 734 frame #3: 0x000000010f036b6c mysqld`handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function) + 140 frame #4: 0x000000010f6d5a94 mysqld`Rows_log_event::next_record_scan(bool) + 324 frame #5: 0x000000010f6d66cf mysqld`Rows_log_event::do_scan_and_update(Relay_log_info const*) + 159 frame #6: 0x000000010f6d7198 mysqld`Rows_log_event::do_apply_event(Relay_log_info const*) + 1064 frame #7: 0x000000010f718d42 mysqld`apply_event_and_update_pos(Log_event**, THD*, Relay_log_info*) + 530 frame #8: 0x000000010f711f46 mysqld`handle_slave_sql + 4438
slave_rows_search_algorithms='TABLE_SCAN,INDEX_SCAN'
事務執行超過11145s,還沒執行完成
---TRANSACTION 4520, ACTIVE 11145 sec mysql tables in use 1, locked 1 622 lock struct(s), heap size 90320, 191792 row lock(s), undo log entries 95896
調用棧採樣
* frame #0: 0x0000000109fd9c3a mysqld`btr_search_s_lock(dict_index_t const*) + 58 frame #1: 0x0000000109fdb37f mysqld`btr_search_guess_on_hash(dict_index_t*, btr_search_t*, dtuple_t const*, unsigned long, unsigned long, btr_cur_t*, unsigned long, mtr_t*) + 479 frame #2: 0x0000000109fc84a9 mysqld`btr_cur_search_to_nth_level(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, unsigned long, char const*, unsigned long, mtr_t*) + 649 frame #3: 0x000000010a177324 mysqld`row_search_on_row_ref(btr_pcur_t*, unsigned long, dict_table_t const*, dtuple_t const*, mtr_t*) + 164 frame #4: 0x000000010a17746f mysqld`row_get_clust_rec(unsigned long, unsigned char const*, dict_index_t*, dict_index_t**, mtr_t*) + 175 frame #5: 0x000000010a1988e5 mysqld`row_vers_impl_x_locked(unsigned char const*, dict_index_t*, unsigned long const*) + 293 frame #6: 0x000000010a0f39db mysqld`lock_rec_convert_impl_to_expl(buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*) + 603 frame #7: 0x000000010a0f4914 mysqld`lock_sec_rec_read_check_and_lock(unsigned long, buf_block_t const*, unsigned char const*, dict_index_t*, unsigned long const*, lock_mode, unsigned long, que_thr_t*) + 596 frame #8: 0x000000010a1802f1 mysqld`sel_set_rec_lock(btr_pcur_t*, unsigned char const*, dict_index_t*, unsigned long const*, unsigned long, unsigned long, que_thr_t*, mtr_t*) + 193 frame #9: 0x000000010a17e280 mysqld`row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long) + 6720 frame #10: 0x000000010a0a027e mysqld`ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function) + 734 frame #11: 0x0000000109869b6c mysqld`handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function) + 140 frame #12: 0x0000000109f09065 mysqld`Rows_log_event::do_index_scan_and_update(Relay_log_info const*) + 821 frame #13: 0x0000000109f0a198 mysqld`Rows_log_event::do_apply_event(Relay_log_info const*) + 1064 frame #14: 0x0
經過測試發現使用slave_rows_search_algorithms= INDEX_SCAN,HASH_SCAN 配置在此場景下回放binlog會有大幅性能改善,這種方式會有必定內存開銷,因此要保障內存足夠建立hash表,纔會看到性能提高。
對於此問題的改進建議:
避免無where條件的delete或update操做大表,若是須要全表delete請使用truncate操做
在binlog row模式下表結構最好能有主鍵
將slave_rows_search_algorithms設置爲 INDEX_SCAN,HASH_SCAN,會有必定性能改善。