版權聲明:本文由張青林原創文章,轉載請註明出處:
文章原文連接:https://www.qcloud.com/community/article/181html
來源:騰雲閣 https://www.qcloud.com/communitynode
週一上班,首先向同事瞭解了一下上週的測試狀況,被告知在多實例場景下 MySQL Server hang 住,沒法測試下去,原生版本不存在這個問題,而新版本上出現了這個問題,不由心頭一顫,心中不由感到奇怪,好在現場環境還在,爲排查問題提供了一個好的環境,隨投入到緊張的問題排查過程中……,問題實例表現以下:mysql
pstack { pidof hang mysqld server } > 20002.info
[root@TENCENT64 ~]# pt-pmp 20002.info 305 select(libc.so.6),os_thread_sleep(os0thread.cc:261),srv_conc_enter_innodb_with_atomics(srv0conc.cc:276),srv_conc_enter_innodb(srv0conc.cc:276),innobase_srv_conc_enter_innodb(ha_innodb.cc:1173),ha_innobase::index_read(ha_innodb.cc:1173),handler::index_read_idx_map(handler.cc:6846),handler::ha_index_read_idx_map(handler.cc:2787),join_read_(handler.cc:2787),join_read__table(handler.cc:2787),make_join_statistics(sql_optimizer.cc:3592),JOIN::optimize(sql_optimizer.cc:363),mysql_execute_select(sql_select.cc:1087),mysql_select(sql_select.cc:1087),handle_select(sql_select.cc:110),execute_sqlcom_select(sql_parse.cc:5156),mysql_execute_command(sql_parse.cc:2656),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 68 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),trx_prepare(trx0trx.cc:2206),trx_prepare_for_mysql(trx0trx.cc:2258),innobase_xa_prepare(ha_innodb.cc:13618),ha_prepare_low(handler.cc:2104),ha_commit_trans(handler.cc:1434),trans_commit(transaction.cc:228),mysql_execute_command(sql_parse.cc:4302),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 61 select(libc.so.6),os_thread_sleep(os0thread.cc:261),srv_conc_enter_innodb_with_atomics(srv0conc.cc:276),srv_conc_enter_innodb(srv0conc.cc:276),innobase_srv_conc_enter_innodb(ha_innodb.cc:1173),ha_innobase::index_read(ha_innodb.cc:1173),ha_index_read_map(handler.cc:2752),handler::read_range_first(handler.cc:2752),handler::multi_range_read_next(handler.cc:5864),QUICK_RANGE_SELECT::get_next(opt_range.cc:10644),rr_quick(records.cc:369),mysql_update(sql_update.cc:742),mysql_execute_command(sql_parse.cc:3336),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 14 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),trx_undo_assign_undo(trx0undo.cc:1813),trx_undo_report_row_operation(trx0rec.cc:1282),btr_cur_upd_lock_and_undo(btr0cur.cc:1710),btr_cur_update_in_place(btr0cur.cc:1710),btr_cur_optimistic_update(btr0cur.cc:2170),row_upd_clust_rec(row0upd.cc:2132),row_upd_clust_step(row0upd.cc:2466),row_upd(row0upd.cc:2521),row_upd_step(row0upd.cc:2521),row_update_for_mysql(row0mysql.cc:1779),ha_innobase::update_row(ha_innodb.cc:7091),handler::ha_update_row(handler.cc:7305),mysql_update(sql_update.cc:819),mysql_execute_command(sql_parse.cc:3336),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 10 __io_getevents_0_4(libaio.so.1),os_aio_linux_collect(os0file.cc:4975),os_aio_linux_handle(os0file.cc:4975),fil_aio_wait(fil0fil.cc:5796),io_handler_thread(srv0start.cc:492),start_thread(libpthread.so.0),clone(libc.so.6) 5 pthread_cond_wait,inline_mysql_cond_wait(mysql_thread.h:1162),block_until_new_connection(mysql_thread.h:1162),one_thread_per_connection_end(mysql_thread.h:1162),do_handle_one_connection(sql_connect.cc:997),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 4 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),btr_pcur_commit_specify_mtr(btr0pcur.ic:386),ibuf_btr_pcur_commit_specify_mtr(btr0pcur.ic:386),ibuf_merge_or_delete_for_page(btr0pcur.ic:386),buf_page_io_complete(buf0buf.cc:4313),fil_aio_wait(fil0fil.cc:5832),io_handler_thread(srv0start.cc:492),start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),btr_pcur_commit_specify_mtr(btr0pcur.ic:386),ibuf_btr_pcur_commit_specify_mtr(btr0pcur.ic:386),ibuf_delete_rec(btr0pcur.ic:386),ibuf_merge_or_delete_for_page(ibuf0ibuf.cc:4884),buf_page_io_complete(buf0buf.cc:4313),fil_aio_wait(fil0fil.cc:5832),io_handler_thread(srv0start.cc:492),start_thread(libpthread.so.0),clone(libc.so.6) 3 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),srv_worker_thread(srv0srv.cc:2499),start_thread(libpthread.so.0),clone(libc.so.6) 1 sigwait(libpthread.so.0),signal_hand(mysqld.cc:3217),start_thread(libpthread.so.0),clone(libc.so.6) 1 select(libc.so.6),os_thread_sleep(os0thread.cc:261),srv_conc_enter_innodb_with_atomics(srv0conc.cc:276),srv_conc_enter_innodb(srv0conc.cc:276),innobase_srv_conc_enter_innodb(ha_innodb.cc:1173),ha_innobase::index_read(ha_innodb.cc:1173),ha_innobase::index_first(ha_innodb.cc:7883),handler::ha_index_first(handler.cc:2861),join_read_first(sql_executor.cc:2539),sub_select(sql_executor.cc:1259),do_select(sql_executor.cc:936),JOIN::exec(sql_executor.cc:936),mysql_execute_select(sql_select.cc:1101),mysql_select(sql_select.cc:1101),handle_select(sql_select.cc:110),execute_sqlcom_select(sql_parse.cc:5156),mysql_execute_command(sql_parse.cc:2656),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),rw_lock_s_lock_spin(sync0rw.cc:429),rw_lock_s_lock_func(sync0rw.ic:369),pfs_rw_lock_s_lock_func(sync0rw.ic:369),buf_page_get_gen(sync0rw.ic:369),trx_undo_page_get(trx0undo.ic:170),trx_undo_get_first_rec(trx0undo.ic:170),trx_purge_read_undo_rec(trx0purge.cc:741),trx_purge_choose_next_log(trx0purge.cc:782),trx_purge_get_next_rec(trx0purge.cc:888),trx_purge_fetch_next_rec(trx0purge.cc:960),trx_purge_attach_undo_recs(trx0purge.cc:960),trx_purge(trx0purge.cc:960),srv_do_purge(srv0srv.cc:2602),srv_purge_coordinator_thread(srv0srv.cc:2602),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),rw_lock_s_lock_spin(sync0rw.cc:429),rw_lock_s_lock_func(sync0rw.ic:369),pfs_rw_lock_s_lock_func(sync0rw.ic:369),buf_page_get_gen(sync0rw.ic:369),btr_block_get_func(btr0btr.ic:60),btr_cur_latch_leaves(btr0btr.ic:60),btr_cur_open_at_rnd_pos_func(btr0cur.cc:1047),btr_pcur_open_at_rnd_pos_func(btr0pcur.ic:548),ibuf_merge_pages(btr0pcur.ic:548),ibuf_merge(ibuf0ibuf.cc:2759),ibuf_contract_in_background(ibuf0ibuf.cc:2759),srv_master_do_active_tasks(srv0srv.cc:2116),srv_master_thread(srv0srv.cc:2116),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),mtr_add_dirtied_pages_to_flush_list(sync0sync.ic:218),mtr_log_reserve_and_write(mtr0mtr.cc:270),mtr_commit(mtr0mtr.cc:270),row_upd_sec_index_entry(row0upd.cc:1819),row_upd_sec_step(row0upd.cc:1861),row_upd(row0upd.cc:1861),row_upd_step(row0upd.cc:1861),row_update_for_mysql(row0mysql.cc:1779),ha_innobase::update_row(ha_innodb.cc:7091),handler::ha_update_row(handler.cc:7305),mysql_update(sql_update.cc:819),mysql_execute_command(sql_parse.cc:3336),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_write_up_to(sync0sync.ic:218),log_write_up_to(log0log.cc:1651),buf_flush_write_block_low(buf0flu.cc:919),buf_flush_page(buf0flu.cc:919),buf_flush_try_neighbors(buf0flu.cc:1276),buf_flush_page_and_try_neighbors(buf0flu.cc:1350),buf_do_flush_list_batch(buf0flu.cc:1350),buf_flush_batch(buf0flu.cc:1700),buf_flush_list(buf0flu.cc:1700),page_cleaner_do_flush_batch(buf0flu.cc:2166),page_cleaner_flush_pages_if_needed(buf0flu.cc:2166),buf_flush_page_cleaner_thread(buf0flu.cc:2166),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),row_ins_sec_index_entry_low(row0ins.cc:2779),row_ins_sec_index_entry(row0ins.cc:2930),row_upd_sec_index_entry(row0upd.cc:1834),row_upd_sec_step(row0upd.cc:1861),row_upd(row0upd.cc:1861),row_upd_step(row0upd.cc:1861),row_update_for_mysql(row0mysql.cc:1779),ha_innobase::update_row(ha_innodb.cc:7091),handler::ha_update_row(handler.cc:7305),mysql_update(sql_update.cc:819),mysql_execute_command(sql_parse.cc:3336),mysql_parse(sql_parse.cc:6408),dispatch_command(sql_parse.cc:1340),do_command(sql_parse.cc:1037),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_reserve_and_write_fast(sync0sync.ic:218),mtr_log_reserve_and_write(sync0sync.ic:218),mtr_commit(sync0sync.ic:218),ibuf_mtr_commit(ibuf0ibuf.ic:58),ibuf_merge_or_delete_for_page(ibuf0ibuf.ic:58),buf_page_io_complete(buf0buf.cc:4313),fil_aio_wait(fil0fil.cc:5832),io_handler_thread(srv0start.cc:492),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),sync_array_wait_event(sync0arr.cc:424),mutex_spin_wait(sync0sync.cc:580),mutex_enter_func(sync0sync.ic:218),pfs_mutex_enter_func(sync0sync.ic:218),log_get_lsn(sync0sync.ic:218),srv_error_monitor_thread(sync0sync.ic:218),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_wait,os_cond_wait(os0sync.cc:214),os_event_wait_low(os0sync.cc:214),buf_dump_thread(buf0dump.cc:594),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_cond_wait_timed(os0sync.cc:177),os_event_wait_time_low(os0sync.cc:177),srv_monitor_thread(srv0srv.cc:1551),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_cond_wait_timed(os0sync.cc:177),os_event_wait_time_low(os0sync.cc:177),lock_wait_timeout_thread(lock0wait.cc:503),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_cond_wait_timed(os0sync.cc:177),os_event_wait_time_low(os0sync.cc:177),ib_wqueue_timedwait(ut0wqueue.cc:154),fts_optimize_thread(fts0opt.cc:3000),start_thread(libpthread.so.0),clone(libc.so.6) 1 pthread_cond_timedwait,os_cond_wait_timed(os0sync.cc:177),os_event_wait_time_low(os0sync.cc:177),dict_stats_thread(dict0stats_bg.cc:349),start_thread(libpthread.so.0),clone(libc.so.6) 1 poll(libc.so.6),vio_io_wait(viosocket.c:771),vio_socket_io_wait(viosocket.c:68),vio_read(viosocket.c:123),net_read_raw_loop(net_serv.cc:669),net_read_packet_header(net_serv.cc:751),net_read_packet(net_serv.cc:751),my_net_read(net_serv.cc:894),do_command(sql_parse.cc:970),do_handle_one_connection(sql_connect.cc:990),handle_one_connection(sql_connect.cc:906),start_thread(libpthread.so.0),clone(libc.so.6) 1 poll(libc.so.6),handle_connections_sockets(mysqld.cc:6320),mysqld_main(mysqld.cc:5730),__libc_start_main(libc.so.6),_start
從堆棧上能夠看出,有這樣幾類線程:linux
從以上的分類不難看出,全部須要寫 redo log 的線程都在等待 log_sys->mutex
,那麼這個保護redo log buffer
的 mutex 被究竟被哪一個線程獲取了呢,所以,咱們能夠順着這個線索進行問題排查,須要解決如下問題:ios
log_sys->mutex
?log_sys->mutex
的線程爲何沒有繼續執行下去,是在等其它鎖仍是其它緣由?在查找 log_sys->mutex
所屬線程狀況時,有兩點能夠幫助咱們快速的定位到這個線程:sql
log_sys->mutex
同時只能被同一個線程得到,因此在 pt-pmp 的信息輸出中就能夠排除線程數目大於1的線程;mtr_log_reserve_and_write
或 log_write_up_to
的堆棧;順着上面的思路很快的從 pstack 中找到了如下線程:服務器
Thread 446 (Thread 0x7fbea736d700 (LWP 32591)): #0 0x00007fd1b72c8705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000000009f47f1 in os_cond_wait (fast_mutex=0x7defd680, cond=0x7defd6c0) at ../storage/innobase/os/os0sync.cc:214 #2 os_event_wait_low (event=event@entry=0x7defd680, reset_sig_count=<optimized out>) at ../storage/innobase/os/os0sync.cc:592 #3 0x0000000000a5ca1e in sync_array_wait_event (arr=0x340b850, index=1) at ../storage/innobase/sync/sync0arr.cc:424 #4 0x0000000000a5e750 in mutex_spin_wait (mutex=mutex@entry=0x7defd410, file_name=file_name@entry=0xd2d348 "../storage/innobase/mtr/mtr0mtr.cc", line=line@entry=215) at ../storage/innobase/sync/sync0sync.cc:580 #5 0x00000000009ebc27 in mutex_enter_func (line=215, file_name=0xd2d348 "../storage/innobase/mtr/mtr0mtr.cc", mutex=0x7defd410) at ../storage/innobase/include/sync0sync.ic:218 #6 pfs_mutex_enter_func (line=215, file_name=0xd2d348 "../storage/innobase/mtr/mtr0mtr.cc", mutex=0x7defd410) at ../storage/innobase/include/sync0sync.ic:250 #7 mtr_add_dirtied_pages_to_flush_list (mtr=0x7fbea736ae10) at ../storage/innobase/mtr/mtr0mtr.cc:215 #8 0x00000000009ec34c in mtr_log_reserve_and_write (mtr=<optimized out>) at ../storage/innobase/mtr/mtr0mtr.cc:270 #9 mtr_commit (mtr=mtr@entry=0x7fbea736ae10) at ../storage/innobase/mtr/mtr0mtr.cc:324 #10 0x0000000000a4a477 in row_upd_sec_index_entry (node=node@entry=0x7fbb9c01cd18, thr=thr@entry=0x7fbb9c01d018) at ../storage/innobase/row/row0upd.cc:1819 #11 0x0000000000a4e73c in row_upd_sec_step (thr=<optimized out>, node=0x7fbb9c01cd18) at ../storage/innobase/row/row0upd.cc:1861 #12 row_upd (thr=0x7fbb9c01d018, node=0x7fbb9c01cd18) at ../storage/innobase/row/row0upd.cc:2556 #13 row_upd_step (thr=thr@entry=0x7fbb9c01d018) at ../storage/innobase/row/row0upd.cc:2673 #14 0x0000000000a339c8 in row_update_for_mysql (mysql_rec=mysql_rec@entry=0x7fbb9c017528 "\377\366\r4\001\350e/\001\065\071\062\067\060\067\064\067\065\063\063-46485490867-60617341003-10452679142-20155412750-96257700347-89668812213-35633691718-21407253672-69113394932", ' ' <repeats 72 times>..., prebuilt=0x7fbb9c01c568) at ../storage/innobase/row/row0mysql.cc:1779 #15 0x00000000009ac99e in ha_innobase::update_row (this=0x7fbb9c016f10, old_row=0x7fbb9c017528 "\377\366\r4\001\350e/\001\065\071\062\067\060\067\064\067\065\063\063-46485490867-60617341003-10452679142-20155412750-96257700347-89668812213-35633691718-21407253672-69113394932", ' ' <repeats 72 times>..., new_row=0x7fbb9c017300 "\377\366\r4\001\351e/\001\065\071\062\067\060\067\064\067\065\063\063-46485490867-60617341003-10452679142-20155412750-96257700347-89668812213-35633691718-21407253672-69113394932", ' ' <repeats 72 times>...) at ../storage/innobase/handler/ha_innodb.cc:7091 #16 0x00000000005b4332 in handler::ha_update_row (this=0x7fbb9c016f10, old_data=0x7fbb9c017528 "\377\366\r4\001\350e/\001\065\071\062\067\060\067\064\067\065\063\063-46485490867-60617341003-10452679142-20155412750-96257700347-89668812213-35633691718-21407253672-69113394932", ' ' <repeats 72 times>..., new_data=0x7fbb9c017300 "\377\366\r4\001\351e/\001\065\071\062\067\060\067\064\067\065\063\063-46485490867-60617341003-10452679142-20155412750-96257700347-89668812213-35633691718-21407253672-69113394932", ' ' <repeats 72 times>...) at ../sql/handler.cc:7305 #17 0x000000000076b145 in mysql_update (thd=thd@entry=0x7e604470, table_list=<optimized out>, fields=..., values=..., conds=0x7fbe2800aca0, order_num=<optimized out>, order=<optimized out>, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false, found_return=found_return@entry=0x7fbea736bac0, updated_return=updated_return@entry=0x7fbea736c020) at ../sql/sql_update.cc:819 #18 0x00000000006fc2dd in mysql_execute_command (thd=thd@entry=0x7e604470) at ../sql/sql_parse.cc:3336 #19 0x0000000000700618 in mysql_parse (thd=thd@entry=0x7e604470, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fbea736c770) ...
這裏咱們簡單的介紹一下MySQL寫 redo log 的過程(省略undo & buffer pool 部分),當對數據進行修改時,MySQL 會首先對針對操做類型記錄不一樣的 redo 日誌,主要過程是:多線程
src/storage/innobase/include/mtr0mtr.h
recv_parse_or_apply_log_rec_body()
;redo buffer
,並將這次涉及到髒頁的數據加入到 buffer_pool
的 flush list 鏈表中;innodb_flush_log_at_trx_commit
的值來判斷在commit 的時候是否進行 sync 操做;上面的堆棧則是寫Redo後將髒頁加到 flush list 過程當中時 hang 住了,即此線程在獲取了 log_sys->mutex
後,在獲取 log_sys->log_flush_order_mutex
的過程當中 hang 住了,而此時有大量的線程在等待該線程釋放log_sys->mutex鎖,問題一 已經有了答案,那麼log_sys->log_flush_order_mutex
是個什麼東東,它又被哪一個佔用了呢?併發
說 明:MySQL 的 buffer pool 維護了一個有序的髒頁鏈表 (flush list according LSN order
),這樣在作 checkpoint & log_free_check
的過程當中能夠很快的定位到 redo log 須要推動的位置,在將髒頁加入
flush list 過程當中須要對其上鎖以保證 flush list 中 LSN 的有序性, 可是若是使用 log_sys->mutex
,在併發量大的時候則會形成 log_sys->mutex
的 contention,進而引發性能問題,所以添加了
另一個 mutex 來保護髒頁按 LSN 的有序性,代碼說明以下:oracle
ib_mutex_t log_flush_order_mutex;/*!< mutex to serialize access to the flush list when we are putting dirty blocks in the list. The idea behind this mutex is to be able to release log_sys->mutex during mtr_commit and still ensure that insertions in the flush_list happen in the LSN order. */
在問題一的排查過程當中咱們肯定了 log_sys->mutex
的所屬線程, 這個線程在得到 log_sys->log_flush_order_mutex
的過程當中 hang 住了,所以線程堆棧能夠分覺得下幾類:
log_sys->log_flush_order_mutex
以把髒頁加入到 buffer_pool
的 flush list
中;log_sys->mutex
以寫日誌或者讀取日誌信息的線程;log_sys->log_flush_order_mutex
,在作其它事情的時候被 hang 住;所以,問題的關鍵是找到哪一個線程獲取了log_sys->log_flush_order_mutex
爲了找到相關的線程作了如下操做:
查找獲取log_sys->log_flush_order_mutex
的地方;
[root@TENCENT64 /data/workplace/CDB]# grep log_flush_order_mutex_enter storage/innobase/* -r --color storage/innobase/buf/buf0buf.cc: log_flush_order_mutex_enter(); storage/innobase/include/log0log.h:#define log_flush_order_mutex_enter() do { \ storage/innobase/log/log0recv.cc: log_flush_order_mutex_enter(); storage/innobase/mtr/mtr0mtr.cc: log_flush_order_mutex_enter();
結合現有 pstack 中的線程信息,仔細查看上述查找結果中的相關代碼,發現基本沒有線程得到log_sys->log_flush_order_mutex
;
log_sys->log_flush_order_mutex
打印出來,發現{waiters=1; lock_word= 0}!!!
,即 Thread 446 在等待一個空閒的 mutex,而這個Mutex也確實被等待,因爲咱們的版本爲 Release 版本,因此不少有用的信息沒有辦法獲得,而若用 debug 版本跑則很難重現問題,log_flush_order_mutex
的定義以下:/** InnoDB mutex */ struct ib_mutex_t { os_event_t event; /*!< Used by sync0arr.cc for the wait queue */ volatile lock_word_t lock_word; /*!< lock_word is the target of the atomic test-and-set instruction when atomic operations are enabled. */ #if !defined(HAVE_ATOMIC_BUILTINS) os_fast_mutex_t os_fast_mutex; /*!< We use this OS mutex in place of lock_word when atomic operations are not enabled */ #endif ulint waiters; /*!< This ulint is set to 1 if there are (or may be) threads waiting in the global wait array for this mutex to be released. Otherwise, this is 0. */ UT_LIST_NODE_T(ib_mutex_t) list; /*!< All allocated mutexes are put into a list. Pointers to the next and prev. */ #ifdef UNIV_SYNC_DEBUG const char* file_name; /*!< File where the mutex was locked */ ulint line; /*!< Line where the mutex was locked */ ulint level; /*!< Level in the global latching order */ #endif /* UNIV_SYNC_DEBUG */ const char* cfile_name;/*!< File name where mutex created */ ulint cline; /*!< Line where created */ ulong count_os_wait; /*!< count of os_wait */ #ifdef UNIV_DEBUG /** Value of mutex_t::magic_n */ # define MUTEX_MAGIC_N 979585UL os_thread_id_t thread_id; /*!< The thread id of the thread which locked the mutex. */ ulint magic_n; /*!< MUTEX_MAGIC_N */ const char* cmutex_name; /*!< mutex name */ ulint ib_mutex_type; /*!< 0=usual mutex, 1=rw_lock mutex */ #endif /* UNIV_DEBUG */ #ifdef UNIV_PFS_MUTEX struct PSI_mutex* pfs_psi; /*!< The performance schema instrumentation hook */ #endif };
由以上的分析能夠得出 問題二 的答案:
log_sys->log_flush_order_mutex
有關,其中一個是 Thread 446 線程, 另一個則是最近一次調用 log_flush_order_mutex_exit()
的線程;log_sys->log_flush_order_mutex
的過程當中沒有喚醒 Thread 446,致使Thread 446 hang 並形成其它線程不能得到 log_sys->mutex
,進而形成實例不可用;log_sys->log_flush_order_mutex
沒有被任何線程得到;由問題二的分析過程可知 log_sys->log_flush_order_mutex
沒有被任何線程得到,但是爲何 Thread 446 沒有被喚醒呢,信號丟失仍是程序問題?若是是信號丟失,爲何能夠穩定復現?官方的bug list 列表中是沒有相似的 Bug的,搜了一下社區,發現可用信息不多,這個時候分析好像陷入了死衚衕,內心壓力開始無形中變大……,好像沒有辦法,可是任何問題都是有緣由的,找到了緣由,也就是有解的了……,再一次將注意力移到了 Thread 446 的堆棧中,而後查看了函數:
/************************************************************//** Append the dirty pages to the flush list. */ static void mtr_add_dirtied_pages_to_flush_list( /*================================*/ mtr_t* mtr) /*!< in/out: mtr */ { ut_ad(!srv_read_only_mode); /* No need to acquire log_flush_order_mutex if this mtr has not dirtied a clean page. log_flush_order_mutex is used to ensure ordered insertions in the flush_list. We need to insert in the flush_list iff the page in question was clean before modifications. */ if (mtr->made_dirty) { log_flush_order_mutex_enter(); } /* It is now safe to release the log mutex because the flush_order mutex will ensure that we are the first one to insert into the flush list. */ log_release(); if (mtr->modifications) { mtr_memo_note_modifications(mtr); } if (mtr->made_dirty) { log_flush_order_mutex_exit(); } }
由問題二的分析過程能夠得出某線程在 log_flush_order_mutex_exit
的退出過程沒有將 Thread 446 喚醒,那麼就順着這個函數找,看它如何喚醒其它本程的,在沒有辦法的時候也只有這樣一步一步的分析代碼,但願有些收穫,隨着函數調用的不斷深刻,將目光定在了 mutex_exit_func
上, 函數中的註釋引發了個人注意:
/******************************************************************//** NOTE! Use the corresponding macro mutex_exit(), not directly this function! Unlocks a mutex owned by the current thread. */ UNIV_INLINE void mutex_exit_func( /*============*/ ib_mutex_t* mutex) /*!< in: pointer to mutex */ { ut_ad(mutex_own(mutex)); ut_d(mutex->thread_id = (os_thread_id_t) ULINT_UNDEFINED); #ifdef UNIV_SYNC_DEBUG sync_thread_reset_level(mutex); #endif mutex_reset_lock_word(mutex); /* A problem: we assume that mutex_reset_lock word is a memory barrier, that is when we read the waiters field next, the read must be serialized in memory after the reset. A speculative processor might perform the read first, which could leave a waiting thread hanging indefinitely. Our current solution call every second sync_arr_wake_threads_if_sema_free() to wake up possible hanging threads if they are missed in mutex_signal_object. */ if (mutex_get_waiters(mutex) != 0) { mutex_signal_object(mutex); } #ifdef UNIV_SYNC_PERF_STAT mutex_exit_count++; #endif }
從上面的註釋中能夠獲得兩點信息:
mutex_get_waiters & mutex_reset_lock_word
的調用順序可能與執行順序相反,這種狀況下會引發 hang 問題;sync_arr_wake_threads_if_sema_free()
來解決上述問題;由上面的註釋能夠看到,並非信號丟失,而是多線程 memory barrier 的存在可能會形成指令執行的順序的異常,這種問題肯定存在,但既然有sync_arr_wake_threads_if_sema_free()
規避這個問題,爲何還會存在 hang 呢?有了這個線索,瞬間感受有了些盼頭……,通過查找sync_arr_wake_threads_if_sema_free
只在srv_error_monitor_thread
有調用,這個線程是專門對 MySQL 內部異常狀況進行監控並打印出 error 信息的線程,臭名昭著的 600S 自殺案也是它的傑做, 那麼問題來了:
sync_arr_wake_threads_if_sema_free
能夠喚醒,爲何沒有喚醒呢?順着這個思路,查看了pstack 中 srv_error_monitor_thread
的堆棧,能夠發現此線程在獲取log_sys->mutex
的時候hang 住了,所以沒法執行sync_arr_wake_threads_if_sema_free()
& 常歸的異常檢查,正好回答了上面的問題,詳細堆棧以下:
Thread 470 (Thread 0x7fbeb0a2e700 (LWP 20032)): #0 0x00007fd1b72c8705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00000000009f47f1 in os_cond_wait (fast_mutex=0x7defd5f0, cond=0x7defd630) at ../storage/innobase/os/os0sync.cc:214 #2 os_event_wait_low (event=event@entry=0x7defd5f0, reset_sig_count=<optimized out>) at ../storage/innobase/os/os0sync.cc:592 #3 0x0000000000a5ca1e in sync_array_wait_event (arr=0x340b850, index=86) at ../storage/innobase/sync/sync0arr.cc:424 #4 0x0000000000a5e750 in mutex_spin_wait (mutex=mutex@entry=0x7defd3c8, file_name=file_name@entry=0xd2d380 "../storage/innobase/include/log0log.ic", line=line@entry=409) at ../storage/innobase/sync/sync0sync.cc:580 #5 0x0000000000a552e9 in mutex_enter_func (line=409, file_name=0xd2d380 "../storage/innobase/include/log0log.ic", mutex=0x7defd3c8) at ../storage/innobase/include/sync0sync.ic:218 #6 pfs_mutex_enter_func (line=409, file_name=0xd2d380 "../storage/innobase/include/log0log.ic", mutex=0x7defd3c8) at ../storage/innobase/include/sync0sync.ic:250 #7 log_get_lsn () at ../storage/innobase/include/log0log.ic:409 #8 srv_error_monitor_thread (arg=<optimized out>) at ../storage/innobase/srv/srv0srv.cc:1722 #9 0x00007fd1b72c4df3 in start_thread () from /lib64/libpthread.so.0 #10 0x00007fd1b63ac1bd in clone () from /lib64/libc.so.6
通過上面的分析問題愈來愈明朗了,過程能夠簡單的歸結爲:
log_sys->mutex
, 可是在等待 log_sys->log_flush_order_mutex
的過程當中沒有被喚醒;log_sys->log_flush_order_mutex
的過程當中出現了 memory barrier 問題,沒有喚醒 Thread 446 ;log_sys->mutex
時被 hang 住,致使沒法執行sync_arr_wake_threads_if_sema_free()
, 致使了整個實例的 hang 住;log_sys->mutex
, 而 Thread 446 須要被 Thread 470 喚醒纔會釋放 log_sys->mutex;結合 log_sys->log_flush_order_mutex
的狀態信息,實例 hang 住的整個過程以下:
THD_1 THD_2 THD_N srv_error_monitor_thread mutex_enter(log_sys->mutex) ... mutx_enter(log_sys->log_flush_order_mutex) mutex_exit(log_sys->mutex) mutx_exit(log_sys->log_flush_order_mutex) mutex_enter(log_sys->mutex) mutex_enter(log_sys->mutex) mutex_enter(log_sys->mutex) mutex_get_waiters() ... mutx_enter(log_sys->log_flush_order_mutex) mutex_set_waiters sync_array_wait_event mutex_reset_lock_word()
關於 Memory barrier 的介紹能夠參考①介紹
既然知道了問題產生的緣由,那麼問題也就能夠順利解決了,有兩種方法:
雖然問題解決了,但官方版本中確定存在着這個問題,爲何 buglist 沒有找到相關信息呢,因而在查看了最新代碼,發現這個問題已經修復,修復方法爲上面列的第二種方法,詳細的 commit message 信息以下:
commit 79032a7ae1b4e000028a64ab0d2f216d4c23767b Author: Shaohua Wang <shaohua.wang@oracle.com> Date: Tue Jan 12 15:08:09 2016 +0800 BUG#22530768 Innodb freeze running REPLACE statements we can see from the hang stacktrace, srv_monitor_thread is blocked when getting log_sys::mutex, so that sync_arr_wake_threads_if_sema_free cannot get a change to break the mutex deadlock. The fix is simply removing any mutex wait in srv_monitor_thread. Patch is reviewed by Sunny over IM.
bug 影響範圍:MySQL 5.6.28 及以前的版本都有此問題