在咱們運維MySQL的時候,總會遇到各類狀況致使程序和MySQL之間的會話異常中斷,好比css
假如強制關閉應用
假如client機器忽然崩潰宕機/斷電
假如網絡發生抖動/網卡發生故障
機房級別斷網
那麼此時正在MySQL中執行的事務會何表現?mysql
設計一個案例模擬client 在MySQL中執行事務,可是client機器忽然down機,致使會話異常中斷。ios
client 192.168.56.102
MySQL 192.168.56.101
client 鏈接db 執行 select for update 動做給記錄加上鎖,固然實際上 也能夠是update,delete 這樣的動做給記錄加上鎖。sql
而後關閉client機器模擬斷電,斷網。shell
此時server端 網絡層的鏈接狀態依然是 ESTABLISH 數據庫中的事務處於running狀態。數據庫
再開啓另一個會話,對t1表進行加鎖須要等待,說明斷網以後的事務依然處於活躍狀態。bash
ok 表演結束 ,咱們接下來繼續分析網絡斷開,事務爲啥沒有退出?網絡
3.1 服務端爲何沒有退出這個事務呢?less
MySQL普通的會話鏈接沒有保活機制,即沒有設置socket屬性,也沒有設置心跳機制。若是網絡鏈接異常斷開服務端不能及時探測到該異常。更進一步,咱們經過 TCP 關閉的四次握手來看運維
網絡異常的時候,TCP鏈接的狀態仍是ESTABLISHED,說明 server 和 client 任何一方都沒有主動發送FIN包,服務端還在等待 client端 發送數據,此時的 MySQL 事務沒法直接退出。
3.2 事務在網絡斷開後如何處理
事務正在執行
一個鏈接進行事務後,若是事務語句正在執行,那麼網絡斷開後會在語句執行完成後回滾掉。由於執行狀態包不能送達客戶端,所以會感知到這種網絡斷開的錯誤。調試堆棧信息參考 堆棧1。
if (thd->is_error() || (thd->variables.option_bits & OPTION_MASTER_SQL_ERROR))
trans_rollback_stmt(thd);
事務執行完成未提交
若是事務中sql執行完成而沒有提交,此時網絡斷開,那麼事務還存在服務端,須要手動kill。client到server端的鏈接路徑:
socket->listen->poll(socketfd)->accept->newthread->poll(newfd,wait_timeout)
一旦有新的數據到來,若是須要讀取或者寫入因爲網絡問題依舊使用poll進行等待,直到超時。其中參數 read_timeout/write_timeout 用於讀取網絡數據的,若是網絡不可用,會話保持的時間就是等待網絡可用的時間,也就是 wait_timeout和 read_timeout/write_timeout 均使用poll的timeout實現。見棧2,可見 vio_io_wait 函數用於處理各類超時,主要用poll來處理。
net_write_packet
->net_write_raw_loop 一個包大小16M
->vio_write ->mysql_socket_send 若是發送信息失敗 inline_mysql_socket_send調用send命令 ->若是是SOCKET_EAGAIN那麼經過vio_socket_io_wait函數進行斷定,須要等待的時間 /* Wait for the output buffer to become writable.*/
3.3 什麼時候退出呢?
這裏須要分狀況來討論。
空閒鏈接狀態時
此時鏈接的會話時間由 MySQL參數 wait_timeout 決定,默認是8h,也即會話時間空閒超過8h,會被MySQL自動關閉。詳細知識能夠移步到以下連接:
淺析interactive_timeout和wait_timeout
等待TCP超時
默認狀況下會話會保持2小時+11次*75秒,此時服務端爲啥沒有退出這個事務呢? 。(由TCP屬性決定)
/proc/sys/net/ipv4/tcp_keepalive_time = 7200(等待空閒時間秒)
/proc/sys/net/ipv4/tcp_keepalive_intvl = 75(探測間隔秒)
/proc/sys/net/ipv4/tcp_keepalive_probes = 9(探測次數)
主動kill 異常會話
kill $thread_id;
咱們以前遇到一次機房級別的斷網,應用重連以後遇到大量sql 鎖等待,因而乎,咱們寫了一個腳本按期kill 長時間活躍的事務,僅供你們參考。
# 針對系統運維層面
Kill_SQL="select concat('kill ', trx_mysql_thread_id ,' ;') from information_schema.processlist a,information_schema.INNODB_TRX b where a.ID=b.trx_mysql_thread_id and trx_started <=SUBDATE(now(),interval 60 second) and STATE='' and COMMAND='Sleep' and TIME>=60"
LOGFILE='/data/logs/zandb_agent/kill_long_trx.log'
run_user=`whoami`
ret_log()
{
msg="$1"
echo `date +%Y%m%d_%H:%M:%S` "[info]" " $msg" >> ${LOGFILE}
}
if [ $# -lt 1 ]; then
ports=`ps -ef |grep mysqld| grep -v mysqld_safe |grep port=|awk -F"port=" '{print $NF}' |awk '{print $1}' | sort `
else
ports=$1
fi
for port in ${ports};
do
if [ -f /tmp/long_trx_${port}.lock ];then
msg="there is a lock file ,so we skip instance ${port}.."
ret_log "${msg}"
continue
fi
if [ -S /srv/my${port}/run/mysql.sock ]; then
SOCKET="/srv/my${port}/run/mysql.sock"
else
msg="socket file does not exists,please check ."
ret_log "${msg}"
fi
MYSQL="/opt/mysql/bin/mysql -uroot -S ${SOCKET} "
${MYSQL} --skip-column-names -e "$Kill_SQL" > /tmp/kill_trx_${port}.sql.${run_user} 2>/tmp/kill_trx_${port}.log.${run_user}
num=`grep kill -c /tmp/kill_trx_${port}.sql.${run_user}`
if [[ ${num} -gt 0 ]]; then
msg="${port} ${num} long trx was killed "
ret_log "${msg}"
${MYSQL} -e "source /tmp/kill_trx_${port}.sql.${run_user}"
fi
done
棧1:回滾棧
(gdb) bt
#0 innobase_rollback (hton=0x2e12440, thd=0x7ffefc000950, rollback_trx=false) at /home/mysql/soft/percona-server-5.7.29-32/storage/innobase/handler/ha_innodb.cc:5452
#1 0x0000000000ea6ab8 in ha_rollback_low (thd=0x7ffefc000950, all=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/handler.cc:2019
#2 0x00000000017f0f23 in MYSQL_BIN_LOG::rollback (this=0x2d668a0 <mysql_bin_log>, thd=0x7ffefc000950, all=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/binlog.cc:2532
#3 0x0000000000ea6d40 in ha_rollback_trans (thd=0x7ffefc000950, all=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/handler.cc:2106
#4 0x00000000015c6a13 in trans_rollback_stmt (thd=0x7ffefc000950) at /home/mysql/soft/percona-server-5.7.29-32/sql/transaction.cc:515
#5 0x00000000014c08de in mysql_execute_command (thd=0x7ffefc000950, first_level=true) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:5325
#6 0x00000000014c2025 in mysql_parse (thd=0x7ffefc000950, parser_state=0x7fffe88824a0, update_userstat=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:5927
#7 0x00000000014b6c5f in dispatch_command (thd=0x7ffefc000950, com_data=0x7fffe8882c90, command=COM_QUERY) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:1539
#8 0x00000000014b5a94 in do_command (thd=0x7ffefc000950) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:1060
#9 0x00000000015e9d32 in handle_connection (arg=0x3c09eb0) at /home/mysql/soft/percona-server-5.7.29-32/sql/conn_handler/connection_handler_per_thread.cc:325
#10 0x00000000018b97f2 in pfs_spawn_thread (arg=0x3b784b0) at /home/mysql/soft/percona-server-5.7.29-32/storage/perfschema/pfs.cc:2198
#11 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ffff5fa08dd in clone () from /lib64/libc.so.6
棧2 寫入等待timeout=60000 即默認的write_timeout=60S
#0 0x00007ffff5f95c3d in poll () from /lib64/libc.so.6
#1 0x0000000001da0e3b in vio_io_wait (vio=0x7ffefc005690, event=VIO_IO_EVENT_WRITE, timeout=60000) at /home/mysql/soft/percona-server-5.7.29-32/vio/viosocket.c:1173
#2 0x0000000001d9f352 in vio_socket_io_wait (vio=0x7ffefc005690, event=VIO_IO_EVENT_WRITE) at /home/mysql/soft/percona-server-5.7.29-32/vio/viosocket.c:127
#3 0x0000000001d9f74a in vio_write (vio=0x7ffefc005690, buf=0x7ffefc013838 "\001\061\001\060\004", size=12872) at /home/mysql/soft/percona-server-5.7.29-32/vio/viosocket.c:260
#4 0x00000000016f92a3 in net_write_raw_loop (net=0x7ffefc002528, buf=0x7ffefc013838 "\001\061\001\060\004", count=12872) at /home/mysql/soft/percona-server-5.7.29-32/sql/net_serv.cc:522
#5 0x00000000016f9588 in net_write_packet (net=0x7ffefc002528, packet=0x7ffefc012a80 "\001\061\001\060\004", length=16384) at /home/mysql/soft/percona-server-5.7.29-32/sql/net_serv.cc:661
#6 0x00000000016f9177 in net_write_buff (net=0x7ffefc002528, packet=0x7ffefc936f70 "\001\061\001\060", len=4) at /home/mysql/soft/percona-server-5.7.29-32/sql/net_serv.cc:474
#7 0x00000000016f8e1c in my_net_write (net=0x7ffefc002528, packet=0x7ffefc936f70 "\001\061\001\060", len=4) at /home/mysql/soft/percona-server-5.7.29-32/sql/net_serv.cc:347
#8 0x0000000001745635 in Protocol_classic::end_row (this=0x7ffefc001c68) at /home/mysql/soft/percona-server-5.7.29-32/sql/protocol_classic.cc:1204
#9 0x00000000014571d7 in Query_result_send::send_data (this=0x7ffefc0097a0, items=...) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_class.cc:2937
#10 0x0000000001477a0d in end_send (join=0x7ffefc009a70, qep_tab=0x7ffefc93b290, end_of_records=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_executor.cc:2946
#11 0x000000000147461b in evaluate_join_record (join=0x7ffefc009a70, qep_tab=0x7ffefc93b118) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_executor.cc:1652
#12 0x0000000001473a5b in sub_select (join=0x7ffefc009a70, qep_tab=0x7ffefc93b118, end_of_records=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_executor.cc:1304
#13 0x00000000014732dc in do_select (join=0x7ffefc009a70) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_executor.cc:957
#14 0x0000000001471243 in JOIN::exec (this=0x7ffefc009a70) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_executor.cc:206
#15 0x000000000150d2d5 in handle_query (thd=0x7ffefc000950, lex=0x7ffefc003000, result=0x7ffefc0097a0, added_options=0, removed_options=0)
at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_select.cc:192
#16 0x00000000014c1097 in execute_sqlcom_select (thd=0x7ffefc000950, all_tables=0x7ffefc009160) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:5490
#17 0x00000000014ba323 in mysql_execute_command (thd=0x7ffefc000950, first_level=true) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:3016
#18 0x00000000014c2025 in mysql_parse (thd=0x7ffefc000950, parser_state=0x7fffe88824a0, update_userstat=false) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:5927
#19 0x00000000014b6c5f in dispatch_command (thd=0x7ffefc000950, com_data=0x7fffe8882c90, command=COM_QUERY) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:1539
#20 0x00000000014b5a94 in do_command (thd=0x7ffefc000950) at /home/mysql/soft/percona-server-5.7.29-32/sql/sql_parse.cc:1060
#21 0x00000000015e9d32 in handle_connection (arg=0x3c09eb0) at /home/mysql/soft/percona-server-5.7.29-32/sql/conn_handler/connection_handler_per_thread.cc:325
#22 0x00000000018b97f2 in pfs_spawn_thread (arg=0x3b784b0) at /home/mysql/soft/percona-server-5.7.29-32/storage/perfschema/pfs.cc:2198
#23 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0
#24 0x00007ffff5fa08dd in clone () from /lib64/libc.so.6
(gdb)
感謝你們看到這裏,文章有不足,歡迎你們指出;若是你以爲寫得不錯,那就給我一個贊吧。