MySQL-長事務詳解

前言: mysql

『入門MySQL』系列文章已經完結,從此個人文章仍是會以MySQL爲主,主要記錄下近期工做及學習遇到的場景或者本身的感悟想法,可能後續的文章不是那麼連貫,但仍是但願你們多多支持。言歸正傳,本篇文章主要介紹MySQL長事務相關內容,好比說咱們開啓的一個事務,一直沒提交或回滾會怎樣呢,出現事務等待狀況應該如何處理,本篇文章將給你答案。sql

注意:本篇文章並不聚焦於談論事務隔離級別以及相關特性。而是介紹長事務相關危害以及監控處理方法。本文是基於MySQL5.7.23版本,不可重複讀(RR)隔離級別所作實驗。shell

1.什麼是長事務

首先咱們先要知道什麼是長事務,顧名思義就是運行時間比較長,長時間未提交的事務,也能夠稱之爲大事務。這類事務每每會形成大量的阻塞和鎖超時,容易形成主從延遲,要儘可能避免使用長事務。bash

下面我將演示下如何開啓事務及模擬長事務:ide

#假設咱們有一張stu_tb表,結構及數據以下
mysql> show create table stu_tb\G
*************************** 1. row ***************************
       Table: stu_tb
Create Table: CREATE TABLE `stu_tb` (
  `increment_id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增主鍵',
  `stu_id` int(11) NOT NULL COMMENT '學號',
  `stu_name` varchar(20) DEFAULT NULL COMMENT '學生姓名',
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '建立時間',
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改時間',
  PRIMARY KEY (`increment_id`),
  UNIQUE KEY `uk_stu_id` (`stu_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8 COMMENT='測試學生表'
1 row in set (0.01 sec)

mysql> select * from stu_tb;
+--------------+--------+----------+---------------------+---------------------+
| increment_id | stu_id | stu_name | create_time         | update_time         |
+--------------+--------+----------+---------------------+---------------------+
|            1 |   1001 | from1    | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            2 |   1002 | dfsfd    | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            3 |   1003 | fdgfg    | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            4 |   1004 | sdfsdf   | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            5 |   1005 | dsfsdg   | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            6 |   1006 | fgd      | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            7 |   1007 | fgds     | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
|            8 |   1008 | dgfsa    | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
+--------------+--------+----------+---------------------+---------------------+
8 rows in set (0.00 sec)

#顯式開啓事務,可用begin或start transaction
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from stu_tb where stu_id = 1006 for update;
+--------------+--------+----------+---------------------+---------------------+
| increment_id | stu_id | stu_name | create_time         | update_time         |
+--------------+--------+----------+---------------------+---------------------+
|            6 |   1006 | fgd      | 2019-09-15 14:27:34 | 2019-09-15 14:27:34 |
+--------------+--------+----------+---------------------+---------------------+
1 row in set (0.01 sec)

#若是咱們不及時提交上個事務,那麼這個事務就變成了長事務,當其餘會話要操做這條數據時,就會一直等待。

2.如何找到長事務

遇到事務等待問題時,咱們首先要作的是找到正在執行的事務。 information_schema.INNODB_TRX 表中包含了當前innodb內部正在運行的事務信息,這個表中給出了事務的開始時間,咱們能夠稍加運算便可獲得事務的運行時間。學習

mysql> select t.*,to_seconds(now())-to_seconds(t.trx_started) idle_time from INFORMATION_SCHEMA.INNODB_TRX t \G
*************************** 1. row ***************************
                    trx_id: 6168
                 trx_state: RUNNING
               trx_started: 2019-09-16 11:08:27
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 3
       trx_mysql_thread_id: 11
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 1
          trx_lock_structs: 3
     trx_lock_memory_bytes: 1136
           trx_rows_locked: 2
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
                 idle_time: 170

在結果中idle_time是計算產生的,也是事務的持續時間。但事務的trx_query是NUL,這並非說事務什麼也沒執行,一個事務可能包含多個SQL,若是SQL執行完畢就再也不顯示了。當前事務正在執行,innodb也不知道這個事務後續還有沒有sql,啥時候會commit。 所以trx_query不能提供有意義的信息。測試

若是咱們想看到這個事務執行過的SQL,看是否能夠殺掉長事務,怎麼辦呢?咱們能夠聯合其餘系統表查詢獲得,具體查詢SQL以下:code

mysql> select now(),(UNIX_TIMESTAMP(now()) - UNIX_TIMESTAMP(a.trx_started)) diff_sec,b.id,b.user,b.host,b.db,d.SQL_TEXT from information_schema.innodb_trx a inner join
    -> information_schema.PROCESSLIST b
    -> on a.TRX_MYSQL_THREAD_ID=b.id and b.command = 'Sleep'
    -> inner join performance_schema.threads c ON b.id = c.PROCESSLIST_ID
    -> inner join performance_schema.events_statements_current d ON d.THREAD_ID = c.THREAD_ID;
+---------------------+----------+----+------+-----------+--------+-----------------------------------------------------+
| now()               | diff_sec | id | user | host      | db     | SQL_TEXT                                            |
+---------------------+----------+----+------+-----------+--------+-----------------------------------------------------+
| 2019-09-16 14:06:26 |       54 | 17 | root | localhost | testdb | select * from stu_tb where stu_id = 1006 for update |
+---------------------+----------+----+------+-----------+--------+-----------------------------------------------------+

上述結果中diff_sec和上面idle_time表示意思相同,都是表明此事務持續的秒數。SQL_TEXT表示該事務剛執行的SQL。可是呢,上述語句只能查到事務最後執行的SQL,咱們知道,一個事務裏可能包含多個SQL,那咱們想查詢這個未提交的事務執行過哪些SQL,是否能夠知足呢,答案是結合events_statements_history系統表也能夠知足需求。下面語句將會查詢出該事務執行過的全部SQL:orm

mysql> SELECT
    ->   ps.id 'PROCESS ID',
    ->   ps.USER,
    ->   ps.HOST,
    ->   esh.EVENT_ID,
    ->   trx.trx_started,
    ->   esh.event_name 'EVENT NAME',
    ->   esh.sql_text 'SQL',
    ->   ps.time
    -> FROM
    ->   PERFORMANCE_SCHEMA.events_statements_history esh
    ->   JOIN PERFORMANCE_SCHEMA.threads th ON esh.thread_id = th.thread_id
    ->   JOIN information_schema.PROCESSLIST ps ON ps.id = th.processlist_id
    ->   LEFT JOIN information_schema.innodb_trx trx ON trx.trx_mysql_thread_id = ps.id
    -> WHERE
    ->   trx.trx_id IS NOT NULL
    ->   AND ps.USER != 'SYSTEM_USER'
    -> ORDER BY
    ->   esh.EVENT_ID;
+------------+------+-----------+----------+---------------------+------------------------------+-----------------------------------------------------+------+
| PROCESS ID | USER | HOST      | EVENT_ID | trx_started         | EVENT NAME                   | SQL                                                 | time |
+------------+------+-----------+----------+---------------------+------------------------------+-----------------------------------------------------+------+
|         20 | root | localhost |        1 | 2019-09-16 14:18:44 | statement/sql/select         | select @@version_comment limit 1                    |   60 |
|         20 | root | localhost |        2 | 2019-09-16 14:18:44 | statement/sql/begin          | start transaction                                   |   60 |
|         20 | root | localhost |        3 | 2019-09-16 14:18:44 | statement/sql/select         | SELECT DATABASE()                                   |   60 |
|         20 | root | localhost |        4 | 2019-09-16 14:18:44 | statement/com/Init DB        | NULL                                                |   60 |
|         20 | root | localhost |        5 | 2019-09-16 14:18:44 | statement/sql/show_databases | show databases                                      |   60 |
|         20 | root | localhost |        6 | 2019-09-16 14:18:44 | statement/sql/show_tables    | show tables                                         |   60 |
|         20 | root | localhost |        7 | 2019-09-16 14:18:44 | statement/com/Field List     | NULL                                                |   60 |
|         20 | root | localhost |        8 | 2019-09-16 14:18:44 | statement/com/Field List     | NULL                                                |   60 |
|         20 | root | localhost |        9 | 2019-09-16 14:18:44 | statement/sql/select         | select * from stu_tb                                |   60 |
|         20 | root | localhost |       10 | 2019-09-16 14:18:44 | statement/sql/select         | select * from stu_tb where stu_id = 1006 for update |   60 |
+------------+------+-----------+----------+---------------------+------------------------------+-----------------------------------------------------+------+

從上述結果中咱們能夠看到該事務從一開始到如今執行過的全部SQL,當咱們把該事務相關信息都查詢清楚後,咱們就能夠斷定該事務是否能夠殺掉,以避免影響其餘事務形成等待現象。事務

在這裏稍微拓展下,長事務極易形成阻塞或者死鎖現象,一般狀況下咱們能夠首先查詢 sys.innodb_lock_waits 視圖肯定有沒有事務阻塞現象:

#假設一個事務執行 select * from stu_tb where stu_id = 1006 for update
#另一個事務執行 update stu_tb set stu_name = 'wang' where stu_id = 1006

mysql> select * from sys.innodb_lock_waits\G
*************************** 1. row ***************************
                wait_started: 2019-09-16 14:34:32
                    wait_age: 00:00:03
               wait_age_secs: 3
                locked_table: `testdb`.`stu_tb`
                locked_index: uk_stu_id
                 locked_type: RECORD
              waiting_trx_id: 6178
         waiting_trx_started: 2019-09-16 14:34:32
             waiting_trx_age: 00:00:03
     waiting_trx_rows_locked: 1
   waiting_trx_rows_modified: 0
                 waiting_pid: 19
               waiting_query: update stu_tb set stu_name = 'wang' where stu_id = 1006
             waiting_lock_id: 6178:47:4:7
           waiting_lock_mode: X
             blocking_trx_id: 6177
                blocking_pid: 20
              blocking_query: NULL
            blocking_lock_id: 6177:47:4:7
          blocking_lock_mode: X
        blocking_trx_started: 2019-09-16 14:18:44
            blocking_trx_age: 00:15:51
    blocking_trx_rows_locked: 2
  blocking_trx_rows_modified: 0
     sql_kill_blocking_query: KILL QUERY 20
sql_kill_blocking_connection: KILL 20

上述結果顯示出被阻塞的SQL以及鎖的類型,更強大的是殺掉會話的語句也給出來了。可是並無找到阻塞會話執行的SQL,若是咱們想找出更詳細的信息,可使用下面語句:

mysql> SELECT
    ->   tmp.*,
    ->   c.SQL_Text blocking_sql_text,
    ->   p.HOST blocking_host
    -> FROM
    ->   (
    ->   SELECT
    ->     r.trx_state wating_trx_state,
    ->     r.trx_id waiting_trx_id,
    ->     r.trx_mysql_thread_Id waiting_thread,
    ->     r.trx_query waiting_query,
    ->     b.trx_state blocking_trx_state,
    ->     b.trx_id blocking_trx_id,
    ->     b.trx_mysql_thread_id blocking_thread,
    ->     b.trx_query blocking_query
    ->   FROM
    ->     information_schema.innodb_lock_waits w
    ->     INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
    ->     INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id
    ->   ) tmp,
    ->   information_schema.PROCESSLIST p,
    ->   PERFORMANCE_SCHEMA.events_statements_current c,
    ->   PERFORMANCE_SCHEMA.threads t
    -> WHERE
    ->   tmp.blocking_thread = p.id
    ->   AND t.thread_id = c.THREAD_ID
    ->   AND t.PROCESSLIST_ID = p.id \G
*************************** 1. row ***************************
  wating_trx_state: LOCK WAIT
    waiting_trx_id: 6180
    waiting_thread: 19
     waiting_query: update stu_tb set stu_name = 'wang' where stu_id = 1006
blocking_trx_state: RUNNING
   blocking_trx_id: 6177
   blocking_thread: 20
    blocking_query: NULL
 blocking_sql_text: select * from stu_tb where stu_id = 1006 for update
     blocking_host: localhost

上面結果顯得更加清晰,咱們能夠清楚的看到阻塞端及被阻塞端事務執行的語句,有助於咱們排查並確認是否能夠殺掉阻塞的會話。

3.監控長事務

現實工做中咱們須要監控下長事務,定義一個閾值,好比說30s 執行時間超過30s的事務即爲長事務,要求記錄並告警出來,提醒管理人員去處理。下面給出監控腳本,各位能夠參考下,根據需求改動使用:

#!/bin/bash
# -------------------------------------------------------------------------------
# FileName:    long_trx.sh
# Describe:    monitor long transaction
# Revision:    1.0
# Date:        2019/09/16
# Author:      wang

/usr/local/mysql/bin/mysql -N -uroot -pxxxxxx -e "select now(),(UNIX_TIMESTAMP(now()) - UNIX_TIMESTAMP(a.trx_started)) diff_sec,b.id,b.user,b.host,b.db,d.SQL_TEXT from information_schema.innodb_trx a inner join
information_schema.PROCESSLIST b
on a.TRX_MYSQL_THREAD_ID=b.id and b.command = 'Sleep'
inner join performance_schema.threads c ON b.id = c.PROCESSLIST_ID
inner join performance_schema.events_statements_current d ON d.THREAD_ID = c.THREAD_ID;" | while read A B C D E F G H
do
  if [ "$C" -gt 30 ]
      then
      echo $(date +"%Y-%m-%d %H:%M:%S")
      echo "processid[$D] $E@$F in db[$G] hold transaction time $C SQL:$H"
  fi
done >> /tmp/longtransaction.txt

簡單說明一下,這裏的-gt 30是30秒鐘的意思,只要超過了30秒鐘就認定是長事務,能夠根據實際須要自定義。將該腳本加入定時任務中便可執行。

總結: 

本文主要介紹了長事務相關內容,怎樣找到長事務,怎麼處理長事務,如何監控長事務。可能有些小夥伴對事務理解還很少,但願這篇文章對你有所幫助。因爲本篇文章列出的查詢事務相關語句較多,現總結以下:

# 查詢全部正在運行的事務及運行時間
select t.*,to_seconds(now())-to_seconds(t.trx_started) idle_time from INFORMATION_SCHEMA.INNODB_TRX t \G

# 查詢事務詳細信息及執行的SQL
select now(),(UNIX_TIMESTAMP(now()) - UNIX_TIMESTAMP(a.trx_started)) diff_sec,b.id,b.user,b.host,b.db,d.SQL_TEXT from information_schema.innodb_trx a inner join information_schema.PROCESSLIST b
on a.TRX_MYSQL_THREAD_ID=b.id and b.command = 'Sleep'
inner join performance_schema.threads c ON b.id = c.PROCESSLIST_ID
inner join performance_schema.events_statements_current d ON d.THREAD_ID = c.THREAD_ID;

# 查詢事務執行過的全部歷史SQL記錄
SELECT
  ps.id 'PROCESS ID',
  ps.USER,
  ps.HOST,
  esh.EVENT_ID,
  trx.trx_started,
  esh.event_name 'EVENT NAME',
  esh.sql_text 'SQL',
  ps.time 
FROM
  PERFORMANCE_SCHEMA.events_statements_history esh
  JOIN PERFORMANCE_SCHEMA.threads th ON esh.thread_id = th.thread_id
  JOIN information_schema.PROCESSLIST ps ON ps.id = th.processlist_id
  LEFT JOIN information_schema.innodb_trx trx ON trx.trx_mysql_thread_id = ps.id 
WHERE
  trx.trx_id IS NOT NULL 
  AND ps.USER != 'SYSTEM_USER' 
ORDER BY
  esh.EVENT_ID;

 # 簡單查詢事務鎖
 select * from sys.innodb_lock_waits\G

 # 查詢事務鎖詳細信息
 SELECT
  tmp.*,
  c.SQL_Text blocking_sql_text,
  p.HOST blocking_host 
FROM
  (
  SELECT
    r.trx_state wating_trx_state,
    r.trx_id waiting_trx_id,
    r.trx_mysql_thread_Id waiting_thread,
    r.trx_query waiting_query,
    b.trx_state blocking_trx_state,
    b.trx_id blocking_trx_id,
    b.trx_mysql_thread_id blocking_thread,
    b.trx_query blocking_query 
  FROM
    information_schema.innodb_lock_waits w
    INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
    INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id 
  ) tmp,
  information_schema.PROCESSLIST p,
  PERFORMANCE_SCHEMA.events_statements_current c,
  PERFORMANCE_SCHEMA.threads t 
WHERE
  tmp.blocking_thread = p.id 
  AND t.thread_id = c.THREAD_ID 
  AND t.PROCESSLIST_ID = p.id \G
相關文章
相關標籤/搜索