MySQL死鎖

時間 2019-12-20

標籤 mysql 死鎖欄目 MySQL 简体版

原文原文鏈接

https://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks.htmlhtml

什麼是mysql的死鎖？python

A deadlock is a situation where different transactions are unable to proceed because each holds a lock that the other needs. Because both transactions are waiting for a resource to become available, neither ever release the locks it holds.
mysql

簡單來講能夠提煉出2個詞：環路等待（each holds a lock that the other needs）和不可剝奪（neither ever release the locks it holds）。sql

其實普遍意義上死鎖的四個必要條件也能夠直接簡化爲上述兩個條件，剩下的互斥和請求保持條件只是兩個衆所周知的補充。數據庫

1、一個簡單的死鎖示例：服務器

會話A：網絡

mysql> CREATE TABLE t (i INT) ENGINE = InnoDB;
Query OK, 0 rows affected (1.07 sec)

mysql> INSERT INTO t (i) VALUES(1);
Query OK, 1 row affected (0.09 sec)

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;
+------+
| i    |
+------+
| 1    |
+------+

會話B：併發

mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

mysql> DELETE FROM t WHERE i = 1;

此時會話B會被阻塞（直到鎖請求超時）。app

此時會話A繼續執行：

DELETE FROM t WHERE i = 1;

會話B會被立馬rollback，由於產生了死鎖，最近的死鎖信息能夠經過show engine innodb status\G看到。負載均衡

打開innodb_print_all_deadlocks參數以後，死鎖信息還會在error日誌裏打印。鑑於本例過於簡單就不佔用篇幅分析死鎖信息了。

set @@global.innodb_print_all_deadlocks=on;

innodb會選擇耗費資源較少的事務進行回滾（取決於DML涉及的行數和size）。

2、一個實際的死鎖示例：

error日誌裏顯示的死鎖日誌爲：

InnoDB: transactions deadlock detected, dumping detailed information.
*** (1) TRANSACTION:
TRANSACTION 209262583957, ACTIVE 1 sec starting index read
mysql tables in use 2, locked 2
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s)
MySQL thread id 129183854, OS thread handle 0x7f1aeae7a700, query id 68320628504 <服務器A信息> updating
update  tb_authorize_info set account_balance=account_balance-  100.00 
     where (SELECT a.account_balance from 
(select account_balance from tb_authorize_info a where appId =  '49E5BD695F853DC3' )a)  -  100.00 > 0 
 and appId = '49E5BD695F853DC3'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262583957 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......

*** (2) TRANSACTION:
TRANSACTION 209262584968, ACTIVE 1 sec starting index read
mysql tables in use 2, locked 2
4 lock struct(s), heap size 1184, 2 row lock(s)
MySQL thread id 129183879, OS thread handle 0x7f198b208700, query id 68320632234 <服務器B信息> updating
update  tb_authorize_info set account_balance=account_balance-  100.00 
     where (SELECT a.account_balance from 
(select account_balance from tb_authorize_info a where appId =  '49E5BD695F853DC3' )a)  -  100.00 > 0 
 and appId = '49E5BD695F853DC3'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262584968 lock mode S locks rec but not gap
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1845 page no 4 n bits 96 index `PRIMARY` of table `xxx`.`tb_authorize_info` trx id 209262584968 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 32; compact format; info bits 0
......

*** WE ROLL BACK TRANSACTION (2)

這個死鎖屬於簡單的死鎖，因爲網絡或其餘延遲致使應用請求發送到了2臺負載均衡的應用服務器，兩個應用程序同時請求數據庫執行SQL，二者都根據where條件先獲取到了S鎖，而後準備升級爲X鎖以便更新，可是各自被對方的S鎖阻塞，所以造成死鎖，不過死鎖很快被mysql殺掉，事務1正常執行完畢，事務二回滾，前臺業務除了一點點延遲基本沒啥影響。

3、stackoverflow上另外一個死鎖：

有人在stackoverflow上發了一個死鎖的信息，嘗試直接解析此類信息對分析高併發下的SQL卡慢會有幫助所以嘗試本身解析，因爲時間久遠如今我已經找不到相關連接也懶得去找了。

LATEST DETECTED DEADLOCK
------------------------
130409  0:40:58
*** (1) TRANSACTION:
TRANSACTION 3D61D41F, ACTIVE 3 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 43 lock struct(s), heap size 6960, 358 row lock(s), undo log entries 43
MySQL thread id 17241690, OS thread handle 0x7ffd3469a700, query id 860259163 localhost root update
#############
INSERT INTO `notification` (`other_grouped_notifications_count`, `user_id`, `notifiable_type`, `action_item`, `action_id`, `created_at`, `status`, `updated_at`) 
VALUES (0, 4442, 'MATCH', 'MATCH', 224716, 1365448255, 1, 1365448255)
#############
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 272207 n bits 1272 index `user_id` of table `notification` trx id 3D61D41F lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 69 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 8000115b; asc    [;;
 1: len 4; hex 0005e0bb; asc     ;;
-- 事務1欲插入數據user_id=4442，所以首先獲取了對應主鍵(lower_bound,4443]範圍上的插入意向鎖，而後想要在輔助索引(lower_bound,4443]的範圍上加insert intention lock，但被阻塞，推斷這個範圍上已經有了其餘事務的行鎖
-- 事務1須要獲取2個插入意向鎖後纔會開始插入操做，這兩個鎖的獲取是不可分割的
*** (2) TRANSACTION:
TRANSACTION 3D61C472, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1248, 2 row lock(s)
MySQL thread id 17266704, OS thread handle 0x7ffd34b01700, query id 860250374 localhost root Updating
#############
UPDATE `notification` SET `status`=0 WHERE user_id = 4443 and status=1
#############
*** (2) HOLDS THE LOCK(S):
-- 事務2的update語句要更新user_id=4443的記錄，所以首先在user_id索引的(lower_bound,4443]範圍添加了X模式的next-key行鎖，事務1就是被這個next-key行鎖阻塞的
RECORD LOCKS space id 0 page no 272207 n bits 1272 index `user_id` of table `notification` trx id 3D61C472 lock_mode X
Record lock, heap no 69 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 8000115b; asc    [;;
 1: len 4; hex 0005e0bb; asc     ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
-- 當事務2嘗試更新主鍵數據時要獲取user_id=4443對應主鍵的行鎖，可是發現主鍵的(lower_bound,4443]範圍上已經被事務1加了insert intention lock，所以被阻塞
-- 一樣事務2獲取輔助索引的next-key和主鍵的record鎖也是不可分割的，只有都獲取完畢才能進行update
RECORD LOCKS space id 0 page no 261029 n bits 248 index `PRIMARY` of table `notification` trx id 3D61C472 lock_mode X locks rec but not gap waiting
Record lock, heap no 161 PHYSICAL RECORD: n_fields 16; compact format; info bits 0
 0: len 4; hex 0005e0bb; asc     ;;
 1: len 6; hex 00000c75178f; asc    u  ;;
 2: len 7; hex 480007c00c1d10; asc H      ;;
 3: len 4; hex 8000115b; asc    [;;
 4: len 8; hex 5245474953544552; asc REGISTER;;
 5: SQL NULL;
 6: SQL NULL;
 7: SQL NULL;
 8: len 4; hex d117dd91; asc     ;;
 9: len 4; hex d117dd91; asc     ;;
 10: len 1; hex 80; asc  ;;
 11: SQL NULL;
 12: SQL NULL;
 13: SQL NULL;
 14: SQL NULL;
 15: len 4; hex 80000000; asc     ;;

*** WE ROLL BACK TRANSACTION (2)

因此這個死鎖的出現就很容易理解了，事務1先獲取了4442位置主鍵的插入意向鎖，在獲取輔助索引上的插入意向鎖時被事務2 update語句的next-key行鎖阻塞致使插入意向鎖獲取失敗，而事務2的update獲取了索引的next-key行鎖後嘗試更新主鍵(即在主鍵上加非gap行鎖)卻被事務1的插入意向鎖阻塞。

兩個事務都不能放棄本身已有的資源，都請求與對方不兼容的鎖，不可剝奪且造成環路等待所以死鎖。

這個死鎖的根源就在於事務2的update語句持續的時間過長，致使後繼insert語句卡死。

4、如何避免死鎖？

其實官網有一篇完整的介紹：https://dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks-handling.html

可是內容有點多，我仍是習慣用幾句話總結下：

一、儘量優化SQL的查詢性能使得事務儘量的短小。

二、若是不介意幻讀可使用read committed隔離級別以禁止範圍鎖。

三、若是前二者都作不到或者SQL優化的空間比較小，那麼儘可能分表分庫，經過增長資源（或者叫分散資源）減小資源衝突的概率。

5、總結：

因爲mysql innodb特殊的行鎖機制，死鎖一般都是涉及到插入意向鎖和next-key鎖的，由於這兩個鎖是範圍鎖，範圍鎖涉及的目的就是爲避免幻讀，這會鎖定一些本身不須要操做的記錄。

不過在mysql中死鎖歷來都不是大問題，死鎖一般都是數據庫卡慢的果，而非因。並且因爲數據庫中廣泛存在的死鎖查殺機制，死鎖產生後會很快被查殺。

真正可能引起數據庫性能問題的，是高併發下的長事務，這種事務會致使undo等資源的爭用，會佔用binlog的提交隊列致使後繼事務處於commit階段沒法提交，即使強制kill也會引起長時間的rollback操做。