死鎖及常見死鎖模型

Ⅰ、死鎖的概念

兩個或兩個以上的事務在執行過程當中,因爭奪資源而形成的一種互相等待的現象前端

最多見:AB-BAmysql

稍微複雜:A-B B-C C-A造成以下圖的循環
算法

tips:
lock wait time out 和 dead lock不是一回事sql

1.1 死鎖的處理機制

  • 鎖超時
    --innodb_lock_wait_timeout 讓一個超時另外一個執行,但這個機制在數據庫中是不用的
  • 自動死鎖檢測
    經過鎖的信息鏈表和事務等待鏈表構造出一個等待圖(wait-for graph),以下:

    t1 t2 t3 t4 是事務列表,四個事務之間的邊表示等待關係——edge,每一個節點(事務)和每一個edge加進去的時候會判斷下有沒有迴路,若是有,就那啥,懂的吧?

檢測到了後會很聰明地選擇其中一個事務回滾,那選擇哪一個呢?數據庫

根據undo的量判斷,回滾量少的,不記得是5.5仍是5.6開始纔有這個機制,以前的數據庫版本是直接回滾後面一個事務,比較挫緩存

5.6版本對圖的死鎖檢測部分的內核算法進行了優化,原來是遞歸的方式作的,如今經過重寫,非遞歸,提高了性能,因此大併發時5.6性能比5.5好不少,這塊也有功勞session

Ⅱ、死鎖演示

2.1 先模擬下場景

begin:
session1:
select a for update;

session2:
begin:
select b for update;
select a for update;
此時等待。

session1:
select b for update;

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

(root@localhost) [test]> show engine innodb status\G
...
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-06-15 01:27:47 0x7f2cb6acc700
*** (1) TRANSACTION:
TRANSACTION 31220816, ACTIVE 25 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1448, OS thread handle 139830020597504, query id 8810 localhost root statistics
select * from l where a = 4 for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220816 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; compact format; info bits 0
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 000001c1b93a; asc      :;;
 2: len 7; hex e1000001a90110; asc        ;;
 3: len 4; hex 80000006; asc     ;;
 4: len 4; hex 80000008; asc     ;;
 5: len 4; hex 8000000a; asc     ;;

*** (2) TRANSACTION:
TRANSACTION 31220817, ACTIVE 11 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1449, OS thread handle 139830020065024, query id 8811 localhost root statistics
select * from l where a = 10 for update
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220817 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; compact format; info bits 0
 0: len 4; hex 80000004; asc     ;;
 1: len 6; hex 000001c1b93a; asc      :;;
 2: len 7; hex e1000001a90110; asc        ;;
 3: len 4; hex 80000006; asc     ;;
 4: len 4; hex 80000008; asc     ;;
 5: len 4; hex 8000000a; asc     ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220817 lock_mode X locks rec but not gap waiting
Record lock, heap no 6 PHYSICAL RECORD: n_fields 6; compact format; info bits 0
 0: len 4; hex 8000000a; asc     ;;
 1: len 6; hex 000001dc637f; asc     c ;;
 2: len 7; hex b30000019d0110; asc        ;;
 3: len 4; hex 8000000c; asc     ;;
 4: len 4; hex 8000000e; asc     ;;
 5: len 4; hex 80000010; asc     ;;

*** WE ROLL BACK TRANSACTION (2)
...

上面這個只能記錄最近一次死鎖,記錄全部死鎖信息到錯誤日誌中可打開下面這個參數併發

innodb_print_all_deadlocks     建議用起來

2.2 5.7版本一個新參數

set global innodb_deadlock_detect=0    不檢測死鎖

不檢測以後,這數據庫怎麼解決死鎖呢,那就是等待鎖超時咯,默認50ssqlserver

兩邊鎖超時,兩邊事務依然沒法繼續進行,不會回滾,處於未知狀態,須要人爲操做,要麼commit,要麼rollback,不然繼續執行仍是被鎖性能

而dead lock時,其中一個事務是回滾的

一般來講事務中執行一個操做失敗是不會回滾的,由用戶決定是回滾仍是提交,只有死鎖的狀況會回滾,不回滾的話,事務佔用的鎖的資源是不釋放的。

何時須要把死鎖檢測調爲0?

這個需求最先是淘寶提給官方的,官方也接受了

秒殺場景下設爲0是有意義的,反正都在等,不要死鎖檢測,設了性能有一點點提高,意義不大

2.3 8.0中鎖的兩個新語法

session1:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> select * from l where a = 2 for update;
+---+------+------+------+
| a | b    | c    | d    |
+---+------+------+------+
| 2 |    4 |    6 |    8 |
+---+------+------+------+
1 row in set (0.00 sec)

session2:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> select * from l where a = 2 for update nowait;
ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set.

若是session2用另外一個語法則不會報錯,

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> select * from l where a = 2 for update skip locked;
Empty set (0.00 sec)

發現記錄上有鎖,則跳過,返回空,不報錯

這兩個語法比較有用,sqlserver,pg也有,業務裏能夠用這個能夠代替上面淘寶說出的那個,秒殺的時候若是返回空,那前端就等待

Ⅲ、死鎖經典案例

3.1 購物車死鎖——扣庫存操做

sku_id(每件商品的真正id,最小分類下的)

session1;
begin;
update stock set count=count-1 where sku_id=1;
-------------------------------------------=2;
------------------------------------------=30;

session2:
update stock set count=count-1 where sku_id=30;
--------------------------------------------=1;
--------------------------------------------=2;

併發的時候就死鎖了,這個問題數據庫層解決不了

死鎖並非問題,是數據庫的正常現象,只有當死鎖影響到業務時,這時候才須要dba介入處理

解決:
前端或者接口層把訂單中的商品id排序再發送到數據庫層,排序後不會死鎖,但會發生等待,鎖等待調成3s,不要用默認的50s,可是若是併發很大,性能就會比較差

不少業務忽然就變成熱點,想不到的,這時候消息隊列就不行了,這就要用線程池限流來解決,業務很是大,線程池是繞不過去的,強烈建議,應用層用消息隊列,數據庫層開線程池

這一步是電商中最關鍵的,也是併發最大的,不能夠寫緩存,全是insert和update操做,還有少部分select for update鎖定庫存

3.2 惟一索引死鎖

先搞清楚惟一索引的插入

(root@localhost) [test]> show create table l\G
*************************** 1. row ***************************
       Table: l
Create Table: CREATE TABLE `l` (
  `a` int(11) NOT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`a`),
  UNIQUE KEY `c` (`c`),
  KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

(root@localhost) [test]> select * from l;
+---+------+------+------+
| a | b    | c    | d    |
+---+------+------+------+
| 2 |    4 |    6 |    8 |
| 4 |    6 |    8 |   10 |
| 6 |    8 |   10 |   12 |
| 8 |   10 |   12 |   14 |
+---+------+------+------+
4 rows in set (0.00 sec)

(root@localhost) [test]> show variables like 'tx_isolation';  
+---------------+----------------+
| Variable_name | Value          |
+---------------+----------------+
| tx_isolation  | READ-COMMITTED |
+---------------+----------------+
1 row in set (0.00 sec)

session1:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> delete from l where c = 10;
Query OK, 1 row affected (0.00 sec)

session2:

(root@localhost) [test]> insert into l values (10,12,10,16);
hang~~~

session3:

(root@localhost) [(none)]> show engine innodb status\G
...
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421305875783280, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 31220861, ACTIVE 11 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1
MySQL thread id 1561, OS thread handle 139830452774656, query id 8980 localhost root update
insert into l values(10,12,10,16)
------- TRX HAS BEEN WAITING 11 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220861 lock mode S waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 8000000a; asc     ;;
 1: len 4; hex 80000006; asc     ;;

------------------
TABLE LOCK table `test`.`l` trx id 31220861 lock mode IX
RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220861 lock mode S waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 8000000a; asc     ;;
 1: len 4; hex 80000006; asc     ;;

---TRANSACTION 31220860, ACTIVE 18 sec
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 1560, OS thread handle 139830453040896, query id 8978 localhost root
TABLE LOCK table `test`.`l` trx id 31220860 lock mode IX
RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220860 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 4; hex 8000000a; asc     ;;
 1: len 4; hex 80000006; asc     ;;

RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220860 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 6; compact format; info bits 32
 0: len 4; hex 80000006; asc     ;;
 1: len 6; hex 000001dc647c; asc     d|;;
 2: len 7; hex 3800000ff21de8; asc 8      ;;
 3: len 4; hex 80000008; asc     ;;
 4: len 4; hex 8000000a; asc     ;;
 5: len 4; hex 8000000c; asc     ;;
...

等待10上面的這把鎖,鎖的類型是S鎖,好奇怪

10上面有個S lock,如今插入9也會被阻塞,因此,因此說,雖然是rc事務隔離級別,但只要有惟一索引,那依然存在gap鎖

lock share mode不怎麼在業務層(鎖庫存)使用,可是在數據庫層保證惟一性

若是這張表上沒有惟一索引,若是有,那也是個主鍵,而且是自增的,也就是說插入不會衝突,這種狀況下全部的插入在rc的事務隔離級別下都是並行的,不會被阻塞

緣由:rc狀況下,沒有gap鎖,插入的時候不鎖範圍,那就能夠並行插入,不會被阻塞,主鍵又惟一,又是自增的,不會衝突,因此不會有問題。
可是若是表中除了主鍵還有其餘惟一索引,插入就會發生等待

對於insert,它是怎麼插入的呢?
舉例:
1 3 5 7 insert into 3

①先找大於3的第一條記錄(next_rec)

②看此記錄是否有gap鎖或者next-locking,有不能插,沒有能夠插,insert-intention lock和record lock均可以插

若是有惟一索引,那就還有下面步驟

③previous_rec=current_rec,就是衝突了,這樣就檢測了惟一性

這時候若是previous_rec上沒鎖,那就立刻告訴你衝突了,duplicate key

若是previous_rec上有鎖,那這時候插入3,須要對這條記錄加一個S。不能直接報duplicate key,在這個例子中,可能另外一個事務是delete 3,它成功了,我就能夠插了,因此5以前這條記錄產生S lock,若是3上面的鎖被釋放了,就會喚醒這個S lock

這樣這個S lock產生的緣由就講清楚了

總結:
惟一索引的插入須要額外一步檢查(惟一約束的檢查),即便rc,依然有gap鎖,不能掉以輕心

接下來咱們最小化模擬一下這個死鎖

(root@localhost) [test]> show create table dl\G
*************************** 1. row ***************************
       Table: dl
Create Table: CREATE TABLE `dl` (
  `a` int(11) NOT NULL,
  UNIQUE KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

(root@localhost) [test]> select * from dl;
+---+
| a |
+---+
| 1 |
+---+
1 row in set (0.00 sec)

(root@localhost) [test]> show variables like 'tx_isolation';  
+---------------+----------------+
| Variable_name | Value          |
+---------------+----------------+
| tx_isolation  | READ-COMMITTED |
+---------------+----------------+
1 row in set (0.00 sec)

session1:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> delete from l where a = 1;
Query OK, 0 rows affected (0.00 sec)

session2:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> insert into dl values(1);
hang~~~

session3:

(root@localhost) [test]> begin;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [test]> insert into dl values(1);
hang~~~

session1:

(root@localhost) [test]> commit;
Query OK, 0 rows affected (0.01 sec)

session2:

Query OK, 1 row affected (29.35 sec)

session3:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

看下死鎖信息

(root@localhost) [test]> show engine innodb status\G
...
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-06-15 02:39:12 0x7f2cd008e700
*** (1) TRANSACTION:
TRANSACTION 31220875, ACTIVE 30 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1561, OS thread handle 139830452774656, query id 9000 localhost root update
insert into dl values(1)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220875 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 000001dc6486; asc     d ;;
 2: len 7; hex 4100000fce1353; asc A     S;;

*** (2) TRANSACTION:
TRANSACTION 31220876, ACTIVE 4 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1562, OS thread handle 139830445532928, query id 9010 localhost root update
insert into dl values (1)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220876 lock mode S locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 000001dc6486; asc     d ;;
 2: len 7; hex 4100000fce1353; asc A     S;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220876 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 000001dc6486; asc     d ;;
 2: len 7; hex 4100000fce1353; asc A     S;;

*** WE ROLL BACK TRANSACTION (2)
...

delete 的時候 加了一個X鎖,insert 的時候 加了一個S鎖,insert 的時候又加了一個S鎖,commit後第一個insert成功,第二個死鎖

梳理一下這個流程

thd1                         thd2                                        thd3       
 del--->X
                           ins--->S--->wait
                                                                        ins--->S--->wait
 commit--->釋放

                        ins--->須要X--->等待thd3的S      <===>       ins--->須要X--->等待thd2的S
                                                        死鎖

遇到死鎖大部分都是惟一索引引發的,看下show engine innodb status\G 死鎖裏面鎖的索引是否是惟一索引
記住S鎖就是用來作惟一性檢測的,其餘用的比較少

tips: unique key必須是not null,不然這個死鎖搞不出來,不知道爲何,沒研究

相關文章
相關標籤/搜索