兩個或兩個以上的事務在執行過程當中,因爭奪資源而形成的一種互相等待的現象前端
最多見:AB-BAmysql
稍微複雜:A-B B-C C-A造成以下圖的循環
算法
tips:
lock wait time out 和 dead lock不是一回事sql
檢測到了後會很聰明地選擇其中一個事務回滾,那選擇哪一個呢?數據庫
根據undo的量判斷,回滾量少的,不記得是5.5仍是5.6開始纔有這個機制,以前的數據庫版本是直接回滾後面一個事務,比較挫緩存
5.6版本對圖的死鎖檢測部分的內核算法進行了優化,原來是遞歸的方式作的,如今經過重寫,非遞歸,提高了性能,因此大併發時5.6性能比5.5好不少,這塊也有功勞session
begin: session1: select a for update; session2: begin: select b for update; select a for update; 此時等待。 session1: select b for update; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction (root@localhost) [test]> show engine innodb status\G ... ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2018-06-15 01:27:47 0x7f2cb6acc700 *** (1) TRANSACTION: TRANSACTION 31220816, ACTIVE 25 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 1448, OS thread handle 139830020597504, query id 8810 localhost root statistics select * from l where a = 4 for update *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220816 lock_mode X locks rec but not gap waiting Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; compact format; info bits 0 0: len 4; hex 80000004; asc ;; 1: len 6; hex 000001c1b93a; asc :;; 2: len 7; hex e1000001a90110; asc ;; 3: len 4; hex 80000006; asc ;; 4: len 4; hex 80000008; asc ;; 5: len 4; hex 8000000a; asc ;; *** (2) TRANSACTION: TRANSACTION 31220817, ACTIVE 11 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 1449, OS thread handle 139830020065024, query id 8811 localhost root statistics select * from l where a = 10 for update *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220817 lock_mode X locks rec but not gap Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; compact format; info bits 0 0: len 4; hex 80000004; asc ;; 1: len 6; hex 000001c1b93a; asc :;; 2: len 7; hex e1000001a90110; asc ;; 3: len 4; hex 80000006; asc ;; 4: len 4; hex 80000008; asc ;; 5: len 4; hex 8000000a; asc ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220817 lock_mode X locks rec but not gap waiting Record lock, heap no 6 PHYSICAL RECORD: n_fields 6; compact format; info bits 0 0: len 4; hex 8000000a; asc ;; 1: len 6; hex 000001dc637f; asc c ;; 2: len 7; hex b30000019d0110; asc ;; 3: len 4; hex 8000000c; asc ;; 4: len 4; hex 8000000e; asc ;; 5: len 4; hex 80000010; asc ;; *** WE ROLL BACK TRANSACTION (2) ...
上面這個只能記錄最近一次死鎖,記錄全部死鎖信息到錯誤日誌中可打開下面這個參數併發
innodb_print_all_deadlocks 建議用起來
set global innodb_deadlock_detect=0 不檢測死鎖
不檢測以後,這數據庫怎麼解決死鎖呢,那就是等待鎖超時咯,默認50ssqlserver
兩邊鎖超時,兩邊事務依然沒法繼續進行,不會回滾,處於未知狀態,須要人爲操做,要麼commit,要麼rollback,不然繼續執行仍是被鎖性能
而dead lock時,其中一個事務是回滾的
一般來講事務中執行一個操做失敗是不會回滾的,由用戶決定是回滾仍是提交,只有死鎖的狀況會回滾,不回滾的話,事務佔用的鎖的資源是不釋放的。
何時須要把死鎖檢測調爲0?
這個需求最先是淘寶提給官方的,官方也接受了
秒殺場景下設爲0是有意義的,反正都在等,不要死鎖檢測,設了性能有一點點提高,意義不大
session1:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> select * from l where a = 2 for update; +---+------+------+------+ | a | b | c | d | +---+------+------+------+ | 2 | 4 | 6 | 8 | +---+------+------+------+ 1 row in set (0.00 sec)
session2:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> select * from l where a = 2 for update nowait; ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set.
若是session2用另外一個語法則不會報錯,
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> select * from l where a = 2 for update skip locked; Empty set (0.00 sec)
發現記錄上有鎖,則跳過,返回空,不報錯
這兩個語法比較有用,sqlserver,pg也有,業務裏能夠用這個能夠代替上面淘寶說出的那個,秒殺的時候若是返回空,那前端就等待
sku_id(每件商品的真正id,最小分類下的)
session1; begin; update stock set count=count-1 where sku_id=1; -------------------------------------------=2; ------------------------------------------=30; session2: update stock set count=count-1 where sku_id=30; --------------------------------------------=1; --------------------------------------------=2;
併發的時候就死鎖了,這個問題數據庫層解決不了
死鎖並非問題,是數據庫的正常現象,只有當死鎖影響到業務時,這時候才須要dba介入處理
解決:
前端或者接口層把訂單中的商品id排序再發送到數據庫層,排序後不會死鎖,但會發生等待,鎖等待調成3s,不要用默認的50s,可是若是併發很大,性能就會比較差
不少業務忽然就變成熱點,想不到的,這時候消息隊列就不行了,這就要用線程池限流來解決,業務很是大,線程池是繞不過去的,強烈建議,應用層用消息隊列,數據庫層開線程池
這一步是電商中最關鍵的,也是併發最大的,不能夠寫緩存,全是insert和update操做,還有少部分select for update鎖定庫存
先搞清楚惟一索引的插入
(root@localhost) [test]> show create table l\G *************************** 1. row *************************** Table: l Create Table: CREATE TABLE `l` ( `a` int(11) NOT NULL, `b` int(11) DEFAULT NULL, `c` int(11) DEFAULT NULL, `d` int(11) DEFAULT NULL, PRIMARY KEY (`a`), UNIQUE KEY `c` (`c`), KEY `b` (`b`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 1 row in set (0.00 sec) (root@localhost) [test]> select * from l; +---+------+------+------+ | a | b | c | d | +---+------+------+------+ | 2 | 4 | 6 | 8 | | 4 | 6 | 8 | 10 | | 6 | 8 | 10 | 12 | | 8 | 10 | 12 | 14 | +---+------+------+------+ 4 rows in set (0.00 sec) (root@localhost) [test]> show variables like 'tx_isolation'; +---------------+----------------+ | Variable_name | Value | +---------------+----------------+ | tx_isolation | READ-COMMITTED | +---------------+----------------+ 1 row in set (0.00 sec)
session1:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> delete from l where c = 10; Query OK, 1 row affected (0.00 sec)
session2:
(root@localhost) [test]> insert into l values (10,12,10,16); hang~~~
session3:
(root@localhost) [(none)]> show engine innodb status\G ... LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 421305875783280, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 31220861, ACTIVE 11 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s), undo log entries 1 MySQL thread id 1561, OS thread handle 139830452774656, query id 8980 localhost root update insert into l values(10,12,10,16) ------- TRX HAS BEEN WAITING 11 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220861 lock mode S waiting Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 8000000a; asc ;; 1: len 4; hex 80000006; asc ;; ------------------ TABLE LOCK table `test`.`l` trx id 31220861 lock mode IX RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220861 lock mode S waiting Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 8000000a; asc ;; 1: len 4; hex 80000006; asc ;; ---TRANSACTION 31220860, ACTIVE 18 sec 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1 MySQL thread id 1560, OS thread handle 139830453040896, query id 8978 localhost root TABLE LOCK table `test`.`l` trx id 31220860 lock mode IX RECORD LOCKS space id 1358 page no 4 n bits 80 index c of table `test`.`l` trx id 31220860 lock_mode X locks rec but not gap Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 8000000a; asc ;; 1: len 4; hex 80000006; asc ;; RECORD LOCKS space id 1358 page no 3 n bits 80 index PRIMARY of table `test`.`l` trx id 31220860 lock_mode X locks rec but not gap Record lock, heap no 4 PHYSICAL RECORD: n_fields 6; compact format; info bits 32 0: len 4; hex 80000006; asc ;; 1: len 6; hex 000001dc647c; asc d|;; 2: len 7; hex 3800000ff21de8; asc 8 ;; 3: len 4; hex 80000008; asc ;; 4: len 4; hex 8000000a; asc ;; 5: len 4; hex 8000000c; asc ;; ...
等待10上面的這把鎖,鎖的類型是S鎖,好奇怪
10上面有個S lock,如今插入9也會被阻塞,因此,因此說,雖然是rc事務隔離級別,但只要有惟一索引,那依然存在gap鎖
lock share mode不怎麼在業務層(鎖庫存)使用,可是在數據庫層保證惟一性
若是這張表上沒有惟一索引,若是有,那也是個主鍵,而且是自增的,也就是說插入不會衝突,這種狀況下全部的插入在rc的事務隔離級別下都是並行的,不會被阻塞
緣由:rc狀況下,沒有gap鎖,插入的時候不鎖範圍,那就能夠並行插入,不會被阻塞,主鍵又惟一,又是自增的,不會衝突,因此不會有問題。
可是若是表中除了主鍵還有其餘惟一索引,插入就會發生等待
對於insert,它是怎麼插入的呢?
舉例:
1 3 5 7 insert into 3
①先找大於3的第一條記錄(next_rec)
②看此記錄是否有gap鎖或者next-locking,有不能插,沒有能夠插,insert-intention lock和record lock均可以插
若是有惟一索引,那就還有下面步驟
③previous_rec=current_rec,就是衝突了,這樣就檢測了惟一性
這時候若是previous_rec上沒鎖,那就立刻告訴你衝突了,duplicate key
若是previous_rec上有鎖,那這時候插入3,須要對這條記錄加一個S。不能直接報duplicate key,在這個例子中,可能另外一個事務是delete 3,它成功了,我就能夠插了,因此5以前這條記錄產生S lock,若是3上面的鎖被釋放了,就會喚醒這個S lock
這樣這個S lock產生的緣由就講清楚了
總結:
惟一索引的插入須要額外一步檢查(惟一約束的檢查),即便rc,依然有gap鎖,不能掉以輕心
接下來咱們最小化模擬一下這個死鎖
(root@localhost) [test]> show create table dl\G *************************** 1. row *************************** Table: dl Create Table: CREATE TABLE `dl` ( `a` int(11) NOT NULL, UNIQUE KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 1 row in set (0.00 sec) (root@localhost) [test]> select * from dl; +---+ | a | +---+ | 1 | +---+ 1 row in set (0.00 sec) (root@localhost) [test]> show variables like 'tx_isolation'; +---------------+----------------+ | Variable_name | Value | +---------------+----------------+ | tx_isolation | READ-COMMITTED | +---------------+----------------+ 1 row in set (0.00 sec)
session1:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> delete from l where a = 1; Query OK, 0 rows affected (0.00 sec)
session2:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> insert into dl values(1); hang~~~
session3:
(root@localhost) [test]> begin; Query OK, 0 rows affected (0.00 sec) (root@localhost) [test]> insert into dl values(1); hang~~~
session1:
(root@localhost) [test]> commit; Query OK, 0 rows affected (0.01 sec)
session2:
Query OK, 1 row affected (29.35 sec)
session3:
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
看下死鎖信息
(root@localhost) [test]> show engine innodb status\G ... ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2018-06-15 02:39:12 0x7f2cd008e700 *** (1) TRANSACTION: TRANSACTION 31220875, ACTIVE 30 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 1561, OS thread handle 139830452774656, query id 9000 localhost root update insert into dl values(1) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220875 lock_mode X locks rec but not gap waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32 0: len 4; hex 80000001; asc ;; 1: len 6; hex 000001dc6486; asc d ;; 2: len 7; hex 4100000fce1353; asc A S;; *** (2) TRANSACTION: TRANSACTION 31220876, ACTIVE 4 sec inserting mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 1562, OS thread handle 139830445532928, query id 9010 localhost root update insert into dl values (1) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220876 lock mode S locks rec but not gap Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32 0: len 4; hex 80000001; asc ;; 1: len 6; hex 000001dc6486; asc d ;; 2: len 7; hex 4100000fce1353; asc A S;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1630 page no 3 n bits 72 index a of table `test`.`dl` trx id 31220876 lock_mode X locks rec but not gap waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 32 0: len 4; hex 80000001; asc ;; 1: len 6; hex 000001dc6486; asc d ;; 2: len 7; hex 4100000fce1353; asc A S;; *** WE ROLL BACK TRANSACTION (2) ...
delete 的時候 加了一個X鎖,insert 的時候 加了一個S鎖,insert 的時候又加了一個S鎖,commit後第一個insert成功,第二個死鎖
梳理一下這個流程
thd1 thd2 thd3 del--->X ins--->S--->wait ins--->S--->wait commit--->釋放 ins--->須要X--->等待thd3的S <===> ins--->須要X--->等待thd2的S 死鎖
遇到死鎖大部分都是惟一索引引發的,看下show engine innodb status\G 死鎖裏面鎖的索引是否是惟一索引
記住S鎖就是用來作惟一性檢測的,其餘用的比較少
tips: unique key必須是not null,不然這個死鎖搞不出來,不知道爲何,沒研究