今天公司同事反饋一個SQL語句刪除數據刪除了一個小時,尚未刪除完,強制中斷。 第一眼看到 exists 的時候,腦子裏要有這麼個概念:
mysql
Oracle exists 的效率比in 高。而Mysql 則不必定。 Mysql 使用eixsts 與使用in的規則爲:sql
子查詢的表大的時候,使用EXISTS能夠有效減小總的循環次數來提高速度;
外查詢的表大的時候,使用IN能夠有效減小對外查詢表循環遍從來提高速度。
從本質上講,exists 是之外查詢爲驅動表,而in 是以子查詢爲驅動表(驅動表決定了以 哪一個結果集做爲nestloop的對比依據)。
oop
3.1.1 SQL
DELETE t FROM o.`AI_AD_U_L` t WHERE EXISTS (SELECT 1 FROM o.`AI_AD_U_L_TEMP` AS a WHERE a.`ca_id`=t.`ca_id`);
3.1.2 分析過程
查看錶上的索引spa
mysql> show index from AI_AD_U_L; +-----------+------------+---------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-----------+------------+---------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | AI_AD_U_L | 0 | PRIMARY | 1 | prod_inst_id | A | 21162012 | NULL | NULL | | BTREE | | | | AI_AD_U_L | 1 | ai_sync_prod_level_cust_addr_id | 1 | cust_addr_id | A | 8266746 | NULL | NULL | YES | BTREE | | | | AI_AD_U_L | 1 | ai_sync_prod_level_mac | 1 | mac | A | 12227460 | NULL | NULL | YES | BTREE | | | +-----------+------------+---------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 3 rows in set (0.00 sec) mysql> show index from AI_AD_U_L_TEMP; +----------------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | AI_AD_U_L_TEMP | 1 | idx_cust_addr_id2 | 1 | cust_addr_id | A | 2366 | NULL | NULL | YES | BTREE | | | | AI_AD_U_L_TEMP | 1 | idx_prod_inst_id | 1 | prod_inst_id | A | 3791 | NULL | NULL | | BTREE | | | +----------------+------------+-------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.00 sec)
此時表上是有對應字段的索引的,若是索引不存在,須要建立索引。code
查看執行計劃索引
mysql> explain DELETE t FROM o.`AI_AD_U_L` t WHERE EXISTS (SELECT 1 FROM o.`AI_AD_U_L_TEMP` AS a WHERE a.prod_inst_id = t.prod_inst_id); +----+--------------------+-------+------------+------+------------------+------------------+---------+-----------------------+----------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------+------------------+---------+-----------------------+----------+----------+-------------+ | 1 | DELETE | t | NULL | ALL | NULL | NULL | NULL | NULL | 21162122 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | a | NULL | ref | idx_prod_inst_id | idx_prod_inst_id | 8 | o.t.prod_inst_id | 1 | 100.00 | Using index | +----+--------------------+-------+------------+------+------------------+------------------+---------+-----------------------+----------+----------+-------------+ 2 rows in set, 1 warning (0.01 sec)
經過執行計劃發現兩點問題:it
- 外查詢表數據量大,21162122,也就是訪問了21162122次,而子查詢經過索引只訪問了一次。
- 發現子查詢使用了索引,而外查詢表上沒有使用索引。
從以上兩點發現,說明外查詢做爲了驅動表。io
查看子查詢中表的數據量table
mysql> select count(*) from AI_AD_U_L_TEMP; +----------+ | count(*) | +----------+ | 3791 | +----------+ 1 row in set (0.00 sec)
子查詢中數據量小,應以子查詢爲驅動表。應該用exists 應換成in。class
調整SQL語句並查看執行計劃 將exists 改成in 的用法 。
mysql> explain DELETE t FROM o.`AI_AD_U_L` t WHERE t.prod_inst_id in (SELECT prod_inst_id FROM o.`AI_AD_U_L_TEMP` AS a ); +----+-------------+-------+------------+--------+------------------+------------------+---------+-----------------------+------+----------+------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+------------------+---------+-----------------------+------+----------+------------------------+ | 1 | SIMPLE | a | NULL | index | idx_prod_inst_id | idx_prod_inst_id | 8 | NULL | 3791 | 100.00 | Using index; LooseScan | | 1 | DELETE | t | NULL | eq_ref | PRIMARY | PRIMARY | 8 | o.a.prod_inst_id | 1 | 100.00 | NULL | +----+-------------+-------+------------+--------+------------------+------------------+---------+-----------------------+------+----------+------------------------+ 2 rows in set (0.00 sec)
從執行計劃中能夠看到,兩張表都在使用索引。而外表的訪問次數也明顯降低爲子查詢表中的行數。大量減小了循環訪問外表的次數。
執行SQL語句
mysql> DELETE t FROM o.`AI_AD_U_L` t WHERE t.prod_inst_id in (SELECT prod_inst_id FROM o.`AI_AD_U_L_TEMP` AS a ); Query OK, 3525 rows affected (0.44 sec)
咱們看到效果明顯, 原來1小時都沒法執行完成的SQL,如今只須要0.44秒。