結合explain extended淺析使用mysql in 的效率

時間 2019-11-07

標籤結合 explain extended 淺析使用 mysql 效率欄目 MySQL 简体版

原文原文鏈接

用explain extended查看執行計劃會比explain多一列 filtered。
filtered列給出了一個百分比的值，這個百分比值和rows列的值一塊兒使用，能夠估計出那些將要和explain中的前一個表進行鏈接的行的數目。
前一個表就是指explain 的 id列的值比當前表的id小的表。

1. mysql sql查詢中，in是會走索引的：

mysql

點擊(此處)摺疊或打開sql

mysql> explain extended select *,sleep(0.2) from testinfo where id in (1232,232,324,2342,23);
服務器
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------+-------------+
mysql優化
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
性能
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------+-------------+
測試
| 1 | SIMPLE | testinfo | range | PRIMARY | PRIMARY | 4 | NULL | 5 | 100.00 | Using where |
優化
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------+-------------+
spa
1 row in set, 1 warning (0.00 sec)
orm
mysql> show warnings \G
索引
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `test`.`testinfo`.`id` AS `id`,`test`.`testinfo`.`idtest` AS `idtest`,`test`.`testinfo`.`nametest` AS `nametest`,`test`.`testinfo`.`author` AS`author`,`test`.`testinfo`.`typetest` AS `typetest`,sleep(0.2) AS `sleep(0.2)` from `test`.`testinfo` where (`test`.`testinfo`.`id` in (1232,232,324,2342,23))
1 row in set (0.00 sec)
mysql> select *,sleep(0.2) from testinfo where id in (1232,232,324,2342,23);
5 rows in set (1.02 sec)
# Time: 130725 11:47:51
# User@Host: root[root] @ localhost []
# Query_time: 1.017450 Lock_time: 0.000219 Rows_sent: 5 Rows_examined: 5
SET timestamp=1374724071;
select *,sleep(0.2) from testinfo where id in (1232,232,324,2342,23);

可見，id in (1232,232,324,2342,23) 是走了主鍵索引，並且效果很好，掃描5行就出結果了。

2.看看sql爲：select count(*) from testinfo where id not in (select id from testinfo group by idtest);的效率

點擊(此處)摺疊或打開

mysql> explain extended select count(*) from testinfo where id not in (select id from testinfo group by idtest);
+----+--------------------+----------+-------+---------------+------------+---------+------+------+-----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+----------+-------+---------------+------------+---------+------+------+-----------+--------------------------+
| 1 | PRIMARY | testinfo | index | NULL | key_idtest | 62 | NULL | 8761 | 100.00 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | testinfo | index | NULL | key_idtest | 62 | NULL | 1 | 876100.00 | Using index |
+----+--------------------+----------+-------+---------------+------------+---------+------+------+-----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)

表面上看都走了索引，但仔細發現filtered列爲876100，這個值很大，直接影響到執行sql時掃描的行數。

mysql> show warnings \G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select count(0) AS `count(*)` from `test`.`testinfo` where (not(<in_optimizer>(`test`.`testinfo`.`id`,<exists>(select `test`.`testinfo`.`id` from `test`.`testinfo` group by `test`.`testinfo`.`idtest` having (<cache>(`test`.`testinfo`.`id`) = <ref_null_helper>(`test`.`testinfo`.`id`))))))
1 row in set (0.00 sec)
可見，通過mysql優化器後，in 給轉換成exists的方式，下面實際執行一次sql花了36秒
mysql> select count(*) from testinfo where id not in (select id from testinfo group by idtest);
+----------+
| count(*) |
+----------+
| 1059 |
+----------+
1 row in set (36.79 sec)
根據上面的執行計劃，估算大概的掃描的行數爲：76755121
mysql> select 8761*((876100*1)/100)
-> ;
+-----------------------+
| 8761*((876100*1)/100) |
+-----------------------+
| 76755121.0000 |
+-----------------------+
1 row in set (0.00 sec)
而實際執行掃描的行數爲：50910026
# User@Host: root[root] @ localhost []
# Query_time: 36.793302 Lock_time: 0.000227 Rows_sent: 1 Rows_examined: 50910026
SET timestamp=1374723426;
select count(*) from testinfo where id not in (select id from testinfo group by idtest);

從上面測試可知，in裏面的子查詢並不是是先查出結果後再執行外層的查詢。當in中子查詢含有group by時，需注意是否會產生掃描的行數很大，sql執行效率很低。

3.將上面的sql變換一下：
先建立一個臨時表：
create table wjlcn_temp(id int auto_increment primary key);
再將中間結果insert到臨時表中：
insert into wjlcn_temp select id from testinfo group by idtest;
再來查詢結果：select count(*) from testinfo where id not in (select id from wjlcn_temp);

點擊(此處)摺疊或打開

mysql> explain extended select count(*) from testinfo where id not in (select id from wjlcn_temp);
+----+--------------------+------------+-----------------+---------------+------------+---------+------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+-----------------+---------------+------------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | testinfo | index | NULL | key_idtest | 62 | NULL | 8761 | 100.00 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | wjlcn_temp | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using index |
+----+--------------------+------------+-----------------+---------------+------------+---------+------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings \G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select count(0) AS `count(*)` from `test`.`testinfo` where (not(<in_optimizer>(`test`.`testinfo`.`id`,<exists>(<primary_index_lookup>(<cache>(`test`.`testinfo`.`id`) in wjlcn_temp on PRIMARY)))))
1 row in set (0.00 sec)
mysql> select count(*),sleep(1) from testinfo where id not in (select id from wjlcn_temp);
+----------+----------+
| count(*) | sleep(1) |
+----------+----------+
| 1059 | 0 |
+----------+----------+
1 row in set (1.02 sec)
# Time: 130725 11:41:04
# User@Host: root[root] @ localhost []
# Query_time: 1.026054 Lock_time: 0.000231 Rows_sent: 1 Rows_examined: 9999
SET timestamp=1374723664;
select count(*),sleep(1) from testinfo where id not in (select id from wjlcn_temp);

從上面能夠看到執行計劃的 filtered列爲100，跟上面的sql有很大的區別。
其次，在explain中出現了 unique_subquery

文檔中解釋：
unique_subquery
This type replaces ref for some IN subqueries of the following form:

value IN (SELECT primary_key FROM single_table WHERE some_expr)
unique_subquery is just an index lookup function that replaces the subquery completely for better efficiency.

當sql中出現 unique_subquery時，sql會自動替換in 後面的子查詢。從上面的執行計劃中能夠看到sql實際執行的是：
select count ( 0 ) AS `count(*)` from `test` . `testinfo` where ( not ( < in_optimizer > ( `test` . `testinfo` . `id` , < exists > ( < primary_index_lookup > ( < cache > ( `test` . `testinfo` . `id` ) in wjlcn_temp on PRIMARY ) ) ) ) )
當sql中用到 primary_index_lookup時，sql的執行效率也比較好。
從慢查詢中看到掃描的行數爲：9999，而實際的執行時間爲： Query_time: 1 . 026054 - 1 = 0 . 026054 秒。比起前面的36秒好了不少，若是表的記錄數更大時，執行時間相差更加明顯。因此，在sql中用in子查詢時，最後看看執行計劃。若在線上大表頻繁執行 select count(*) from testinfo where id not in (select id from testinfo group by idtest);類的sql，可能會致使服務器的性能問題。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。