在平常開發中,對於 LEFT JOIN
和 JOIN
的用法大部分應該都是同樣的,若是有兩個表 A,B,若是兩個表的數據都想要,就使用 JOIN
,若是隻想要一個表的所有數據,另外一個表數據無關緊要,就使用 LEFT JOIN
。(固然這麼描述是不太準確的,可是很符合個人平常業務開發)。mysql
在 MYSQL LEFT JOIN 詳解 這篇文章中咱們已經知道了,LEFT JOIN
是本身選擇驅動表的,而 JOIN
是 MYSQL 優化器選擇驅動表的。算法
那麼,當咱們寫了一條 LEFT JOIN
語句,MYSQL 會將這條語句優化成 JOIN
語句嗎?sql
若是會優化的話,那麼何時會優化呢?markdown
事實上,這正是我遇到的一個線上問題。咱們一塊兒來看一下。post
在咱們線上有這麼一條慢 SQL(已處理),執行時間超過 0.5 秒。測試
select
count(distinct order.order_id)
from order force index(shop_id)
left join `order_extend`
on `order`.`order_id` = `order_extend`.`order_id`
where `order`.`create_time` >= "2020-08-01 00:00:00"
and `order`.`create_time` <= "2020-08-01 23:59:59"
and `order`.`shop_id` = 328449726569069326
and `order`.`status` = 1
and `order_extend`.`shop_id` = 328449726569069326
and `order_extend`.`status` = 1
複製代碼
explain 結果以下:優化
+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
| 1 | SIMPLE | order_extend | NULL | ref | order_id,shop_id | shop_id | 8 | const | 3892 | Using where |
| 1 | SIMPLE | order | NULL | eq_ref | shop_id | shop_id | 16 | example.order.order_id | 1 | Using where |
+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
經過 explain,再結合咱們以前講的 MYSQL 鏈接查詢算法,驅動表爲 order_extend,循環 3892 次,說多也很少,說少也很多,被驅動表數據查詢類型爲 eq_ref
,因此應該不會太慢,那麼問題就出如今 3892 次上面了,想辦法將這個數字降下來便可。ui
等等!爲何驅動表是 order_extend?我明明使用的是 LEFT JOIN
啊,按理說驅動表應該是 order 表,爲何會變成了 order_extend 了。難道是 MYSQL 內部優化了?spa
順着這個思路,既然驅動表變了,說明這條 SQL 變爲 JOIN
語句了。code
咱們順着分析 JOIN
語句的方式來分析一下這條語句。(ps:須要對 MYSQL JOIN 內部執行過程有必定的理解,若是不太熟悉,請先移步看這篇文章 → MYSQL 鏈接查詢算法 )
MYSQL 選擇 order_extend 當作驅動表,說明在 where 條件下 order_extend 查詢的數據更少,MYSQL 會選擇一個小的表當作驅動表。
咱們來分別適用上述的 where 條件單獨執行 select count(*)
語句,查看一下大體每一個表都涉及到多少條 SQL 記錄。
爲了避免影響咱們的分析,咱們使用 explain 語句,這樣整個過程就都是估算的結果,模擬一下 MYSQL 分析的過程。
mysql> explain select
count(distinct order.order_id)
from order force index(shop_id)
where `order`.`create_time` >= "2020-08-01 00:00:00"
and `order`.`create_time` <= "2020-08-01 23:59:59"
and `order`.`shop_id` = 328449726569069326
and `order`.`status` = 1;
+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
| 1 | SIMPLE | order | NULL | ref | PRIMARY,shop_id,create_time... | shop_id | 8 | const | 320372 | Using where |
+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
1 row in set, 1 warning (0.00 sec)
複製代碼
select
count(distinct order_extend.order_id)
and `order_extend`.`shop_id` = 328449726569069326
and `order_extend`.`status` = 1
+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | order_extend | NULL | ref | order_id,shop_id | shop_id | 8 | const | 3892 | 10.00 | Using where |
+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
複製代碼
能夠看到,在上述 where 條件下,order_extend 表只會查詢 3892 條數據,而 order 表會查詢 320372 條數據,因此 order_extend 表當驅動表是徹底沒有問題的。
那麼咱們再來看看爲何 order 表會掃描這麼多數據呢?在 2020-08-01 這一天可能也沒有這麼多數據啊。那麼這個時候咱們應該會很容易的想到,是強制走索引的問題,由於在上述查詢語句中,咱們強制走了 shop_id
索引,這個索引可能不是最優索引,咱們把 force index(shop_id)
去掉再試試看
mysql> explain select
count(distinct order.order_id)
where `order`.`create_time` >= "2020-08-01 00:00:00"
and `order`.`create_time` <= "2020-08-01 23:59:59"
and `order`.`shop_id` = 328449726569069326
and `order`.`status` = 1;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
| 1 | SIMPLE | order | NULL | ref | create_time | create_time | 8 | const | <3892 | 10.00 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
複製代碼
能夠看到,若是不強制走 shop_id
索引的話,走 create_time
索引的話,掃描的行數會更少,假設說 100 行,只會循環 100 次,掃描 100 x 3892
行數據,而以前的總共要循環 3892 次,掃描 3892 x 300000
行數據。
因此最終的這條慢 SQL 的緣由肯定了,是由於咱們強制走 shop_id
索引,致使 MYSQL 掃描的行數更多了,咱們只須要去掉強制走索引便可,大多數時間 MYSQL 都會選擇正確的索引,因此強制使用索引的時候必定要當心謹慎。
SQL 慢的問題咱們已經解決了,咱們再來回顧一下文章開頭的問題:LEFT JOIN
會被優化爲 JOIN
嗎?
答案是會的。那麼何時會出現這種狀況呢?
咱們再來回顧一下 MYSQL LEFT JOIN 詳解 文章中的內容。
爲了方便閱讀,咱們將部份內容粘貼出來。
mysql> select * from goods left join goods_category on goods.category_id = goods_category.category_id;
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 1 | 男鞋1 | 1 | 1 | 鞋 |
| 2 | 男鞋2 | 1 | 1 | 鞋 |
| 3 | 男鞋3 | 3 | 3 | 羽絨服 |
| 4 | T恤1 | 2 | 2 | T恤 |
| 5 | T恤2 | 2 | 2 | T恤 |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)
mysql> select * from goods left join goods_category on goods.category_id = goods_category.category_id;
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 1 | 男鞋1 | 1 | 1 | 鞋 |
| 2 | 男鞋2 | 1 | 1 | 鞋 |
| 3 | 男鞋3 | 4 | NULL | NULL |
| 4 | T恤1 | 2 | 2 | T恤 |
| 5 | T恤2 | 2 | 2 | T恤 |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)
mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and g.goods_name = 'T恤1');
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 1 | 男鞋1 | 1 | NULL | NULL |
| 2 | 男鞋2 | 1 | NULL | NULL |
| 3 | 男鞋3 | 4 | NULL | NULL |
| 4 | T恤1 | 2 | 2 | T恤 |
| 5 | T恤2 | 2 | NULL | NULL |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)
mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and c.category_name = 'T恤');
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 1 | 男鞋1 | 1 | NULL | NULL |
| 2 | 男鞋2 | 1 | NULL | NULL |
| 3 | 男鞋3 | 4 | NULL | NULL |
| 4 | T恤1 | 2 | 2 | T恤 |
| 5 | T恤2 | 2 | 2 | T恤 |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)
mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id) where c.category_name = '鞋';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 1 | 男鞋1 | 1 | 1 | 鞋 |
| 2 | 男鞋2 | 1 | 1 | 鞋 |
+----------+------------+-------------+-------------+---------------+
2 rows in set (0.00 sec)
mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id) where g.goods_name = 'T恤1';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 4 | T恤1 | 2 | 2 | T恤 |
+----------+------------+-------------+-------------+---------------+
1 row in set (0.00 sec)
mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and g.goods_name = 'T恤2') where g.goods_name = 'T恤1';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
| 4 | T恤1 | 2 | NULL | NULL |
+----------+------------+-------------+-------------+---------------+
1 row in set (0.00 sec)
複製代碼
咱們能夠看到,當 where 條件中有被驅動表的條件時,查詢結果是和 JOIN
的結果是一致的,無 NULL 值的出現。
因此,咱們能夠想到,LEFT JOIN
優化爲 JOIN
的條件爲:where 條件中有被驅動表的非空條件時,LEFT JOIN
等價於 JOIN
。
這不難理解,LEFT JOIN
會返回驅動表全部數據,當有被驅動表的 where 條件時,會過濾掉 NULL 的值,此時和 JOIN
的結果一致了,那麼 MYSQL 會選擇將 LEFT JOIN
優化爲 JOIN
,這樣就能夠本身選擇驅動表了。
咱們再來編寫一個測試用例來驗證一下咱們的結論。
CREATE TABLE `A` (
`id` int(11) auto_increment,
`a` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB;
delimiter ;;
create procedure idata()
begin
declare i int;
set i=1;
while(i<=100)do
insert into A (`a`) values(i);
set i=i+1;
end while;
end;;
delimiter ;
call idata();
CREATE TABLE `B` (
`id` int(11) auto_increment,
`b` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `b` (`b`)
) ENGINE=InnoDB;
delimiter ;;
create procedure idata()
begin
declare i int;
set i=1;
while(i<=100)do
insert into B (`b`) values(i);
set i=i+1;
end while;
end;;
delimiter ;
call idata();
複製代碼
咱們建立了兩張如出一轍的表,每一個表中有 100 條數據,而後咱們執行一下 LEFT JOIN
語句。
mysql> explain select * from A left join B on A.id = B.id where A.a <= 100;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| 1 | SIMPLE | A | NULL | index | a | a | 5 | NULL | 100 | 100.00 | Using where; Using index |
| 1 | SIMPLE | B | NULL | eq_ref | PRIMARY | PRIMARY | 4 | example2.A.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
mysql> explain select * from A left join B on A.id = B.id where A.a <= 100 and B.b <= 50;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| 1 | SIMPLE | B | NULL | range | PRIMARY,b | b | 5 | NULL | 50 | 100.00 | Using where; Using index |
| 1 | SIMPLE | A | NULL | eq_ref | PRIMARY,a | PRIMARY | 4 | example2.B.id | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
mysql> explain select * from A left join B on A.id = B.id where A.a <= 100 and B.b <= 100;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| 1 | SIMPLE | A | NULL | index | PRIMARY,a | a | 5 | NULL | 100 | 100.00 | Using where; Using index |
| 1 | SIMPLE | B | NULL | eq_ref | PRIMARY,b | PRIMARY | 4 | example2.A.id | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
從上面看,給 B 表增長了 where 條件以後,若是 B 表掃描的行數更少,那麼是有可能換驅動表的,這也說明了,LEFT JOIN
語句被優化成了 JOIN
語句。
上面咱們分析了一條慢 SQL 的問題,分析的過程涉及到了不少知識點,但願你們能夠認真研究一下。
同時咱們得出了一條結論:當有被驅動表的非空 where 條件時,MYSQL 會將 LEFT JOIN
語句優化爲 JOIN
語句。