什麼,LEFT JOIN 會變成 JOIN?

前言

在平常開發中,對於 LEFT JOINJOIN 的用法大部分應該都是同樣的,若是有兩個表 A,B,若是兩個表的數據都想要,就使用 JOIN,若是隻想要一個表的所有數據,另外一個表數據無關緊要,就使用 LEFT JOIN。(固然這麼描述是不太準確的,可是很符合個人平常業務開發)。mysql

MYSQL LEFT JOIN 詳解 這篇文章中咱們已經知道了,LEFT JOIN 是本身選擇驅動表的,而 JOIN 是 MYSQL 優化器選擇驅動表的。算法

那麼,當咱們寫了一條 LEFT JOIN 語句,MYSQL 會將這條語句優化成 JOIN 語句嗎?sql

若是會優化的話,那麼何時會優化呢?markdown

事實上,這正是我遇到的一個線上問題。咱們一塊兒來看一下。post

問題描述

在咱們線上有這麼一條慢 SQL(已處理),執行時間超過 0.5 秒。測試

select 
    count(distinct order.order_id) 
from order force index(shop_id) 
left join `order_extend`
on `order`.`order_id` = `order_extend`.`order_id` 
where `order`.`create_time` >= "2020-08-01 00:00:00" 
and `order`.`create_time` <= "2020-08-01 23:59:59" 
and `order`.`shop_id` = 328449726569069326 
and `order`.`status` = 1 
and `order_extend`.`shop_id` = 328449726569069326 
and `order_extend`.`status` = 1
複製代碼

explain 結果以下:優化

+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
| id | select_type | table        | partitions | type   | possible_keys    | key      | key_len | ref                    | rows | Extra       |
+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
|  1 | SIMPLE      | order_extend | NULL       | ref    | order_id,shop_id | shop_id  | 8       | const                  | 3892 | Using where |
|  1 | SIMPLE      | order        | NULL       | eq_ref | shop_id          | shop_id  | 16      | example.order.order_id |    1 | Using where |
+----+-------------+--------------+------------+--------+------------------+----------+---------+------------------------+------+-------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼

問題分析

經過 explain,再結合咱們以前講的 MYSQL 鏈接查詢算法,驅動表爲 order_extend,循環 3892 次,說多也很少,說少也很多,被驅動表數據查詢類型爲 eq_ref,因此應該不會太慢,那麼問題就出如今 3892 次上面了,想辦法將這個數字降下來便可。ui

等等!爲何驅動表是 order_extend?我明明使用的是 LEFT JOIN 啊,按理說驅動表應該是 order 表,爲何會變成了 order_extend 了。難道是 MYSQL 內部優化了?spa

順着這個思路,既然驅動表變了,說明這條 SQL 變爲 JOIN 語句了。code

咱們順着分析 JOIN 語句的方式來分析一下這條語句。(ps:須要對 MYSQL JOIN 內部執行過程有必定的理解,若是不太熟悉,請先移步看這篇文章 → MYSQL 鏈接查詢算法

MYSQL 選擇 order_extend 當作驅動表,說明在 where 條件下 order_extend 查詢的數據更少,MYSQL 會選擇一個小的表當作驅動表。

咱們來分別適用上述的 where 條件單獨執行 select count(*) 語句,查看一下大體每一個表都涉及到多少條 SQL 記錄。

爲了避免影響咱們的分析,咱們使用 explain 語句,這樣整個過程就都是估算的結果,模擬一下 MYSQL 分析的過程。

mysql> explain select 
    count(distinct order.order_id) 
from order force index(shop_id) 
where `order`.`create_time` >= "2020-08-01 00:00:00" 
and `order`.`create_time` <= "2020-08-01 23:59:59" 
and `order`.`shop_id` = 328449726569069326 
and `order`.`status` = 1;


+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
| id | select_type | table | partitions | type | possible_keys                  | key     | key_len | ref   | rows   | Extra       |
+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
|  1 | SIMPLE      | order | NULL       | ref  | PRIMARY,shop_id,create_time... | shop_id | 8       | const | 320372 | Using where |
+----+-------------+-------+------------+------+--------------------------------+---------+---------+-------+--------+-------------+
1 row in set, 1 warning (0.00 sec)
複製代碼
select 
    count(distinct order_extend.order_id) 
and `order_extend`.`shop_id` = 328449726569069326 
and `order_extend`.`status` = 1

+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table        | partitions | type | possible_keys    | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | order_extend | NULL       | ref  | order_id,shop_id | shop_id | 8       | const | 3892 |    10.00 | Using where |
+----+-------------+--------------+------------+------+------------------+---------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
複製代碼

能夠看到,在上述 where 條件下,order_extend 表只會查詢 3892 條數據,而 order 表會查詢 320372 條數據,因此 order_extend 表當驅動表是徹底沒有問題的。

那麼咱們再來看看爲何 order 表會掃描這麼多數據呢?在 2020-08-01 這一天可能也沒有這麼多數據啊。那麼這個時候咱們應該會很容易的想到,是強制走索引的問題,由於在上述查詢語句中,咱們強制走了 shop_id 索引,這個索引可能不是最優索引,咱們把 force index(shop_id) 去掉再試試看

mysql> explain select 
    count(distinct order.order_id) 
where `order`.`create_time` >= "2020-08-01 00:00:00" 
and `order`.`create_time` <= "2020-08-01 23:59:59" 
and `order`.`shop_id` = 328449726569069326 
and `order`.`status` = 1;


+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref   | rows  | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
|  1 | SIMPLE      | order | NULL       | ref  | create_time   | create_time | 8       | const | <3892 |    10.00 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+-------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
複製代碼

能夠看到,若是不強制走 shop_id 索引的話,走 create_time 索引的話,掃描的行數會更少,假設說 100 行,只會循環 100 次,掃描 100 x 3892 行數據,而以前的總共要循環 3892 次,掃描 3892 x 300000 行數據。

問題結論

因此最終的這條慢 SQL 的緣由肯定了,是由於咱們強制走 shop_id 索引,致使 MYSQL 掃描的行數更多了,咱們只須要去掉強制走索引便可,大多數時間 MYSQL 都會選擇正確的索引,因此強制使用索引的時候必定要當心謹慎。

問題延伸

SQL 慢的問題咱們已經解決了,咱們再來回顧一下文章開頭的問題:LEFT JOIN 會被優化爲 JOIN 嗎?

答案是會的。那麼何時會出現這種狀況呢?

咱們再來回顧一下 MYSQL LEFT JOIN 詳解 文章中的內容。

爲了方便閱讀,咱們將部份內容粘貼出來。

mysql> select * from goods left join goods_category on goods.category_id = goods_category.category_id;
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        1 | 男鞋1      |           1 |           1 | 鞋            |
|        2 | 男鞋2      |           1 |           1 | 鞋            |
|        3 | 男鞋3      |           3 |           3 | 羽絨服        |
|        4 | T恤1       |           2 |           2 | T恤           |
|        5 | T恤2       |           2 |           2 | T恤           |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)

mysql> select * from goods left join goods_category on goods.category_id = goods_category.category_id;
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        1 | 男鞋1      |           1 |           1 | 鞋            |
|        2 | 男鞋2      |           1 |           1 | 鞋            |
|        3 | 男鞋3      |           4 |        NULL | NULL          |
|        4 | T恤1       |           2 |           2 | T恤           |
|        5 | T恤2       |           2 |           2 | T恤           |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)

mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and g.goods_name = 'T恤1');
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        1 | 男鞋1      |           1 |        NULL | NULL          |
|        2 | 男鞋2      |           1 |        NULL | NULL          |
|        3 | 男鞋3      |           4 |        NULL | NULL          |
|        4 | T恤1       |           2 |           2 | T恤           |
|        5 | T恤2       |           2 |        NULL | NULL          |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)

mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and c.category_name = 'T恤');
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        1 | 男鞋1      |           1 |        NULL | NULL          |
|        2 | 男鞋2      |           1 |        NULL | NULL          |
|        3 | 男鞋3      |           4 |        NULL | NULL          |
|        4 | T恤1       |           2 |           2 | T恤           |
|        5 | T恤2       |           2 |           2 | T恤           |
+----------+------------+-------------+-------------+---------------+
5 rows in set (0.00 sec)

mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id) where c.category_name = '鞋';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        1 | 男鞋1      |           1 |           1 | 鞋            |
|        2 | 男鞋2      |           1 |           1 | 鞋            |
+----------+------------+-------------+-------------+---------------+
2 rows in set (0.00 sec)

mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id) where g.goods_name = 'T恤1';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        4 | T恤1       |           2 |           2 | T恤           |
+----------+------------+-------------+-------------+---------------+
1 row in set (0.00 sec)

mysql> select * from goods g left join goods_category c on (g.category_id = c.category_id and g.goods_name = 'T恤2') where g.goods_name = 'T恤1';
+----------+------------+-------------+-------------+---------------+
| goods_id | goods_name | category_id | category_id | category_name |
+----------+------------+-------------+-------------+---------------+
|        4 | T恤1       |           2 |        NULL | NULL          |
+----------+------------+-------------+-------------+---------------+
1 row in set (0.00 sec)
複製代碼

咱們能夠看到,當 where 條件中有被驅動表的條件時,查詢結果是和 JOIN 的結果是一致的,無 NULL 值的出現。

因此,咱們能夠想到,LEFT JOIN 優化爲 JOIN 的條件爲:where 條件中有被驅動表的非空條件時LEFT JOIN 等價於 JOIN

這不難理解,LEFT JOIN 會返回驅動表全部數據,當有被驅動表的 where 條件時,會過濾掉 NULL 的值,此時和 JOIN 的結果一致了,那麼 MYSQL 會選擇將 LEFT JOIN 優化爲 JOIN,這樣就能夠本身選擇驅動表了。

實例測試

咱們再來編寫一個測試用例來驗證一下咱們的結論。

CREATE TABLE `A` (
  `id` int(11) auto_increment,
  `a` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB;

delimiter ;;
create procedure idata()
begin
  declare i int;
  set i=1;
  while(i<=100)do
    insert into A (`a`) values(i);
    set i=i+1;
  end while;
end;;
delimiter ;
call idata();

CREATE TABLE `B` (
  `id` int(11) auto_increment,
  `b` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `b` (`b`)
) ENGINE=InnoDB;

delimiter ;;
create procedure idata()
begin
  declare i int;
  set i=1;
  while(i<=100)do
    insert into B (`b`) values(i);
    set i=i+1;
  end while;
end;;
delimiter ;
call idata();
複製代碼

咱們建立了兩張如出一轍的表,每一個表中有 100 條數據,而後咱們執行一下 LEFT JOIN 語句。

mysql> explain select * from A left join B on A.id = B.id where A.a <= 100;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
|  1 | SIMPLE      | A     | NULL       | index  | a             | a       | 5       | NULL          |  100 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | B     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | example2.A.id |    1 |   100.00 | NULL                     |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
mysql> explain select * from A left join B on A.id = B.id where A.a <= 100 and B.b <= 50;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
|  1 | SIMPLE      | B     | NULL       | range  | PRIMARY,b     | b       | 5       | NULL          |   50 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | A     | NULL       | eq_ref | PRIMARY,a     | PRIMARY | 4       | example2.B.id |    1 |   100.00 | Using where              |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼
mysql> explain select * from A left join B on A.id = B.id where A.a <= 100 and B.b <= 100;
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
|  1 | SIMPLE      | A     | NULL       | index  | PRIMARY,a     | a       | 5       | NULL          |  100 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | B     | NULL       | eq_ref | PRIMARY,b     | PRIMARY | 4       | example2.A.id |    1 |   100.00 | Using where              |
+----+-------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
複製代碼

從上面看,給 B 表增長了 where 條件以後,若是 B 表掃描的行數更少,那麼是有可能換驅動表的,這也說明了,LEFT JOIN 語句被優化成了 JOIN 語句。

總結

上面咱們分析了一條慢 SQL 的問題,分析的過程涉及到了不少知識點,但願你們能夠認真研究一下。

同時咱們得出了一條結論:當有被驅動表的非空 where 條件時,MYSQL 會將 LEFT JOIN 語句優化爲 JOIN 語句

相關文章
相關標籤/搜索