阿里的程序員也不過如此,竟被一個簡單的 SQL 查詢難住

時間 2020-05-09

原文原文鏈接

(請原諒我, 標題黨一回, 花幾分鐘看看, 或許對你有幫助).最近工做上遇到一個」神奇」的問題, 或許對你們有幫助, 所以造成本文.html

背景

最近工做上遇到一個"神奇"的問題, 或許對你們有幫助, 所以造成本文.mysql

問題大概是, 我有兩個表 TableA, TableB, 其中 TableA 表大概百萬行級別(存量業務數據), TableB 表幾行(新業務場景, 數據還未膨脹起來), 語義上 TableA.columnA = TableB.columnA, 其中 columnA 上創建了索引, 但查詢的時候確巨慢無比, 基本上到 5-6 秒, 明顯跟預期不符合.sql

下面我以一個具體的例子來講明吧, 模擬其中的 SQL 查詢場景.後端

場景重現

user_info 表, 爲了場景儘可能簡單, 我只 mock 了其中的三列數據.

mysql> desc user_info;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| id    | int(11)      | NO   | PRI | NULL    | auto_increment |
| uid   | varchar(64)  | NO   | MUL | NULL    |                |
| name  | varchar(255) | YES  |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
複製代碼

user_score 表, 其中 uid 和 user_info.uid 語義一致:

mysql> desc user_info;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| id    | int(11)      | NO   | PRI | NULL    | auto_increment |
| uid   | varchar(64)  | NO   | MUL | NULL    |                |
| name  | varchar(255) | YES  |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
複製代碼

其中數據狀況以下, 都是很常見的場景.

mysql> select * from user_score limit 2;
+----+--------------------------------------+-------+
| id | uid                                  | score |
+----+--------------------------------------+-------+
|  5 | 111111111                            |   100 |
|  6 | 55116d58-be26-4eb7-8f7e-bd2d49fbb968 |   100 |
+----+--------------------------------------+-------+
2 rows in set (0.00 sec)

mysql> select * from user_info limit 2;
+----+--------------------------------------+-------------+
| id | uid                                  | name        |
+----+--------------------------------------+-------------+
|  1 | 111111111                            | tanglei     |
|  2 | 55116d58-be26-4eb7-8f7e-bd2d49fbb968 | hudsonemily |
+----+--------------------------------------+-------------+
2 rows in set (0.00 sec)

mysql> select count(*) from user_score
    -> union
    -> select count(*) from user_info;
+----------+
| count(*) |
+----------+
|        4 |
|  3000003 |
+----------+
2 rows in set (1.39 sec)
複製代碼

索引狀況是:

mysql> show index from user_score;
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table      | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| user_score |          0 | PRIMARY   |            1 | id          | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| user_score |          1 | index_uid |            1 | uid         | A         |           4 |     NULL | NULL   | YES  | BTREE      |         |               |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)

mysql> show index from user_info;
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| user_info |          0 | PRIMARY   |            1 | id          | A         |     2989934 |     NULL | NULL   |      | BTREE      |         |               |
| user_info |          1 | index_uid |            1 | uid         | A         |     2989934 |     NULL | NULL   |      | BTREE      |         |               |
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
複製代碼

查詢業務場景: 已知 user_score.id, 須要關聯查詢對應user_info的信息, (你們先忽略這個具體業務場景是否合理哈). 那麼對應的 SQL 很天然的以下:

mysql> select * from user_score us
    -> inner join user_info ui on us.uid = ui.uid
    -> where us.id = 5;
+----+-----------+-------+---------+-----------+---------+
| id | uid       | score | id      | uid       | name    |
+----+-----------+-------+---------+-----------+---------+
|  5 | 111111111 |   100 |       1 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685399 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685400 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685401 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685402 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685403 | 111111111 | tanglei |
+----+-----------+-------+---------+-----------+---------+
6 rows in set (1.18 sec)
複製代碼

請忽略其中的數據, 我剛開始 mock 了 100W, 而後又重複導入了兩遍, 所以數據有一些重複. 300W 數據, 最後查詢出來也是 1.18 秒. 按道理應該更快的. 老規矩 explain 看看啥狀況?安全

mysql> explain
    -> select * from user_score us
    -> inner join user_info ui on us.uid = ui.uid
    -> where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type  | possible_keys     | key     | key_len | ref   | rows    | Extra       |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
|  1 | SIMPLE      | us    | const | PRIMARY,index_uid | PRIMARY | 4       | const |       1 | NULL        |
|  1 | SIMPLE      | ui    | ALL   | NULL              | NULL    | NULL    | NULL  | 2989934 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼

發現 user_info表沒用上索引, 全表掃描近 300W 數據? 現象是這樣, 爲何呢?bash

你不妨思考一下, 若是你遇到這種場景, 應該怎麼去排查?運維

我當時也是"一頓操做猛如虎", 然並卵? 嘗試了什麼多種 sql 寫法來完成這個操做.工具

好比更換Join表的順序(驅動表/被驅動表)oop

mysql> explain select * from user_info ui inner join user_score us on us.uid = ui.uid where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type  | possible_keys     | key     | key_len | ref   | rows    | Extra       |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
|  1 | SIMPLE      | us    | const | PRIMARY,index_uid | PRIMARY | 4       | const |       1 | NULL        |
|  1 | SIMPLE      | ui    | ALL   | NULL              | NULL    | NULL    | NULL  | 2989934 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼

再好比用子查詢:測試

mysql> explain select * from user_info where uid in  (select uid from user_score where id = 5);
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table      | type  | possible_keys     | key     | key_len | ref   | rows    | Extra       |
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
|  1 | SIMPLE      | user_score | const | PRIMARY,index_uid | PRIMARY | 4       | const |       1 | NULL        |
|  1 | SIMPLE      | user_info  | ALL   | NULL              | NULL    | NULL    | NULL  | 2989934 | Using where |
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼

最終, 仍是沒有結果. 但直接單表查詢寫 SQL 確能用上索引.

mysql> select * from user_info where uid = '111111111';
+---------+-----------+---------+
| id      | uid       | name    |
+---------+-----------+---------+
|       1 | 111111111 | tanglei |
| 3685399 | 111111111 | tanglei |
| 3685400 | 111111111 | tanglei |
| 3685401 | 111111111 | tanglei |
| 3685402 | 111111111 | tanglei |
| 3685403 | 111111111 | tanglei |
+---------+-----------+---------+
6 rows in set (0.01 sec)

mysql> explain select * from user_info where uid = '111111111';
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
| id | select_type | table     | type | possible_keys | key       | key_len | ref   | rows | Extra                 |
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
|  1 | SIMPLE      | user_info | ref  | index_uid     | index_uid | 194     | const |    6 | Using index condition |
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)
複製代碼

問題解決

嘗試更換檢索條件, 好比更換 uid 直接關聯查詢, 索引仍然用不上, 差點放棄了都. 在準備求助 DBA 前, 看了下表的建表語句.

mysql> show create table user_info;
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table     | Create Table                                                                                                                                                                                                                                                 |
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| user_info | CREATE TABLE `user_info` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `uid` varchar(64) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_uid` (`uid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3685404 DEFAULT CHARSET=utf8 |
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table user_score;
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                             |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| user_score | CREATE TABLE `user_score` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `uid` varchar(64) NOT NULL,
  `score` float DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_uid` (`uid`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
複製代碼

徹底有理由懷疑由於字符集不一致的問題致使索引失效的問題了. 因而修改了小表(真實線上環境可別亂操做)的字符集與大表一致, 再測試下.

mysql> select * from user_score us
    -> inner join user_info ui on us.uid = ui.uid
    -> where us.id = 5;
+----+-----------+-------+---------+-----------+---------+
| id | uid       | score | id      | uid       | name    |
+----+-----------+-------+---------+-----------+---------+
|  5 | 111111111 |   100 |       1 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685399 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685400 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685401 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685402 | 111111111 | tanglei |
|  5 | 111111111 |   100 | 3685403 | 111111111 | tanglei |
+----+-----------+-------+---------+-----------+---------+
6 rows in set (0.00 sec)

mysql> explain
    -> select * from user_score us
    -> inner join user_info ui on us.uid = ui.uid
    -> where us.id = 5;
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
| id | select_type | table | type  | possible_keys     | key       | key_len | ref   | rows | Extra |
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
|  1 | SIMPLE      | us    | const | PRIMARY,index_uid | PRIMARY   | 4       | const |    1 | NULL  |
|  1 | SIMPLE      | ui    | ref   | index_uid         | index_uid | 194     | const |    6 | NULL  |
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
2 rows in set (0.00 sec)
複製代碼

果真 work 了.

挖掘根因

其實深究緣由, 就是網上各類 MySQL軍規/規約所提到的, "索引列不要參與計算". 此次這個 case, 若是知道 explain extended + show warnings 這個工具的話, (之前都不知道explain後面還能加 extended 參數), 可能就儘早"恍然大悟"了. (最新的 MySQL 8.0版本貌似不須要另外加這個關鍵字).

看下效果. (啊, 我還得把字符集改回去!!!)

mysql> explain extended select * from user_score us  inner join user_info ui on us.uid = ui.uid where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
| id | select_type | table | type  | possible_keys     | key     | key_len | ref   | rows    | filtered | Extra       |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
|  1 | SIMPLE      | us    | const | PRIMARY,index_uid | PRIMARY | 4       | const |       1 |   100.00 | NULL        |
|  1 | SIMPLE      | ui    | ALL   | NULL              | NULL    | NULL    | NULL  | 2989934 |   100.00 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                              |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select '5' AS `id`,'111111111' AS `uid`,'100' AS `score`,`test`.`ui`.`id` AS `id`,`test`.`ui`.`uid` AS `uid`,`test`.`ui`.`name` AS `name` from `test`.`user_score` `us` join `test`.`user_info` `ui` where (('111111111' = convert(`test`.`ui`.`uid` using utf8mb4))) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
複製代碼

索引列參與計算了, 每次都要根據字符集去轉換, 全表掃描, 你說能快得起來麼?

至於這個問題爲何會發生? 綜合來看, 就是由於歷史緣由, 老業務場景中的原表是假 utf8, 新業務新表採用了真 utf8mb4.

考慮新表的時候, 忽略和原庫字符集的比較. 其實, 發現庫裏面的不一樣表可能都有不一樣的字符集, 不一樣人建的時候可能都依據我的喜愛去選擇了不一樣的字符集. 因而可知, 開發規範有多重要.
雖然知道索引列不能參與計算, 但這個場景下都是相同的類型, varchar(64) 最終查詢過程當中仍然發生了類型轉換. 所以須要把字段字符集不一致等同於字段類型不一致.
若是這個 case, 利用 fail-fast 的理念的話, 發現不一致, 直接不讓 join 會不會更好? (就像 char v.s varchar 不能 join 同樣).

留一道思考題

你能解釋以下狀況嗎? 查詢結果表現爲什麼不一致? 注意一下 SQL 的執行順序, 查詢優化器工做流程, 以及其中的 Using join buffer (Block Nested Loop), 建議多看看 MySQL 官方手冊深刻背後原理.

mysql> select * from user_info ui
    -> inner join user_score us on us.uid = ui.uid
    -> where us.uid = '111111111';
+---------+-----------+---------+----+-----------+-------+
| id      | uid       | name    | id | uid       | score |
+---------+-----------+---------+----+-----------+-------+
|       1 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685399 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685400 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685401 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685402 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685403 | 111111111 | tanglei |  5 | 111111111 |   100 |
+---------+-----------+---------+----+-----------+-------+
6 rows in set (1.14 sec)

mysql> select * from user_info ui
    -> inner join user_score us on us.uid = ui.uid
    -> where ui.uid = '111111111';
+---------+-----------+---------+----+-----------+-------+
| id      | uid       | name    | id | uid       | score |
+---------+-----------+---------+----+-----------+-------+
|       1 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685399 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685400 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685401 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685402 | 111111111 | tanglei |  5 | 111111111 |   100 |
| 3685403 | 111111111 | tanglei |  5 | 111111111 |   100 |
+---------+-----------+---------+----+-----------+-------+
6 rows in set (0.00 sec)
複製代碼

mysql> explain
    -> select * from user_info ui
    -> inner join user_score us on us.uid = ui.uid
    -> where us.uid = '111111111';
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key       | key_len | ref   | rows    | Extra                                              |
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
|  1 | SIMPLE      | us    | ref  | index_uid     | index_uid | 258     | const |       1 | Using index condition                              |
|  1 | SIMPLE      | ui    | ALL  | NULL          | NULL      | NULL    | NULL  | 2989934 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
2 rows in set (0.00 sec)

mysql> explain
    -> select * from user_info ui
    -> inner join user_score us on us.uid = ui.uid
    -> where ui.uid = '111111111';
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key       | key_len | ref   | rows | Extra                                              |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
|  1 | SIMPLE      | ui    | ref  | index_uid     | index_uid | 194     | const |    6 | Using index condition                              |
|  1 | SIMPLE      | us    | ALL  | index_uid     | NULL      | NULL    | NULL  |    4 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
2 rows in set (0.01 sec)

複製代碼

說明: 本文測試場景基於 MySQL 5.6, 另外, 本文案例只是爲了說明問題, 其中的 SQL 並不規範(例如儘可能別用 select * 之類的), 請勿模仿(模仿了我也不負責). 爲了寫本文, 可花了很多時間, 建 DB, 灌mock數據等等, 若是以爲有用, 還望你幫忙"在看", "轉發". 最後留一個思考題供討論, 歡迎留言說出你的見解.

打個廣告

阿里雲ECS彈性計算服務是阿里雲的最重要的雲服務產品之一。彈性計算服務是一種簡單高效，處理能力可彈性伸縮的計算服務。咱們始終致力於利用和創造業界最新的前沿技術，讓更多的客戶輕鬆享受這些技術紅利，在雲上快速構建更穩定、安全的應用，提高運維效率，下降IT成本，使客戶更專一於本身的核心業務創新。彈性計算從新定義了人們使用計算資源的方式，這一新的方式正在而且將一直影響着關於計算資源的生態和經濟圈。咱們正在創造歷史，咱們真誠地邀請您加入咱們的隊伍。

最近團隊釋放很多 HC, 誠招 P6/P7/P8 的同窗, 本組同窗主要招聘後端研發同窗(JD在此), 感興趣的同窗可掃描下面二維碼加我聯繫.

另外, 2021 屆校招/實習生崗位也正在進行中(詳情請戳), 若是你是 2020-11 -- 2021-07 月之間畢業, 同時對阿里巴巴感興趣, 也歡迎聯繫我幫忙內推.