(請原諒我, 標題黨一回, 花幾分鐘看看, 或許對你有幫助).最近工做上遇到一個」神奇」的問題, 或許對你們有幫助, 所以造成本文.html
最近工做上遇到一個"神奇"的問題, 或許對你們有幫助, 所以造成本文.mysql
問題大概是, 我有兩個表 TableA, TableB, 其中 TableA 表大概百萬行級別(存量業務數據), TableB 表幾行(新業務場景, 數據還未膨脹起來), 語義上 TableA.columnA = TableB.columnA
, 其中 columnA
上創建了索引, 但查詢的時候確巨慢無比, 基本上到 5-6 秒, 明顯跟預期不符合.sql
下面我以一個具體的例子來講明吧, 模擬其中的 SQL 查詢場景.後端
user_info
表, 爲了場景儘可能簡單, 我只 mock 了其中的三列數據.mysql> desc user_info;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| uid | varchar(64) | NO | MUL | NULL | |
| name | varchar(255) | YES | | NULL | |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
複製代碼
user_score
表, 其中 uid
和 user_info.uid
語義一致:mysql> desc user_info;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| uid | varchar(64) | NO | MUL | NULL | |
| name | varchar(255) | YES | | NULL | |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
複製代碼
mysql> select * from user_score limit 2;
+----+--------------------------------------+-------+
| id | uid | score |
+----+--------------------------------------+-------+
| 5 | 111111111 | 100 |
| 6 | 55116d58-be26-4eb7-8f7e-bd2d49fbb968 | 100 |
+----+--------------------------------------+-------+
2 rows in set (0.00 sec)
mysql> select * from user_info limit 2;
+----+--------------------------------------+-------------+
| id | uid | name |
+----+--------------------------------------+-------------+
| 1 | 111111111 | tanglei |
| 2 | 55116d58-be26-4eb7-8f7e-bd2d49fbb968 | hudsonemily |
+----+--------------------------------------+-------------+
2 rows in set (0.00 sec)
mysql> select count(*) from user_score
-> union
-> select count(*) from user_info;
+----------+
| count(*) |
+----------+
| 4 |
| 3000003 |
+----------+
2 rows in set (1.39 sec)
複製代碼
mysql> show index from user_score;
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| user_score | 0 | PRIMARY | 1 | id | A | 4 | NULL | NULL | | BTREE | | |
| user_score | 1 | index_uid | 1 | uid | A | 4 | NULL | NULL | YES | BTREE | | |
+------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
mysql> show index from user_info;
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| user_info | 0 | PRIMARY | 1 | id | A | 2989934 | NULL | NULL | | BTREE | | |
| user_info | 1 | index_uid | 1 | uid | A | 2989934 | NULL | NULL | | BTREE | | |
+-----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
複製代碼
user_score.id
, 須要關聯查詢對應user_info
的信息, (你們先忽略這個具體業務場景是否合理哈). 那麼對應的 SQL 很天然的以下:mysql> select * from user_score us
-> inner join user_info ui on us.uid = ui.uid
-> where us.id = 5;
+----+-----------+-------+---------+-----------+---------+
| id | uid | score | id | uid | name |
+----+-----------+-------+---------+-----------+---------+
| 5 | 111111111 | 100 | 1 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685399 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685400 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685401 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685402 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685403 | 111111111 | tanglei |
+----+-----------+-------+---------+-----------+---------+
6 rows in set (1.18 sec)
複製代碼
請忽略其中的數據, 我剛開始 mock 了 100W, 而後又重複導入了兩遍, 所以數據有一些重複. 300W 數據, 最後查詢出來也是 1.18 秒. 按道理應該更快的. 老規矩 explain
看看啥狀況?安全
mysql> explain
-> select * from user_score us
-> inner join user_info ui on us.uid = ui.uid
-> where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| 1 | SIMPLE | us | const | PRIMARY,index_uid | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | ui | ALL | NULL | NULL | NULL | NULL | 2989934 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼
發現 user_info
表沒用上索引, 全表掃描近 300W 數據? 現象是這樣, 爲何呢?bash
你不妨思考一下, 若是你遇到這種場景, 應該怎麼去排查?運維
我當時也是"一頓操做猛如虎", 然並卵? 嘗試了什麼多種 sql 寫法來完成這個操做.工具
好比更換Join表的順序(驅動表/被驅動表)oop
mysql> explain select * from user_info ui inner join user_score us on us.uid = ui.uid where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
| 1 | SIMPLE | us | const | PRIMARY,index_uid | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | ui | ALL | NULL | NULL | NULL | NULL | 2989934 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼
再好比用子查詢:測試
mysql> explain select * from user_info where uid in (select uid from user_score where id = 5);
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
| 1 | SIMPLE | user_score | const | PRIMARY,index_uid | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | user_info | ALL | NULL | NULL | NULL | NULL | 2989934 | Using where |
+----+-------------+------------+-------+-------------------+---------+---------+-------+---------+-------------+
2 rows in set (0.00 sec)
複製代碼
最終, 仍是沒有結果. 但直接單表查詢寫 SQL 確能用上索引.
mysql> select * from user_info where uid = '111111111';
+---------+-----------+---------+
| id | uid | name |
+---------+-----------+---------+
| 1 | 111111111 | tanglei |
| 3685399 | 111111111 | tanglei |
| 3685400 | 111111111 | tanglei |
| 3685401 | 111111111 | tanglei |
| 3685402 | 111111111 | tanglei |
| 3685403 | 111111111 | tanglei |
+---------+-----------+---------+
6 rows in set (0.01 sec)
mysql> explain select * from user_info where uid = '111111111';
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
| 1 | SIMPLE | user_info | ref | index_uid | index_uid | 194 | const | 6 | Using index condition |
+----+-------------+-----------+------+---------------+-----------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)
複製代碼
嘗試更換檢索條件, 好比更換 uid 直接關聯查詢, 索引仍然用不上, 差點放棄了都. 在準備求助 DBA 前, 看了下表的建表語句.
mysql> show create table user_info;
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| user_info | CREATE TABLE `user_info` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uid` varchar(64) NOT NULL,
`name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_uid` (`uid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3685404 DEFAULT CHARSET=utf8 |
+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table user_score;
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| user_score | CREATE TABLE `user_score` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uid` varchar(64) NOT NULL,
`score` float DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_uid` (`uid`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
複製代碼
徹底有理由懷疑由於字符集不一致的問題致使索引失效的問題了. 因而修改了小表(真實線上環境可別亂操做)的字符集與大表一致, 再測試下.
mysql> select * from user_score us
-> inner join user_info ui on us.uid = ui.uid
-> where us.id = 5;
+----+-----------+-------+---------+-----------+---------+
| id | uid | score | id | uid | name |
+----+-----------+-------+---------+-----------+---------+
| 5 | 111111111 | 100 | 1 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685399 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685400 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685401 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685402 | 111111111 | tanglei |
| 5 | 111111111 | 100 | 3685403 | 111111111 | tanglei |
+----+-----------+-------+---------+-----------+---------+
6 rows in set (0.00 sec)
mysql> explain
-> select * from user_score us
-> inner join user_info ui on us.uid = ui.uid
-> where us.id = 5;
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
| 1 | SIMPLE | us | const | PRIMARY,index_uid | PRIMARY | 4 | const | 1 | NULL |
| 1 | SIMPLE | ui | ref | index_uid | index_uid | 194 | const | 6 | NULL |
+----+-------------+-------+-------+-------------------+-----------+---------+-------+------+-------+
2 rows in set (0.00 sec)
複製代碼
果真 work 了.
其實深究緣由, 就是網上各類 MySQL軍規/規約所提到的, "索引列不要參與計算". 此次這個 case, 若是知道 explain extended + show warnings
這個工具的話, (之前都不知道explain
後面還能加 extended
參數), 可能就儘早"恍然大悟"了. (最新的 MySQL 8.0版本貌似不須要另外加這個關鍵字).
看下效果. (啊, 我還得把字符集改回去!!!)
mysql> explain extended select * from user_score us inner join user_info ui on us.uid = ui.uid where us.id = 5;
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
| 1 | SIMPLE | us | const | PRIMARY,index_uid | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | ui | ALL | NULL | NULL | NULL | NULL | 2989934 | 100.00 | Using where |
+----+-------------+-------+-------+-------------------+---------+---------+-------+---------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select '5' AS `id`,'111111111' AS `uid`,'100' AS `score`,`test`.`ui`.`id` AS `id`,`test`.`ui`.`uid` AS `uid`,`test`.`ui`.`name` AS `name` from `test`.`user_score` `us` join `test`.`user_info` `ui` where (('111111111' = convert(`test`.`ui`.`uid` using utf8mb4))) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
複製代碼
索引列參與計算了, 每次都要根據字符集去轉換, 全表掃描, 你說能快得起來麼?
至於這個問題爲何會發生? 綜合來看, 就是由於歷史緣由, 老業務場景中的原表是假 utf8
, 新業務新表採用了真 utf8mb4
.
varchar(64)
最終查詢過程當中仍然發生了類型轉換. 所以須要把字段字符集不一致等同於字段類型不一致.fail-fast
的理念的話, 發現不一致, 直接不讓 join 會不會更好? (就像 char v.s varchar
不能 join 同樣).你能解釋以下狀況嗎? 查詢結果表現爲什麼不一致? 注意一下 SQL 的執行順序, 查詢優化器工做流程, 以及其中的 Using join buffer (Block Nested Loop), 建議多看看 MySQL 官方手冊 深刻背後原理.
mysql> select * from user_info ui
-> inner join user_score us on us.uid = ui.uid
-> where us.uid = '111111111';
+---------+-----------+---------+----+-----------+-------+
| id | uid | name | id | uid | score |
+---------+-----------+---------+----+-----------+-------+
| 1 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685399 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685400 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685401 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685402 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685403 | 111111111 | tanglei | 5 | 111111111 | 100 |
+---------+-----------+---------+----+-----------+-------+
6 rows in set (1.14 sec)
mysql> select * from user_info ui
-> inner join user_score us on us.uid = ui.uid
-> where ui.uid = '111111111';
+---------+-----------+---------+----+-----------+-------+
| id | uid | name | id | uid | score |
+---------+-----------+---------+----+-----------+-------+
| 1 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685399 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685400 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685401 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685402 | 111111111 | tanglei | 5 | 111111111 | 100 |
| 3685403 | 111111111 | tanglei | 5 | 111111111 | 100 |
+---------+-----------+---------+----+-----------+-------+
6 rows in set (0.00 sec)
複製代碼
mysql> explain
-> select * from user_info ui
-> inner join user_score us on us.uid = ui.uid
-> where us.uid = '111111111';
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
| 1 | SIMPLE | us | ref | index_uid | index_uid | 258 | const | 1 | Using index condition |
| 1 | SIMPLE | ui | ALL | NULL | NULL | NULL | NULL | 2989934 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+-----------+---------+-------+---------+----------------------------------------------------+
2 rows in set (0.00 sec)
mysql> explain
-> select * from user_info ui
-> inner join user_score us on us.uid = ui.uid
-> where ui.uid = '111111111';
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
| 1 | SIMPLE | ui | ref | index_uid | index_uid | 194 | const | 6 | Using index condition |
| 1 | SIMPLE | us | ALL | index_uid | NULL | NULL | NULL | 4 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+----------------------------------------------------+
2 rows in set (0.01 sec)
複製代碼
說明: 本文測試場景基於 MySQL 5.6, 另外, 本文案例只是爲了說明問題, 其中的 SQL 並不規範(例如儘可能別用 select * 之類的), 請勿模仿(模仿了我也不負責). 爲了寫本文, 可花了很多時間, 建 DB, 灌mock數據等等, 若是以爲有用, 還望你幫忙"在看", "轉發". 最後留一個思考題供討論, 歡迎留言說出你的見解.
阿里雲ECS彈性計算服務是阿里雲的最重要的雲服務產品之一。彈性計算服務是一種簡單高效,處理能力可彈性伸縮的計算服務。咱們始終致力於利用和創造業界最新的前沿技術,讓更多的客戶輕鬆享受這些技術紅利,在雲上快速構建更穩定、安全的應用,提高運維效率,下降IT成本,使客戶更專一於本身的核心業務創新。彈性計算從新定義了人們使用計算資源的方式,這一新的方式正在而且將一直影響着關於計算資源的生態和經濟圈。咱們正在創造歷史,咱們真誠地邀請您加入咱們的隊伍。
最近團隊釋放很多 HC, 誠招 P6/P7/P8 的同窗, 本組同窗主要招聘後端研發同窗(JD在此), 感興趣的同窗可掃描下面二維碼加我聯繫.
另外, 2021 屆校招/實習生崗位也正在進行中(詳情請戳), 若是你是 2020-11 -- 2021-07 月之間畢業, 同時對阿里巴巴感興趣, 也歡迎聯繫我幫忙內推.