原創做者: 楊濤濤mysql
在開始演示以前,咱們先介紹下兩個概念。sql
查詢優化器在生成各類執行計劃以前,得先從統計信息中取得相關數據,這樣才能估算每步操做所涉及到的記錄數,而這個相關數據就是cardinality。簡單來講,就是每一個值在每一個字段中的惟一值分佈狀態。json
好比表t1有100行記錄,其中一列爲f1。f1中惟一值的個數能夠是100個,也能夠是1個,固然也能夠是1到100之間的任何一個數字。這裏惟一值越的多少,就是這個列的可選擇基數。優化
那看到這裏咱們就明白了,爲何要在基數高的字段上創建索引,而基數低的的字段創建索引反而沒有全表掃描來的快。固然這個只是一方面,至於更深刻的探討就不在我這篇探討的範圍了。ui
這裏我來講下HINT是什麼,在何時用。spa
HINT簡單來講就是在某些特定的場景下人工協助MySQL優化器的工做,使她生成最優的執行計劃。通常來講,優化器的執行計劃都是最優化的,不過在某些特定場景下,執行計劃可能不是最優化。code
好比:表t1通過大量的頻繁更新操做,(UPDATE,DELETE,INSERT),cardinality已經很不許確了,這時候恰好執行了一條SQL,那麼有可能這條SQL的執行計劃就不是最優的。爲何說有可能呢?orm
譬如,如下兩條SQL,索引
select * from t1 where f1 = 20;
select * from t1 where f1 = 30;
若是f1的值恰好頻繁更新的值爲30,而且沒有達到MySQL自動更新cardinality值的臨界值或者說用戶設置了手動更新又或者用戶減小了sample page等等,那麼對這兩條語句來講,可能不許確的就是B了。rem
這裏順帶說下,MySQL提供了自動更新和手動更新表cardinality值的方法,因篇幅有限,須要的能夠查閱手冊。
那回到正題上,MySQL 8.0 帶來了幾個HINT,我今天就舉個index_merge的例子。
示例表結構:
mysql> desc t1; +------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+--------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | rank1 | int(11) | YES | MUL | NULL | | | rank2 | int(11) | YES | MUL | NULL | | | log_time | datetime | YES | MUL | NULL | | | prefix_uid | varchar(100) | YES | | NULL | | | desc1 | text | YES | | NULL | | | rank3 | int(11) | YES | MUL | NULL | | +------------+--------------+------+-----+---------+----------------+ 7 rows in set (0.00 sec)
表記錄數:
mysql> select count(*) from t1; +----------+ | count(*) | +----------+ | 32768 | +----------+ 1 row in set (0.01 sec)
這裏咱們兩條經典的SQL:
select * from t1 where rank1 = 1 or rank2 = 2 or rank3 = 2;
select * from t1 where rank1 =100 and rank2 =100 and rank3 =100;
表t1實際上在rank1,rank2,rank3三列上分別有一個二級索引。
那咱們來看SQL C的查詢計劃。
mysql> explain format=json select * from t1 where rank1 =1 or rank2 = 2 or rank3 = 2\G *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "3243.65" }, "table": { "table_name": "t1", "access_type": "ALL", "possible_keys": [ "idx_rank1", "idx_rank2", "idx_rank3" ], "rows_examined_per_scan": 32034, "rows_produced_per_join": 115, "filtered": "0.36", "cost_info": { "read_cost": "3232.07", "eval_cost": "11.58", "prefix_cost": "3243.65", "data_read_per_join": "49K" }, "used_columns": [ "id", "rank1", "rank2", "log_time", "prefix_uid", "desc1", "rank3" ], "attached_condition": "((`ytt`.`t1`.`rank1` = 1) or (`ytt`.`t1`.`rank2` = 2) or (`ytt`.`t1`.`rank3` = 2))" } } } 1 row in set, 1 warning (0.00 sec)
顯然,沒有用到任何索引,掃描的行數爲32034,cost爲3243.65。
咱們加上hint給相同的查詢,再次看看查詢計劃。
這個時候用到了index_merge,union了三個列。掃描的行數爲1103,cost爲441.09,明顯比以前的快了好幾倍。
mysql> explain format=json select /*+ index_merge(t1) */ * from t1 where rank1 =1 or rank2 = 2 or rank3 = 2\G *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "441.09" }, "table": { "table_name": "t1", "access_type": "index_merge", "possible_keys": [ "idx_rank1", "idx_rank2", "idx_rank3" ], "key": "union(idx_rank1,idx_rank2,idx_rank3)", "key_length": "5,5,5", "rows_examined_per_scan": 1103, "rows_produced_per_join": 1103, "filtered": "100.00", "cost_info": { "read_cost": "330.79", "eval_cost": "110.30", "prefix_cost": "441.09", "data_read_per_join": "473K" }, "used_columns": [ "id", "rank1", "rank2", "log_time", "prefix_uid", "desc1", "rank3" ], "attached_condition": "((`ytt`.`t1`.`rank1` = 1) or (`ytt`.`t1`.`rank2` = 2) or (`ytt`.`t1`.`rank3` = 2))" } } } 1 row in set, 1 warning (0.00 sec)
咱們再看下SQL D的計劃:
mysql> explain format=json select * from t1 where rank1 =100 and rank2 =100 and rank3 =100\G *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "534.34" }, "table": { "table_name": "t1", "access_type": "ref", "possible_keys": [ "idx_rank1", "idx_rank2", "idx_rank3" ], "key": "idx_rank1", "used_key_parts": [ "rank1" ], "key_length": "5", "ref": [ "const" ], "rows_examined_per_scan": 555, "rows_produced_per_join": 0, "filtered": "0.07", "cost_info": { "read_cost": "478.84", "eval_cost": "0.04", "prefix_cost": "534.34", "data_read_per_join": "176" }, "used_columns": [ "id", "rank1", "rank2", "log_time", "prefix_uid", "desc1", "rank3" ], "attached_condition": "((`ytt`.`t1`.`rank3` = 100) and (`ytt`.`t1`.`rank2` = 100))" } } } 1 row in set, 1 warning (0.00 sec)
mysql> explain format=json select /*+ index_merge(t1)*/ * from t1 where rank1 =100 and rank2 =100 and rank3 =100\G *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "5.23" }, "table": { "table_name": "t1", "access_type": "index_merge", "possible_keys": [ "idx_rank1", "idx_rank2", "idx_rank3" ], "key": "intersect(idx_rank1,idx_rank2,idx_rank3)", "key_length": "5,5,5", "rows_examined_per_scan": 1, "rows_produced_per_join": 1, "filtered": "100.00", "cost_info": { "read_cost": "5.13", "eval_cost": "0.10", "prefix_cost": "5.23", "data_read_per_join": "440" }, "used_columns": [ "id", "rank1", "rank2", "log_time", "prefix_uid", "desc1", "rank3" ], "attached_condition": "((`ytt`.`t1`.`rank3` = 100) and (`ytt`.`t1`.`rank2` = 100) and (`ytt`.`t1`.`rank1` = 100))" } } } 1 row in set, 1 warning (0.00 sec)
對比下以上兩個,加了HINT的比不加HINT的cost小了100倍。
總結下,就是說表的cardinality值影響這張的查詢計劃,若是這個值沒有正常更新的話,就須要手工加HINT了。相信MySQL將來的版本會帶來更多的HINT。