聲明mysql
先看一個執行計劃sql
(root@localhost) [test]> desc select * from l; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
有個潛規則叫:id相等從上往下看,id不等從下往上看json
主要優化對象是index和ALL,有兩種狀況能夠考慮保留index
只查詢索引列,不回表或者使用索引進行排序或者聚合mysql優化
優化器可能使用到的索引性能
優化器實際選擇的索引優化
使用索引的字節長度code
優化器預估的記錄數量orm
根據條件過濾獲得的記錄的百分比對象
(root@localhost) [dbt3]> DESC SELECT -> * -> FROM -> part -> WHERE -> p_partkey IN (SELECT -> l_partkey -> FROM -> lineitem -> WHERE -> l_shipdate BETWEEN '1997-01-01' AND '1997-02-01') -> ORDER BY p_retailprice DESC -> LIMIT 10; +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+ | 1 | SIMPLE | part | NULL | ALL | PRIMARY | NULL | NULL | NULL | 197706 | 100.00 | Using where; Using filesort | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_key> | <auto_key> | 5 | dbt3.part.p_partkey | 1 | 100.00 | NULL | | 2 | MATERIALIZED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 138672 | 100.00 | Using index condition; Using MRR | +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+---------------------+--------+----------+----------------------------------+ 3 rows in set, 1 warning (0.01 sec)
id 順序
1 ② part表(外表)和subquery2(id=2產生的14w記錄的表)進行關聯,對於part表中全部記錄都要關聯,一共是19w行,再和l_partkey進行關聯,最後排序用到using filesort
1 ③ 內表要加索引,因此mysql優化器自動把第一步取出來的數據添加了一個惟一索引,in裏面是去重的(這實際上是作了一個物化),因此是惟一索引,eq_ref表示經過惟一索引進行關聯,和外表中的p_partkey關聯
2 ① 先查lineitem表,是一個range範圍查詢,使用了i_l_shipdate索引,l_shipdate是date類型,佔用四個字節,預估14萬行記錄,過濾出百分之百,materiallized表示產生了一張實際的表,而且去添加了索引,l_partkey,惟一索引(in裏面是去重的)排序
注意一個細節
(root@localhost) [dbt3]> DESC SELECT -> * -> FROM -> part -> WHERE -> p_partkey IN (SELECT -> l_partkey -> FROM -> lineitem -> WHERE -> l_shipdate BETWEEN '1997-01-01' AND '1997-01-07') -> ORDER BY p_retailprice DESC -> LIMIT 10; +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+ | 1 | SIMPLE | <subquery2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | part | NULL | eq_ref | PRIMARY | PRIMARY | 4 | <subquery2>.l_partkey | 1 | 100.00 | NULL | | 2 | MATERIALIZED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 29148 | 100.00 | Using index condition; Using MRR | +----+--------------+-------------+------------+--------+----------------------------------------------+--------------+---------+-----------------------+-------+----------+----------------------------------------------+ 3 rows in set, 1 warning (0.00 sec)
驅動表就變成了subquerry2,這時候優化器又把子查詢做爲了外表,說明優化器很聰明
in的子查詢,優化器會幫你重寫成join,而且幫你選擇子查詢究竟是內表仍是外表
(root@localhost) [dbt3]> DESC select -> a.* -> from -> part a, -> (select distinct -> l_partkey -> from -> lineitem -> where l_shipdate between '1997-01-01' and '1997-02-01') b -> where -> a.p_partkey=b.l_partkey -> order by a.p_retailprice desc -> limit 10; +----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+ | 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 138672 | 100.00 | Using where; Using temporary; Using filesort | | 1 | PRIMARY | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | b.l_partkey | 1 | 100.00 | NULL | | 2 | DERIVED | lineitem | NULL | range | i_l_shipdate,i_l_suppkey_partkey,i_l_partkey | i_l_shipdate | 4 | NULL | 138672 | 100.00 | Using index condition; Using MRR; Using temporary | +----+-------------+------------+------------+--------+----------------------------------------------+--------------+---------+-------------+--------+----------+---------------------------------------------------+ 3 rows in set, 1 warning (0.00 sec)
這麼改寫,b表永遠是外表,子查詢只是產生一個派生表,可是沒辦法給它建索引,若是子查詢出來的結果集很大,這時候性能就不如in了,in的話優化器會把它做爲內表
(root@localhost) [dbt3]> DESC select max(l_extendedprice) -> from orders,lineitem -> where o_orderdate between '1995-01-01' and '1995-01-31' -> and l_orderkey=o_orderkey; +----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+ | 1 | SIMPLE | orders | NULL | range | PRIMARY,i_o_orderdate | i_o_orderdate | 4 | NULL | 40696 | 100.00 | Using where; Using index | | 1 | SIMPLE | lineitem | NULL | ref | PRIMARY,i_l_orderkey,i_l_orderkey_quantity | PRIMARY | 4 | dbt3.orders.o_orderkey | 3 | 100.00 | NULL | +----+-------------+----------+------------+-------+--------------------------------------------+---------------+---------+------------------------+-------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec)
orderkey上有索引,可是沒用,用的是pk,orders表示外表,根據過濾條件把數據過濾出來作外表,而後跟lineitem表關聯,用的是pk,關聯的列是orders.o_orderkey
若是強行走orderkey索引,成本很高,須要回表,經過主鍵不用回表
(root@localhost) [dbt3]> DESC select * -> from -> lineitem -> where -> l_shipdate <= '1995-12-32' -> union -> select -> * -> from -> lineitem -> where -> l_shipdate >= '1997-01-01'; +----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+ | 1 | PRIMARY | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 5409799 | 33.33 | Using where | | 2 | UNION | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 5409799 | 50.00 | Using where | |NULL| UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary | +----+--------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+ 3 rows in set, 3 warnings (0.10 sec)
union result合併兩張表 會using temporary,使用臨時表,union會去重,因此又去建了臨時表,在上面加了惟一索引,這裏就用了兩個索引,因此一個sql只能用一條索引是不對的
(root@localhost) [employees]> DESC SELECT -> emp_no, -> dept_no, -> (SELECT -> COUNT(1) -> FROM -> dept_emp t2 -> WHERE -> t1.emp_no <= t2.emp_no) AS row_num -> FROM -> dept_emp t1; +----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+ | 1 | PRIMARY | t1 | NULL | index | NULL | emp_no | 4 | NULL | 331570 | 100.00 | Using index | | 2 | DEPENDENT SUBQUERY | t2 | NULL | ALL | PRIMARY,emp_no | NULL | NULL | NULL | 331570 | 33.33 | Range checked for each record (index map: 0x3) | +----+--------------------+-------+------------+-------+----------------+--------+---------+------+--------+----------+------------------------------------------------+ 2 rows in set, 2 warnings (0.00 sec)
對於這個sql,先執行了1再執行了2,2是dependent subquery,要依賴子查詢,因此先執行了1,因此t1是外表,t2是內表,每次得關聯33w * 33%次數,一共關聯33w次,一共是33w * 10w次
行號問題,性能很是差
一般來講查詢的結果是不須要物化的,子查詢產生的一張表,去重,加一個惟一鍵,上面還有個索引
A和B關聯,B是子查詢查出來的,原本這些數據都存放在內存中直接和A表關聯,B確定是外表,由於他沒有索引,in的話就會去重,加惟一鍵,這時候就既能夠是外表也能夠是內表,這就是物化
這個東西仍是蠻好用的,能夠用來看看sql的執行成本
mysql優化器會選擇一個成本最小的方式做爲執行計劃