Mysql 使用 optimizer_trace 查看執行流程，分析、驗證優化思路

時間 2019-11-09

標籤 mysql 使用 optimizer trace 查看執行流程分析驗證優化思路欄目 MySQL 简体版

原文原文鏈接

該博客是我在看了《 MySQL實戰45講》以後的一次實踐筆記。文章比較枯燥，若是你在這篇文章看到一些陌生的關鍵字，建議你也必定要去作實驗，只有作實驗且驗證了各個數據的由來，才能真正弄懂。

背景

Mysql 版本：5.7
業務需求：須要統最近一個月閱讀量最大的10篇文章
爲了對比後面實驗效果，我加了3個索引html

CREATE TABLE `article_rank` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `aid` int(11) unsigned NOT NULL,
  `pv` int(11) unsigned NOT NULL DEFAULT '1',
  `day` int(11) NOT NULL COMMENT '日期 例如 20171016',
  PRIMARY KEY (`id`),
  KEY `idx_day` (`day`),
  KEY `idx_day_aid_pv` (`day`,`aid`,`pv`),
  KEY `idx_aid_day_pv` (`aid`,`day`,`pv`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

實驗原理

Optimizer Trace 是MySQL 5.6.3裏新加的一個特性，能夠把MySQL Optimizer的決策和執行過程輸出成文本，結果爲JSON格式，兼顧了程序分析和閱讀的便利。

利用performance_schema庫裏面的session_status來統計innodb讀取行數
利用performance_schema庫裏面的optimizer_trace來查看語句執行的詳細信息mysql

下面的實驗都使用以下步驟來執行算法

#0. 若是前面有開啓 optimizer_trace 則先關閉
SET optimizer_trace="enabled=off";

#1. 開啓 optimizer_trace
SET optimizer_trace='enabled=on';

#2. 記錄如今執行目標 sql 以前已經讀取的行數
select VARIABLE_VALUE into @a from performance_schema.session_status where variable_name = 'Innodb_rows_read';

#3. 執行咱們須要執行的 sql
todo

#4. 查詢 optimizer_trace 詳情
select trace from `information_schema`.`optimizer_trace`\G;

#5. 記錄如今執行目標 sql 以後讀取的行數
select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';

官方文檔 https://dev.mysql.com/doc/int...

實驗

我作了四次實驗，具體執行的第三步的 sql 以下sql

實驗	sql
實驗1	select `aid`,sum(`pv`) as num from article_rank force index(idx_day_aid_pv) where `day`>20190115 group by aid order by num desc LIMIT 10;
實驗2	select `aid`,sum(`pv`) as num from article_rank force index(idx_day) where `day`>20190115 group by aid order by num desc LIMIT 10;
實驗3	select `aid`,sum(`pv`) as num from article_rank force index(idx_aid_day_pv) where `day`>20190115 group by aid order by num desc LIMIT 10;
實驗4	select `aid`,sum(`pv`) as num from article_rank force index(PRI) where `day`>20190115 group by aid order by num desc LIMIT 10;

實驗1

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_day_aid_pv) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 結果省略
10 rows in set (25.05 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+---------+
| @b-@a   |
+---------+
| 6417027 |
+---------+
1 row in set (0.01 sec)

實驗2

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_day) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 結果省略
10 rows in set (42.06 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select @b-@a;
+---------+
| @b-@a   |
+---------+
| 9625540 |
+---------+
1 row in set (0.00 sec)

實驗3

mysql> select `aid`,sum(`pv`) as num from article_rank force index(idx_aid_day_pv) where `day`>'20190115' group by aid order by num desc LIMIT 10;
# 省略結果
10 rows in set (5.38 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 0,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 649101,
              "row_size": 24,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 352,
              "sort_mode": "<sort_key, rowid>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+----------+
| @b-@a    |
+----------+
| 14146056 |
+----------+
1 row in set (0.00 sec)

實驗4

mysql> select `aid`,sum(`pv`) as num from article_rank force index(PRI) where `day`>'20190115' group by aid order by num desc LIMIT 10;# 省略查詢結果
10 rows in set (21.90 sec)

{
  "steps": [
    {
      "join_preparation": "略"
    },
    {
      "join_optimization": "略"
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": {
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 838860
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": {
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 20,
                "key_length": 4,
                "unique_constraint": false,
                "location": "disk (InnoDB)",
                "record_format": "fixed"
              }
            }
          },
          {
            "filesort_information": [
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "num"
              }
            ],
            "filesort_priority_queue_optimization": {
              "limit": 10,
              "rows_estimate": 1057,
              "row_size": 36,
              "memory_available": 262144,
              "chosen": true
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 649091,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 488,
              "sort_mode": "<sort_key, additional_fields>"
            }
          }
        ]
      }
    }
  ]
}

mysql> select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';
Query OK, 1 row affected (0.00 sec)

mysql> select @b-@a;
+----------+
| @b-@a    |
+----------+
| 17354569 |
+----------+
1 row in set (0.00 sec)

執行流程舉例說明

看下本案例中的 sql 去掉強制索引以後的語句json

select `aid`,sum(`pv`) as num from article_rank where `day`>20190115 group by aid order by num desc LIMIT 10;

咱們以實驗1爲例數組

第一步

由於該 sql 中使用了 group by，因此咱們看到optimizer_trace在執行時（join_execution）都會先建立一張臨時表creating_tmp_table）來存放group by子句以後的結果。緩存

存放的字段是 aid和 num兩個字段。該臨時表是如何存儲的? row_length 爲何是 20? 另開三篇博客寫了這個問題
https://mengkang.net/1334.html
https://mengkang.net/1335.html
https://mengkang.net/1336.html

第二步

由於memory_table_size_exceeded的緣由，須要把臨時表intermediate_tmp_table以InnoDB引擎存在磁盤。bash

mysql> show global variables like '%table_size';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 16777216 |
| tmp_table_size      | 16777216 |
+---------------------+----------+

https://dev.mysql.com/doc/ref...
https://dev.mysql.com/doc/ref...
max_heap_table_size
This variable sets the maximum size to which user-created MEMORY tables are permitted to grow. The value of the variable is used to calculate MEMORY table MAX_ROWS values. Setting this variable has no effect on any existing MEMORY table, unless the table is re-created with a statement such as CREATE TABLE or altered with ALTER TABLE or TRUNCATE TABLE. A server restart also sets the maximum size of existing MEMORY tables to the global max_heap_table_size value.session

tmp_table_size
The maximum size of internal in-memory temporary tables. This variable does not apply to user-created MEMORY tables.
The actual limit is determined from whichever of the values of tmp_table_size and max_heap_table_size is smaller. If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-disk temporary table. The internal_tmp_disk_storage_engine option defines the storage engine used for on-disk temporary tables.app

也就是說這裏臨時表的限制是16M，而一行須要佔的空間是20字節，那麼最多隻能容納floor(16777216/20) = 838860行，因此row_limit_estimate是838860。

咱們統計下group by以後的總行數。

mysql> select count(distinct aid) from article_rank where `day`>'20190115';
+---------------------+
| count(distinct aid) |
+---------------------+
|              649091 |
+---------------------+

649091 < 838860

問題：爲何會觸發 memory_table_size_exceeded呢？

數據寫入臨時表的過程以下：

在磁盤上建立臨時表，表裏有兩個字段，aid和num，由於是 group by aid，因此aid是臨時表的主鍵。
實驗1中是掃描索引idx_day_aid_pv，依次取出葉子節點的aid和pv的值。
若是臨時表種沒有對應的 aid就插入，若是已經存在的 aid，則把須要插入行的 pv 累加在原來的行上。

第三步

對intermediate_tmp_table裏面的num字段作desc排序

filesort_summary.examined_rows

排序掃描行數統計，咱們統計下group by以後的總行數。（前面算過是649091）

因此每一個實驗的結果中filesort_summary.examined_rows 的值都是649091。
filesort_summary.number_of_tmp_files的值爲0，表示沒有使用臨時文件來排序。

filesort_summary.sort_mode

MySQL 會給每一個線程分配一塊內存用於排序，稱爲sort_buffer。sort_buffer的大小由sort_buffer_size來肯定。

mysql> show global variables like 'sort_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| sort_buffer_size | 262144 |
+------------------+--------+
1 row in set (0.01 sec)

也就說是sort_buffer_size默認值是256KB

https://dev.mysql.com/doc/ref...
Default Value (Other, 64-bit platforms, >= 5.6.4) 262144

排序的方式也是有多種的

<sort_key, rowid>
<sort_key, additional_fields>
<sort_key, packed_additional_fields>

additional_fields

初始化sort_buffer，肯定放入字段，由於咱們這裏是根據num來排序，因此sort_key就是num，additional_fields就是aid；
把group by 子句以後生成的臨時表（intermediate_tmp_table）裏的數據（aid,num）存入sort_buffer。咱們經過number_of_tmp_files值爲0，知道內存是足夠用的，並無使用外部文件進行歸併排序；
對sort_buffer中的數據按num作快速排序；
按照排序結果取前10行返回給客戶端；

rowid

根據索引或者全表掃描，按照過濾條件得到須要查詢的排序字段值和row ID；
將要排序字段值和row ID組成鍵值對，存入sort buffer中；
若是sort buffer內存大於這些鍵值對的內存，就不須要建立臨時文件了。不然，每次sort buffer填滿之後，須要在內存中排好序（快排），並寫到臨時文件中；
重複上述步驟，直到全部的行數據都正常讀取了完成；
用到了臨時文件的，須要利用磁盤外部排序，將row id寫入到結果文件中；
根據結果文件中的row ID按序讀取用戶須要返回的數據。因爲row ID不是順序的，致使回表時是隨機IO，爲了進一步優化性能（變成順序IO），MySQL會讀一批row ID，並將讀到的數據按排序字段順序插入緩存區中(內存大小read_rnd_buffer_size)。

實驗結果分析

在看了附錄中的實驗結果以後，我彙總了一些比較重要的數據對比信息

指標	index	query_time	filesort_summary.examined_rows	filesort_summary.sort_mode	filesort_priority_queue_optimization.rows_estimate	converting_tmp_table_to_ondisk	Innodb_rows_read
實驗1	idx_day_aid_pv	25.05	649091	additional_fields	1057	true	6417027
實驗2	idx_day	42.06	649091	additional_fields	1057	true	9625540
實驗3	idx_aid_day_pv	5.38	649091	rowid	649101	false	14146056
實驗4	PRI	21.90	649091	additional_fields	1057	true	17354569

filesort_summary.examined_rows

實驗1案例中已經分析過。

mysql> select count(distinct aid) from article_rank where `day`>'20190115';
+---------------------+
| count(distinct aid) |
+---------------------+
|              649091 |
+---------------------+

filesort_summary.sort_mode

一樣的字段，一樣的行數，爲何有的是additional_fields排序，有的是rowid排序呢？

前面咱們已經分析過對於 InnoDB 表來講 additional_fields 對比 rowid 來講，減小了回表，也就減小了磁盤訪問，會被優先選擇。可是要注意這是對於 InnoDB 來講的。而實驗3是內存表，使用的是 memory 引擎。回表過程只是根據數據行的位置，直接訪問內存獲得數據，不會有磁盤訪問（能夠簡單的理解爲一個內存中的數組下標去找對應的元素），排序的列越少越好佔的內存就越小，因此就選擇了 rowid 排序。

還有一個緣由就是咱們這裏使用了limit 10這樣堆的成員個數比較小，因此佔用的內存不會太大。不要忘了這裏選擇優先隊列排序算法依然受到sort_buffer_size的限制。

關於內存表的排序詳解，能夠參考 MySQL實戰45講的第17講如何正確地顯示隨機消息

filesort_priority_queue_optimization

關於這裏的 filesort_priority_queue_optimization 算法能夠參考 http://www.javashuo.com/article/p-ctofdnun-np.html

優先隊列排序執行步驟分析：

在臨時表（未排序）中取出前 10 行，把其中的num（來源於sum(pv)）和rowid做爲10個元素構成一個小頂堆，也就是最小的 num 在堆頂。
取下一行，根據 num 的值和堆頂值做比較，若是該字大於堆頂的值，則替換掉。而後將新的堆作堆排序。
重複步驟2直到第 649091 行比較完成。
而後對最後的10行作一次回表查詢其 aid,num。

rows_estimate

根據以上分析，先讀取了 649091 行，而後回表又讀取了 10 行，因此總共是 649101 行。
實驗3的結果與之相吻合，可是其餘的都是 1057 行，是怎麼算出來的呢？

row_size

沒弄明白

存儲在臨時表裏時，都是 aid 和 num 字段，佔用寬度是4+15是19字節。
實驗3是 rowid 排序，也就是說num 15 字節 + row ID 6 字節，應該是21字節，實際結果是24字節；
其餘是 additional_fields 排序，也就是15+4+6 25 字節，實際結果是36字節。

converting_tmp_table_to_ondisk

是否建立臨時表。一樣是寫入 649091 到內存臨時表，爲何其餘三種方式都會出現內存不夠用的狀況呢？
注意到一點，實驗3中建立臨時表時key_length是0，其餘都是4

Innodb_rows_read

上面實驗中每次在統計@b-@a的過程當中，咱們查詢了OPTIMIZER_TRACE這張表，須要用到臨時表，而 internal_tmp_disk_storage_engine 的默認值是 InnoDB。若是使用的是 InnoDB 引擎的話，把數據從臨時表取出來的時候，會讓 Innodb_rows_read 的值加 1。

咱們先查詢下面兩個數據，下面須要使用到

mysql> select count(*) from article_rank;
+----------+
| count(*) |
+----------+
| 14146055 |
+----------+

mysql> select count(*) from article_rank where `day`>'20190115';
+----------+
| count(*) |
+----------+
|  3208513 |
+----------+

實驗1

由於知足條件的總行數是3208513，由於使用的是idx_day_aid_pv索引，而查詢的值是aid和pv，因此是覆蓋索引，不須要進行回表。
可是能夠看到在建立臨時表（creating_tmp_table）以後，由於超過臨時表內存限制（memory_table_size_exceeded），因此這3208513行數據的臨時表會寫入磁盤，使用的依然是InnoDB引擎。
因此實驗1最後結果是 3208513*2 + 1 = 6417027；

實驗2

相比實驗1，實驗2中不只須要對臨時表存盤，同時由於索引是idx_day，不能使用覆蓋索引，還須要每行都回表，因此最後結果是 3208513*3 + 1 = 9625540；

實驗3

實驗3中由於最左列是aid，沒法對day>20190115表達式進行過濾篩選，因此須要遍歷整個索引（覆蓋全部行的數據）。
可是本次過程當中建立的臨時表（memory 引擎）都是在內存中操做，因此最後結果是14146055 + 1 = 14146056；

須要注意，若是咱們開啓慢查詢日誌，慢查詢日誌裏面的掃描行數和這裏統計的不同，內存臨時表的掃描行數也算在內的。

實驗4

實驗4首先遍歷主表，須要掃描14146055行，而後把符合條件的3208513行放入臨時表，因此最後是14146055 + 3208513 + 1 = 17354569。

參考

《 MySQL實戰45講》
https://time.geekbang.org/column/article/73479
https://time.geekbang.org/column/article/73795
https://dev.mysql.com/doc/ref...
https://juejin.im/entry/59019...