MySQL 重複索引探討(持續更新中...)

時間 2019-11-17

標籤 mysql 重複索引探討持續更新欄目 MySQL 简体版

原文原文鏈接

資料參考：http://xiezhenye.com/2015/01/%E6%89%BE%E5%88%B0-mysql-%E6%95%B0%E6%8D%AE%E5%BA%93%E4%B8%AD%E7%9A%84%E4%B8%8D%E8%89%AF%E7%B4%A2%E5%BC%95.html html

<一> 建立'有問題的'表

1.建立表test1

CREATE TABLE test1 (
  id int(11) NOT NULL,
  f1 int(11) DEFAULT NULL,
  f2 int(11) DEFAULT NULL,
  f3 int(11) DEFAULT NULL,
  PRIMARY KEY (id),
  KEY k1 (f1,id),
  KEY k2 (id,f1),
  KEY k3 (f1),
  KEY k4 (f1,f3),
  KEY k5 (f1,f3,f2)
)

2.建立表 test2

CREATE TABLE test2 (
  id1 int(11) NOT NULL DEFAULT 0,
  id2 int(11) NOT NULL DEFAULT 0,
  b int(11) DEFAULT NULL,
  PRIMARY KEY (id1,id2),
  KEY k1 (b)
)

<二> 存在問題的索引

1. 包含主鍵的索引

innodb 自己是聚簇表，每一個二級索引自己就包含主鍵，相似f1,id 的索引，雖然實際沒什麼害處，但反映使用者對mysql 索引的不瞭解。而 id,f1 這種多餘索引，會浪費存儲空間，並影響數據更新性能。包含主鍵的索引用這樣一句sql 就能所有找出來：mysql

select c.*, pk from 
       (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where index_name != 'PRIMARY' and table_schema != 'mysql'
     group by table_schema, table_name, index_name) c,
       (select table_schema, table_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') pk 
         from INFORMATION_SCHEMA.STATISTICS 
         where index_name = 'PRIMARY' and table_schema != 'mysql'
     group by table_schema, table_name) p  
     where c.table_name = p.table_name and c.table_schema = p.table_schema and c.cols like concat('%', pk, '%');

結果：
sql

2.重複的索引

包含重複前綴的索引，索引能由另外一個包含該前綴的索引徹底代替，是多餘索引。多餘的索引會浪費存儲空間，並影響數據更新性能。這樣的索引一樣用一句 sql 能夠找出來。性能

select c1.table_schema, c1.table_name, c1.index_name,c1.cols,c2.index_name, c2.cols from
       (select table_schema, table_name, index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where table_schema != 'mysql' and index_name!='PRIMARY'
     group by table_schema,table_name,index_name) c1,   
       (select table_schema, table_name,index_name, concat('|', group_concat(column_name order by seq_in_index separator '|'), '|') cols 
         from INFORMATION_SCHEMA.STATISTICS 
         where table_schema != 'mysql' and index_name != 'PRIMARY'
     group by table_schema, table_name, index_name) c2 
     where c1.table_name = c2.table_name and c1.table_schema = c2.table_schema and c1.cols like concat(c2.cols, '%') and c1.index_name != c2.index_name;

結果：spa

2.2 關於冗餘索引，補充一些

若是建立了索引(A,B)，再建立索引(A)，則(A) 是冗餘索引，由於這只是前一個索引的前綴索引。code

所以索引(A,B)也能夠當作索引(A)來使用(這種冗餘只是對B-Tree 索引來講的)。htm

可是若是再建立索引(B,A)，則不是冗餘索引，索引(B)也不是，由於B 不是索引(A,B)的最左前綴列。索引

另外，其它不一樣類型的索引(如哈希索引或者全文索引)也不會是B-Tree 索引的冗餘索引，而不管覆蓋的索引列是什麼。get

3. 低區分度索引

這樣的索引因爲仍然會掃描大量記錄，在實際查詢時一般會被忽略。可是在某些狀況下仍然是有用的。所以須要根據實際狀況進一步分析。這裏是區分度小於 10% 的索引，能夠根據須要調整參數。it

select p.table_schema, p.table_name, c.index_name, c.car, p.car total from
       (select table_schema, table_name, index_name, max(cardinality) car
         from INFORMATION_SCHEMA.STATISTICS
     where index_name != 'PRIMARY'
     group by table_schema, table_name,index_name) c,
       (select table_schema, table_name, max(cardinality) car
         from INFORMATION_SCHEMA.STATISTICS
     where index_name = 'PRIMARY' and table_schema != 'mysql'
     group by table_schema,table_name) p
     where c.table_name = p.table_name and c.table_schema = p.table_schema and p.car > 0 and c.car / p.car < 0.1;

結果：

4. 複合主鍵

因爲 innodb 是聚簇表，每一個二級索引都會包含主鍵值。複合主鍵會形成二級索引龐大，而影響二級索引查詢性能，並影響更新性能。一樣須要根據實際狀況進一步分析。

sql 爲：

select table_schema, table_name, group_concat(column_name order by seq_in_index separator ',') cols, max(seq_in_index) len
        from INFORMATION_SCHEMA.STATISTICS
        where index_name = 'PRIMARY' and table_schema != 'mysql'
        group by table_schema, table_name having len>1;

結果爲：