mysql的聚簇索引和非聚簇索引

時間 2019-11-08

標籤 mysql 索引欄目 MySQL 简体版

原文原文鏈接

mysql聚簇索引和非聚簇索引

這篇文章主要介紹mysql中innodb的聚簇索引和非聚簇索引,那首先咱們要先看下聚簇索引和非聚簇索引的概念是什麼,是幹什麼用的.html

聚簇索引和非聚簇索引的概念

咱們先引用官網上的一段話來看看它們是幹嗎的mysql

Every InnoDB table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key. To get the best performance from queries, inserts, and other database operations, you must understand how InnoDB uses the clustered index to optimize the most common lookup and DML operations for each table.

When you define a PRIMARY KEY on your table, InnoDB uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically.

If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.

If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index named GEN_CLUST_INDEX on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
複製代碼

有耐性的朋友能夠本身翻譯看看,這裏我們大概翻譯了一下,總結出上面這段話的意思:
每一個InnoDB表都有一個特殊的索引，稱爲聚簇索引，用於存儲行數據。
1.若是建立了一個主鍵，InnoDB會將其用做聚簇索引(若是主鍵沒有邏輯惟一且非空的列或列集，最好是設置成自動遞增的)
2.若是沒有爲表建立主鍵，則MySQL會在全部鍵列都不爲NULL的狀況下找到第一個UNIQUE索引，InnoDB會將其用做彙集索引
3.若是表沒有PRIMARY KEY或合適的UNIQUE索引，則InnoDB在包含行ID值的合成列上內部生成一個名爲GEN_CLUST_INDEX的隱藏的彙集索引(隱藏的是看不到的,也就是說不會出如今desc table中,行ID是一個6字節的字段，隨着插入新行而單調增長)
從這三種狀況來看的話,就是說無論你有沒有建立主鍵,mysql都會給你弄一個聚簇索引給安排上,你建立了就用你設置的主鍵爲聚簇索引,沒有建立就給你來個隱藏的.sql

聚簇索引(也稱爲主鍵索引)就是攜帶了行數據的索引,非聚簇索引就是除了聚簇索引以外的索引.這樣提及來可能有點乾巴巴的,我們畫個圖來理解一下.
假設有一張表test數據庫

create table test(
id int primary key,
age int not null,
name varchar(16),
PRIMARY KEY (`id`),
KEY `idx_age` (`age`) USING BTREE,
KEY `idx_name` (`name`) USING BTREE,
)engine=InnoDB;
複製代碼

主鍵是id,而後有兩個普通索引idx_age,idx_name(btree類型的索引),使用的是innodb引擎. 咱們知道id就是聚簇索引,idx_age,idx_name是非聚簇索引. 如今有三條數據(1,11,'甲'),(2,12,'乙'),(2,13,'丙').那麼他們在數據庫中存儲的形式是,以下: 聚簇索引: bash

非聚簇索引:

能夠看到聚簇索引後面是直接跟着的數據,而非聚簇索引指向的是聚簇索引的key值. 所以非聚簇索引查詢數據須要先查到聚簇索引的key,而後用這個key去查詢真正的數據(這個過程稱爲回表). 也就是說非聚簇索引是須要查詢兩次如圖:

因此能走聚簇索引的儘可能走聚簇索引(也能夠說是儘可能走主鍵),看起來都是走索引,實際上主鍵要更快. 並且主鍵索引若是是自增的int類型,由於長度比較小,佔用的空間也比較小.

覆蓋索引

咱們上面說到若是是非聚簇索引的話會須要回表,查詢兩次,可是若是要查詢得字段,數據直接就在索引上是能夠不須要回表的.這種索引稱爲覆蓋索引. 好比咱們要查詢上面的test表中的age和name兩個字段.函數

select id,age,name from test where age = 13;
複製代碼

直接查詢的話,會根據age的索引找到id的key,而後再用id去查詢出數據. 可是若是咱們建立一個(age,name)的聯合索引,狀況就不同了.
優化

由於要返回的值,id在聯合索引指向的key上,age和name共同組成了聯合索引, 所以數據都在(age,name)的聯合索引上,並不須要回表在去查詢一次,能夠大大提升查詢得效率.
固然這個查詢要比較頻繁,使用率比較高,畢竟建立索引也是要消耗資源的,實際狀況要根據查詢頻率和索引大小來作出判斷.
有聯合索引存在的狀況下能走覆蓋索引固然是最好的,提升了查詢效率.
注:還有在某些count聚合函數使用的時候可使用覆蓋索引來優化count,好比說select count(age) from test.
由於age是有索引了,直接使用到的也是age,因此覆蓋索引了,無需回表.