mysql-mysql16索引

時間 2019-11-30

原文原文鏈接

一丶什麼是索引

知識回顧:數據都是存在硬盤上的，那查詢數據不可避免的須要進行IO操做mysql

索引在MySQL中也叫作「鍵」，是存儲引擎用於快速找到記錄的一種數據結構。sql

primary key
unique key
index key

注意foreign key不是用來加速查詢用的，不在咱們研究範圍以內,上面三種key前兩種除了有加速查詢的效果以外還有額外的約束條件(primary key:非空且惟一，unique key:惟一)，而index key沒有任何約束功能只會幫你加速查詢數據結構

索引就是一種數據結構，相似於書的目錄。意味着之後再查數據應該先找目錄再找數據，而不是用翻頁的方式查詢數據性能

二丶索引本質

經過不斷地縮小想要獲取數據的範圍來篩選出最終想要的結果，同時把隨機的事件變成順序的事件，也就是說，有了這種索引機制，咱們能夠老是用同一種查找方式來鎖定數據。測試

三丶索引的影響

在表中有大量數據的前提下，建立索引速度會很慢
在索引建立完畢後，對錶的查詢性能會大幅度提高，可是寫的性能會下降

四丶B+樹

https://images2017.cnblogs.com/blog/1036857/201709/1036857-20170912011123500-158121126.pngcode

只有葉子結點存放真實數據，根和樹枝節點存的僅僅是虛擬數據blog

查詢次數由樹的層級決定，層級越低次數越少索引

一個磁盤塊兒的大小是必定的，那也就意味着能存的數據量是必定的。如何保證樹的層級最低呢？一個磁盤塊兒存放佔用空間比較小的數據項事件

思考咱們應該給咱們一張表裏面的什麼字段字段創建索引可以下降樹的層級高度>>> 主鍵id字段get

4.1彙集索引(primary key)

彙集索引其實指的就是表的主鍵，innodb引擎規定一張表中必需要有主鍵。先來回顧一下存儲引擎。

myisam在建表的時候對應到硬盤有幾個文件(三個)？

innodb在建表的時候對應到硬盤有幾個文件(兩個)？frm文件只存放表結構，不可能放索引，也就意味着innodb的索引跟數據都放在idb表數據文件中。

特色:葉子結點放的一條條完整的記錄

4.2輔助索引(unique,index)

輔助索引:查詢數據的時候不可能都是用id做爲篩選條件，也可能會用name，password等字段信息，那麼這個時候就沒法利用到彙集索引的加速查詢效果。就須要給其餘字段創建索引，這些索引就叫輔助索引

特色：葉子結點存放的是輔助索引字段對應的那條記錄的主鍵的值(好比:按照name字段建立索引，那麼葉子節點存放的是:{name對應的值:name所在的那條記錄的主鍵值})

4.3覆蓋索引

select name from user where name='jason';

上述語句叫覆蓋索引:只在輔助索引的葉子節點中就已經找到了全部咱們想要的數據

4.4非覆蓋索引

select age from user where name='jason';

上述語句叫非覆蓋索引,雖然查詢的時候命中了索引字段name，可是要查的是age字段，因此還須要利用主鍵纔去查找

測試索引練習

#1. 準備表
create table s1(
id int,
name varchar(20),
gender char(6),
email varchar(50)
);

#2. 建立存儲過程，實現批量插入記錄
delimiter $$ #聲明存儲過程的結束符號爲$$
create procedure auto_insert1()
BEGIN
    declare i int default 1;
    while(i<3000000)do
        insert into s1 values(i,'jason','male',concat('jason',i,'@oldboy'));
        set i=i+1;
    end while;
END$$ #$$結束
delimiter ; #從新聲明 分號爲結束符號

#3. 查看存儲過程
show create procedure auto_insert1\G 

#4. 調用存儲過程
call auto_insert1();

# 表沒有任何索引的狀況下
select * from s1 where id=30000;
# 避免打印帶來的時間損耗
select count(id) from s1 where id = 30000;
select count(id) from s1 where id = 1;

# 給id作一個主鍵
alter table s1 add primary key(id);  # 速度很慢

select count(id) from s1 where id = 1;  # 速度相較於未建索引以前二者差着數量級
select count(id) from s1 where name = 'jason'  # 速度仍然很慢


"""
範圍問題
"""
# 並非加了索引，之後查詢的時候按照這個字段速度就必定快   
select count(id) from s1 where id > 1;  # 速度相較於id = 1慢了不少
select count(id) from s1 where id >1 and id < 3;
select count(id) from s1 where id > 1 and id < 10000;
select count(id) from s1 where id != 3;

alter table s1 drop primary key;  # 刪除主鍵 單獨再來研究name字段
select count(id) from s1 where name = 'jason';  # 又慢了

create index idx_name on s1(name);  # 給s1表的name字段建立索引
select count(id) from s1 where name = 'jason'  # 仍然很慢！！！
"""
再來看b+樹的原理，數據須要區分度比較高，而咱們這張表全是jason，根本沒法區分
那這個樹其實就建成了「一根棍子」
"""
select count(id) from s1 where name = 'xxx';  
# 這個會很快，我就是一根棍，第一個不匹配直接不須要再往下走了
select count(id) from s1 where name like 'xxx';
select count(id) from s1 where name like 'xxx%';
select count(id) from s1 where name like '%xxx';  # 慢 最左匹配特性

# 區分度低的字段不能建索引
drop index idx_name on s1;

# 給id字段建普通的索引
create index idx_id on s1(id);
select count(id) from s1 where id = 3;  # 快了
select count(id) from s1 where id*12 = 3;  # 慢了  索引的字段必定不要參與計算

drop index idx_id on s1;
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';
# 針對上面這種連續多個and的操做，mysql會從左到右先找區分度比較高的索引字段，先將總體範圍降下來再去比較其餘條件
create index idx_name on s1(name);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 並無加速

drop index idx_name on s1;
# 給name，gender這種區分度不高的字段加上索引並不難加快查詢速度

create index idx_id on s1(id);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 快了  先經過id已經講數據快速鎖定成了一條了
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 慢了  基於id查出來的數據仍然不少，而後還要去比較其餘字段

drop index idx_id on s1

create index idx_email on s1(email);
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 快 經過email字段一劍封喉

聯合索引

select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  
# 若是上述四個字段區分度都很高，那給誰建都能加速查詢
# 給email加然而不用email字段
select count(id) from s1 where name='jason' and gender = 'male' and id > 3; 
# 給name加然而不用name字段
select count(id) from s1 where gender = 'male' and id > 3; 
# 給gender加然而不用gender字段
select count(id) from s1 where id > 3; 

# 帶來的問題是全部的字段都建了索引然而都沒有用到，還須要花費四次創建的時間
create index idx_all on s1(email,name,gender,id);  # 最左匹配原則，區分度高的往左放
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 速度變快

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。