咱們常常在論壇和麪試中遇到這個問題,mysql中,where in會不會用到索引?mysql
爲了完全搞明白這個問題,作了一些測試,發現記錄數大小對是否命中索引有影響,咱們來看一看。面試
使用的mysql版本是5.7,數據庫引擎爲默認的innoDB,索引類型是默認的B+樹索引,用explain執行計劃確認是否命中索引。sql
咱們建立一個表數據庫
create table staffs( id int primary key auto_increment, name varchar(24) not null default '' comment '姓名', age int not null default 0 comment '年齡', pos varchar(20) not null default '' comment '職位', add_time timestamp not null default current_timestamp comment '入職時間' )charset utf8 comment '員工記錄表';
先插入三條數據數組
insert into staffs(name,age,pos,add_time) values('z3',22,'manager',now()); insert into staffs(name,age,pos,add_time) values('July',23,'dev',now()); insert into staffs(name,age,pos,add_time) values('2000',23,'dev',now());
alter table staffs add index idx_staffs_name(name);
mysql> explain select * from staffs where name in ('z3', '2000'); +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_name | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
能夠看到,沒有命中索引,行數爲3,server層對存儲引擎返回的數據作過濾以後剩餘66.67%,也就是說,存儲引擎返回了3條記錄,mysql的server層過濾掉1條,剩下2條,filtered的值爲66.67%. (explain詳見以前的博文: http://www.javashuo.com/article/p-nawevcyl-ds.html)bash
準備索引測試
alter table staffs drop index idx_staffs_name; alter table staffs add index idx_staffs_nameAgePos(name, age, pos);
mysql> explain select * from staffs where name = 'z3'; +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ | 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 74 | const | 1 | 100.00 | NULL | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
mysql> explain select * from staffs where name in ('z3', '2000'); +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.04 sec)
能夠看到,用 = 查詢時,因爲最左原則,用到了索引,而用in查詢時,沒有用到索引。優化
mysql> explain select * from staffs where name = 'z3' and age = 22; +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ | 1 | SIMPLE | staffs | NULL | ref | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | const,const | 1 | 100.00 | NULL | +----+-------------+--------+------------+------+-----------------------+-----------------------+---------+-------------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
mysql> explain select * from staffs where name = 'z3' and age in (22, 23); +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | staffs | NULL | ALL | idx_staffs_nameAgePos | NULL | NULL | NULL | 3 | 66.67 | Using where | +----+-------------+--------+------------+------+-----------------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
一樣的,當使用 = 查詢時,依次使用了聯合索引,而第二個字段用 in 查詢時,連第一個字段都被拖累,沒有使用索引。spa
爲了快速插入大量數據並建立索引,咱們先把原來的那張表drop掉,再建一張同樣的表,不帶任何索引,這樣就不會耗費更新索引的時間。這邊用存儲過程插入。.net
DELIMITER $$ CREATE PROCEDURE test_insert() BEGIN declare i int; set i = 1 ; WHILE (i < 10000) DO INSERT INTO staffs(`name`,`age`,`pos`) VALUES(CONCAT('a', i), FLOOR(20 + RAND() * (100 - i + 1)),'dev'); set i = i + 1; END WHILE; commit; END$$ DELIMITER ; CALL test_insert();
Query OK, 0 rows affected (8 min 7.84 sec)
9999條數據耗時8分多鐘,仍是有點慢的。
按照以前的動做,創建索引(命令和上面同樣,爲了節約篇幅,這裏就不放出來了,下同),再查詢。
mysql> explain select * from staffs where name in ('a1', 'a2000'); +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_name | idx_staffs_name | 74 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------+-----------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
命中索引,2條記錄,準確率100%.
一樣先刪除單列索引,建立聯合索引。
mysql> explain select * from staffs where name in ('a1', 'a2000'); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 74 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
命中索引。
mysql> explain select * from staffs where name in ('a1', 'a2000') and age = 23; +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
in字段後面再加條件也能夠命中。
mysql> explain select * from staffs where name = 'a1' and age in (22, 23); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 2 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.01 sec)
mysql> explain select * from staffs where name in ('a1', 'a2000') and age in (22, 23); +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | staffs | NULL | range | idx_staffs_nameAgePos | idx_staffs_nameAgePos | 78 | NULL | 4 | 100.00 | Using index condition | +----+-------------+--------+------------+-------+-----------------------+-----------------------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)
對中間字段也沒有影響,一樣能夠命中索引。
3.1 當數據量少時,會按照聯合索引的順序依次使用索引,反而不會使用單列索引,可能的緣由是,mysql認爲數據量過小,直接走全表查詢,全表掃描反而更快。
3.2 當數據量大時,單列索引必定會使用。聯合索引也會按順序依次使用。
3.3 固然這裏in條件裏面的數值長度不大,若是是一個很長數組,致使返回的結果佔全表記錄數量較大時,應該也不會使用索引而走全表查詢。
3.4 這裏尚未測試,當in條件裏面是一個子查詢時的狀況。同時,這裏沒有對5.7如下版本作測試。這裏引用一段這位博主的話
若是是 5.5 以前的版本確實不會走索引的,在 5.5 以後的版本,MySQL 作了優化。MySQL 在 2010 年發佈 5.5 版本中,優化器對 in 操做符能夠自動完成優化,針對創建了索引的列可使用索引,沒有索引的列仍是會走全表掃描。
好比,5.5 以前的版本(如下都是 5.5 之前的版本)。select * from a where id in (select id from b); 這條 sql 語句它的執行計劃其實並非先查詢出 b 表的全部 id,而後再與 a 表的 id 進行比較。mysql 會把 in 子查詢轉換成 exists 相關子查詢,因此它實際等同於這條 sql 語句:select * from a where exists(select * from b where b.id=a.id);
而 exists 相關子查詢的執行原理是:循環取出 a 表的每一條記錄與 b 表進行比較,比較的條件是 a.id=b.id。看 a 表的每條記錄的 id 是否在 b 表存在,若是存在就行返回 a 表的這條記錄。