大部分開發者都知道加索引會快。但實際過程當中,咱們常碰到一些疑問&困難:git
本文嘗試去解釋索引的基本知識&解答上述的疑問。redis
大部分開發者接觸索引,大概知道索引相似書的目錄,你要找到想要的內容,經過目錄找到限定的關鍵字,進而找到對應的章節的pageno,再找到具體的內容。
在數據結構裏面,最簡單的索引實現相似hashmap,經過關鍵字key,映射到具體的位置,找到具體的內容。但除了hash的方式,還有多種的方式實現索引。算法
hash / b-tree / b+-tree redis HSET / MongoDB&PostgreSQL / MySQLsql
hashmap數組
一圖見b-tree & b+-tree 差異:bash
算法查找複雜度上來講:數據結構
至於爲什麼MongoDB 的實現選擇b-tree 而非 b+-tree ?
網上多篇文章有闡述,非本文重點。less
insert/update/delete 會觸發rebalance tree,因此,增刪改數據,索引會觸發修改,性能會有損耗,索引不是越多越好。既然如此,選哪些字段做爲索引呢?當查詢用到這些條件,怎麼辦?
拿一個最簡單的hashmap來說,爲何複雜度不是O(1),而是所謂接近 O(1)。由於有key 衝突/重複,DB 去找的時候,key 衝突的數據一大堆的話,仍是得輪着繼續找。b-tree 看鍵(key)的選擇也是如此。
所以一個大部分開發常常犯的錯就是對沒有區分度的key建索引。例如:不少就只有集中類別的 type/status 的 documents count 達幾十萬以上的collection,一般這種索引沒什麼幫助。性能
若是不想多建多餘的索引,開發的同事在複合 & 單個字段選擇上有時候挺糾結的。 根據典型碰到的場景,來作幾個實驗:
這裏建立了個loans collection。簡化只有100條數據。這個是借貸的表有 _id, userId, status(借貸狀態), amount(金額).優化
db.loans.count()100
db.loans.find({ "userId" : "59e022d33f239800129c61c7", "status" : "repayed", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"status" : {
"$eq" : "repayed"
}
},
{
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
}
]
},
"queryHash" : "15D5A9A1",
"planCacheKey" : "15D5A9A1",
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [
{
"status" : {
"$eq" : "repayed"
}
},
{
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
}
]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
注意上面 COLLSCAN 全表掃描了,由於沒有索引。接下來咱們分別創建幾個索引。
step 1 先創建 {userId:1, status:1}
db.loans.createIndex({userId:1, status:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
複製代碼
db.loans.find({ "userId" : "59e022d33f239800129c61c7", "status" : "repayed", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"status" : {
"$eq" : "repayed"
}
},
{
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
}
]
},
"queryHash" : "15D5A9A1",
"planCacheKey" : "BB87F2BA",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
],
"status" : [
"["repayed", "repayed"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
結果:如願命中 {userId:1, status:1} 做爲 winning plan。
step2:再創建個典型的索引 userId
db.loans.createIndex({userId:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
複製代碼
db.loans.find({ "userId" : "59e022d33f239800129c61c7", "status" : "repayed", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"status" : {
"$eq" : "repayed"
}
},
{
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
}
]
},
"queryHash" : "15D5A9A1",
"planCacheKey" : "1B1A4861",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"[\"59e022d33f239800129c61c7\", \"59e022d33f239800129c61c7\"]"
],
"status" : [
"[\"repayed\", \"repayed\"]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"status" : {
"$eq" : "repayed"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1
},
"indexName" : "userId_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
]
}
}
}
]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
留意到 DB 檢測到 {userId:1, status:1} 爲更優執行的方案.
db.loans.find({ "userId" : "59e022d33f239800129c61c7" }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
},
"queryHash" : "B1777DBA",
"planCacheKey" : "1F09D68E",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1
},
"indexName" : "userId_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
],
"status" : [
"[MinKey, MaxKey]"
]
}
}
}
]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
留意到 DB 檢測到 {userId:1} 爲更優執行的方案,嗯~,如咱們所料.
db.loans.find({ "status" : "repayed" }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"status" : {
"$eq" : "repayed"
}
},
"queryHash" : "E6304EB6",
"planCacheKey" : "7A94191B",
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"status" : {
"$eq" : "repayed"
}
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
有趣的部分:status不命中索引,全表掃描
接下來的步驟,加個sort :
db.loans.find({ "userId" : "59e022d33f239800129c61c7" }).sort({status:1}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
},
"queryHash" : "F5ABB1AA",
"planCacheKey" : "764CBAA8",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
],
"status" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "SORT",
"sortPattern" : {
"status" : 1
},
"inputStage" : {
"stage" : "SORT_KEY_GENERATOR",
"inputStage" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1
},
"indexName" : "userId_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
]
}
}
}
}
}
]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
有趣的部分:status 不命中索引
db.loans.find({ "status" : "repayed","userId" : "59e022d33f239800129c61c7", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{
"status" : {
"$eq" : "repayed"
}
},
{
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
}
]
},
"queryHash" : "15D5A9A1",
"planCacheKey" : "1B1A4861",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"[\"59e022d33f239800129c61c7\", \"59e022d33f239800129c61c7\"]"
],
"status" : [
"[\"repayed\", \"repayed\"]"
]
}
}
},
"rejectedPlans" : [
{
"stage" : "FETCH",
"filter" : {
"status" : {
"$eq" : "repayed"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1
},
"indexName" : "userId_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
]
}
}
}
]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
命中索引,跟query的各個字段順序不相關,如咱們猜想。
有趣部分再來, 咱們刪掉索引{userId:1}
db.loans.dropIndex({"userId":1})
{ "nIndexesWas" : 3, "ok" : 1 }
db.loans.find({"userId" : "59e022d33f239800129c61c7", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
},
"queryHash" : "B1777DBA",
"planCacheKey" : "5776AB9C",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"]"
],
"status" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
DB 執行分析器以爲索引{userId:1, status:1} 能更優,沒有命中複合索引,這個是由於status不是leading field。
db.loans.find({ "status" : "repayed" }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"status" : {
"$eq" : "repayed"
}
},
"queryHash" : "E6304EB6",
"planCacheKey" : "7A94191B",
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"status" : {
"$eq" : "repayed"
}
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
再換個角度sort 一遍, 與前面query & sort互換,以前是:
db.loans.find({userId:1}).sort({ "status" : "repayed" })
複製代碼
看看有啥不同?
db.loans.find({ "status" : "repayed" }).sort({userId:1}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"status" : {
"$eq" : "repayed"
}
},
"queryHash" : "56EA6313",
"planCacheKey" : "2CFCDA7F",
"winningPlan" : {
"stage" : "FETCH",
"filter" : {
"status" : {
"$eq" : "repayed"
}
},
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"userId" : 1,
"status" : 1
},
"indexName" : "userId_1_status_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"userId" : [ ],
"status" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"userId" : [
"[MinKey, MaxKey]"
],
"status" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
如猜想,命中索引。
再來玩玩,確認下leading filed試驗:
db.loans.dropIndex("userId_1_status_1")
{ "nIndexesWas" : 2, "ok" : 1 }
複製代碼
db.loans.getIndexes()
[
{
"v" : 2,
"key" : {
"id" : 1
},
"name" : "id_",
"ns" : "cashLoan.loans"
}
]
複製代碼
db.loans.createIndex({status:1, userId:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
複製代碼
db.loans.getIndexes()
[
{
"v" : 2,
"key" : {
"id" : 1
},
"name" : "id_",
"ns" : "cashLoan.loans"
},
{
"v" : 2,
"key" : {
"status" : 1,
"userId" : 1
},
"name" : "status_1_userId_1",
"ns" : "cashLoan.loans"
}
]
複製代碼
db.loans.find({ "status" : "repayed" }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"status" : {
"$eq" : "repayed"
}
},
"queryHash" : "E6304EB6",
"planCacheKey" : "7A94191B",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"status" : 1,
"userId" : 1
},
"indexName" : "status_1_userId_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"status" : [ ],
"userId" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"status" : [
"["repayed", "repayed"]"
],
"userId" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
db.loans.getIndexes()
[
{
"v" : 2,
"key" : {
"id" : 1
},
"name" : "id_",
"ns" : "cashLoan.loans"
},
{
"v" : 2,
"key" : {
"status" : 1,
"userId" : 1
},
"name" : "status_1_userId_1",
"ns" : "cashLoan.loans"
}
]
複製代碼
db.loans.find({"userId" : "59e022d33f239800129c61c7", }).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "cashLoan.loans",
"indexFilterSet" : false,
"parsedQuery" : {
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
},
"queryHash" : "B1777DBA",
"planCacheKey" : "5776AB9C",
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"userId" : {
"$eq" : "59e022d33f239800129c61c7"
}
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "RMBAP",
"port" : 27017,
"version" : "4.1.11",
"gitVersion" : "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"
},
"ok" : 1
}
複製代碼
看完這個試驗,明白了 {userId:1, status:1} vs {status:1,userId:1} 的差異了嗎?
PS:這個case 裏面其實status 區分度不高,這裏只是做爲實例展現。
DB 通常都有執行器優化的分析,MySQL & MongoDB 都是 用explain 來作分析。
語法上MySQL :
explain your_sql
MongoDB:
yoursql.explain()
總結典型:理想的查詢是結合explain 的指標,他們一般是多個的混合:
文末,還有最開頭1個問題沒回答:若是個人索引改加的都加了,還不夠快,怎麼辦? 留個懸念,以後再寫一篇。