索引是數據庫設計的基礎,並告訴開發人員使用數據庫關於設計者的意圖。不幸的是,當性能問題出現時,索引每每被添加爲過後考慮。這裏最後是一個簡單的系列文章,應該使他們快速地使任何數據庫專業人員「快速」 SQL Server索引階段1中的級別1一般引入了SQL Server索引,特別引入了非聚簇索引。做爲咱們的第一個案例研究,咱們演示了從表中檢索單個行時索引的潛在好處。在這個層面上,咱們繼續調查非集羣指標。在超出從表中檢索單個行的狀況下,檢查他們對良好查詢性能的貢獻。 就像大多數這些層面的狀況同樣,咱們引入少許的理論,檢查一些索引內部的內容來幫助解釋理論,而後執行一些查詢。這些查詢是在沒有索引的狀況下執行的,而且打開了性能報告統計信息,以便查看索引的影響。 咱們將使用咱們在Level 1中使用的AdventureWorks數據庫中的表的子集,集中在整個級別的Contact表。咱們將只使用一個索引,即咱們在1級中使用的FullName索引來講明咱們的觀點。爲了確保咱們控制Contact表上的索引,咱們將在dbo模式中建立表的兩個副本,並僅在其中一個上建立FullName索引。這將給咱們咱們的受控環境:表的兩個副本:一個具備單個非彙集索引,另外一個沒有任何索引。sql
注意: 在這個樓梯級別顯示的全部TSQL代碼能夠在文章底部下載。 清單1中的代碼建立了Person.Contact表的副本,咱們能夠在咱們但願以「clean slate」開始的任什麼時候候從新運行這個批處理。數據庫
IF EXISTS (app
SELECT *數據庫設計
FROM sys.tables ide
WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_index'))sqlserver
DROP TABLE dbo.Contacts_index;性能
GO測試
IF EXISTS (ui
SELECT *this
FROM sys.tables
WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_noindex'))
DROP TABLE dbo.Contacts_noindex;
GO
SELECT * INTO dbo.Contacts_index
FROM Person.Contact;
SELECT * INTO dbo.Contacts_noindex
FROM Person.Contact;
清單2.1:製做Person.Contact表的副本聯繫人表格的一個片斷顯示在這裏:
ContactID FirstName MiddleName LastName EmailAddress
1288 Laura F Norman laura1@adventure-works.com 651 Michael Patten michael20@adventure-works.com 1652 Isabella R James isabella6@adventure-works.com 1015 David R Campbell david8@adventure-works.com 1379 Balagane Swaminath balaganesan0@adventure-works.c 742 Steve Schmidt steve3@adventure-works.com 1743 Shannon C Guo shannon16@adventure-works.com 1106 John Y Chen john2@adventure-works.com 1470 Blaine Dockter blaine1@adventure-works.com 833 Clarence R. Tatman clarence0@adventure-works.com 1834 Heather M Wu heather6@adventure-works.com 1197 Denise H Smith denise0@adventure-works.com 560 Jennifer J. Maxham jennifer1@adventure-works.com 1561 Ido Ben-Sacha ido1@adventure-works.com 924 Becky R. Waters becky0@adventure-works.com
非彙集索引條目 如下語句在Contacts_index表上建立咱們的FullName非聚簇索引。
Contacts_index table.
CREATE INDEX FullName
ON Contacts_index
( LastName, FirstName );
清單2.2 - 建立一個非彙集索引請記住,非聚簇索引按順序存儲索引鍵,以及用於訪問表中實際數據的書籤。 您能夠將書籤看做一種指針。 將來的層次將更詳細地描述書籤,其形式和使用。 這裏顯示FullName索引的片斷,包括姓氏和名字做爲鍵列,加上書籤:
:--- Search Key Columns : Bookmark
Russell Zachary => Ruth Andy => Ruth Andy => Ryan David => Ryan Justin => Sabella Deanna => Sackstede Lane => Sackstede Lane => Saddow Peter => Sai Cindy => Sai Kaitlin => Sai Manuel => Salah Tamer => Salanki Ajay => Salavaria Sharon =>
每一個條目都包含索引鍵列和書籤值。另外,SQL Server非聚簇索引條目具備一些僅供內部使用的頭信息,並可能包含一些可選的數據值。這兩個都將在後面的層面進行討論。在這個時候,對非基本指標的基本理解也不重要。 如今,咱們只須要知道鍵值就能使SQL Server找到合適的索引條目;而且該條目的書籤值使SQL Server可以訪問表中相應的數據行。 索引條目的好處是在順序 索引的條目按索引鍵值進行排序,因此SQL Server能夠在任一方向上快速遍歷條目。順序條目的掃描能夠從索引的開始,索引的結尾或索引內的任何條目開始。 所以,若是一個請求要求全部以姓氏字母「S」開頭的聯繫人(WHERE LastName LIKE'S%'),SQL Server能夠快速導航到第一個「S」項(「Sabella,Deanna」),而後遍歷索引,使用書籤訪問行,直到到達第一個「T」條目;在這一點上它知道它已經檢索了全部的「S」條目。 若是全部選定的列都在索引中,上面的請求會更快地執行。所以,若是咱們發出:
SELECT FirstName, LastName
FROM Contact
WHERE LastName LIKE 'S%';
SQL Server能夠快速導航到第一個「S」條目,而後遍歷索引條目,忽略書籤並直接從索引條目檢索數據值,直到達到第一個「T」條目。在關係數據庫術語中,索引已經「覆蓋」了查詢。 從序列數據中受益的任何SQL操做符均可以從索引中受益。這包括ORDER BY,GROUP BY,DISTINCT,UNION(不是UNION ALL)和JOIN ... ON。 例如,若是一個請求經過姓氏詢問聯繫人的數量,SQL Server能夠從第一個條目開始計數,而後沿索引繼續。每次更改姓氏的值時,SQL Server都會輸出當前計數並開始新的計數。與以前的請求同樣,這是一個覆蓋查詢; SQL Server只訪問索引,徹底忽略表。 請注意按鍵列從左到右的順序的重要性。若是一個請求詢問全部姓「Ashton」的人,咱們的索引是很是有用的,可是若是這個請求是針對全部名字是「Ashton」的人,那麼這個索引幾乎沒有任何幫助。 測試一些樣本查詢 若是要執行後續的測試查詢,請確保運行腳本以建立新的聯繫人表的兩個版本:dbo.Contacts_index和dbo.Contacts_noindex;並運行該腳本以在dbo.Contacts_index上建立LastName,FirstName索引。 爲了驗證上一節中的斷言,咱們打開了在1級中使用的相同性能統計信息,並運行一些查詢;有和沒有索引。
SET STATISTICS io ON
SET STATISTICS time ON
因爲AdventureWorks數據庫中的Contacts表中只有19972行,因此很難得到有意義的統計時間值。 咱們大多數的查詢會顯示一個CPU時間值爲0,因此咱們不顯示統計時間的輸出; 只從統計數據IO中反映出可能須要讀取的頁數。 這些值將容許咱們在相對意義上比較查詢,以肯定哪些查詢具備哪些索引比其餘索引執行得更好。 若是您想要更大的表進行更加實際的計時測試,則可使用本文提供的構建百萬行版本的Contact表的腳本。 接下來的全部討論都假設你使用的是標準的19972行表。 測試涵蓋的查詢 咱們的第一個查詢是一個將被索引覆蓋的查詢; 一個爲全部姓氏以「S」開頭的聯繫人檢索一組有限的列。 查詢執行信息如表2.1所示。
SQL |
SELECT FirstName, LastName FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'S%' |
Without Index |
(2130 row(s) affected) Table 'Contacts_noindex'. Scan count 1, logical reads 568. |
With Index |
(2130 row(s) affected) Table 'Contacts_index'. Scan count 1, logical reads 14. |
Index Impact |
IO reduced from 568 reads to 14 reads. |
Comments |
An index that covers the query is a good thing to have. Without an index, the entire table is scanned to find the rows. The 「2130 rows」 statistic indicates that 「S」 is a popular initial letter for last names, occurring in ten percent of all contacts. |
表2.1:運行覆蓋查詢時的執行結果 測試一個不包含的查詢 接下來,咱們修改咱們的查詢以請求與以前相同的行,但包括不在索引中的列。 查詢執行信息見表2.2。
SQL |
SELECT * FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'S%' |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(2130 row(s) affected) Table 'Contact_index'. Scan count 1, logical reads 568. |
Index Impact |
No impact at all. |
Comments |
The index was never used during the execution of the query! SQL Server decided that jumping from an index entry to the corresponding row in the table 2130 times (once for each row) was more work than scanning the entire table of one million rows to find the 2130 rows that it needed. |
表2.2:運行非覆蓋查詢時的執行結果 測試一個不包含的查詢,但更有選擇性 這一次,咱們使咱們的查詢更具選擇性; 也就是說,咱們縮小了被請求的行數。 這增長了索引對該查詢有利的可能性。 查詢執行信息如表2.3所示。
SQL |
SELECT * FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(107 row(s) affected) Table 'Contact_index'. Scan count 1, logical reads 111. |
Index Impact |
IO reduced from 568 reads to 111 reads.. |
Comments |
SQL Server accessed the 107 「Ste%」 entries, all of which are located consecutively within the index. Each entry’s bookmark was then used to retrieve to corresponding row. The rows are not located consecutively within the table. The index benefitted this query; but not as much as it benefitted the first query, the 「covered」 query; especially in terms of number of IOs required to retrieve each row. You might expect that reading 107 index entries plus 107 rows would require 107 + 107 reads. The reason why only 111 reads were required will be covered at a higher level. For now, we will say that very few of the reads were used to access the index entries; most were used to access the rows. Since the previous query, which requested 2130 rows, did not benefit from the index; and this query, which requested 107 rows, did benefit from the index - you might also wonder 「where does the tipping point lie?」 The calculations behind SQL Server’s decision also will be covered in a future level. |
表2.3:運行更具選擇性的非覆蓋查詢時的執行結果 測試涵蓋的聚合查詢 咱們最後一個示例查詢將是一個聚合查詢; 這是一個涉及計數,合計,平均等的查詢。 在這種狀況下,這是一個查詢,告訴咱們在聯繫人表中名稱重複的程度。 結果部分看起來像這樣:
Steel Merrill 1 Steele Joan 1 Steele Laura 2 Steelman Shanay 1 Steen Heidi 2 Stefani Stefano 1 Steiner Alan 1 查詢執行信息見表2.4。
SQL |
SELECT LastName, FirstName, COUNT(*) as 'Contacts' FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' GROUP BY LastName, FirstName |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(104 row(s) affected) Table 'Contacts_index'. Scan count 1, logical reads 4. |
Index Impact |
IO reduced from 568 reads to 4 reads. |
Comments |
All the information needed by the query is in the index; and it is in the index in the ideal sequence for calculating the counts. All the 「last name begins with ‘Ste’」 entries are consecutive within the index; and within that group, all the entries for a single FirstName / LastName value are grouped together. No accessing of the table was required; nor was any sorting of intermediate results needed. Again, an index that covers the query is a good thing to have. |
表2.4:運行覆蓋聚合查詢時的執行結果 測試未覆蓋的聚合查詢 若是咱們改變查詢來包含不在索引中的列,咱們能夠獲得咱們在表2.5中看到的性能結果。
SQL |
SELECT LastName, FirstName, MiddleName, COUNT(*) as 'Contacts' FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' GROUP BY LastName, FirstName, MiddleName |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(105 row(s) affected) Table 'ContactLarge'. Scan count 1, logical reads 111. |
Index Impact |
IO reduced from 568 reads to 111 reads; same as the previous non-covered query |
Comments |
Intermediate work done while processing the query does not always appear in the statistics. Techniques that use memory or tempdb to sort and merge data are examples of this. In reality, the benefit of an index may be greater than that shown by the statistics. |