每次經歷數據庫性能調優,都是對性能優化的再次認識、對本身知識不足的有力驗證,只有不斷總結、學習才能少走彎路。html
內容摘要:sql
1、性能問題描述數據庫
2、監測分析緩存
3、等待類型分析性能優化
4、優化方案服務器
5、優化效果session
應用端反應系統查詢緩慢,長時間出不來結果。SQLServer數據庫服務器吞吐量不足,CPU資源不足,常常飆到100%....... 架構
收集性能數據採用二種方式:連續一段時間收集和高峯期實時收集併發
連續一天收集性能指標(如下簡稱「連續監測」)app
目的: 經過此方式獲得CPU/內存/磁盤/SQLServer整體狀況,宏觀上分析當前服務器的主要的性能瓶頸。
工具: 性能計數器 Perfmon+PAL日誌分析器(工具使用方法請參考另一篇博文)
配置:
Perfmon配置主要性能計數器內容具體以下表
Perfmon收集的時間間隔:15秒 (不宜太短,不然會對服務器性能形成額外壓力)
收集時間: 8:00~20:00業務時間,收集一天
分析監測結果
收集完成後,經過PAL(一款日誌分析工具,可見一篇博文介紹)工具自動分析出結果,顯示主要性能問題:
業務高峯期CPU接近100%,並伴隨較多的Latch(閂鎖)等待,查詢時有大量的掃表操做。這些只是宏觀上獲得的「現象級「的性能問題表現,並不能必定說明是CPU資源不夠致使的,須要進一步找證據分析。
PAL分析得出幾個突出性能問題
1. 業務高峯期CPU接近瓶頸:CPU平均在60%左右,高峯在80%以上,極端達到100%
2. Latch等待一直持續存在,平均在>500。Non-Page Latch等待嚴重
4. SQL編譯和反編譯參數高於正常
5.PLE即頁在內存中的生命週期,其數量從某個時間點出現斷崖式降低
其數量從早上某個時間點降低後直持續到下午4點,說明這段時間內存中頁面切換比較頻繁,出現從磁盤讀取大量頁數據到內存,極可能是大面積掃表致使。
實時監測性能指標
目的: 根據「連續監測「已知的業務高峯期PeakTime主要發生時段,接下來經過實時監測重點關注這段時間各項指標,進一步確認問題。
工具: SQLCheck(工具使用請見另一篇 博文介紹)
配置: 客戶端鏈接到SQLCheck配置
小貼士:建議不要在當前服務器運行,可選擇另一臺機器運行SQLCheck
分析監測結果
實時監測顯示Non-Page Latch等待嚴重,這點與上面「連續監測」獲得結果一直
Session之間阻塞現象時常發生,經分析是大的結果集查詢阻塞了別的查詢、更新、刪除操做致使
詳細分析
數據庫存存在大量表掃描操做,致使緩存中數據不能知足查詢,須要從磁盤中讀取數據,產生IO等待致使阻塞。
1. Non-Page Latch等待時間長
2. 當 Non-Page Latch等待發生時候,實時監測顯示正在執行大的查詢操做
3. 伴有session之間阻塞現象,在大的查詢時發生阻塞現象,CPU也隨之飆到95%以上
解決方案
找到問題語句,建立基於條件的索引來減小掃描,並更新統計信息。
上面方法還沒法解決,考慮將受影響的數據轉移到更快的IO子系統,考慮增長內存。
經過等待類型,換個角度進一步分析到底時哪些資源出現瓶頸
工具: DMV/DMO
操做:
1. 先清除歷史等待數據
選擇早上8點左右執行下面語句
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);
2. 晚上8點左右執行,執行下面語句收集Top 10的等待類型信息統計。
WITH [Waits] AS ( SELECT [wait_type] , [wait_time_ms] / 1000.0 AS [WaitS] , ( [wait_time_ms] - [signal_wait_time_ms] ) / 1000.0 AS [ResourceS] , [signal_wait_time_ms] / 1000.0 AS [SignalS] , [waiting_tasks_count] AS [WaitCount] , 100.0 * [wait_time_ms] / SUM([wait_time_ms]) OVER ( ) AS [Percentage] , ROW_NUMBER() OVER ( ORDER BY [wait_time_ms] DESC ) AS [RowNum] FROM sys.dm_os_wait_stats WHERE [wait_type] NOT IN ( N'CLR_SEMAPHORE', N'LAZYWRITER_SLEEP', N'RESOURCE_QUEUE', N'SQLTRACE_BUFFER_FLUSH', N'SLEEP_TASK', N'SLEEP_SYSTEMTASK', N'WAITFOR', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION', N'CHECKPOINT_QUEUE', N'REQUEST_FOR_DEADLOCK_SEARCH', N'XE_TIMER_EVENT', N'XE_DISPATCHER_JOIN', N'LOGMGR_QUEUE', N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'BROKER_TASK_STOP', N'CLR_MANUAL_EVENT', N'CLR_AUTO_EVENT', N'DISPATCHER_QUEUE_SEMAPHORE', N'TRACEWRITE', N'XE_DISPATCHER_WAIT', N'BROKER_TO_FLUSH', N'BROKER_EVENTHANDLER', N'FT_IFTSHC_MUTEX', N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', N'DIRTY_PAGE_POLL', N'SP_SERVER_DIAGNOSTICS_SLEEP' ) ) SELECT [W1].[wait_type] AS [WaitType] , CAST ([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S] , CAST ([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S] , CAST ([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S] , [W1].[WaitCount] AS [WaitCount] , CAST ([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage] , CAST (( [W1].[WaitS] / [W1].[WaitCount] ) AS DECIMAL(14, 4)) AS [AvgWait_S] , CAST (( [W1].[ResourceS] / [W1].[WaitCount] ) AS DECIMAL(14, 4)) AS [AvgRes_S] , CAST (( [W1].[SignalS] / [W1].[WaitCount] ) AS DECIMAL(14, 4)) AS [AvgSig_S] FROM [Waits] AS [W1] INNER JOIN [Waits] AS [W2] ON [W2].[RowNum] <= [W1].[RowNum] GROUP BY [W1].[RowNum] , [W1].[wait_type] , [W1].[WaitS] , [W1].[ResourceS] , [W1].[SignalS] , [W1].[WaitCount] , [W1].[Percentage] HAVING SUM([W2].[Percentage]) - [W1].[Percentage] <95; -- percentage threshold GO
3.提取信息
查詢結果得出排名:
1:CXPACKET
2:LATCH_X
3:IO_COMPITION
4:SOS_SCHEDULER_YIELD
5: ASYNC_NETWORK_IO
6. PAGELATCH_XX
7/8.PAGEIOLATCH_XX
跟主要資源相關的等待方陣以下:
CPU相關:CXPACKET 和SOS_SCHEDULER_YIELD
IO相關: PAGEIOLATCH_XX\IO_COMPLETION
Memory相關: PAGELATCH_XX、LATCH_X
當前排前三位:CXPACKET、LATCH_EX、IO_COMPLETION等待,開始一個個分析其產生等待背後緣由
小貼士:關於等待類型的知識學習,可參考Paul Randal的系列文章。
CXPACKET等待排第1位, SOS_SCHEDULER_YIELD排在4位,伴有第七、8位的PAGEIOLATCH_XX等待。發生了並行操做worker被阻塞
說明:
1. 存在大範圍的表Scan
2. 某些並行線程執行時間過長,這個要將PAGEIOLATCH_XX和非頁閂鎖Latch_XX的ACCESS_METHODS_DATASET_PARENT Latch結合起來看,後面會給到相關信息
3. 執行計劃不合理的可能
分析:
1. 首先看一下花在執行等待和資源等待的時間
2. PAGEIOLATCH_XX是否存在,PAGEIOLATCH_SH等待,這意味着大範圍SCAN
3. 是否同時有ACCESS_METHODS_DATASET_PARENT Latch或ACCESS_METHODS_SCAN_RANGE_GENERATOR LATCH等待
4. 執行計劃是否合理
信提取息:
獲取CPU的執行等待和資源等待的時間所佔比重
執行下面語句:
--CPU Wait Queue (threshold:<=6) select scheduler_id,idle_switches_count,context_switches_count,current_tasks_count, active_workers_count from sys.dm_os_schedulers where scheduler_id<255
SELECT sum(signal_wait_time_ms) as total_signal_wait_time_ms, sum(wait_time_ms-signal_wait_time_ms) as resource_wait_time_percent, sum(signal_wait_time_ms)*1.0/sum(wait_time_ms)*100 as signal_wait_percent, sum(wait_time_ms-signal_wait_time_ms)*1.0/sum(wait_time_ms)*100 as resource_wait_percent FROM SYS.dm_os_wait_stats
結論:從下表收集到信息CPU主要花在資源等待上,而執行時候等待佔比率小,因此不能武斷認爲CPU資源不夠。
形成緣由:
缺乏彙集索引、不許確的執行計劃、並行線程執行時間過長、是否存在隱式轉換、TempDB資源爭用
解決方案:
主要從如何減小CPU花在資源等待的時間
1. 設置查詢的MAXDOP,根據CPU核數設置合適的值(解決多CPU並行處理出現水桶短板現象)
2. 檢查」cost threshold parallelism」的值,設置爲更合理的值
3. 減小全表掃描:創建合適的彙集索引、非彙集索引,減小全表掃描
4. 不精確的執行計劃:選用更優化執行計劃
5. 統計信息:確保統計信息是最新的
6. 建議添加多個Temp DB 數據文件,減小Latch爭用,最佳實踐:>8核數,建議添加4個或8個等大小的數據文件
LATCH_EX等待排第2位。
說明:
有大量的非頁閂鎖等待,首先確認是哪個閂鎖等待時間過長,是否同時發生CXPACKET等待類型。
分析:
查詢全部閂鎖等待信息,發現ACCESS_METHODS_DATASET_PARENT等待最長,查詢相關資料顯示因從磁盤->IO讀取大量的數據到緩存,結合與以前Perfmon結果作綜合分析判斷,判斷存在大量掃描。
運行腳本
SELECT * FROM sys.dm_os_latch_stats
信提取息:
形成緣由:
有大量的並行處理等待、IO頁面處理等待,這進一步推定存在大範圍的掃描表操做。
與開發人員確認存儲過程當中使用大量的臨時表,並監測到業務中處理用頻繁使用臨時表、標量值函數,不斷建立用戶對象等,TEMPDB 處理內存相關PFS\GAM\SGAM時,有不少內部資源申請徵用的Latch等待現象。
解決方案:
1. 優化TempDB
2. 建立非彙集索引來減小掃描
3. 更新統計信息
4. 在上面方法仍然沒法解決,可將受影響的數據轉移到更快的IO子系統,考慮增長內存
現象:
IO_COMPLETION等待排第3位
說明:
IO延遲問題,數據從磁盤到內存等待時間長
分析:
從數據庫的文件讀寫效率分析哪一個比較慢,再與「CXPACKET等待分析」的結果合起來分析。
Temp IO讀/寫資源效率
1. TempDB的數據文件的平均IO在80左右,這個超出通常值,TempDB存在嚴重的延遲。
2. TempDB所在磁盤的Read latency爲65,也比通常值偏高。
運行腳本:
1 --數據庫文件讀寫IO性能 2 SELECT DB_NAME(fs.database_id) AS [Database Name], CAST(fs.io_stall_read_ms/(1.0 + fs.num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms], 3 CAST(fs.io_stall_write_ms/(1.0 + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms], 4 CAST((fs.io_stall_read_ms + fs.io_stall_write_ms)/(1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms], 5 CONVERT(DECIMAL(18,2), mf.size/128.0) AS [File Size (MB)], mf.physical_name, mf.type_desc, fs.io_stall_read_ms, fs.num_of_reads, 6 fs.io_stall_write_ms, fs.num_of_writes, fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls], fs.num_of_reads + fs.num_of_writes AS [total_io] 7 FROM sys.dm_io_virtual_file_stats(null,null) AS fs 8 INNER JOIN sys.master_files AS mf WITH (NOLOCK) 9 ON fs.database_id = mf.database_id 10 AND fs.[file_id] = mf.[file_id] 11 ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE); 12 13 --驅動磁盤-IO文件狀況 14 SELECT [Drive], 15 CASE 16 WHEN num_of_reads = 0 THEN 0 17 ELSE (io_stall_read_ms/num_of_reads) 18 END AS [Read Latency], 19 CASE 20 WHEN io_stall_write_ms = 0 THEN 0 21 ELSE (io_stall_write_ms/num_of_writes) 22 END AS [Write Latency], 23 CASE 24 WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0 25 ELSE (io_stall/(num_of_reads + num_of_writes)) 26 END AS [Overall Latency], 27 CASE 28 WHEN num_of_reads = 0 THEN 0 29 ELSE (num_of_bytes_read/num_of_reads) 30 END AS [Avg Bytes/Read], 31 CASE 32 WHEN io_stall_write_ms = 0 THEN 0 33 ELSE (num_of_bytes_written/num_of_writes) 34 END AS [Avg Bytes/Write], 35 CASE 36 WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0 37 ELSE ((num_of_bytes_read + num_of_bytes_written)/(num_of_reads + num_of_writes)) 38 END AS [Avg Bytes/Transfer] 39 FROM (SELECT LEFT(mf.physical_name, 2) AS Drive, SUM(num_of_reads) AS num_of_reads, 40 SUM(io_stall_read_ms) AS io_stall_read_ms, SUM(num_of_writes) AS num_of_writes, 41 SUM(io_stall_write_ms) AS io_stall_write_ms, SUM(num_of_bytes_read) AS num_of_bytes_read, 42 SUM(num_of_bytes_written) AS num_of_bytes_written, SUM(io_stall) AS io_stall 43 FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs 44 INNER JOIN sys.master_files AS mf WITH (NOLOCK) 45 ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id 46 GROUP BY LEFT(mf.physical_name, 2)) AS tab 47 ORDER BY [Overall Latency] OPTION (RECOMPILE);
信提取息:
各數據文件IO/CPU/Buffer訪問狀況,Temp DB的IO Rank達到53%以上
解決方案:
添加多個Temp DB 數據文件,減小Latch爭用。最佳實踐:>8核數,建議添加4個或8個等大小的數據文件。
分析:
經過等待類型發現與IO相關 的PAGEIOLATCH_XX 值很是高,數據庫存存在大量表掃描操做,致使緩存中數據不能知足查詢,須要從磁盤中讀取數據,產生IO等待。
解決方案:
建立合理非彙集索引來減小掃描,更新統計信息
上面方法還沒法解決,考慮將受影響的數據轉移到更快的IO子系統,考慮增長內存。
依據以上監測和分析結果,從「優化順序」和「實施原則」開始實質性的優化。
1. 從數據庫配置優化
理由:代價最小,根據監測分析結果,經過修改配置可提高空間不小。
2. 索引優化
理由:索引不會動數據庫表等與業務緊密的結構,業務層面不會有風險。
步驟:考慮到庫中打表(超過100G),在索引優化也要分步進行。 優化索引步驟:無用索引->重複索引->丟失索引添加->彙集索引->索引碎片整理。
3. 查詢優化
理由:語句優化須要結合業務,須要和開發人員緊密溝通,最終選擇優化語句的方案
步驟:DBA抓取執行時間、使用CPU、IO、內存最多的TOP SQL語句/存儲過程,交由開發人員並協助找出可優化的方法,如加索引、語句寫法等。
整個診斷和優化方案首先在測試環境中進行測試,將在測試環境中測試經過並確認的逐步實施到正式環境。
1. 當前數據庫服務器有超過24個核數, 當前MAXDOP爲0,配置不合理,致使調度併發處理時出現較大並行等待現象(水桶短板原理)
優化建議:建議修改MAXDOP 值,最佳實踐>8核的,先設置爲4
2. 當前COST THRESHOLD FOR PARALLELISM值默認5秒
優化建議:建議修改 COST THRESHOLD FOR PARALLELISM值,超過15秒容許並行處理
3. 監測到業務中處理用頻繁使用臨時表、標量值函數,不斷建立用戶對象等,TEMPDB 處理內存相關PFS\GAM\SGAM時,有不少的Latch等待現象,給性能形成影響
優化建議:建議添加多個Temp DB 數據文件,減小Latch爭用。最佳實踐:>8核數,建議添加4個或8個等大小的數據文件。
4. 啓用optimize for ad hoc workloads
5. Ad Hoc Distributed Queries開啓即席查詢優化
1. 無用索引優化
目前庫中存在大量無用索引,可經過腳本找出無用的索引並刪除,減小系統對索引維護成本,提升更新性能。另外,根據讀比率低於1%的表的索引,可結合業務最終確認是否刪除索引。
詳細列表請參考:性能調優數據收集_索引.xlsx-無用索引
無用索引,參考執行語句:
SELECT OBJECT_NAME(i.object_id) AS table_name , COALESCE(i.name, SPACE(0)) AS index_name , ps.partition_number , ps.row_count , CAST(( ps.reserved_page_count * 8 ) / 1024. AS DECIMAL(12, 2)) AS size_in_mb , COALESCE(ius.user_seeks, 0) AS user_seeks , COALESCE(ius.user_scans, 0) AS user_scans , COALESCE(ius.user_lookups, 0) AS user_lookups , i.type_desc FROM sys.all_objects t INNER JOIN sys.indexes i ON t.object_id = i.object_id INNER JOIN sys.dm_db_partition_stats ps ON i.object_id = ps.object_id AND i.index_id = ps.index_id LEFT OUTER JOIN sys.dm_db_index_usage_stats ius ON ius.database_id = DB_ID() AND i.object_id = ius.object_id AND i.index_id = ius.index_id WHERE i.type_desc NOT IN ( 'HEAP', 'CLUSTERED' ) AND i.is_unique = 0 AND i.is_primary_key = 0 AND i.is_unique_constraint = 0 AND COALESCE(ius.user_seeks, 0) <= 0 AND COALESCE(ius.user_scans, 0) <= 0 AND COALESCE(ius.user_lookups, 0) <= 0 ORDER BY OBJECT_NAME(i.object_id) , i.name --1. Finding unused non-clustered indexes. SELECT OBJECT_SCHEMA_NAME(i.object_id) AS SchemaName , OBJECT_NAME(i.object_id) AS TableName , i.name , ius.user_seeks , ius.user_scans , ius.user_lookups , ius.user_updates FROM sys.dm_db_index_usage_stats AS ius JOIN sys.indexes AS i ON i.index_id = ius.index_id AND i.object_id = ius.object_id WHERE ius.database_id = DB_ID() AND i.is_unique_constraint = 0 -- no unique indexes AND i.is_primary_key = 0 AND i.is_disabled = 0 AND i.type > 1 -- don't consider heaps/clustered index AND ( ( ius.user_seeks + ius.user_scans + ius.user_lookups ) < ius.user_updates OR ( ius.user_seeks = 0 AND ius.user_scans = 0 ) )
表的讀寫比,參考執行語句
1 DECLARE @dbid int 2 SELECT @dbid = db_id() 3 SELECT TableName = object_name(s.object_id), 4 Reads = SUM(user_seeks + user_scans + user_lookups), Writes = SUM(user_updates),CONVERT(BIGINT,SUM(user_seeks + user_scans + user_lookups))*100/( SUM(user_updates)+SUM(user_seeks + user_scans + user_lookups)) 5 FROM sys.dm_db_index_usage_stats AS s 6 INNER JOIN sys.indexes AS i 7 ON s.object_id = i.object_id 8 AND i.index_id = s.index_id 9 WHERE objectproperty(s.object_id,'IsUserTable') = 1 10 AND s.database_id = @dbid 11 GROUP BY object_name(s.object_id) 12 ORDER BY writes DESC
2. 移除、合併重複索引
目前系統中不少索引重複,對該類索引進行合併,減小索引的維護成本,從而提高更新性能。
重複索引,參考執行語句:
1 WITH MyDuplicate AS (SELECT 2 Sch.[name] AS SchemaName, 3 Obj.[name] AS TableName, 4 Idx.[name] AS IndexName, 5 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 1) AS Col1, 6 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 2) AS Col2, 7 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 3) AS Col3, 8 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 4) AS Col4, 9 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 5) AS Col5, 10 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 6) AS Col6, 11 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 7) AS Col7, 12 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 8) AS Col8, 13 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 9) AS Col9, 14 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 10) AS Col10, 15 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 11) AS Col11, 16 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 12) AS Col12, 17 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 13) AS Col13, 18 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 14) AS Col14, 19 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 15) AS Col15, 20 INDEX_Col(Sch.[name] + '.' + Obj.[name], Idx.index_id, 16) AS Col16 21 FROM sys.indexes Idx 22 INNER JOIN sys.objects Obj ON Idx.[object_id] = Obj.[object_id] 23 INNER JOIN sys.schemas Sch ON Sch.[schema_id] = Obj.[schema_id] 24 WHERE index_id > 0 AND Obj.[name]='DOC_INVPLU') 25 SELECT MD1.SchemaName, MD1.TableName, MD1.IndexName, 26 MD2.IndexName AS OverLappingIndex, 27 MD1.Col1, MD1.Col2, MD1.Col3, MD1.Col4, 28 MD1.Col5, MD1.Col6, MD1.Col7, MD1.Col8, 29 MD1.Col9, MD1.Col10, MD1.Col11, MD1.Col12, 30 MD1.Col13, MD1.Col14, MD1.Col15, MD1.Col16 31 FROM MyDuplicate MD1 32 INNER JOIN MyDuplicate MD2 ON MD1.tablename = MD2.tablename 33 AND MD1.indexname <> MD2.indexname 34 AND MD1.Col1 = MD2.Col1 35 AND (MD1.Col2 IS NULL OR MD2.Col2 IS NULL OR MD1.Col2 = MD2.Col2) 36 AND (MD1.Col3 IS NULL OR MD2.Col3 IS NULL OR MD1.Col3 = MD2.Col3) 37 AND (MD1.Col4 IS NULL OR MD2.Col4 IS NULL OR MD1.Col4 = MD2.Col4) 38 AND (MD1.Col5 IS NULL OR MD2.Col5 IS NULL OR MD1.Col5 = MD2.Col5) 39 AND (MD1.Col6 IS NULL OR MD2.Col6 IS NULL OR MD1.Col6 = MD2.Col6) 40 AND (MD1.Col7 IS NULL OR MD2.Col7 IS NULL OR MD1.Col7 = MD2.Col7) 41 AND (MD1.Col8 IS NULL OR MD2.Col8 IS NULL OR MD1.Col8 = MD2.Col8) 42 AND (MD1.Col9 IS NULL OR MD2.Col9 IS NULL OR MD1.Col9 = MD2.Col9) 43 AND (MD1.Col10 IS NULL OR MD2.Col10 IS NULL OR MD1.Col10 = MD2.Col10) 44 AND (MD1.Col11 IS NULL OR MD2.Col11 IS NULL OR MD1.Col11 = MD2.Col11) 45 AND (MD1.Col12 IS NULL OR MD2.Col12 IS NULL OR MD1.Col12 = MD2.Col12) 46 AND (MD1.Col13 IS NULL OR MD2.Col13 IS NULL OR MD1.Col13 = MD2.Col13) 47 AND (MD1.Col14 IS NULL OR MD2.Col14 IS NULL OR MD1.Col14 = MD2.Col14) 48 AND (MD1.Col15 IS NULL OR MD2.Col15 IS NULL OR MD1.Col15 = MD2.Col15) 49 AND (MD1.Col16 IS NULL OR MD2.Col16 IS NULL OR MD1.Col16 = MD2.Col16) 50 ORDER BY 51 MD1.SchemaName,MD1.TableName,MD1.IndexName
3. 添加丟失索引
根據對語句的頻次,表中讀寫比,結合業務對缺失的索引進行創建。
丟失索引,參考執行語句:
1 -- Missing Indexes in current database by Index Advantage 2 SELECT user_seeks * avg_total_user_cost * ( avg_user_impact * 0.01 ) AS [index_advantage] , 3 migs.last_user_seek , 4 mid.[statement] AS [Database.Schema.Table] , 5 mid.equality_columns , 6 mid.inequality_columns , 7 mid.included_columns , 8 migs.unique_compiles , 9 migs.user_seeks , 10 migs.avg_total_user_cost , 11 migs.avg_user_impact , 12 N'CREATE NONCLUSTERED INDEX [IX_' + SUBSTRING(mid.statement, 13 CHARINDEX('.', 14 mid.statement, 15 CHARINDEX('.', 16 mid.statement) 17 + 1) + 2, 18 LEN(mid.statement) - 3 19 - CHARINDEX('.', 20 mid.statement, 21 CHARINDEX('.', 22 mid.statement) 23 + 1) + 1) + '_' 24 + REPLACE(REPLACE(REPLACE(CASE WHEN mid.equality_columns IS NOT NULL 25 AND mid.inequality_columns IS NOT NULL 26 AND mid.included_columns IS NOT NULL 27 THEN mid.equality_columns + '_' 28 + mid.inequality_columns 29 + '_Includes' 30 WHEN mid.equality_columns IS NOT NULL 31 AND mid.inequality_columns IS NOT NULL 32 AND mid.included_columns IS NULL 33 THEN mid.equality_columns + '_' 34 + mid.inequality_columns 35 WHEN mid.equality_columns IS NOT NULL 36 AND mid.inequality_columns IS NULL 37 AND mid.included_columns IS NOT NULL 38 THEN mid.equality_columns + '_Includes' 39 WHEN mid.equality_columns IS NOT NULL 40 AND mid.inequality_columns IS NULL 41 AND mid.included_columns IS NULL 42 THEN mid.equality_columns 43 WHEN mid.equality_columns IS NULL 44 AND mid.inequality_columns IS NOT NULL 45 AND mid.included_columns IS NOT NULL 46 THEN mid.inequality_columns 47 + '_Includes' 48 WHEN mid.equality_columns IS NULL 49 AND mid.inequality_columns IS NOT NULL 50 AND mid.included_columns IS NULL 51 THEN mid.inequality_columns 52 END, ', ', '_'), ']', ''), '[', '') + '] ' 53 + N'ON ' + mid.[statement] + N' (' + ISNULL(mid.equality_columns, N'') 54 + CASE WHEN mid.equality_columns IS NULL 55 THEN ISNULL(mid.inequality_columns, N'') 56 ELSE ISNULL(', ' + mid.inequality_columns, N'') 57 END + N') ' + ISNULL(N'INCLUDE (' + mid.included_columns + N');', 58 ';') AS CreateStatement 59 FROM sys.dm_db_missing_index_group_stats AS migs WITH ( NOLOCK ) 60 INNER JOIN sys.dm_db_missing_index_groups AS mig WITH ( NOLOCK ) ON migs.group_handle = mig.index_group_handle 61 INNER JOIN sys.dm_db_missing_index_details AS mid WITH ( NOLOCK ) ON mig.index_handle = mid.index_handle 62 WHERE mid.database_id = DB_ID() 63 ORDER BY index_advantage DESC;
4. 索引碎片整理
須要經過DBCC check完成索引碎片清理,提升查詢時效率。
備註:當前據庫不少表比較大(>50G),作表上索引可能花費很長時間,通常1個T的庫要8小時以上,建議制定一個詳細計劃,以表爲單位逐步碎片清理。
索引碎片參考執行語句:
1 SELECT '[' + DB_NAME() + '].[' + OBJECT_SCHEMA_NAME(ddips.[object_id], 2 DB_ID()) + '].[' 3 + OBJECT_NAME(ddips.[object_id], DB_ID()) + ']' AS [statement] , 4 i.[name] AS [index_name] , 5 ddips.[index_type_desc] , 6 ddips.[partition_number] , 7 ddips.[alloc_unit_type_desc] , 8 ddips.[index_depth] , 9 ddips.[index_level] , 10 CAST(ddips.[avg_fragmentation_in_percent] AS SMALLINT) 11 AS [avg_frag_%] , 12 CAST(ddips.[avg_fragment_size_in_pages] AS SMALLINT) 13 AS [avg_frag_size_in_pages] , 14 ddips.[fragment_count] , 15 ddips.[page_count] 16 FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, 17 NULL, NULL, 'limited') ddips 18 INNER JOIN sys.[indexes] i ON ddips.[object_id] = i.[object_id] 19 AND ddips.[index_id] = i.[index_id] 20 WHERE ddips.[avg_fragmentation_in_percent] > 15 21 AND ddips.[page_count] > 500 22 ORDER BY ddips.[avg_fragmentation_in_percent] , 23 OBJECT_NAME(ddips.[object_id], DB_ID()) , 24 i.[name]
5. 審查沒有彙集、主鍵索引的表
當前庫不少表沒有彙集索引,須要細查緣由是否是業務要求,若是沒有特殊緣由能夠加上。
1. 從數據庫歷史保存信息中,經過DMV獲取
參考獲取Top100執行語句
1 --執行時間最長的語句 2 SELECT TOP 100 3 execution_count, 4 total_worker_time / 1000 AS total_worker_time, 5 total_logical_reads, 6 total_logical_writes,max_elapsed_time, 7 [text] 8 FROM 9 sys.dm_exec_query_stats qs 10 CROSS APPLY 11 sys.dm_exec_sql_text(qs.sql_handle) AS st 12 ORDER BY 13 max_elapsed_time DESC 14 15 16 --消耗CPU最多的語句 17 SELECT TOP 100 18 execution_count, 19 total_worker_time / 1000 AS total_worker_time, 20 total_logical_reads, 21 total_logical_writes, 22 [text] 23 FROM 24 sys.dm_exec_query_stats qs 25 CROSS APPLY 26 sys.dm_exec_sql_text(qs.sql_handle) AS st 27 ORDER BY 28 total_worker_time DESC 29 30 --消耗IO讀最多的語句 31 SELECT TOP 100 32 execution_count, 33 total_worker_time / 1000 AS total_worker_time, 34 total_logical_reads, 35 total_logical_writes, 36 [text] 37 FROM 38 sys.dm_exec_query_stats qs 39 CROSS APPLY 40 sys.dm_exec_sql_text(qs.sql_handle) AS st 41 ORDER BY 42 total_logical_reads DESC 43 44 --消耗IO寫最多的語句 45 SELECT TOP 100 46 execution_count, 47 total_worker_time / 1000 AS total_worker_time, 48 total_logical_reads, 49 total_logical_writes, 50 [text] 51 FROM 52 sys.dm_exec_query_stats qs 53 CROSS APPLY 54 sys.dm_exec_sql_text(qs.sql_handle) AS st 55 ORDER BY 56 total_logical_writes DESC 57 58 59 --單個語句查詢平均IO時間 60 SELECT TOP 100 61 [Total IO] = (qs.total_logical_writes+qs.total_logical_reads) 62 , [Average IO] = (qs.total_logical_writes+qs.total_logical_reads) / 63 qs.execution_count 64 , qs.execution_count 65 , SUBSTRING (qt.text,(qs.statement_start_offset/2) + 1, 66 ((CASE WHEN qs.statement_end_offset = -1 67 THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2 68 ELSE qs.statement_end_offset 69 END - qs.statement_start_offset)/2) + 1) AS [Individual Query] 70 , qt.text AS [Parent Query] 71 , DB_NAME(qt.dbid) AS DatabaseName 72 , qp.query_plan 73 FROM sys.dm_exec_query_stats qs 74 CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt 75 CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp 76 WHERE DB_NAME(qt.dbid)='tyyl_sqlserver' and execution_count>3 AND qs.total_logical_writes+qs.total_logical_reads>10000 77 --and qt.text like '%POSCREDIT%' 78 ORDER BY [Average IO] DESC 79 80 --單個語句查詢平均‘邏輯讀’時間 81 SELECT TOP 100 82 deqs.execution_count, 83 deqs.total_logical_reads/deqs.execution_count as "Avg Logical Reads", 84 deqs.total_elapsed_time/deqs.execution_count as "Avg Elapsed Time", 85 deqs.total_worker_time/deqs.execution_count as "Avg Worker Time",SUBSTRING(dest.text, (deqs.statement_start_offset/2)+1, 86 ((CASE deqs.statement_end_offset 87 WHEN -1 THEN DATALENGTH(dest.text) 88 ELSE deqs.statement_end_offset 89 END - deqs.statement_start_offset)/2)+1) as query,dest.text AS [Parent Query], 90 , qp.query_plan 91 FROM sys.dm_exec_query_stats deqs 92 CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) dest 93 CROSS APPLY sys.dm_exec_query_plan(deqs.sql_handle) qp 94 WHERE dest.encrypted=0 95 --AND dest.text LIKE'%INCOMINGTRANS%' 96 order by "Avg Logical Reads" DESC 97 98 --單個語句查詢平均‘邏輯寫’時間 99 SELECT TOP 100 100 [Total WRITES] = (qs.total_logical_writes) 101 , [Average WRITES] = (qs.total_logical_writes) / 102 qs.execution_count 103 , qs.execution_count 104 , SUBSTRING (qt.text,(qs.statement_start_offset/2) + 1, 105 ((CASE WHEN qs.statement_end_offset = -1 106 THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2 107 ELSE qs.statement_end_offset 108 END - qs.statement_start_offset)/2) + 1) AS [Individual Query] 109 , qt.text AS [Parent Query] 110 , DB_NAME(qt.dbid) AS DatabaseName 111 , qp.query_plan 112 FROM sys.dm_exec_query_stats qs 113 CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt 114 CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp 115 WHERE DB_NAME(qt.dbid)='DRSDataCN' 116 and qt.text like '%POSCREDIT%' 117 ORDER BY [Average WRITES] DESC 118 119 120 121 --單個語句查詢平均CPU執行時間 122 SELECT SUBSTRING(dest.text, (deqs.statement_start_offset/2)+1, 123 ((CASE deqs.statement_end_offset 124 WHEN -1 THEN DATALENGTH(dest.text) 125 ELSE deqs.statement_end_offset 126 END - deqs.statement_start_offset)/2)+1) as query, 127 deqs.execution_count, 128 deqs.total_logical_reads/deqs.execution_count as "Avg Logical Reads", 129 deqs.total_elapsed_time/deqs.execution_count as "Avg Elapsed Time", 130 deqs.total_worker_time/deqs.execution_count as "Avg Worker Time" 131 ,deqs.last_execution_time,deqs.creation_time 132 FROM sys.dm_exec_query_stats deqs 133 CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) dest 134 WHERE dest.encrypted=0 135 AND deqs.total_logical_reads/deqs.execution_count>50 136 ORDER BY QUERY,[Avg Worker Time] DESC
2. 經過工具實時抓取業務高峯期這段時間執行語句
收集工具:
推薦使用SQLTrace或Extend Event,不推薦使用Profiler
收集內容:
分析工具:
推薦ClearTrace,免費。具體使用方法請見個人另一篇博文介紹。
3. 須要逐條分析以上二點收集到語句,經過相似執行計劃分析找出更優化的方案語句
4. 這次優化針對當前庫,特別關注下面幾個性能殺手問題
1. 平均CPU使用時間在30000毫秒以上語句由20個減小到3個
2. 執行語句在CPU使用超過10000毫秒的,從1500減小到500個
3. CPU保持在 20%左右,高峯期在40%~60%,極端超過60%以上,極少80%
4. Batch Request從原來的1500提升到4000
最後方一張優化先後的效果對比,有較明顯的性能提高,只是解決眼前的瓶頸問題。
數據庫的優化只是一個層面,或許解決眼前的資源瓶頸問題,不少發現數據庫架構設計問題,受業務的限制,沒法動手去作任何優化,只能到此文爲止,這好像也是一種常態。從本次經歷中,到想到另一個問題,當只有發生性能瓶頸時候,企業的作法是趕快找人來救火,救完火後,而後就....好像就沒有而後...結束。換一種思惟,若是能從平常維護中作好監控、提早預警,作好規範,或許這種救火的行爲會少些。
感謝2016!
如要轉載,請加本文連接並註明出處http://www.cnblogs.com/SameZhao/p/6238997.html,謝謝。