DPA監控發現一臺SQL Server服務器最近兩天執行系統存儲過程msdb.dbo.sp_MailItemResultSets中的某個SQL時,出現較嚴重的ASYNC_NETWORK_IO等待。以下截圖所示html
進一步分析發現,主要是執行存儲過程msdb.dbo.sp_MailItemResultSets中下面這段SQL語句出現ASYNC_NETWORK_IO等待sql
SELECT
mi.mailitem_id,
mi.profile_id,
(SELECT name FROM msdb.dbo.sysmail_profile p WHERE p.profile_id = mi.profile_id) as 'profile_name',
mi.recipients,
mi.copy_recipients,
mi.blind_copy_recipients,
mi.subject,
mi.body,
mi.body_format,
mi.importance,
mi.sensitivity,
ISNULL(sr.send_attempts, 0) as retry_attempt,
ISNULL(mi.from_address, '') as from_address,
ISNULL(mi.reply_to, '') as reply_to
FROM sysmail_mailitems as mi
LEFT JOIN sysmail_send_retries as sr
ON sr.mailitem_id = mi.mailitem_id
WHERE mi.mailitem_id = @mailitem_id
進一步分析,發現隨便一個與表sysmail_allitems有關的SQL都會出現嚴重的ASYNC_NETWORK_IO等待:數據庫
SELECT * FROM msdb.dbo.sysmail_allitems WITH(NOLOCK)
WHERE sent_status != 'sent'
ORDER BY sent_date DESC;
另外,分析過程當中發現sysmail_mailitems表只有7萬多條記錄,可是表的Size大小接近10G大小,以下截圖所示:服務器
這顯然明顯不正常,sysmail_mailitems中確定有一些超大的郵件記錄,由於這個系統常常有經過SQL,生成一些報表數據發送給用戶。因而我想檢查一下是否真的有一些超大的郵件。這裏就必須查看sysmail_mailitems表的行大小,因而用下面腳本查看錶sysmail_mailitems中行記錄的大小。app
USE msdb;
GO
IF object_id('sp_GetRowSize') is not null
drop procedure sp_GetRowSize
GO
CREATE procedure sp_GetRowSize(@Tablename varchar(100),@pkcol varchar(100))
AS
BEGIN
declare @dynamicsql varchar(MAX)
-- A @pkcol can be used to identify max/min length row
set @dynamicsql = 'select ' + @PkCol +' , (0'
-- traverse each record and calculate the datalength
select @dynamicsql = @dynamicsql + ' + isnull(datalength(' + name + '), 1)'
from syscolumns where id = object_id(@Tablename)
set @dynamicsql = @dynamicsql + ') as rowsize from ' + @Tablename + ' order by 2 desc'
print (@dynamicsql)
END
以下截圖所示,還真的有一些郵件記錄的rowsize超級大,正常狀況下,rowsize只有1122個字節左右大小,而mailitem_id=5146768 這條記錄竟然有1352285196字節,若是換算成大小的話SELECT 1352285196.0/1024/1024 ~=1290M, 真是無語了!!ide
EXEC sp_GetRowSize 'sysmail_mailitems', 'mailitem_id'性能
緣由倒也不復雜,就是生成郵件的SQL出現邏輯錯誤,致使郵件的Body變得無比巨大,致使msdb.dbo.sysmail_allitems變得很是大,與之相關的SQL語句IO性能變差,出現ASYNC_NETWORK_IO等待。其實之前也遇到過相似案例,請見SQL Server 2008 R2執行存儲過程sp_MailItemResultSets引發大量PREEMPTIVE_OS_WAITFORSINGLEOBJEC等待,只是當時沒有繼續深挖Root Cause而已!spa
解決方案3d
那麼如何解決這個問題呢? 很簡單,就是刪除表msdb.dbo.sysmail_allitems中的記錄,讓其Size變小。也能夠刪除那些mail_id很是大的記錄。能夠用下面腳本處理code
/******************************************************************************************************
Script Function : 如下示例按條件刪除數據庫郵件系統中的電子郵件
*******************************************************************************************************/
DECLARE @GETDATE datetime
SET @GETDATE = GETDATE()-2;
EXECUTE msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @GETDATE;
GO
其實sysmail_delete_mailitems_sp中的邏輯也是去刪除msdb.dbo.sysmail_allitems 中的記錄,以下所示,在處理的過程當中,須要在業務空閒的時候處理,不然會引發大量阻塞。
DELETE
FROM msdb.dbo.sysmail_allitems
WHERE ((@sent_before IS NULL)
OR ( send_request_date < @sent_before))
AND ((@sent_status IS NULL)
OR (sent_status = @sent_status))
CREATE PROCEDURE
sysmail_delete_mailitems_sp
@sent_before DATETIME = NULL, -- sent before
@sent_status varchar(8) = NULL -- sent status
AS
BEGIN
SET @sent_status = LTRIM(RTRIM(@sent_status))
IF @sent_status = ''
SET @sent_status = NULL
IF ( (@sent_status IS NOT NULL) AND
(LOWER(@sent_status collate SQL_Latin1_General_CP1_CS_AS) NOT IN ( 'unsent', 'sent', 'failed'
, 'retrying') ) )
BEGIN
RAISERROR(14266, -1, -1, '@sent_status', 'unsent, sent, failed, retrying')
RETURN(1) -- Failure
END
IF ( @sent_before IS NULL AND @sent_status IS NULL )
BEGIN
RAISERROR(14608, -1, -1, '@sent_before', '@sent_status')
RETURN(1) -- Failure
END
/* BEGIN ACTIVE SECTION (comment inserted by DPA) */
DELETE
FROM msdb.dbo.sysmail_allitems
WHERE ((@sent_before IS NULL)
OR ( send_request_date < @sent_before))
AND ((@sent_status IS NULL)
OR (sent_status = @sent_status))
/* END ACTIVE SECTION (comment inserted by DPA) */
DECLARE @localmessage nvarchar(255)
SET @localmessage = FORMATMESSAGE(14665, SUSER_SNAME(), @@ROWCOUNT) exec
msdb.dbo.sysmail_logmailevent_sp @event_type=1,
@description=@localmessage
END