[20191125]探究等待事件的本源.txt

[20191125]探究等待事件的本源.txt

--//當工做中遇到oracle的性能問題時,查看awr報表提供很好的解決問題途徑.可是有時候很容易想固然.
--//好比之前我一看到 log file sync等待事件就很主觀的認爲redo 磁盤IO不行,實際上真實的狀況可能有許多緣由.
--//好比提交進程太多,cpu調度不過來.總之須要wait chains肯定問題的本源.
--//上午看了大師Tanel Poder的dash_wait_chains.sql與ash_wait_chains.sql腳本,簡單學習它的使用.
--//dash_wait_chains.sql 是基於DBA_HIST_ACTIVE_SESS_HISTORY視圖.而ash_wait_chains.sql基於V$ACTIVE_SESSION_HISTORY視圖.
--//兩者命令執行格式差很少,僅僅學習ash_wait_chains.sql就ok了.

--//實際上根本不須要學習與記憶,看看腳本的開頭就知道如何執行與使用:
-- Usage:
--     @ash_wait_chains <grouping_cols> <filters> <fromtime> <totime>
--
-- Example:
--     @ash_wait_chains username||':'||program2||event2 session_type='FOREGROUND' sysdate-1/24 sysdate
--
-- Other:
--     This script uses only the in-memory V$ACTIVE_SESSION_HISTORY, use
--     @dash_wait_chains.sql for accessiong the DBA_HIST_ACTIVE_SESS_HISTORY archive
--
--     Oracle 10g does not  have the BLOCKING_INST_ID column in ASH so you'll need
--     to comment out this column in this script. This may give you somewhat
--     incorrect results in RAC environment with global blockers.
--
--------------------------------------------------------------------------------

--//我作一點點簡單的修改:
COL wait_chain FOR A300 WORD_WRAP
--//修改成:
COL wait_chain FOR A200 WORD_WRAP

--//個人工做環境最大顯示寬度是271列.若是設置再小,字體顯示就過小,我同事許多設置才237.
$ echo $COLUMNS
271

--//由於我還須要管理10g的數據庫,根據前面的說明,能夠拷貝腳本命名爲ash_wait_chains10g.sql.
--//註解以下,固然對於rac環境可能不正確.
--         AND PRIOR d.blocking_inst_id = d.inst_id

--//做者還從新定義字段program,event爲program2,event2.這樣顯示更加直觀,佔用寬度更小一些.
--//如下是測試部分:

1.環境:
qqqqqqqqqqq> @ ver1
PORT_STRING         VERSION    BANNER
------------------- ---------- ----------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

qqqqqqqqqqq> @ tpt/ash/ash_wait_chains  program2||':'||event2 1=1 trunc(sysdate)+9/24 trunc(sysdate)+10/24
-- Display ASH Wait Chain Signatures script v0.2 BETA by Tanel Poder ( http://blog.tanelpoder.com )
%This     SECONDS        AAS WAIT_CHAIN
------ ---------- ---------- -------------------------------------------------------------------
  33%        4362         .1 -> (XXXYYY.EXE) :ON CPU
  10%        1368          0 -> (NSSn) :LNS wait on SENDREQ
  10%        1323          0 -> (LGWR) :LGWR-LNS wait on channel
   6%         735          0 -> (XXXYYY.EXE) :log file sync  -> (LGWR) :LGWR-LNS wait on channel
   5%         605          0 -> (CAPAA-PIPE) :ON CPU
   5%         604          0 -> (XXXYYY.exe) :ON CPU
   4%         573          0 -> (wnwp.exe) :ON CPU
   3%         378          0 -> (DIAn) :ON CPU
   2%         274          0 -> (sqlplus) :ON CPU
   2%         246          0 -> (httpd.exe) :ON CPU
   2%         230          0 -> (PSPn) :ON CPU
   1%         115          0 -> (CKPT) :ON CPU
   1%         111          0 -> (LMSn) :ON CPU
   1%         111          0 -> (wnwp.exe) :log file sync  -> (LGWR) :LGWR-LNS wait on channel
   1%          89          0 -> (routine.exe) :ON CPU
   1%          87          0 -> (oracle) :ON CPU
   1%          86          0 -> (sqlplus) :control file sequential read
   1%          83          0 -> (LGWR) :ON CPU
   1%          76          0 -> (routine.exe) :log file sync  -> (LGWR) :LGWR-LNS wait on channel
   1%          72          0 -> (ARCn) :ON CPU
   1%          69          0 -> (XXXYYY.EXE) :gc current block 2-way
   1%          68          0 -> (XXXYYY.EXE) :gc cr block 2-way
   0%          63          0 -> (LGWR) :log file parallel write
   0%          55          0 -> (LMON) :ON CPU
   0%          55          0 -> () :null event
   0%          47          0 -> (XXXYYY.EXE) :log file sync
   0%          46          0 -> (nnnn.exe) :log file sync  -> (LGWR) :LGWR-LNS wait on channel
   0%          39          0 -> (LMSn) :gcs log flush sync  -> (LGWR) :LGWR-LNS wait on channel
   0%          38          0 -> (XXXYYY.EXE) :direct path read
   0%          38          0 -> (XXXYYY.EXE) :gc cr block busy
30 rows selected.

--//開始不理解AAS表示什麼.表示以下
ROUND(COUNT(*) / ((CAST(&4 AS DATE) - CAST(&3 AS DATE)) * 86400), 1) AAS
--//至關於每秒取樣幾回.
--//參數3,參數4是時間範圍,我建議使用相似例子的格式.你僅僅須要知道1/24表示1小時,1/1440表示1分鐘就ok了.
--//好比你須要查看當天8:15到9:10的狀況,時間能夠寫成:
trunc(sysdate)+8/24+15/1440 trunc(sysdate)+9/24+10/1440
--//另外注意一點:V$ACTIVE_SESSION_HISTORY保留信息有限制,時間過久的可能查詢不到.要改用dash_wait_chains.sql腳本查詢.
--//參數2能夠加入一些過濾例子:session_type='FOREGROUND' 或者使用 1=1 表示所有.

--//你能夠看到這臺服務器的log file sync等待事件實際上因爲(LGWR) :LGWR-LNS wait on channel.
--//問題在與安裝配置dg時的參數:

qqqqqqqqqqq> show parameter log_archive_dest_2
NAME               TYPE   VALUE
------------------ ------ ----------------------------------------------------------------------------------------------------
log_archive_dest_2 string service=rzdbra lgwr sync reopen=15 max_failure=10 net_timeout=30 optional noaffirm db_unique_name=XXXXXX
--//對方配置了sync,這樣在提交時要肯定dg方已經寫入日誌文件纔算提交.
--//若是改爲async,相關等待LNS wait on SENDREQ,LGWR-LNS wait on channel,log file sync都會消失或者減小.

--//再來看看另外1個10g的數據庫:
WWWWWWWWW> @ ver1
PORT_STRING         VERSION        BANNER
------------------- -------------- ----------------------------------------------------------------
x86_64/Linux 2.4.xx 10.2.0.4.0     Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

WWWWWWWWW> @ tpt/ash/ash_wait_chains10g  program2||':'||event2 1=1 trunc(sysdate)+9/24 trunc(sysdate)+10/24
-- Display ASH Wait Chain Signatures script v0.2 BETA by Tanel Poder ( http://blog.tanelpoder.com )
%This     SECONDS        AAS WAIT_CHAIN
------ ---------- ---------- ------------------------------------------------------------------------
  30%        3602         .1 -> (xxxzzzz.exe) :db file sequential read
  19%        2244          0 -> (DBWn) :ON CPU
  11%        1321          0 -> (LGWR) :log file parallel write
   9%        1113          0 -> (xxxzzzz.exe) :ON CPU
   7%         890          0 -> (xxxzzzz.exe) :db file scattered read
   6%         784          0 -> (xxxzzzz.exe) :log file sync  -> (LGWR) :log file parallel write
   5%         633          0 -> (aaatobbbb.exe) :log file sync  -> (LGWR) :log file parallel write
   4%         490          0 -> (CKPT) :control file parallel write
   1%         106          0 -> (wnwp.exe) :db file sequential read
   1%          88          0 -> (JDBC Thin Client) :db file sequential read
   1%          86          0 -> (aaatobbbb.exe) :db file scattered read
   1%          82          0 -> (mnnn) :db file sequential read
   0%          59          0 -> (xxxzzzz.exe) :read by other session
   0%          55          0 -> (wnwp.exe) :ON CPU
   0%          54          0 -> (JDBC Thin Client) :ON CPU
   0%          45          0 -> (plsqldev.exe) :enq: TX - row lock contention
   0%          44          0 -> (sqlplus) :ON CPU
   0%          27          0 -> (wnwp.exe) :log file sync  -> (LGWR) :log file parallel write
   0%          27          0 -> (aaatobbbb.exe) :ON CPU
   0%          25          0 -> (xxxzzzz.exe) :SQL*Net more data from dblink
   0%          25          0 -> (mnnn) :ON CPU
   0%          24          0 -> (ARCn) :log file sequential read
   0%          19          0 -> (xxxzzzz.exe) :db file parallel read
   0%          13          0 -> (CTWR) :ON CPU
   0%          13          0 -> (aaatobbbb.exe) :db file sequential read
   0%          12          0 -> (Jnnn) :db file sequential read
   0%          10          0 -> (aaatobbbb.exe) :log file sync
   0%          10          0 -> (SMON) :db file sequential read
   0%          10          0 -> (ARCn) :ON CPU
   0%          10          0 -> (CJQn) :ON CPU
30 rows selected.
--//能夠發現這臺系統磁盤io已經不行.若是負荷在增長就不行.順便說一下這臺機器硬件配置不好.

WWWWWWWWW> @ tpt/ash/ash_wait_chains10g  event2 1=1 trunc(sysdate)+7/24 sysdate
-- Display ASH Wait Chain Signatures script v0.2 BETA by Tanel Poder ( http://blog.tanelpoder.com )
%This     SECONDS        AAS WAIT_CHAIN
------ ---------- ---------- -----------------------------------------------------------
  32%        4910         .1 -> db file sequential read
  32%        4887         .1 -> ON CPU
  12%        1896          0 -> log file sync  -> log file parallel write
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  11%        1761          0 -> log file parallel write
   7%        1078          0 -> db file scattered read
   4%         650          0 -> control file parallel write
   0%          59          0 -> read by other session
   0%          45          0 -> enq: TX - row lock contention
   0%          34          0 -> SQL*Net more data to client
   0%          34          0 -> SQL*Net more data from dblink
   0%          24          0 -> log file sequential read
   0%          23          0 -> db file parallel read
   0%          21          0 -> log file sync
   0%           7          0 -> log file switch completion
   0%           6          0 -> null event
   0%           5          0 -> change tracking file synchronous write
   0%           4          0 -> SQL*Net more data from client
   0%           4          0 -> control file sequential read
   0%           3          0 -> os thread startup
   0%           2          0 -> enq: CF - contention  -> control file parallel write
   0%           1          0 -> SQL*Net break/reset to client
   0%           1          0 -> Log archive I/O
   0%           1          0 -> latch: library cache
   0%           1          0 -> log file sync  -> ON CPU
24 rows selected.

--//注意看下劃線,log file sync主要是因爲log file parallel write太慢形成的,與前面的不一樣.這樣定位問題就不會太盲目亂猜.sql

相關文章
相關標籤/搜索