自引用外鍵引起的性能問題解決案例

 一家貴陽地區的重點用戶在一次ZLHIS升級sp後,性能急劇降低。特別是醫囑相關的操做(包括新開,修改) 很是慢。現場人員同時反饋即便是對醫囑記錄的單單行數據的進行update也很是慢,這種狀況已經持續了1個多小時,嚴重影響了系統的運行,醫囑相關的業務基本處於停滯狀態。react

       經過電話溝通,感受象是遇到了「表級鎖」,致使事務無做獲取TM鎖。經過現場人員提取時段的awr報告,做了簡單分析:


DB Name DB Id Instance Inst num Release RAC Host
ORCL 1160490627 orcl 1 10.2.0.1.0 NO ZYHOSPIT-C55630
  Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 38900 17-2月 -12 08:01:04 277 59.6
End Snap: 38901 17-2月 -12 09:00:52 414 61.4
Elapsed:   59.81 (mins)    
DB Time:   1,768.12 (mins)    


能夠看到db time是間隔時間的近30倍,系統性能很是差,接下來查看top 5 timed Events來肯定db time的主要的構成:

Top 5 Timed Eventsweb

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
enq: TM - contention 25,550 75,077 2,938 70.8 Application
CPU time   16,319   15.4  
db file sequential read 1,987,166 8,702 4 8.2 User I/O
db file scattered read 1,386,286 2,933 2 2.8 User I/O
enq: TX - row lock contention 411 1,094 2,662 1.0 Application


        enq:TM-contention等待佔整個db time的70.8%,平均等待時間也達到了2938ms(也就是,2.93s) ,出現了嚴重的tm 類的enqueue等待,使用腳本查詢v$視圖。
TM 鎖(TM lock)用於確保在修改表的內容時,表的結構不會改變。例如,若是你已經更新了一個表,會獲得這個表的一個TM 鎖。這會防止另外一個用戶在該表上執行DROP ALTER 命令。若是你有表的一個TM 鎖,而另外一位用戶試圖在這個表上執行DDL,他就會獲得如下錯誤消息:
drop table dept
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified

在一個事務中 , 若是修改了多個表,則會獲得多個表的 TM 鎖。常見的enqueue的鎖mode有3和6,那咱們這裏持有的是那種模式的鎖呢?咱們使用下述sql來進行查詢:
      
Select Decode(Request, 0, 'Holder: ', 'Waiter: ') || Sid Sess, Id1, Id2, Lmode, Request, Type
From V$lock
Where (Id1, Id2, Type) In (Select Id1, Id2, Type From V$lock Where Request > 0)
Order By Id1, Request;

    SESS ID1 ID2 LMODE   REQUEST TYPE
1 Holder: 220 52074 0 3 0 TM
2 Waiter: 224 52074 0 0 2 TM
3 Waiter: 138 52074 0 0 2 TM
4 Waiter: 125 52074 0 0 2 TM
5 Waiter: 243 52074 0 0 2 TM
6 Waiter: 401 52074 0 0 2 TM
7 Waiter: 136 52074 0 0 2 TM
8 Waiter: 506 52074 0 0 2 TM
9 Waiter: 502 52074 0 0 2 TM
10 Waiter: 61 52074 0 0 2 TM
11 Waiter: 7 52074 0 0 2 TM
12 Waiter: 99 52074 0 0 2 TM
13 Waiter: 207 52074 0 0 2 TM
14 Waiter: 491 52074 0 0 2 TM
15 Waiter: 245 52074 0 0 2 TM
16 Waiter: 140 52074 0 0 3 TM
17 Waiter: 150 52074 0 0 3 TM
18 Waiter: 66 52074 0 0 3 TM
19 Waiter: 116 52074 0 0 3 TM
20 Waiter: 132 52074 0 0 3 TM
21 Waiter: 106 52074 0 0 3 TM

           
      能夠看到持有的是mode爲3的tm鎖,而請求的mode爲2的鎖;enqueu事件的id1列描述了表的object_id,查詢dba_objects能夠查到OBJECT_ID爲520740正是「病人醫囑記錄」這一張表。
 
  關於mode3的鎖,owi一書中有以下的說明:

Wait for TM Enqueue in Mode 3

Unindexed foreign key columns are the primary cause of TM lock contention in mode 3. However, this only applies to databases prior to Oracle9i Database. Depending on the operation, when foreign key columns are not indexed, Oracle either takes up a DML share lock (S – mode 4) or share row exclusive lock (SRX – mode 5) on the child table whenever the parent key or row is modified. (The share row exclusive lock is taken on the child table when the parent row is deleted and the foreign key constraint is created with the ON DELETE CASCADE option. Without this option, Oracle takes the share lock.) The share lock or share row exclusive lock on the child table prohibits other processes from getting a row exclusive lock (RX—mode 3) on the table. The waiting session will wait until the blocking session commits or rolls back its transaction.sql

Here is a philosophical question for you: Are you going to start building new indexes for all the foreign key columns in your databases? DBAs are divided on this. Our take is that you should hold your horses and don’t get carried away building new indexes just yet. If you do, you will introduce many new indexes to the database, some that are unnecessary. For example, you don’t need to create new indexes on foreign key columns when the parent tables they reference are static. You only need to create indexes on foreign key columns of the child table that is being identified by the   enqueue  wait event. The object ID for the child table is recorded in the P2 column, which corresponds to the ID1 column of the V$LOCK view. Query the DBA_OBJECTS view using the object ID and you will see the name of the child table. Yes, you will be operating in reactive mode, but it beats creating unnecessary indexes in the database, which not only wastes storage and increases maintenance, but may open up another can of worms for SQL tuning.
 
    這段話的大致意思是,沒有索引的外鍵列是模式3 中tm鎖爭用的主要緣由,然而這種緣由只適用9i以前的數據庫,根據不一樣的操做,當外鍵列沒有被索引時,Oracle在子表上採用一個DML共享鎖或共享獨佔鎖,只要父鍵或父行被修改。子表上的共享鎖或共享行獨佔鎖禁止進程或會話得到表上的獨佔鎖,會話交持續等待,直到形成阻塞的會話提交或回退它的事務。
    咱們的庫是Oracle 10g,彷佛這段說明並不適用咱們的狀況;咱們經過下列的sql查找表上有外鍵,但未創建索引的列:
 
SELECT TABLE_NAME,
       CONSTRAINT_NAME,
       CNAME1 || NVL2(CNAME2, ',' || CNAME2, NULL) ||
       NVL2(CNAME3, ',' || CNAME3, NULL) ||
       NVL2(CNAME4, ',' || CNAME4, NULL) ||
       NVL2(CNAME5, ',' || CNAME5, NULL) ||
       NVL2(CNAME6, ',' || CNAME6, NULL) ||
       NVL2(CNAME7, ',' || CNAME7, NULL) ||
       NVL2(CNAME8, ',' || CNAME8, NULL) COLUMNS
FROM (SELECT B.TABLE_NAME,
               B.CONSTRAINT_NAME,
               MAX(DECODE(POSITION, 1, COLUMN_NAME, NULL)) CNAME1,
               MAX(DECODE(POSITION, 2, COLUMN_NAME, NULL)) CNAME2,
               MAX(DECODE(POSITION, 3, COLUMN_NAME, NULL)) CNAME3,
               MAX(DECODE(POSITION, 4, COLUMN_NAME, NULL)) CNAME4,
               MAX(DECODE(POSITION, 5, COLUMN_NAME, NULL)) CNAME5,
               MAX(DECODE(POSITION, 6, COLUMN_NAME, NULL)) CNAME6,
               MAX(DECODE(POSITION, 7, COLUMN_NAME, NULL)) CNAME7,
               MAX(DECODE(POSITION, 8, COLUMN_NAME, NULL)) CNAME8,
               COUNT(*) COL_CNT
          FROM (SELECT SUBSTR(TABLE_NAME, 1, 30) TABLE_NAME,
                       SUBSTR(CONSTRAINT_NAME, 1, 30) CONSTRAINT_NAME,
                       SUBSTR(COLUMN_NAME, 1, 30) COLUMN_NAME,
                       POSITION
                  FROM USER_CONS_COLUMNS) A,
               USER_CONSTRAINTS B
         WHERE A.CONSTRAINT_NAME = B.CONSTRAINT_NAME
           AND B.CONSTRAINT_TYPE = 'R'
         GROUP BY B.TABLE_NAME, B.CONSTRAINT_NAME) CONS
WHERE COL_CNT > ALL
(SELECT COUNT(*)
          FROM USER_IND_COLUMNS I
         WHERE I.TABLE_NAME = CONS.TABLE_NAME
           AND I.COLUMN_NAME IN (CNAME1, CNAME2, CNAME3, CNAME4, CNAME5,
                CNAME6, CNAME7, CNAME8)
           AND I.COLUMN_POSITION <= CONS.COL_CNT
         GROUP BY I.INDEX_NAME)
 
這個查詢,使用了decode函數來實現行轉列的效果,從而獲得外鍵的列;從獲得的結果中,查看醫囑記錄相關的表,能夠看錶上確實有這種未建索引的外鍵:
 
  病人醫囑記錄 病人醫囑記錄_FK_前提ID 前提ID
  病人醫囑記錄 病人醫囑記錄_FK_病人科室ID 病人科室ID
  病人醫囑記錄 病人醫囑記錄_FK_開囑科室ID 開囑科室ID
  病人醫囑記錄 病人醫囑記錄_FK_執行科室ID 執行科室ID
 
      焦點集中在「前提ID」上,由於其餘幾個外鍵列都是引用部門表,部門表做爲基礎表,數據變更的機率比較小。而"前提id"是一個 自引用的外鍵,並非簡單的主從表形式的外鍵,從升級腳本中找到這個約束的定義:
 
  ALTER TABLE 病人醫囑記錄
    ADD CONSTRAINT 病人醫囑記錄_FK_前提ID
    FOREIGN KEY (前提ID)
    REFERENCES 病人醫囑記錄(ID);
 
    能夠看到咱們前提ID引用的是表的主鍵列(ID),ID雖然基本上不更新,但insert很是頻繁;通過測試,這種自引用的外鍵約束即便是在10g中,當咱們更新或insert記錄時也會引起對錶的tm鎖;若是在insert到表時,未創建外鍵都會引起tm鎖,接下來就是創建索引:
 
CREATE INDEX 病人醫囑記錄_IX_前提ID
    ON 病人醫囑記錄(前提ID)
    PCTFREE 10 
    TABLESPACE zl9CisRec
    online nologging;
 
     因爲是生產庫時,創建索引時加了online選項,同時加了nologging選項不產生日誌以加快創建的速度。若是在創建索引的過程當中使用了parallel 選項,必定記住在索引創建完成後,將parallel修改回1,以避免產生大量的併發進程。
     索引創建完成後,相關操做恢復正常。
 
    總結:從owi的說明中能夠看到,並非全部的外建都須要創建索引,是否創建索引要根據引用的主鍵是否常常變化,以及外鍵列上的索引是否可以提高性能,防止避免創建一個不使用或不多使用的「殭屍索引「。在咱們的案例中,也沒有爲幾個引用部門表的外鍵創建索引,仍是那句話,都得具體問題具體分析,不能簡單行事。 這個案例也說明,即便是在10g下,對於自引用的主鍵常常變化(包括insert)的外鍵,必需要創建索引。
相關文章
相關標籤/搜索