最近經歷了一次較爲痛苦的災難恢復過程,在一次維護過程當中,須要shudown 整個db,但shutdown immdiate命令一直沒法結束,最後不得不使用shudown abort命令,強制關閉了數據庫,但打開時出現以下的00600錯誤:web
ORA-00600: 內部錯誤代碼, 參數: [kclchkblk_4], [0], [1158738710], [0], [1128825042], [], [], []數據庫
Wed Apr 20 21:36:40 2011session
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_5622.trc:oracle
ORA-00600: 內部錯誤代碼, 參數: [kclchkblk_4], [0], [1158738710], [0], [1128825042], [], [], []app
Wed Apr 20 21:36:40 2011ide
Error 600 happened during db open, shutting down databasethis
USER: terminating instance due to error 600spa
看來shutdown abort不能用呀,教訓沉重!!如今只能硬着頭皮作數據庫恢復了。開始提示system01須要介質恢復,但查詢了一個controlfile的scn與數據文件頭不一致,嘗試作了recover database until cancel,提示恢復完成,使用alter system open resetlogs 打開時仍是一樣的錯誤!orm
到metalink上查找[kclchkblk_4]這個錯誤, [ID 275902.1]說明了這種狀況:事件
1) Error, ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block
is too high. The same reason caused the ORA-600[2662]s in the alert logs.
2) This issue is because the tempfiles may not get reinitialized during open
resetlogs.
具體的緣由就是resetlog期間臨時表空間的scn與系統scn不一致;解決辦法就是在moun狀態將物理的tempfile文件所有刪除,而後再在打開狀態添加臨時文件便可。
按照這種方式處理後,打開時報出了一個新的錯誤:
Wed Apr 20 22:34:54 2011
SMON: enabling cache recovery
Wed Apr 20 22:34:54 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 內部錯誤代碼, 參數: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 內部錯誤代碼, 參數: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_30042.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 30165
2662錯誤在使用了_all_resetlogs_curruption等參數不徹底恢復後,常常出現的錯誤, 主要原緣由是當前數據庫的數據塊的SCN早於當前的SCN,主要是和存儲在UGA變量中的dependent SCN進行比較,若是當前的SCN小於它,數據庫就會產生這個ORA-600 [2662]的錯誤了:
Wed Apr 20 22:34:54 2011
SMON: enabling cache recovery
Wed Apr 20 22:34:54 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 內部錯誤代碼, 參數: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 內部錯誤代碼, 參數: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_30042.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 30165
2662錯誤的解決方法通常爲使用10015事件調節scn:
alter session set events '10015 trace name adjust_scn level x';
x爲level 1爲增進SCN 10億 (1 billion) (1024*1024*1024),一般Level 1已經足夠。也能夠根據實際狀況適當調整。好比咱們這裏的狀況,提示1128985090小於1158738710,若是將level設置爲1,新調整的scn爲1073741824,這樣就會小於當前的scn了,調整的數不夠,將會報出另外一個爲2256的錯誤,因此我使用level 2。
根據以往在8i/9i下的經驗,這時候就應該可以打開數據庫了,但是打開時仍是報出相同的錯誤,同時查詢V$database發現scn也沒有發生變化。看來調整scn 起做用,這下子就有點麻煩了。
仔細分析生成的trace文件,發如今報出2662錯誤以前,還報了一個ORA-01031的權限不足的錯誤:
Clearing ORA-1031 thrown by trace 'ADJUST_SCN'
----- Dump for trace 'ADJUST_SCN': -----
*** 2011-04-20 23:54:19.034
ksedmp: internal or fatal error
ORA-01031: 權限不足
Current SQL statement for this session:
alter database open
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
看來,確實是由於某種權限的緣由,致使了調整scn失敗;但在8i/9i下這種方法是常常使用的,應該有沒有問題,只能猜想Oracle 10g對10015事件作了某些修改,後來通過多方打探,包括一些朋友和QQ圈,終於在一位朋友那裏知道了一個參數,_allow_error_simulation,只有這個參數設置爲true的狀況下,才能使用10015調整scn。向別人求助是個好習慣,但我堅定反對深夜求助!!!
在init.ora中設置這個參數,再次使用10015事件,終開打開了這個數據庫;而後就是exp/imp重建,順利收工。
這次工做的教訓就是,shutdown abort必定慎用,慎重再慎重!!