在AIX 7100-02-03-1334 上安裝Oracle Rac,grid和oracle都已安裝完成。可是dbca建庫的時候發現數據庫crash,如下是建庫時的alert.log,數據庫報ora-07445報錯,dbca的日誌中能夠發如今Create database時出錯。
在mos上沒有找到匹配的文檔,嘗試使用其餘方法。
/oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/alert_rmbtodb1.log
MMNL started with pid=26, OS id=7733452
Exception [type: SIGILL, Illegal opcode] [ADDR:0x103E2AFA0] [PC:0x103E2AFA0, {empty}] [flags: 0x0, count: 1]
Errors in file /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/rmbtodb1_asmb_6357148.trc (incident=105793):
ORA-07445: exception encountered: core dump [PC:0x103E2AFA0] [SIGILL] [ADDR:0x103E2AFA0] [PC:0x103E2AFA0] [Illegal opcode] []
Incident details in: /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/incident/incdir_105793/rmbtodb1_asmb_6357148_i105793.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
lmon registered with NM - instance number 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 2)
List of instances:
1 (myinst: 1)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Thu Dec 11 11:19:18 2014
LCK0 started with pid=27, OS id=10420304
Starting background process RSMN
Thu Dec 11 11:19:18 2014
RSMN started with pid=28, OS id=9306256
ORACLE_BASE from environment = /oraapp/oracle
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x496568BB8] [PC:0x10029B4D0, {empty}] [flags: 0x8, count: 3]
Errors in file /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/rmbtodb1_asmb_6357148.trc (incident=105794):
ORA-07445: exception encountered: core dump [PC:0x10029B4D0] [SIGSEGV] [ADDR:0x496568BB8] [PC:0x10029B4D0] [Address not mapped to object] []
ORA-07445: exception encountered: core dump [PC:0x103E2AFA0] [SIGILL] [ADDR:0x103E2AFA0] [PC:0x103E2AFA0] [Illegal opcode] []
Incident details in: /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/incident/incdir_105794/rmbtodb1_asmb_6357148_i105794.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Dec 11 11:19:21 2014
Sweep [inc][105794]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sweep [inc][105793]: completed
Sweep [inc2][93794]: completed
Sweep [inc2][105794]: completed
PMON (ospid: 16318602): terminating the instance due to error 486
System state dump requested by (instance=1, osid=16318602 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace/rmbtodb1_diag_14352568.trc
Dumping diagnostic data in directory=[cdmp_20141211111922], requested by (instance=1, osid=16318602 (PMON)), summary=[abnormal instance termination].
Instance terminated by PMON, pid = 16318602
oracle@urmbtodb1:/oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/trace>1/incident/incdir_105794/rmbtodb1_asmb_6357148_i105794.trc <
"/oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/incident/incdir_105794/rmbtodb1_asmb_6357148_i105794.trc" 2832 lines, 161159 characters
Dump file /oraapp/oracle/diag/rdbms/rmbtodb/rmbtodb1/incident/incdir_105794/rmbtodb1_asmb_6357148_i105794.trc
首先懷疑是oracle對ASM磁盤沒有寫權限,嘗試用oracle在ASM上建立spfile,成功建立。檢查CRS_HOME和ORACLE_HOME的執行文件oracle,並未發現權限問題。
一、首先嚐試在1號節點上手動建庫,編寫一份pfile,嘗試將數據庫nomount,發現數據庫nomount起來後當即crash。
二、嘗試在2號節點上dbca建庫,其中報錯信息以下:
/oraapp/oracle/cfgtoollogs/dbca/rmbtodb/trace.log
[Thread-178] [ 2014-12-11 12:47:49.813 CST ] [PostDBCreationStep.executeImpl:889] Starting Database HA Resource
[Thread-178] [ 2014-12-11 12:48:16.318 CST ] [CRSNative.internalStartResource:389] Failed to start resource: Name: ora.rmbtodb.db, node: null, filter: null,
msg CRS-5017: The resource action "ora.rmbtodb.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 14287060
Session ID: 126 Serial number: 1
. For details refer to "(:CLSN00107:)" in
"/oraapp/grid/gridhome/log/urmbtodb1/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.rmbtodb.db' on 'urmbtodb1' failed
CRS-2632: There are no more servers to try to place resource 'ora.rmbtodb.db' on that would satisfy its placement policy
[Thread-178] [ 2014-12-11 12:48:16.319 CST ] [PostDBCreationStep.executeImpl:897] Exception while Starting with HA Database Resource PRCR-1079 : Failed to s
tart resource ora.rmbtodb.db
CRS-5017: The resource action "ora.rmbtodb.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 14287060
Session ID: 126 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/oraapp/grid/gridhome/log/urmbtodb1/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.rmbtodb.db' on 'urmbtodb1' failed
CRS-2632: There are no more servers to try to place resource 'ora.rmbtodb.db' on that would satisfy its placement policy
ora.rmbtodb.db在rmbtodb1上啓動失敗,可是數據庫能夠成功建立在2號節點上。
具體查看oraagent_oracle.log日誌:
/oraapp/grid/gridhome/log/urmbtodb1/agent/crsd/oraagent_oracle/oraagent_oracle.log
2014-12-10 22:48:11.505: [ USRTHRD][1800] {2:52141:473} Value of LOCAL_LISTENER is
2014-12-10 22:48:11.549: [ USRTHRD][1800] {2:52141:473} ORA-01405: fetched column value is NULL
2014-12-10 22:48:11.549: [ USRTHRD][1800] {2:52141:473} Value of LISTENER_NETWORKS is
2014-12-10 22:48:11.549: [ USRTHRD][1800] {2:52141:473} sqlStmt =
ALTER SYSTEM SET LOCAL_LISTENER=' (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=200.31.155.225)(PORT=1521))))' SCOPE=MEMORY SID='rmbtodb1' /* db agent *//* {2:52141:473} */
2014-12-10 22:48:13.011: [ USRTHRD][1800] {2:52141:473} ORA-03113: end-of-file on communication channel
Process ID: 14287060
Session ID: 126 Serial number: 1
發如今設置LOCAL_LISTENER時,數據庫crash。此時問題已經很是明顯,確定是網絡方面的問題。
AIX管理員表示以前在1號節點上作過更改網卡綁定的模式。
grid@urmbtodb1:/home/grid>oifcfg getif -global
en10 192.168.4.0 global cluster_interconnect
en9 200.31.155.0 global public
查看public IP和priv IP並沒有異常。嘗試將Public IP從新設置一下:
刪除en9信息:
grid@urmbtodb1:/home/grid>oifcfg -delif -global en9
grid@urmbtodb1:/home/grid>oifcfg getif -global
en10 192.168.4.0 global cluster_interconnect
重設public IP:
grid@urmbtodb1:/home/grid>oifcfg -setif -global en9/200.31.155.0:public
grid@urmbtodb1:/home/grid>oifcfg getif -global
en10 192.168.4.0 global cluster_interconnect
en9 200.31.155.0 global public
以後將crs從新啓動。並再次在1號節點dbca建庫,沒有出現此前相似的問題。