oracle數據庫日誌報錯ORA-27300 ORA-27301 ORA-27302 ORA-27157處理記錄:數據庫
應用鏈接數據庫失敗,先鏈接數據庫服務器,啓動數據庫服務恢復應用業務,而後排查數據庫crush緣由:
1)查看messages日誌是否有與Oracle用戶相關的出錯信息vim
# cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core) # cat /var/log/messages |grep failed
2)查看用戶郵件日誌是否有與Oracle用戶相關的出錯信息服務器
# tail -n 200 /var/mail/oracle
3)檢查跟蹤日誌文件oracle
SQL> select value from v$diag_info where name ='Diag Trace'; VALUE -------------------------------- /app/oracle/diag/rdbms/crm/crm/trace $ cat /app/oracle/diag/rdbms/crm/crm/trace/alert_crm.log |grep ORA- $ cat /app/oracle/diag/rdbms/crm/crm/trace/alert_crm.log |grep err $ cat /app/oracle/diag/rdbms/crm/crm/trace/alert_crm.log |grep fail SQL> select name,value from v$diag_info; NAME VALUE -------------------------------- Diag Enabled TRUE ADR Base /app/oracle/ ADR Home /app/oracle/diag/rdbms/crm/crm Diag Trace /app/oracle/diag/rdbms/crm/crm/trace Diag Alert /app/oracle/diag/rdbms/crm/crm/alert Diag Incident /app/oracle/diag/rdbms/crm/crm/incident Diag Cdump /app/oracle/diag/rdbms/crm/crm/cdump Health Monitor /app/oracle/diag/rdbms/crm/crm/hm Default Trace File /app/oracle/diag/rdbms/crm/crm/trace/crm_ora_10974.trc Active Problem Count 1 Active Incident Count 1
查看log.xml搜索問題出現時間點app
$ less /app/oracle/diag/rdbms/crm/crm/alert/log.xml Errors in file /app/oracle/diag/rdbms/crm/crm/trace/crm_dbw2_80900.trc: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 Errors in file /app/oracle/diag/rdbms/crm/crm/trace/crm_dbw3_80902.trc: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1
This could be due to some outside user or application removing the semaphores/shared memory. To monitor the semaphore/shared memory state we can use the following methods: Setup a cronjob to run every 5-10min and dump the output of 'ipcs' and 'ps - ef' to a file with a timestamp. Rotate your logs every 4-7 days to build a history. Then if the problem re-occurs, we can at least try to make sure 'ipcrm' wasn't the culprit and get some general information of the state of the IPC resources plus the processes running. You can also consult with your sysadmin to check if there is any OS level auditing that can be turned on to audit the usage of commands like 'ipcrm' which can remove shared memory segments /semaphore sets. Note: This issue can happen on different platform, but in case you encounter the issue in RHEL7.2, then please also check below RHEL7.2 specific information. In RHEL7.2 operating system setting RemoveIPC=YES crashes the database.The default value for RemoveIPC in RHEL7.2 is YES. Workaround : 1) Set RemoveIPC=no in /etc/systemd/logind.conf if it is not in that file 2) Reboot the server or restart systemd-logind as follows: # systemctl daemon-reload # systemctl restart systemd-logind
Centos7.2有個特性,默認該配置是爲yesless
# vim /etc/systemd/logind.conf RemoveIPC=no # systemctl daemon-reload # systemctl restart systemd-logind