設置了RemoveIPC=yes 的RHEL7.2 會crash掉Oracle asm 實例和Oracle database實例,該問題也會在使用Shared Memory Segment (SHM) or Semaphores (SEM)的應用程序中發生。
來源於:
ALERT: Setting RemoveIPC=yes on Redhat 7.2 Crashes ASM and Database Instances as Well as Any Application That Uses a Shared Memory Segment (SHM) or Semaphores (SEM) (文檔 ID 2081410.1)html
適用於:
Oracle Database - Standard Edition
Oracle Database - Enterprise Edition
Linux x86-64
Linux x86oracle
描述:
在RHEL7.2中,systemd-logind 服務引入了一個新特性,該新特性是:當一個user 徹底退出os以後,remove掉全部的IPC objects。
該特性由/etc/systemd/logind.conf參數文件中RemoveIPC選項來控制。詳細請看man logind.conf(5)app
在RHEL7.2中,RemoveIPC的默認值爲yeside
所以,當最後一個oracle 或者Grid用戶退出時,操做系統會remove 掉這個user的shared memory segments and semaphorespost
因爲Oracle ASM 和database 使用 shared memory segments ,remove shared memory segments將會crash掉Oracle ASM and database instances.操作系統
請參考Redhat bug 1264533 - https://bugzilla.redhat.com/show_bug.cgi?id=1264533.net
OCCURRENCE(不知道翻譯成什麼比較合適)
該問題影響使用the shared memory segments 和semaphores 的全部應用程序,所以,Oracle ASM 實例和Oracle Database 實例均受到影響。翻譯
Oracle Linux 7.2 經過在/etc/systemd/logind.conf配置文件中明確設置RemoveIPC爲no,Oracle Linux7.2 避免了該問題,
可是如果/etc/systemd/logind.conf文件是在os upgrade以前修改的,那麼yum/update將會寫一個正確的配置文件(RemoveIPC=no),該配置文件名是logind.conf.rpmnew,若是用戶使用原來的配置文件,那麼本文描述的failures將會發生。
爲了不本問題,當os升級以後,務必編輯logind.conf 文件並設置RemoveIPC=no。這在Oracle Linux 7.2 release notes中有記錄。rest
症狀:server
1) Installing 11.2 and 12c GI/CRS fails, because ASM crashes towards the end of the installation. 2) Upgrading to 11.2 and 12c GI/CRS fails. 3) After Redhat Linux is upgraded to 7.2, 11.2 and 12c ASM and database instances crash.
systemd-logind remove掉IPC objects可能在任什麼時候候發生,故障的表現能夠有很大的不一樣,下面是故障的幾個例子
Most common error that occurs is that the following is found in the asm or database alert.log: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1
The second observed error occurs during installation and upgrade when asmca fails with the following error: KFOD-00313: No ASM instances available. CSS group services were successfully initilized by kgxgncin KFOD-00105: Could not open pfile 'init@.ora'
The third observed error occurred during installation and upgrade: Creation of ASM password file failed. Following error occurred: Error in Process: /u01/app/12.1.0/grid/bin/orapwd Enter password for SYS: OPW-00009: Could not establish connection to Automatic Storage Management instance 2015/11/20 21:38:45 CLSRSC-184: Configuration of ASM failed 2015/11/20 21:38:46 CLSRSC-258: Failed to configure and start ASM
The fourth observed error is the following message is found in the /var/log/messages file around the time that asm or database instance crashed: Nov 20 21:38:43 testc201 kernel: traps: oracle[24861] trap divide error ip:3896db8 sp:7ffef1de3c40 error:0 in oracle[400000+ef57000]
變通的解決方法:
1) Set RemoveIPC=no in /etc/systemd/logind.conf
2) Reboot the server or restart systemd-logind as follows:
# systemctl daemon-reload
# systemctl restart systemd-logind
補丁:
從RHEL7.2遷移到Oracle Linux7.2能夠解決本問題。
如果遷移到Oracle Linux7.2不可能,請使用上述變通的解決方法
歷史:
2015年11月23日,本文章被創建。
轉自:https://blog.csdn.net/msdnchina/article/details/50864065?tdsourcetag=s_pctim_aiomsg