Oracle 12C CRS-5013

1.背景

OS:SUSE 12SP3
DB:12.2.0.1.190115 2節點RAC
Q:crs alert日誌一直刷以下報錯
2019-02-12 12:46:18.163 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:21.161 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:24.168 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:27.167 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:39.163 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:42.161 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:48.167 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:51.165 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:51.724 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:51.789 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"

2019-02-12 12:46:54.166 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:46:57.167 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:00.158 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:06.160 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:09.167 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:12.166 [ORAAGENT(91622)]CRS-5013: Agent "ORAAGENT" failed to start process "/oracle/app/12.2.0/grid/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:13.588 [ORAAGENT(91622)]CRS-5016: Process "/oracle/app/12.2.0/grid/opmn/bin/onsctli" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:18.743 [ORAAGENT(91622)]CRS-5016: Process "/oracle/app/12.2.0/grid/opmn/bin/onsctli" spawned by agent "ORAAGENT" for action "start" failed: details at "(:CLSN00010:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"
2019-02-12 12:47:23.852 [ORAAGENT(91622)]CRS-5016: Process "/oracle/app/12.2.0/grid/opmn/bin/onsctli" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "/oracle/app/grid/diag/crs/ssng3mcs-db2/crs/trace/crsd_oraagent_grid.trc"

2.檢查DB、ASM alert日誌

DB alerthtml

Errors in file /oracle/app/oracle/diag/rdbms/mcsdb/MCSDB2/trace/MCSDB2_psp0_93951.trc:
ORA-27300: OS system dependent operation:fork failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: skgpspawn5

ASM alertcookie

2019-02-12T16:06:08.891616+08:00
Process startup failed, error stack:
2019-02-12T16:06:08.891871+08:00
Errors in file /oracle/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_psp0_92477.trc:
ORA-27300: OS system dependent operation:fork failed with status: 11
ORA-27301: OS failure message: Resource temporarily unavailable
ORA-27302: failure occurred at: skgpspawn3
2019-02-12T16:06:09.889920+08:00
Process m000 died, see its trace file

3.搜MOS

  • MOS
1.Database And ASM Instance Ora-27300 OS System Dependent Operation Fork Failed With Status 11 (文檔 ID 2331884.1)
2.SLES 12: Database Startup Error with ORA-27300 ORA-27301 ORA-27303 While Starting using Srvctl (文檔 ID 2340986.1)
  • 參考連接
http://feed.askmaclean.com/archives/suse-12-redhat-7%E4%B8%AD%E7%9A%84ora-27300-os-system-dependent-operationfork-failed-with-status-11%E7%9A%84%E6%95%85%E9%9A%9C%E5%A4%84%E7%90%86.html
  • 根據上面的參考,檢查本環境
cat /etc/security/limits.conf
ps h -Led -o user | sort | uniq -c | sort -n
cat /etc/systemd/system.conf|grep DefaultTasksMax
#DefaultTasksMax=512

# systemctl status ohasd
● ohasd.service - LSB: Start and Stop Oracle High Availability Service
   Loaded: loaded (/etc/init.d/ohasd; bad; vendor preset: disabled)
   Active: active (exited) since Tue 2019-02-12 12:42:37 CST; 4h 3min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 78777 ExecStart=/etc/init.d/ohasd start (code=exited, status=0/SUCCESS)
    Tasks: 512 (limit: 512)    ========>限制是512

Feb 12 12:42:32 SSNG3MCS-DB2 systemd[1]: Starting LSB: Start and Stop Oracle High Availability Service...
Feb 12 12:42:32 SSNG3MCS-DB2 ohasd[78777]: Starting ohasd:
Feb 12 12:42:37 SSNG3MCS-DB2 ohasd[78777]: CRS-4123: Oracle High Availability Services has been started.
Feb 12 12:42:37 SSNG3MCS-DB2 systemd[1]: Started LSB: Start and Stop Oracle High Availability Service.

4.解決方案

1.vi /etc/systemd/system.conf
Set DefaultTasksMax to 'infinity'
2.Restart OS

5.問題緣由

在Linux 7或Suse 12上,使用了systemd新的啓動方式,這個在Linux 6和SUSE 11上是沒有的,當這個啓動以後,就會忽略掉/etc/security/limits.conf下的設置。
  而該文件的一個參數,DefaultTasksMax設置爲默認值(512),限制了可在節點上建立的最大任務數。此設置還影響OS上的maxpid值。

6. SUSE 12sp3修改方法及參數檢查

#################添加limits參數###################
在/etc/security/limits.conf文件中添加參數:
###################For HW  soft #################
*               soft     nofile         1200000
*               hard     nofile         1220000
*               soft     memlock        32
*               soft     core           10485760
*               soft     data           -1
*               soft     nproc          148270
*               soft     stack          -1
*               soft     as             -1
*               soft     rss            -1
*               hard    nproc       1200000

############添加內核參數##################
在/etc/sysctl.conf中添加參數:
fs.aio-max-nr = 1048576
fs.file-max = 6815744
fs.nr_open = 2000000
fs.inotify.max_user_watches = 2000000
kernel.acct = 100 100 30
kernel.msgmax = 1048576
kernel.msgmnb = 8388608
kernel.msgmni = 256
kernel.sem = 1250 320000 100 256
##############512G#####################
kernel.shmmax = 433517314048
kernel.shmmni = 4096
##############512G#####################
kernel.shmall = 549755813888
kernel.suid_dumpable = 1
kernel.sysrq = 8
kernel.core_pattern = /corefiles/core.%p.%e
vm.min_free_kbytes = 3000000
vm.swappiness = 10
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_default = 8388608
net.core.wmem_max = 41943040
net.ipv4.ip_local_port_range = 50000 65000
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_rmem = 8388608 8388608 33554432
net.ipv4.tcp_sack = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_wmem = 524288 524288 33554432
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_retries2 = 5
net.ipv4.tcp_syn_retries = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 1
net.ipv4.conf.default.arp_announce = 1
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_synack_retries = 5
vm.overcommit_memory = 1
vm.drop_caches = 1
vm.zone_reclaim_mode = 0
vm.max_map_count = 655360
vm.dirty_background_ratio = 60
vm.dirty_ratio = 60
vm.page-cluster = 3
vm.dirty_writeback_centisecs = 360000
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.icmp_ignore_bogus_error_responses = 1
kernel.sched_child_runs_first = 1
kernel.sched_latency_ns = 40000000
kernel.sched_nr_migrate = 64
net.ipv4.tcp_moderate_rcvbuf = 1
kernel.sched_compat_yield = 1
net.ipv4.tcp_max_tw_buckets = 5000
kernel.sched_migration_cost = 250000
kernel.sched_min_granularity_ns = 8000000
kernel.sched_wakeup_granularity_ns = 2500000
kernel.sched_rt_period_us = 2000000
kernel.pid_max = 1310720

/sbin/sysctl -p

#####################修改system.conf參數#####################
在/etc/systemd/system.conf中將#DefaultTasksMax=512修改成DefaultTasksMax=infinity

...重啓OS...
# systemctl status ohasd
● ohasd.service - LSB: Start and Stop Oracle High Availability Service
   Loaded: loaded (/etc/init.d/ohasd; bad; vendor preset: disabled)
   Active: active (exited) since Thu 2019-02-21 14:27:32 CST; 2h 9min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 86737 ExecStart=/etc/init.d/ohasd start (code=exited, status=0/SUCCESS)
    Tasks: 959

Feb 21 14:27:26 SSNG3MCS-DB1 systemd[1]: Starting LSB: Start and Stop Oracle High Availability Service...
Feb 21 14:27:26 SSNG3MCS-DB1 ohasd[86737]: Starting ohasd:
Feb 21 14:27:31 SSNG3MCS-DB1 ohasd[86737]: CRS-4123: Oracle High Availability Services has been started.
Feb 21 14:27:32 SSNG3MCS-DB1 systemd[1]: Started LSB: Start and Stop Oracle High Availability Service.

# cat /etc/systemd/system.conf|grep DefaultTasksMax
#DefaultTasksMax=512
DefaultTasksMax=infinity
相關文章
相關標籤/搜索