In this Document bash SymptomsCauseSolution
APPLIES TO:Oracle Exadata Storage Server Software - Version 18.1.0.0.0 to 18.1.6.0.0 [Release 12.2] Exadata X7-2 Hardware - Version All Versions and later Exadata X7-8 Hardware - Version All Versions and later Information in this document applies to any platform.
SYMPTOMSOn X7-2 and X7-8 Exadata database servers running 18.1.0 through 18.1.6, a third-party tool used by CheckHWnFWProfile leaks one semaphore array per invocation, which is typically once per day. In an OVM configuration this occurs in dom0 only. The maximum number of semaphores arrays (semmni) is typically set to 256. Once the number of allocated semaphores arrays reaches 256, then a variety of failures may occur, such as the following:oracle
In a non-virtualized configuration new database instances fail to start with an error indicating insufficient semaphores, such as the following:app SQL> startup nomount ORA-27154: post/wait create failed ORA-27300: OS system dependent operation:semget failed with status: 28 ORA-27301: OS failure message: No space left on device ORA-27302: failure occurred at: sskgpcreatesdom Already running database instances are unaffected.
ide
In a virtualized (OVM) configuration OneCommand (OEDA) step "Create Virtual Machine" fails with the following error:oop Error: Command [/opt/exadata_ovm/exadata.img.domu_maker start-domain /EXAVMIMAGES/conf/final-vm.xml] run on node n1v1.oracle.com as user root did not execute successfully... Error running oracle.onecommand.deploy.machines.VmUtils method createVmspost /var/log/exadata.img.domu_maker.trc contains the following error:this [WARNING][/opt/exadata_ovm/exadata.img.domu_maker - 6214][exadata_img_domu_maker_start_domain][] [CMD: kpartx -a -v /EXAVMIMAGES/GuestImages/texa2b3npv07adm.de.t-internal.com/System.img] [CMD_STATUS: 3] ----- START STDERR ----- Limit for the maximum number of semaphores reached. You can check and set the limits in /proc/sys/kernel/sem. create/reload failed on loop6p1 Limit for the maximum number of semaphores reached. You can check and set the limits in /proc/sys/kernel/sem. create/reload failed on loop6p2 Limit for the maximum number of semaphores reached. You can check and set the limits in /proc/sys/kernel/sem. create/reload failed on loop6p3 Already running domUs are unaffected.
CheckHWnFWProfile does not properly identify the firmware version of 25G Ethernet devices.
Customer-installed software that relies on the ability to allocate a semaphore array to run will fail.
Additional symptoms include the following: /var/log/messages contains the following errors: BROADCOM[32717]: ERROR SemCreate() semget() failed! No space left on device BROADCOM[32717]: ERROR ngBmapiInitialize() LockCreate() failed!
BROADCOM[32717]: ERROR /usr/share/hwdata/pci.ids file should be updated BROADCOM[32717]: ERROR GetSriovInfo() fopen() /sys/bus/pci/devices/0000:5e:00.0/virtfn0/uevent failed! 2 A large number of semaphore arrays exist containing 3 semaphores per array, but are not associated with any running process. This can be determined by running the following bash code as the root user. This bash code identifies semaphore arrays that have been allocated but the process they are associated with no longer exists. # for semid in $(ipcs -s | egrep ' 3[ ]*$' | awk '{print $2}'); do for pid in $(ipcs -s -p -i $semid | awk '/^[0-9]/{print $NF}'|sort -u); do if ! ps -p $pid >/dev/null 2>&1; then echo "safe to remove semid $semid - no pid $pid" fi done done
safe to remove semid 98306 - no pid 15494 safe to remove semid 1703939 - no pid 33680 safe to remove semid 3309572 - no pid 57755 safe to remove semid 5373957 - no pid 269913 safe to remove semid 8814598 - no pid 138286 safe to remove semid 10878983 - no pid 30220 safe to remove semid 13860872 - no pid 175465 safe to remove semid 17301513 - no pid 268284 safe to remove semid 18907146 - no pid 271790 safe to remove semid 22347787 - no pid 156966 safe to remove semid 24412172 - no pid 191491 safe to remove semid 29229069 - no pid 322416 safe to remove semid 32669710 - no pid 369570 safe to remove semid 34275343 - no pid 77291 safe to remove semid 35880976 - no pid 179308 safe to remove semid 38862865 - no pid 81631 safe to remove semid 40468498 - no pid 172204 safe to remove semid 42991635 - no pid 139704 safe to remove semid 45056020 - no pid 195145 safe to remove semid 47120405 - no pid 32180 safe to remove semid 50561046 - no pid 207504 safe to remove semid 52166679 - no pid 374637 safe to remove semid 53772312 - no pid 204800 safe to remove semid 55377945 - no pid 137219 safe to remove semid 56983578 - no pid 237474 safe to remove semid 58589211 - no pid 278120 safe to remove semid 60653596 - no pid 146935 safe to remove semid 62259229 - no pid 394885 safe to remove semid 64323614 - no pid 208690 safe to remove semid 65929247 - no pid 269162 safe to remove semid 67534880 - no pid 323781 safe to remove semid 69599265 - no pid 31314 safe to remove semid 71204898 - no pid 115285 safe to remove semid 72810531 - no pid 178516 safe to remove semid 74416164 - no pid 251490 safe to remove semid 76021797 - no pid 323671 safe to remove semid 77627430 - no pid 366789 safe to remove semid 80150567 - no pid 21204 safe to remove semid 81756200 - no pid 21632 safe to remove semid 83361833 - no pid 164005 CAUSEBug 28027670 |