Remote Oracle Database Support : CRS-4123 CRS-4124 CRS-4000

Issue: cluster start command "crsctl start crs" fails to start crs services

CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-4124: Oracle High Availability Services startup failed
CRS-4000: Command Start failed, or completed with errors

Cause:

Recent changes to cluster hosts ...

OS patching
Network subnet changes
OS reboot
Files system permission changes
ASM disk unavailability
private network unreliability

Possible Solution:

1. Verify if cluster interconnect is reachable : Obtain private interconnect IP from /etc/hosts and check ping response from all cluster nodes

2. Compare ifconfig -a output before & after OS changes to validate if Ethernet naming (e.g. eth0/eth1) has not changed , in case earlier o/p is not available second running node can also be referred

3. Verify ASM disk availability using kfod disks=all

4. if OS kernal has been upgraded as a recent change take strace o/p as below

sudo strace $GRID_HOME/bin/crsctl start crs

If below errors observed in trace files then possible cause is OHASD process is not able to spawn the crs start
demons and stucked in "ohasd run" , in this scenario either /var/tmp/.oracle , /tmp/.oracle , /usr/tmp/.oracle need to be backed up and cleaned up followed by clean OS and crs restart

sendto(23, "\4", 1, MSG_NOSIGNAL, NULL, 0) = 1

connect(27, {sa_family=AF_LOCAL, sun_path="/var/tmp/.oracle/sprocr_local_conn_0_PROL"}, 110) = -1 E
ioctl(20, FIONBIO, [1]) = 0

connect(20, {sa_family=AF_LOCAL, sun_path="/var/tmp/.oracle/sOHASD_UI_SOCKET"}, 110) = -1 ENOENT (No such file or directory)

5. if None of the above steps work then need to rollback the recently upgraded kernal or OS patch

6. You may use cluvfy stage -pre crsinst -n node1,node2 as a part of troubleshooting

Refer:

Top 5 Grid Infrastructure Startup Issues (Doc ID 1368382.1)

Remote Oracle Database Support

Tabs

CRS-4123 CRS-4124 CRS-4000

No comments:

Post a Comment

Database Errors

How To

Technology