Unable to discover any voting files, retrying discovery in 15 seconds


Errors:
 
From: $GRID_HOME/log/<hostname>/alert<<hostname>>.log

[ohasd(13154)]CRS-2112:The OLR service started on node indbprd1.
[ohasd(13154)]CRS-1301:Oracle High Availability Service started on node indbprd1.
[ohasd(13154)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
[/oravl01/ora11g/112_GRID/bin/orarootagent.bin(14076)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
[ohasd(13154)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
[gpnpd(14181)]CRS-2328:GPNPD started on node indbprd1.
[/oravl01/ora11g/112_GRID/bin/orarootagent.bin(14196)]CRS-5013:Agent "/oravl01/ora11g/112_GRID/bin/orarootagent.bin" failed to start process "/oravl01/ora11g/112_GRID/bin/osysmond" for action "start": details at "(:CLSN00008:)" in "/oravl01/ora11g/112_GRID/log/indbprd1/agent/ohasd/orarootagent_root/orarootagent_root.log"
[cssd(14283)]CRS-1713:CSSD daemon is started in clustered mode
[ohasd(13154)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
[cssd(14283)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oravl01/ora11g/112_GRID/log/indbprd1/cssd/ocssd.log
 
From: $GRID_HOME/log/<hostname>/cssd/ocssd.log

[    CSSD][4290766592]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x7fc1f0045110) client((nil))
[    CSSD][4290766592]clssgmDeadProc: proc 0x7fc1f0045110
[    CSSD][4290766592]clssgmDestroyProc: cleaning up proc(0x7fc1f0045110) con(0x1025) skgpid  ospid 14199 with 0 clients, refcount 0
[    CSSD][4290766592]clssgmDiscEndpcl: gipcDestroy 0x1025
[    GPNP][4288517888]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2104] get-profile call to url "ipc://GPNPD_indbprd1" disco "" [f=0 claimed- host: cname: seq: auth:]
[    GPNP][4288517888]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2234] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_indbprd1" disco ""
[    CSSD][4288517888]clssnmReadDiscoveryProfile: voting file discovery string()
[    CSSD][4288517888]clssnmvDDiscThread: using discovery string  for initial discovery
[   SKGFD][4288517888]Discovery with str::
[   SKGFD][4288517888]UFS discovery with ::
[   SKGFD][4288517888]Fetching UFS disk :/dev/raw/rawctl:
[   SKGFD][4288517888]OSS discovery with ::
[    CSSD][4288517888]clssnmvDiskVerify: Successful discovery of 0 disks
[    CSSD][4288517888]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
[    CSSD][4288517888]clssnmvFindInitialConfigs: No voting files found
[    CSSD][4288517888](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
[    CSSD][4290766592]clssscSelect: cookie accept request 0x2130390
[    CSSD][4290766592]clssgmAllocProc: (0x7fc1f005eeb0) allocated
[    CSSD][4290766592]clssgmClientConnectMsg: properties of cmProc 0x7fc1f005eeb0 - 1,2,3,4,5
[    CSSD][4290766592]clssgmClientConnectMsg: Connect from con(0x10b1) proc(0x7fc1f005eeb0) pid(14199) version 11:2:1:4, properties: 1,2,3,4,5
[    CSSD][4290766592]clssgmClientConnectMsg: msg flags 0x0000
[    CSSD][4290766592]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x7fc1f005eeb0) client((nil))
 

From: $GRID_HOME/log/<hostname>/gpnpd/gpnpd.log

[    GPNP][1800144672]clsgpnpdRCB: [at clsgpnpd.c:3933] GPnPD endpoint url "mdns:gpnp._tcp://indbprd1:60496/agent=gpnpd,cname=cramprd-cluster,host=indbprd1,pid=57230/gpnpd h:indbprd1 c:cramprd-cluster" successfully advertised with RD
[    GPNP][1543501568]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2104] put-profile call to url "tcp://indbprd2:63402" disco "mdns:service:gpnp._tcp.local.://indbprd2:63402/agent=gpnpd,cname=cramprd-cluster,host=indbprd2,pid=34530/gpnpd h:indbprd2 c:cramprd-cluster" [f=0 claimed- host:indbprd1 cname:cramprd-cluster seq:4 auth:CN=GPnP_peer]
[    GPNP][1543501568]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2234] Result: (25) CLSGPNP_DUPLICATE. Successful put-profile CALL to remote "tcp://indbprd2:63402" disco "mdns:service:gpnp._tcp.local.://indbprd2:63402/agent=gpnpd,cname=cramprd-cluster,host=indbprd2,pid=34530/gpnpd h:indbprd2 c:cramprd-cluster"
[    GPNP][1543501568]clsgpnpd_pushThread: [at clsgpnpd.c:4770] START gpnpd start serving clients after profile updates
[  OCRMSG][1790699264]GIPC error [29] msg [gipcretConnectionRefused]
[    GPNP][1790699264]clsgpnp_ocrDetectThread: [at clsgpnp0.c:4508] OCR client init SUCCEEDED. OCR shared cache is now available
[  OCRMSG][1782527744]GIPC error [29] msg [gipcretConnectionRefused]
[    GPNP][1782527744]clsgpnpd_reconcile: [at clsgpnpd.c:2449] Result: (59) CLSGPNP_OCR_NOSRV. Did not get a shared profile, OCR provider 0x1801ad0
[      RD][1800144672]srd_apple_concheck: *** Got error kDNSServiceErr_Unknown (-65537) from mdns in srd_apple_cxWork:1227
[    GPNP][1788598016]CLSDM requested exit
[ default][1788598016]GPNPD on node indbprd1 shutdown.

When:

After creating spfile for ASM in OCR asm disk

Background:

1. A 2 Node RAC 11gr2 on Linux with OCR/Voting on ASM

2. When checked for "show parameter pfile" shown null, thought that spfile was not created for asm & asm alert was also showing warning as "using default parameter settings without any parameter"


SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /oravl01/ora11g/112_GRID
System name: Linux
Node name: indbprd1
Release: 2.6.32-279.el6.x86_64
Version: #1 SMP Wed Jun 13 18:24:36 EDT 2012
Machine: x86_64
WARNING: using default parameter settings without any parameter

3. So asm pfile created forcefully, in first attempt "create spfile from pfile;" fired resulted in error ORA-01565: Unable to open Spfile ?/dbs/spfile@.ora

So, from asmcmd command prompt a directory created manually in asm disk & "create spfile='+OCR1/cramprd-cluster/asmparameterfile/asm_cramprd.ora’ from pfile;"fired which got executed successfully

After which everything went wrong, cluster unable to come up


SQL> /*ASMCMD */alter diskgroup /*ASMCMD*/ "OCR1" add directory '+OCR1/cramprd-cluster/asmparameterfile'
SUCCESS: /* ASMCMD */alter diskgroup /*ASMCMD*/ "OCR1" add directory '+OCR1/cramprd-cluster/asmparameterfile'
Sat Apr 27 16:11:04 2013

alert_+ASM1.log:
============
SUCCESS: diskgroup OCR_VOTE was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:0:2327} */
SQL> ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent *//* {0:0:2327} */
SUCCESS: ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent *//* {0:0:2327} */
NOTE: Attempting voting file refresh on diskgroup OCR_VOTE
NOTE: Voting file relocation is required in diskgroup OCR_VOTE
NOTE: Attempting voting file relocation on diskgroup OCR_VOTE
Fri Apr 26 09:49:18 2013
WARNING: failed to online diskgroup resource ora.OCR1.dg (unable to communicate with CRSD/OHASD)
Fri Apr 26 09:49:18 2013
NOTE: [crsd.bin@indbprd1 (TNS V1-V3) 52647] opening OCR file
Starting background process ASMB
Fri Apr 26 09:49:18 2013
ASMB started with pid=25, OS id=52661
Fri Apr 26 09:49:18 2013
NOTE: client +ASM1:+ASM registered, osid 52663, mbr 0x0
Fri Apr 26 09:49:18 2013
NOTE: [crsd.bin@indbprd1 (TNS V1-V3) 52647] opening OCR file
WARNING: failed to online diskgroup resource ora.OCR2.dg (unable to communicate with CRSD/OHASD)
WARNING: failed to online diskgroup resource ora.OCR_VOTE.dg (unable to communicate with CRSD/OHASD)
Fri Apr 26 09:49:24 2013
SQL> ALTER DISKGROUP FRA MOUNT /* asm agent *//* {1:9791:2} */
Fri Apr 26 09:49:24 2013
SQL> ALTER DISKGROUP DATA1 MOUNT /* asm agent *//* {1:9791:2} */
NOTE: cache registered group FRA number=4 incarn=0xc817349e
...........................
...........................
SUCCESS: ALTER DISKGROUP DATA1 MOUNT /* asm agent *//* {1:9791:2} */
NOTE: diskgroup resource ora.DATA1.dg is updated
Fri Apr 26 09:49:55 2013
NOTE: client CRAMPRD1:CRAMPRD registered, osid 53733, mbr 0x1
Fri Apr 26 09:50:22 2013
ALTER SYSTEM SET local_listener=' (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=121.244.255.41)(PORT=1527))))' SCOPE=MEMORY SID='+ASM1';
Sat Apr 27 15:59:47 2013
ORA-01565: Unable to open Spfile ?/dbs/spfile@.ora.
Sat Apr 27 16:10:17 2013
SQL> /* ASMCMD */alter diskgroup /*ASMCMD*/ "OCR1" add directory '+OCR1/cramprd-cluster/asmparameterfile'
SUCCESS: /* ASMCMD */alter diskgroup /*ASMCMD*/ "OCR1" add directory '+OCR1/cramprd-cluster/asmparameterfile'
Sat Apr 27 16:11:04 2013
NOTE: updated gpnp profile ASM diskstring:
NOTE: updated gpnp profile ASM SPFILE to +OCR1/cramprd-cluster/asmparameterfile/asm_cramprd.ora
........................
......................
Sat Apr 27 05:24:40 2013
ORA-15032: not all alterations performed
Sat Apr 27 05:24:40 2013
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "OCR1" precludes its dismount
ORA-15027: active use of diskgroup "OCR2" precludes its dismount
ERROR: ALTER DISKGROUP OCR1 DISMOUNT /* asm agent *//* {1:9791:4123} */
ERROR: ALTER DISKGROUP OCR2 DISMOUNT /* asm agent *//* {1:9791:4123} */
Sat Apr 27 05:24:50 2013
Sat Apr 27 05:24:50 2013
SQL> ALTER DISKGROUP OCR2 DISMOUNT /* asm agent *//* {1:9791:4123} */
SQL> ALTER DISKGROUP OCR1 DISMOUNT /* asm agent *//* {1:9791:4123} */
................................
............................
Sat Apr 27 05:26:01 2013
ORA-15032: not all alterations performed
ORA-15027: active use of diskgroup "OCR2" precludes its dismount
ERROR: ALTER DISKGROUP OCR2 DISMOUNT /* asm agent *//* {1:9791:4123} */
Sat Apr 27 05:26:11 2013
NOTE: client exited [52647]
Sat Apr 27 05:26:11 2013
NOTE: client exited [52647]
Sat Apr 27 05:26:12 2013
NOTE: ASMB process exiting due to lack of ASM file activity for 1 seconds
Sat Apr 27 05:26:12 2013
Shutting down instance (immediate)
Shutting down instance: further logons disabled
Stopping background process MMNL
Stopping background process MMON
License high water mark = 22
SQL> ALTER DISKGROUP ALL DISMOUNT /* asm agent *//* {0:0:4736} */

4. Only problem was spfile created from a pfile which was created from memory & diskstring parameter value was missing


5. Cluster unable to come up or go down unless killed or stop by "crsctl stop crs –f" specifying below voting disk error continuously in logs


[cssd(14283)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /oravl01/ora11g/112_GRID/log/indbprd1/cssd/ocssd.log


Analysis:

1. In first sight it seemed like OCR/Voting corruption has happen as spfile was created in manually created directory, so we moved towards restoration, but no! It wasn’t corruption until we identified something in "gpnp" profile.xml


2. Due inappropriate creation of spfile profile.xml got changed to reflect wrong diskstring & spfile parameter values


Location:
/oravl01/ora11g/112_GRID/gpnp/indbprd1/profiles/peer/profile.xml


profile.xml before modification when everything was working fine picking up default parameters:

<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="1" ClusterUId="1693915e3a5ecf1abf7a6bbc3ebe88d8" ClusterName="cramprd-cluster" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="121.244.255.0" Adapter="bond0" Use="public"/><gpnp:Network id="net2" IP="10.255.253.0" Adapter="bond1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="/dev/mapper/*" SPFile=""/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"><InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethodAlgorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestV
alue>IiV8dW5jOV2Nzsgq18oLCxrLjFE=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>bpAkaFECi2VONAjby
MN9rN79I2anvGnjKX4KfMrtXWHXeEc0N1Bc5jHUrb2dFTxzFv8zoG1GOp5kvb5pko9IKBiRo7yC9bgrPHJayrAau7ZNJD/YxRK7+n1aupdx4VStA12
X57O7R3h9CUNwhh6ByrTaqpp+A5cKgJ2OMgVn4Do=</ds:SignatureValue></ds:Signature></gpnp:GPnPProfile>

Profile.xml after first attempt of spfile creation:


 

<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0"  
xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-
pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile"  
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd"  
ProfileSequence="5" ClusterUId="1693915e3a5ecf1abf7a6bbc3ebe88d8" ClusterName="cramprd-cluster"  
PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network  
id="net1" IP="121.244.255.0" Adapter="bond0" Use="public"/><gpnp:Network id="net2" IP="10.255.253.0" Adapt
er="bond1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css"
DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm"
DiscoveryString="" SPFile="+OCR1/cramprd-cluster/asmparameterfile/registry.253.811077707"/>
<ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:Si
gnedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod  

Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform  
Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml
-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform>

</ds:Transforms><ds:DigestMethod Algorithm=
http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>D9cngcwfJ4NIFUT5YdPKoldg6zg=</ds:DigestValue>

</ds:Reference></ds:SignedInfo><ds:SignatureValue>YfOPKgl8WLciUaqkrkkGXwiD7q5Xa6gp3BKhqOc/v5Gd3xOHTCN/
dppe37V2qskeekvykpwWDQyL4oBEv0y7ySwGS3hc364I1AcwaiEI856KqPpHNpLW3hsnSrc8uDxagc7Px4OA9tvqyMJEHDih99DY/PqnLTqGpUUlIfRGiY=
</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>

Profile.xml after last attempt (as first two attempts given error) of spfile creation when manually directory created in ASM OCR disk group:

<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="6" ClusterUId="1693915e3a5ecf1abf7a6bbc3ebe88d8" ClusterName="cramprd-cluster" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="121.244.255.0" Adapter="bond0" Use="public"/><gpnp:Network id="net2" IP="10.255.253.0" Adapter="bond1" Use="cluster_interconnect"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="" SPFile="+OCR1/cramprd-cluster/asmparameterfile/asm_cramprd.ora"/><ds:Signaturexmlns:ds="http://www.w3.org/2000/09/xmldsig#">
<ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>nULuHrErz0l+qzHaW9XxVppCotE=</ds:DigestValue>
</ds:Reference></ds:SignedInfo><ds:SignatureValue>HcikyQW24FNY9m14jcv16LYcQzN0dh0+t04iqgOOxCeHxB5D4p5bO2rZ9R8nGdein
BaLUCQ78vYCTTejwamNjSasZqqaN+Circl71MnpGveAJ6ZriHMYDL/i8MVhpe7YANYcCh1JNdmtvcaAzw8zcWv6I3aomkvEhZY6RiW2sXw=</ds:SignatureValue>
</ds:Signature></gpnp:GPnPProfile>

3. As it was clear that something wrong being picked up from profile.xml, the backup profile.xml  (From older GRID_HOME backup) was replaced but still it didn’t help

4. Below were the ocr/voting disk configuration before issue being faced


$crsctl query css votedisk
##STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 338bd735aec74fd6bf0d14ebc7e8b259 (/dev/mapper/mpathbb) [OCR_VOTE]
2. ONLINE 05cda287b1fb4f4cbf9d86addfef1a61 (/dev/mapper/mpathbc) [OCR_VOTE]
3. ONLINE a3aed1cd33bb4ff4bf3f512466e2945c (/dev/mapper/mpathbe) [OCR_VOTE]

$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3288
Available space (kbytes) : 258832
ID : 845475396
Device/File Name : +OCR2
Device/File integrity check succeeded
Device/File Name : +OCR1
Device/File integrity check succeeded

$ more /etc/oracle/ocr.loc
#Device/file getting replaced by device +OCR1
ocrconfig_loc=+OCR2
ocrmirrorconfig_loc=+OCR1
local_only=false

1 3 CACHED MEMBER NORMAL 51200 51200 8350 0 42850 DATA1_0003 /dev/mapper/DATA4 26-MAR-13
1 2 CACHED MEMBER NORMAL 51200 51200 8349 0 42851 DATA1_0002 /dev/mapper/DATA3 26-MAR-13
1 1 CACHED MEMBER NORMAL 51200 51200 8350 0 42850 DATA1_0001 /dev/mapper/DATA2 26-MAR-13
3 0 CACHED MEMBER NORMAL 51200 51200 50836 0 364 OCR1_0000 /dev/mapper/OCR1 26-MAR-13
2 0 CACHED MEMBER NORMAL 51200 51200 10981 0 40219 FRA_0000 /dev/mapper/ARCH1 26-MAR-13
1 0 CACHED MEMBER NORMAL 51200 51200 8351 0 42849 DATA1_0000 /dev/mapper/DATA1 26-MAR-13
2 1 CACHED MEMBER NORMAL 51200 51200 10987 0 40213 FRA_0001 /dev/mapper/mpatha 26-MAR-13
4 0 CACHED MEMBER NORMAL 1024 1024 661 0 363 OCR2_0000 /dev/mapper/mpathba 25-APR-13
5 0 CACHED MEMBER NORMAL 1024 1024 897 0 127 OCR_VOTE_0000 /dev/mapper/mpathbb 24-APR-13
5 1 CACHED MEMBER NORMAL 1024 1024 897 0 127 OCR_VOTE_0001 /dev/mapper/mpathbc 24-APR-13
5 2 CACHED MEMBER NORMAL 1024 1024 897 0 127 OCR_VOTE_0002 /dev/mapper/mpathbe 24-APR-13
2 2 CACHED MEMBER NORMAL 51200 51200 10995 0 40205 FRA_0002 /dev/mapper/mpathaa 26-MAR-13
1 4 CACHED MEMBER NORMAL 51200 51200 8353 0 42847 DATA1_0004 /dev/mapper/mpatho 12-APR-13
2 3 CACHED MEMBER NORMAL 51200 51200 11001 0 40199 FRA_0003 /dev/mapper/mpathp 12-APR-13


5. An attempt was made to startup asm by using pfile , as the default parameters were used for asm instance previously but luckily a pfile was created from memory "create pfile from memory;" before spfile creation havoc was attempted. Below is the outcome (No Luck)


SQL> startup nomount pfile='/oravl01/ora11g/112_GRID/dbs/init+ASM1.ora';
ORA-29709: Communication failure with Cluster Synchronization Services
SQL> exit
Disconnected

$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

$ crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
$ cat /oravl01/ora11g/112_GRID/dbs/init+ASM1.ora
large_pool_size=20971520
#+ASM1.asm_diskgroups='FRA','DATA1','OCR1','OCR2','OCR_VOTE'
*.asm_diskstring='/dev/mapper/*'
*.asm_power_limit=1
audit_file_dest='/oradump/diag/admin/audit/ASM1'
*.diagnostic_dest='/oradump'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'



Solution:


As asm requires CRS services should be UP before starting up ASM in RAC which was the measure bottleneck where OCR/Voting disk were not been recognised due to missing diskstring in spfile file

So, just to bypass crs check the cluster has been started in exclusive mode without crs check


$ sudo ~/112_GRID/bin/crsctl start crs -excl -nocrs
[sudo] password for ora11g:
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'indbprd1'
CRS-2676: Start of 'ora.mdnsd' on '
indbprd1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on '
indbprd1'
CRS-2676: Start of 'ora.gpnpd' on '
indbprd1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on '
indbprd1'
CRS-2672: Attempting to start 'ora.gipcd' on '
indbprd1'
CRS-2676: Start of 'ora.cssdmonitor' on '
indbprd1' succeeded
CRS-2676: Start of 'ora.gipcd' on '
indbprd1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on '
indbprd1'
CRS-2672: Attempting to start 'ora.diskmon' on 'indbprd1'
CRS-2676: Start of 'ora.diskmon' on '
indbprd1' succeeded
CRS-2676: Start of 'ora.cssd' on '
indbprd1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on '
indbprd1'
CRS-2672: Attempting to start 'ora.ctssd' on '
indbprd1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on '
indbprd1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on '
indbprd1'
CRS-2676: Start of 'ora.ctssd' on '
indbprd1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on '
indbprd1' succeeded
CRS-2679: Attempting to clean 'ora.asm' on '
indbprd1'
CRS-2681: Clean of 'ora.asm' on '
indbprd1' succeeded
CRS-2672: Attempting to start 'ora.asm' on '
indbprd1'
CRS-2676: Start of 'ora.asm' on '
indbprd1' succeeded

$ ps -ef| grep -i pmon
ora11g 14758 1 0 01:39 ? 00:00:00 asm_pmon_+ASM1
ora11g 15607 12039 0 01:40 pts/1 00:00:00 grep -i pmon

$ sqlplus "/ as sysdba"
SQL*Plus: Release 11.2.0.3.0 Production on Sun Apr 28 01:40:59 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> show parameter pfile
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string

ASM instance came up but due to missing diskstring it was unable to recognise diskgroup paths completely so, new spfile created (this time in +DATA1 asm diskgroup) using memory parameter file with "*.asm_diskstring='/dev/mapper/*'" appended in pfile


Hurrey !! Able to bring up all disk groups & further normal cluster start/ stop attempted from both the nodes.


Got my cluster back in place & working this time.

Exclusive mode without crs helped to resolve the issue acting as if it is a single instance database

No comments:

Post a Comment

Popular Posts