Despite all the updates to the Oracle RAC installer, it is still an extremely brittle install process that frequently goes pear shaped.
A common problem is pilot-error, and a regular mistake is running the root.sh script on secondary nodes before it has fully completed on the primary node.
Oracle now provides the rootcrs.pl script in the Grid home crs/install directory.
To execute the script, connect to your first node and execute as follows:
[root@tbird1 install]# ./rootcrs.pl -deconfig -verbose -force 2012-10-28 17:04:38: Parsing the host name 2012-10-28 17:04:38: Checking for super user privileges 2012-10-28 17:04:38: User has super user privileges Using configuration parameter file: ./crsconfig_params VIP exists.:tbird1 <output removed to aid clarity> CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'tbird1' has completed CRS-4133: Oracle High Availability Services has been stopped. Successfully deconfigured Oracle clusterware stack on this node
Now connect to the other nodes and execute the same script as before. On the last node in your cluster, you should add the -lastnode directive:
[root@tbird1 install]# ./rootcrs.pl -deconfig -verbose -force -lastnode
If that doesn’t work, we need to resort to somewhat more belligerent methods.
The following will wipe out the Oracle Grid install completely, allowing you start over with install media.
First, make sure any CRS software is shut down. If it is not shut down then use the crsctl command to stop all the clusterware software:
[root@tbird2 oraInventory]# . oraenv ORACLE_SID = [root] ? +ASM1 The Oracle base for ORACLE_HOME=/u01/app/11.2.0/grid is /u01/app/oracle [root@tbird2 oraInventory]# crsctl stop has CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'tbird2' CRS-2673: Attempting to stop 'ora.crsd' on 'tbird2' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'tbird2' CRS-2673: Attempting to stop 'ora.tbird2.vip' on 'tbird2' <output removed to aid clarity> CRS-2677: Stop of 'ora.gipcd' on 'tbird2' succeeded CRS-2677: Stop of 'ora.diskmon' on 'tbird2' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'tbird2' has completed CRS-4133: Oracle High Availability Services has been stopped.
Make sure that nothing is running as Oracle:
[root@tbird2 oraInventory]# ps -ef | grep oracle root 19214 4529 0 16:51 pts/1 00:00:00 grep oracle
Now we can remove the Oracle install as follows:
Disable the OHASD Daemon from starting on reboot – do this on all nodes:
[root@tbird2 etc]# cat /etc/inittab # Run xdm in runlevel 5 x:5:respawn:/etc/X11/prefdm -nodaemon h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
Remove the last line that spawns the ohasd daemon, and save the file.
Now locate the Oracle Inventory and the location of the current Oracle installs. I am assuming in this case you want to remove everything.
The Oracle inventory location is stored in the oraInst.loc file:
[root@tbird2 etc]# cat /etc/oraInst.loc inventory_loc=/u01/app/oraInventory inst_group=oinstall
Navigate to the Oracle Inventory, listed here at /u01/app/oraInventory and inspect the contents of the ContentsXML/inventory.xml file – do this on all nodes:
[root@tbird2 oraInventory]# cat /u01/app/oraInventory/ContentsXML/inventory.xml <?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2009, Oracle. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO> <SAVED_WITH>184.108.40.206.0</SAVED_WITH> <MINIMUM_VER>220.127.116.11.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1" CRS="true"> <NODE_LIST> <NODE NAME="tbird1"/> <NODE NAME="tbird2"/> </NODE_LIST> </HOME> </HOME_LIST> </INVENTORY>
We can see we have a Grid install at /u01/app/11.2.0/grid. We can remove this as follows:
[root@tbird2 oraInventory]# rm -R /u01/app/11.2.0
Now we can remove the inventory directory – do this on all nodes:
[root@tbird2 oraInventory]# rm -R /u01/app/oraInventory
Now we can remote the Oracle directory and files under /etc – do this on all nodes.
[root@tbird2 ~]# rm -R /etc/oracle [root@tbird2 ~]# rm /etc/oraInst.loc [root@tbird2 ~]# rm /etc/oratab
Now we delete the files added to /usr/local/bin – do this on all nodes.
[root@tbird2 ~]# rm /usr/local/bin/dbhome [root@tbird2 ~]# rm /usr/local/bin/oraenv [root@tbird2 ~]# rm /usr/local/bin/coraenv
Reset the permissions on /u01/app – do this on all nodes.
[root@tbird2 ~]# chown oracle:dba /u01/app
Now we need to clear the ASM devices we created – do this on both nodes.
[root@tbird2 ~]# oracleasm deletedisk DATA Clearing disk header: done Dropping disk: done
Finally re-stamp the devices for ASM.
[root@tbird1 ~]# oracleasm createdisk DATA /dev/sdc1 Writing disk header: done Instantiating disk: done
And scan it on the secondary nodes:
[root@tbird2 ~]# oracleasm scandisks Reloading disk partitions: done Cleaning any stale ASM disks... Scanning system for ASM disks... Instantiating disk "DATA"
Now Oracle is completely removed, you can start your Grid install again.