PRVG-11850 : The system call “connect” failed with error “111” while executing exectask on node

During install of Oracle RAC 12c (12.1.0.2), the installer, or possibly the cluster verification utility reports an error:

PRVG-11850 : The system call "connect" failed with error "111" while executing exectask on node ...

Checking the MOS site and other Oracle blogs, we find that the common approach to resolving this issue is to check that the firewall and SE Linux is disabled:

Check that the Linux firewall is disabled on all nodes:

[root@DSSDserver1 ~]# service iptables status
iptables: Firewall is not running.

And check that SE Linux is disabled:

[root@DSSDserver2 ~]# cat /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
#SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

In my case, both the firewall and SE Linux were disabled on all nodes, yet I continued to sporadically get the same error, and the Grid prerequisite check took three hours to complete on a four-node cluster (before failing).

One option is to use the runcluvfy.sh script to see if that helps to determine the problem. The script can be found in the grid install directory, and we can use the following syntax to test the inter-node connectivity.

In my example my nodes are named dssdserver1 through dssdserver4. The network adapter for the public subnet is p5p1.

$ ./runcluvfy.sh comp nodecon -i p5p1 -n dssdserver1,dssdserver2,dssdserver3,dssdserver4 -verbose

I repeated the above test for the second adapter for the private interconnect, and in both cases I got random PRVG-11850 errors, but not new clues on what was causing it.

Then I wondered if my NFS mounted Oracle software directory was the problem:

[oracle@DSSDserver1 grid]$ mount 
/dev/mapper/vg_dssdserver1-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg_dssdserver1-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
gvfs-fuse-daemon on /root/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
10.5.22.60:/oramnt on /oramnt type nfs (rw,bg,soft,nointr,tcp,vers=3,timeo=600,rsize=32768,wsize=32768,actimeo=0,addr=10.5.22.60)

The Oracle install binaries are mounted on an NFS export from another machine, and pinging that IP address revealed it was an extremely slow connection:

[oracle@DSSDserver1 grid]$ ping 10.5.22.60
PING 10.5.22.60 (10.5.22.60) 56(84) bytes of data.
64 bytes from 10.5.22.60: icmp_seq=1 ttl=58 time=10.9 ms
64 bytes from 10.5.22.60: icmp_seq=2 ttl=58 time=11.8 ms
64 bytes from 10.5.22.60: icmp_seq=3 ttl=58 time=10.9 ms
64 bytes from 10.5.22.60: icmp_seq=4 ttl=58 time=11.2 ms
^C
--- 10.5.22.60 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3367ms
rtt min/avg/max/mdev = 10.965/11.255/11.868/0.376 ms

So I moved the install binaries to a local disk from where the install was being run from. After that the inter node connectivity issue went away, and the Grid install prerequisite check completed in about 30 seconds.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s