Skip to Main Content

Database Software

Announcement

Testing banner

Grid Infrastructure 19c installation fails

User_Q79A7May 5 2021

Hello Experts,

I have been trying to setup a 19c GI for SEHA implementation, however GI installation fails while starting cluster_interconnect_haip resource. Below is the setup information-

Platform -
Oracle Cloud VMs
Oracle Linux 7.9
16 GB Memory

Network-
I create a new Virtual Cloud Network(CIDR 100.120.0.0/16) with two subnets-
Public Subnet - 100.120.21.0/24
Private Subnet - 100.120.20.0/24

/ets/hosts-
=============
100.120.21.123 rac1.sub05031238420.racvcn.oraclevcn.com rac1
100.120.21.186 rac2.sub05031238420.racvcn.oraclevcn.com rac2
# Private
100.120.20.18 rac1-priv.sub05031238421.racvcn.oraclevcn.com rac1-priv
100.120.20.149 rac2-priv.sub05031238421.racvcn.oraclevcn.com rac2-priv
# Virtual
100.120.21.65 rac1-vip.sub05031238420.racvcn.oraclevcn.com rac1-vip
100.120.21.66 rac2-vip.sub05031238420.racvcn.oraclevcn.com rac2-vip
# SCAN
100.120.21.131 mycluster-scan mycluster-scan
100.120.21.132 mycluster-scan mycluster-scan
100.120.21.133 mycluster-scan mycluster-scan

Cluster Verify Utility completes with just one failed check about insufficient swap space(expected 16 GB, actual 8GB). On installer as well, all pre-requisites are met apart from swap space.

While executing the root.sh on first node, the script errors at step 16, following errors are reported in crs alert log-

2021-05-05 08:13:25.552 [OCSSD(14355)]CRS-1709: Lease acquisition failed for node rac1 because no voting file has been configured; Details at (:CSSNM00031:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc
2021-05-05 08:13:26.788 [OCSSD(14355)]CRS-1621: The IPMI configuration data for this node stored in the Oracle registry is incomplete; details at (:CSSNK00002:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc
2021-05-05 08:13:26.789 [OCSSD(14355)]CRS-1617: The information required to do node kill for node rac1 is incomplete; details at (:CSSNM00004:) in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ocssd.trc
2021-05-05 08:13:34.316 [OCSSD(14355)]CRS-1601: CSSD Reconfiguration complete. Active nodes are rac1 .
2021-05-05 08:13:34.334 [OCSSD(14355)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.
2021-05-05 08:13:36.255 [OCTSSD(14569)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 14569
2021-05-05 08:13:37.004 [OCTSSD(14569)]CRS-2403: The Cluster Time Synchronization Service on host rac1 is in observer mode.
2021-05-05 08:13:38.635 [OCTSSD(14569)]CRS-2407: The new Cluster Time Synchronization Service reference node is host rac1.
2021-05-05 08:13:38.636 [OCTSSD(14569)]CRS-2401: The Cluster Time Synchronization Service started on host rac1.
2021-05-05 08:13:50.735 [CRSCTL(14947)]CRS-1013: The OCR location in an ASM disk group is inaccessible. Details in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/crsctl_14947.trc.
2021-05-05 08:14:59.158 [ORAROOTAGENT(14005)]CRS-5818: Aborted command 'start' for resource 'ora.cluster_interconnect.haip'. Details at (:CRSAGF00113:) {0:0:106} in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd_orarootagent_root.trc.
2021-05-05 08:15:05.522 [OHASD(13885)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.cluster_interconnect.haip'. Details at (:CRSPE00221:) {0:0:106} in /u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd.trc.
2021-05-05 08:15:05.518 [ORAROOTAGENT(14005)]CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
2021-05-05 08:15:05.518+Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/grid_base/diag/crs/rac1/crs/trace/ohasd_orarootagent_root.trc".

ocssd.trc-
============
2021-05-05 08:13:25.550 : CSSD:4155763968: [ INFO] clsssclsnrsetup: endp 0x723 for gipcha://rac1:nm2_mycluster
2021-05-05 08:13:25.550 : CSSD:4155763968: [ INFO] clssnmOpenGIPCEndp: listening on gipcha://rac1:nm2_mycluster
2021-05-05 08:13:25.552 : CLSF:4155763968: Allocated CLSF context
2021-05-05 08:13:25.552 : CSSD:4155763968: [ INFO] clssnmlalloccx:phyname rac1
2021-05-05 08:13:25.552 : CSSD:4155763968: [ INFO] clssnmlGetLease:Node does not have a valid lease going for lease acquistion
2021-05-05 08:13:25.552 : CSSD:4155763968: [ INFO] clssnmlpickslot:Optimize the lease acquisition for Fixed configuration slot provided by root scripts with slot 1
2021-05-05 08:13:25.552 : CSSD:4155763968: [ INFO] (:CSSNM00031:)clssnmlgetslot:No voting files available on node rac1
2021-05-05 08:13:25.553 : CSSD:4155763968: [ ERROR] clssnml_acqlease: failed to get a lease slot
2021-05-05 08:13:25.553 : CSSD:4155763968: [ ERROR] clssnmvInit: Failed to acquire lease
2021-05-05 08:13:25.553 : CSSD:4155763968: [ INFO] clssscUpdateInitState: Set state to 0x008c1e47, based on prior state of 0x008c1e46 and requested change of 0x00000001
2021-05-05 08:13:25.553 : CSSD:4155763968: [ INFO] clssnmInitNodeDB: Initializing with OCR id 0
2021-05-05 08:13:25.553 : CSSD:3973330688: [ INFO] clssscWaitOnInitState: returning 1, requested state 0x00000001, current state 0x008c1e47
2021-05-05 08:13:25.553 : CSSD:2751461120: [ INFO] clssscWaitOnInitState: returning 1, requested state 0x00000001, current state 0x008c1e47
2021-05-05 08:13:25.553 : CSSD:2751461120: [ INFO] clssgmclientlsnr: The event hdlr is client
2021-05-05 08:13:25.553 : CSSD:2751461120: [ INFO] clssscWaitOnInitState: Waiting on requested state 0x00008000, current state 0x008c1e47, timeout 4294967295
2021-05-05 08:13:25.554 : CSSD:2729383680: [ INFO] clssscthrdmain: Starting thread skgxnmon
2021-05-05 08:13:25.554 : CSSD:3973330688: clssscqueue_init: queue(0x7f00b80b0a10), max(0)
2021-05-05 08:13:25.554 : CSSD:3973330688: [ INFO] clssscWaitOnInitState: Waiting on requested state 0x00000100, current state 0x008c1e47, timeout 4294967295

crsctl_14947.trc-
===================
default:1956045184: u_set_comp_error: comptype '103' : error '29'
2021-05-05 08:13:44.865 : OCRRAW:1956045184: kgfnInitEnv env=0x7ffdf40b86b8 flags=0x0

2021-05-05 08:13:44.865 : OCRRAW:1956045184: kgfoCreateCtxExt2 trcflg: 0 [trclvl_in:3] ctx:0x5586e673e8c0

2021-05-05 08:13:45.323 : OCRRAW:1956045184: kgfnFindLocalNode03: kgfn_find_node_sid found no members

2021-05-05 08:13:45.323*:kgfn.c@1412: kgfnFindLocalNode: found no members
2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnFindLocalNode: not ok

2021-05-05 08:13:45.324*:kgfn.c@1466: kgfnFindLocalNode: not ok
2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnTgtInit: local node not found, free kgfnpds

2021-05-05 08:13:45.324*:kgfn.c@2252: kgfnTgtInit: not found
2021-05-05 08:13:45.324 : OCRRAW:1956045184: kgfnGetBeqData failed init target; inst=(null) flags=0x2000

2021-05-05 08:13:45.324*:kgfn.c@5941: kgfnGetBeqData: kgfnTgtInit failed, inst=NULL flags=0x2000

ohasd_orarootagent_root.trc-
==============================
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] PROBE: got conflicting target ip 0.0.0.0, source ip 169.254.22.237, addr 00-00-17-85-81-3f, myAddr 02-00-17-01-58-bf
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAMain] HAIP: add IP 169.254.22.237 in Conflict IP List
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAMain] HAIP: IP 169.254.22.237 is in Conflict IP List
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] PROBE: conflict detected src { 169.254.22.237, 00-00-17-85-81-3f }, target { 0.0.0.0, 02-00-17-01-58-bf }
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAMain] HAIP: delete the IP from Conflict IP List, 169.254.22.237
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] ProcessInitial, ip '', subnetNum 0, numSubnets 1, generateIp 1
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] HAIP: subnetRange 0, 65193
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAMain] HAIP: getSubnetRange 1, 1, 0, 8192, 1, 0, 8192, 0
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] HAIP: base 0, len 8192
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] HAIP: ipNum 3978100393, num 7405
2021-05-05 08:14:00.232 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] HAIP: my ip 169.254.28.237
2021-05-05 08:14:00.333 : USRTHRD:3483309824: [ INFO] {0:0:106} Failed to check 169.254.28.237 on ens5
2021-05-05 08:14:00.333 : USRTHRD:3483309824: [ INFO] {0:0:106} (null) category: 0, operation: , loc: , OS error: 0, other:
2021-05-05 08:14:00.333 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] Starting Probe for ip 169.254.28.237
2021-05-05 08:14:00.333 : USRTHRD:3483309824: [ INFO] {0:0:106} Thread:[NetHAWork] Transitioning to Probe State
2021-05-05 08:14:00.652 : USRTHRD:3483309824: [ INFO] {0:0:106} Arp::sProbe {
2021-05-05 08:14:00.652 : USRTHRD:3483309824: [ INFO] {0:0:106} Arp::sSend: sending type 1
2021-05-05 08:14:00.652 : USRTHRD:3483309824: [ INFO] {0:0:106} Arp::sProbe }

ohasd.trc-
=============
2021-05-05 08:15:05.520 : AGFW:3527337728: [ INFO] {0:0:106} Received the reply to the message: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:452 from the agent /u01/app/grid/19.3/gridhome_1/bin/orarootagent_root
2021-05-05 08:15:05.521 : AGFW:3527337728: [ INFO] {0:0:106} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:448
2021-05-05 08:15:05.522 : CRSPE:3514730240: [ INFO] {0:0:106} Received reply to action [Start] message ID: 448
2021-05-05 08:15:05.523 : CRSMAIN:3514730240: [ NONE] {0:0:106} {0:0:106} Created alert : (:CRSPE00221:) : Start action timed out!
2021-05-05 08:15:05.523 : CRSPE:3514730240: [ INFO] {0:0:106} Start action failed with error code: 3
2021-05-05 08:15:05.531 : CRSRPT:3512628992: [ INFO] {0:0:106} Published to EVM CRS_ACTION_FAILURE for ora.cluster_interconnect.haip
2021-05-05 08:15:06.179 :UiServer:3508426496: [ INFO] {0:0:111} Sending to PE. ctx= 0x7f5878072cb0, ClientPID=14196 set Properties (grid,116594)
2021-05-05 08:15:06.179 : CRSPE:3514730240: [ INFO] {0:0:111} Processing PE command id=131 origin:rac1. Description: [Stat Resource : 0x7f5884245dd0]
2021-05-05 08:15:06.183 :UiServer:3508426496: [ INFO] {0:0:111} Done for ctx=0x7f5878072cb0

I have tried to include some lines form logs/traces, but if they don't help kindly let me know for more data.

Best Regards,
Udit

Comments
Post Details