Infrastructure Software

Announcement

Testing banner

Bug 6719988: MOUNT.OCFS2 FAILS WITH UNKNOWN CODE B 0 still with us on Version 1.8.5

4285969Jul 13 2020

Hello,

I've incurred in this BUG on Ubuntu 18.04, kernel 5.3.0-1028-oracle.

dmesg shows:

[ 2593.523291] o2net: Connected to node exacto-t4-ad2-fd1 (num 7) at 10.1.1.25:7777

[ 2593.523349] o2net: Connected to node exacto-t6-ad2-fd3 (num 9) at 10.1.1.27:7777

[ 2593.523493] o2net: Connected to node exacto-t5-ad2-fd2 (num 8) at 10.1.1.26:7777

[ 2691.649837] (mount.ocfs2,22008,1):dlm_join_domain:1911 Timed out joining dlm domain 386D6FD2075E4761A409BFA2339E4949 after 90200 msecs

This output is incorrect, though, since it is missing the last added node, as can be seen with:

sudo mounted.ocfs2 -f

on each of the 3 referenced nodes; it correctly returns:

Device Stack Cluster F Nodes

/dev/sdb o2cb exacto-t4-ad2-fd1, exacto-t5-ad2-fd2, exacto-t6-ad2-fd3, exacto-t8-ad2-fd2

I can get a consistent result even on the last node:

ssh exacto-t8-ad2-fd2 sudo mounted.ocfs2 -f

Device Stack Cluster F Nodes

/dev/sdb o2cb exacto-t4-ad2-fd1, exacto-t5-ad2-fd2, exacto-t6-ad2-fd3, exacto-t8-ad2-fd2

While on the affected node, which is exhibiting the BUG, the command has a very strange output:

ssh exacto-t7-ad2-fd1 sudo mounted.ocfs2 -f

Device Stack Cluster F Nodes

/dev/sdb o2cb exacto-t4-ad2-fd1, exacto-t5-ad2-fd2, exacto-t6-ad2-fd3, 4

Moreover, status of the ocfs2 driver:

sudo systemctl status ocfs2

● ocfs2.service - Load ocfs2 Modules

Loaded: loaded (/lib/systemd/system/ocfs2.service; enabled; vendor preset: enabled)

Active: active (exited) since Mon 2020-07-13 09:46:18 UTC; 17ms ago

Process: 7476 ExecStop=/etc/init.d/ocfs2 stop (code=exited, status=0/SUCCESS)

Process: 7686 ExecStart=/etc/init.d/ocfs2 start (code=exited, status=0/SUCCESS)

Main PID: 7686 (code=exited, status=0/SUCCESS)

Jul 13 09:44:35 exacto-t7-ad2-fd1 systemd[1]: Starting Load ocfs2 Modules...

Jul 13 09:46:18 exacto-t7-ad2-fd1 ocfs2[7686]: Starting Oracle Cluster File System (OCFS2) mount.ocfs2: Unknown code B 0 while mounting /dev/sdb on /mnt/ocfs2. Check 'dmesg' for more information on this error.

Jul 13 09:46:18 exacto-t7-ad2-fd1 ocfs2[7686]: Failed

/etc/ocfs2/cluster.conf is identical across all nodes, as I'm building the cluster following the official docs.

Added on Jul 13 2020

#clustering, #oracle-solaris

0 comments

32 views