From fe52fc33d00eafe7d52ffff1b2dab846374f1d4a Mon Sep 17 00:00:00 2001
From: Raghavendra G <rgowdapp@redhat.com>
Date: Tue, 13 Mar 2018 16:39:44 +0530
Subject: tests/bug-1110262.t: fix a race condition

This test does:

1. mount a volume
2. kill a brick in the volume
3. mkdir (/somedir)

In my local tests and in [1], I see that mkdir in step 3 fails because
there is no dht-layout on root directory.

The reason I think is by the time first lookup on "/" hit dht, a brick
was killed as per step 2. This means layout was not healed for "/" and
since this is a new volume, no layout is present on it. Note that the
first lookup done on "/" by fuse-bridge is not synchronized with
parent process of daemonized glusterfs mount completing. IOW, by the
time glusterfs cmd executed there is no guarantee that lookup on "/"
is complete. So, if step 2 races ahead of fuse_first_lookup on "/", we
end up with an invalid dht-layout on "/" resulting in failures.

Doint an operation like ls makes sure that lookup on "/" is completed
before we kill a brick

Change-Id: Ie0c4e442c4c629fad6f7ae850437e3d63fe4bea9
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1543279
---
 tests/bugs/bug-1110262.t | 6 ++++++
 1 file changed, 6 insertions(+)

(limited to 'tests')

diff --git a/tests/bugs/bug-1110262.t b/tests/bugs/bug-1110262.t
index be785f4f3f7..90b101fc98d 100644
--- a/tests/bugs/bug-1110262.t
+++ b/tests/bugs/bug-1110262.t
@@ -23,6 +23,12 @@ TEST $CLI volume start $V0;
 EXPECT 'Started' volinfo_field $V0 'Status';
 TEST glusterfs -s $H0 --volfile-id=$V0 $M0
 
+#do some operation on mount, so that kill_brick is guaranteed to be
+#done _after_ first lookup on root and dht has a proper layout on
+#it. Otherwise mkdir done in later stages of script might fail due to
+#lack of layout on "/" as dht-self-heal won't proceed if any of its
+#subvolumes are down.
+TEST ls $M0
 #kill one of the brick process
 TEST kill_brick $V0 $H0 $B0/${V0}2
 
-- 
cgit