From fe52fc33d00eafe7d52ffff1b2dab846374f1d4a Mon Sep 17 00:00:00 2001 From: Raghavendra G Date: Tue, 13 Mar 2018 16:39:44 +0530 Subject: tests/bug-1110262.t: fix a race condition This test does: 1. mount a volume 2. kill a brick in the volume 3. mkdir (/somedir) In my local tests and in [1], I see that mkdir in step 3 fails because there is no dht-layout on root directory. The reason I think is by the time first lookup on "/" hit dht, a brick was killed as per step 2. This means layout was not healed for "/" and since this is a new volume, no layout is present on it. Note that the first lookup done on "/" by fuse-bridge is not synchronized with parent process of daemonized glusterfs mount completing. IOW, by the time glusterfs cmd executed there is no guarantee that lookup on "/" is complete. So, if step 2 races ahead of fuse_first_lookup on "/", we end up with an invalid dht-layout on "/" resulting in failures. Doint an operation like ls makes sure that lookup on "/" is completed before we kill a brick Change-Id: Ie0c4e442c4c629fad6f7ae850437e3d63fe4bea9 Signed-off-by: Raghavendra G BUG: 1543279 --- tests/bugs/bug-1110262.t | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'tests') diff --git a/tests/bugs/bug-1110262.t b/tests/bugs/bug-1110262.t index be785f4f3f7..90b101fc98d 100644 --- a/tests/bugs/bug-1110262.t +++ b/tests/bugs/bug-1110262.t @@ -23,6 +23,12 @@ TEST $CLI volume start $V0; EXPECT 'Started' volinfo_field $V0 'Status'; TEST glusterfs -s $H0 --volfile-id=$V0 $M0 +#do some operation on mount, so that kill_brick is guaranteed to be +#done _after_ first lookup on root and dht has a proper layout on +#it. Otherwise mkdir done in later stages of script might fail due to +#lack of layout on "/" as dht-self-heal won't proceed if any of its +#subvolumes are down. +TEST ls $M0 #kill one of the brick process TEST kill_brick $V0 $H0 $B0/${V0}2 -- cgit