glusterd: make sure that brickinfo->uuid is not null

Problem: After an upgrade from the version where shared-brick-count option is not present to a version which introduced this option causes issue at the mount point i.e, size of the volume at mount point will be reduced by shared-brick-count value times. Cause: shared-brick-count is equal to the number of bricks that are sharing the file system. gd_set_shared_brick_count() calculates the shared-brick-count value based on uuid of the node and fsid of the brick. https://review.gluster.org/#/c/glusterfs/+/19484 handles setting of fsid properly during an upgrade path. This patch assumed that when the code path is reached, brickinfo->uuid is non-null. But brickinfo->uuid is null for all the bricks, as the uuid is null https://review.gluster.org/#/c/glusterfs/+/19484 couldn't reached the code path to set the fsid for bricks. So, we had fsid as 0 for all bricks, which resulted in gd_set_shared_brick_count() to calculate shared-brick-count in a wrong way. i.e, the logic written in gd_set_shared_brick_count() didn't work as expected since fsid is 0. Solution: Before control reaches the code path written by https://review.gluster.org/#/c/glusterfs/+/19484, adding a check for whether brickinfo->uuid is null and if brickinfo->uuid is having null value, calling glusterd_resolve_brick will set the brickinfo->uuid to a proper value. When we have proper uuid, fsid for the bricks will be set properly and shared-brick-count value will be caluculated correctly. Please take a look at the bug https://bugzilla.redhat.com/show_bug.cgi?id=1632889 for complete RCA Steps followed to test the fix: 1. Created a 2 node cluster, the cluster is running with binary which doesn't have shared-brick-count option 2. Created a 2x(2+1) volume and started it 3. Mouted the volume, checked size of volume using df 4. Upgrade to a version where shared-brick-count is introduced (upgraded the nodes one by one i.e, stop the glusterd, upgrade the node and start the glusterd). 5. after upgrading both the nodes, bumped up the cluster.op-version 6. At mount point, df shows the correct size for volume. > backport of https://review.gluster.org/#/c/glusterfs/+/21278/ > Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b > Signed-off-by: Sanju Rakonde <srakonde@redhat.com> (cherry picked from commit f1e9b878ce2067db83a0baa5f384eda87287719d) fixes: bz#1633242 Change-Id: Ib9f078aafb15e899a01086eae113270657ea916b Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
author: Sanju Rakonde <srakonde@redhat.com> 2018-09-25 23:36:48 +0530
committer: Sanju Rakonde <srakonde@redhat.com> 2018-09-27 06:18:39 +0000
commit: 7ab0f691f0bd08585d06a26f23dc24f62ddd6251 (patch)
tree: 0b736132c3b555481b787e311e1a02fc9ac74a10
parent: 112b50070861101be2d6cc8d8e96af75359a8ca3 (diff)
1 files changed, 2 insertions, 1 deletions
diff --git a/xlators/mgmt/glusterd/src/glusterd-store.c b/xlators/mgmt/glusterd/src/glusterd-store.c
index b5de958f8be..2300a548e5b 100644
--- a/xlators/mgmt/glusterd/src/glusterd-store.c
+++ b/xlators/mgmt/glusterd/src/glusterd-store.c
@@ -2724,6 +2724,8 @@ glusterd_store_retrieve_bricks(glusterd_volinfo_t *volinfo)
          * snapshot or snapshot restored volume this would be done post
          * creating the brick mounts
          */
+        if (gf_uuid_is_null(brickinfo->uuid))
+            (void)glusterd_resolve_brick(brickinfo);
         if (brickinfo->real_path[0] == '\0' && !volinfo->is_snap_volume &&
             gf_uuid_is_null(volinfo->restored_from_snap)) {
             /* By now if the brick is a local brick then it will be
@@ -2732,7 +2734,6 @@ glusterd_store_retrieve_bricks(glusterd_volinfo_t *volinfo)
              * with MY_UUID for realpath check. Hence do not handle
              * error
              */
-            (void)glusterd_resolve_brick(brickinfo);
             if (!gf_uuid_compare(brickinfo->uuid, MY_UUID)) {
                 if (!realpath(brickinfo->path, abspath)) {
                     gf_msg(this->name, GF_LOG_CRITICAL, errno,
author	Sanju Rakonde <srakonde@redhat.com>	2018-09-25 23:36:48 +0530
committer	Sanju Rakonde <srakonde@redhat.com>	2018-09-27 06:18:39 +0000
commit	7ab0f691f0bd08585d06a26f23dc24f62ddd6251 (patch)
tree	0b736132c3b555481b787e311e1a02fc9ac74a10
parent	112b50070861101be2d6cc8d8e96af75359a8ca3 (diff)