diff options
| author | Samikshan Bairagya <samikshan@gmail.com> | 2017-05-23 19:32:24 +0530 | 
|---|---|---|
| committer | Atin Mukherjee <amukherj@redhat.com> | 2017-05-24 05:38:40 +0000 | 
| commit | a8624b8b13a1f4222e4d3e33fa5836d7b45369bc (patch) | |
| tree | 6f8f1a7ae413366961768fb1c5617000614a370a /xlators/mgmt | |
| parent | 461888bb63b2409f8245c7766aa799ca22f734e6 (diff) | |
glusterd: Eliminate race in brick compatibility checking stage
In https://review.gluster.org/17307/, while looking for compatible
bricks for multiplexing, it is checked if the brick pidfile exists
before checking if the corresponding brick process is running.
However checking if the brick process is running just after
checking if the pidfile exists isn't enough since there might be
race conditions where the pidfile has been created but hasn't
been updated with a pid value yet. This commit solves that by
making sure that we wait iteratively till the pid value is updated
as well.
Change-Id: Ib7a158f95566486f7c1f84b6357c9b89e4c797ae
BUG: 1451248
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/17375
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Diffstat (limited to 'xlators/mgmt')
| -rw-r--r-- | xlators/mgmt/glusterd/src/glusterd-utils.c | 7 | 
1 files changed, 5 insertions, 2 deletions
| diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c b/xlators/mgmt/glusterd/src/glusterd-utils.c index ea8d60cd87b..0a89535211f 100644 --- a/xlators/mgmt/glusterd/src/glusterd-utils.c +++ b/xlators/mgmt/glusterd/src/glusterd-utils.c @@ -5225,13 +5225,16 @@ find_compat_brick_in_vol (glusterd_conf_t *conf,                   * wait for the pidfile to be populated with a value before                   * checking if the service is running */                  while (retries > 0) { -                        if (sys_access (pidfile2, F_OK) == 0) +                        if (sys_access (pidfile2, F_OK) == 0 && +                            gf_is_service_running (pidfile2, &pid2)) {                                  break; +                        } +                          sleep (1);                          retries--;                  } -                if (!gf_is_service_running (pidfile2, &pid2)) { +                if (retries == 0) {                          gf_log (this->name, GF_LOG_INFO,                                  "cleaning up dead brick %s:%s",                                  other_brick->hostname, other_brick->path); | 
