summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorPranith Kumar K <pkarampu@redhat.com>2015-11-10 09:06:54 +0530
committerXavier Hernandez <xhernandez@datalab.es>2015-11-11 05:48:02 -0800
commit9a69ad2c8438b9fbdcb133404a5d205f809bbb5a (patch)
tree63cea0a2eb5df48133b7f16681b65aee16af2a83
parentd025d954a5c593ccb0838788e165c36cb3537b25 (diff)
cluster/ec: fix bug in update_good
Problem: Bricks that didn't participate in the fops are considered to be good. This is happening two fold. Examples: Case-1: 1) 2+1 volume. 'd1' directory on Brick-0 is bad. 2) readdir takes locks and lock->good_mask is '7' 3) readdir does xattrop and fop->mask is '6'. 4) because fop->expected is '1' lock->good_mask remains '7' Case-2: 1) when all the bricks are up, it does lock + xattrop before op and figures out all the bricks are good. 2) By the time second operation starts brick-0 is down. Now lock->good_mask will always have the '0' bit set as long as the operations are happening on it. because: "lock->good_mask &= ~fop->mask | fop->remaining" fop->mask doesn't have '0' th bit. 3) When it comes time to perform the final xattrop in update_size_version brick-0 comes online because of which it gives the same version to brick-0 as well thinking it has participated in all the transactions till then, even when it didn't participate in the transactions. Fix: Case-1's fix: Update lock->good_mask in ec_prepare_update_cbk with latest good/bad bricks Case-2's fix: Consider non-participating brick as bad. Change-Id: Ic01a733f8180131ded6a3cc784fcb1960758cf23 BUG: 1276989 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12561 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
-rwxr-xr-xrun-tests.sh1
-rw-r--r--xlators/cluster/ec/src/ec-common.c12
2 files changed, 7 insertions, 6 deletions
diff --git a/run-tests.sh b/run-tests.sh
index f94f86c35ad..0d764a4e8ce 100755
--- a/run-tests.sh
+++ b/run-tests.sh
@@ -205,7 +205,6 @@ function is_bad_test ()
./tests/geo-rep/georep-basic-dr-rsync.t \
./tests/geo-rep/georep-basic-dr-tarssh.t \
./tests/bugs/replicate/bug-1221481-allow-fops-on-dir-split-brain.t \
- ./tests/basic/ec/ec-readdir.t \
; do
[ x"$name" = x"$bt" ] && return 0 # bash: zero means true/success
done
diff --git a/xlators/cluster/ec/src/ec-common.c b/xlators/cluster/ec/src/ec-common.c
index a7d6da4038f..da67dbc0f95 100644
--- a/xlators/cluster/ec/src/ec-common.c
+++ b/xlators/cluster/ec/src/ec-common.c
@@ -154,11 +154,12 @@ void ec_lock_update_good(ec_lock_t *lock, ec_fop_data_t *fop)
return;
}
- /* When updating the good mask of the lock, we only take into
- * consideration those bits corresponding to the bricks where
- * the fop has been executed. */
- lock->good_mask &= ~fop->mask | fop->remaining;
- lock->good_mask |= fop->good;
+ /* When updating the good mask of the lock, we only take into consideration
+ * those bits corresponding to the bricks where the fop has been executed.
+ * Bad bricks are removed from good_mask, but once marked as bad it's never
+ * set to good until the lock is released and reacquired */
+
+ lock->good_mask &= fop->good | fop->remaining;
}
void __ec_fop_set_error(ec_fop_data_t * fop, int32_t error)
@@ -981,6 +982,7 @@ unlock:
/* We don't allow the main fop to be executed on bricks that have not
* succeeded the initial xattrop. */
parent->mask &= fop->good;
+ ec_lock_update_good (lock, fop);
/*As of now only data healing marks bricks as healing*/
lock->healing |= fop->healing;