From fcb5b70b1099d0379b40c81f35750df8bb9545a5 Mon Sep 17 00:00:00 2001 From: Krutika Dhananjay Date: Thu, 28 Jul 2016 21:29:59 +0530 Subject: cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a BUG: 1363721 Signed-off-by: Krutika Dhananjay Reviewed-on: http://review.gluster.org/15080 Tested-by: Pranith Kumar Karampuri Smoke: Gluster Build System CentOS-regression: Gluster Build System Reviewed-by: Ravishankar N Reviewed-by: Oleksandr Natalenko NetBSD-regression: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri --- xlators/cluster/ec/src/ec-locks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'xlators/cluster/ec/src/ec-locks.c') diff --git a/xlators/cluster/ec/src/ec-locks.c b/xlators/cluster/ec/src/ec-locks.c index 0253b51bf5e..ed835f1aadc 100644 --- a/xlators/cluster/ec/src/ec-locks.c +++ b/xlators/cluster/ec/src/ec-locks.c @@ -52,7 +52,7 @@ int32_t ec_lock_check(ec_fop_data_t *fop, uintptr_t *mask) } if (error == -1) { - if (ec_bits_count(locked | notlocked) >= ec->fragments) { + if (gf_bits_count(locked | notlocked) >= ec->fragments) { if (notlocked == 0) { if (fop->answer == NULL) { fop->answer = cbk; -- cgit