glusterfs.git/tests/bugs/replicate, branch v3.8.4

cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order

2016-08-22T10:22:36+00:00

        Backport of: http://review.gluster.org/15080

When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

> Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
> BUG: 1363721
> Signed-off-by: Krutika Dhananjay 
> Reviewed-on: http://review.gluster.org/15080
> Tested-by: Pranith Kumar Karampuri 
> Smoke: Gluster Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Ravishankar N 
> Reviewed-by: Oleksandr Natalenko 
> NetBSD-regression: NetBSD Build System 
> Reviewed-by: Pranith Kumar Karampuri 
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)

Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay 
Signed-off-by: Oleksandr Natalenko 
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

glusterd: Convert volume to replica after adding brick self heal is not triggered

2016-08-18T16:55:17+00:00

Problem:  After add brick to a distribute volume to convert to replica is not
          triggering self heal.

Solution: Modify the condition in brick_graph_add_index to set trusted.afr.dirty
          attribute in xlator.

Test    : To verify the patch followd below steps
          1) Create a single node volume
             gluster volume create  
          2) Start volume and create mount point
             mount -t glusterfs :/DIS /mnt
          3) Touch some file and write some data on file
          4) Add another brick along with replica 2
             gluster volume add-brick DIS replica 2 :/dist2/brick2
          5) Before apply the patch file size is 0 bytes in mount point.
Backport of commit 87bb8d0400d4ed18dd3954b1d9e5ca6ee0fb9742
BUG: 1366440
Signed-off-by: Mohit Agrawal 

> Change-Id: Ief0ccbf98ea21b53d0e27edef177db6cabb3397f
> Signed-off-by: Mohit Agrawal 
> Reviewed-on: http://review.gluster.org/15118
> NetBSD-regression: NetBSD Build System 
> Reviewed-by: Ravishankar N 
> Reviewed-by: Anuradha Talur 
> Smoke: Gluster Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Atin Mukherjee 
> (cherry picked from commit 87bb8d0400d4ed18dd3954b1d9e5ca6ee0fb9742)

Change-Id: Icd104cf5a2152a9c606dac209746e2953c4d293e
Reviewed-on: http://review.gluster.org/15151
Smoke: Gluster Build System 
Reviewed-by: Ravishankar N 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Anuradha Talur 
Reviewed-by: Atin Mukherjee 
Tested-by: Atin Mukherjee

afr: some coverity fixes

2016-07-28T13:54:49+00:00

Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.

Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.

Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().

Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Krutika Dhananjay 
Reviewed-by: Pranith Kumar Karampuri

afr:Don't wind reads for files in metadata split-brain

2016-06-27T12:19:55+00:00

Backport of http://review.gluster.org/#/c/13389/

Problem: For a read on  a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local->readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.

Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().

Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349879
Signed-off-by: Ravishankar N 
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14790
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

afr: Consider ENOSPC and EDQUOT as symmetric errors

2016-06-13T10:14:49+00:00

Backport of http://review.gluster.org/#/c/14604/

Problem:
Since commit 8eaa3506ead4f11b81b146a9e56575c79f3aad7b, in replica 3, if a
brick is down and a create fails on the other 2 brick with EDQUOT, we consider
it an unsymmetric error and hence do not do post-op. So the dirty xattr
remains set on the parent dir, leading to conservative merges during heal when
all bricks are up. i.e. a file deleted on the source might re-appear after heal.

Fix:
Consider ENOSPC and  EDQUOT as symmetric errors since there is no
possibility of partial inode or entry modification operations possible when
quota is enabled. IOW, if quota reports EDQUOT, the no. of bytes written
(or not written) will be the same on all bricks of the replica.
Likewise, the entry operation (create, mkdir...) will either succeed or
not succeed on all bricks.

Change-Id: Iacb1108e9ef4a918e36242fb4a957455133744e9
BUG: 1344559
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/14687
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: Niels de Vos

cluster/afr: Do not inode_link in afr

2016-05-25T10:31:27+00:00

Race is explained at
https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0

This patch also handles performing of self-heal with shd-pid.
Also performs the healing with this->itable's inode rather than
main itable.

 >BUG: 1337405
 >Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/14422
 >Smoke: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 
 >Reviewed-by: Krutika Dhananjay 

BUG: 1337870
Change-Id: Ifb476eeed2ff73a44e481d64074599ab0707c725
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/14455
Smoke: Gluster Build System 
Reviewed-by: Krutika Dhananjay 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Niels de Vos

cluster/afr : Do post-op in case of symmetric errors

2016-05-24T08:37:10+00:00

        Backport of: http://review.gluster.org/#/c/14310/

In afr_changelog_post_op_now(), if there was any error,
meaning op_ret < 0, post-op was not being done even when
the errors were symmetric and there were no "failed
subvols".

Fix:
When the errors are symmetric, perform post-op.

How was the bug found :
In a 1 X 3 volume with shard and write behind on
when writes were done into a file with one brick down,
the trusted.afr.dirty xattr's value for .shard directory
would keep increasing as post op was not done but pre-op was.
This incorrectly showed .shard to be in split-brain.

RCA:
When WB is on, due to multiple writes being sent on
offset lying in the same shard, chances are that
same shard file will be created more than once
with the second one failing with op_ret < 0
and op_errno = EEXIST.

As op_ret was negative, afr wouldn't do post-op,
leading to no decrement of trusted.afr.dirty xattr.
Thus showing .shard directory to be in split-brain.

        >Change-Id: I711bdeaa1397244e6a7790e96f0c84501798fc59
        >BUG: 1335652
        >Signed-off-by: Anuradha Talur 

Change-Id: I711bdeaa1397244e6a7790e96f0c84501798fc59
BUG: 1335829
Signed-off-by: Anuradha Talur 
Reviewed-on: http://review.gluster.org/14331
NetBSD-regression: NetBSD Build System 
Smoke: Gluster Build System 
Reviewed-by: Ravishankar N 
CentOS-regression: Gluster Build System 
Reviewed-by: Niels de Vos

glusterd: default value of nfs.disable, change from false to true

2016-04-27T07:59:30+00:00

Next step in eventual deprecation of glusterfs nfs server in favor
of ganesha.nfsd.

Also replace several open-coded strings with constant.

Change-Id: If52f5e880191a14fd38e69b70a32b0300dd93a50
BUG: 1092414
Signed-off-by: Kaleb S KEITHLEY 
Reviewed-on: http://review.gluster.org/13738
NetBSD-regression: NetBSD Build System 
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee 
Tested-by: Atin Mukherjee 
Reviewed-by: Niels de Vos

cluster/ec: Rebalance hangs during rename

2016-03-30T08:51:11+00:00

Problem:
During the rename of a particular file (ec
is holding blocking inodelk on the parent
directory), if the rename of another file
under the same directory comes. EC does not
release the lock and goes ahead and renames
the "new" file with the "already held lock".

That causes rebalance process to be blocked
on a lock which has been acquired by rename.

Solution:
While rename fop comes, ec takes blocking inodelk
on old and new parent of the file. Before releasing,
every lock held by ec, it waits for some "time" to
see if that lock can be reused by the next fop.
If within this "time" some other request comes,
it releases this lock based on condition
"lock count > 1"

To get this "lock count" for rename fop, we have
implemented "pl_rename" in feature/lock. Also,
on ec side, changed the condition to release the lock
based on the type of fop and old and new parent
directories.

Change-Id: I979dbab1185df962e8f305a6074ae1186ffe7db0
Bug: 1304988
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/13460
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: Krutika Dhananjay

cluster/ec: Provide an option to enable/disable eager lock

2016-03-16T04:54:28+00:00

Problem: If a fop takes lock, and completes its operation,
it waits for 1 second before releasing the lock. However,
If ec find any lock contention within this time period,
it release the lock immediately before time expires. As we
take lock on first brick, for few operations, like read, it
might happen that discovery of lock contention might take
long time and can degrades the performance.

Solution: Provide an option to enable/disable eager lock.
If eager lock is disabled, lock will be released as soon
as fop completes.

gluster v set  disperse.eager-lock on
gluster v set  disperse.eager-lock off

Change-Id: I000985a787eba3c190fdcd5981dfbf04e64af166
BUG: 1314649
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/13605
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: Pranith Kumar Karampuri 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System