glusterfs.git/xlators/cluster/ec, branch v3.7.14

cluster/ec: Unlock stale locks when inodelk/entrylk/lk fails

2016-07-30T01:04:22+00:00

Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.

 >Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
 >BUG: 1344836
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/14703
 >Reviewed-by: Xavier Hernandez 
 >Smoke: Gluster Build System 
 >Tested-by: mohammed rafi  kc 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 

BUG: 1361402
Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/15040
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Xavier Hernandez 
CentOS-regression: Gluster Build System

cluster/ec: Handle absence of keys in some callback dict

2016-07-27T07:00:04+00:00

Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion  kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.

Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.

master-
http://review.gluster.org/#/c/14761/

Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1360152
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez 
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/15012

cluster/ec: Fix race in timer cancellation

2016-07-18T06:29:05+00:00

A race in timer cancellation for delayed unlock could cause a crash
if the cancelling thread fails to cancel the timer because it has
already been fired but not executed, and the callback is scheduled
out of the CPU, delaying it until the thread has released important
resources needed by the callback.

This patch improves the handling of this case to make it robust.

Backport of:
> Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
> BUG: 1345855
> Signed-off-by: Xavier Hernandez 
> Reviewed-on: http://review.gluster.org/14712
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 

Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
BUG: 1346156
Signed-off-by: Xavier Hernandez 
Reviewed-on: http://review.gluster.org/14724
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

cluster/ec: Fix invalid __fd_unref() call

2016-06-14T01:03:50+00:00

__fd_unref() doesn't do any cleanup, so it cannot be called to release
fd references, specially if it's the last reference.

The code has been changed to avoid a call to this function.

In the previous version we always tried to keep the newest fd in the
ec_lock_t structure. However this is not necessary. We'll always keep
one reference to an open file on the same inode. It's irrelevant if
the reference is new or old.

The function __fd_unref() has also been removed from fd.h to avoid being
used in the future since it's useless as it's defined now.

Backport of http://review.gluster.org/14683

Change-Id: Ia728777fc8e464758d5ea4d3bf020f0603919039
BUG: 1344422
Signed-off-by: Xavier Hernandez 
Reviewed-on: http://review.gluster.org/14685
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/ec: Pass xdata to dht in case of error

2016-06-13T08:30:20+00:00

Problem: In case of mkdir failure, dht expects
error information so that it can act accordingly.
Aftre adding bricks and re balance, layout gets
changed. Fop "mkdir" with old layout returns EIO.
EC gets this error in xdata but does not pass it
back to dht. In this case dht will not be able to
take corrective action.

Solution: Return xdata back to dht

master -
http://review.gluster.org/#/c/14679/

Change-Id: I24def8038e6880607689b7b046dc6428f564c6ab
BUG: 1344595
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/14689
Reviewed-by: Xavier Hernandez 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

cluster/ec: Restrict the launch of replace brick heal

2016-06-09T09:04:07+00:00

Problem: When features.cache-invalidation is ON, a lot of
ec_notify function gets called which leads to launch of
too many heals. This leads to no heal completion,
which causes accumulation of heals.

Solution: ec_launch_replace_heal should not be launch
for every event. Replace brick will trigger a child up
event and then only this heal function should be called.

master -
http://review.gluster.org/#/c/14649/

Change-Id: I57b44c6a279d57230daea1d93229be6069245b7d
BUG: 1342964
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/14652
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Xavier Hernandez

cluster/ec: Fix issues with eager locking

2016-05-04T11:29:04+00:00

Due to a race in timer cancellation, in some cases it was possible
to unlock the lock while another concurrent fop that needed it
continues execution as if it were not released.

This patch also fixes an issue that caused a lock to not be released
if an error was found while preparing ec_update_size_version().

> Change-Id: I1344a3f5ecfc333f05a09e62653838264c9c26b1
> BUG: 1331254
> Signed-off-by: Xavier Hernandez 
> Reviewed-on: http://review.gluster.org/14112
> Smoke: Gluster Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Chen Chen 
> NetBSD-regression: NetBSD Build System 

Change-Id: I21edd17d914dfa8d2f98e6bbde50830496e12a92
BUG: 1330132
Signed-off-by: Xavier Hernandez 
Reviewed-on: http://review.gluster.org/14174
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Jeff Darcy

cluster/afr: Don't lookup/forget inodes

2016-04-17T14:11:08+00:00

Problem:
All inodes that are looked-up are always forgotten without fail in
afr removing the benefits of them being in lru. This same code can
cause crashes if between inode_lookup, inode_forget in afr if the
top xlator does inode_forget(0).

Fix:
Don't use lookup/forget in afr. No benefits are there at the moment
for keeping this code. It is impossible to prevent top xlators to
do inode_forget(0). Found similar instances in ec
and removed them even though those code paths are not going to
be executed in any place other than heal-daemon.

 >BUG: 1321554
 >Change-Id: Ia4cb236178f7f129cc898d53f0bbd26f494a2a8d
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/13834
 >Smoke: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 
 >Reviewed-by: Anuradha Talur 

BUG: 1327864
Change-Id: I3507ed88cd75e069ed302525bfa259cf407871fb
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/14009
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System

cluster/ec: Do not ref dictionary in lookup

2016-04-09T18:51:08+00:00

Problem:
1) dict_for_each loops over the elements without any locks, so the members of
   the dictionary can be ref/unrefed while dict_for_each is executed by another
   thread leading to crashes.

Basically with distributed ec + disctributed replicate as cold, hot tiers. tier
sends a lookup which fails on ec. (By this time dict already contains ec
xattrs) After this lookup_everywhere code path is hit in tier which triggers
lookup on each of distribute's hash lookup but fails which leads to the cold,
hot dht's lookup_everywhere in two parallel epoll threads where in ec when it
tries to set trusted.ec.version/dirty/size as keys in the dictionary, the older
values against the same key get erased. While this erasing is going on if the
thread that is doing lookup on afr's subvolume accesses these keys either in
dict_copy_with_ref or client xlator trying to serialize, that can either lead
to crash or hang based on if the spin/mutex lock is called on invalid memory.

2) EC deletes GF_CONTENT_KEY from the dictionary, this may lead to extra reads
   in case of lookup-everwhere for tiered volumes.

Fix:
Do dict_copy_with_ref() for the lookup-dictionary.
This is avoiding the problem and is not actually fixing the 1st problem.
2nd problem will be fixed.

 >Change-Id: I5427aa14c48cb7572977d4de9a28c5ffff2b4b95
 >BUG: 1315560
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/13680
 >Smoke: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 
 >Reviewed-by: Xavier Hernandez 
 >(cherry picked from commit 64cba025b13aad7fb3020a04930cfa22fbfcb859)

Change-Id: I2828a0d9e730bc4b0ea6cee037365131767ae43e
BUG: 1322520
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/13859
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Ravishankar N 
Reviewed-by: Krutika Dhananjay 
Smoke: Gluster Build System

cluster/ec: Provide an option to enable/disable eager lock

2016-03-21T05:52:53+00:00

Problem: If a fop takes lock, and completes its operation,
it waits for 1 second before releasing the lock. However,
If ec find any lock contention within this time period,
it release the lock immediately before time expires. As we
take lock on first brick, for few operations, like read, it
might happen that discovery of lock contention might take
long time and can degrades the performance.

Solution: Provide an option to enable/disable eager lock.
If eager lock is disabled, lock will be released as soon
as fop completes.

gluster v set  disperse.eager-lock on
gluster v set  disperse.eager-lock off

master-
http://review.gluster.org/13605

Change-Id: I000985a787eba3c190fdcd5981dfbf04e64af166
BUG: 1318965
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/13773
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Pranith Kumar Karampuri