glusterfs.git/xlators/cluster/afr/src, branch v3.8.8

afr: Ignore event_generation checks post inode refresh for write txns

2017-01-08T11:15:27+00:00

Backport of http://review.gluster.org/#/c/16205/

Before http://review.gluster.org/#/c/16091/, after inode refresh, we
failed read txns in case of EIO or event_generation being zero. For
write transactions, the check was only for EIO. 16091 re-factored the
code to fail both read and write when event_generation=0. This seems to
have caused a regression as explained in the BZ.

This patch restores that behaviour in afr_txn_refresh_done().

Change-Id: Id763ed2d420b6d045d4505893a18959d998c91a3
BUG: 1378547
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/16322
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Niels de Vos 
Smoke: Gluster Build System

afr: allow I/O when favorite-child-policy is enabled

2017-01-08T10:48:49+00:00

Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.

Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.

The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.

> Reviewed-on: http://review.gluster.org/15673
> Tested-by: Pranith Kumar Karampuri 
> Smoke: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 

Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1378547
Signed-off-by: Ravishankar N 
Reported-by: Simon Turcotte-Langevin 
Reviewed-on: http://review.gluster.org/16091
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Niels de Vos

cluster/afr: Fix missing name indices due to EEXIST error

2016-12-29T15:09:10+00:00

        Backport of: http://review.gluster.org/16286

PROBLEM:
Consider a volume with  granular-entry-heal and sharding enabled. When
a replica is down and a shard is created as part of a write, the name
index is correctly created under indices/entry-changes/.
Now when a read on the same region triggers another MKNOD, the fop
fails on the online bricks with EEXIST. By virtue of this being a
symmetric error, the failed_subvols[] array is reset to all zeroes.
Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be
set, causing the name index, which was created in the previous MKNOD
operation, to be wrongly deleted in THIS MKNOD operation.

FIX:
The ideal fix would have been for a transaction to delete the name
index ONLY if it knows it is the one that created the index in the first
place. This would involve gathering information as to whether THIS xattrop
created the index from individual bricks, aggregating their responses and
based on the various posisble combinations of responses, decide whether to
delete the index or not. This is rather complex. Simpler fix would be
for post-op to examine local->op_ret in the event of no failed_subvols
to figure out whether to delete the name index or not. This can occasionally
lead to creation of stale name indices but they won't be affecting the IO path
or mess with pending changelogs in any way and self-heal in its crawl of
"entry-changes" directory would take care to delete such indices.

Change-Id: Icc642a987d1b6a5097562315aecf1263ed35ceb6
BUG: 1408786
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/16293
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

afr: use accused matrix instead of readable matrix for deciding heals

2016-12-28T09:13:23+00:00

Problem:
afr_replies_interpret() used the 'readable' matrix to trigger client
side heals after inode refresh. But for arbiter, readable is always
zero. So when `dd` is run with a data brick down, spurious data heals
are are triggered. These heals open an fd, causing eager lock to be
disabled (open fd count >1) in afr transactions, leading to extra FXATTROPS

Fix:
Use the accused matrix (derived from interpreting the afr pending
xattrs) to decide whether we can start heal or not.

> Reviewed-on: http://review.gluster.org/16277
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Smoke: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> Tested-by: Pranith Kumar Karampuri 
(cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f)

Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9
BUG: 1408772
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/16291
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/afr: Fix per-txn optimistic changelog initialisation

2016-12-13T09:03:15+00:00

        Backport of: http://review.gluster.org/16075

Incorrect initialisation of local->optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.

FIX:
Initialise local->optimistic_change_log just before pre-op.

Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.

Change-Id: Ibc0fbfb3fa21c578e28868d9e30b274e33c12064
BUG: 1403646
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/16105
Reviewed-by: Pranith Kumar Karampuri 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

selfheal: fix memory leak on client side healing queue

2016-12-05T01:30:28+00:00

> Reviewed-on: http://review.gluster.org/15968
> Tested-by: Pranith Kumar Karampuri 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> Reviewed-by: Ravishankar N 
> Smoke: Gluster Build System 
(cherry picked from commit fb95eb4da6f4fc0b9c69e3b159a2214fe47e6d1d)

Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d
BUG: 1400927
Signed-off-by: Mateusz Slupny 
Reviewed-on: http://review.gluster.org/16012
Tested-by: Ravishankar N 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Ravishankar N 
Reviewed-by: Pranith Kumar Karampuri

cluster/afr: When failing fop due to lack of quorum, also log error string

2016-11-11T12:23:54+00:00

        Backport of: http://review.gluster.org/#/c/15800/

Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081
BUG: 1393630
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/15813
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Pranith Kumar Karampuri

features/shard: Fill loc.pargfid too for named lookups on individual shards

2016-11-09T07:28:19+00:00

        Backport of: http://review.gluster.org/#/c/15788/

On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:

Brick-logs:

[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==> (Invalid argument) [Invalid argument]

Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]

Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.

Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb
BUG: 1392846
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/15796
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: Ravishankar N 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

afr,ec: Heal device files with correct major, minor numbers

2016-10-27T06:23:45+00:00

Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.

 >BUG: 1384297
 >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/15728
 >Smoke: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >Reviewed-by: Xavier Hernandez 
 >CentOS-regression: Gluster Build System 

Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
BUG: 1388948
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/15735
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Xavier Hernandez

cluster/afr: Prevent dict_set() on NULL dict

2016-10-17T10:04:45+00:00

In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.

 >BUG: 1385104
 >Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/15646
 >Smoke: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 
 >Reviewed-by: Ravishankar N 

 >BUG: 1385236
 >Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/15650
 >Smoke: Gluster Build System 
 >CentOS-regression: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 
 >Reviewed-by: Ravishankar N 

BUG: 1385442
Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/15651
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Ravishankar N