glusterfs.git/xlators/cluster, branch v8.0

afr: more quorum checks in lookup and new entry marking

2020-06-29T12:51:32+00:00

Problem: See github issue for details.

Fix:
-In lookup if the entry exists in 2 out of 3 bricks, don't fail the
lookup with ENOENT just because there is an entrylk on the parent.
Consider quorum before deciding.

-If entry FOP does not succeed on quorum no. of bricks, do not perform
new entry mark.

Fixes: #1303
Change-Id: I56df8c89ad53b29fa450c7930a7b7ccec9f4a6c5
Signed-off-by: Ravishankar N 
(cherry picked from commit c4a6748f25d2c1ab3ebcf89952278ebf94c8d371)

locks: prevent deletion of locked entries

2020-06-29T12:51:20+00:00

To keep consistency inside transactions started by locking an entry or
an inode, this change delays the removal of entries that are currently
locked by one or more clients. Once all locks are released, the removal
is processed.

It has also been improved the detection of stale inodes in the locking
code of EC.

Fixes: #990
Change-Id: Ic8ba23d9480f80c7f74e7a310bf8a15922320fd5
Signed-off-by: Xavi Hernandez

cluster/afr: Prioritize ENOSPC over other errors

2020-06-16T04:56:19+00:00

Problem:
In a replicate/arbiter volume if file creations or writes fails on
quorum number of bricks and on one brick it is due to ENOSPC and
on other brick it fails for a different reason, it may fail with
errors other than ENOSPC in some cases.

Fix:
Prioritize ENOSPC over other lesser priority errors and do not set
op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any
error_no which can be misinterpreted by __afr_dir_write_finalize().

Also removing the function afr_has_arbiter_fop_cbk_quorum() which
might consider a successful reply form a single brick as quorum
success in some cases, whereas we always need fop to be successful
on quorum number of bricks in arbiter configuration.

Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08
Fixes: #1254
Signed-off-by: karthik-us 
(cherry picked from commit fa63b45ca5edf172b1b89b28b5db3c5129cc57b6)

cluster/ec: Return correct error code and log message

2020-05-28T07:16:40+00:00

In case of readdir was send with an FD on which opendir
was failed, this FD will be useless and we return it with error.
For now, we are returning it with EINVAL without logging any
message in log file.

Return a correct error code and also log the message to improve thing to debug.

fixes: #1220
Change-Id: Iaf035254b9c5aa52fa43ace72d328be622b06169
(cherry picked from commit af70cb5eedd80207cd184e69f2a4fb252b72d070)

afr: event gen changes

2020-04-24T12:50:13+00:00

The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N 
Reported-by: Erik Jacobson 
(cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)

dht - Remove "tier" code (part 1)

2020-04-17T04:59:18+00:00

This patch is removing some of the "tier" code in dht xlator, as it is no longer
being used.
Not all of the not-needed code is removed at once, so reviewing is easier.
Follow up patches removing additional unused code will follow.

This is based in the work done in https://review.gluster.org/#/c/glusterfs/+/23935/

Change-Id: I3cb6a0c5d8f14afcd87cf021ef8f74b91c0f908a
updates: #1097
Signed-off-by: Barak Sason Rofman

dht - fixing a permission update issue

2020-04-08T06:57:53+00:00

When bringing back a downed brick and performing lookup from the client
side, the permission on said brick aren't updated on the first lookup,
but only on the second.

This patch modifies permission update logic so the first lookup will
trigger a permission update on the downed brick.

LIMITATIONS OF THE PATCH:
As the choice of source depends on whether the directory has layout or not.
Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory.
Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted.

fixes: #999
Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33
Signed-off-by: Barak Sason Rofman

cluster/afr: Removing unsupported options from code base to improve coverage

2020-04-07T04:26:33+00:00

Support for gluster volume heal  info healed/heal-failed
was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in
release-3.6. cli parser will display the usage message in all the
supported versions whenever these clis are run, leading to some
dead code in the latest branches. Since support for these clis
were removed long back, this should not give any backward
compatibility issues as well. Hence removing the dead code from
the code base which will lead to better code coverage by the
regression runs as well.

Updates: #1052
Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9
Signed-off-by: karthik-us

afr: mark pending xattrs as a part of metadata heal

2020-04-02T04:32:57+00:00

...if pending xattrs are zero for all children.

Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the  BZ.

Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.

Fixes: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N

dht: gf_defrag_process_dir is called even if gf_defrag_fix_layout has failed

2020-03-24T05:14:07+00:00

Currently even though gf_defrag_fix_layout fails with ENOENT or ESTALE, a
subsequent call is made to gf_defrag_process_dir leading to rebalance failure.

fixes: #1102
Change-Id: Ib0c309fd78e89a000fed3feb4bbe2c5b48e61478
Signed-off-by: Susant Palai