glusterfs.git/xlators/cluster/afr/src, branch v7.2

afr: expose cluster.optimistic-change-log to CLI.

2020-01-09T13:22:22+00:00

Backport of https://review.gluster.org/#/c/glusterfs/+/23960/

This volume option was not made avaialble to `gluster volume set` CLI.

Reported-by: epolakis(https://github.com/kinsu) in
https://github.com/gluster/glusterfs/issues/781

fixes: bz#1788785
Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7
Signed-off-by: Ravishankar N

afr: make heal info lockless

2019-12-16T05:38:25+00:00

Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.

Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.

fixes: bz#1783858
Change-Id: Ie9e83c162aa77f44a39c2ba7115de558120ada4d
Signed-off-by: Ravishankar N 
(cherry picked from commit d7e049160a9dea988ded5816491c2234d40ab6b3)

cluster/afr: Heal entries when there is a source & no healed_sinks

2019-11-14T13:18:45+00:00

Problem:
In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame
anything for entry heal, heal will not complete even though we have
clear source and sinks. This will happen because while doing
afr_selfheal_find_direction() only the bricks which are blamed by
non-accused bricks are considered as sinks. Later in
__afr_selfheal_entry_finalize_source() when it tries to mark all the
non-sources as sinks it fails to do so because there won't be any
healed_sinks marked, no witness present and there will be a source.

Fix:
If there is a source and no healed_sinks, then reset all the locked
sources to 0 and healed sinks to 1 to do conservative merge.

Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548
Fixes: bz#1760699
Signed-off-by: karthik-us

afr: support split-brain CLI for replica 3

2019-11-13T05:04:07+00:00

Ever since we added quorum checks for lookups in afr via commit
bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution
commands would not work for replica 3 because there would be no
readables for the lookup fop.

The argument was that split-brains do not occur in replica 3 but we do
see (data/metadata) split-brain cases once in a while which indicate that there are
a few bugs/corner cases yet to be discovered and fixed.

Fortunately, commit  8016d51a3bbd410b0b927ed66be50a09574b7982 added
GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we
leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD,
split-brain resolution commands will work for replica 3 volumes too.

Likewise, the check is added in shard_lookup as well to permit resolving
split-brains by specifying "/.shard/shard-file.xx" as the file name
(which previously used to fail with EPERM).

Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9
Fixes: bz#1760791
Signed-off-by: Ravishankar N 
(cherry picked from commit 47dbd753187f69b3835d2e42fdbe7485874c4b3e)

ctime/rebalance: Heal ctime xattr on directory during rebalance

2019-09-16T10:54:21+00:00

After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Backport of:

 > Patch: https://review.gluster.org/23127/
 > Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
 > BUG: 1734026
 > Signed-off-by: Kotresh HR 
(cherry picked from commit 304640e55c0f3c6d15f4e230dc6376e4f5020fea)

Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
Signed-off-by: Kotresh HR 
fixes: bz#1752429

afr/lookup: Pass xattr_req in while doing a selfheal in lookup

2019-09-11T05:04:47+00:00

We were not passing xattr_req when doing a name self heal
as well as a meta data heal. Because of this, some xdata
was missing which causes i/o errors

Backport of > https://review.gluster.org/#/c/glusterfs/+/23024/


>Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f
>Fixes: bz#1728770
>Signed-off-by: Mohammed Rafi KC 

Fixes: bz#1749305
Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f
Signed-off-by: Mohammed Rafi KC 
(cherry picked from commit d026f0bcfd301712e4f0671ccf238f43f2e6dd30)

afr: wake up index healer threads

2019-08-30T05:04:56+00:00

...whenever shd is re-enabled after disabling or there is a change in
`cluster.heal-timeout`, without needing to restart shd or waiting for the
current `cluster.heal-timeout` seconds to expire.

See BZ 1743988 for more details.

Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe
fixes: bz#1747301
Reported-by: Glen Kiessling 
Signed-off-by: Ravishankar N 
(cherry picked from commit 600ba94183333c4af9b4a09616690994fd528478)

afr: restore timestamp of parent dir during entry-heal

2019-08-21T11:45:24+00:00

Fixes: bz#1741041
Change-Id: I29e338bac62104233a6f80212df8d0fb016affda
Signed-off-by: Ravishankar N 
(cherry picked from commit 8e9c53ebf16705b9a1db2fc486dc24a5cb244ddd)

cluster/ta: Notify the clients only if there are pending heals

2019-07-24T11:01:54+00:00

Problem:
In case of thin arbiter, before index healer starts crawling the
indices at every heal-timeout interval, even if there is nothing to
be healed it will send an upcall notification to all the clients to
release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait
for the upcall to return before proceeding with the heal even though
there is nothing to be healed. This will also invalidates the cached
information about the bricks states on the clients which leads to
extra calls on TA from clients for the next reads & writes if needed.
This will impact the IO performance.

Fix:
- Before sending the upcall to the clients, check for any pending heals
on TA without taking  any locks.
- If there is nothing marked bad on TA, then continue with the index
crawl to heal any dirty markings present on the files due to any post-op
failure.
- If there is a brick marked as bad on TA, then take the
AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and
continue with the current healing process.

Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6
fixes: bz#1729483
Signed-off-by: karthik-us

cluster/afr: Fix incorrect reporting of gfid & type mismatch

2019-07-20T07:35:44+00:00

Problems:
1. When checking for type and gfid mismatch, if the type or gfid
is unknown because of missing gfid handle and the gfid xattr
it will be reported as type or gfid mismatch and the heal will
not complete.

2. If the source selected during entry heal has null gfid the same
will be sent to afr_lookup_and_heal_gfid(). In this function when
we try to assign the gfid on the bricks where it does not exist,
we are considering the same gfid and try to assign that on those
bricks. This will fail in posix_gfid_set() since the gfid sent
is null.

Fix:
If the gfid sent to afr_lookup_and_heal_gfid() is null choose a
valid gfid before proceeding to assign the gfid on the bricks
where it is missing.

In afr_selfheal_detect_gfid_and_type_mismatch(), do not report
type/gfid mismatch if the type/gfid is unknown or not set.

Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da
fixes: bz#1729481
Signed-off-by: karthik-us