glusterfs.git/xlators/cluster/afr, branch v9dev

cluster/afr: Removing unsupported options from code base to improve coverage

2020-04-07T04:26:33+00:00

Support for gluster volume heal  info healed/heal-failed
was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in
release-3.6. cli parser will display the usage message in all the
supported versions whenever these clis are run, leading to some
dead code in the latest branches. Since support for these clis
were removed long back, this should not give any backward
compatibility issues as well. Hence removing the dead code from
the code base which will lead to better code coverage by the
regression runs as well.

Updates: #1052
Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9
Signed-off-by: karthik-us

afr: mark pending xattrs as a part of metadata heal

2020-04-02T04:32:57+00:00

...if pending xattrs are zero for all children.

Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the  BZ.

Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.

Fixes: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N

cluster/afr: Fixes for halo

2020-03-13T13:20:37+00:00

Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.

fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K

cluster/afr: fix race when bricks come up

2020-03-02T07:13:55+00:00

The was a problem when self-heal was sending lookups at the same time
that one of the bricks was coming up. In this case there was a chance
that the number of 'up' bricks changes in the middle of sending the
requests to subvolumes which caused a discrepancy in the expected
number of replies and the actual number of sent requests.

This discrepancy caused that AFR continued executing requests before
all requests were complete. Eventually, the frame of the pending
request was destroyed when the operation terminated, causing a use-
after-free issue when the answer was finally received.

In theory the same thing could happen in the reverse way, i.e. AFR
tries to wait for more replies than sent requests, causing a hang.

Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70
Signed-off-by: Xavi Hernandez 
Fixes: bz#1808875

afr: prevent spurious entry heals leading to gfid split-brain

2020-02-18T09:11:40+00:00

Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.

Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.

fixes: bz#1801624
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N

cluster/thin-arbiter: Wait for TA connection before ta-file lookup

2020-02-17T06:24:18+00:00

Problem:
When we mount a ta volume, as soon as 2 data bricks are connected
we consider that the mount is done and then send a lookup/create on
ta file on ta node. However, this connection with ta node might not
have been completed.
Due to this delay, ta replica id file will not be created and we
will see ENOTCONN error in log file if we do lookup.

Solution:
As we know that this ta node could have a higher latency, we should
wait for reasonable time for connection to happen before sending
lookup/create on replica id file.

fixes: bz#1720463
Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc

cluster/afr: Check for lock on source & sink before doing data heal

2020-02-13T09:38:14+00:00

Problem:
In function afr_selfheal_data_block(), we only check for the lock count
to be equal to or greater than the number of sinks. There can be a case
where we have 2 source bricks and one sink and the locking is successful
on only the source brick(s). In this case we continue with the healing
on sink without having a lock, which is not correct.

Fix:
Check for lock on atleast source & one sink before starting the data heal.

Change-Id: Iebcb57dcaa4b31831fedfee63d6ca16e9d6c8df8
fixes: bz#1688115
Signed-off-by: karthik-us

tests: Fix spurious self-heald.t failure

2020-02-11T13:19:13+00:00

Problem:
heal-info code assumes that all indices in xattrop directory
definitely need heal. There is one corner case.
The very first xattrop on the file will lead to adding the
gfid to 'xattrop' index in fop path and in _cbk path it is
removed because the fop is zero-xattr xattrop in success case.
These gfids could be read by heal-info and shown as needing heal.

Fix:
Check the pending flag to see if the file definitely needs or
not instead of which index is being crawled at the moment.

fixes: bz#1801623
Change-Id: I79f00dc7366fedbbb25ec4bec838dba3b34c7ad5
Signed-off-by: Pranith Kumar K

afr: restore timestamp of files during metadata heal

2020-01-24T04:35:38+00:00

For files: During metadata heal, we restore timestamps
only for non-regular (char, block etc.) files.
Extenting it for regular files as timestamp is updated
via touch command also

fixes: bz#1787274
Change-Id: I26fe4fb6dff679422ba4698a7f828bf62ca7ca18
Signed-off-by: Sheetal Pamecha

multiple xlators: reduce key length

2020-01-14T17:11:22+00:00

In many cases, we were freely allocating long keys with no need.
Smaller char arrays are just fine almost anywhere, so just went ahead
and looked where they we can use smaller ones.

In some cases, annotated the functions as static and the prefixes
passed as const as it was easier to read and understand.

Where relevant, converted the dict functions to use known key length.

Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2
updates: bz#1193929
Signed-off-by: Yaniv Kaul