glusterfs.git/tests/bugs/replicate, branch v3.12.14

afr: heal gfids when file is not present on all bricks

2018-07-11T14:03:47+00:00

Backport of https://review.gluster.org/#/c/20271/ (only change is in .t)

commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265 introduced a regression
wherein if a file is present in only 1 brick of replica *and* doesn't
have a gfid associated with it, it doesn't get healed upon the next
lookup from the client. Fix it.

Change-Id: I7d1111dcb45b1b8b8340a7d02558f05df70aa599
BUG: 1598121
fixes: bz#1598121
Signed-off-by: Ravishankar N 
(cherry picked from commit eb472d82a083883335bc494b87ea175ac43471ff)

afr: fix bug-1363721.t failure

2018-07-09T10:03:07+00:00

Backport of https://review.gluster.org/#/c/20036/
Note:  We need to update inode context's write_subvol even in case of compound
fops. This is not there in master and 4.1 since compound FOPS was removed in it.

Problem:
In the .t, when the only good brick was brought down, writes on the fd were
still succeeding on the bad bricks. The inflight split-brain check was
marking the write as failure but since the write succeeded on all the
bad bricks, afr_txn_nothing_failed() was set to true and we were
unwinding writev with success to DHT and then catching the failure in
post-op in the background.

Fix:
Don't wind the FOP phase if the write_subvol (which is populated with readable
subvols obtained in pre-op cbk) does not have at least 1 good brick which was up
when the transaction started.

Change-Id: I4a1fef4569609c31cffeaef591a64c10870e8d0b
BUG: 1598720
Signed-off-by: Ravishankar N

afr: add quorum checks in post-op

2018-07-04T04:04:22+00:00

afr relies on pending changelog xattrs to identify source and sinks and the
setting of these xattrs happen in post-op. So if post-op fails, we need to
unwind the write txn with a failure.

Change-Id: I0f019ac03890108324ee7672883d774918b20be1
BUG: 1597120
Signed-off-by: Ravishankar N 
(cherry picked from commit a40a87ec3b226ae86a6ed8f4af25b45965a20cad)

afr: don't treat all cases all bricks being blamed as split-brain

2018-07-04T04:03:45+00:00

Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk.  Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.

Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.

Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1597123
Signed-off-by: Ravishankar N 
(cherry picked from commit 0e6e8216823c2d9dafb81aae0f6ee3497c23d140)

storage/posix: Fix posix_symlinks_match()

2018-07-04T04:03:10+00:00

1) snprintf into linkname_expected should happen with PATH_MAX
2) comparison should happen with linkname_actual with complete
   string linkname_expected

fixes bz#1595528
Change-Id: Ic3b3c362dc6c69c046b9a13e031989be47ecff14
BUG: 1595528
Signed-off-by: Pranith Kumar K

mount/fuse: use fstat in getattr implementation if any opened fd is available

2018-01-02T08:26:42+00:00

The restriction of using fds opened by the same Pid means fds cannot
be shared across threads of multithreaded application. Note that fops
from kernel have different Pid for different threads. Imagine
following sequence of operations:

* Turn off performance.open-behind
* Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of
  "file" is "nodeid-file".
* Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of
  "newfile" as "nodeid-newfile".
* t2 proceeds to do fstat (fd1)

The above set of operations can sometimes result in ESTALE/ENOENT
errors. RENAME overwrites "file" with "newfile" changing its nodeid
from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is
removed from the backend. If fstat carries nodeid-file as argument,
which can happen if lookup has not refreshed the nodeid of "file" and
since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT
which will fail as "nodeid-file" no longer exists.

Since the above set of operations and sharing of fds across
multiple threads are valid, this is a bug.

The fix is to use any fd opened on the inode. In this specific example
fuse_getattr_resume will find fd1 and winds down the call as fstat
(fd1) which won't fail.

Cross-checked with "Miklos Szeredi"  for
any security issues with this solution and he approves the solution.

Thanks to "Miklos Szeredi"  for all the
pointers and discussions.

>Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c
>BUG: 1510401
>Signed-off-by: Raghavendra G 
(cherry picked from commit 8b57378e5596f287a7b9d106dd6fb56a624b42ee)
Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c
BUG: 1529085
Signed-off-by: Raghavendra G

cluster/afr: Make choose-local "reconfigurable"

2017-10-12T18:46:10+00:00

        Backport of:
        > Change-Id: Ibab292ba705d993b475cd0303fb3318211fb2500
        > Reviewed-on: https://review.gluster.org/18026
        > BUG: 1480525
        > cherry-picked from commit 1e2d6537875d16b783e3c50ada7ee61487c6d796

With this change, enabling choose-local (which means its state makes
transition from "off" to "on") will be effective after the first
gfid-lookup on "/" since volume-set was executed.

Change-Id: Ibab292ba705d993b475cd0303fb3318211fb2500
BUG: 1501022
Signed-off-by: Krutika Dhananjay

afr: heal gfid as a part of entry heal

2017-10-10T05:33:15+00:00

Problem:
If a brick crashes after an entry (file or dir) is created but before
gfid is assigned, the good bricks will have pending entry heal xattrs
but the heal won't complete because afr_selfheal_recreate_entry() tries
to create the entry again and it fails with EEXIST.

Fix:
We could have fixed posx_mknod/mkdir etc to assign the gfid if the file
already exists but the right thing to do seems to be to trigger a lookup
on the bad brick and let it heal the gfid instead of winding an
mknod/mkdir in the first place.

(cherry picked from commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265)
Change-Id: I82f76665a7541f1893ef8d847b78af6466aff1ff
BUG: 1499202
Signed-off-by: Ravishankar N

glusterd: fix client io-threads option for replicate volumes

2017-10-09T12:21:08+00:00

Backport of https://review.gluster.org/#/c/18430/

Problem:
Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading
client-io-threads for replicate volumes (it was set to on by default in
commit e068c1997314046658dd502e9118dab32decf879) due to performance
issues but in doing so, inadvertently failed to load the xlator even if
the user explicitly enabled the option using the volume set command.
This was despite returning returning sucess for the volume set.

Fix:
Modify the check in perfxl_option_handler() and add checks in volume
create/add-brick/remove-brick code paths, tying it all to
GD_OP_VERSION_3_12_2.

Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf
BUG: 1499158
Signed-off-by: Ravishankar N

afr: auto-resolve split-brains for zero-byte files

2017-10-05T12:17:05+00:00

Problems:
As described in BZ 1491670, renaming hardlinks can result in data/mdata
split-brain of the DHT link-to files (T files) without any mismatch of
data and metadata.

As described in BZ 1486063, for a zero-byte file with only dirty bits
set, arbiter brick will likely be chosen as the source brick.

Fix:
For zero byte files in split-brain, pick first brick as
a) data source if file size is zero on all bricks.
b) metadata source if metadata is the same on all bricks

In arbiter case, if file size is zero on all bricks and there are no
pending afr xattrs, pick 1st brick as data source.

(cherry picked from commit 1719cffa911c5287715abfdb991bc8862f0c994e)
Change-Id: I0270a9a2f97c3b21087e280bb890159b43975e04
BUG: 1496317
Signed-off-by: Ravishankar N 
Reported-by: Rahul Hinduja 
Reported-by: Mabi