glusterfs.git/tests/bugs, branch v3.12.9

features/index: Choose different base file on EMLINK error

2018-04-12T05:20:36+00:00

Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f
fixes: bz#1565655
Signed-off-by: Pranith Kumar K 
(cherry picked from commit bb12f2109a01856e8184e13cf984210d20155b13)

cluster/ec: avoid delays in self-heal

2018-04-06T12:48:38+00:00

Self-heal creates a thread per brick to sweep the index looking for
files that need to be healed. These threads are started before the
volume comes online, so nothing is done but waiting for the next
sweep. This happens once per minute.

When a replace brick command is executed, the new graph is loaded and
all index sweeper threads started. When all bricks have reported, a
getxattr request is sent to the root directory of the volume. This
causes a heal on it (because the new brick doesn't have good data),
and marks its contents as pending to be healed. This is done by the
index sweeper thread on the next round, one minute later.

This patch solves this problem by waking all index sweeper threads
after a successful check on the root directory.

Additionally, the index sweep thread scans the index directory
sequentially, but it might happen that after healing a directory entry
more index entries are created but skipped by the current directory
scan. This causes the remaining entries to be processed on the next
round, one minute later. The same can happen in the next round, so
the heal is running in bursts and taking a lot to finish, specially
on volumes with many directory levels.

This patch solves this problem by immediately restarting the index
sweep if a directory has been healed.

Backport of:
> BUG: 1547662

Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e
BUG: 1555201
Signed-off-by: Xavi Hernandez

nfs: Fixing the failure in bug-974972.t test case

2018-02-12T10:13:32+00:00

Problem:
gNFS servers is restarted even before the pending marker is set
becuase of eager lock being on, leading to a file in split-brain
to get healed.

Fix:
Switch off the eager lock.
This is fixed in the later versions by the patch:
https://review.gluster.org/#/c/13075/

Change-Id: If59e5ede2de1cbfcdeac01cca38dc0d046f1993c
BUG: 1542475
Signed-off-by: karthik-us

tests: mark test bad

2018-02-07T06:10:46+00:00

Please remove after fixing the test.

Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
BUG: 1542826
Signed-off-by: N Balachandran

tests: Spurious failure multiplex-limit-issue-151.t

2018-02-06T16:34:25+00:00

A timing issue caused the remove-brick commit to fail.
Replaced 'remove-brick commit' with 'remove-brick force'.

> Change-Id: I69144b2f7be34095dbd3a7d182e0bf01b27fb0a4
> BUG: 1517904
> Signed-off-by: N Balachandran 

Change-Id: I69144b2f7be34095dbd3a7d182e0bf01b27fb0a4
BUG: 1542615
Signed-off-by: N Balachandran

tests: Fix a broken test case

2018-02-05T13:29:58+00:00

Issue: When using statedump command to take statedump of the
gfapi process, we specify the following things:

$gluster volume statedump  client :
   pid: Pid of the gfapi application
   host: This should be the IP/hostname as seen by the glusterd,
         the gfapi application is connected to.

In this test case, if gfapi application is running locally,
and is connected to $H1 glusterd, the  need not be $H1.
 could be localhost, 127.0.0.1, 127.1.1.1 etc. based on
the configuration of the system. Hence use netstat to find the
right  value.

>mainline patch : https://review.gluster.org/#/c/18893/

Change-Id: I6efb9d1ccaf9c6841a9ab7c9ebfecafc03c0bc5e
BUG: 1542054
Signed-off-by: Poornima G

cluster/ec: OpenFD heal implementation for EC

2018-02-02T06:50:35+00:00

Existing EC code doesn't try to heal the OpenFD to
avoid unnecessary healing of the data later.

Fix implements the healing of open FDs before
carrying out file operations on them by making an
attempt to open the FDs on required up nodes.

Backport of:
>BUG: 1431955

BUG: 1536334
Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15
Signed-off-by: Sunil Kumar Acharya

posix: delete stale gfid handles in nameless lookup

2018-01-16T04:54:21+00:00

..in order for self-heal of symlinks to work properly (see BZ for
details).

Backport of https://review.gluster.org/#/c/19070/
Signed-off-by: Ravishankar N 

Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501
BUG: 1534847

cluster/dht: Use percentages for space check

2018-01-12T05:37:34+00:00

With heterogenous bricks now being supported in DHT
we could run into issues where files are not migrated
even though there is sufficient space in newly added bricks
which just happen to be considerably smaller than older
bricks. Using percentages instead of absolute available
space for space checks can mitigate that to some extent.

Marking bug-1247563.t as that used to depend on the easier
code to prevent a file from migrating. This will be removed
once we find a way to force a file migration failure.

> Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647
> BUG: 1529440
> Signed-off-by: N Balachandran 

Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647
BUG: 1530455
Signed-off-by: N Balachandran

mount/fuse: use fstat in getattr implementation if any opened fd is available

2018-01-02T08:26:42+00:00

The restriction of using fds opened by the same Pid means fds cannot
be shared across threads of multithreaded application. Note that fops
from kernel have different Pid for different threads. Imagine
following sequence of operations:

* Turn off performance.open-behind
* Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of
  "file" is "nodeid-file".
* Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of
  "newfile" as "nodeid-newfile".
* t2 proceeds to do fstat (fd1)

The above set of operations can sometimes result in ESTALE/ENOENT
errors. RENAME overwrites "file" with "newfile" changing its nodeid
from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is
removed from the backend. If fstat carries nodeid-file as argument,
which can happen if lookup has not refreshed the nodeid of "file" and
since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT
which will fail as "nodeid-file" no longer exists.

Since the above set of operations and sharing of fds across
multiple threads are valid, this is a bug.

The fix is to use any fd opened on the inode. In this specific example
fuse_getattr_resume will find fd1 and winds down the call as fstat
(fd1) which won't fail.

Cross-checked with "Miklos Szeredi"  for
any security issues with this solution and he approves the solution.

Thanks to "Miklos Szeredi"  for all the
pointers and discussions.

>Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c
>BUG: 1510401
>Signed-off-by: Raghavendra G 
(cherry picked from commit 8b57378e5596f287a7b9d106dd6fb56a624b42ee)
Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c
BUG: 1529085
Signed-off-by: Raghavendra G