glusterfs.git/tests/bugs/replicate, branch v3.9dev

glusterd: default value of nfs.disable, change from false to true

2016-04-27T07:59:30+00:00

Next step in eventual deprecation of glusterfs nfs server in favor
of ganesha.nfsd.

Also replace several open-coded strings with constant.

Change-Id: If52f5e880191a14fd38e69b70a32b0300dd93a50
BUG: 1092414
Signed-off-by: Kaleb S KEITHLEY 
Reviewed-on: http://review.gluster.org/13738
NetBSD-regression: NetBSD Build System 
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee 
Tested-by: Atin Mukherjee 
Reviewed-by: Niels de Vos

cluster/ec: Rebalance hangs during rename

2016-03-30T08:51:11+00:00

Problem:
During the rename of a particular file (ec
is holding blocking inodelk on the parent
directory), if the rename of another file
under the same directory comes. EC does not
release the lock and goes ahead and renames
the "new" file with the "already held lock".

That causes rebalance process to be blocked
on a lock which has been acquired by rename.

Solution:
While rename fop comes, ec takes blocking inodelk
on old and new parent of the file. Before releasing,
every lock held by ec, it waits for some "time" to
see if that lock can be reused by the next fop.
If within this "time" some other request comes,
it releases this lock based on condition
"lock count > 1"

To get this "lock count" for rename fop, we have
implemented "pl_rename" in feature/lock. Also,
on ec side, changed the condition to release the lock
based on the type of fop and old and new parent
directories.

Change-Id: I979dbab1185df962e8f305a6074ae1186ffe7db0
Bug: 1304988
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/13460
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: Krutika Dhananjay

cluster/ec: Provide an option to enable/disable eager lock

2016-03-16T04:54:28+00:00

Problem: If a fop takes lock, and completes its operation,
it waits for 1 second before releasing the lock. However,
If ec find any lock contention within this time period,
it release the lock immediately before time expires. As we
take lock on first brick, for few operations, like read, it
might happen that discovery of lock contention might take
long time and can degrades the performance.

Solution: Provide an option to enable/disable eager lock.
If eager lock is disabled, lock will be released as soon
as fop completes.

gluster v set  disperse.eager-lock on
gluster v set  disperse.eager-lock off

Change-Id: I000985a787eba3c190fdcd5981dfbf04e64af166
BUG: 1314649
Signed-off-by: Ashish Pandey 
Reviewed-on: http://review.gluster.org/13605
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: Pranith Kumar Karampuri 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System

afr: Add throttled background client-side heals

2016-03-01T11:23:20+00:00

If a heal is needed after inode refresh (lookup, read_txn), launch it in
the background instead of blocking the fop (that triggered refresh) until the
heal happens.

afr_replies_interpret() is modified such that the heal is
launched only if atleast one sink brick is up.

Max. no of heals that can happen in parallel is configurable via the
'background-self-heal-count' volume option. Any number greater than that
is put in a wait queue whose length is configurable via
'heal-wait-queue-leng' volume option. If the wait queue is also full,
further heals will be ignored.

Default values:  background-self-heal-count=8, heal-wait-queue-leng=128

Change-Id: I1d4a52814cdfd43d90591b6d2ad7b6219937ce70
BUG: 1297172
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/13207
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: Pranith Kumar Karampuri 
NetBSD-regression: NetBSD Build System

heal: Remove sleep()

2016-02-11T17:05:46+00:00

I wrote this program from a sample gfapi program which had sleep.
I am not sure why this sleep was needed. So removing it now.

Changed tests/bugs/replicate/bug-1190069-afr-stale-index-entries.t
to execute count_sh_entries every second, instead of comparing
same value over and over.

Change-Id: I7b89d6cab3e50bb7bf4d40a6064f2d8734155bea
BUG: 1306199
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/13421
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Krutika Dhananjay

cluster/afr: Fix heal-info slow response while IO is in progress

2016-02-04T06:58:37+00:00

Now heal-info does an open() on the file being examined so that
the client at some point sees open-fd count being > 1 and releases
the eager-lock so that heal-info doesn't remain blocked forever
until IO completes.

Change-Id: Icc478098e2bc7234408728b54d8185102b3540dc
BUG: 1297695
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/13326
Reviewed-by: Ravishankar N 
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: Pranith Kumar Karampuri 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

tests: Fix spurious failure in bug-1221481-allow-fops-on-dir-split-brain.t.

2016-01-23T06:02:04+00:00

Occasionally, when ls is executed, prior to READDIRP, a STAT is wound on
the operand directory. And AFR fails STAT with EIO if it is in metadata split-brain
which "dir" is in the test case in question. As a result, ls also fails with EIO,
causing test 20 to return negative exit status.
The fix is in the test script where the parts that cause the dir to go into
metadata split-brain have been removed. Now "dir" will only have entry
split-brain.

Change-Id: I4e4e6ba0a2401c7168719cd44e5f4f4bcb8fdd89
BUG: 1295702
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/13172
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: Pranith Kumar Karampuri 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

tests: handle bad objects during lookup/inode_refresh

2015-12-28T11:03:21+00:00

Change-Id: I1848f0e9243c9376e0deba6738757350fe8b704a
BUG: 1290965
Signed-off-by: Ravishankar N 
Reviewed-on: http://review.gluster.org/13044
Tested-by: NetBSD Build System 
Tested-by: Gluster Build System 
Reviewed-by: Venky Shankar 
Reviewed-by: Pranith Kumar Karampuri

cluster/afr: Fix data loss due to race between sh and ongoing write

2015-12-22T08:29:07+00:00

Problem:

When IO is happening on a file and a brick goes down comes back up
during this time, protocol/client translator attempts reopening of the
fd on the gfid handle of the file. But if another client renames this
file while a brick was down && writes were in progress on it, once this
brick is back up, there can be a race between reopening of the fd and
entry self-heal replaying the effect of the rename() on the sink brick.
If the reopening of the fd happens first, the application's writes
continue to go into the data blocks associated with the gfid.
Now entry-self-heal deletes 'src' and creates 'dst' file on the sink,
marking dst as a 'newentry'.  Data self-heal is also completed on 'dst'
as a result and self-heal terminates. If at this point the application
is still writing to this fd, all writes on the file after self-heal
would go into the data blocks associated with this fd, which would be
lost once the fd is closed. The result - the 'dst' file on the source
and sink are not the same and there is no pending heal on the file,
leading to silent corruption on the sink.

Fix:

Leverage http://review.gluster.org/#/c/12816/ to ensure the gfid handle
path gets saved in .glusterfs/unlink until the fd is closed on the file.
During this time, when self-heal sends mknod() with gfid of the file,
do the following:
link() the gfid handle under .glusterfs/unlink to the new path to be
created in mknod() and
rename() the gfid handle to go back under .glusterfs/ab/cd/.

Change-Id: I86ef1f97a76ffe11f32653bb995f575f7648f798
BUG: 1292379
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/13001
Reviewed-by: Pranith Kumar Karampuri 
Tested-by: NetBSD Build System 
Tested-by: Gluster Build System

cluster/afr : Examine data/metadata readable for read-subvol

2015-08-25T17:32:49+00:00

During lookup and discover, currently read_subvol is based
only on data_readable. read_subvol should be decided based
on both data_readable and metadata_readable.

Credits to Ravishankar N for the logic of afr_first_up_child
from http://review.gluster.org/10905/ .

Change-Id: I98580b23c278172ee2902be08eeaafb6722e830c
BUG: 1240244
Signed-off-by: Anuradha Talur 
Reviewed-on: http://review.gluster.org/11551
Reviewed-by: Ravishankar N 
Tested-by: Gluster Build System 
Reviewed-by: Krutika Dhananjay 
Reviewed-by: Pranith Kumar Karampuri