| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir
failures because of stale layout. But AFR was unwinding null xdata_rsp in case
of failures. This was leading to mkdir failures just after remove-brick. Unwind
the xdata_rsp in case of failures to make sure the response from brick reaches
dht.
BUG: 1340623
Change-Id: Idd3f7b95730e8ea987b608e892011ff190e181d1
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14553
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1336612
Change-Id: Ife1ce4b11776a303df04321b4a8fc5de745389d6
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14545
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See also
> Change-Id: I567a4be8f0f31f6285550f243fe802895f6bc43b
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
BUG: 1336793
Change-Id: Icb9a6ff94d86663a5bca4ba931d810439c02556e
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14526
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ief71cc68a4fbf8113e15b4254ebcabf7e30f74e2
BUG: 1339181
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14516
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce cluster.favorite-child-policy which when enabled with
[ctime|mtime|size|majority], automatically heals files that are in
split-brian.
The majority policy will not pick a source if there is no majority.
The other three policies pick the first brick with a valid reply and
non-zero ctime/mtime/size as source.
Change-Id: I3c099a0404082213860f74f2c9b4d207cfaedb76
BUG: 1328224
Original-author: Richard Wareing <rwareing@fb.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/14026
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99
BUG: 1338634
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/14496
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since rebalance(not remove-brick) process does not migrate hardlinks
mark them as skipped rather than failed as it creates confusion for
the users.
Change-Id: I5d469d10146274f00bb91482d0373c5235a9b8b2
BUG: 1339071
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14493
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem :
Misleading messages are getting logged in mount logs
and bricks log.
"Mismatching xdata" and "Heal failed" are getting logged
Solution :
Reduce the level of logs from INFO, WARNING and NOTICE
to DEBUG level wherever applicable OR use fop_log_level
to get proper log level.
Change-Id: Ia824c71e75ab683d3cb8949e1966ea09c9ccce72
BUG: 1231224
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/13266
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Parallel rmdir operations on the same directory results in ENOTCONN messages
eventhough there was no network disconnect.
In blocking entry lock during rmdir, AFR takes 2 set of locks on all its
children-One (parentdir,name of dir to be deleted), the other (full lock
on the dir being deleted). We proceed to pre-op stage even if only a single
lock (but not all the needed locks) was obtained, only to fail it with ENOTCONN
because afr_locked_nodes_get() returns zero nodes in afr_changelog_pre_op().
Fix:
After we get replies for all blocking lock requests, if we don't have
the minimum number of locks to carry out the FOP, unlock and fail the
FOP. The op_errno will be that of the last failed reply we got, i.e.
whatever is set in afr_lock_cbk().
Change-Id: Ibef25e65b468ebb5ea6ae1f5121a5f1201072293
BUG: 1336381
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/14358
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shards
Change-Id: I0606b74f11f5412c4d9af44a6505635ed9022c15
BUG: 1335858
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14334
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Race is explained at
https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0
This patch also handles performing of self-heal with shd-pid.
Also performs the healing with this->itable's inode rather than
main itable.
BUG: 1337405
Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14422
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If a named fresh-lookup is done on an loc and the fop fails on one of the
bricks or not sent on one of the bricks, but by the time response comes to afr,
if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(),
this will lead to inode-ctx for that inode to be not set, this can lead to EIO
in case of a transaction as it depends on 'readable' array to be available by
that point.
Fix:
Refresh inode for inode-write fops for the ctx to be set if it is not already
done at the time of named fresh-lookup or if the file is in split-brain where
we need to perform one more refresh before failing the fop to check if the file
is still in split-brain or not.
BUG: 1336612
Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14368
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case of 3 way replication with quorum enabled with sharding,
if one bricks is brought down and brought back up sometimes
fops fail with EROFS because the mknod of shard file fails with
two good nodes with EEXIST. So even when quorum is not met, it
makes sense to unwind with the errno returned by lower xlators
as much as possible.
Change-Id: Iabd91cd7c270f5dfe6cbd18c50e59c299a331552
BUG: 1336612
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14369
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For directory rename if destination exists the source directory
is created as a child of the given destination directory. Since
the new child directory does not exist take lock on parent of the
child directory.
Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71
BUG: 1336698
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/14371
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also missing bang (!) in #!/bin/bash in shell scripts.
Change-Id: I567a4be8f0f31f6285550f243fe802895f6bc43b
BUG: 1336793
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14398
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1334164
Change-Id: I4259d88f2b6e4f9d4ad689bc4e438f1db9cfd177
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/14365
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for following reasons:
* healing is done in lookup and mkdir codepath where inode is not
linked _yet_ as normally linking is done in interface layers
(fuse-bridge, gfapi, nfsv3 etc).
* healing consists of non-lookup fops like inodelk, setattr, setxattr
etc. All non-lookup fops expect a linked inode.
Change-Id: I1bd8157abbae58431b7f6f6fffee0abfe5225342
BUG: 1334164
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/14295
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "max cycle time" log message was incorrectly logged as
an error. Downgrade it to INFO.
Change-Id: Ia7d074423019fa79443bc6ea694148b7b8da455d
BUG: 1335973
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/14336
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: N Balachandran <nbalacha@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had wrongly placed the clearing tier-fix-layout-complete
xattr before the joining of migration threads. This would lead to
situations where failure of clearing the xattr would cause the
premature death of migration threads.
Now we clear the xattr only after the data movement threads join,
ensuring that all migration is done.
Change-Id: I829b671efa165ae13dbff7b00707434970b37a09
BUG: 1334839
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/14285
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Joseph Fernandes
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In afr_changelog_post_op_now(), if there was any error,
meaning op_ret < 0, post-op was not being done even when
the errors were symmetric and there were no "failed
subvols".
Fix:
When the errors are symmetric, perform post-op.
How was the bug found :
In a 1 X 3 volume with shard and write behind on
when writes were done into a file with one brick down,
the trusted.afr.dirty xattr's value for .shard directory
would keep increasing as post op was not done but pre-op was.
This incorrectly showed .shard to be in split-brain.
RCA:
When WB is on, due to multiple writes being sent on
offset lying in the same shard, chances are that
same shard file will be created more than once
with the second one failing with op_ret < 0
and op_errno = EEXIST.
As op_ret was negative, afr wouldn't do post-op,
leading to no decrement of trusted.afr.dirty xattr.
Thus showing .shard directory to be in split-brain.
Change-Id: I711bdeaa1397244e6a7790e96f0c84501798fc59
BUG: 1335652
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/14310
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Spurious entries are reported in heal info when the mount is on second/third
brick of the replica pair because local-child is given preference in selecting
source. The code is supposed to suggest the file needs heal if the (source < 0)
(failure code path), but instead it is written as if any non-zero value
is considered failure.
Fix:
Treat +ve source as success case
BUG: 1335429
Change-Id: I1be7f9defef2ae03be7eec8d7d49bf34adeca82c
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14302
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch rearrange the code, to add some defence functionality for
pthread_create(), i.e. only on a success on pthread_create() call
pthread_join().
Change-Id: I0836bc950a210574cfdc755a666c6ac5df6ab430
BUG: 1332219
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-on: http://review.gluster.org/14152
Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During detach check if background fixlayout is done, if not done ignore
the case and continue detach.
Change-Id: I5d5cfc0e73d0eb217fdeab54c432dc4af8bc598d
BUG: 1332136
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/14147
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During locking we send lock request to cached subvol,
and normally we unlock to the cached subvol
But with parallel fresh lookup on a directory, there
is a race window where the cached subvol can change
and the unlock can go into a different subvol from
which we took lock.
This will result in a stale lock held on one of the
subvol.
So we will store the details of subvol which we took the lock
and will unlock from the same subvol
Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb
BUG: 1311002
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/13492
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Multi-threaded healing doesn't create synctask with shd pid, this
leads to healing problems when quota exceeds.
BUG: 1332994
Change-Id: I80f57c1923756f3298730b8820498127024e1209
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14211
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.. to prevent unnecessary logs from gf_msg_callingfn()
Change-Id: I367628fee2f6783ba9ed6f918deabd034df820c9
BUG: 1333043
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14212
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Afr does post-ops after write but the stat buffer it unwinds is at the
time of write, so if nfs client caches this, it will see different
ctime when it does stat on it after post-op is done. From NFS client's
perspective it thinks the file is changed. Tar which depends on this
to be correct keeps giving 'file changed as we read it' warning.
If Afr instead has to choose to unwind after post-op, eager-lock,
delayed-post-op will have to be disabled which will lead to bad
performance for all write usecases.
Fix:
Don't let client cache stat after write.
Change-Id: Ic6062acc6e5cdd97a9c83c56bd529ec83cee8a23
BUG: 1302948
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/13785
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to a race in timer cancellation, in some cases it was possible
to unlock the lock while another concurrent fop that needed it
continues execution as if it were not released.
This patch also fixes an issue that caused a lock to not be released
if an error was found while preparing ec_update_size_version().
Change-Id: I1344a3f5ecfc333f05a09e62653838264c9c26b1
BUG: 1331254
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14112
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Chen Chen <chenchen@smartquerier.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT did not handle rmdir failures on non-hashed subvols
correctly in a 2x2 dist-rep volume, causing the
directory do be deleted from the hashed subvol.
Also fixed an issue where the dht_selfheal_restore
errcodes were overwriting the rmdir error codes.
Change-Id: If2c6f8dc8ee72e3e6a7e04a04c2108243faca468
BUG: 1330032
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/14060
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id0e7400c8ae950c90d42a3ddf8b558a14959a1f8
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14074
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With lock-migration, we need to send requests to destination
brick post migration. Once, the source brick marks the lock
structure to be already migrated, the requests will be redirected
to destination brick by dht_lk2/flush2.
Change-Id: I50b14011c5ab68c34826fb7ba7f8c8d42a68ad97
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/13493
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I48c6f9cdda47503615ba65882acd5eedf0a70c89
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14024
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When we spawn promote and demote thread, query files
are build. And only query file with index 0 is picked for migration
as the first query file. This may not be suitable for scenarios,
where the file in the query are too big to move in the first cycle,
as a result file in the other query files always get missed. We need to
shuffle so that other query files also get a chance.
Fix: Remember the previous first query file and shift it by one index,
before the migration starts.
Change-Id: I704947bcf4bab6b20b1179a6d9ae4a15a3d51bd9
BUG: 1330353
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/14068
Tested-by: Joseph Fernandes
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements new indices type ENTRY_CHANGES where other
xlators can add/delete names.
Change-Id: I01c5568997085e11d22ba36a4376c70b78fb3827
BUG: 1269461
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/12482
Tested-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I52da41dff5619492b656c2217f4716a6cdadebe0
BUG: 1269461
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/12442
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not allow directory creations without gfids as
after the directories are created, operations
on them fail anyway. So it is better to fail mkdir.
BUG: 1317361
Change-Id: I8f8e3b38bbded1960b7215bac0432500f7e78038
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/13690
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It has become very difficult to identify the xlator which returned
negative op_ret. Being able to just change the log level and
visualize the stack is helpful in such cases.
Change-Id: I6545b4802c1ab4d0d230d5e9e036afb2384882e1
BUG: 1330052
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/13448
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1329501
Change-Id: Id402c20f2fa19b22bc402295e03e7a0ea96b0c40
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14048
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: During mount, afr waits for response from all its children before
notifying the parent xlator. In a 1x2 replica volume , if one of the nodes is
down, the mount will hang for more than a minute until child down is received
from the client xlator for that node.
Fix:
When parent up is received by afr, start a 10 second timer. In the timer call
back, if we receive a successful child up from atleast one brick, propagate the
event to the parent xlator.
Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d
BUG: 1054694
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11113
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an ongoing rebalance completion check task been
triggered by dht, there is a possibility of a race
between afr setting subvol as non-readable and dht updates
the cached subvol. In this window a write can fail with EIO.
Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d
BUG: 1329503
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/14049
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: On a rebal stop, the migrator threads don't intimate the
crawler thread to wake up in case it is waiting on signal from
migrator thread.
Change-Id: I3cc4be41a4db25f48fee059ebb79a97ee99dcd00
BUG: 1327507
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14004
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0bbc2c2ef115c78393f6570815a5b80316e7e4be
BUG: 1319992
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/11720
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Olia-Kremmyda for finding the bug on github review,
https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6
Change-Id: Ib8640ed0c331a635971d5d12052f0959c24f76a2
BUG: 1329773
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14052
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_mkdir ()
{
first-hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any
subvol, but we choose first-hashed-subvol randomly");
{
begin:
hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
hash-range = extract hashe-range from layout of "parent";
ret = mkdir (parent/bname, hashed-subvol, hash-range);
if (ret == "hash-value doesn't fall into layout stored on
the brick (this error is returned by posix-mkdir)")
{
refresh_parent_layout ();
goto begin;
}
}
inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN",
"first-hashed-subvol");
proceed with other parts of dht_mkdir;
}
posix_mkdir (parent/bname, client-hash-range)
{
disk-hash-range = getxattr (parent, "dht-layout-key");
if (disk-hash-range != client-hash-range) {
fail-with-error ("hash-value doesn't fall into layout
stored on the brick");
return 0;
}
continue-with-posix-mkdir;
}
Similar changes need to be done for dentry operations like create,
symlink, link, unlink, rmdir, rename. These will be addressed in
subsequent patches. This patch addresses only mkdir codepath.
This change breaks stripe tests, as on some striped subvols dht layout
xattrs are not set for some reason. This results in failure of
mkdir. Since striped volumes are always created with dht, some tests
associated with stripe also fail. So, I am making following tests
changes (since stripe is out of maintainance):
* modify ./tests/basic/rpc-coverage.t to not to use striped volumes
* mark all (2) tests in tests/bugs/stripe/ as bad tests
Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25
BUG: 1323040
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/13885
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Locking schemes in afr-v1 were locking the directory/file completely during
self-heal. Newer schemes of locking don't require Full directory, file locking.
But afr-v2 still has compatibility code to work-well with older clients, where
in entry-self-heal it takes a lock on a special 256 character name which can't
be created on the fs. Similarly for data self-heal there used to be a lock on
(LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks
before examining heal-state. If it doesn't take sh-domain locks, then there is
a possibility of heal-info hanging till self-heal completes because of
compatibility locks. But the problem with heal-info taking sh-domain locks is
that if two heal-info or shd, heal-info try to inspect heal state in parallel
using trylocks on sh-domain, there is a possibility that both of them assuming
a heal is in progress. This was leading to spurious entries being shown in
heal-info.
Fix:
As long as there is afr-v1 way of locking, we can't fix this problem with
simple solutions. If we know that the cluster is running newer versions of
locking schemes, in those cases we can give accurate information in heal-info.
So introduce a new option called 'locking-scheme' which if it is 'granular'
will give correct information in heal-info. Not only that, Extra network hops
for taking compatibility locks, sh-domain locks in heal info will not be
necessary anymore. Thus it improves performance.
BUG: 1322850
Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/13873
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When quota is enabled the quota enforcer tries to get the size of the
source directory by sending nameless lookup to quotad. But if the rename
is successful even on one subvol or the source layout has anomalies then
this nameless lookup in quotad tries to heal the directory which requires
a lock on as many subvols as it can. But src is already locked as part of
rename. For rename to proceed in brick it needs to complete a cluster-wide
lookup. But cluster-wide lookup in quotad is blocked on locks held by rename,
hence a deadlock. To avoid this quota sends an option in xdata which instructs
DHT not to heal.
Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
BUG: 1252244
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/13988
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When there are 2 sources and one sink and if two self-heal daemons
try to acquire locks at the same time, there is a chance that it
gets a lock on one source and sink leading partial to heal. This will
need one more heal from the remaining source to sink for the complete
self-heal. This is not optimal.
Fix:
Upgrade non-blocking locks to blocking lock on all the subvolumes, if
the number of locks acquired is majority and there were eagains.
BUG: 1318751
Change-Id: Iae10b8d3402756c4164b98cc49876056ff7a61e5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/13766
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In afr-v1 pre-op, xattrop increments self xattr first then it increments the
value on rest. In post-op, xattr value is decreased first on rest and at last
it gets decremented on self. So for a possible operation to be witnessed i.e.
a fop is seen by the brick it is important to have at least 1 pending op
because without completing pre-op fop won't come. The other possibility is when
fop completes but at the time of post-op after decrementing pending counts on
others just before decrementing its own pending count, the brick dies.
Fix:
Fix witness detection code in afr_self_heal_find_direction()
BUG: 1322253
Change-Id: Ia7e76482c0a46e775e269bb96ec1b9490a3ac18f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/13811
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The throughput for a 'dd' workload was much less for arbiter
configuration when compared to normal replica-3 volume. There were 2
issues:
i)arbiter_writev was using the request dict as response dict while
unwinding, leading to incorect GLUSTERFS_WRITE_IS_APPEND and
GLUSTERFS_OPEN_FD_COUNT values (=4), leading to immediate post-ops
because is_afr_delayed_changelog_post_op_needed() failed due to
afr_are_multiple_fds_opened() check.
ii) The arbiter code in afr was setting local->transaction.{start and len} =0
to take full file locks. What this meant was even for simultaenous but
non-overlapping writevs, afr_transaction_eager_lock_init() was not
happening because afr_locals_overlap() always stays true. Consequently
is_afr_delayed_changelog_post_op_needed() failed due to
local->delayed_post_op not being set.
Fix:
i) Send appropriate response dict values in arbiter_writev.
ii) Modify flock params instead of local->transaction.{start and len} to
take full file locks in the transaction.
Also changed _fill_writev_xdata() in posix to fill rsp_xdata for
whatever key is requested for.
Change-Id: I1c5fc5e98aba49ade540bb441a022e65b753432a
BUG: 1324004
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Robert Rauch <robert.rauch@gns-systems.de>
Reported-by: Russel Purinton <russell.purinton@gmail.com>
Reviewed-on: http://review.gluster.org/13906
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turns a special xattr into an rmdir with flags set. When that hits
the posix translator on the server side, that causes the file/directory
to be moved into the special "landfill" directory. From there, the
posix janitor thread will take care of deleting it entirely on the
server side - traversing it recursively if necessary. A couple of
secondary issues were fixed to make this effective.
* FUSE now ensures that setxattr values are NUL terminated.
* The janitor thread now gets woken up immediately when something is
placed in 'landfill' instead of only when file descriptors need to be
closed.
* The default landfill-emptying interval was reduced to 10s.
To use the feature, issue a setxattr something like this:
setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir
The value doesn't actually matter; the mere receipt of a request with
this key is sufficient. Some day it might be useful to allow setting a
required value as a sort of password, so that only those who know it can
access the underlying special functionality.
Change-Id: I8a343c2cdb40a76d5a06c707191fb67babb8514f
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/13878
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|