| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.
Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.
cherry picked from commit f013335400d033a9677797377b90b968803135f4:
>BUG: 1346719
>Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15309
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
BUG: 1373392
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15406
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a distributed-replicate volume attribute
"replica.split-brain-status" value does not display split-brain
condition though directory is in split-brain.
If directory is in split brain on mutiple replica-pairs
it does not show full list of replica pairs.
Solution: Update the dht_aggregate code to aggregate the xattr
value in this specific condition.
Fix: 1) function getChoices returns the choices from split-brain
status string.
2) function add_opt adding the choices to local buffer to
store in dictionary
3) For the key "replica.split-brain-status" function dht_aggregate
call dht_aggregate_split_brain_xattr to prepare the list.
Test: To verify the patch followed below steps
1) Create a distributed replica volume and create mount point
2) Stop heal daemon
3) Touch file and directories on mount point
mkdir test{1..5};touch tmp{1..5}
4) Down brick process on one of the replica set
pkill -9 glusterfsd
5) Change permission of dir on mount point
chmod 755 test{1..5}
6) Restart brick process on node with force option
7) kill brick process on other node in same replica set
8) Change permission of dir again on mount point
chmod 766 test{1..5}
9) Reexecute same step from 4-9 on other replica set also
10) After check heal status on server it will show dir's are
in split brain on all replica sets
11) After check the replica.split-brain-status attr on mount
point it will show wrong status of split brain.
12) After apply the patch the attribute shows correct value.
> Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> Reviewed-on: http://review.gluster.org/15201
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> (cherry picked from commit c4e9ec653c946002ab6d4c71ee8e6df056438a04)
Change-Id: Ia5234e8a2291a7e8a7211c82368f4df1c99fa099
Backport of commit c4e9ec653c946002ab6d4c71ee8e6df056438a04
BUG: 1375099
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15466
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cyclic order
Backport of: http://review.gluster.org/15080
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.
This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.
Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08
BUG: 1367270
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15222
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.
Fix:
Copy loc to frame->local and use it's address.
> Reviewed-on: http://review.gluster.org/15070
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
BUG: 1367305
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-on: http://review.gluster.org/15168
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/15145
AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.
Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df
BUG: 1367270
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15162
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise inode-link failures in selfheal codepath will result in a
crash.
This regression was introduced in master as fix to 1334164. But, that
patch never made into 3.7. Hence, in essence this patch is 3.7 version
of fix to 1334164, minus the regression.
> Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022
> BUG: 1345748
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-on: http://review.gluster.org/14707
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Poornima G <pgurusid@redhat.com>
> Reviewed-by: Susant Palai <spalai@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a6022
BUG: 1366483
Reviewed-on: http://review.gluster.org/15163
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems: The maximum number of migratior threads created was static set
to "40". And the number of these threads get created in rebalance depends
on the number of cores user has. If the number of cores exceeds 40, a
crash or memory corruption can be seen.
Fix: Make the migratior thread pool dynamic.
> Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
> BUG: 1362070
> Reviewed-on: http://review.gluster.org/15000
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b)
Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
BUG: 1362070
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15062
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.
>Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
>BUG: 1344836
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14703
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Tested-by: mohammed rafi kc <rkavunga@redhat.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
BUG: 1361402
Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15040
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/14895/
Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().
Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360549
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15017
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.
Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.
master-
http://review.gluster.org/#/c/14761/
Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1360152
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race in timer cancellation for delayed unlock could cause a crash
if the cancelling thread fails to cancel the timer because it has
already been fired but not executed, and the callback is scheduled
out of the CPU, delaying it until the thread has released important
resources needed by the callback.
This patch improves the handling of this case to make it robust.
Backport of:
> Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
> BUG: 1345855
> Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
> Reviewed-on: http://review.gluster.org/14712
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
BUG: 1346156
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14724
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/13389/
Problem: For a read on a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local->readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.
Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().
Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349881
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14791
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When DHT traverses the inode->fd_list, it does that in an unsafe
way that can generate races with fd_unref() called from other threads.
This patch fixes this problem taking the inode->lock and adding a
reference to the fd while it's being used outside of the mutex
protected region.
A minor change in storage/posix has been done to also access the
inode->fd_list in a safe way.
Backport of:
> Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93
> BUG: 1344340
> Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
> Reviewed-on: http://review.gluster.org/14682
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93
BUG: 1346751
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14734
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic559c220a1f0051e531314d13940604e2dead08c
BUG: 1336284
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/14349
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not allow directory creations without gfids as
after the directories are created, operations
on them fail anyway. So it is better to fail mkdir.
>BUG: 1317361
>Change-Id: I8f8e3b38bbded1960b7215bac0432500f7e78038
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/13690
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>(cherry picked from commit b246b07896fefb261c9fb07f3f29f0d03b81b88d)
Change-Id: Ibf9c84add7265e3e1755a37958e1de38307624b2
BUG: 1332372
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14188
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/14191
If the application opens a file with O_DIRECT, the shards'
anon fds would also need to inherit the flag. Towards this,
shard xl would be passing the odirect flag in the @flags parameter
to the WRITEV fop. This will be used in anon fd resolution
and subsequent opening by posix xl.
Change-Id: I3a0593fa46cc25e390a5762a0354b469c2a1532d
BUG: 1342903
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14663
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__fd_unref() doesn't do any cleanup, so it cannot be called to release
fd references, specially if it's the last reference.
The code has been changed to avoid a call to this function.
In the previous version we always tried to keep the newest fd in the
ec_lock_t structure. However this is not necessary. We'll always keep
one reference to an open file on the same inode. It's irrelevant if
the reference is new or old.
The function __fd_unref() has also been removed from fd.h to avoid being
used in the future since it's useless as it's defined now.
Backport of http://review.gluster.org/14683
Change-Id: Ia728777fc8e464758d5ea4d3bf020f0603919039
BUG: 1344422
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14685
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In case of mkdir failure, dht expects
error information so that it can act accordingly.
Aftre adding bricks and re balance, layout gets
changed. Fop "mkdir" with old layout returns EIO.
EC gets this error in xdata but does not pass it
back to dht. In this case dht will not be able to
take corrective action.
Solution: Return xdata back to dht
master -
http://review.gluster.org/#/c/14679/
Change-Id: I24def8038e6880607689b7b046dc6428f564c6ab
BUG: 1344595
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/14689
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/14604/
Problem:
Since commit 8eaa3506ead4f11b81b146a9e56575c79f3aad7b, in replica 3, if a
brick is down and a create fails on the other 2 brick with EDQUOT, we consider
it an unsymmetric error and hence do not do post-op. So the dirty xattr
remains set on the parent dir, leading to conservative merges during heal when
all bricks are up. i.e. a file deleted on the source might re-appear after heal.
Fix:
Consider ENOSPC and EDQUOT as symmetric errors since there is no
possibility of partial inode or entry modification operations possible when
quota is enabled. IOW, if quota reports EDQUOT, the no. of bytes written
(or not written) will be the same on all bricks of the replica.
Likewise, the entry operation (create, mkdir...) will either succeed or
not succeed on all bricks.
Change-Id: Iacb1108e9ef4a918e36242fb4a957455133744e9
BUG: 1344561
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/14688
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When features.cache-invalidation is ON, a lot of
ec_notify function gets called which leads to launch of
too many heals. This leads to no heal completion,
which causes accumulation of heals.
Solution: ec_launch_replace_heal should not be launch
for every event. Replace brick will trigger a child up
event and then only this heal function should be called.
master -
http://review.gluster.org/#/c/14649/
Change-Id: I57b44c6a279d57230daea1d93229be6069245b7d
BUG: 1342964
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/14652
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir
failures because of stale layout. But AFR was unwinding null xdata_rsp in case
of failures. This was leading to mkdir failures just after remove-brick. Unwind
the xdata_rsp in case of failures to make sure the response from brick reaches
dht.
>BUG: 1340623
>Change-Id: Idd3f7b95730e8ea987b608e892011ff190e181d1
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14553
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Anuradha Talur <atalur@redhat.com>
>Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
BUG: 1340992
Change-Id: I2641d35a851be692aa223dfea5d082245ac6c2bc
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14633
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Afr does post-ops after write but the stat buffer it unwinds is at the
time of write, so if nfs client caches this, it will see different
ctime when it does stat on it after post-op is done. From NFS client's
perspective it thinks the file is changed. Tar which depends on this
to be correct keeps giving 'file changed as we read it' warning.
If Afr instead has to choose to unwind after post-op, eager-lock,
delayed-post-op will have to be disabled which will lead to bad
performance for all write usecases.
Fix:
Don't let client cache stat after write.
>Change-Id: Ic6062acc6e5cdd97a9c83c56bd529ec83cee8a23
>BUG: 1302948
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Signed-off-by: Anuradha Talur <atalur@redhat.com>
>Reviewed-on: http://review.gluster.org/13785
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Niels de Vos <ndevos@redhat.com>
BUG: 1312721
Change-Id: I42a5d524bcf2a2034fe48ee8454812ca26a98c37
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14454
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If a named fresh-lookup is done on an loc and the fop fails on one of the
bricks or not sent on one of the bricks, but by the time response comes to afr,
if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(),
this will lead to inode-ctx for that inode to be not set, this can lead to EIO
in case of a transaction as it depends on 'readable' array to be available by
that point.
Fix:
Refresh inode for inode-write fops for the ctx to be set if it is not already
done at the time of named fresh-lookup or if the file is in split-brain where
we need to perform one more refresh before failing the fop to check if the file
is still in split-brain or not.
>BUG: 1336612
>Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14368
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Backport of http://review.gluster.org/14545
BUG: 1337831
Change-Id: If4465ab8fc506e1f905b623b82a53bdab8f5cffd
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14453
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/14496/
> Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99
> BUG: 1338634
> Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99
BUG: 1338668
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/14497
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Race is explained at
https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0
This patch also handles performing of self-heal with shd-pid.
Also performs the healing with this->itable's inode rather than
main itable.
>BUG: 1337405
>Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14422
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
BUG: 1337872
Change-Id: I6d8e79a44e4cc1c5489d81f05c82510e4e90546f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14456
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/14358
Problem:
Parallel rmdir operations on the same directory results in ENOTCONN messages
eventhough there was no network disconnect.
In blocking entry lock during rmdir, AFR takes 2 set of locks on all its
children-One (parentdir,name of dir to be deleted), the other (full lock
on the dir being deleted). We proceed to pre-op stage even if only a single
lock (but not all the needed locks) was obtained, only to fail it with ENOTCONN
because afr_locked_nodes_get() returns zero nodes in afr_changelog_pre_op().
Fix:
After we get replies for all blocking lock requests, if we don't have
the minimum number of locks to carry out the FOP, unlock and fail the
FOP. The op_errno will be that of the last failed reply we got, i.e.
whatever is set in afr_lock_cbk().
Change-Id: I9fcb6bec0335dd9cdd851a92cb08605b4a959e64
BUG: 1339446
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/14528
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Multi-threaded healing doesn't create synctask with shd pid, this
leads to healing problems when quota exceeds.
>BUG: 1332994
>Change-Id: I80f57c1923756f3298730b8820498127024e1209
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14211
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: Id3f3ee44b27db7dbf94f3e7a9a6bfd7412d44ab8
BUG: 1335686
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14313
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shards
Backport of http://review.gluster.org/#/c/14334/
Change-Id: I41321d8b00a10f1bd5b0a7b008f673b1aa240d0c
BUG: 1337837
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14450
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/14310/
In afr_changelog_post_op_now(), if there was any error,
meaning op_ret < 0, post-op was not being done even when
the errors were symmetric and there were no "failed
subvols".
Fix:
When the errors are symmetric, perform post-op.
How was the bug found :
In a 1 X 3 volume with shard and write behind on
when writes were done into a file with one brick down,
the trusted.afr.dirty xattr's value for .shard directory
would keep increasing as post op was not done but pre-op was.
This incorrectly showed .shard to be in split-brain.
RCA:
When WB is on, due to multiple writes being sent on
offset lying in the same shard, chances are that
same shard file will be created more than once
with the second one failing with op_ret < 0
and op_errno = EEXIST.
As op_ret was negative, afr wouldn't do post-op,
leading to no decrement of trusted.afr.dirty xattr.
Thus showing .shard directory to be in split-brain.
>Change-Id: I711bdeaa1397244e6a7790e96f0c84501798fc59
>BUG: 1335652
>Signed-off-by: Anuradha Talur <atalur@redhat.com>
Change-Id: I711bdeaa1397244e6a7790e96f0c84501798fc59
BUG: 1335836
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/14332
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "max cycle time" log message was incorrectly logged as
an error. Downgrade it to INFO.
This is a backport of 14336
> Change-Id: Ia7d074423019fa79443bc6ea694148b7b8da455d
> BUG: 1335973
> Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: I29514c66781f49d5c36a0d3ad5dee6ab0c0368cd
BUG: 1336470
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/14361
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case of 3 way replication with quorum enabled with sharding,
if one bricks is brought down and brought back up sometimes
fops fail with EROFS because the mknod of shard file fails with
two good nodes with EEXIST. So even when quorum is not met, it
makes sense to unwind with the errno returned by lower xlators
as much as possible.
>Change-Id: Iabd91cd7c270f5dfe6cbd18c50e59c299a331552
>BUG: 1336612
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14369
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
BUG: 1337831
Change-Id: I18979db118911e588da318094b2d22f5d426efd5
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14452
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For directory rename if destination exists the source directory
is created as a child of the given destination directory. Since
the new child directory does not exist take lock on parent of the
child directory.
Backport of http://review.gluster.org/#/c/14371/
> Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71
> BUG: 1336698
> Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71
BUG: 1337022
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/14407
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had wrongly placed the clearing tier-fix-layout-complete
xattr before the joining of migration threads. This would lead to
situations where failure of clearing the xattr would cause the
premature death of migration threads.
Now we clear the xattr only after the data movement threads join,
ensuring that all migration is done.
This is a backport of 14285
> Change-Id: I829b671efa165ae13dbff7b00707434970b37a09
> BUG: 1334839
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: I475242e6a05cacd2252dc5c29b160e7abc5d1791
BUG: 1336148
Reviewed-on: http://review.gluster.org/14341
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: N Balachandran <nbalacha@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During detach check if background fixlayout is done, if not done ignore
the case and continue detach.
Backport of http://review.gluster.org/14147
> Change-Id: I5d5cfc0e73d0eb217fdeab54c432dc4af8bc598d
> BUG: 1332136
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/14147
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
>Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Change-Id: I2161673cf6861b02a8e323366a13a13587258bef
BUG: 1333934
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/14246
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Joseph Fernandes
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/14302
Problem:
Spurious entries are reported in heal info when the mount is on second/third
brick of the replica pair because local-child is given preference in selecting
source. The code is supposed to suggest the file needs heal if the (source < 0)
(failure code path), but instead it is written as if any non-zero value
is considered failure.
Fix:
Treat +ve source as success case
BUG: 1334566
Change-Id: Iac6d68cc429496756a9d8f6e21e71aa5f6b932ee
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14304
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During locking we send lock request to cached subvol,
and normally we unlock to the cached subvol
But with parallel fresh lookup on a directory, there
is a race window where the cached subvol can change
and the unlock can go into a different subvol from
which we took lock.
This will result in a stale lock held on one of the
subvol.
So we will store the details of subvol which we took the lock
and will unlock from the same subvol
back port of>
>Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb
>BUG: 1311002
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
>Reviewed-on: http://review.gluster.org/13492
>Reviewed-by: N Balachandran <nbalacha@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Change-Id: Ia847e7115d2296ae9811b14a956f3b6bf39bd86d
BUG: 1333645
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/14236
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/14212
.. to prevent unnecessary logs from gf_msg_callingfn()
Change-Id: I443322d26f2b5238320bc14f1ddc94affe030943
BUG: 1333241
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14216
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to a race in timer cancellation, in some cases it was possible
to unlock the lock while another concurrent fop that needed it
continues execution as if it were not released.
This patch also fixes an issue that caused a lock to not be released
if an error was found while preparing ec_update_size_version().
> Change-Id: I1344a3f5ecfc333f05a09e62653838264c9c26b1
> BUG: 1331254
> Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
> Reviewed-on: http://review.gluster.org/14112
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Chen Chen <chenchen@smartquerier.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Change-Id: I21edd17d914dfa8d2f98e6bbde50830496e12a92
BUG: 1330132
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14174
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT did not handle rmdir failures on non-hashed subvols
correctly in a 2x2 dist-rep volume, causing the
directory do be deleted from the hashed subvol.
Also fixed an issue where the dht_selfheal_restore
errcodes were overwriting the rmdir error codes.
Change-Id: If2c6f8dc8ee72e3e6a7e04a04c2108243faca468
BUG: 1331933
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/14123
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/13448
commit c4d67a8338b42d6485f49999f310cbb9ed5359c5
It has become very difficult to identify the xlator which returned
negative op_ret. Being able to just change the log level and
visualize the stack is helpful in such cases.
Change-Id: I6545b4802c1ab4d0d230d5e9e036afb2384882e1
BUG: 1330739
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/14099
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
>BUG: 1329501
>Change-Id: Id402c20f2fa19b22bc402295e03e7a0ea96b0c40
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14048
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
>(cherry picked from commit 302e218f68ef5edab6b369411d6f06cafea08ce1)
Change-Id: Ifbf693f8de6765fca90a9ef3c11c1912c2e9885f
BUG: 1331342
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14104
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an ongoing rebalance completion check task been
triggered by dht, there is a possibility of a race
between afr setting subvol as non-readable and dht updates
the cached subvol. In this window a write can fail with EIO.
Back port of>
>Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d
>BUG: 1329503
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
>Reviewed-on: http://review.gluster.org/14049
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: N Balachandran <nbalacha@redhat.com>
>Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
(cherry picked from commit a9ccd0c8ea6989c72073028b296f73a6fcb6b896)
Change-Id: Ifc741f05a9cfe5c1d9c03085312ebbf21b8b798b
BUG: 1330428
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/14070
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/11113
Problem: During mount, afr waits for response from all its children before
notifying the parent xlator. In a 1x2 replica volume , if one of the nodes is
down, the mount will hang for more than a minute until child down is received
from the client xlator for that node.
Fix:
When parent up is received by afr, start a 10 second timer. In the timer call
back, if we receive a successful child up from atleast one brick, propagate the
event to the parent xlator.
Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d
BUG: 1330855
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14088
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11433
The above mentioned patch was backported but it was not complete.
This patch backports whatever was left out. For more info, refer to
previous backport at http://review.gluster.org/#/c/11652.
Change-Id: I6ccda8ca8e43485b9b354341bbfcb302496f632c
BUG: 1330765
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/14084
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a part of CHILD_MODIFIED event DHT forgets the current layout and
performs fresh lookup. However this is not required when a replica pair
goes offline as the xattrs can be read from other replica pairs. Hence
setting different event to handle replica pair going down.
> Backport of http://review.gluster.org/#/c/12573/
> Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3
> BUG: 1281230
> Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
> Reviewed-on: http://review.gluster.org/12573
> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
> Reviewed-by: Susant Palai <spalai@redhat.com>
> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: Ida30240d1ad8b8730af7ab50b129dfb05264fdf9
BUG: 1283972
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12767
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When quota is enabled the quota enforcer tries to get the size of the
source directory by sending nameless lookup to quotad. But if the rename
is successful even on one subvol or the source layout has anomalies then
this nameless lookup in quotad tries to heal the directory which requires
a lock on as many subvols as it can. But src is already locked as part of
rename. For rename to proceed in brick it needs to complete a cluster-wide
lookup. But cluster-wide lookup in quotad is blocked on locks held by rename,
hence a deadlock. To avoid this quota sends an option in xdata which instructs
DHT not to heal.
Backport of http://review.gluster.org/#/c/13988/
> Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
> BUG: 1252244
> Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
> Reviewed-on: http://review.gluster.org/13988
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
BUG: 1328473
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/14031
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/14004
Problem: On a rebal stop, the migrator threads don't intimate the
crawler thread to wake up in case it is waiting on signal from
migrator thread.
BUG: 1330529
Change-Id: I9019a715c7b4673b8bb5a75d7d33a18add85ce33
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14076
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a backport of the following two patches (of which the second is a
trivial adjustment to a timeout for a test added by the first).
http://review.gluster.org/13878
http://review.gluster.org/13935
This turns a special xattr into an rmdir with flags set. When that hits
the posix translator on the server side, that causes the file/directory
to be moved into the special "landfill" directory. From there, the
posix janitor thread will take care of deleting it entirely on the
server side - traversing it recursively if necessary. A couple of
secondary issues were fixed to make this effective.
* FUSE now ensures that setxattr values are NUL terminated.
* The janitor thread now gets woken up immediately when something is
placed in 'landfill' instead of only when file descriptors need to be
closed.
* The default landfill-emptying interval was reduced to 10s.
To use the feature, issue a setxattr something like this:
setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir
The value doesn't actually matter; the mere receipt of a request with
this key is sufficient. Some day it might be useful to allow setting a
required value as a sort of password, so that only those who know it can
access the underlying special functionality.
Change-Id: I4132a30d1faa53a6682399ad1d9041e2c4519951
BUG: 1330241
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/14065
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Olia-Kremmyda for finding the bug on github review,
https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6
>Change-Id: Ib8640ed0c331a635971d5d12052f0959c24f76a2
>BUG: 1329773
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14052
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
BUG: 1329779
Change-Id: I3d77f0b445fdedf2c582ea88f8d89e1da525638f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14053
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_mkdir ()
{
first-hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any
subvol, but we choose first-hashed-subvol randomly");
{
begin:
hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
hash-range = extract hashe-range from layout of "parent";
ret = mkdir (parent/bname, hashed-subvol, hash-range);
if (ret == "hash-value doesn't fall into layout stored on
the brick (this error is returned by posix-mkdir)")
{
refresh_parent_layout ();
goto begin;
}
}
inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN",
"first-hashed-subvol");
proceed with other parts of dht_mkdir;
}
posix_mkdir (parent/bname, client-hash-range)
{
disk-hash-range = getxattr (parent, "dht-layout-key");
if (disk-hash-range != client-hash-range) {
fail-with-error ("hash-value doesn't fall into layout
stored on the brick");
return 0;
}
continue-with-posix-mkdir;
}
Similar changes need to be done for dentry operations like create,
symlink, link, unlink, rmdir, rename. These will be addressed in
subsequent patches. This patch addresses only mkdir codepath.
This change breaks stripe tests, as on some striped subvols dht layout
xattrs are not set for some reason. This results in failure of
mkdir. Since striped volumes are always created with dht, some tests
associated with stripe also fail. So, I am making following tests
changes (since stripe is out of maintainance):
* modify ./tests/basic/rpc-coverage.t to not to use striped volumes
* mark all (2) tests in tests/bugs/stripe/ as bad tests
Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25
BUG: 1329062
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/14040
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|