| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Qsorted latencies must also take into account that a host can be
marked down yet still have a (slightly) better latency. This can
happen if max-replicas is reached, and there are multiple bricks which
meet the halo threshold. In such a case we must prefer the brick
which is marked up albeit with a perhaps slightly higher latency.
It's not worth doing any sort of swap in this case as it's going to be
jarring to the cluster and introduce flapping behavior.
Test Plan: - Halo prove tests @ https://phabricator.fb.com/P56251020
Reviewers: sshreyas, kvigor
Reviewed By: kvigor
Blame Revision: Change-Id: I858bbf44638a4978093dc56d4a98c96be4f8b45e
Change-Id: Ie3b79700ec63398bc30e7bcf75fb05931dd5d0d0
FB-commit-id: deb2f35db6f8b979857b4e3779b2a41c59a2e416
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16307
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Implements "Hybrid" halo mounts which perform write FOPs synchronously
to all regions and read FOPs asynchronously to those nodes within the
region.
- Leverages the fact that the inode cache stores "hints" as to the
clean/dirty state of upto 16 replicas in the cluster; upon refresh of
an inode we can do a fresh lookup to get a "wise" node if none exist
in our region
- Activated via the cluster.halo-hybrid-mode option
Test Plan:
- Run AFR prove tests
- Run halo prove tests
Reviewers: kvigor, sshreyas
Reviewed By: kvigor, sshreyas
FB-commit-id: aca760757afd45e4de2e28dc31b87a73ee52f12d
Change-Id: Ic6728ce93b7b96e3151dccdebccd30e007f4750c
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16306
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- SHD is now excluded from the max-replicas policy. We'd need
to make an SHD specific tunable for this to make tests reliably
pass, and frankly it probably makes things more intuitive having
SHD excluded (i.e. SHD can always see everything).
- Updated the halo-failover-enabled test, I think it's a bit more clear
now, and works reliably. halo.t fixed after fixing the SHD
max-replicas bug.
Test Plan: - Run prove tests -> https://phabricator.fb.com/P19872728
Reviewers: dph, sshreyas
Reviewed By: sshreyas
FB-commit-id: e425e6651cd02691d36427831b6b8ca206d0f78f
Change-Id: I57855ef99628146c32de59af475b096bd91d6012
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16305
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Changes halo-decision to be based on the lowest halo value observed
- Adds halo-min-sample option to wait until N latency samples have been
gathered prior to activating halos.
- Fixed 3 edge cases where halo's weren't being correctly
config'd, or not configured as quickly as is possible. Namely:
1. Don't mark a child down if there's no better alternative (and you'd
no longer satisfy min/max replicas); fixes unneccessary flapping.
2. If a child goes down and this causes us to fall below max_replicas,
swap in a warm child immediately if it is within our halo latency
(don't wait around for the next "ping"); swaps in a new child
immediately helping with resiliency.
3. If the child latency is within the halo, and it's currently marked
up, mark it down if it's the highest latency child and the number of
children is > max_replicas; this will allow us to support the
SHD use-case where we can "beam" a single copy to a geo and have it
replicate within the geo after that.
- More commenting
Test Plan:
- Run halo prove tests
- Pointed compiled code at gfsglobal.prn2, tested out an NFS daemon and
FUSE mounts to ensure they worked as expected on a large scale
cluster.
Reviewers: dph, jackl, cjh, mmckeen
Reviewed By: mmckeen
FB-commit-id: 7e2e8ae6b8ec62a5e0b31c9fd6100c81795b3424
Change-Id: Iba2b2f1bc848b4546cb96117ff1895f83953a4f8
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16304
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Realtime latencies in practice have far too much jitter under real
loading conditions, instead let's use a running average which will get
very "heavy" over time such that temp spikes in brick latency will not
affect halo decisions.
Test Plan: - Run prove tests
Reviewed By: mmckeen
Change-Id: I5ebf9bc93c67d9a226287796dd7ca5eeb7b1cfa5
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16301
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Adds "halo-failover-enabled" option to enable/disable failing over to a brick outside of the defined halo to satisfy min-replicas
- There are some use-cases where failing over to a brick which is out of region will be undesirable. I such cases we will more than likely opt to have more replicas within the region to tolerate the loss of a single replica in that region without losing quorum.
- Fixed quorum accounting problem as well, now correctly goes RO in case where we lose a brick and aren't able to swap one in for some reason (fail-over not enabled or otherwise)
Test Plan:
- run prove -v tests/basic/halo.t
- run prove -v tests/basic/halo-disable.t
- run prove -v tests/basic/halo-failover-enabled.t
- run prove -v tests/basic/halo-failover-disabled.t
Reviewers: dph, cjh, jackl, mmckeen
Reviewed By: mmckeen
Conflicts:
xlators/cluster/afr/src/afr.h
xlators/mount/fuse/utils/mount.glusterfs.in
Change-Id: Ia3ebf83f34b53118ca4491a3c4b66a178cc9795e
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16275
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- This option was broken (or shall we say....not fully implemented? :) ),
it needs to set the halo-latency to FLT_MAX to
gaurentee no bricks will be marked down, and all those presently
marked down @ runtime shall be marked back up upon the transition from
true to false
- Also fixes tests/bugs/fb2518260.t
Test Plan: Prove tests (paste coming)
Reviewers: cjh, dph, mmckeen
Reviewed By: mmckeen
Differential Revision: https://phabricator.fb.com/D1435264
Conflicts:
xlators/cluster/afr/src/afr-common.c
Change-Id: I81209d6f2cc9ea7e562eedf44bf3efbe87e01bf7
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16227
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Several prove tests use the 'launch_cluster' function to set up a
clustered volume. This replies on using multiple local IP
addresses, one for each server. Since IPV6 provides only ::1 as
a local address, as opposed to IPv4's complete 127.x.x.x subnet,
this cannot work in a pure IPv6 environment.
However, FB systems do at least have enough IPv4 stack to talk
locally, so fix launch_cluster to work properly when default
transport is IPv6.
To do this:
1) explicitly set transport.address-family volume option to inet in
launch_cluster().
2) teach glusterd to honor transport.address-family when connecting
to peer glusterds in glusterd_friend_rpc_create(). Previously
transport.address-family was used only for binding local socket,
not for communicating with peers.
Test Plan:
prove -f --timer ./tests/basic/glusterd/arbiter-volume-probe.t
Reviewers:
Subscribers:
Tasks:
Blame Revision:
Change-Id: I077d8549dcdbe4919ac7df34856a4b2d1428cdb6
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16225
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- We have disabled the use of rmtab in our environment due to the its
impact on performance in NFS.
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I6c2db19e49791aa4938f38a55dbb8ee3e17661e9
Reviewed-on: http://review.gluster.org/16220
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- We've seen lots of issues when the ping timeout is too low @ 60 seconds.
- This diff defaults the value to 180 seconds.
- This is a cherry-pick of D3753765 to 3.8.
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I70b96b027ac024df63af4ca1aa768f973295b7e4
Reviewed-on: http://review.gluster.org/16219
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Enforces FUSE/gNFSd/SHD/rebalance rejection of writes when all subvolumes are
beyond the value set in "cluster.min-free-disk"
- Fixes existing code paths to be more intuitive & straightforward
- Write path now honors min-free-disk
- Adds test to ensure feature doesn't break in future
- This is a port of D2981282 to 3.8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I76923bf76178fe589aa1a26bd1970cf8d009642a
Reviewed-on: http://review.gluster.org/16153
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- After the afr-common.c refactor the quorum accounting broke, this diff
fixes things so when an alternate brick is swapped in to provide the
min-replicas quorum accounting works correctly and the FS doesn't go
RO. In short, the magic which makes halo clusters seamlessly fail-over
to a remote region for writes broke :).
Test Plan:
prove -v tests/basic/halo-failover.t ->
https://phabricator.fb.com/P12179467
Reviewers: dph, jackl, cjh
Reviewed By: cjh
Subscribers: meyering
Differential Revision: https://phabricator.fb.com/D1390670
Tasks: 4117827
Change-Id: I2d7fb8ca1e80cd1b21cc12b11b0a3db812321080
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16203
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Add a configurable minimum free space for bricks, using the new
options storage.min-free-disk (analagous to cluster.min-free-disk,
and using the same units: either a percentage or an
absolute number of bytes) and storage.freespace-check-interval
(how frequently to check free space, in seconds).
- This is a cherry-pick of D2920210 to 3.8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I4b87e421aad023e49b5972c6e61539670a818411
Reviewed-on: http://review.gluster.org/16176
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Captures the error of an operation in the FOP sample.
- Cherry-pick of D3306106 to io-stats
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ia32a5b34bbd36981ac693a8829c70fa74b02d38d
Reviewed-on: http://review.gluster.org/16175
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- The result needs to be cast to (struct sockaddr_in *)
- This diff is a cherry-pick of D3111554 to 3.8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: If4c27dbe6c032f9e278ea08cd3c96a4d07bcc5f9
Reviewed-on: http://review.gluster.org/16179
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- The NFS server would occasionally send requests to the bricks which contained
a null parent gfid and a name, even though a proper parent inode was available.
This caused name resolution to fail on the brick. When this occurred while trying
to obtain a lock on a file, it could lead to the NFS server concluding there was
a lack of consensus, and hence returning EROFS.
- This is a cherry-pick of D3064740 into 3.8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Iae63c1f39abe74d4101058f494d1e14fda1c1912
Reviewed-on: http://review.gluster.org/16180
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Master option for halo geo-replication
- Added prove test for halo mode
- Updated options do values which should work "out of the box" for most use-cases, just run "enable" halo mode and you are done, the options can be tweaked as needed from there.
Test Plan:
- Enabled option, verified halo works, disabled option verified async
behavior is disabled
- Ran "prove -v tests/basic/halo.t" -> https://phabricator.fb.com/P12074204
Reviewers: jackl, dph, cjh
Reviewed By: cjh
Subscribers: meyering
Differential Revision: https://phabricator.fb.com/D1386675
Tasks: 4117827
Conflicts:
xlators/cluster/afr/src/afr-self-heal-common.c
xlators/cluster/afr/src/afr.h
Change-Id: Ib704528d438c3a42150e30974eb6bb01d9e795ae
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16172
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Our keys are prefixed by storage.gluster, not gluster.
- This is a cherry-pick of D1184318.
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Iaf3d3b4ccd285853c7750759db20404607941c0e
Reviewed-on: http://review.gluster.org/16151
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- A race condition exists when reconnecting to a brick after connection
has been lost; it is possible for the client translator to believe the
connection is down while the socket layer believes the connection is up.
This situation is permanent and eventually leads to loss of quorum
and EROFS errors.
- This is a cherry-pick of D3490020 to 3.8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ida7afbafd3dceadf9ca7ea8b350aa88db382dd88
Reviewed-on: http://review.gluster.org/16174
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
initialized
Summary:
- Disconnects RPC transports for requests that cannot be serviced
because volumes are not ready.
- This is a cherry-pick of D2991403
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I07ff0795b81d25624541ff981b5f2586d078e9a6
Reviewed-on: http://review.gluster.org/16154
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- This diff adds the ability to track paths in our FOP samples.
- It adds a function called `attach_iosstat_to_inode()` which
attaches a special struct containing the filename, to the inode.
- This diff attaches a struct, `ios_local` to each frame, and tracks
paths, locs, fds, and inodes depending on the fop that it is
executing.
- Operations done on this inode can then reference this path.
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ie43b2193f66d8c7f59b5d07293e07d6120e3b20a
Reviewed-on: http://review.gluster.org/16149
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Prevent crashes for the case where "getattr" actually failed and
returned with a NULL buf, but the op_errno was set to 0.
- This is a cherry-pick of D1571184
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ia2d6ad7539df714f9420dcf063c7c14e727bb7e3
Reviewed-on: http://review.gluster.org/16152
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summmary:
- Adds a new server-side daemon called gfproxyd & a new FUSE client
called gfproxy-client
- Adds a new translator called AHA (Advanced High-Availability) that
manages failover between gfproxy servers & FOP replay for failed FOPS.
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Iba0bd54e6b4035b8d7914aab64bcac9e93089dd7
Reviewed-on: http://review.gluster.org/16136
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks.
In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously.
There are a few new volume options due to this feature:
halo-shd-latency: The threshold below which self-heal daemons will
consider children (bricks) connected.
halo-nfsd-latency: The threshold below which NFS daemons will consider
children (bricks) connected.
halo-latency: The threshold below which all other clients will
consider children (bricks) connected.
halo-min-replicas: The minimum number of replicas which are to
be enforced regardless of latency specified in the above 3 options.
If the number of children falls below this threshold the next
best (chosen by latency) shall be swapped in.
New FUSE mount options:
halo-latency & halo-min-replicas: As descripted above.
This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities.
Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms.
TODO:
- Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array.
- Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish to synchronously write to
Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer & valgrind
- Prove tests
Reviewers: jackl, dph, cjh, meyering
Reviewed By: meyering
Subscribers: ethanr
Differential Revision: https://phabricator.fb.com/D1272053
Tasks: 4117827
Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16099
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
page_size and window_size
Summary:
- It adds a configurable option for "trickling-writes".
- Makes `__wb_preprocess_winds()` use `wb_inode->window_conf` rather
than `page_size`, so that the window-size option is actually
respected.
- This is a port of D3576122 & D3738605 to 3.8.
Test Plan:
- Prove test which looks @ brick-level FOPs and ensures
that they fall in the right write-size bucket.
Reviewed By: rwareing
Signature: t1:3576122:1468892648:6923a6a19b18888577ce5173b5c9cb9531f941e7
Change-Id: I379a9f2f0c4768c9052b7e9dd71c5f0469cb2d68
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-on: http://review.gluster.org/16079
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- We add an option to cache all extended attributes for an inode
- This is an option to bypass the whitelisted xattrs to cache
Test Plan:
- Prove tests
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ia52bed22aa8d84f953fe1d022df929674d716e9e
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-on: http://review.gluster.org/16126
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Fixes a bug where data-self-heal-window was ignored and instead
hard-coded to 128k
- Cherry-pick of D2752781
Test Plan:
- Prove tests
Reviewed By: sshreyas
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: Ie38456ce9ad90921f7456fe02aaace88393433a9
Reviewed-on: http://review.gluster.org/16083
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Reduce the amount of unnecessary timing calls
in iot_worker servicing.
- The current logic is unnecessarily accurate and
hurts performance for many small FOPS.
- This is a cherry-pick of D3156588 for 3.8
Test Plan:
- Prove tests
Change-Id: I6db4f1ad9a48d9d474bb251a2204969061021954
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-on: http://review.gluster.org/16081
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- The maximum block size, `DHT_REBALANCE_BLKSIZE`, is set to 128 KB.
- As a result, migrating files in the megabytes to gigabytes can take much longer than necessary.
- Some preliminary results by bumping the blocksize:
With 128 KB:
[2016-08-04 11:40:19.251167] I [MSGID: 109028] [dht-rebalance.c:2196:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 15.00 secs
[2016-08-04 11:40:19.251189] I [MSGID: 109028] [dht-rebalance.c:2200:gf_defrag_status_get] 0-glusterfs: Files migrated: 49, size: 2569011200, lookups: 149, failures: 0, skipped: 0
With 1 MB:
[2016-08-04 11:41:21.093662] I [MSGID: 109028] [dht-rebalance.c:2196:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 7.00 secs
[2016-08-04 11:41:21.093687] I [MSGID: 109028] [dht-rebalance.c:2200:gf_defrag_status_get] 0-glusterfs: Files migrated: 49, size: 2569011200, lookups: 149, failures: 0, skipped: 0
- This is a cherry-pick of D3670927 to 3.8.
Test Plan: Tested rebalance on devserver.
Reviewed By: dph, rwareing
Change-Id: Ide2edbf87ef9ae2b32a03f189c57b63e2f233fc8
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-on: http://review.gluster.org/16078
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Motivation: Prevents cluster instability by mis-behaving clients
causing bricks to OOM due to inode/entry lock pile-ups.
- Adds option to strip clients of entry/inode locks after N seconds
- Adds option to clear ALL locks should the revocation threshold get hit
- Adds option to clear all or granted locks should the max-blocked
threshold get hit (can be used in combination w/ revocation-clear-all).
- Options are:
features.locks-revocation-secs <integer; 0 to disable>
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked <integer>
- Adds monkey-locking option to ignore 1% of unlock requests (dev only)
features.locks-monkey-unlocking [on/off]
- Adds logging to indicate revocation event & reason
Test Plan:
First you will need TWO fuse mounts for this repro. Call them /mnt/patchy1 & /mnt/patchy2.
1. Enable monkey unlocking on the volume:
gluster vol set patchy features.locks-monkey-unlocking on
2. From the "patchy1", use DD or some other utility to begin writing to a file,
eventually the dd will hang due to the dropped unlocked requests. This now
simulates the broken client. Run:
for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done'
...this will eventually hang as the unlock request has been lost.
3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and
observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the
inability of the client to take out the required lock.
4. Next, re-start the test this time enabling lock revocation; use a timeout of
2-5 seconds for testing:
'gluster vol set patchy features.locks-revocation-secs <2-5>'
5. Wait 2-5 seconds before executing step 3 above this time. Observe that this
time the access to the file will succeed, and the writes on patchy1 will
unblock until they hit another failed unlock request due to
"monkey-unlocking".
Change-Id: I814b9f635fec53834a26db634d1300d9a61057d8
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14816
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-on: http://review.gluster.org/16086
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- Background: Frequently spinlock is observed on busy GFS clusters,
which wastes CPU and destroys the performance of the cluster.
Current solutions to this problem involve under-provisioning the thread
pool, but this is problematic as during busy periods there may not be
enough threads to service the queue.
- This patch introduces a technique to avoid the stampeding herd problem with
the io-threads workers. This is done by dynamically tuning the
threads by a ratio of threads to queue depth, there-by keeping
already running threads sufficiently busy by a tunable FOP to thread
ratio. Ratio is controllable by the
performanace.io-threads-fops-per-threads-ratio option.
- More detailed reading on this approach can be found here:
https://h21007.www2.hp.com/portal/download/files/unprot/hpux/MakingConditionVariablesPerform.pdf
- Cherry-pick of D2530504 for 3.8
Test Plan:
- Stress teston my dev server
- shadow testing
Reviewed By: moox, sshreyas
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I771ae783aa4ca5a6fd0449db64e07d1f4bff0d04
Reviewed-on: http://review.gluster.org/16080
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
- `is_mdc_key_satisfied()` is returning 0 when it has not checked any of the keys
- This causes the cache'd value for the root inode to always be invalid (`mdc_xattr_satisfied()` returns 0, which causes us to jump to `uncached').
- In this diff we add a new option called "strict-xattrs", when enabled winds getxattr calls for those keys not present in our cache.
- This allows "special" getxattr commands (quota cli commands for example) to work when md-cache is enabled.
- This is a port of D4135452
Test Plan:
- Test on devserver and see latency improvements for root inode.
- Prove tests
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I8ff75595e821d7a714224b3b3dded23f0a93560a
Reviewed-on: http://review.gluster.org/16060
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- This diff changes all locations in the code to prefer inet6 family
instead of inet. This will allow change GlusterFS to operate
via IPv6 instead of IPv4 for all internal operations while still
being able to serve (FUSE or NFS) clients via IPv4.
- The changes apply to NFS as well.
- This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch.
Test Plan: Prove tests!
Reviewers: dph, rwareing
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33
Reviewed-on: http://review.gluster.org/16059
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/15800/
Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081
BUG: 1393630
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15813
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
open-behind is taking fd->lock then inode->lock where as statedump is taking
inode->lock then fd->lock, so it is leading to deadlock
In open-behind, following code exists:
void
ob_fd_free (ob_fd_t *ob_fd)
{
loc_wipe (&ob_fd->loc); <<--- this takes (inode->lock)
.......
}
int
ob_wake_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
int op_ret, int op_errno, fd_t *fd_ret, dict_t *xdata)
{
.......
LOCK (&fd->lock); <<---- fd->lock
{
.......
__fd_ctx_del (fd, this, NULL);
ob_fd_free (ob_fd); <<<---------------
}
UNLOCK (&fd->lock);
.......
}
=================================================================
In statedump this code exists:
inode_dump (inode_t *inode, char *prefix)
{
.......
ret = TRY_LOCK(&inode->lock); <<---- inode->lock
.......
fd_ctx_dump (fd, prefix); <<<-----
.......
}
fd_ctx_dump (fd_t *fd, char *prefix)
{
.......
LOCK (&fd->lock); <<<------------------ this takes fd-lock
{
.......
}
Fix:
Make sure open-behind doesn't call ob_fd_free() inside fd->lock
>BUG: 1393259
>Change-Id: I4abdcfc5216270fa1e2b43f7b73445f49e6d6e6e
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15808
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Poornima G <pgurusid@redhat.com>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1393682
Change-Id: I45a0fbed683ef6acb7900df87534927f332fdaaa
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15818
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If extended attributes are not present in md-cache it returns NULL as xattr.
posix acl xlator should check for NULL before using xattr.
If normal and default ACLs are not set on file then md-cache will not contain
system.posix_acl_access and system.posix_acl_default extended attributes in
its cache.
Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise
the logs will get filled with dictionary errors.
> Reviewed-on: http://review.gluster.org/15769
> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit de7fe24663713fff364cfc2b52b675e3e979ee68)
Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da
BUG: 1392868
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/15799
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/15788/
On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:
Brick-logs:
[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==> (Invalid argument) [Invalid argument]
Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]
Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.
Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb
BUG: 1392846
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15796
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem
=======
When quota is enabled on 3.6, it will have quota conf version in quota.conf
as v1.1. This node gets upgraded to 3.7 but it will still have quota conf
version as v1.1 until a quota enable/disable/set limit is initiated. When
this is not initiated and when this node tries to peer probe a node which
is a fresh install of 3.7 (which will have quota conf version as v1.2), then this
will result in "Peer rejected" state. This patch fixes the issue.
Solution
========
When an upgrade happens from 3.6 to 3.7, quota.conf file needs
to be modified as well. With 3.6, in quota.conf the version will be
v1.1 and it needs to be changed to v1.2 from 3.7. This is because in
3.7, inode quota feature is introduced. So when an op-version bumpup
happens quota.conf needs to be upgraded with quota conf version v1.2
and all the 16 byte uuid needs to be changed to 17 bytes uuid as well.
Previously, when the cluster version is upgraded to 3.7, the quota.conf
got upgraded as well. But, the upgradation was done only when quota
enable/disable/set limit is done. With this patch, the upgradation is done
during a cluster op version bump up as well.
> Reviewed-on: http://review.gluster.org/15352
> Tested-by: Atin Mukherjee <amukherj@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 4b2cff614462508eef529c5d128e0974720e3f50)
Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0
BUG: 1392716
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/15791
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
Tested-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check if S32gluster_enable_shared_storage.sh is present
at /var/lib/glusterd/hooks/1/set/post/ at staging
before proceeding with the command. Fail the command
with the appropriate error message in case it is not
present.
> Reviewed-on: http://review.gluster.org/15718
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 29587a91716e1e55bd172d63340c40249fb343c9)
Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b
BUG: 1377448
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/15741
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> Reviewed-on: http://review.gluster.org/15668
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
(cherry picked from commit 48a4c4e525665115f7e8c478d3bf51764427378d)
Change-Id: Idc2cb16574d166e3c0ee1f7c3a485f1acb19fc8c
BUG: 1388354
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/15720
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a file is opened with O_TRUNC flag set, its size gets
set to '0'. This case needs to be handled in md-cache to
avoid sending incorrect cached stat.
This is backport of below mainline patch -
http://review.gluster.org/#/c/15618/
> Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9
> BUG: 1382266
> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
> Reviewed-on: http://review.gluster.org/15618
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
> Tested-by: jiffin tony Thottan <jthottan@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
> (cherry picked from commit 6ca5d6382f03685b31b12accb095093cf1486603)
Change-Id: I92349f5b48aef07f3790db7aae25bfa2ddb5947e
BUG: 1391450
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/15771
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the issue is happened in this case:
assume a file is opened with fd1 and fd2.
1. some WRITE opto fd1 got error, they were add back to 'todo' queue
because of those error.
2. fd2 closed, a FLUSH op is send to write-behind.
3. FLUSH can not be unwind because it's not a legal waiter for those
failed write(as func __wb_request_waiting_on() say). and those failed
WRITE also can not be ended if fd1 is not closed. fd2 stuck in close
syscall.
to resolve this issue, we can change the way we determine 2 requests is
'conflict': flush/fsync is not conflict with those write that is not
belonged to them. so __wb_pick_winds() can wind the FLUSH op.
below is some information when the stuck issue happen:
glusterdump logs:
[xlator.performance.write-behind.wb_inode]
path=/ltp-F9eG0ZSOME/rw-buffered-16436
inode=0x7fdbe8039b9c
window_conf=1048576
window_current=249856
transit-size=0
dontsync=0
[.WRITE]
request-ptr=0x7fdbe8020200
refcount=1
wound=no
generation-number=4
req->op_ret=-1
req->op_errno=116
sync-attempts=3
sync-in-progress=no
size=131072
offset=1220608
lied=-1
append=0
fulfilled=0
go=0
[.WRITE]
request-ptr=0x7fdbe8068c30
refcount=1
wound=no
generation-number=5
req->op_ret=-1
req->op_errno=116
sync-attempts=2
sync-in-progress=no
size=118784
offset=1351680
lied=-1
append=0
fulfilled=0
go=0
[.FLUSH]
request-ptr=0x7fdbe8021cd0
refcount=1
wound=no
generation-number=6
req->op_ret=0
req->op_errno=0
sync-attempts=0
gdb detail about above 3 requests:
(gdb) print *((wb_request_t *)0x7fdbe8021cd0)
$2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next
= 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0,
prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev =
0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev =
0x7fdbe8021d10}, wip = {
next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub =
0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret =
0, op_errno = 0,
refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner
= {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>},
iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering =
{size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0,
go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8020200)
$3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next
= 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50,
prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev =
0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev =
0x7fdbe8020240}, wip = {
next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub =
0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0,
op_ret = -1,
op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3,
ordering = {size = 131072, off = 1220608, append = 0, tempted = -1,
lied = -1, fulfilled = 0, go = 0}}
(gdb) print *((wb_request_t *)0x7fdbe8068c30)
$4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next
= 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628,
prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev =
0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev =
0x7fdbe8068c70}, wip = {
next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub =
0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0,
op_ret = -1,
op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop =
GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>},
iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2,
ordering = {size = 118784, off = 1351680, append = 0, tempted = -1,
lied = -1, fulfilled = 0, go = 0}}
you can see they are all on 'todo' queue, and FLUSH op fd is not the
same WRITE op fd.
> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab
> BUG: 1372211
> Signed-off-by: Ryan Ding <ryan.ding@open-fs.com>
Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab
BUG: 1390838
Signed-off-by: Ryan Ding <ryan.ding@open-fs.com>
Reviewed-on: http://review.gluster.org/15762
Tested-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a misleading message from logs about
available disk space while rebalancing of bricks
while calculating free space.
Bug: 1390870
Backprt of http://review.gluster.org/#/c/15345/
Change-Id: Ie9df0b2cbf00faaf13a0a3f0dbd657377a082755
Signed-off-by: ankitraj <anraj@redhat.com>
Reviewed-on: http://review.gluster.org/15765
Tested-by: ankitraj
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fedora has updated openssl-1.1.0b in/for Fedora 26
HMAC_CTX is now an opaque type and instances of it must be
created and released by calling HMAC_CTX_new() and
HMAC_CTX_free().
Change-Id: I3a09751d7b0d9fc25fe18aac6527e5431e9ab19a
BUG: 1388580
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15727
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.
>BUG: 1384297
>Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15728
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
BUG: 1388948
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15735
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When one thread is in the process of creating a file/directory
and the other thread is doing readdirp, there is a chance that
posix_pstat, creation fops race in the following manner which
will lead to wrong stat values to be read by parent xlators
like posix-acl.
Creation fops posix_pstat() as part of readdirp
1) file is created with uid/gid 0/0 1) does stat of the path that
is created just now.
2) Does chown to set the correct
uid/gid
3) Sets the acl/user/internal xattrs
4) Sets the gfid on the entry and
completes the creation of the file/dir
2) fills the gfid in the iatt
If unwind of readdirp hits server xlator before creation fop, then
posix-acl remembers uid/gid of the file to be root/root and fails
fops like open etc on it.
Fix:
Reverse the order of filling gfid and filling lstat() values in
posix_pstat() so that if there is gfid in iatt buffer uid/gid
are valid.
>Change-Id: I46caa7f6da7abfa40a0b1d70e35b88de9c64959c
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15564
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
BUG: 1386071
Change-Id: I81c4c6e6b8d4037cee9b23da262b254c126c0a19
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15665
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wb_fulfill_request
Before this patch, a request is removed from liability queue only when
ref count of request hits 0. Though, wb_fulfill_request does an unref,
it need not be the last unref and hence the request may survive in
liability queue till the last unref. Let,
T1: the time at which wb_fulfill_request is invoked
T2: the time at which last unref is done on request
Let's consider a case of T2 > T1. In the time window between T1 and
T2, any other request (waiter) conflicting with request in liability
queue (blocker - basically a write which has been lied) is blocked
from winding. If T2 happens to be when wb_do_unwinds is invoked, no
further processing of request list happens and "waiter" would get
blocked forever. An example imaginary sequence of events is given
below:
1. A write request w1 is picked up for unwinding in __wb_pick_unwinds
(but unwind is not done _yet_ and hence reference
remains). However, w1 is moved to liability queue. Let's call this
invocation of wb_process_queue by wb_writev as PQ1.
2. A flush (f1) request hits write behind. Since the liability queue
of inode is not empty, f1 is not picked for unwinding. Let's call
the invocation of wb_process_queue by wb_flush as PQ2.
3. PQ2 continues and picks w1 for fulfilling and invokes
wb_fulfill. As part of successful wb_fulfill_cbk,
wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence
not removed from liability queue) as w1 is not unwound _yet_ and a
ref remains (PQ1 has not invoked wb_do_unwinds _yet_).
4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's
say PQ3). f1 is not resumed in PQ3 as w1 is still in liability
queue. At this time, PQ2 and PQ3 are complete.
5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed
(and removed from liability queue). Since PQ1 didn't invoke
wb_fulfill on any other write requests, there won't be any future
codepaths that would invoke wb_process_queue and f1 is stuck
forever.
With this fix, w1 is removed from liability queue in step 3 above and
PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1
in liability queue during execution of PQ3).
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> BUG: 1379655
> Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb
> Reviewed-on: http://review.gluster.org/15579
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit a8b2a981881221925bb5edfe7bb65b25ad855c04)
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1385620
Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb
Reviewed-on: http://review.gluster.org/15658
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.
>BUG: 1385104
>Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15646
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>BUG: 1385236
>Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15650
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
BUG: 1385442
Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15651
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.
Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.
> Reviewed-on: http://review.gluster.org/15641
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)
Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1375125
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Max Raba <max.raba@comsysto.com>
Reviewed-on: http://review.gluster.org/15647
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.
Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.
cherry picked from commit f013335400d033a9677797377b90b968803135f4:
>BUG: 1346719
>Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15309
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ashish Pandey <aspandey@redhat.com>
>Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
BUG: 1371397
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15405
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|