glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
* \|	cluster/afr: Hybrid mounts must honor _marked_ up/down states	Richard Wareing	2016-12-29	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Qsorted latencies must also take into account that a host can be marked down yet still have a (slightly) better latency. This can happen if max-replicas is reached, and there are multiple bricks which meet the halo threshold. In such a case we must prefer the brick which is marked up albeit with a perhaps slightly higher latency. It's not worth doing any sort of swap in this case as it's going to be jarring to the cluster and introduce flapping behavior. Test Plan: - Halo prove tests @ https://phabricator.fb.com/P56251020 Reviewers: sshreyas, kvigor Reviewed By: kvigor Blame Revision: Change-Id: I858bbf44638a4978093dc56d4a98c96be4f8b45e Change-Id: Ie3b79700ec63398bc30e7bcf75fb05931dd5d0d0 FB-commit-id: deb2f35db6f8b979857b4e3779b2a41c59a2e416 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16307 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* \|	cluster/afr: Hybrid Halo mounts	Richard Wareing	2016-12-29	4	-20/+260
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Implements "Hybrid" halo mounts which perform write FOPs synchronously to all regions and read FOPs asynchronously to those nodes within the region. - Leverages the fact that the inode cache stores "hints" as to the clean/dirty state of upto 16 replicas in the cluster; upon refresh of an inode we can do a fresh lookup to get a "wise" node if none exist in our region - Activated via the cluster.halo-hybrid-mode option Test Plan: - Run AFR prove tests - Run halo prove tests Reviewers: kvigor, sshreyas Reviewed By: kvigor, sshreyas FB-commit-id: aca760757afd45e4de2e28dc31b87a73ee52f12d Change-Id: Ic6728ce93b7b96e3151dccdebccd30e007f4750c Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16306 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	Fix Halo tests in v3.6.3 of GlusterFS + minor SHD bug fix	Richard Wareing	2016-12-29	1	-7/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - SHD is now excluded from the max-replicas policy. We'd need to make an SHD specific tunable for this to make tests reliably pass, and frankly it probably makes things more intuitive having SHD excluded (i.e. SHD can always see everything). - Updated the halo-failover-enabled test, I think it's a bit more clear now, and works reliably. halo.t fixed after fixing the SHD max-replicas bug. Test Plan: - Run prove tests -> https://phabricator.fb.com/P19872728 Reviewers: dph, sshreyas Reviewed By: sshreyas FB-commit-id: e425e6651cd02691d36427831b6b8ca206d0f78f Change-Id: I57855ef99628146c32de59af475b096bd91d6012 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16305 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	Add halo-min-samples option, better swap logic, edge case fixes	Richard Wareing	2016-12-28	5	-44/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Changes halo-decision to be based on the lowest halo value observed - Adds halo-min-sample option to wait until N latency samples have been gathered prior to activating halos. - Fixed 3 edge cases where halo's weren't being correctly config'd, or not configured as quickly as is possible. Namely: 1. Don't mark a child down if there's no better alternative (and you'd no longer satisfy min/max replicas); fixes unneccessary flapping. 2. If a child goes down and this causes us to fall below max_replicas, swap in a warm child immediately if it is within our halo latency (don't wait around for the next "ping"); swaps in a new child immediately helping with resiliency. 3. If the child latency is within the halo, and it's currently marked up, mark it down if it's the highest latency child and the number of children is > max_replicas; this will allow us to support the SHD use-case where we can "beam" a single copy to a geo and have it replicate within the geo after that. - More commenting Test Plan: - Run halo prove tests - Pointed compiled code at gfsglobal.prn2, tested out an NFS daemon and FUSE mounts to ensure they worked as expected on a large scale cluster. Reviewers: dph, jackl, cjh, mmckeen Reviewed By: mmckeen FB-commit-id: 7e2e8ae6b8ec62a5e0b31c9fd6100c81795b3424 Change-Id: Iba2b2f1bc848b4546cb96117ff1895f83953a4f8 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16304 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	Make Halo calculate & use average latencies, not realtime	Richard Wareing	2016-12-27	2	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Realtime latencies in practice have far too much jitter under real loading conditions, instead let's use a running average which will get very "heavy" over time such that temp spikes in brick latency will not affect halo decisions. Test Plan: - Run prove tests Reviewed By: mmckeen Change-Id: I5ebf9bc93c67d9a226287796dd7ca5eeb7b1cfa5 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16301 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	Add option to toggle x-halo fail-over	Richard Wareing	2016-12-27	5	-9/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Adds "halo-failover-enabled" option to enable/disable failing over to a brick outside of the defined halo to satisfy min-replicas - There are some use-cases where failing over to a brick which is out of region will be undesirable. I such cases we will more than likely opt to have more replicas within the region to tolerate the loss of a single replica in that region without losing quorum. - Fixed quorum accounting problem as well, now correctly goes RO in case where we lose a brick and aren't able to swap one in for some reason (fail-over not enabled or otherwise) Test Plan: - run prove -v tests/basic/halo.t - run prove -v tests/basic/halo-disable.t - run prove -v tests/basic/halo-failover-enabled.t - run prove -v tests/basic/halo-failover-disabled.t Reviewers: dph, cjh, jackl, mmckeen Reviewed By: mmckeen Conflicts: xlators/cluster/afr/src/afr.h xlators/mount/fuse/utils/mount.glusterfs.in Change-Id: Ia3ebf83f34b53118ca4491a3c4b66a178cc9795e Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16275 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* \|	Fix halo-enabled option	Richard Wareing	2016-12-21	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - This option was broken (or shall we say....not fully implemented? :) ), it needs to set the halo-latency to FLT_MAX to gaurentee no bricks will be marked down, and all those presently marked down @ runtime shall be marked back up upon the transition from true to false - Also fixes tests/bugs/fb2518260.t Test Plan: Prove tests (paste coming) Reviewers: cjh, dph, mmckeen Reviewed By: mmckeen Differential Revision: https://phabricator.fb.com/D1435264 Conflicts: xlators/cluster/afr/src/afr-common.c Change-Id: I81209d6f2cc9ea7e562eedf44bf3efbe87e01bf7 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16227 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* \|	Repair cluster prove tests for FB environment	Kevin Vigor	2016-12-21	2	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Several prove tests use the 'launch_cluster' function to set up a clustered volume. This replies on using multiple local IP addresses, one for each server. Since IPV6 provides only ::1 as a local address, as opposed to IPv4's complete 127.x.x.x subnet, this cannot work in a pure IPv6 environment. However, FB systems do at least have enough IPv4 stack to talk locally, so fix launch_cluster to work properly when default transport is IPv6. To do this: 1) explicitly set transport.address-family volume option to inet in launch_cluster(). 2) teach glusterd to honor transport.address-family when connecting to peer glusterds in glusterd_friend_rpc_create(). Previously transport.address-family was used only for binding local socket, not for communicating with peers. Test Plan: prove -f --timer ./tests/basic/glusterd/arbiter-volume-probe.t Reviewers: Subscribers: Tasks: Blame Revision: Change-Id: I077d8549dcdbe4919ac7df34856a4b2d1428cdb6 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16225 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* \|	nfs: [FB-ONLY] Disable rmtab	Shreyas Siravara	2016-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - We have disabled the use of rmtab in our environment due to the its impact on performance in NFS. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I6c2db19e49791aa4938f38a55dbb8ee3e17661e9 Reviewed-on: http://review.gluster.org/16220 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	client: Increase default ping-timeout to 180 seconds	Shreyas Siravara	2016-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - We've seen lots of issues when the ping timeout is too low @ 60 seconds. - This diff defaults the value to 180 seconds. - This is a cherry-pick of D3753765 to 3.8. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I70b96b027ac024df63af4ca1aa768f973295b7e4 Reviewed-on: http://review.gluster.org/16219 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	cluster/dht: Bug fixes to cluster.min-free-disk	Richard Wareing	2016-12-20	9	-36/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Enforces FUSE/gNFSd/SHD/rebalance rejection of writes when all subvolumes are beyond the value set in "cluster.min-free-disk" - Fixes existing code paths to be more intuitive & straightforward - Write path now honors min-free-disk - Adds test to ensure feature doesn't break in future - This is a port of D2981282 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I76923bf76178fe589aa1a26bd1970cf8d009642a Reviewed-on: http://review.gluster.org/16153 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	Fixes halo multi-region fail-over regression	Richard Wareing	2016-12-20	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - After the afr-common.c refactor the quorum accounting broke, this diff fixes things so when an alternate brick is swapped in to provide the min-replicas quorum accounting works correctly and the FS doesn't go RO. In short, the magic which makes halo clusters seamlessly fail-over to a remote region for writes broke :). Test Plan: prove -v tests/basic/halo-failover.t -> https://phabricator.fb.com/P12179467 Reviewers: dph, jackl, cjh Reviewed By: cjh Subscribers: meyering Differential Revision: https://phabricator.fb.com/D1390670 Tasks: 4117827 Change-Id: I2d7fb8ca1e80cd1b21cc12b11b0a3db812321080 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16203 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	storage/posix: Add free space limits to bricks	Kevin Vigor	2016-12-19	4	-1/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Add a configurable minimum free space for bricks, using the new options storage.min-free-disk (analagous to cluster.min-free-disk, and using the same units: either a percentage or an absolute number of bytes) and storage.freespace-check-interval (how frequently to check free space, in seconds). - This is a cherry-pick of D2920210 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I4b87e421aad023e49b5972c6e61539670a818411 Reviewed-on: http://review.gluster.org/16176 Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	debug/io-stats: Add errors to FOP samples	Shreyas Siravara	2016-12-18	1	-50/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Captures the error of an operation in the FOP sample. - Cherry-pick of D3306106 to io-stats Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ia32a5b34bbd36981ac693a8829c70fa74b02d38d Reviewed-on: http://review.gluster.org/16175 Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	nfs: Fix compiler warning when calling svc_getcaller	Kevin Vigor	2016-12-18	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - The result needs to be cast to (struct sockaddr_in *) - This diff is a cherry-pick of D3111554 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: If4c27dbe6c032f9e278ea08cd3c96a4d07bcc5f9 Reviewed-on: http://review.gluster.org/16179 Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	nfs: Fill in pargfid in NFS requests	Kevin Vigor	2016-12-17	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - The NFS server would occasionally send requests to the bricks which contained a null parent gfid and a name, even though a proper parent inode was available. This caused name resolution to fail on the brick. When this occurred while trying to obtain a lock on a file, it could lead to the NFS server concluding there was a lack of consensus, and hence returning EROFS. - This is a cherry-pick of D3064740 into 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Iae63c1f39abe74d4101058f494d1e14fda1c1912 Reviewed-on: http://review.gluster.org/16180 Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	Adding halo-enable option	Richard Wareing	2016-12-17	3	-7/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Master option for halo geo-replication - Added prove test for halo mode - Updated options do values which should work "out of the box" for most use-cases, just run "enable" halo mode and you are done, the options can be tweaked as needed from there. Test Plan: - Enabled option, verified halo works, disabled option verified async behavior is disabled - Ran "prove -v tests/basic/halo.t" -> https://phabricator.fb.com/P12074204 Reviewers: jackl, dph, cjh Reviewed By: cjh Subscribers: meyering Differential Revision: https://phabricator.fb.com/D1386675 Tasks: 4117827 Conflicts: xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr.h Change-Id: Ib704528d438c3a42150e30974eb6bb01d9e795ae Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16172 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	debug/io-stats: [FB-Only] Update JSON key name to 'storage.gluster'	Shreyas Siravara	2016-12-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Our keys are prefixed by storage.gluster, not gluster. - This is a cherry-pick of D1184318. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Iaf3d3b4ccd285853c7750759db20404607941c0e Reviewed-on: http://review.gluster.org/16151 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	protocol/client: Fix race in brick reconnection	Kevin Vigor	2016-12-16	1	-5/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - A race condition exists when reconnecting to a brick after connection has been lost; it is possible for the client translator to believe the connection is down while the socket layer believes the connection is up. This situation is permanent and eventually leads to loss of quorum and EROFS errors. - This is a cherry-pick of D3490020 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ida7afbafd3dceadf9ca7ea8b350aa88db382dd88 Reviewed-on: http://review.gluster.org/16174 Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	nfs: Tear down transports for requests that arrive before the volume is	Shreyas Siravara	2016-12-16	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	initialized Summary: - Disconnects RPC transports for requests that cannot be serviced because volumes are not ready. - This is a cherry-pick of D2991403 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I07ff0795b81d25624541ff981b5f2586d078e9a6 Reviewed-on: http://review.gluster.org/16154 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	debug/io-stats: Track path of operations in FOP samples	Shreyas Siravara	2016-12-16	1	-124/+447
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - This diff adds the ability to track paths in our FOP samples. - It adds a function called `attach_iosstat_to_inode()` which attaches a special struct containing the filename, to the inode. - This diff attaches a struct, `ios_local` to each frame, and tracks paths, locs, fds, and inodes depending on the fop that it is executing. - Operations done on this inode can then reference this path. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ie43b2193f66d8c7f59b5d07293e07d6120e3b20a Reviewed-on: http://review.gluster.org/16149 Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	nfs: Check for null buf, and set op_errno to EIO not 0	Richard Wareing	2016-12-16	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Prevent crashes for the case where "getattr" actually failed and returned with a NULL buf, but the op_errno was set to 0. - This is a cherry-pick of D1571184 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ia2d6ad7539df714f9420dcf063c7c14e727bb7e3 Reviewed-on: http://review.gluster.org/16152 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	gfproxy: Introduce new server-side daemon called GFProxy	Shreyas Siravara	2016-12-16	19	-21/+2730
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summmary: - Adds a new server-side daemon called gfproxyd & a new FUSE client called gfproxy-client - Adds a new translator called AHA (Advanced High-Availability) that manages failover between gfproxy servers & FOP replay for failed FOPS. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Iba0bd54e6b4035b8d7914aab64bcac9e93089dd7 Reviewed-on: http://review.gluster.org/16136 Tested-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	Halo Replication feature for AFR translator	Richard Wareing	2016-12-15	12	-99/+545
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* \|	write-behind: Allow trickling-writes to be configurable, fix usage of ↵	Shreyas Siravara	2016-12-14	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	page_size and window_size Summary: - It adds a configurable option for "trickling-writes". - Makes `__wb_preprocess_winds()` use `wb_inode->window_conf` rather than `page_size`, so that the window-size option is actually respected. - This is a port of D3576122 & D3738605 to 3.8. Test Plan: - Prove test which looks @ brick-level FOPs and ensures that they fall in the right write-size bucket. Reviewed By: rwareing Signature: t1:3576122:1468892648:6923a6a19b18888577ce5173b5c9cb9531f941e7 Change-Id: I379a9f2f0c4768c9052b7e9dd71c5f0469cb2d68 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16079 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	performance/md-cache: Add an option to cache all xattrs for an inode	Shreyas Siravara	2016-12-14	1	-11/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - We add an option to cache all extended attributes for an inode - This is an option to bypass the whitelisted xattrs to cache Test Plan: - Prove tests Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ia52bed22aa8d84f953fe1d022df929674d716e9e Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16126 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* \|	afr/cluster: Restore data-self-heal-window option	Richard Wareing	2016-12-09	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Fixes a bug where data-self-heal-window was ignored and instead hard-coded to 128k - Cherry-pick of D2752781 Test Plan: - Prove tests Reviewed By: sshreyas Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ie38456ce9ad90921f7456fe02aaace88393433a9 Reviewed-on: http://review.gluster.org/16083 Tested-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	performance/io-threads: Reduce the number of timing calls in iot_worker	Max Rijevski	2016-12-09	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Reduce the amount of unnecessary timing calls in iot_worker servicing. - The current logic is unnecessarily accurate and hurts performance for many small FOPS. - This is a cherry-pick of D3156588 for 3.8 Test Plan: - Prove tests Change-Id: I6db4f1ad9a48d9d474bb251a2204969061021954 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16081 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	dht/rebalance: Increase maximum read block size from 128 KB to 1 MB	Shreyas Siravara	2016-12-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - The maximum block size, `DHT_REBALANCE_BLKSIZE`, is set to 128 KB. - As a result, migrating files in the megabytes to gigabytes can take much longer than necessary. - Some preliminary results by bumping the blocksize: With 128 KB: [2016-08-04 11:40:19.251167] I [MSGID: 109028] [dht-rebalance.c:2196:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 15.00 secs [2016-08-04 11:40:19.251189] I [MSGID: 109028] [dht-rebalance.c:2200:gf_defrag_status_get] 0-glusterfs: Files migrated: 49, size: 2569011200, lookups: 149, failures: 0, skipped: 0 With 1 MB: [2016-08-04 11:41:21.093662] I [MSGID: 109028] [dht-rebalance.c:2196:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 7.00 secs [2016-08-04 11:41:21.093687] I [MSGID: 109028] [dht-rebalance.c:2200:gf_defrag_status_get] 0-glusterfs: Files migrated: 49, size: 2569011200, lookups: 149, failures: 0, skipped: 0 - This is a cherry-pick of D3670927 to 3.8. Test Plan: Tested rebalance on devserver. Reviewed By: dph, rwareing Change-Id: Ide2edbf87ef9ae2b32a03f189c57b63e2f233fc8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16078 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	features/locks: Add lock revocation functionality to posix locks translator	Richard Wareing	2016-12-09	8	-9/+343
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Motivation: Prevents cluster instability by mis-behaving clients causing bricks to OOM due to inode/entry lock pile-ups. - Adds option to strip clients of entry/inode locks after N seconds - Adds option to clear ALL locks should the revocation threshold get hit - Adds option to clear all or granted locks should the max-blocked threshold get hit (can be used in combination w/ revocation-clear-all). - Options are: features.locks-revocation-secs <integer; 0 to disable> features.locks-revocation-clear-all [on/off] features.locks-revocation-max-blocked <integer> - Adds monkey-locking option to ignore 1% of unlock requests (dev only) features.locks-monkey-unlocking [on/off] - Adds logging to indicate revocation event & reason Test Plan: First you will need TWO fuse mounts for this repro. Call them /mnt/patchy1 & /mnt/patchy2. 1. Enable monkey unlocking on the volume: gluster vol set patchy features.locks-monkey-unlocking on 2. From the "patchy1", use DD or some other utility to begin writing to a file, eventually the dd will hang due to the dropped unlocked requests. This now simulates the broken client. Run: for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done' ...this will eventually hang as the unlock request has been lost. 3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the inability of the client to take out the required lock. 4. Next, re-start the test this time enabling lock revocation; use a timeout of 2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs <2-5>' 5. Wait 2-5 seconds before executing step 3 above this time. Observe that this time the access to the file will succeed, and the writes on patchy1 will unblock until they hit another failed unlock request due to "monkey-unlocking". Change-Id: I814b9f635fec53834a26db634d1300d9a61057d8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14816 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-on: http://review.gluster.org/16086 Tested-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	performance/io-threads: Eliminate spinlock contention via fops-per-thread-ratio	Richard Wareing	2016-12-09	3	-5/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Background: Frequently spinlock is observed on busy GFS clusters, which wastes CPU and destroys the performance of the cluster. Current solutions to this problem involve under-provisioning the thread pool, but this is problematic as during busy periods there may not be enough threads to service the queue. - This patch introduces a technique to avoid the stampeding herd problem with the io-threads workers. This is done by dynamically tuning the threads by a ratio of threads to queue depth, there-by keeping already running threads sufficiently busy by a tunable FOP to thread ratio. Ratio is controllable by the performanace.io-threads-fops-per-threads-ratio option. - More detailed reading on this approach can be found here: https://h21007.www2.hp.com/portal/download/files/unprot/hpux/MakingConditionVariablesPerform.pdf - Cherry-pick of D2530504 for 3.8 Test Plan: - Stress teston my dev server - shadow testing Reviewed By: moox, sshreyas Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I771ae783aa4ca5a6fd0449db64e07d1f4bff0d04 Reviewed-on: http://review.gluster.org/16080 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* \|	performance/md-cache: Fix caching for root inode	Shreyas Siravara	2016-12-08	1	-4/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - `is_mdc_key_satisfied()` is returning 0 when it has not checked any of the keys - This causes the cache'd value for the root inode to always be invalid (`mdc_xattr_satisfied()` returns 0, which causes us to jump to `uncached'). - In this diff we add a new option called "strict-xattrs", when enabled winds getxattr calls for those keys not present in our cache. - This allows "special" getxattr commands (quota cli commands for example) to work when md-cache is enabled. - This is a port of D4135452 Test Plan: - Test on devserver and see latency improvements for root inode. - Prove tests Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I8ff75595e821d7a714224b3b3dded23f0a93560a Reviewed-on: http://review.gluster.org/16060 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* \|	gluster: IPv6 single stack support	Richard Wareing	2016-12-07	4	-4/+43
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - This diff changes all locations in the code to prefer inet6 family instead of inet. This will allow change GlusterFS to operate via IPv6 instead of IPv4 for all internal operations while still being able to serve (FUSE or NFS) clients via IPv4. - The changes apply to NFS as well. - This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch. Test Plan: Prove tests! Reviewers: dph, rwareing Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33 Reviewed-on: http://review.gluster.org/16059 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/afr: When failing fop due to lack of quorum, also log error string	Krutika Dhananjay	2016-11-11	1	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/#/c/15800/ Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081 BUG: 1393630 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15813 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	performance/open-behind: Avoid deadlock in statedump	Pranith Kumar K	2016-11-10	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: open-behind is taking fd->lock then inode->lock where as statedump is taking inode->lock then fd->lock, so it is leading to deadlock In open-behind, following code exists: void ob_fd_free (ob_fd_t ob_fd) { loc_wipe (&ob_fd->loc); <<--- this takes (inode->lock) ....... } int ob_wake_cbk (call_frame_t frame, void cookie, xlator_t this, int op_ret, int op_errno, fd_t fd_ret, dict_t xdata) { ....... LOCK (&fd->lock); <<---- fd->lock { ....... __fd_ctx_del (fd, this, NULL); ob_fd_free (ob_fd); <<<--------------- } UNLOCK (&fd->lock); ....... } ================================================================= In statedump this code exists: inode_dump (inode_t inode, char prefix) { ....... ret = TRY_LOCK(&inode->lock); <<---- inode->lock ....... fd_ctx_dump (fd, prefix); <<<----- ....... } fd_ctx_dump (fd_t fd, char prefix) { ....... LOCK (&fd->lock); <<<------------------ this takes fd-lock { ....... } Fix: Make sure open-behind doesn't call ob_fd_free() inside fd->lock >BUG: 1393259 >Change-Id: I4abdcfc5216270fa1e2b43f7b73445f49e6d6e6e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15808 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Poornima G <pgurusid@redhat.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1393682 Change-Id: I45a0fbed683ef6acb7900df87534927f332fdaaa Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15818 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	posix-acl: check dictionary before using it	Rajesh Joseph	2016-11-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If extended attributes are not present in md-cache it returns NULL as xattr. posix acl xlator should check for NULL before using xattr. If normal and default ACLs are not set on file then md-cache will not contain system.posix_acl_access and system.posix_acl_default extended attributes in its cache. Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise the logs will get filled with dictionary errors. > Reviewed-on: http://review.gluster.org/15769 > Reviewed-by: Raghavendra Talur <rtalur@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit de7fe24663713fff364cfc2b52b675e3e979ee68) Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da BUG: 1392868 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15799 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	features/shard: Fill loc.pargfid too for named lookups on individual shards	Krutika Dhananjay	2016-11-08	2	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb BUG: 1392846 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15796 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/quota: upgrade quota.conf file during an upgrade	Manikandan Selvaganesh	2016-11-08	4	-16/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem ======= When quota is enabled on 3.6, it will have quota conf version in quota.conf as v1.1. This node gets upgraded to 3.7 but it will still have quota conf version as v1.1 until a quota enable/disable/set limit is initiated. When this is not initiated and when this node tries to peer probe a node which is a fresh install of 3.7 (which will have quota conf version as v1.2), then this will result in "Peer rejected" state. This patch fixes the issue. Solution ======== When an upgrade happens from 3.6 to 3.7, quota.conf file needs to be modified as well. With 3.6, in quota.conf the version will be v1.1 and it needs to be changed to v1.2 from 3.7. This is because in 3.7, inode quota feature is introduced. So when an op-version bumpup happens quota.conf needs to be upgraded with quota conf version v1.2 and all the 16 byte uuid needs to be changed to 17 bytes uuid as well. Previously, when the cluster version is upgraded to 3.7, the quota.conf got upgraded as well. But, the upgradation was done only when quota enable/disable/set limit is done. With this patch, the upgradation is done during a cluster op version bump up as well. > Reviewed-on: http://review.gluster.org/15352 > Tested-by: Atin Mukherjee <amukherj@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 4b2cff614462508eef529c5d128e0974720e3f50) Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0 BUG: 1392716 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/15791 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> Tested-by: Manikandan Selvaganesh <manikandancs333@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/shared storage: Check for hook-script at staging	Avra Sengupta	2016-11-06	2	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check if S32gluster_enable_shared_storage.sh is present at /var/lib/glusterd/hooks/1/set/post/ at staging before proceeding with the command. Fail the command with the appropriate error message in case it is not present. > Reviewed-on: http://review.gluster.org/15718 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 29587a91716e1e55bd172d63340c40249fb343c9) Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b BUG: 1377448 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15741 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	snapshot: Fix for memory leaks in snapshot code path	Avra Sengupta	2016-11-03	3	-12/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	> Reviewed-on: http://review.gluster.org/15668 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 48a4c4e525665115f7e8c478d3bf51764427378d) Change-Id: Idc2cb16574d166e3c0ee1f7c3a485f1acb19fc8c BUG: 1388354 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15720 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	md-cache: Invalidate cache entry for open() with O_TRUNC	Soumya Koduri	2016-11-03	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. This is backport of below mainline patch - http://review.gluster.org/#/c/15618/ > Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 > BUG: 1382266 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/15618 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > (cherry picked from commit 6ca5d6382f03685b31b12accb095093cf1486603) Change-Id: I92349f5b48aef07f3790db7aae25bfa2ddb5947e BUG: 1391450 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15771 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	performance/write-behind: fix flush stuck by former failed writes	Ryan Ding	2016-11-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print ((wb_request_t )0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print ((wb_request_t )0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. > Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab > BUG: 1372211 > Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1390838 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15762 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	dht: Proper log message if data migration is skipped	ankitraj	2016-11-02	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a misleading message from logs about available disk space while rebalancing of bricks while calculating free space. Bug: 1390870 Backprt of http://review.gluster.org/#/c/15345/ Change-Id: Ie9df0b2cbf00faaf13a0a3f0dbd657377a082755 Signed-off-by: ankitraj <anraj@redhat.com> Reviewed-on: http://review.gluster.org/15765 Tested-by: ankitraj CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	crypt: changes needed for openssl-1.1 (coming in Fedora 26)	Kaleb S. KEITHLEY	2016-10-27	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fedora has updated openssl-1.1.0b in/for Fedora 26 HMAC_CTX is now an opaque type and instances of it must be created and released by calling HMAC_CTX_new() and HMAC_CTX_free(). Change-Id: I3a09751d7b0d9fc25fe18aac6527e5431e9ab19a BUG: 1388580 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15727 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	afr,ec: Heal device files with correct major, minor numbers	Pranith Kumar K	2016-10-26	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. >BUG: 1384297 >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15728 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a BUG: 1388948 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15735 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
*	storage/posix: Fix race in posix_pstat	Pranith Kumar K	2016-10-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When one thread is in the process of creating a file/directory and the other thread is doing readdirp, there is a chance that posix_pstat, creation fops race in the following manner which will lead to wrong stat values to be read by parent xlators like posix-acl. Creation fops posix_pstat() as part of readdirp 1) file is created with uid/gid 0/0 1) does stat of the path that is created just now. 2) Does chown to set the correct uid/gid 3) Sets the acl/user/internal xattrs 4) Sets the gfid on the entry and completes the creation of the file/dir 2) fills the gfid in the iatt If unwind of readdirp hits server xlator before creation fop, then posix-acl remembers uid/gid of the file to be root/root and fails fops like open etc on it. Fix: Reverse the order of filling gfid and filling lstat() values in posix_pstat() so that if there is gfid in iatt buffer uid/gid are valid. >Change-Id: I46caa7f6da7abfa40a0b1d70e35b88de9c64959c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15564 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> BUG: 1386071 Change-Id: I81c4c6e6b8d4037cee9b23da262b254c126c0a19 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15665 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	performance/write-behind: remove the request from liability queue in	Raghavendra G	2016-10-17	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1379655 > Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb > Reviewed-on: http://review.gluster.org/15579 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit a8b2a981881221925bb5edfe7bb65b25ad855c04) Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1385620 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15658 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/afr: Prevent dict_set() on NULL dict	Pranith Kumar K	2016-10-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In afr lookup when NULL dict is received in lookup, afr is supposed to set all the xattrs it requires in a new dict it creates, but for 'link-count' it is trying to set to the dict that is passed in lookup which can be NULL sometimes. This is leading to error logs. Fixed the same in this patch. >BUG: 1385104 >Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15646 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >BUG: 1385236 >Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15650 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> BUG: 1385442 Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15651 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	afr: Take full locks in arbiter only for data transactions	Ravishankar N	2016-10-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. > Reviewed-on: http://review.gluster.org/15641 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60) Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1375125 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15647 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/ec: Use locks for opendir	Pranith Kumar K	2016-10-14	1	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In some cases we see that readdir keeps winding to the brick that doesn't have any blocked locks i.e. first brick. This is leading to the client assuming that there are no blocking locks on the inode so it won't give away the lock. Other clients end up blocked on the lock as if the command hung. Fix: Proper way to fix this issue is to use infra present in http://review.gluster.org/14736 This is a stop gap fix where we start taking inodelks in opendir which goes to all the bricks, this will detect if there is any contention. cherry picked from commit f013335400d033a9677797377b90b968803135f4: >BUG: 1346719 >Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15309 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 BUG: 1371397 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15405 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>