glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/dht: unwind if dht_selfheal_dir_mkdir returns an error	Raghavendra G	2018-05-03	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	If dht_selfheal_dir_mkdir returns an error, cbk passed to dht_selfheal_directory is not invoked. So, Current codepath leaves an unwound frame resulting in a hung fop forever. Change-Id: I422308b8a34a074301ca46b029ffe676f5e0f66c fixes: bz#1574305 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	protocol/server : unwind as per op version	Ashish Pandey	2018-05-03	3	-3/+11
\| \| \| \| \| \| \|	Change-Id: Id6717640ac14881b490e512c4682e45ffffa7f5b fixes: bz#1570538 BUG: 1570538 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	glusterd/geo-rep: Fix UNUSED_VALUE coverity issue	Varsha Rao	2018-05-03	2	-12/+12
\| \| \| \| \| \| \| \| \| \| \|	The return value of glusterd_get_local_brickpaths is unused so add goto statement. As it is reinitialized outside the if block. Also change the if condition to check the failure case, when return value is -1 and path_list is NULL. Change-Id: I6b47d7751263f704bd69a6452a7e71bfcf226d49 updates: bz#789278 Signed-off-by: Varsha Rao <varao@redhat.com>
*	core/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-05-02	32	-341/+377
\| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/ use print fn from __future__ Change-Id: If5075d8d9ca9641058fbc71df8a52aa35804cda4 updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	block-profile: enable cluster.eager-lock in block-profile	Prasanna Kumar Kalever	2018-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Eager-lock gave 2.5X perf improvement. On top of that with batching fix in tcmu-runner and client-io-threads we are seeing close to 3X perf improvement. But we don't want to include that in the default profile option but enable it on a case by case basis. So not adding client-io-threads option. BUG: 1573119 Fixes: bz#1573119 Change-Id: Ida53c3ef9a041a73b65fdd06158ac082da437206 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
*	glusterd: Fix for memory leak in get-state detail	Sanju Rakonde	2018-05-01	1	-1/+8
\| \| \| \| \| \|	Fixes: bz#1573066 Change-Id: I76fe3bdde7351736b32eb3d6c4cc5f8f276257ed Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	dht: gf_defrag_settle_hash should ignore ENOENT and ESTALE error	Susant Palai	2018-04-30	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \|	Problem: A directory deletion can happen just before gf_defrag_settle_hash which internally does a setxattr operation on a directory. Solution: Ignore ENOENT and ESTALE errors Fixes: bz#1572581 Change-Id: I2f91809f3b5e02976c4c3a5a596406a8b2f8f6f2 Signed-off-by: Susant Palai <spalai@redhat.com>
*	cluster/afr: shd changes for thin arbiter	karthik-us	2018-04-30	1	-0/+184
\| \| \| \| \| \| \|	Updates #352 Change-Id: I1bbb3c652ba33cec6aa37f3700370674077fb17d Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	afr: initial changes for thin arbiter	Ravishankar N	2018-04-30	6	-8/+229
\| \| \| \| \| \| \| \| \|	1. Create thin arbiter index file during mount. 2. Set pending marker in thin arbiter id file in case of failure. Change-Id: I269eb8d069f0323f1fc616175e5e5eb7b91d5f82 updates: #352 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	server/resolver: don't trust inode-table for RESOLVE_NOT	Raghavendra G	2018-04-30	1	-4/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There have been known races between fops which add a dentry (like lookup, create, mknod etc) and fops that remove a dentry (like rename, unlink, rmdir etc) due to which stale dentries are left out in inode table even though the dentry doesn't exist on backend. For eg., consider a lookup (parent/bname) and unlink (parent/bname) racing in the following order: * lookup hits storage/posix and finds that dentry exists * unlink removes the dentry on storage/posix * unlink reaches protocol/server where the dentry (parent/bname) is unlinked from the inode * lookup reaches protocol/server and creates a dentry (parent/bname) on the inode Now we've a stale dentry (parent/bname) associated with the inode in itable. This situation is bad for fops like link, create etc which invoke resolver with type RESOLVE_NOT. These fops fail with EEXIST even though there is no such dentry on backend fs. This issue can be solved in two ways: * Enable "dentry fop serializer" xlator [1]. # gluster volume set features.sdfs on * Make sure resolver does a lookup on backend when it finds a dentry in itable and validates the state of itable. - If a dentry is not found, unlink those stale dentries from itable and continue with fop - If dentry is found, fail the fop with EEXIST This patch implements second solution as sdfs is not enabled by default in brick xlator stack. Once sdfs is enabled by default, this patch can be reverted. [1] https://github.com/gluster/glusterfs/issues/397 Change-Id: Ia8bb0cf97f97cb0e72639bce8aadb0f6d3f4a34a updates: bz#1543279 BUG: 1543279 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	libglusterfs: Capture the dict response in syncop_xattrop_cbk	karthik-us	2018-04-27	5	-9/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently it is not possible to capture the xattrs values which are set on the bricks by calling syncop_(f)xattrop, because the response dict is not being assigned to any of the dictionaries. Fix: In the xattrop callback capture the response dict and send it back to the caller if it is requested. Change-Id: I9de9bcd97d6008091c9b060bcca3676cb9ae8ef9 fixes: bz#1572076 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	feature/thin-arbiter: Implement thin-arbiter translator	Ashish Pandey	2018-04-25	11	-1/+843
\| \| \| \| \| \| \|	Updates #352 Change-Id: I3d8caa6479dc8e48bec62a09b056971bb061f0cf Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	performance/md-cache: purge cache on ENOENT/ESTALE errors	Raghavendra G	2018-04-25	1	-87/+438
\| \| \| \| \| \| \| \| \| \|	If not, next lookup could be served from cache and can be success, which is wrong. This can affect retry logic of VFS when it receives an ESTALE. Change-Id: Iad8e564d666aa4172823343f19a60c11e4416ef6 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1566303
*	cluster/afr: Keep child-up until ping-event	Pranith Kumar K	2018-04-25	3	-25/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If we have 2 bricks, brick-A and brick-B with brick-A within halo-max-latency and brick-B more than halo-max-latency. If we set both halo-min, halo-max replicas as '1'. In this case, brick-A comes online and then ping-latency will be updated for it. When brick-B comes online, we have 2 up-bricks, so the code tries to find the brick with worst latency to mark it down. Since Brick-B just came online it always had '0' latency so brick-B used to be marked offline and Brick-B would eventually be the one to be online even when brick-A is more suited. Fix: Consider latency of just-up child as HALO_MAX_LATENCY so that worst-child until ping-latency is found as the just-up brick. Also keep ping-latency as -1 until child-up during initialization. BUG: 1567881 fixes bz#1567881 Change-Id: I148262fe505468190f0eb99225d0f6d57cdb6f04 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	libglusterfs/syncop: Handle barrier_{init/destroy} in error cases	Pranith Kumar K	2018-04-23	2	-4/+27
\| \| \| \| \| \| \|	BUG: 1568521 updates: bz#1568521 Change-Id: I53e60cfcaa7f8edfa5eca47307fa99f10ee64505 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	features/shard: Add option to barrier parallel lookup and unlink of shards	Krutika Dhananjay	2018-04-23	2	-28/+89
\| \| \| \| \| \| \| \| \|	Also move the common parallel unlink callback for GF_FOP_TRUNCATE and GF_FOP_FTRUNCATE into a separate function. Change-Id: Ib0f90a5f62abdfa89cda7bef9f3ff99f349ec332 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	cluster/dht: Fix dht_rename lock order	N Balachandran	2018-04-23	1	-18/+47
\| \| \| \| \| \| \| \| \| \|	Fixed dht_order_rename_lock to use the same inodelk ordering as that of the dht selfheal locks (dictionary order of lock subvolumes). Change-Id: Ia3f8353b33ea2fd3bc1ba7e8e777dda6c1d33e0d fixes: bz#1568348 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	server/auth: add option for strict authentication	Mohammed Rafi KC	2018-04-20	6	-12/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When this option is enabled, we will check for a matching username and password, if not found then the connection will be rejected. This also does a checksum validation of volfile The option is invalid when SSL/TLS is in use, at which point the SSL/TLS certificate user name is used to validate and hence authorize the right user. This expects TLS allow rules to be setup correctly rather than the default *. This option is not settable, as a result this cannot be enabled for volumes using the CLI. This is used with the shared storage volume, to restrict access to the same in non-SSL/TLS environments to the gluster peers only. Tested: ./tests/bugs/protocol/bug-1321578.t ./tests/features/ssl-authz.t - Ran tests on volumes with and without strict auth checking (as brick vol file needed to be edited to test, or rather to enable the option) - Ran tests on volumes to ensure existing mounts are disconnected when we enable strict checking Change-Id: I2ac4f0cfa5b59cc789cc5a265358389b04556b59 fixes: bz#1568844 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	shared storage: Prevent mounting shared storage from non-trusted client	Mohammed Rafi KC	2018-04-20	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gluster shared storage is a volume used for internal storage for various features including ganesha, geo-rep, snapshot. So this volume should not be exposed to the client, as it is a special volume for internal use. This fix wont't generate non trusted volfile for shared storage volume. Change-Id: I8ffe30ae99ec05196d75466210b84db311611a4c fixes: bz#1568844 BUG: 1568844 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	server: fix unresolved symbols by moving them to libglusterfs	Mohit Agrawal	2018-04-20	5	-104/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd2 build is failed due to undefined symbol (xlator_mem_cleanup , glusterfsd_ctx) in server.so Solution: To resolve the same done below two changes 1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c to xlator.c to be part of libglusterfs.so 2) replace glusterfsd_ctx to this->ctx because symbol glusterfsd_ctx is not part of server.so BUG: 1544090 Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5 fixes: bz#1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	cluster/afr: Need heal-timeout to be configured as low as 5 seconds	Pranith Kumar K	2018-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	In Halo replication, there are pending heals more often than not. It makes sense to give users the capability to configure it as low as 5 seconds. BUG: 1569489 fixes bz#1569489 Change-Id: I451c1975827f66398b903f659c981ef3121d5376 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	features/bitrot: show the corresponding brick for the corrupted objects	Raghavendra Bhat	2018-04-20	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \|	Currently with "gluster volume bitrot <volume name> scrub status" command the corrupted objects of a node are shown. But to what brick that corrupted object belongs to is not shown. Showing the brick of the corrupted object will help in situations where a node hosts multiple bricks of a volume. Change-Id: I7fbdea1e0072b9d3487eb10757468bc02d24df21 fixes: bz#1569198 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	eventsapi: Handle Unicode string during signing	Aravinda VK	2018-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Python 2.7 HMAC does not support Unicode strings. Secret is read from file so it is possible that glustereventsd reads the content as Unicode. This patch converts the secret to `str` type before generating HMAC signature. Fixes: bz#1568820 Change-Id: I7daa64499ac4ca02544405af26ac8af4b6b0bd95 Signed-off-by: Aravinda VK <avishwan@redhat.com>
*	Make glusterfsd binary print statedump & xlator dir	Prashanth Pai	2018-04-19	5	-7/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The glusterd2 needs following options, some of which are provided by gluster CLI today: --print-xlatordir --print-statedumpdir --print-logdir However, the CLI package need not be present on the machine running glusterd2. This change adds the above CLI options to glusterfsd binary which glusterd2 depends on. Reverts 9a1ae47c8d60836ae0628a04a153f28c1085c0e8 Related changes: https://review.gluster.org/#/c/19882/ https://github.com/gluster/glusterd2/pull/663 Updates: bz#1193929 Change-Id: I18c123b0d3350d2bd4f2400783e3b94e402a4e29 Signed-off-by: Prashanth Pai <ppai@redhat.com>
*	gluster: Sometimes Brick process is crashed at the time of stopping brick	Mohit Agrawal	2018-04-19	20	-112/+365
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
*	glusterd: volume inode/fd status broken with brick mux	hari gowtham	2018-04-19	9	-87/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The values for inode/fd was populated from the ctx received from the server xlator. Without brickmux, every brick from a volume belonged to a single brick from the volume. So searching the server and populating it worked. With brickmux, a number of bricks can be confined to a single process. These bricks can be from different volumes too (if we use the max-bricks-per-process option). If they are from different volumes, using the server xlator to populate causes problem. Fix: Use the brick to validate and populate the inode/fd status. Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd fixes: bz#1566067
*	features/shard: Make operations on internal directories generic	Krutika Dhananjay	2018-04-18	2	-92/+206
\| \| \| \| \| \|	Change-Id: Iea7ad2102220c6d415909f8caef84167ce2d6818 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	fuse: do fd_resolve in fuse_getattr if fd is received	Susant Palai	2018-04-18	2	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	problem: With the current code, post graph switch the old fd is received for fuse_getattr and since it is associated with old inode, it does not have the inode ctx across xlators in new graph. Hence, dht errored out saying "no layout" for fstat call. Hence the EINVAL. Solution: if fd is passed, init and resolve fd to carry on getattr test case: - Created a single brick distributed volume - Started untar - Added a new-brick Without this fix, untar used to abort with ERROR. Change-Id: I5805c463fb9a04ba5c24829b768127097ff8b9f9 fixes: bz#1566207 Signed-off-by: Susant Palai <spalai@redhat.com>
*	glusterd: update listen-backlog value to 1024	Milind Changire	2018-04-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Update default value of listen-backlog to 1024 to reflect the changes in socket.c This keeps the actual implementation in socket.c and the help text in glusterd-volume-set.c consistent Change-Id: If04c9e0bb5afb55edcc7ca57bbc10922b85b7075 fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	cluster/afr: Make sure latency-arg is passed to afr	Pranith Kumar K	2018-04-18	4	-3/+6
\| \| \| \| \| \| \| \| \| \| \|	xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	libglusterfs: fix comparison of a NULL dict with a non-NULL dict	Xavi Hernandez	2018-04-18	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	Function are_dicts_equal() had a bug when the first argument was NULL and the second one wasn't NULL. In this case it incorrectly returned that the dicts were different when they could be equal. Fixes: bz#1566732 Change-Id: I0fc245c2e7d1395865a76405dbd05e5d34db3273 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	Add CLI option to print XLATORDIR	Prashanth Pai	2018-04-18	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterfs gets the path to xlator dir from a compile time flag named XLATORDIR which gets passed through a -D flag to GCC. This path is used to find and load xlator shared objects. The XLATORDIR path isn't easily accessible to glusterd2. Glusterd2 currently uses the following command (hack) to get value of XLATORDIR: $ strings -d `which glusterfsd` \| awk '/glusterfs/*/xlator$/' This change introduces "print-xlatordir" CLI option to expose XLATORDIR. The option is intentionally not documented. Updates: bz#1193929 Change-Id: Ic7247457600f11cd8d68eb3d0ad2526fdfda0b02 Signed-off-by: Prashanth Pai <ppai@redhat.com>
*	afr: fixes to afr-eager locking	Ravishankar N	2018-04-18	2	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. If pre-op fails on all bricks,set lock->release to true in afr_handle_lock_acquire_failure so that the GF_ASSERT in afr_unlock() does not crash. 2. Added a missing 'return' after handling pre-op failure in afr_transaction_perform_fop(), fixing a use-after-free issue. Change-Id: If0627a9124cb5d6405037cab3f17f8325eed2d83 fixes: bz#1561129 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	Revert "storage/posix: add pgfid in readdirp if needed"	Nigel Babu	2018-04-18	1	-38/+8
\| \| \| \| \| \| \| \|	This reverts commit d206fab73f6815c927a84171ee9361c9b31557b1. Change-Id: I5b43fdcf916bc844437c9d60f6957bc40936e3c2 Updates: bz#1560319 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	build: exclude '--with-previous-options' to prevent infinite loop	Xie Changlong	2018-04-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Reproducible Steps: 1. cd glusterfs/; rm -rf *; git reset --hard #clean repo 2. cd extras/LinuxRPM/; ./make_glusterrpms #it's ok here 3. ./make_glusterrpms #infinite loop 4. cd ../../; make distclean #infinite loop Change-Id: I162953d4576cedea7c6f6c631a77163a5cca023e updates: #439 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
*	maintainers: promote Deepshikha to maintainer	Nigel Babu	2018-04-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Deepshikha has been doing excellent work across the CI system. She is now ready to co-maintain the Continuous Integration module and be responsible for the CI ecosystem in its entirety. Fixes: bz#1567880 Change-Id: If204301d26731f93b2dccfe8b6571ee748a47b26 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	fuse: retire statvfs tweak	Csaba Henk	2018-04-16	1	-13/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fuse xlator used to override the filesystem block size of the storage backend to indicate its preferences. Now we retire this tweak and pass on what we get from the backend. This fixes the anomaly reported in the referred BUG. For more background, see the following email, which was sent out to gluster-devel and gluster-users mailing lists to gauge if anyone sees any use of this tweak: http://lists.gluster.org/pipermail/gluster-devel/2018-March/054660.html http://lists.gluster.org/pipermail/gluster-users/2018-March/033775.html Noone vetoed the removal of it but it got endorsement: http://lists.gluster.org/pipermail/gluster-devel/2018-March/054686.html BUG: 1523219 Change-Id: I3b7111d3037a1b91a288c1589f407b2c48d81bfa Signed-off-by: Csaba Henk <csaba@redhat.com>
*	geo-rep: Fix syncing of symlink	Kotresh HR	2018-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If symlink is created on master pointing to current directory (e.g symlink -> ".") with non root uid or gid, geo-rep worker crashes with ENOTSUP. Cause: Geo-rep creates the symlink on slave and fixes the uid and gid using chown cmd. os.chown dereferences the symlink which is pointing to ".gfid" which is not supported. Note that geo-rep operates on aux-gfid-mount (e.g. "/mnt/.gfid/<gfid-of-symlink-file>"). Solution: The uid or gid change is acutally on symlink file. So use os.lchown, i.e, don't deference. BUG: 1567209 Change-Id: I63575fc589d71f987bef1d350c030987738c78ad updates: bz#1567209 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	extras: Disable choose-local in groups virt and gluster-block	Krutika Dhananjay	2018-04-13	2	-0/+2
\| \| \| \| \| \|	Change-Id: Icba68406d86623195d59d6ee668e0850c037c63a fixes: bz#1566386 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	rpc: set listen-backlog to high value	Milind Changire	2018-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: On node reboot, when glusterd starts volumes rapidly, there's a flood of connections from the bricks to glusterd and from the self-heal daemons to the bricks. This causes SYN Flooding and dropped connections when the listen-backlog is not enough to hold the pending connections to compensate for the rate at which connections are accepted by the RPC layer. Solution: Increase the listen-backlog value to 1024. This is a partial solution. Part of the solution is to rearm the listener socket early for quicker accept() of connections. See commit 6964640a977cb10c0c95a94e03c229918fa6eca8 (change 19833) Change-Id: I62283d1f4990dd43839f9a6932cf8a36effd632c fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	cluster/dht: Handle file migrations when brick down	N Balachandran	2018-04-13	1	-5/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The decision as to which node would migrate a file was based on the gfid of the file. Files were divided among the nodes for the replica/disperse set. However, if a brick was down when rebalance started, the nodeuuids would be saved as NULL and a set of files would not be migrated. Now, if the nodeuuid is NULL, the first non-null entry in the set is the node responsible for migrating the file. Change-Id: I72554c107792c7d534e0f25640654b6f8417d373 fixes: bz#1564198 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	core/build/various: python3 compat, prepare for python2 -> python3	Kaleb S. KEITHLEY	2018-04-12	59	-102/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note 1) we're not supposed to be using #!/usr/bin/env python, see https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Shebang_lines Note 2) we're also not supposed to be using "!/usr/bin/python, see https://fedoraproject.org/wiki/Changes/Avoid_usr_bin_python_in_RPM_Build#Quick_Opt-Out The previous patch (https://review.gluster.org/19767) tried to do too much in one patch, so it was abandoned. This patch does two things: 1) minor cleanup of configure(.ac) to explicitly use python2 2) change all the shebang lines to #!/usr/bin/python2 and add them where they were missing based on warnings emitted during rpmbuild. In a follow-up patch python2 will eventually be changed to python3. Before that python2-isms (e.g. print, string.join(), etc.) need to be converted to python3. Some of those can be rewritten in version agnostic python. E.g. print statements become print() with "from __future_ import print_function". The python 2to3 utility will be used for some of those. Also Aravinda has given guidance in the comments to the first patch for changes. updates: #411 Change-Id: I471730962b2526022115a1fc33629fb078b74338 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: Wind open to all subvols	N Balachandran	2018-04-11	1	-10/+5
\| \| \| \| \| \| \| \| \| \|	dht_opendir should wind the open to all subvols whether or not local->subvols is set. This is because dht_readdirp winds the calls to all subvols. Change-Id: I67a96b06dad14a08967c3721301e88555aa01017 updates: bz#1564198 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	xlators/performance: Add pass-through option	Varsha Rao	2018-04-11	9	-10/+139
\| \| \| \| \| \| \| \| \| \|	Add pass-through option in performance traslators. Set the option in GF_OPTION_INIT() and GF_OPTION_RECONF() Updates: #304 Change-Id: If1537450147d154905831e36f7162a32866d7ad6 Signed-off-by: Varsha Rao <varao@redhat.com>
*	posix: reserve option behavior is not correct while using fallocate	Mohit Agrawal	2018-04-11	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: storage.reserve option is not working correctly while disk space is allocate throguh fallocate Solution: In posix_disk_space_check_thread_proc after every 5 sec interval it calls posix_disk_space_check to monitor disk space and set the flag in posix priv.In 5 sec timestamp user can create big file with fallocate that can reach posix reserve limit and no error is shown on terminal even limit has reached. To resolve the same call posix_disk_space for every fallocate fop instead to call by a thread after 5 second BUG: 1560411 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I39ba9390e2e6d084eedbf3bcf45cd6d708591577
*	storage/posix: add pgfid in readdirp if needed	Kinglong Mee	2018-04-10	1	-8/+38
\| \| \| \| \| \|	Change-Id: I6745428fd9d4e402bf2cad52cee8ab46b7fd822f fixes: bz#1560319 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	posix: check file state before continuing with fops	Susant Palai	2018-04-10	5	-16/+756
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In context of Cloudsync: In scenarios where a data modification fop e.g. a write landed in POSIX thinking that the file is local, while the file was actually remote, can be dangerous. Ofcourse we don’t want to take inodelk for every read/write operation to check the archival status or coordinate with an upload or a download of a file. To avoid inodelk, we will check the status of the file in POSIX it self, before we resume the fop. This helps us avoiding any races mentioned above. Now e.g. if a write reached POSIX for a file which was actually remote, it can check the status of the file and will get to know that the file is remote. It can error out with this status “remote” and cloudsync xlator will retry the same operation, once it finished downloading the file. This patch includes the setxattr changes to do the post processing of upload i.e. truncate and setting the remote xattr "trusted.glusterfs.cs.remote" to indicate the file is REMOTE Each file will have no xattr if the file is LOCAL, one remote xattr if the file is REMOTE and a combination of REMOTE and DOWNLOADING xattr if the file is getting downloaded. There is healing logic of these xattrs to recover from crash inconsitencies. Fixes: #387 Change-Id: Ie93c2d41aa8d6a798a39bdbef9d1669f057e5fdb Signed-off-by: Susant Palai <spalai@redhat.com>
*	cluster/dht: act as passthrough for renames on single child DHT	Raghavendra G	2018-04-10	1	-7/+15
\| \| \| \| \| \| \| \| \| \|	Various synchronization present in dht_rename while handling directories and files is necessary only if we have more than only one child. Change-Id: Ie21ad419125504ca2f391b1ae2e5c1d166fee247 fixes: bz#1563511 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	experimental/cloudsync: Download xlator for archival feature	Susant Palai	2018-04-10	21	-4/+2468
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spec-files: https://review.gluster.org/#/c/18854/ Overview: * Cloudsync maintains three file states in it's inode-ctx i.e 1 - LOCAL, 2 - REMOTE, 3 - DOWNLOADING. * A data modifying fop is allowed only if the state is LOCAL. If the state is REMOTE or DOWNLOADING, client will download or wait for the download to finish initiated by other client. * Multiple download and upload from different clients are synchronized by inodelk. * In POSIX a state check is done (part of different commit)before allowing the fop to continue. If the state is remote/downloading the fop is unwound with EREMOTE. The client will then download the file and continue with the fop again. * Basic Algo for fop (let's say write fop): - If LOCAL -> resume fop - If REMOTE -> - INODELK - STAT (this gets state and heal the state if needed) - DOWNLOAD - resume fop Note: * Developers will need to write plugins for download, based on the remote store they choose. In phase-1, support will be added for one remote store per volume. In future, more options for multiple remote stores will be explored. TODOs: - Implement stat/lookup/readdirp to return size info from xattr - Make plugins configurable - Implement unlink fop - Add metrics collection - Add sharding support Design Contributions: Aravinda V K <avishwan@redhat.com> Amar Tumballi <amarts@redhat.com> Ram Ankireddypalle <areddy@commvault.com> Susant Palai <spalai@redhat.com> updates: #387 Change-Id: Iddf711ee7ab4e946ae3e472ff62791a7b85e6d4b Signed-off-by: Susant Palai <spalai@redhat.com>
*	quota: allow writes when with EINVAL on pgfid isnot exist	Kinglong Mee	2018-04-09	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFS client gets "Invalid argument" when writing file through nfs-ganesha. 1. With quota disabled; nfs client mount nfs-ganesha share, and do 'll' in the testing directory. 2. Enable quota; getfattr: Removing leading '/' from absolute path names trusted.gfid=0xe2edaac0eca8420ebbbcba7e56bbd240 trusted.gfid2path.b3250af8fa558e66=0x39663134343566662d653530332d343831352d396635312d3236633565366332633137642f7465737466696c653932 trusted.glusterfs.quota.9f1445ff-e503-4815-9f51-26c5e6c2c17d.contri.3=0x00000000000002000000000000000001 Notice: testfile92 without trusted.pgfid xattr. 3. restart glusterfs volume by "gluster volume stop/start gvtest" 4. echo somedata > testfile92 5. ll testfile92 -rw-r--r-- 1 root root 0 Mar 6 21:43 testfile92 BUG: 1560319 Change-Id: Iaa4dd1e891c99069fb85b7b11bb0482cbf2303b1 fixes: bz#1560319 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>