glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	If bind-address is IPv6 return it successfully	Amgad Saleh	2019-05-28	1	-6/+11
\| \| \| \| \| \|	Change-Id: Ibd37b6ea82b781a1a266b95f7596874134f30079 fixes: bz#1713730 Signed-off-by: Amgad Saleh <amgad.saleh@nokia.com>
*	glusterd: bulkvoldict thread is not handling all volumes	Mohit Agrawal	2019-05-27	2	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In commit ac70f66c5805e10b3a1072bd467918730c0aeeb4 I missed one condition to populate volume dictionary in multiple threads while brick_multiplex is enabled.Due to that glusterd is not sending volume dictionary for all volumes to peer. Solution: Update the condition in code as well as update test case also to avoid the issue Change-Id: I06522dbdfee4f7e995d9cc7b7098fdf35340dc52 fixes: bz#1711250 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: Add changelog api tests	Kotresh HR	2019-05-27	2	-0/+135
\| \| \| \| \| \|	updates: bz#1193929 Change-Id: Iee9aab8140882069165621189741f189fb2cc884 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	glusterd/tier: remove tier related code from glusterd	Hari Gowtham	2019-05-27	30	-5484/+126
\| \| \| \| \| \| \| \| \| \| \| \| \|	The handler functions are pointed to dummy functions. The switch case handling for tier also have been moved to point default case to avoid issues, if reintroduced. The tier changes in DHT still remain as such. updates: bz#1693692 Change-Id: I80d80c9a3eb862b4440a36b31ae82b2e9d92e4dc Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
*	tests: Add history api tests	Kotresh HR	2019-05-27	5	-0/+171
\| \| \| \| \| \|	updates: bz#1193929 Change-Id: Ic26ab5277f720c734f083150c1c541763dfa64aa Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	gfapi:add missng api to increase code coverage	Sheetal Pamecha	2019-05-26	1	-18/+340
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	add test for async Read/Write combinations glfs_read_async/write_async glfs_pread_async/pwrite_async glfs_readv_async/writev_async glfs_preadv_async/pwritev_async ftruncate/ftruncate_async fsync/fsync_async fdatasync/fdatasync_async Updates: #655 Change-Id: I12beb97029fd60bce79650a376d8fcd8d383ef16 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	api/glfsxmp.c: minor fixes	Sheetal Pamecha	2019-05-26	2	-63/+266
\| \| \| \| \| \| \| \| \| \| \|	* add more fops: f{get,set,list,remove}xattr(), access(), fstat(), fsetattr(), getxattr(), lgetxattr(), llistxattr(), lsetxattr(), fgetxattr() * handle some error cases (like volume not found) Updates: #655 Change-Id: I3334bdf3090eafd83a54e1be12036ea01b181089 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	Fix some "Null pointer dereference" coverity issues	Xavi Hernandez	2019-05-26	12	-17/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following CID's: * 1124829 * 1274075 * 1274083 * 1274128 * 1274135 * 1274141 * 1274143 * 1274197 * 1274205 * 1274210 * 1274211 * 1288801 * 1398629 Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	cluster/ec: honor contention notifications for partially acquired locks	Xavi Hernandez	2019-05-25	2	-1/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EC was ignoring lock contention notifications received while a lock was being acquired. When a lock is partially acquired (some bricks have granted the lock but some others not yet) we can receive notifications from acquired bricks, which should be honored, since we may not receive more notifications after that. Since EC was ignoring them, once the lock was acquired, it was not released until the eager-lock timeout, causing unnecessary delays on other clients. This fix takes into consideration the notifications received before having completed the full lock acquisition. After that, the lock will be releaed as soon as possible. Fixes: bz#1708156 Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	tests: Fix spurious failures in ta-write-on-bad-brick.t	Pranith Kumar K	2019-05-24	5	-17/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: afr_child_up_status_meta works only when LOOKUP on $M0 is successful. There are cases where quorum is not met and LOOKUP fails on $M0 which leads to failures similar to: grep: /mnt/glusterfs/0/.meta/graphs/active/patchy-replicate-0/private: Transport endpoint is not connected This was happening once in a while based on attribute-timeout and md-cache not serving the lookup. Fix: Find child-up status based on statedump instead. Also changed mount options to include --entry-timeout=0 and --attribute-timeout=0 updates bz#1193929 Change-Id: Ic0de72c3006d7399a5feb3e4d10d4748949b2ab3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Test openfd heal doesn't truncate files	Pranith Kumar K	2019-05-24	2	-0/+218
\| \| \| \| \| \|	fixes bz#1706603 Change-Id: I0bfd30f787f157b7a54f71088f767ccfd7621208 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	glusterd: coverity fix	Sanju Rakonde	2019-05-23	1	-1/+1
\| \| \| \| \| \| \| \| \|	CID: 1401345 - Unused value updates: bz#789278 Change-Id: I6b8f2611151ce0174042384b7632019c312ebae3 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	geo-rep: Geo-rep help text issue	Shwetha K Acharya	2019-05-23	1	-2/+2
\| \| \| \| \| \| \| \| \|	Modified Geo-rep help text for better sanity. fixes: bz#1652887 Change-Id: I40ef7ef709eaecf0125ab4b4a7517e2c5d1ef4a0 Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
*	glusterd-utils.c: skip checksum when possible.	Yaniv Kaul	2019-05-23	1	-22/+18
\| \| \| \| \| \| \| \| \| \| \| \|	We only need to calculate and write the checksum in case of !is_quota_conf . Align the code in accordance. Also, use a smaller buffer (to write few chars). Change-Id: I40c83ce10447df77ff9975d314d768ec2c0087c2 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	cli: Fixed typos	N Balachandran	2019-05-23	1	-2/+2
\| \| \| \| \| \|	Change-Id: I14957c5161f31d5dfc6cf56f8d7ccf4d39372f39 fixes: bz#1711820 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	inode: fix wrong loop count in __inode_ctx_free	Xie Changlong	2019-05-23	1	-5/+6
\| \| \| \| \| \| \| \|	Avoid serious memory leak fixes: bz#1711240 Change-Id: Ic61a8fdd0e941e136c98376a87b5a77fa8c22316 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
*	cluster/dht: Lookup all files when processing directory	N Balachandran	2019-05-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A rebalance process currently only looks up files that it is supposed to migrate. This could cause issues when lookup-optimize is enabled as the dir layout can be updated with the commit hash before all files are looked up. This is expecially problematic of one of the rebalance processes fails to complete as clients will try to access files whose linkto files might not have been created. Each process will now lookup every file in the directory it is processing. Pros: Less likely that files will be inaccessible. Cons: More lookup requests sent to the bricks and a potential performance hit. Note: this does not handle races such as when a layout is updated on disk just as the create fop is sent by the client. Change-Id: I22b55846effc08d3b827c3af9335229335f67fb8 fixes: bz#1711764 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	lock: check null value of dict to avoid log flooding	Susant Palai	2019-05-23	1	-1/+1
\| \| \| \| \| \|	updates: bz#1712322 Change-Id: I120a1d23506f9ebcf88c7ea2f2eff4978a61cf4a Signed-off-by: Susant Palai <spalai@redhat.com>
*	ec/fini: Fix race with ec_fini and ec_notify	Mohammed Rafi KC	2019-05-21	6	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During a graph cleanup, we first sent a PARENT_DOWN and wait for a child down to ultimately free the xlator and the graph. In the ec xlator, we cleanup the threads when we get a PARENT_DOWN event. But a racing event like CHILD_UP or event xl_op may trigger healing threads after threads cleanup. So there is a chance that the threads might access a freed private variabe Change-Id: I252d10181bb67b95900c903d479de707a8489532 fixes: bz#1703948 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	tests/quick-read-with-upcall.t: increase the timeout	Amar Tumballi	2019-05-21	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running with 2 second sleep at this place caused failures like: `not ok 14 [ 2014/ 7] < 41> 'test-message1 cat /mnt/glusterfs/1/test.txt' -> 'Got "test-message0" instead of "test-message1"'` in few runs in 100 iterations. But when increased to higher than sleep 3, have not seen any failures in 100 runs. While I don't know the exact reasons for the behavior yet, looks like this increase in wait helps to pass the regression without failures. updates: bz#1693692 Change-Id: I0610b79bea53e36de3eea6c11234b7fc9dfd6232 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr/frame: Destroy frame after afr_selfheal_entry_granular	Mohammed Rafi KC	2019-05-21	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \|	In function "afr_selfheal_entry_granular", after completing the heal we are not destroying the frame. This will lead to crash. when we execute statedump operation, where it tried to access xlator object. If this xlator object is freed as part of the graph destroy this will lead to an invalid memory access Change-Id: I0a5e78e704ef257c3ac0087eab2c310e78fbe36d fixes: bz#1708926 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	Revert "rpc: implement reconnect back-off strategy"	Amar Tumballi	2019-05-21	2	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 59841f7e1ff0511b04884015441a181a56d07bea. This revert is done as a 'possible' fix for frequent regression failures, which are random in nature too (ie, different tests fails in different runs). Why exactly this patch? Because this patch seemed like most probable candidate which got merged in last 15days, and after which regressions are failing more often. Updates: bz#1711827 Change-Id: I35333162fcd4064f9609525ca93c666053c6d959
*	tests: change usleep() to sleep()	Sanju Rakonde	2019-05-16	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	While running a test case the following warning messages are seen on the display. To avoid suh warnings changing usleep() to sleep(). warning: usleep is deprecated, and will be removed in near future! warning: use "sleep 0.25" instead... updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com> Change-Id: I48b79ede1c70b101f654635dd4cc83e50ea55b73
*	features/shard: Fix crash during background shard deletion in a specific case	Krutika Dhananjay	2019-05-16	4	-4/+164
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider the following case - 1. A file gets FALLOCATE'd such that > "shard-lru-limit" number of shards are created. 2. And then it is deleted after that. The unique thing about FALLOCATE is that unlike WRITE, all of the participant shards are resolved and created and fallocated in a single batch. This means, in this case, after the first "shard-lru-limit" number of shards are resolved and added to lru list, as part of resolution of the remaining shards, some of the existing shards in lru list will need to be evicted. So these evicted shards will be inode_unlink()d as part of eviction. Now once the fop gets to the actual FALLOCATE stage, the lru'd-out shards get added to fsync list. 2 things to note at this point: i. the lru'd out shards are only part of fsync list, so each holds 1 ref on base shard ii. and the more recently used shards are part of both fsync and lru list. So each of these shards holds 2 refs on base inode - one for being part of fsync list, and the other for being part of lru list. FALLOCATE completes successfully and then this very file is deleted, and background shard deletion launched. Here's where the ref counts get mismatched. First as part of inode_resolve()s during the deletion, the lru'd-out inodes return NULL, because they are inode_unlink()'d by now. So these inodes need to be freshly looked up. But as part of linking them in lookup_cbk (precisely in shard_link_block_inode()), inode_link() returns the lru'd-out inode object. And its inode ctx is still valid and ctx->base_inode valid from the last time it was added to list. But shard_common_lookup_shards_cbk() passes NULL in the place of base_pointer to __shard_update_shards_inode_list(). This means, as part of adding the lru'd out inode back to lru list, base inode is not ref'd since its NULL. Whereas post unlinking this shard, during shard_unlink_block_inode(), ctx->base_inode is accessible and is unref'd because the shard was found to be part of LRU list, although the matching ref didn't occur. This at some point leads to base_inode refcount becoming 0 and it getting destroyed and released back while some of its associated shards are continuing to be unlinked in parallel and the client crashes whenever it is accessed next. Fix is to pass base shard correctly, if available, in shard_link_block_inode(). Also, the patch fixes the ret value check in tests/bugs/shard/shard-fallocate.c Change-Id: Ibd0bc4c6952367608e10701473cbad3947d7559f Updates: bz#1696136 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	geo-rep: Convert gfid conflict resolutiong logs into debug	Kotresh HR	2019-05-14	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \|	The gfid conflict resolution code path is not supposed to hit in generic code path. But few of the heavy rename workload (BUG: 1694820) makes it a generic case. So logging the entries to be fixed as INFO floods the log in these particular workloads. Hence convert them to DEBUG. fixes: bz#1709653 Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep: Fix sync hang with tarssh	Kotresh HR	2019-05-13	3	-4/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Geo-rep sync hangs when tarssh is used as sync engine at heavy workload. Analysis and Root cause: It's found out that the tar process was hung. When debugged further, it's found out that stderr buffer of tar process on master was full i.e., 64k. When the buffer was copied to a file from /proc/pid/fd/2, the hang is resolved. This can happen when files picked by tar process to sync doesn't exist on master anymore. If this count increases around 1k, the stderr buffer is filled up. Fix: The tar process is executed using Popen with stderr as PIPE. The final execution is something like below. tar \| ssh <args> root@slave tar --overwrite -xf - -C <path> It was waiting on ssh process first using communicate() and then tar. Note that communicate() reads stdout and stderr. So when stderr of tar process is filled up, there is no one to read until untar via ssh is completed. This can't happen and leads to deadlock. Hence we should be waiting on both process parallely, so that stderr is read on both processes. Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf fixes: bz#1707728 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	ec/shd: Cleanup self heal daemon resources during ec fini	Mohammed Rafi KC	2019-05-13	6	-13/+124
\| \| \| \| \| \| \| \| \| \|	We were not properly cleaning self-heal daemon resources during ec fini. With shd multiplexing, it is absolutely necessary to cleanup all the resources during ec fini. Change-Id: Iae4f1bce7d8c2e1da51ac568700a51088f3cc7f2 fixes: bz#1703948 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	rpc: implement reconnect back-off strategy	Xavier Hernandez	2019-05-11	2	-16/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a connection failure happens, gluster tries to reconnect every 3 seconds. In some cases the failure is spurious, so a delay of 3 seconds could be unnecessarily long. This patch implements a back-off strategy that tries a reconnect as soon as 1 tenth of a second. If this fails, the time is doubled until it's around 3 seconds. After that, the reconnect is attempted every 3 seconds as before. Change-Id: Icb3fbe20d618f50cbbb599dce542b4e871c22149 Updates: bz#1193929 Signed-off-by: Xavier Hernandez <xhernandez@redhat.com>
*	libglusterfs: Remove decompunder helper routines from symbol export	Anoop C S	2019-05-11	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	decompounder and related sources were removed via the following commits: https://review.gluster.org/#/c/glusterfs/+/22627/ https://review.gluster.org/#/c/glusterfs/+/22629/ Therefore taking out symbol exports for those removed routines. Change-Id: I2ef99a318de1e4b512cabd2fa923225c5b79b1e5 updates: bz#1193929 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	core: Capture process memory usage at the time of call gf_msg_nomem	Mohit Agrawal	2019-05-11	1	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: All gluster processes call gf_mgm_nomem while calloc/malloc/realloc throw an error but the message does not capture current memory usage of gluster process Solution: Call getrusage to capture current memory usage of gluster process Change-Id: I2e0319da1f33b177fa042fdc9e7268068576c9c3 fixes: bz#1708051 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	cli: Validate invalid slave url	Kotresh HR	2019-05-11	4	-5/+31
\| \| \| \| \| \| \| \| \|	This patch validates the invalid slave url in cli itself and throws appropriate error. fixes: bz#1098991 Change-Id: I278e2a04a4d619d2c2d1db0dd56ab5bdf7e7f469 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	glusterd: Add gluster volume stop operation to glusterd_validate_quorum()	Vishal Pandey	2019-05-11	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	ISSUE: gluster volume stop succeeds even if quorum is not met. Fix: Add GD_OP_STOP_VOLUME to gluster_validate_quorum in glusterd_mgmt_v3_pre_validate (). Since the volume stop command has been ported from synctask to mgmt_v3, the quorum check was missed out. Change-Id: I7a634ad89ec2e286ea262d7952061efad5360042 fixes: bz#1690753 Signed-off-by: Vishal Pandey <vpandey@redhat.com>
*	tests: fix bug-1319374.c compile warnings.	Ravishankar N	2019-05-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was looking at a downstream failure of bug-1319374-THIS-crash.t when I saw the compiler was throwing a warning while running the test: tests/bugs/gfapi/bug-1319374.c:17:61: warning: implicit declaration of function ‘strerror’; did you mean ‘perror’? [-Wimplicit-function-declaration] fprintf(stderr, "\nglfs_new: returned NULL (%s)\n", strerror(errno)); ^~~~~~~~ perror So I compiled the .c with -Wall and saw a lot many more warnings, all due of a missing header. This patch fixes it. fixes: bz#1708163 Change-Id: I8b6dd8e1404178a3d99b2d92d01f4575f5203e58 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	shd/glusterd: Serialize shd manager to prevent race condition	Mohammed Rafi KC	2019-05-10	4	-0/+72
\| \| \| \| \| \| \| \| \| \| \|	At the time of a glusterd restart, while doing a handshake there is a possibility that multiple shd manager might get executed. Because of this, there is a chance that multiple shd get spawned during a glusterd restart Change-Id: Ie20798441e07d7d7a93b7d38dfb924cea178a920 fixes: bz#1707081 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	afr: thin-arbiter lock release fixes	Ravishankar N	2019-05-10	3	-47/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pass fop state instead of afr local to afr_ta_dom_lock_check_and_release() - avoid afr_lock_release_synctask() being called simultaneosuly from notify code path and transaction (post-op) code path due to races. - Check if the post-op on TA is valid based on event_gen checks. - Invalidate in-memory information when we get TA child down. Note: Thi patch addresses some pending review comments of commit 053b1309dc8fbc05fcde5223e734da9f694cf5cc (https://review.gluster.org/#/c/glusterfs/+/20095/) fixes: bz#1698449 Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	glusterd: fix inconsistent global option output in volume get	Atin Mukherjee	2019-05-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	volume get all all \| grep <key> & volume get <volname> all \| grep <key> dumps two different output value for cluster.brick-multiplex and cluster.server-quorum-ratio Fixes: bz#1707700 Change-Id: Id131734e0502aa514b84768cf67fce3c22364eae Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: reduce some work in glusterd-utils.c	Yaniv Kaul	2019-05-09	1	-125/+128
\| \| \| \| \| \| \| \| \| \| \| \|	Similar to https://review.gluster.org/#/c/glusterfs/+/22652/ , reduce some of the work by using smaller buffers and less conversion of parameters when snprintf()'ing them. On the way, remove some clang warnings, mainly on dead assignment. Change-Id: Ie51e6d6f14df6b2ccbebba314cf937af08839741 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	tests: improve and fix some test scripts	Xavier Hernandez	2019-05-09	19	-82/+179
\| \| \| \| \| \|	Change-Id: Iceefe22af754096c599dc570d4894d14fce4deae Updates: bz#1193929 Signed-off-by: Xavier Hernandez <xhernandez@redhat.com>
*	geo-rep: Fix sync-method config	Kotresh HR	2019-05-09	4	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When 'use_tarssh' is set to true, it exits with successful message but the default 'rsync' was used as sync-engine. The new config 'sync-method' is not allowed to set from cli. Analysis and Fix: The 'use_tarssh' config is deprecated with new config framework and 'sync-method' is the new config to choose sync-method i.e. tarssh or rsync. This patch fixes the 'sync-method' config. The allowed values are tarssh and rsync. Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84 fixes: bz#1707686 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	tests/geo-rep: Fix arequal checksum comparison	Kotresh HR	2019-05-09	5	-9/+10
\| \| \| \| \| \| \| \| \|	The arequal checkusm comparison was always returning as successful, eventhough, if it was not. Fixed the same. Change-Id: I5083da25c0954126e452d06311d2d376f8540555 fixes: bz#1707742 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	glusterd: improve logging in __server_getspec()	Sanju Rakonde	2019-05-08	2	-2/+15
\| \| \| \| \| \| \|	updates: bz#1193929 Change-Id: Idad745d5869c92e6bed71842f14bc1a3362ca4bd Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	afr: log before attempting data self-heal.	Ravishankar N	2019-05-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was working on a blog about troubleshooting AFR issues and I wanted to copy the messages logged by self-heal for my blog. I then realized that AFR-v2 is not logging before attempting data heal while it logs it for metadata and entry heals. I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1 I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5 I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 Adding it in this patch. Now there is a 'performing' and a corresponding 'Completed' message for every type of heal. fixes: bz#1707746 Change-Id: I0b954cf1e17b48280aefa76640b5119b92133d61 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: enhance the auth.allow test to validate all failures of 'login' module	Amar Tumballi	2019-05-08	1	-4/+49
\| \| \| \| \| \| \| \|	now the enhanced test covers most of the code in auth.login and auth.addr module. updates: bz#1693692 Change-Id: I1f43c7dc414e2e4d443a93e9a37051359fd46ea4 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	geo-rep: fix incorrectly formatted authorized_keys	Sunny Kumar	2019-05-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two ways for creating secret pem pub file during geo-rep setup. 1. gluster-georep-sshkey generate 2. gluster system:: execute gsec_create Below patch solves this problem for `gluster-georep-sshkey generate` method. Patch link: https://review.gluster.org/#/c/glusterfs/+/22246/ This patch is added to support old way of creating secret pem pub file `gluster system:: execute gsec_create`. Problem: While Geo-rep setup when creating an ssh authorized_keys the geo-rep setup inserts an extra space before the "ssh-rsa" label. This gets flagged by an enterprise customer's security scan as a security violation. Solution: Remove extra space while creating secret key. fixes: bz#1679401 Change-Id: I92ba7e25aaa5123dae9ebe2f3c68d14315aa5f0e Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	glusterd/store: store all key-values in one shot	Yaniv Kaul	2019-05-08	5	-323/+343
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of saving each key-value separately, which is slow ( especially as we fflush() after each!), store them all as one string and write all together. Implements https://github.com/gluster/glusterfs/issues/629 Change-Id: Ie77a272446b0b6785584b710a4fdd9c613dd9578 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat,.com>
*	dht: Custom xattrs are not healed in case of add-brick	root	2019-05-08	2	-8/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If any custom xattrs are set on the directory before add a brick, xattrs are not healed on the directory after adding a brick. Solution: xattr are not healed because dht_selfheal_dir_mkdir_lookup_cbk checks the value of MDS and if MDS value is not negative selfheal code path does not take reference of MDS xattrs.Change the condition to take reference of MDS xattr so that custom xattrs are populated on newly added brick Updates: bz#1702299 Change-Id: Id14beedb98cce6928055f294e1594b22132e811c Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	afr : fix Coverity CID 1398627	Rinku Kothiya	2019-05-07	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \|	Fixed coverity error, "Unchecked return value (CHECKED_RETURN)". Checking return value & logging error message if afr_set_pending_dict fails. updates: bz#789278 Change-Id: Iab7da6b4f3cd0622b95b8e1c412b007a330467e5 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	libglusterfs: Fix compilation when --disable-mempool is used	Pranith Kumar K	2019-05-07	1	-0/+5
\| \| \| \| \| \|	updates bz#1193929 Change-Id: I245c065b209bcce5db939b6a0a934ba6fd393b47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	performance/write-behind: remove request from wip list in wb_writev_cbk	Raghavendra G	2019-05-06	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race in the way O_DIRECT writes are handled. Assume two overlapping write requests w1 and w2. * w1 is issued and is in wb_inode->wip queue as the response is still pending from bricks. Also wb_request_unref in wb_do_winds is not yet invoked. list_for_each_entry_safe (req, tmp, tasks, winds) { list_del_init (&req->winds); if (req->op_ret == -1) { call_unwind_error_keep_stub (req->stub, req->op_ret, req->op_errno); } else { call_resume_keep_stub (req->stub); } wb_request_unref (req); } * w2 is issued and wb_process_queue is invoked. w2 is not picked up for winding as w1 is still in wb_inode->wip. w1 is added to todo list and wb_writev for w2 returns. * response to w1 is received and invokes wb_request_unref. Assume wb_request_unref in wb_do_winds (see point 1) is not invoked yet. Since there is one more refcount, wb_request_unref in wb_writev_cbk of w1 doesn't remove w1 from wip. * wb_process_queue is invoked as part of wb_writev_cbk of w1. But, it fails to wind w2 as w1 is still in wip. * wb_requet_unref is invoked on w1 as part of wb_do_winds. w1 is removed from all queues including w1. * After this point there is no invocation of wb_process_queue unless new request is issued from application causing w2 to be hung till the next request. This bug is similar to bz 1626780 and bz 1379655. Change-Id: Iaa47437613591699d4c8ad18bc0b32de6affcc31 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1705865
*	mem-pool.{c\|h}: minor changes	Yaniv Kaul	2019-05-06	1	-25/+12
\| \| \| \| \| \| \| \| \| \| \|	1. Removed some code that was not needed. It did not really do anything. 2. CALLOC -> MALLOC in one place. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I4419161e1bb636158e32b5d33044b06f1eef2449