glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	socket: socket event handlers now return void	Milind Changire	2019-02-18	6	-27/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Returning any value from socket event handlers to the event sub-system doesn't make sense since event sub-system cannot handle socket sub-system errors. Solution: Change return type of all socket event handlers to 'void' Change-Id: I70dc2c57f12b7ea2fae41120f71aa0d7fe0b2b6f Fixes: bz#1651246 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	md-cache.c: minor reduction of work under lock.	Yaniv Kaul	2019-02-18	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	Take the time before taking the lock, not under lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I6cd05d8556a9bcc015e1be53f6ba46854e52a380
*	protocol/server: Use SERVER_REQ_SET_ERROR correctly for dicts	Pranith Kumar K	2019-02-15	1	-275/+236
\| \| \| \| \| \| \| \| \| \|	Removed op_errno based SERVER_REQ_SET_ERROR() calls which was dead-code. xdr_to_dict() calls have this check which is used in 4.0 version of xdr-to-dict. fixes bz#1676797 Change-Id: I6f56907c85576f1263a6ec04ed7e37f723b01ac3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	dht-shared.c: minor reduction of work under lock.	Yaniv Kaul	2019-02-14	1	-6/+7
\| \| \| \| \| \| \| \| \| \|	Minor changes to reduce work done under a lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ia58adfb5125129e5d1f3bbf2202f38520fdbc29f
*	storage/posix: print the actual file path	Raghavendra Bhat	2019-02-14	1	-54/+75
\| \| \| \| \| \| \| \| \| \| \| \| \|	posix converts incoming operations on files to operations on corresponding gfid handles. While this in itself is not a problem, logging of those gfid handles in place of actual file paths can create confusions during debugging. The best way would be to print both the actual file (recieved as an argument) for path based operations and the gfid handle associated with it. Change-Id: I408c36ca6456f2e3981b93151c19ef7f60085ad6 fixes: bz#1675076 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	Fix compilation for fops-sanity.c	Pranith Kumar K	2019-02-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this patch the following error is seen: .... warning: implicit declaration of function ‘makedev’ [-Wimplicit-function-declaration] ret = mknod("cspecial", S_IFCHR \| S_IRWXU \| S_IRWXG, makedev(2, 3)); ^~~~~~~ /usr/bin/ld: /tmp/ccIVwT46.o: in function `path_based_fops': /home/pk/workspace/gerrit-repo/tests/basic/fops-sanity.c:478: undefined reference to `makedev' .... updates bz#1676797 Change-Id: I8a17c38fdfd458dd2dc75f4c7e2bf20ce555a042 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	docs: fix typo in Developer Guide Readme	Csaba Henk	2019-02-14	1	-1/+1
\| \| \| \| \| \|	updates: bz#1193929 Change-Id: I3e13e5a2d7347cf2a4e3717e93b5e97325e2de97 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	cluster/dht: Request linkto xattrs in dht_rmdir opendir	N Balachandran	2019-02-13	1	-1/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If parallel-readdir is enabled, the rda xlator is loaded below dht in the graph and proactively lists and caches entries when an opendir is performed. dht_rmdir checks if the directory being deleted contains stale linkto files by performing a readdirp on its child subvols. However, as the entries are actually read in during the opendir operation which does not request the linkto xattr,no linkto xattrs are present for the entries causing dht to incorrectly identify them as data files and fail the rmdir operation with ENOTEMPTY. DHT now always adds the linkto xattr in the list of xattrs requested in the opendir. Change-Id: I0711198e66c59146282eb8b88084170bedfb4018 fixes: bz#1672851 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	dht: fix double extra unref of inode at heal path	Kinglong Mee	2019-02-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	The loc_wipe is done in the _out_ section, inode_unref(loc.parent) here casues a double extra unref of loc.parent. Change-Id: I2dc809328d3d34bf7b02c7df9a4f97788af511e6 updates: bz#1651439 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	cluster/dht: Fix lookup selfheal and rmdir race	N Balachandran	2019-02-13	1	-9/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A race between the lookup selfheal and rmdir can cause directories to be healed only on non-hashed subvols. This can prevent the directory from being listed from the mount point and in turn causes rm -rf to fail with ENOTEMPTY. Fix: Update the layout information correctly and reduce the call count only after processing the response. Change-Id: I812779aaf3d7bcf24aab1cb158cb6ed50d212451 fixes: bz#1676400 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	inode: make critical section smaller	Amar Tumballi	2019-02-13	3	-217/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do all the 'static' tasks outside of locked region. * hash_dentry() and hash_gfid() are now called outside locked region. * remove extra __dentry_hash exported in libglusterfs.sym * avoid checks in locked functions, if the check is done in calling function. * implement dentry_destroy(), which handles freeing of dentry separately, from that of dentry_unset (which takes care of separating dentry from inode, and table) Updates: bz#1670031 Change-Id: I584213e0748464bb427fbdef3c4ab6615d7d5eb0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/dht: Stop volume before unmounting bricks	N Balachandran	2019-02-13	1	-1/+7
\| \| \| \| \| \| \| \| \|	The bricks are loopback devices. Unmounting them is done before the cleanup and leads to "target is busy" messages. Change-Id: Ia808c2c9580273e1bf0595ecf53c210847c44577 fixes: bz#1676736 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	rfc.sh: fix the missing rebase issue	Amar Tumballi	2019-02-13	1	-38/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we decided to use clang-format for coding-style, and not bother about local `check-patch.pl`, we had commented out coding style check method in `./rfc.sh`. While it was intended and valid, we missed to see one major issue with that. `git fetch` command, which fetched latest code of the project, was inside the check_patches method, which also got missed. Even though we had an explicit 'rebase_orgin' method, it did nothing because the git fetch was not done before this. Now, calling an explicit git fetch, and removing dead code of check patches, as we are all fine with coding-style changes in last 4+ months. updates: bz#1193929 Change-Id: I3779096a527b93e780858ada8d988fdcdd6e2928 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr/shd: Cleanup self heal daemon resources during afr fini	Mohammed Rafi KC	2019-02-12	3	-0/+67
\| \| \| \| \| \| \| \| \|	We were not properly cleaning self-heal daemon resources during afr fini. This patch will clean the same. Change-Id: I597860be6f781b195449e695d871b8667a418d5a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	performance/md-cache: change the op-version of "global-cache-invalidation"	Raghavendra Gowdappa	2019-02-12	2	-2/+2
\| \| \| \| \| \| \| \| \|	Since release-6 is not done yet, this option can be introduced with GD_OP_VERSION_6_0. Change-Id: I8a0867e5b8b23d0d485704a2fc7a3efc4a90f637 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	clnt/rpc: ref leak during disconnect.	Mohammed Rafi KC	2019-02-12	3	-12/+47
\| \| \| \| \| \| \| \| \| \|	During disconnect cleanup, we are not cancelling reconnect timer, which causes a ref leak each time when a disconnect happen. Change-Id: I9d05d1f368d080e04836bf6a0bb018bf8f7b5b8a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	fuse: reflect the actual default for lru-limit option	Amar Tumballi	2019-02-11	2	-2/+2
\| \| \| \| \| \| \| \|	in both `--help` text and man page updates: bz#1193929 Change-Id: I9aa9367c6863ac8e2403255280697c9e6be26cf0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	performance/md-cache: introduce an option to control invalidation of inodes	Raghavendra Gowdappa	2019-02-11	2	-10/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Explicit invalidation by calling inode_invalidate is necessary when same (meta)data is shared/access across multiple mounts. Without an explicit inode_invalidate call, caches in the mount which didn't witness writes wouldn't be aware of changes as writes wouldn't have passed through them. However, if (meta)data is not shared, all relevant I/O goes through the cache of single mount and hence is coherent with (meta)data on bricks always. So, explicit inode invalidation can be disabled for this case which gives a huge performance boost for workloads that write data and then immediately read the data they just wrote. Note that otherwise, local writes (which pass through the cache) will change ctime and cause unnecessary invalidations. The name of the option that controls this behavior is "performance.global-cache-invalidation". This option is global and it purges caches both in glusterfs and kernel stack for native FUSE mounts. For non-native FUSE mounts, it purges cache only from glusterfs stack. This option is effective only when performance.stat-prefetch is on. Note that there is a similar option "performance.cache-invalidation", but the scope of that option is limited to quick-read and md-cache. Change-Id: I462bb4b65ff9aae1f6ba76f50b1f2f94fb10323b Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	inode: create inode outside locked region	Amar Tumballi	2019-02-11	1	-11/+13
\| \| \| \| \| \| \| \| \|	Only linking of inode to the table, and inserting it in a list needs to be in locked region. Updates: bz#1670031 Change-Id: I6ea7e956b80cf2765c2233d761909c4bf9c7253c Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	mount/fuse: fix bug related to --auto-invalidation in mount script	Raghavendra Gowdappa	2019-02-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When "auto-invalidation" option was not specified for mount script, glusterfs cmdline ended with "--auto-invalidation=" option. This patch fixes that bug in mount script. Thanks to Amar for reporting it. Change-Id: Ie5cd4c6ffb3ac644d9d2b032035f914a935d05a8 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	glusterd: improve logging	Atin Mukherjee	2019-02-08	1	-1/+3
\| \| \| \| \| \| \| \| \|	glusterd_resolve_all_bricks failure log should highlight the brick identifier. Updates: bz#1193929 Change-Id: I035b4650ef6a14bb1e1221d3bad1c40f9d43dbdd Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	api: Update all future API versions to rel-6	ShyamsundarR	2019-02-07	5	-75/+73
\| \| \| \| \| \| \| \| \| \|	As release 6 is branched, all future APIs now become 6.0 This change implements the same. Change-Id: I6db368b4dc8585278ec11d4a411adcd04635de53 Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	fuse: correctly handle setxattr values	Xavi Hernandez	2019-02-07	3	-8/+42
\| \| \| \| \| \| \| \| \| \|	The setxattr function receives a pointer to raw data, which may not be null-terminated. When this data needs to be interpreted as a string, an explicit null termination needs to be added before using the value. Change-Id: Id110f9b215b22786da5782adec9449ce38d0d563 updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	Bump up timeout for tests on AWS	Nigel Babu	2019-02-07	3	-3/+4
\| \| \| \| \| \|	Fixes: bz#1672727 Change-Id: I2b9be45f199f6436b858536c6f49be85902217f0 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	glusterd: Update op-version for release 7	ShyamsundarR	2019-02-05	1	-1/+3
\| \| \| \| \| \|	Change-Id: I0f3978d7e603e6e767dc7aa2a23ef35b1f2b43f7 Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	glusterd: get-state command should not fail if any brick is gone badv7dev	Sanju Rakonde	2019-02-05	2	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: get-state command will error out, if any of the underlying brick(s) of volume(s) in the cluster go bad. It is expected that get-state command should not error out, but should generate an output successfully. Solution: In glusterd_get_state(), a statfs call is made on the brick path for every bricks of the volumes to calculate the total and free memory available. If any of statfs call fails on any brick, we should not error out and should report total memory and free memory of that brick as 0. This patch also handles a statfs failure scenario in glusterd_store_retrieve_bricks(). fixes: bz#1672205 Change-Id: Ia9e8a1d8843b65949d72fd6809bd21d39b31ad83 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	geo-rep: Fix configparser import issue	Kotresh HR	2019-02-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'configparser' is backported to python2 and can be installed using pip (pip install configparser). So trying to import 'configparser' first and later 'ConfigParser' can cause issues w.r.t unicode strings. Always try importing 'ConfigParser' first and then 'configparser'. This solves python2/python3 compat issues. Change-Id: I2a87c3fc46476296b8cb547338f35723518751cc fixes: bz#1671637 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	cli: Added the group option for volume set	Rinku Kothiya	2019-02-04	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	gluster volume set <VOLUME> group <GROUP> is used for setting multiple pre-defined volume options, but this was undocumented. This patch doc- ments this feature. fixes: bz#1243991 Change-Id: Id346cf2537f85179caff32479f09555ce2e72e76 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	glusterd: manage upgrade to current master	Amar Tumballi	2019-02-04	6	-20/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scenarios tested: * Upgrade the node when there are stripe / tiering and regular type of volumes are present. - All volumes are started fine (as the change was not on brick volfile) - For tier, the functionality may not even work, as changetimerecorder is not present. - 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and tier type of volume. * Upgrade in a rolling upgrade scenario, where an old version is able to connect to higher master. - on a normal volume, if the volfile-server was new, the newer client volfiles needed to have utime xlator conditionally. - with this one change, all other changes seem to work fine. Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8 updates: bz#1635688 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster/dht: Do not use gfid-req in fresh lookup	N Balachandran	2019-02-02	2	-8/+60
\| \| \| \| \| \| \| \| \| \| \| \|	Fuse sets a random gfid-req value for a fresh lookup. Posix lookup will set this gfid on entries with missing gfids causing a GFID mismatch for directories. DHT will now ignore the Fuse provided gfid-req and use the GFID returned from other subvols to heal the missing gfid. Change-Id: I5f541978808f246ba4542564251e341ec490db14 fixes: bz#1670259 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterfind: python2 to python3 compat	Shwetha K Acharya	2019-02-02	9	-25/+94
\| \| \| \| \| \| \| \|	Made necessary modifications to ensure python3 compatibilty. fixes: bz#1658116 Change-Id: I5cf1d0447eaf3c44eb444245d1f67aadd60705c3 Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
*	mount/fuse: expose auto-invalidation as a mount option	Raghavendra Gowdappa	2019-02-02	10	-10/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Auto invalidation is necessary when same (meta)data is shared/access across multiple mounts. However, if (meta)data is not shared, all relevant I/O goes through the cache of single mount and hence is coherent with (meta)data on bricks always. So, fuse-auto-invalidation can be disabled for this case which gives a huge performance boost for workloads that write data and then immediately read the data they just wrote. From glusterfs --help, <snip> --auto-invalidation[=BOOL] controls whether fuse-kernel can auto-invalidate attribute, dentry and page-cache. Disable this only if same files/directories are not accessed across two different mounts concurrently [default: "on"] </snip> Details on how disabling auto-invalidation helped to reduce pgbench init times can be found at [1]. Time taken for pgbench init of scale 8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s) with auto-invalidations turned off along with other optimizations. Just disabling auto-invalidation contributed 56% improvement by reducing the total time taken by 33260s. [1] https://www.spinics.net/lists/gluster-devel/msg25907.html Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	core: make gf_thread_create() easier to use	Xavi Hernandez	2019-02-01	9	-82/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch creates a specific function to set the thread name using a string format and a variable argument list, like printf(). This function is used to set the thread name from gf_thread_create(), which now accepts a variable argument list to create the full name. It's not necessary anymore to use a local array to build the name of the thread. This is done automatically. Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	cluster/thin-arbiter: Consider thin-arbiter before marking new entry changelog	Ashish Pandey	2019-02-01	5	-25/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a fop to create an entry fails on one of the data brick, we mark the pending changelog on the entry on brick for which it was successful. This is done as part of post op phase to make sure that entry gets healed even if it gets renamed to some other path where its parent was not marked as bad. As it happens as part of post op, we should consider thin-arbiter to check if the brick, which was successful, is the good brick or not. This will avoide split brain and other issues. Change-Id: I12686675be98f02f70a5186b3ed748c541514d53 updates: bz#1662264 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	cluster/dht: Remove internal permission bits	N Balachandran	2019-02-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rebalance sets the sgid and t bits on a file that is being migrated. These permissions are not removed in dht_readdirp_cbk when listing files causing them to show up on the mountpoint. We now remove these permissions if a non-linkto file has the linkto xattr set. Change-Id: I5c69b2ecfe2df804fe50faea903b242d01729596 fixes: bz#1669937 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	core: move "dict is NULL" logs to DEBUG log level	Milind Changire	2019-02-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	Too many logs get printed if dict_ref() and dict_unref() are passed NULL pointer. fixes: bz#1671213 Change-Id: I18afd849d64318f68baa7b549ee310dac0e1e786 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	syncop: remove unnecessary call to gf_backtrace_save()	Xavi Hernandez	2019-01-31	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	A call to gf_backtrace_save() was done on each context switch of a synctask. The backtrace is generated writing to the filesystem, so it can have an important impact on latency. The generated backtrace was not used anywhere, so it's been removed. Change-Id: I399a93b932c5b6e981c696c72c3e1ef44710ba52 Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	feature/bitrot: Avoid thread creation if xlator is not enabled	Mohit Agrawal	2019-01-31	1	-8/+64
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Avoid thread creation for bitrot-stub for a volume if feature is not enabled Solution: Before thread creation check the flag if feature is enabled Updates: #475 Change-Id: I2c6cc35bba142d4b418cc986ada588e558512c8e Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	core: heketi-cli is throwing error "target is busy"	Mohit Agrawal	2019-01-31	5	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk() is supposed to call rpc_transport_unref() after open-files on that transport are flushed per transport.But open-fd-count is maintained in bound_xl->fd_count, which can be incremented/decremented cumulatively in server_connection_cleanup() by all transport disconnect paths. So instead of rpc_transport_unref() happening per transport, it ends up doing it only once after all the files on all the transports for the brick are flushed leading to rpc-leaks. Solution: To avoid races maintain fd_cnt at client instead of maintaining on brick Credits: Pranith Kumar Karampuri Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6 fixes: bz#1668190 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	readdir-ahead: do not zero-out iatt in fop cbk	Ravishankar N	2019-01-31	2	-20/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...when ctime is zero. ia_type and ia_gfid always need to be non-zero for things to work correctly. Problem: Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt buffer in the cbks of modification fops before unwinding if the ctime in the buffer was zero. This was causing the fops to fail: noticeable when AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime when the option is set. See commit 4c4624c9bad2edf27128cb122c64f15d7d63bbc8). Fixes: -Do not zero out the ia_type and ia_gfid of the iatt buff under any circumstance. -Also, fixed _rda_inode_ctx_update_iatts() to always update these values from the incoming buf when ctime is zero. Otherwise we end up with zero ia_type and ia_gfid the first time the function is called and the incoming buf has ctime set to zero. fixes: bz#1670253 Reported-By:Michael Hanselmann <public@hansmi.ch> Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	api: bad GFAPI_4.1.6 block	Kaleb S. KEITHLEY	2019-01-30	1	-2/+3
\| \| \| \| \| \| \| \|	missing global: line, tabs not spaces Change-Id: Icdbc23b4e4cd608da1d764e81757201c4b1269a6 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	features/sdfs: disable by default	Amar Tumballi	2019-01-29	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the feature enabled, some of the performance testing results, specially those which create millions of small files, got approximately 4x regression compared to version before enabling this. On master without this patch: 765 creates/sec On master with this patch : 3380 creates/sec Also there seems to be regression caused by this in 'ls -l' workload. On master without this patch: 3030 files/sec On master with this patch : 16610 files/sec This is a feature added to handle multiple clients parallely operating (specially those which race for file creates with same name) on a single namespace/directory. Considering that is < 3% of Gluster's usecase right now, it makes sense to disable the feature by default, so we don't penalize the default users who doesn't bother about this usecase. Also note that the client side translators, specially, distribute, replicate and disperse already handle the issue upto 99.5% of the cases without SDFS, so it makes sense to keep the feature disabled by default. Credits: Shyamsunder <srangana@redhat.com> for running the tests and getting the numbers. Change-Id: Iec49ce1d82e621e9db25eb633fcb1d932e74f4fc Updates: bz#1670031 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	19	-152/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	socket: fix issue on concurrent handle of a socket	Zhang Huan	2019-01-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found an issue on concurrent invoke of event handler to the same socket fd, causing memory corruption. This issue arises after applying commit "socket: Remove redundant in_lock in incoming message handling" that removes priv->in_lock to serialize socket read. The following call sequence describes how concurrent socket event handle happens. thread 1 thread 2 thread 3 epoll_wait() return (slot->in_handler is 0) call select_on_epoll() and epoll_ctl() on fd epoll_wait() return slot->in_handler++ (slot->in_handler is 1) slot->in_handler++ (slot->in_handler is 2) call handler() call handler() Fix this issue by skip invoke of handler if there is already a handler inprogress. Change-Id: I437126ac772debcadb00993a948919c931cd607b updates: bz#1467614 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	features/shard: Ref shard inode while adding to fsync list	Krutika Dhananjay	2019-01-24	2	-8/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: Lot of the earlier changes in the management of shards in lru, fsync lists assumed that if a given shard exists in fsync list, it must be part of lru list as well. This was found to be not true. Consider this - a file is FALLOCATE'd to a size which would make the number of participant shards to be greater than the lru list size. In this case, some of the resolved shards that are to participate in this fop will be evicted from lru list to give way to the rest of the shards. And once FALLOCATE completes, these shards are added to fsync list but without a ref. After the fop completes, these shard inodes are unref'd and destroyed while their inode ctxs are still part of fsync list. Now when an FSYNC is called on the base file and the fsync-list traversed, the client crashes due to illegal memory access. FIX: Hold a ref on the shard inode when adding to fsync list as well. And unref under following conditions: 1. when the shard is evicted from lru list 2. when the base file is fsync'd 3. when the shards are deleted. Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2 fixes: bz#1669077 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	tests: run nfs tests only if --enable-gnfs is provided	Amar Tumballi	2019-01-24	59	-2/+139
\| \| \| \| \| \|	Fixes: bz#1665358 Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr/self-heal:Fix wrong type checking	Ravishankar N	2019-01-24	6	-33/+37
\| \| \| \| \| \| \| \| \| \|	gf_dirent struct has d_type variable which should check with DT_DIR istead of IA_IFDIR or IA_IFDIR has to compare with entry->d_stat.ia_type Change-Id: Idf1059ce2a590734bc5b6adaad73604d9a708804 updates: bz#1653359 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: heketi-cli is throwing error "target is busy"	Mohit Agrawal	2019-01-24	3	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of deleting block hosting volume through heketi-cli , it is throwing an error "target is busy". cli is throwing an error because brick is not detached successfully and brick is not detached due to race condition to cleanp xprt associated with detached brick Solution: To avoid xprt specifc race condition introduce an atomic flag on rpc_transport Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4 fixes: bz#1668190 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests/bug-brick-mux-restart: add extra information	Amar Tumballi	2019-01-24	1	-1/+12
\| \| \| \| \| \| \| \| \| \|	so that we can understand more about process memory and thread consumptions With this, we will also be able to understand more about the process details with brick-mux. updates: bz#1193929 Change-Id: I147a3e3814fc37dfb635217d0a0f0184fae0994f Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	core: move logs which are only developer relevant to DEBUG level	Amar Tumballi	2019-01-23	4	-6/+6
\| \| \| \| \| \| \| \| \| \| \|	We had only changed the log level to DEBUG in release branch earlier. But considering 90%+ of our deployments happen in same env, we can look at these specific logs on need basis. With this change, the master branch will be easier to debug with lesser logs. Change-Id: I4157a7ec7d5ec9c2948b2bbc1e4cb8317f28d6b8 Updates: bz#1666833 Signed-off-by: Amar Tumballi <amarts@redhat.com>