glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	mount/fuse: Fix graph-switch when reader-thread-count is set	Pranith Kumar K	2020-10-05	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The current graph-switch code sets priv->handle_graph_switch to false even when graph-switch is in progress which leads to crashes in some cases Fix: priv->handle_graph_switch should be set to false only when graph-switch completes. fixes: #1539 Change-Id: I5b04f7220a0a6e65c5f5afa3e28d1afe9efcdc31 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
*	cluster/afr: Heal directory rename without rmdir/mkdir	Pranith Kumar K	2020-10-01	8	-46/+750
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem1: When a directory is renamed while a brick is down entry-heal always did an rm -rf on that directory on the sink on old location and did mkdir and created the directory hierarchy again in the new location. This is inefficient. Problem2: Renamedir heal order may lead to a scenario where directory in the new location could be created before deleting it from old location leading to 2 directories with same gfid in posix. Fix: As part of heal, if oldlocation is healed first and is not present in source-brick always rename it into a hidden directory inside the sink-brick so that when heal is triggered in new-location shd can rename it from this hidden directory to the new-location. If new-location heal is triggered first and it detects that the directory already exists in the brick, then it should skip healing the directory until it appears in the hidden directory. Credits: Ravi for rename-data-loss.t script Fixes: #1211 Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	gfapi: Move the SECURE_ACCESS_FILE check out of glfs_mgmt_init	Môshe van der Sterre	2020-09-29	3	-0/+218
\| \| \| \| \| \| \| \|	glfs_mgmt_init is only called for glfs_set_volfile_server, but secure_mgmt is also required to use glfs_set_volfile with SSL. fixes: #829 Change-Id: Ibc769fe634d805e085232f85ce6e1c48bf4acc66
*	glusterd: Fix Add-brick with increasing replica count failure	Sheetal Pamecha	2020-09-24	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: add-brick operation fails with multiple bricks on same server error when replica count is increased. This was happening because of extra runs in a loop to compare hostnames and if bricks supplied were less than "replica" count, the bricks will get compared to itself resulting in above error. Fixes: #1508 Change-Id: I8668e964340b7bf59728bb838525d2db062197ed Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	fuse: fetch arbitrary number of groups from /proc/[pid]/status	Csaba Henk	2020-08-21	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Glusterfs so far constrained itself with an arbitrary limit (32) for the number of groups read from /proc/[pid]/status (this was the number of groups shown there prior to Linux commit v3.7-9553-g8d238027b87e (v3.8-rc1~74^2~59); since this commit, all groups are shown). With this change we'll read groups up to the number Glusterfs supports in general (64k). Note: the actual number of groups that are made use of in a regular Glusterfs setup shall still be capped at ~93 due to limitations of the RPC transport. To be able to handle more groups than that, brick side gid resolution (server.manage-gids option) can be used along with NIS, LDAP or other such networked directory service (see https://github.com/gluster/glusterdocs/blob/5ba15a2/docs/Administrator%20Guide/Handling-of-users-with-many-groups.md#limit-in-the-glusterfs-protocol ). Also adding some diagnostic messages to frame_fill_groups(). Change-Id: I271f3dc3e6d3c44d6d989c7a2073ea5f16c26ee0 fixes: #1075 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	metadisp: new translator for data and metadata separation	Sheena Artrip	2020-08-21	6	-0/+641
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: feature/metadisp is an xlator for performing "metadata dispersal" across multiple children. it does this by flattening the complex POSIX paths into /$GFID style paths, then forwarding the metadata operations to its first child and forwarding the data operations to its second child. The purpose of this xlator is to allow separation of data and metadata, in cases where metadata might be stored in another format (embedded kv?), on another disk (ssd), on another host (dht2). Change-Id: I392c8bd0c867a3237d144aea327323f700a2728d Updates: #816 Signed-Off-By: Sheena Artrip <sheenobu@fb.com> Tested-By: Amar Tumballi <amar@kadalu.io>
*	tests: provide an option to mark tests as 'flaky'	Amar Tumballi	2020-08-20	14	-36/+35
\| \| \| \| \| \| \| \| \| \| \| \| \|	* also add some time gap in other tests to see if we get things properly * create a directory 'tests/000/', which can host any tests, which are flaky. * move all the tests mentioned in the issue to above directory. * as the above dir gets tested first, all flaky tests would be reported quickly. * change `run-tests.sh` to continue tests even if flaky tests fail. Reference: gluster/project-infrastructure#72 Updates: #1000 Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e Signed-off-by: Amar Tumballi <amar@kadalu.io>
*	features/shard: optimization over shard lookup in case of prealloc	Vinayakswami Hariharmath	2020-08-20	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Assume that we are preallocating a VM of size 1TB with a shard block size of 64MB then there will be ~16k shards. This creation happens in 2 steps shard_fallocate() path i.e 1. lookup for the shards if any already present and 2. mknod over those shards do not exist. But in case of fresh creation, we dont have to lookup for all shards which are not present as the the file size will be 0. Through this, we can save lookup on all shards which are not present. This optimization is quite useful in the case of preallocating big vm. Also if the file is already present and the call is to extend it to bigger size then we need not to lookup for non- existent shards. Just lookup preexisting shards, populate the inodes and issue mknod on extended size. Fixes: #1425 Change-Id: I60036fe8302c696e0ca80ff11ab0ef5bcdbd7880 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
*	afr/split-brain: fix client side split-brain resolution when quorum is enabled	Mohammed Rafi KC	2020-08-13	1	-0/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If we set favourite child policy, then automatic split-brain resolution should work in all cases. This was failing when quorum count was set to a non-zero value. The initial lookup before the read txn was failing with ENOTCONN. Since we don't have a readable subvol, we were failing it. We were only looking to the split brain resolution choice set through the cli command. Fix: We will now consider the favourite child policy if split-brain choice has not been set via cli command. Change-Id: Id2016c3a90d0763ac6f1a0131571053f595576f0 Fixes: #1404 Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
*	tests: Fix regression failures of 01-georep-glusterd-tests.t	Shwetha K Acharya	2020-08-03	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	TEST $GEOREP_CLI $master $slave1 create push-pem force times out on Centos 7 builders. Increasing the GEO_REP_TIMEOUT and SCRIPT_TIMEOUT to address the same. Fixes: #1410 Change-Id: I81b5590e33f40ea4210cc56d18e2b9fa34033cd8 Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
*	test: ./tests/features/ssl-ciphers.t fail on centos 8	Mohit Agrawal	2020-07-31	1	-2/+9
\| \| \| \| \| \| \| \| \| \|	Check the tlsv1 openssl connection based on openssl version. If openssl version is 1.1 it supports tls1 protocol otherwise it supports tlsv1_2 protocol. Fixes: #1403 Change-Id: I3ca286492049e6f84de70e3b969fa41db10378ab Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	glusterd: getspec() returns wrong response when volfile not found	Tamar Shacked	2020-07-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a cluster env: getspec() detects that volfile not found. but further on, this return code is set by another call so the error is lost and not handled. As a result the server responds with ambiguous message: {op_ret = -1, op_errno = 0..} - which cause the client to stuck. Fix: server side: don't override the failure error. fixes: #1375 Change-Id: Id394954d4d0746570c1ee7d98969649c305c6b0d Signed-off-by: Tamar Shacked <tshacked@redhat.com>
*	dht - fixing xattr inconsistency	Barak Sason Rofman	2020-07-22	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scenario of setting an xattr to a dir, killing one of the bricks, removing the xattr, bringing back the brick results in xattr inconsistency - The downed brick will still have the xattr, but the rest won't. This patch add a mechanism that will remove the extra xattrs during lookup. This patch is a modification to a previous patch based on comments that were made after merge: https://review.gluster.org/#/c/glusterfs/+/24613/ fixes: #1324 Change-Id: Ifec0b7aea6cd40daa8b0319b881191cf83e031d1 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	tests/features/interrupt.t: fixes	Csaba Henk	2020-07-19	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Modify the patterns for which we grep the logs so that they don't match themselves. The test runner inserts the invocation of the cases to the log, thus the patterns will occur in the logs verbatim. So if the pattern matches itself, the test case will be moot (always reporting success). - Invoke the test utility (open-and-sleep) on unique paths so that the file at the passed path shall be created on each invocation. The kernel does not send an interrupt if the file is extant. (This was shadowed by the above mistske with result evaluation.) - Modify the pattern for which we grep the log in the test case where interrupt handling is expected so that it asserts that the interrupt was handled. (So far we did not exclude the possibility of the interrupt triggered but not handled due to a race; however, it seems to be the case that this theoretic race does not have the potential to prevent interrupt handling. And if this ever changes in the future we'd rather be notified about that.) Change-Id: I606da2b4064c1ecc4781c7dfdefed95a433478ce Updates: #1374 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	dht: Heal missing dir entry on brick in revalidate path	Susant Palai	2020-07-09	1	-0/+33
\| \| \| \| \| \| \| \| \|	Mark dir as missing in layout structure to be healed in dht_selfheal_directory. fixes: #1327 Change-Id: If2c69294bd8107c26624cfe220f008bc3b952a4e Signed-off-by: Susant Palai <spalai@redhat.com>
*	tests: added volume operations to increase code coverage	nik-redhat	2020-07-06	5	-14/+144
\| \| \| \| \| \| \| \| \| \| \| \|	Added test for volume options like localtime-logging, fixed enable-shared-storage to include function coverage and few negative tests for other volume options to increase the code coverage in the glusterd component. Change-Id: Ib1706c1fd5bc98a64dcb5c8b15a121d639a597d7 Updates: #1052 Signed-off-by: nik-redhat <nladha@redhat.com>
*	Revert "dht - fixing xattr inconsistency"	Barak Sason Rofman	2020-06-25	1	-54/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 620158475f462251c996901a8e24306ef6cb4c42. The patch to revert is https://review.gluster.org/#/c/glusterfs/+/24613/ Reverting is required as comments were posted regarding a more efficient implementation were made after the patch was merged. A new patch will be posted to adress the comments will be posted. updates: #1324 Change-Id: I59205baefe1cada033c736d41ce9c51b21727d3f Signed-off-by: Barak Sason Rofman <redhat@gmail.com>
*	dht - fixing xattr inconsistency	Barak Sason Rofman	2020-06-25	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \|	The scenario of setting an xattr to a dir, killing one of the bricks, removing the xattr, bringing back the brick results in xattr inconsistency - The downed brick will still have the xattr, but the rest won't. This patch add a mechanism that will remove the extra xattrs during lookup. fixes: #1324 Change-Id: Ibcc449bad6c7cb46bcae380e42e4496d733b453d Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	glusterd: add-brick command failure	Sanju Rakonde	2020-06-21	2	-3/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: add-brick operation is failing when replica or disperse count is not mentioned in the add-brick command. Reason: with commit a113d93 we are checking brick order while doing add-brick operation for replica and disperse volumes. If replica count or disperse count is not mentioned in the command, the dict get is failing and resulting add-brick operation failure. fixes: #1306 Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	mount/fuse: use cookies to get fuse-interrupt-record instead of xdata	Pranith Kumar K	2020-06-18	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: On executing tests/features/flock_interrupt.t the following error log appears [2020-06-16 11:51:54.631072 +0000] E [fuse-bridge.c:4791:fuse_setlk_interrupt_handler_cbk] 0-glusterfs-fuse: interrupt record not found This happens because fuse-interrupt-record is never sent on the wire by getxattr fop and there is no guarantee that in the cbk it will be available in case of failures. Fix: wind getxattr fop with fuse-interrupt-record as cookie and recover it in the cbk Fixes: #1310 Change-Id: I4cfff154321a449114fc26e9440db0f08e5c7daa Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests/glusterd: spurious failure of ↵	Sanju Rakonde	2020-06-17	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t Test Summary Report ------------------- tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t (Wstat: 0 Tests: 23 Failed: 3) Failed tests: 21-23 After glusterd restart, volume start is failing. Looks like, it need some time to sync the data. Adding sleep for the same. Note: All other changes are made to avoid spurious failures in the future. fixes: #1272 Change-Id: Ib184757fb936e03b5b6208465e44a8e790b71c1c Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	afr: more quorum checks in lookup and new entry marking	Ravishankar N	2020-06-16	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: See github issue for details. Fix: -In lookup if the entry exists in 2 out of 3 bricks, don't fail the lookup with ENOENT just because there is an entrylk on the parent. Consider quorum before deciding. -If entry FOP does not succeed on quorum no. of bricks, do not perform new entry mark. Fixes: #1303 Change-Id: I56df8c89ad53b29fa450c7930a7b7ccec9f4a6c5 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	Indicate timezone offsets in timestamps	Csaba Henk	2020-06-15	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Logs and other output carrying timestamps will have now timezone offsets indicated, eg.: [2020-03-12 07:01:05.584482 +0000] I [MSGID: 106143] [glusterd-pmap.c:388:pmap_registry_remove] 0-pmap: removing brick (null) on port 49153 To this end, - gf_time_fmt() now inserts timezone offset via %z strftime(3) template. - A new utility function has been added, gf_time_fmt_tv(), that takes a struct timeval pointer (tv) instead of a time_t value to specify the time. If tv->tv_usec is negative, gf_time_fmt_tv(... tv ...) is equivalent to gf_time_fmt(... tv->tv_sec ...) Otherwise it also inserts tv->tv_usec to the formatted string. - Building timestamps of usec precision has been converted to gf_time_fmt_tv, which is necessary because the method of appending a period and the usec value to the end of the timestamp does not work if the timestamp has zone offset, but it's also beneficial in terms of eliminating repetition. - The buffer passed to gf_time_fmt/gf_time_fmt_tv has been unified to be of GF_TIMESTR_SIZE size (256). We need slightly larger buffer space to accommodate the zone offset and it's preferable to use a buffer which is undisputedly large enough. This change does not* do the following: - Retaining a method of timestamp creation without timezone offset. As to my understanding we don't need such backward compatibility as the code just emits timestamps to logs and other diagnostic texts, and doesn't do any later processing on them that would rely on their format. An exception to this, ie. a case where timestamp is built for internal use, is graph.c:fill_uuid(). As far as I can see, what matters in that case is the uniqueness of the produced string, not the format. - Implementing a single-token (space free) timestamp format. While some timestamp formats used to be single-token, now all of them will include a space preceding the offset indicator. Again, I did not see a use case where this could be significant in terms of representation. - Moving the codebase to a single unified timestamp format and dropping the fmt argument of gf_time_fmt/gf_time_fmt_tv. While the gf_timefmt_FT format is almost ubiquitous, there are a few cases where different formats are used. I'm not convinced there is any reason to not use gf_timefmt_FT in those cases too, but I did not want to make a decision in this regard. Change-Id: I0af73ab5d490cca7ed8d07a2ce7ac22a6df2920a Updates: #837 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	features/shard: Use fd lookup post file open	Vinayakswami Hariharmath	2020-06-11	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: When a process has the open fd and the same file is unlinked in middle of the operations, then file based lookup fails with ENOENT or stale file Solution: When the file already open and fd is available, use fstat to get the file attributes Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4 Fixes: #1281 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
*	test: Test case brick-mux-validation-in-cluster.t is failing on RHEL-8	Mohit Agrawal	2020-06-09	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Brick process are not properly attached on any cluster node while some volume options are changed on peer node and glusterd is down on that specific node. Solution: At the time of restart glusterd it got a friend update request from a peer node if peer node having some changes on volume.If the brick process is started before received a friend update request in that case brick_mux behavior is not workingproperly. All bricks are attached to the same process even volumes options are not the same. To avoid the issue introduce an atomic flag volpeerupdate and update the value while glusterd has received a friend update request from peer for a specific volume.If volpeerupdate flag is 1 volume is started by glusterd_import_friend_volume synctask Change-Id: I4c026f1e7807ded249153670e6967a2be8d22cb7 Credit: Sanju Rakaonde <srakonde@redhat.com> fixes: #1290 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	cluster/afr: Delay post-op for fsync	Pranith Kumar K	2020-06-08	3	-0/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: AFR doesn't delay post-op for fsync fop. For fsync heavy workloads this leads to un-necessary fxattrop/finodelk for every fsync leading to bad performance. Fix: Have delayed post-op for fsync. Add special flag in xdata to indicate that afr shouldn't delay post-op in cases where either the process will terminate or graph-switch would happen. Otherwise it leads to un-necessary heals when the graph-switch/process-termination happens before delayed-post-op completes. Fixes: #1253 Change-Id: I531940d13269a111c49e0510d49514dc169f4577 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Prioritize ENOSPC over other errors	karthik-us	2020-06-05	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a replicate/arbiter volume if file creations or writes fails on quorum number of bricks and on one brick it is due to ENOSPC and on other brick it fails for a different reason, it may fail with errors other than ENOSPC in some cases. Fix: Prioritize ENOSPC over other lesser priority errors and do not set op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any error_no which can be misinterpreted by __afr_dir_write_finalize(). Also removing the function afr_has_arbiter_fop_cbk_quorum() which might consider a successful reply form a single brick as quorum success in some cases, whereas we always need fop to be successful on quorum number of bricks in arbiter configuration. Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08 Fixes: #1254 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	open-behind: rewrite of internal logic	Xavi Hernandez	2020-06-04	5	-0/+872
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a critical flaw in the previous implementation of open-behind. When an open is done in the background, it's necessary to take a reference on the fd_t object because once we "fake" the open answer, the fd could be destroyed. However as long as there's a reference, the release function won't be called. So, if the application closes the file descriptor without having actually opened it, there will always remain at least 1 reference, causing a leak. To avoid this problem, the previous implementation didn't take a reference on the fd_t, so there were races where the fd could be destroyed while it was still in use. To fix this, I've implemented a new xlator cbk that gets called from fuse when the application closes a file descriptor. The whole logic of handling background opens have been simplified and it's more efficient now. Only if the fop needs to be delayed until an open completes, a stub is created. Otherwise no memory allocations are needed. Correctly handling the close request while the open is still pending has added a bit of complexity, but overall normal operation is simpler. Change-Id: I6376a5491368e0e1c283cc452849032636261592 Fixes: #1225 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	io-cache,quick-read: deprecate volume options with flawed semantics or naming	Csaba Henk	2020-06-02	4	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- performance.cache-size has a flawed semantics, as it's dispatched on two independent translators, io-cache and quick-read. - performance.qr-cache-timeout has a confusing name, as other options affecting quick-read have an unabbreviated "quick-read-..." prefix in their names. We keep these options with unchanged operation, but in the help output we indicate their deprecation. The following better alternatives are introduced: - performance.io-cache-size to tune cache-size option of io-cache - performance.quick-read-cache-size to tune cache-size option of quick-read - performance.quick-read-cache-timeout as a preferred synonym for performance.qr-cache-timeout Fixes: #952 Change-Id: Ibd04fb638de8cac450ba992ad8a415154f9f4281 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	dht - sparse files rebalance enhancements	Barak Sason Rofman	2020-06-01	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently data migration in rebalance reads sparse file sequentially, disregarding which segments are holes and which are data. This can lead to extremely long migration time for large sparse file. Data migration mechanism needs to be enhanced so only data segments are read and migrated. This can be achieved using lseek to seek for holes and data in the file. This enhancement is a consequence of https://bugzilla.redhat.com/show_bug.cgi?id=1823703 fixes: #1222 Change-Id: If5f448a0c532926464e1f34f504c5c94749b08c3 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	features/shard: Aggregate file size, block-count before unwinding removexattr	Krutika Dhananjay	2020-05-26	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Posix translator returns pre and postbufs in the dict in {F}REMOVEXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4b2dd41ede472c5829af80a67401ec5a6376d872 Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	features/shard: Aggregate size, block-count in iatt before unwinding setxattr	Krutika Dhananjay	2020-05-21	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	Posix translator returns pre and postbufs in the dict in {F}SETXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4da0eceb4235b91546df79270bcc0af8cd64e9ea Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	mgmt/glusterd: Stop old shd before increasing replica count	Pranith Kumar K	2020-05-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In add-brick that increases replica count SHD was restarted after pending xattrs are set on the new bricks and adding bricks. But before restarting SHD there is a possibility that old SHD would do a scan on root-directory see no heal is needed and delete index for root-dir leading to no heals until lookup is executed on the mount Fix: Stop shd, perform pending-xattr setting/adding new bricks and then restart shd Fixes: #1240 Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Disable client-heals	Pranith Kumar K	2020-05-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: ok 32 [ 11/ 9] < 46> 'gf_rm_file_and_gfid_link /d/backends/patchy0 del-file' not ok 33 [ 13/ 131] < 48> '! dd if=/dev/zero of=/mnt/glusterfs/0/del-file bs=1M count=1 oflag=direct' -> '' The assumption in the test above is that the file wouldn't exist when dd happens. But heal can lead to creation of the file in some cases leading to spurious failures. Fix: Disable client side heal. Fixes: #1245 Change-Id: I96b2b45528f9dfb3199d503a467cafafba9b387f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	Fix ./tests/basic/fencing/afr-lock-heal-basic.t failure	Pranith Kumar K	2020-05-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In brick-mux tests, all bricks of the volume have same pid. "generate_brick_statedump" cleans up the older statedumps with same brick pid. So successive calls to this function will delete previous brick's statedump as all bricks share same pid. So grep calls to the statedump were failing leading to failure of the .t To fix this, stored the result we need from statedump before calling next brick's statedump Fixes: #1234 Change-Id: I824ed4dff79e7242b3e980364836b9af0e87a6ee Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: skip tests on absence of reflink in xfs	Pranith Kumar K	2020-05-06	1	-7/+9
\| \| \| \| \| \|	Fixes: #1223 Change-Id: I36cb72d920ffd77405051546615c5262c392daef Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	test: improve geo-rep non root test	Sunny Kumar	2020-05-02	2	-9/+23
\| \| \| \| \| \| \| \| \|	Make sure bricks are up before mounting volume. Also made sure that mount is available before creating test data. Change-Id: I4915b837df46e43be5678dac8ae5602021c52685 Updates: #1197 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	tests: fix georep-upgrade.t failure	Sunny Kumar	2020-04-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes georep-upgrade.t test failure. Problem: `TEST upgrade_script=$(find / -type f -name glusterfs-georep-upgrade.py)` Multiple files with the same name can exist for temp fix picking only the 1st result. The proper fix should be finding a proper place for this upgrade script and use that. Change-Id: I8b388e30a30bc4a9a2f392bed42ceee7e8bc250a Updates: #1209 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	tests: Fix bug-1101647.t test case failure	karthik-us	2020-04-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: tests/bugs/replicate/bug-1101647.t test case fails sporadically in the volume heal since connection to the bricks with shd was not being checked before running the index heal. Build link: https://build.gluster.org/job/regression-test-burn-in/5007/ Fix: Check for the connection status of the bricks with shd before performing the index heal. Change-Id: Ie7060f379b63bef39fd4f9804f6e22e0a25680c1 Updates: #1154 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	extras: upgrade script for geo-rep	Shwetha K Acharya	2020-04-27	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch https://review.gluster.org/#/c/glusterfs/+/23733/( which optimizes the changelog) introduces change in dirctory structure which is above changelog files. Thus, before upgrade, old version should get updated, with respect to the corresponding changes made by the above qouted patch. This upgrade script, 1) creates a temp htime file, with updated paths from the htime file. Updates temp htime file as htime file. 2) places the changelog files under the required directory structure. Updates: #154 Change-Id: I4b5a6cb9a9266a65972b419b329bc958de8fdf8a Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
*	afr: event gen changes	Ravishankar N	2020-04-24	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general idea of the changes is to prevent resetting event generation to zero in the inode ctx, since event gen is something that should follow 'causal order'. Change #1: For a read txn, in inode refresh cbk, if event_generation is found zero, we are failing the read fop. This is not needed because change in event gen is only a marker for the next inode refresh to happen and should not be taken into account by the current read txn. Change #2: The event gen being zero above can happen if there is a racing lookup, which resets even get (in afr_lookup_done) if there are non zero afr xattrs. The resetting is done only to trigger an inode refresh and a possible client side heal on the next lookup. That can be acheived by setting the need_refresh flag in the inode ctx. So replaced all occurences of resetting even gen to zero with a call to afr_inode_need_refresh_set(). Change #3: In both lookup and discover path, we are doing an inode refresh which is not required since all 3 essentially do the same thing- update the inode ctx with the good/bad copies from the brick replies. Inode refresh also triggers background heals, but I think it is okay to do it when we call refresh during the read and write txns and not in the lookup path. The .ts which relied on inode refresh in lookup path to trigger heals are now changed to do read txn so that inode refresh and the heal happens. Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec Fixes: #1179 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
*	tests: Fix spurious failure of tests/basic/quick-read-with-upcall.t	Pranith Kumar K	2020-04-22	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The test is failing at 14:56:41 ok 13, LINENUM:38 14:56:41 not ok 14 Got "test-message0" instead of "test-message1", LINENUM:41 14:56:41 FAILED COMMAND: test-message1 cat /mnt/glusterfs/1/test.txt This happens because fuse sometimes doesn't send 'read' fop to glusterfs and is served from cache. Fix: Mount with direct-io-mode=yes so that read is always received by gluster Fixes: #1190 Change-Id: I369e2024a85dc492dc24c7579b161fb965f55d19 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Fix for spurious failure for some test cases	Mohit Agrawal	2020-04-16	5	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometimes test case is failing at the time of creating files on mount point after mounting the volume Solution: After started the volume need to wait to make sure all bricks instances are completely started so put a online_brick_count check after just started the volume Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6 Fixes: #1162 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: do not truncate file offsets and sizes to 32-bit	Dmitry Antipov	2020-04-15	4	-9/+13
\| \| \| \| \| \| \| \| \| \|	Do not truncate file offsets and sizes to 32-bit to prevent tests from spurious failures on >2Gb files. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2 Fixes: #1161
*	tests: Fix spurious failure of worm.t	Rinku Kothiya	2020-04-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When the output of date command is a single digit number it is preceded by zero which is getting considered as an octal number. Removing the leading zero from the number solved the problem. Fixes: #1156 Change-Id: Iac4fa20607c0bb90d94dd8ff157ef6b60932c560 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	test: Fix test "bug-1064148" to pass in mux	Barak Sason Rofman	2020-04-13	1	-10/+11
\| \| \| \| \| \| \| \|	Parts of the test weren't designed to run in mux mode, this is now fixed Change-Id: I428c2fcce6d047e324ca5dcaef677ee1794e3dfe updates: #1154 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	tests: Fix spurious failure of ↵	Mohit Agrawal	2020-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t Problem: Sometime volume status is failed after restart glusterd in one cluster node Solution: Wait to finish glusterd handshake on down cluster node Change-Id: Ib23ca41c943caf2903c61ebf42dc437c1b9d6054 Fixes: #1158 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	Adding basic glusterfind test case	Shwetha K Acharya	2020-04-09	2	-1/+85
\| \| \| \| \| \| \| \|	This test case includes all the basic glusterfind scenarios. fixes: #1044 Change-Id: I6021443729e35769fe855c5cc41bb3fbc6365ef0 Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
*	posix: Avoid dict_del logs in posix_is_layout_stale while key is NULL	Mohit Agrawal	2020-04-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: The key "GF_PREOP_PARENT_KEY" has been populated by dht and for non-distribute volume like 1x3 key is not populated so posix_is_layout stale throw a message while a file is created Solution: To avoid a log put a condition before delete a key Change-Id: I813ee7960633e7f9f5e9ad2f42f288053d9eb71f Fixes: #1150 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	mount/fuse: Wait for 'mount' child to exit before dying	Pranith Kumar K	2020-04-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: tests/bugs/protocol/bug-1433815-auth-allow.t fails sometimes because of stale mount. This stale mount comes into picture when parent process dies without waiting for the child process which mounts fuse fs to die Fix: Wait for mounting child process to die before dying. Fixes: #1152 Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>