glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	ctr: skip ctr xlator init if ctr is not enabled	Mohit Agrawal	2018-08-27	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: If ctr xlator is not required it consumes resources unnecessarily Solution: Call ctr xlator init only while feature is enabled Fixes: bz#1524323 Change-Id: I378113a390a286be20c4ade1b1bac170a8ef1b14 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: Preserve tarball of tests when they timeout	ShyamsundarR	2018-08-27	1	-18/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When tests timeout, the timeout command sends TERM signal to the command being executed. In the case of run-tests.sh it invokes prove, which further invokes perl and finally the test is run using bash. The TERM signal does not seem to be reachnig the end bash that is actually executing the tests, and hence when any test is terminated due to a timeout, the cleanup routine in include.rc does not get a chance to run and preserve the tarball. Further, cleanup invokes tarball generation, but is invoked at the beginning and end of every test, and at times in beteween as well. This caused way too many tarballs in case we decide to preserve the same whenever generated by cleanup. This patch hence moves the tarball generation to run-tests.sh instead, and further stores them named <test>-iteration-<n>.tar and also prints tarball name generated and stored per iteration. This should help relate failed runs to the tarball iteration # and to look at relevant logs. Further the patch also provides a -p option to run-tests.sh for unit testing purposes, where running a test in a loop without the option will generate as many tarballs, and using the option will reduce this to preserving the last tarball, saving space in smaller unit test setups. Fixes: bz#1614062 Change-Id: I0aee76c89df0691cf4d0c1fcd4c04dffe0d7c896 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	features/snapview-server: validate the fs instance before doing fop there	Raghavendra Bhat	2018-08-24	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: ======== USS design depends on snapview-server translator communicating with each individual snapshot via gfapi. So, the snapview-server xlator maintains the glfs instance (thus the snapshot) to which a inode belongs to by storing it inside the inode context. Suppose, a file from a snapshot is opened by a application, and the fd is still valid from application's point of view (i.e. application has not yet closed fd). Now, if the snapshot to which the opened file belongs to is deleted, then the glfs_t instance corresponding to the snapshot is destroyed by snapview-server as part of snap deletion. But now, if the application does IO on the fd it has kept open, then snapview server tries to send that request to the corresponding snap via glfs instance for that snapshot stored in the inode context for the file on which the application is sending the fop. And this results in freed up glfs_t pointer being accessed and causes a segfault. FIX: === For fd based operations, check whether the glfs instance that the inode contains in its context, is still valid or not. For non fd based operations, usually lookup should guarantee that. But if the file was already looked up, and the client accessing the snap data (either NFS, or native glusterfs fuse) does not bother to send a lookup and directly sends a path based fop, then that path based fop should ensure that the fs instance is valid. Change-Id: I881be15ec46ecb51aa844d7fd41d5630f0d644fb updates: bz#1602070 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	storage/posix: Increment trusted.pgfid in posix_mknod	N Balachandran	2018-08-23	1	-0/+57
\| \| \| \| \| \| \| \| \| \|	The value of trusted.pgfid.xx was always set to 1 in posix_mknod. This is incorrect if posix_mknod calls posix_create_link_if_gfid_exists. Change-Id: Ibe87ca6f155846b9a7c7abbfb1eb8b6a99a5eb68 fixes: bz#1619720 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	xlators/playground: fix the template files with latest requirements	Amar Tumballi	2018-08-22	2	-3/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Make use of xlator_api * Make use of gf_msg() * Make use of mem-pool * Add a sample metrics dump function * Provide an dummy option, which can be initialized, and reconfigured * Add a test case to make sure template xlator is built and used with default fops * Make a change in rpc-coverage to run without lock tests. Updates: bz#1193929 Change-Id: I377dd67b656f440f9bc7c0098e21c0c1934e9096 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cli: Fix invalid access of option_str variable	ShyamsundarR	2018-08-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In function cli_cmd_volume_statedump_options_parse if the wordcount of arguments is exactly 3, then option_str would remain NULL, and hence the function will generate a segmentation fault on the strstr check in its body. This can be triggered when we run the command, `gluster volume statedump <volname>` The fix is to check if option_str is non-NULL before use and also to pass in a duplicated empty string to the dict key "options" when this is NULL. Fixes: bz#1619423 Change-Id: Ic029ab60b64890d92c7a0876a638929495d3aa59 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	cluster/afr: Fix bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t	karthik-us	2018-08-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In line #13 of the test case, it checks whether the file is present on first 2 bricks or not. If it is not present on even one of the bricks it will break the loop and checks for the dirty marking on the parent on the 3rd brick and checks for file not present on the 1st and 2nd bricks. The below scenario can happen in this case: - File gets created on 1st and 3rd bricks - In line #13 it sees file is not present on both 1st & 2nd bricks and breaks the loop - In line #51 test fails because the file will be present on the 1st brick - In line #53 test will fail because the file creation was not failed on quorum bricks and dirty marking will not be there on the parent on 3rd brick Fix: Don't break from the loop if file is present on either brick 1 or brick 2. Change-Id: I918068165e4b9124c1de86cfb373801b5b432bd9 fixes: bz#1612054 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	posix: Delete the entry if gfid link creation fails	karthik-us	2018-08-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If the gfid link file inside .glusterfs is not present for a file, the operations which are dependent on the gfid will fail, complaining the link file does not exists inside .glusterfs. Fix: If the link file creation fails, fail the entry creation operation and delete the original file. Change-Id: Id767511de2da46b1f45aea45cb68b98d965ac96d fixes: bz#1612037 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	snapshot/handshake: store description after strdup	Mohammed Rafi KC	2018-08-20	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	problem: During a handshake, when we import a friend data snap description variable was just referenced to dictionary value. Solution: snap description should have a separate memory allocated through gf_strdup Change-Id: I94da0c57919e1228919231d1563a001362b100b8 fixes: bz#1618004 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	snapshot:Fail snapshot creation if an empty description provided	Mohammed Rafi KC	2018-08-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Snapshot description should have a valid string. Creating a snapshot with null value will cause reading from info file to fail with a null exception Change-Id: I9f84154b8e3e7ffefa5438807b3bb9b4e0d964ca updates: bz#1618004 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	performance/readdir-ahead: keep stats of cached dentries in sync with ↵	Krutika Dhananjay	2018-08-18	3	-1/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	modifications PROBLEM: Stats of dentries that are readdirp'd ahead can become stale due to fops like writes, truncate etc that modify the file pointed by dentries. When a readdir is finally wound at offset corresponding to these entries, the iatts that are returned to the application come from readdir-ahead's cache, which are stale by now. This problem gets further aggravated when caching translators/modules cache and continue to serve this stale information. FIX: * Store the iatt in context of the inode pointed by dentry. * Whenever the inode pointed by dentry undergoes modification, in cbk of modification fop, update the iatt stored in inode-ctx to reflect the modification. * When serving a readdirp response from application, update iatts of dentries with the iatts stored in the context of inodes pointed by these dentries. * Some fops don't have valid iatts in their responses. For eg., write response whose data is still cached in write-behind will have zeroed out stat. In this case keep only ia_type and ia_gfid and reset rest of the iatt members to zero. - fuse-bridge in this case just sends "entry" information back to kernel and attr is not sent. - gfapi sets entry->inode to NULL and zeroes out the entire stat * There is one tiny race between the entry creation and a readdirp on its parent dir, which could cause the inode-ctx setting and inode ctx reading to happen on two different inode objects. To prevent this, when entry->inode doesn't eqaul to linked_inode, - fuse-bridge is made to send only "entry" information without attributes - gfapi sets entry->inode to NULL and zeroes out the entire stat. Change-Id: Ia27ff49a61922e88c73a1547ad8aacc9968a69df BUG: 1390050 Updates: bz#1390050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	tests: Add thin-arbiter.rc for writing tests for thin-arbiter	Pranith Kumar K	2018-08-18	2	-0/+472
\| \| \| \| \| \|	fixes bz#1615789 Change-Id: I1f42e78fec5ddaf2a425dc4b82c9a20472aa146d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Increase timeout to 30 minutes to handle lcov slowness	Pranith Kumar K	2018-08-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This script on a normal setup takes 15 minutes. With lcov it needs to be increased. Considering we did 1.5X of the default $run_timeout in run-tests.sh, I am doing the same for this script. fixes bz#1614718 Change-Id: Ia571b33ff13deb8cbd8e48561769e876aa0b1aff Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Fix spurious failures in stats-dump.t test	ShyamsundarR	2018-08-16	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test fails to grep and find queue_size, in a brick stats dump, having succesfully found aggr.* values in the same. The troubleshot is that, the writer thread in io-stats, that dumps this in a particular interval, truncates the file just before the grep attempts to read the contents, and hence the failure. The fix is to stop the dumper thread, and then wait for a couple of seconds and then check the output, so that the thread writer does not interfere with the test. Fixes: bz#1615582 Change-Id: I29f95488a2ad693abe1dd525b1d87a9d1eee29a2 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	contrib/xxhash: update to latest xxHash (0.6.5)	Yaniv Kaul	2018-08-16	2	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update to latest xxHash, which is supposed to faster with small keys. Specifically, updated to 3064d42e7d74b0921bdd1818395d9cb37bb8976a, which is a bit higher than 0.6.5. Compiled hopefully with namespace (XXH_NAMESPACE=GF_), which allows to use XXH() funcs with no fear they'll 'leak' from our library. Only compile tested! xxhsum is modified to display messages which was conflicting with regression tests (TAP harness). So modified the gfid2path_fuse.t and gfid2path_nfs.t to adhere to that. updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I35cea5cc93f338c1023ac2c9bc6d7d13225a967b
*	features/shard: Fix crash and test case in RENAME fop	Krutika Dhananjay	2018-08-14	1	-40/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting the refresh flag in inode ctx in shard_rename_src_cbk() is applicable only when the dst file exists and is sharded and has a hard link > 1 at the time of rename. But this piece of code is exercised even when dst doesn't exist. In this case, the mount crashes because local->int_inodelk.loc.inode is NULL. Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407 Updates: bz#1611692 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	tests: Fix for gfid-mismatch-resolution-with-fav-child-policy.t failure	karthik-us	2018-08-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test was retried once on build https://build.gluster.org/job/regression-on-demand-multiplex/174/ (logs for the first try is not available with this build) Test case was failing in line #47 where it was was checking for the heal count to be 0. Line #51 had passed that means file got the gfid split brain resolved, and both the bricks had same gfids. At line #54 it again failed which checks for the md5sum on both the bricks. At this point the md5sum of the brick where the file got impunged had the md5sum same as the newly created empty file. This means the data heal has not happened for the file. At line #64 enabling granular-entry-heal faild, but without the logs it is not possible to debug this issue. Change-Id: I56d854dbb9e188cafedfd24a9d463603ae79bd06 fixes: bz#1615331 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	tests: fix brick check orders	Atin Mukherjee	2018-08-13	9	-43/+66
\| \| \| \| \| \| \| \| \| \| \| \|	fix brick checks for validating-server-quorum.t & quorum-validation.t ...and make brick_up_status_1 function more generic. Also fix a timing issue in bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t Change-Id: I797ef4cec5b160aafa979bae7151b1e99fcb48ac Updates: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tests/quick-read/bug-846240.t: fix a wrong test	Raghavendra G	2018-08-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier this test did following things on M0 and M1 mounted on same volume: 1 create file M0/testfile 2 open an fd on M0/testfile 3 remove the file from M1, M1/testfile 4 echo "data" >> M0/testfile The test expects appending data to M0/testfile to fail. However, redirector ">>" creates a file if it doesn't exist. So, the only reason test succeeded was due to lookup succeeding due to stale stat in md-cache. This hypothesis is verified by two experiments: * Add a sleep of 10 seconds before append operation. md-cache cache expires and lookup fails followed by creation of file and hence append succeeds to new file. * set md-cache timeout to 600 seconds and test never fails even with sleep 10 before append operation. Reason is stale stat in md-cache survives sleep 10. So, the spurious nature of failure was dependent on whether lookup is done when stat is present in md-cache or not. The actual test should've been to write to the fd opened in step 2 above. I've changed the test accordingly. Note that this patch also remounts M0 after initial file creation as open-behind disables opening-behind on witnessing a setattr on the inode and touch involves a setattr. On remount, create operation is not done and hence file is opened-behind. Change-Id: I739f255e0a62ff0024f0824dad3539974955df99 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1615096
*	cluster/afr: Fix bug-1433571-undo-pending-only-on-up-bricks.t	karthik-us	2018-08-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The test case was checking for the entry pending marker reset on the root after performing client side lookup at line #60-63. But sometimes the entry heal was not getting completed immediately. Fix: Wait for the entry heal to complete before checking the changelog. Change-Id: I42fde21b04a126ab044ce58373a996d72f125d96 fixes: bz#1614730 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	tests: potential fixes to bugs/replicate/bug-1408712.t	Ravishankar N	2018-08-13	1	-2/+15
\| \| \| \| \| \| \| \|	See BZ for details. Change-Id: I2cc2064f14d80271ebcc21747103ce4cee848cbf fixes: bz#1615078 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: fix replace-brick-self-heal.t failure	Ravishankar N	2018-08-13	1	-1/+1
\| \| \| \| \| \| \| \|	Please see BZ for details. Change-Id: Id9273432874bc6a452ac96b2b8c7a61ea6c5b98d Fixes: bz#1615239 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: fix tests/bugs/shard/configure-lru-limit.t	Atin Mukherjee	2018-08-13	1	-0/+4
\| \| \| \| \| \| \| \|	Check for the bricks to be up before attempting to mount. Change-Id: I1224908137016df3007f4467aa9760967ce0694d Fixes: bz#1615092 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tests: potential fixes for tests/basic/afr/add-brick-self-heal.t	Ravishankar N	2018-08-13	1	-0/+7
\| \| \| \| \| \| \| \|	Please see bug description for details. Change-Id: Ieb6bce6d1d5c4c31f1878dd1a1c3d007d8ff81d5 fixes: bz#1614654 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: Fix cleanup routine for some mux tests	ShyamsundarR	2018-08-13	4	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of the mux tests, set a trap to catch test exit and call cleanup. This will cause cleanup to be invoked twice in case the test times out, or even otherwise, as include.rc also sets a trap to cleanup on exit (TERM and others). This leads to the tarballs generated on failures for these tests to be empty and does not aid debugging. This patch corrects this pattern across the tests to the more standard cleanup at the end. Fixes: bz#1615037 Change-Id: Ib83aeb09fac2aa591b390b9fb9e1f605bfef9a8b Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	Make sure EXPECT_WITHIN executes the statement multiple times	Pranith Kumar K	2018-08-12	2	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we pass a command to be executed in EXPECT_WITHIN and we use `` the value is passed by value, so if the first execution gives a result that is different from the expected value, EXPECT_WITHIN test will fail because the command will not be re-evaluated. Changed the expression with `` to a function. Added sleep(3) in afr.c for reconfigure to both RC and re-test after the change. fixes bz#1614662 Change-Id: I3bc8a75b996729261aa48067f6ed8da9c6273b13 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	glusterd: Compare volume_id before start/attach a brick	Mohit Agrawal	2018-08-10	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: After reboot a node brick is not coming up because fsid comparison is failed before start a brick Solution: Instead of comparing fsid compare volume_id to resolve the same because fsid is changed after reboot a node but volume_id persist as a xattr on brick_root path at the time of creating a volume. Change-Id: Ic289aab1b4ebfd83bbcae8438fee26ae61a0fff4 fixes: bz#1612418 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: kill_brick should wait for brick status to become offline	Atin Mukherjee	2018-08-10	1	-10/+10
\| \| \| \| \| \|	Change-Id: I52e8eec7f334af37de433c444f4ddfc876fa56cc Fixes: bz#1614088 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tests: Set heal-timeout to 5 seconds	Pranith Kumar K	2018-08-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shd keeps doing heals in a loop until it heals at least one entry in the previous run. A heal is termed successful only if it heals both metadata and entry/data heal i.e. the entry needs to be completely healed by just that healer. In tests/basic/afr/granular-esh/replace-brick.t test, brick-0 is old and brick-1 is new. After replace-brick only root-gfid will be present in brick-0's index 1) shd-thread corresponding to brick-0 does metadata heal, this creates root-gfid in brick-0's 'dirty' index. 2) Both healer threads corresponding to brick-0 and brick-1 now try to heal root-gfid and brick-1 gets the heal-domain lock. brick-0's shd-thread will experience a failure and it goes back to waiting for 10 minutes (cluster.heal-timeout). 3) When brick-1's healer-thread completes healing root-gfid it creates 5 files which create indices in brick-0, so until brick-0 doesn't trigger one more heal, heal won't happen. $HEAL_TIMEOUT is set at 120 seconds, which is lesser than cluster.heal-timeout, so decreasing this to 5 seconds so that the next heal is triggered which will do the heals. fixes bz#1613807 Change-Id: I881133fc28880d8615fbc4558a0dfa0dc63d7798 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	tests: Increase timeout for mpx restart crash test	ShyamsundarR	2018-08-09	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In lcov based regression testing environments, all tests take more time than what occurs in centos7 regressions. Possibly due to code instrumentation for lcov purposes. Due to this the test, bug-1432542-mpx-restart-crash.t constantly times out. This patch increases the timeout for the same to enable lcov tests to pass on a more regular basis. It was also noted by Nithya that the test at times generated an OOM kill on the regression machines. In order to reduce runtime memory foot print of the tests, FUSE mounts are unmounted as soon as the required test is complete. Fixes: bz#1608568 Change-Id: I37f8d4b45807a69c52c7c7df4923c0fc33fab4e4 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	tests/bitrot: Fix tests/bitrot/bug-1373520.t	Kotresh HR	2018-08-09	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \|	The test was failing with brick-mux enabled intermittently. As the test depends on lookup to recover file via heal, it's advisable to disable all perf xlators. Hence doing the same. fixes: bz#1611566 Change-Id: Ib7705e7951d53c435b8e390298164d73c6d71745 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	tests: fix online_brick_count function	Atin Mukherjee	2018-08-03	1	-1/+4
\| \| \| \| \| \| \| \|	online_brick_count should discard Bitrot and Scrubber daemon. Change-Id: I301373ccdbeec1d1a5e6c6b137f48ed997f22556 Fixes: bz#1611103 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	Revert "performance/readdir-ahead: Invalidate cached dentries if they're ↵	Raghavendra G	2018-08-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	modified while in cache" This reverts commit 7131de81f72dda0ef685ed60d0887c6e14289b8c. With the latest master, I created a single brick volume and some files inside it. [root@rhgs313-6 ~]# umount -f /mnt/fuse1; mount -t glusterfs -s 192.168.122.6:/thunder /mnt/fuse1; ls -l /mnt/fuse1/; echo "Trying again"; ls -l /mnt/fuse1 umount: /mnt/fuse1: not mounted total 0 ----------. 0 root root 0 Jan 1 1970 file-1 ----------. 0 root root 0 Jan 1 1970 file-2 ----------. 0 root root 0 Jan 1 1970 file-3 ----------. 0 root root 0 Jan 1 1970 file-4 ----------. 0 root root 0 Jan 1 1970 file-5 d---------. 0 root root 0 Jan 1 1970 subdir Trying again total 3 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-1 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-2 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-3 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-4 -rw-r--r--. 1 root root 33 Aug 3 14:06 file-5 d---------. 0 root root 0 Jan 1 1970 subdir [root@rhgs313-6 ~]# Conversation can be followed on gluster-devel on thread with subj: tests/bugs/distribute/bug-1122443.t - spurious failure. git-bisected pointed this patch as culprit. Change-Id: I1eb46f6c196f44fde8ce991840a0e724e6f50862 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1390050
*	glusterd: Bricks of a normal volumes should not attach to ↵	Sanju Rakonde	2018-08-03	2	-33/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gluster_shared_storage bricks Problem: In a brick multiplexing environment, Bricks of a normal volume created by user are getting attached to the bricks of a volume "gluster_shared_storage" which is created by enabling the enable-shared-storage option. Mounting gluster_shared_storage has strict authentication checks. when we attach bricks of a normal volume to bricks of gluster_shared_storage, mounting the normal volume created by user will fail due to strict authentication checks. Solution: We should not attach bricks of a normal volume to brick process of gluster_shared_storage volume and vice versa. fixes: bz#1610726 Change-Id: If1b5a2a02675789a2915ba480fb48c145449163d Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	coverity: Fix remaining SECURE_TEMP issues reported	ShyamsundarR	2018-08-03	1	-3/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two pending SECURE_TEMP issues still exist in the coverity reports, these are fixed by this patch. In both instances (where functions actually seem to be duplicates of each other) the need was for a FILE * and not an fd. Applied the same pattern in both places as in other parts of the code where mkstemp was used and later a FILE * was created from the resulting fd for use. Coverity report: https://download.gluster.org/pub/gluster/ glusterfs/static-analysis/master/glusterfs-coverity/ 2018-07-30-4d3c62e7/html/ Issues numbered: 382, 383 (named SECURE_TEMP) Further added tmpfile to the blacklist, so that future code changes do not add the same, into symbol-check.sh. Also corrected shellcheck errors in symbol-check.sh as a result of updating the same. Updates: bz#789278 Change-Id: I1d572a16ca5b5df2f597aeaa5f454fad34c8296e Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	geo-rep/hook-script: Fix ssh/scp options	Kotresh HR	2018-08-03	4	-11/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Always use ssh and scp with "-oPasswordAuthentication=no" and "-oStrictHostKeyChecking=no" options. It might hang the post script otherwise leading geo-rep setup failure Also increased geo-rep timeout. Occasionally, it's taking more time to reach Active/Passive status. Especially, the first start after create. fixes: bz#1610405 Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	performance/open-behind: don't use anonymous fds for reads by default	Raghavendra G	2018-08-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	anonymous fds interfere with working of read-ahead as read-ahead won't be able to store its cache in fd. Also, as seen in bz 1455872, anonymous fds also affect performance of large file sequential reads as the cost of opening fd for each read on brick stack is significant. So, have a proper fd which enables read-ahead to store its cache and brick stack to reuse the fd during reads. With this change test tests/bugs/snapshot/bug-1167580-set-proper-uid-and-gid-during-nfs-access.t fails consistently. The failure can also be seen with open-behind off. bz 1611532 has been filed to track the issue with test. Thanks to Rafi <rkavunga@redhat.com> for assistance provided in debugging test failure. Change-Id: Ifa52d8ff017f115e83247f3396b9d27f0295ce3f Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1455872
*	performance/md-cache: update cache only from fops issued after previous ↵	Raghavendra G	2018-08-02	5	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	invalidation Invalidations are triggered mainly by two codepaths - upcall and write-behind unwinding a cached write with zeroed out stat. For the case of upcall, following race can happen: * stat s1 is fetched from brick * invalidation is detected on brick * invalidation is propagated to md-cache and cache is invalidated * s1 updates md-cache with a stale state For the case of write-behind, imagine following sequence of operations, * A stat s1 was issued from application thread t1 when size of file was s1 * stat s1 completes on brick stack, but yet to reach md-cache * A write w1 from application thread t2 extends file to size s2 is cached in write-behind and response is unwound with zeroed out stat * md-cache while handling write-cbk, invalidates cache * md-cache receives response for s1, updates cache with stale stat with size of s1 overwriting invalidation state Fix is to remember when s1 was incident on md-cache and update cache with results of s1 only if the it was incident after invalidation of cache. This patch identified some bugs in regression tests which is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1608158. As a stop gap measure I am marking following tests as bad basic/afr/split-brain-resolution.t bugs/bug-1368312.t bugs/replicate/bug-1238398-split-brain-resolution.t bugs/replicate/bug-1417522-block-split-brain-resolution.t bugs/replicate/bug-1438255-do-not-mark-self-accusing-xattrs.t Change-Id: Ia4bb9dd36494944e2d91e9e71a79b5a3974a8c77 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
*	build: remove bundled arg-standalone	Niels de Vos	2018-07-28	4	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	libargp or argp-standalone is available on all commonly used distributions. There is no need to bundle an unmaintained version of argp-standalone in this repository anymore. FreeBSD places the argp.h file in /usr/local/include when argp-standalone is installed. This path is not added to CPPFLAGS by default, so thats done in configure.ac as well. Change-Id: I384a53ab0a008ec9d48fd83afeaf8fbc197e91ee Fixes: bz#1609337 Signed-off-by: Niels de Vos <ndevos@redhat.com>
*	performance/readdir-ahead: Invalidate cached dentries if they're modified ↵	Krutika Dhananjay	2018-07-28	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	while in cache PROBLEM: Entries that are readdirp'd ahead can undergo modification in terms of writes, truncates which could modify their iatts. When a readdir is finally wound at offset corresponding to these entries, the iatts that are returned to the application come from readdir-ahead's cache, which are stale by now. This problem gets further aggravated when caching translators/modules cache and continue to serve this stale information. FIX: Whenever a dentry undergoes modification, in the cbk of the modification fop, a "dirty" flag (default 0) is set in its inode ctx. When it's time for readdir-ahead to serve these entries, it will read the inode ctx and check if the entry is "dirty", and if it is, set the entry's attrs to all zeroes, as an indicator to fuse, md-cache etc not to cache these attributes. Also there is one tiny race between the entry creation and a readdirp on its parent dir, which could cause the inode-ctx setting and inode ctx reading to happen on two different inode objects. To prevent this, fuse-bridge is made to drop entries for which dentry->inode is not the same as linked inode, in readdirp cbk. Change-Id: If7396507632b5268442ca580473d5155fee9cbef BUG: 1390050 Updates: bz#1390050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: Add multiple checks before attach/start a brick	Mohit Agrawal	2018-07-27	6	-1/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In brick mux scenario sometime glusterd is not able to start/attach a brick and gluster v status shows brick is already running Solution: 1) To make sure brick is running check brick_path in /proc/<pid>/fd , if a brick is consumed by the brick process it means brick stack is come up otherwise not 2) Before start/attach a brick check if a brick is mounted or not 3) At the time of printing volume status check brick is consumed by any brick process Test: To test the same followed procedure 1) Setup brick mux environment on a vm 2) Put a breaking point in gdb in function posix_health_check_thread_proc at the time of notify GF_EVENT_CHILD_DOWN event 3) unmount anyone brick path forcefully 4) check gluster v status it will show N/A for the brick 5) Try to start volume with force option, glusterd throw message "No device available for mount brick" 6) Mount the brick_root path 7) Try to start volume with force option 8) down brick is started successfully Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69 fixes: bz#1595320 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	features/shard: Make lru limit of inode list configurable	Krutika Dhananjay	2018-07-23	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently this lru limit is hard-coded to 16384. This patch makes it configurable to make it easier to hit the lru limit and enable testing of different cases that arise when the limit is reached. The option is features.shard-lru-limit. It is by design allowed to be configured only in init() but not in reconfigure(). This is to avoid all the complexity associated with eviction of least recently used shards when the list is shrunk. Change-Id: Ifdcc2099f634314fafe8444e2d676e192e89e295 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	13	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	build/core: location of xattr.h	Kaleb S. KEITHLEY	2018-07-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	starting with libattr-devel-2.4.48-1 in Fedora 28 <attr/xattr.h> has been removed from the package. On Fedora, RHEL/CentOS, and SUSE, the glibc-headers package has provided a near identical file, <sys/xattr.h>, in all the releases that we care about. On Debian, libc6-dev provides <sys/xattr.h> all the way back to 8/Jessie and presumably all Ubuntu derivatives since then. Note that on Debian the sys headers are installed in /usr/include/$DEB_HOST_GNU_TYPE/sys/... Change-Id: Id07c4b225bdaa6556bd54772604e75b8f346fb60 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	cluster/afr: Mark dirty for entry transactions for quorum failures	karthik-us	2018-07-16	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If an entry creation transaction fails on quprum number of bricks it might end up setting the pending changelogs on the file itself on the brick where it got created. But the parent does not have any entry pending marker set. This will lead to the entry not getting healed by the self heal daemon automatically. Fix: For entry transactions mark dirty on the parent if it fails on quorum number of bricks, so that the heal can do conservative merge and entry gets healed by shd. Change-Id: I56448932dd409b3ddb095e2ae32e037b6157a607 fixes: bz#1586020 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	glusterd: To find a compatible brick ignore diagnostics.brick-log-level option	Mohit Agrawal	2018-07-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd start a volume as a separate process instead of attaching with the already running process if volume option has different brick-log-level. There is no functionality impact on a brick if the option has different brick-log-level so glusterd should attach a brick with the already running process. Solution: Ignore brick-log-level option in unsafe_option BUG: 1599628 Change-Id: I72638ff2026fcd9332bc38e1144b1ef4a708820b fixes: bz#1599628 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests/geo-rep: increase the timeout	Amar Tumballi	2018-07-13	2	-0/+4
\| \| \| \| \| \| \| \| \| \|	since some time, geo-rep tests were in the border of 180-190 seconds to complete. As actual test timeout is 200 seconds by default, giving these tests some buffer time to complete properly. updates: bz#1193929 Change-Id: I9f501a02b52585dff7d0473824bdbb229e124278 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/geo-rep: Add rsnapshot and hardlink rename test case	Kotresh HR	2018-07-13	3	-33/+153
\| \| \| \| \| \| \| \| \| \| \| \|	1. rsnapshot creates a I/O pattern which involves create, rename, symlink, hardlink in specific order. 2. Hardlink rename on master with source unlinked use case fixes: bz#1597540 Change-Id: Iedade47e5bf9905424a974df6ec33bc6f6695082 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	Quota: Fix crawling of files	Sanoj Unnikrishnan	2018-07-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Running "find ." does not crawl files. It goes over the directories and lists all dentries with getdents system call. Hence the files are not looked up. Solution: explicitly triggerr stat on files with find . -exec stat {} \; since crawl can take slightly longer, updating timeout in test case Change-Id: If3c1fba2ed8e300c9cc08c1b5c1ba93cb8e4d6b6 fixes: bz#1533000 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
*	snapshot : remove stale entry	Sunny Kumar	2018-07-12	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \|	During snap delete after removing brick-path we should remove snap-path too i.e. /var/run/gluster/snaps/<snap-name>. During snap deactivate also we should remove snap-path. Change-Id: Ib80b5d8844d6479d31beafa732e5671b0322248b fixes: bz#1597662 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>