summaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* afr: prevent winding inodelks twice for arbiter volumesRavishankar N2018-10-101-0/+44
| | | | | | | | | | | | | | | | | | Problem: In an arbiter volume, if there is a pending data heal of a file only on arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks it only once, leaving behind a stale lock on the brick. This causes the next write to the file to hang. Fix: Fix the code-bug to take lock only once. This bug was introduced master with commit eb472d82a083883335bc494b87ea175ac43471ff Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA. fixes: bz#1637802 Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cli: memory leak issues reported by asanAmar Tumballi2018-10-091-1/+1
| | | | | | | | | With this fix, a run on 'rpc-coverage.t' passes properly. This should help to get started with other fixes soon! Change-Id: I257ae4e28b9974998a451d3b490cc18c02650ba2 updates: bz#1633930 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: add get-state command to testSanju Rakonde2018-10-071-0/+7
| | | | | | | | | | | | | | When geo-replication session is running, run "gluster get-state" command to test. https://review.gluster.org/#/c/glusterfs/+/20461/ patch fixes glusterd crash, when we run get-state command with geo-rep session configured. Adding the test now. Fixes: bz#1598345 Change-Id: I56283fba2c782f83669923ddfa4af3400255fed6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Reduce execution time of bug-1559004-EMLINK-handling.tXavi Hernandez2018-10-041-12/+51
| | | | | | | | | | | This patch reduces the execution time of bug-1559004-EMLINK-handling.t from ~14 minutes to ~90 seconds. To do so, it creates some fake hard links directly on the brick instead of creating them through the volume. Change-Id: I9715ff1a4eba47574c733d4f28e68f42f56a7d3f updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* rpc: make binding to port 0 as the default if no option is providedAmar Tumballi2018-10-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | Right now, if no option is provided, the default port is assumed, which is 24007. Ideally, for 'glusterfsd' processes, it is better to not assume there are any ports given, so it can start listening on any port which is available. This helps us to cleanup the dependencies on glusterd from glusterfsd at the moment. No changes would be done to glusterd code, but making the right defaults helps to make glusterfsd more independent process later. NOTE: This patch is a reduced version of below set of patches: * https://review.gluster.org/14613/ & * https://review.gluster.org/14670/ & * https://review.gluster.org/14671/ Credits: Prasanna Kumar Kalever <pkalever@redhat.com> updates: bz#1343926 Change-Id: Ib874e10505e7366dc56ba754458252b67052e653 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* python3: assume python3 unless building _packages_ on sys without py3Kaleb S. KEITHLEY2018-09-2710-10/+1
| | | | | | | | | | | | | | | The jenkins release-new job runs on a CentOS 7 box, which does not have python3. As a result it runs (autogen.sh and) configure before producing the dist tar file, converting all the python3 shebangs to python2 shebangs in the dist tar file. Then when that tar file is "carried" to, e.g. Fedora koji build system to build packages, the shebangs are incorrect, despite having originally been correct in the git repo. Change-Id: I5154baba3f6d29d3c4823bafc2b57abecbf90e5b updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* ctime: Provide noatime optionKotresh HR2018-09-251-0/+51
| | | | | | | | | | | | | | | | | Most of the applications are {c|m}time dependant and very few are atime dependant. So provide noatime option to not update atime when ctime feature is enabled. Also this option has to be enabled with ctime feature to avoid unnecessary self heal. Since AFR/EC reads data from single subvolume, atime is only updated in one subvolume triggering self heal. updates: bz#1593538 Change-Id: I085fb33c882296545345f5df194cde7b6cbc337e Signed-off-by: Kotresh HR <khiremat@redhat.com>
* cluster/ec: Fix failure of tests/basic/ec/ec-1468261.tAshish Pandey2018-09-251-0/+1
| | | | | | | | | | | | | | | | Problem: In this test we are relying on eager-lock time duration of 1 second to delay the post op + unlock phase of an entry fop so that in this 1 second we can kill 2 bricks and dirty on directory could be set. Solution: To fix this issue, we should set the others.eager-lock option to "ON" explicitly in the beginning of this test. Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef fixes bz#1632161 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* afr: fix incorrect reporting of directory split-brainRavishankar N2018-09-212-1/+63
| | | | | | | | | | | | | | | | | Problem: When a directory has dirty xattrs due to failed post-ops or when replace/reset brick is performed, AFR does a conservative merge as expected, but heal-info reports it as split-brain because there are no clear sources. Fix: Modify pending flag to contain information about pending heals and split-brains. For directories, if spit-brain flag is not set,just show them as needing heal and not being in split-brain. Fixes: bz#1626994 Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tests: fix test case failureSanju Rakonde2018-09-201-1/+1
| | | | | | | | | | | | | | | | tests/bugs/glusterd/bug-1595320.t is failing in downstream. In downstream repo, enabling the brick multiplexing made interactive, so it will throw an prompt for the user input. As no input is provided during the test case execution, the test is failing. Using macro CLI instead of using gluster command, will bypass the interacive commands. so replacing the gluster command with CLI macro will address the issue. Change-Id: I6b39052d8e415a8ed08de7c80a91dadce155146a updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* cluster/afr: Use 2 domain locking in SHD for thin-arbiterkarthik-us2018-09-202-0/+230
| | | | | | | | | | | | | | | | | | | | With this change when SHD starts the index crawl it requests all the clients to release the AFR_TA_DOM_NOTIFY lock so that clients will know the in memory state is no more valid and any new operations needs to query the thin-arbiter if required. When SHD completes healing all the files without any failure, it will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on TA to see whether there are any new failures happened by that time. If there are new failures marked on TA, SHD will start the crawl immediately to heal those failures as well. If there are no new failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets the xattrs on TA, so that both the data bricks will be considered as good there after. Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1 fixes: bz#1579788 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* dht: utilize the framework to pass-through xlator tasksAmar Tumballi2018-09-193-4/+3
| | | | | | | | | | | | | | | | | | | | | Also fixes the issue caused due to not converting back the fn function to after getting its address. We wanted the value of the field, not the address of the pt_fop field. With this patch, DHT will always be started in pass-through mode if the number of subvols is just 1. Fixes some tests to make sure DHT is in full config (ie, subvols > 1). - increased timeout of brick-mux test as it was bordering on 300 seconds. - Also change the volume type to supported 'replica 3' from 'replica 2'. - also no DHT tests should assume presence of DHT when there is just 1 brick in volume Credits: Nithya B <nbalacha@redhat.com> fixes: #405 Change-Id: I8e55239ce58d6ac6ae1901e2e384be1ecbd33d6e Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests/dht: Uncomment cleanup stepsN Balachandran2018-09-181-5/+5
| | | | | | | | | I had forgotten to uncomment the cleanup steps for file-create.t. Fixed. Change-Id: Id702b99b8e09f56b7333491a477828b4a37b2687 updates: bz#1628194 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* tests: fixes to bug-1015990-rep.tRavishankar N2018-09-181-14/+7
| | | | | | | | | | | - check that the shd is connected to brick before running statistics command - remove sleep statements - remove unneeded ($count-$value==0) test when it is known that both values will be same Fixes: bz#1625850 Change-Id: Ifcd4887f0238031e5bca803cd9bfdb75a6e6c01b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* geo-rep: Fix issues related config setKotresh HR2018-09-181-0/+4
| | | | | | | | | | | | | | | | | | | | 1. '--ignore-mising-args' option for rsync is not being used even though the rsync version is greater than 3.1.0. Fixed the same. 2. '--existing' option for rsync is also not being used. Fixed the same. 3. geo-rep config fails to set rsync-options as the value contains '--'. Interestingly, python argsparse treats the value with '--' (e.g., --ignore-missing-args) as option. But when passed with something like --value=--ignore-missing-args, it succeeds. Fixed the same. Change-Id: Iaeb838acaff1c2920fee9c7f920c99edce13a0a1 Signed-off-by: Kotresh HR <khiremat@redhat.com> fixes: bz#1629561
* tests/dht: Add tests for file createN Balachandran2018-09-172-0/+159
| | | | | | | | Test dht file creates Change-Id: I7aba710f4911432bd3b86834efecae8f01e4052f updates: bz#1628194 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-1265-7508/+7651
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Land clang-format changesGluster Ant2018-09-122-83/+108
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* misc: fix misc. shebangsKaleb S. KEITHLEY2018-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * One #!/usr/bin/env python and three #!/usr/bin/python were overlooked in all the other python fixups. Ugh. * Two new python files missed the memo about #!/usr/bin/python3. * One #!/usr/bin/env bash. Various distribution packaging policies have strong wording about the use of #!/usr/bin/env ... Note: this patch does not change the use of #!/usr/bin/env bash in the two files extras/{clang-checker.sh,check_goto.pl} as these are not included in any packages. (Although I'm not actually sure why anyone would ever use '/usr/bin/env {sh,bash}' as I'm not aware of any version-specific differences like there are with, e.g., python.) * One #!/usr/bin/bash. On Fedora and CentOS > 6, /bin is a symlink to /usr/bin, so it makes little difference. But Debian & Ubuntu still have separate /bin and /usr/bin; and sh and bash are in /bin, not /usr/bin. (Historically, in BSD and SYSV Unix it was /bin/sh.) Note: Fedora and CentOS package build runs a script that converts all /bin/sh and /bin/bash to /usr/bin/sh and /usr/bin/bash. Change-Id: I9171265829af78dd0cd7622c22b56d22179ff8a3 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* glusterd: avoid using glusterd's working directory as a brickSanju Rakonde2018-09-081-0/+9
| | | | | | | | | Adding checks for avoiding glusterd's working directory used as a brick for volume creation. fixes: bz#853601 Change-Id: I4b16a05f752e92216aa628f542a4fdbf59b3c669 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* cluster/dht: Rework the debug xattr to get hashed subvolN Balachandran2018-09-071-0/+54
| | | | | | | | | | | | | | | | | | | The earlier implementation required the file to already exist when trying to get the hashed subvol. The reworked implementation allows a user to get the hashed subvol for any filename, whether it exists or not. Usage: getfattr -n "dht.file.hashed-subvol.<filename>" <parent dir> Eg:To get the hashed subvol for file-1 inside dir-1 getfattr -n "dht.file.hashed-subvol.file-1" /mnt/gluster/dir1 credit: rgowdapp@redhat.com Change-Id: Iae20bd5f56d387ef48c1c0a4ffa9f692866bf739 fixes: bz#1624244 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* io-stats: dump io-stats info in /var/run/glusterAmar Tumballi2018-09-051-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | It wouldn't make sense to allow iostats file to be written in *any* directory. While the formating makes sure we try to append io-stats-name for the file, so overwriting existing file is slim, but in any case it makes sense to restrict dumping to one directory. Below are the sample commands, and files created for the corresponding values: $ setfattr -n trusted.io-stats-dump -v file-for-dump $M0 In this case, the file would be in /var/run/gluster/file-for-dump $ setfattr -n trusted.io-stats-dump -v /dir1/dir2/file-for-dump $M0 In this case, then the dump file is in /var/run/gluster/dir1-dir2-file-for-dump Note that the value passed for this virtual xattr would be treated as a file, and even if the value has '/' in it, it would be changed to '-' for sanity. Fixes: bz#1625106 Change-Id: Id9ae6a40a190b8937c51662e6e1c2a0f6c86a0e0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* afr: thin-arbiter read txn changesRavishankar N2018-09-052-0/+65
| | | | | | | | | | | | | | | | If both data bricks are up, read subvol will be based on read_subvols. If only one data brick is up: - First qeury the data-brick that is up. If it blames the other brick, allow the reads. - If if doesn't, query the TA to obtain the source of truth. TODO: See if in-memory state can be maintained for read txns (BZ 1624358). updates: bz#1579788 Change-Id: I61eec35592af3a1aaf9f90846d9a358b2e4b2fcc Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* New flag to glusterfsd binary to print libexec dirAravinda VK2018-09-052-0/+8
| | | | | | | | | | | | New CLI option for `glusterfsd` binary to get the path of libexec directory. This helps glusterd2 to detect the installed path of `gsyncd` and other binaries. Usage: `glusterfsd --print-libexecdir` Updates: bz#1193929 Change-Id: I8c1a74afd9acec7ee7bd3deabed9d9f20fe3fb5f Signed-off-by: Aravinda VK <avishwan@redhat.com>
* multiple files: calloc -> mallocYaniv Kaul2018-09-043-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
* cluster/afr: Delegate name-heal when possiblePranith Kumar K2018-09-041-0/+112
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: When name-self-heal is triggered on the mount, it blocks lookup until name-self-heal completes. But that can lead to hangs when lot of clients are accessing a directory which needs name heal and all of them trigger heals waiting for other clients to complete heal. Fix: When a name-heal is needed but quorum number of names have the file and pending xattrs exist on the parent, then better to delegate the heal to SHD which will be completed as part of entry-heal of the parent directory. We could also do the same for quorum-number of names not present but we don't have any known use-case where this is a frequent occurrence so not changing that part at the moment. When there is a gfid mismatch or missing gfid it is important to complete the heal so that next rename doesn't assume everything is fine and perform a rename etc fixes bz#1622821 Change-Id: I8b002c85dffc6eb6f2833e742684a233daefeb2c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* core: python3Kaleb S. KEITHLEY2018-09-038-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | see https://review.gluster.org/#/c/19788/, https://review.gluster.org/#/c/19871/, https://review.gluster.org/#/c/19952/, https://review.gluster.org/#/c/20104/, https://review.gluster.org/#/c/20162/, https://review.gluster.org/#/c/20185/, https://review.gluster.org/#/c/20207/, https://review.gluster.org/#/c/20227/, https://review.gluster.org/#/c/20307/, https://review.gluster.org/#/c/20320/, https://review.gluster.org/#/c/20332/, https://review.gluster.org/#/c/20364/, https://review.gluster.org/#/c/20441/, and https://review.gluster.org/#/c/20484 shebangs changed from /usr/bin/python2 to /usr/bin/python3. (Reminder, various distribution packaging guidelines require use of explicit python version and don't allow '#!/usr/bin/env python', regardless of how handy that idiom may be.) glusterfs.spec(.in) package python{2,3}-gluster and python2 or python3 dependencies as appropriate. configure(.ac): + test for and use python2 or python3 as appropriate. If build machine has python2 and python3, use python3. Override by setting PYTHON=/usr/bin/python2 when running configure. + PYTHONDEV_CPPFLAGS from python[23]-config --includes is a better match to the original python sysconfig.get_python_inc(). All those other extraneous flags breaks the build. + Only change the shebangs once. Changing them over and over again, e.g., during a `make glusterrpms` in extras/LinuxRPM just sends make (is it really make that's looping?) into an infinite loop. If you figure out why, let me know. + Oldest python2 is python2.6 on CentOS 6 and Debian 8 (Jessie). Everything else has 2.7 or 3.x + logic from https://review.gluster.org/c/glusterfs/+/21050, which needs to be removed/merged after that patch is merged. Builds on CentOS 6, CentOS 7, Fedora 28, Fedora rawhide, and the mysterious RHEL > 7. Change-Id: Idae21d3b6f58b32372e1daa0d234e491e563198f updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* cluster/dht: In rename, unlink after creating linkto fileN Balachandran2018-09-031-1/+2
| | | | | | | | | | | | | | The linkto file creation for the dst was done in parallel with the unlink of the old src linkto. If these operations reached the brick out of order, we end up with a dst linkto file without a .glusterfs handle. Fixed by the unlinking only after the linkto file creation has completed. Change-Id: I4246f7655f5bc180f5ded7fd34d263b7828a8110 fixes: bz#1621981 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* libgfchangelog: Fix changelog history APIKotresh HR2018-08-313-0/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If requested start time and end time doesn't fall into first HTIME file, then history API fails even though continuous changelogs are avaiable for the requested range in other HTIME files. This is induced by changelog disable and enable which creates fresh HTIME index file. Cause and Analysis: Each HTIME index file represents the availability of continuous changelogs. If changelog is disabled and enabled, a new HTIME index file is created represents non availability of continuous changelogs. So as long as the requested start and end falls into single HTIME index file and not across, history API should succeed. But History API checks for the changelogs only in first HTIME index file and errors out if not available. Fix: Check in all HTIME index files for availability of continuous changelogs for requested change. fixes: bz#1622549 Change-Id: I80eeceb5afbd1b89f86a9dc4c320e161907d3559 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* cluster/afr: Delegate metadata heal with pending xattrs to SHDPranith Kumar K2018-08-282-13/+25
| | | | | | | | | | | | | | | | | | | Problem: When metadata-self-heal is triggered on the mount, it blocks lookup until metadata-self-heal completes. But that can lead to hangs when lot of clients are accessing a directory which needs metadata heal and all of them trigger heals waiting for other clients to complete heal. Fix: Only when the heal is needed but the pending xattrs are not set, trigger metadata heal that could block lookup. This is the only case where different clients may give different metadata to the clients without heals, which should be avoided. Updates bz#1622821 Change-Id: I6089e9fda0770a83fb287941b229c882711f4e66 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* posix-acl: skip acl_permits check when the owner setting GF_POSIX_ACL_xxxxKinglong Mee2018-08-272-0/+117
| | | | | | Change-Id: Iaeea470d040587027f37e0760ae27c4fc205a189 fixes: bz#1613098 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* ctr: skip ctr xlator init if ctr is not enabledMohit Agrawal2018-08-271-6/+0
| | | | | | | | | | | | Problem: If ctr xlator is not required it consumes resources unnecessarily Solution: Call ctr xlator init only while feature is enabled Fixes: bz#1524323 Change-Id: I378113a390a286be20c4ade1b1bac170a8ef1b14 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* tests: Preserve tarball of tests when they timeoutShyamsundarR2018-08-271-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tests timeout, the timeout command sends TERM signal to the command being executed. In the case of run-tests.sh it invokes prove, which further invokes perl and finally the test is run using bash. The TERM signal does not seem to be reachnig the end bash that is actually executing the tests, and hence when any test is terminated due to a timeout, the cleanup routine in include.rc does not get a chance to run and preserve the tarball. Further, cleanup invokes tarball generation, but is invoked at the beginning and end of every test, and at times in beteween as well. This caused way too many tarballs in case we decide to preserve the same whenever generated by cleanup. This patch hence moves the tarball generation to run-tests.sh instead, and further stores them named <test>-iteration-<n>.tar and also prints tarball name generated and stored per iteration. This should help relate failed runs to the tarball iteration # and to look at relevant logs. Further the patch also provides a -p option to run-tests.sh for unit testing purposes, where running a test in a loop without the option will generate as many tarballs, and using the option will reduce this to preserving the last tarball, saving space in smaller unit test setups. Fixes: bz#1614062 Change-Id: I0aee76c89df0691cf4d0c1fcd4c04dffe0d7c896 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* features/snapview-server: validate the fs instance before doing fop thereRaghavendra Bhat2018-08-241-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: ======== USS design depends on snapview-server translator communicating with each individual snapshot via gfapi. So, the snapview-server xlator maintains the glfs instance (thus the snapshot) to which a inode belongs to by storing it inside the inode context. Suppose, a file from a snapshot is opened by a application, and the fd is still valid from application's point of view (i.e. application has not yet closed fd). Now, if the snapshot to which the opened file belongs to is deleted, then the glfs_t instance corresponding to the snapshot is destroyed by snapview-server as part of snap deletion. But now, if the application does IO on the fd it has kept open, then snapview server tries to send that request to the corresponding snap via glfs instance for that snapshot stored in the inode context for the file on which the application is sending the fop. And this results in freed up glfs_t pointer being accessed and causes a segfault. FIX: === For fd based operations, check whether the glfs instance that the inode contains in its context, is still valid or not. For non fd based operations, usually lookup should guarantee that. But if the file was already looked up, and the client accessing the snap data (either NFS, or native glusterfs fuse) does not bother to send a lookup and directly sends a path based fop, then that path based fop should ensure that the fs instance is valid. Change-Id: I881be15ec46ecb51aa844d7fd41d5630f0d644fb updates: bz#1602070 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Increment trusted.pgfid in posix_mknodN Balachandran2018-08-231-0/+57
| | | | | | | | | | The value of trusted.pgfid.xx was always set to 1 in posix_mknod. This is incorrect if posix_mknod calls posix_create_link_if_gfid_exists. Change-Id: Ibe87ca6f155846b9a7c7abbfb1eb8b6a99a5eb68 fixes: bz#1619720 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* xlators/playground: fix the template files with latest requirementsAmar Tumballi2018-08-222-3/+37
| | | | | | | | | | | | | | | * Make use of xlator_api * Make use of gf_msg() * Make use of mem-pool * Add a sample metrics dump function * Provide an dummy option, which can be initialized, and reconfigured * Add a test case to make sure template xlator is built and used with default fops * Make a change in rpc-coverage to run without lock tests. Updates: bz#1193929 Change-Id: I377dd67b656f440f9bc7c0098e21c0c1934e9096 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cli: Fix invalid access of option_str variableShyamsundarR2018-08-201-0/+3
| | | | | | | | | | | | | | | | | | In function cli_cmd_volume_statedump_options_parse if the wordcount of arguments is exactly 3, then option_str would remain NULL, and hence the function will generate a segmentation fault on the strstr check in its body. This can be triggered when we run the command, `gluster volume statedump <volname>` The fix is to check if option_str is non-NULL before use and also to pass in a duplicated empty string to the dict key "options" when this is NULL. Fixes: bz#1619423 Change-Id: Ic029ab60b64890d92c7a0876a638929495d3aa59 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* cluster/afr: Fix bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.tkarthik-us2018-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Problem: In line #13 of the test case, it checks whether the file is present on first 2 bricks or not. If it is not present on even one of the bricks it will break the loop and checks for the dirty marking on the parent on the 3rd brick and checks for file not present on the 1st and 2nd bricks. The below scenario can happen in this case: - File gets created on 1st and 3rd bricks - In line #13 it sees file is not present on both 1st & 2nd bricks and breaks the loop - In line #51 test fails because the file will be present on the 1st brick - In line #53 test will fail because the file creation was not failed on quorum bricks and dirty marking will not be there on the parent on 3rd brick Fix: Don't break from the loop if file is present on either brick 1 or brick 2. Change-Id: I918068165e4b9124c1de86cfb373801b5b432bd9 fixes: bz#1612054 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* posix: Delete the entry if gfid link creation failskarthik-us2018-08-201-1/+2
| | | | | | | | | | | | | | | Problem: If the gfid link file inside .glusterfs is not present for a file, the operations which are dependent on the gfid will fail, complaining the link file does not exists inside .glusterfs. Fix: If the link file creation fails, fail the entry creation operation and delete the original file. Change-Id: Id767511de2da46b1f45aea45cb68b98d965ac96d fixes: bz#1612037 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* snapshot/handshake: store description after strdupMohammed Rafi KC2018-08-201-0/+48
| | | | | | | | | | | | | | | problem: During a handshake, when we import a friend data snap description variable was just referenced to dictionary value. Solution: snap description should have a separate memory allocated through gf_strdup Change-Id: I94da0c57919e1228919231d1563a001362b100b8 fixes: bz#1618004 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* snapshot:Fail snapshot creation if an empty description providedMohammed Rafi KC2018-08-191-0/+1
| | | | | | | | | | Snapshot description should have a valid string. Creating a snapshot with null value will cause reading from info file to fail with a null exception Change-Id: I9f84154b8e3e7ffefa5438807b3bb9b4e0d964ca updates: bz#1618004 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* performance/readdir-ahead: keep stats of cached dentries in sync with ↵Krutika Dhananjay2018-08-183-1/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | modifications PROBLEM: Stats of dentries that are readdirp'd ahead can become stale due to fops like writes, truncate etc that modify the file pointed by dentries. When a readdir is finally wound at offset corresponding to these entries, the iatts that are returned to the application come from readdir-ahead's cache, which are stale by now. This problem gets further aggravated when caching translators/modules cache and continue to serve this stale information. FIX: * Store the iatt in context of the inode pointed by dentry. * Whenever the inode pointed by dentry undergoes modification, in cbk of modification fop, update the iatt stored in inode-ctx to reflect the modification. * When serving a readdirp response from application, update iatts of dentries with the iatts stored in the context of inodes pointed by these dentries. * Some fops don't have valid iatts in their responses. For eg., write response whose data is still cached in write-behind will have zeroed out stat. In this case keep only ia_type and ia_gfid and reset rest of the iatt members to zero. - fuse-bridge in this case just sends "entry" information back to kernel and attr is not sent. - gfapi sets entry->inode to NULL and zeroes out the entire stat * There is one tiny race between the entry creation and a readdirp on its parent dir, which could cause the inode-ctx setting and inode ctx reading to happen on two different inode objects. To prevent this, when entry->inode doesn't eqaul to linked_inode, - fuse-bridge is made to send only "entry" information without attributes - gfapi sets entry->inode to NULL and zeroes out the entire stat. Change-Id: Ia27ff49a61922e88c73a1547ad8aacc9968a69df BUG: 1390050 Updates: bz#1390050 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* tests: Add thin-arbiter.rc for writing tests for thin-arbiterPranith Kumar K2018-08-182-0/+472
| | | | | | fixes bz#1615789 Change-Id: I1f42e78fec5ddaf2a425dc4b82c9a20472aa146d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* tests: Increase timeout to 30 minutes to handle lcov slownessPranith Kumar K2018-08-161-1/+1
| | | | | | | | | | This script on a normal setup takes 15 minutes. With lcov it needs to be increased. Considering we did 1.5X of the default $run_timeout in run-tests.sh, I am doing the same for this script. fixes bz#1614718 Change-Id: Ia571b33ff13deb8cbd8e48561769e876aa0b1aff Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* tests: Fix spurious failures in stats-dump.t testShyamsundarR2018-08-161-0/+8
| | | | | | | | | | | | | | | | | The test fails to grep and find queue_size, in a brick stats dump, having succesfully found aggr.* values in the same. The troubleshot is that, the writer thread in io-stats, that dumps this in a particular interval, truncates the file just before the grep attempts to read the contents, and hence the failure. The fix is to stop the dumper thread, and then wait for a couple of seconds and then check the output, so that the thread writer does not interfere with the test. Fixes: bz#1615582 Change-Id: I29f95488a2ad693abe1dd525b1d87a9d1eee29a2 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* contrib/xxhash: update to latest xxHash (0.6.5)Yaniv Kaul2018-08-162-19/+19
| | | | | | | | | | | | | | | | | | | | | | Update to latest xxHash, which is supposed to faster with small keys. Specifically, updated to 3064d42e7d74b0921bdd1818395d9cb37bb8976a, which is a bit higher than 0.6.5. Compiled hopefully with namespace (XXH_NAMESPACE=GF_), which allows to use XXH() funcs with no fear they'll 'leak' from our library. Only compile tested! xxhsum is modified to display messages which was conflicting with regression tests (TAP harness). So modified the gfid2path_fuse.t and gfid2path_nfs.t to adhere to that. updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I35cea5cc93f338c1023ac2c9bc6d7d13225a967b
* features/shard: Fix crash and test case in RENAME fopKrutika Dhananjay2018-08-141-40/+56
| | | | | | | | | | | | | | Setting the refresh flag in inode ctx in shard_rename_src_cbk() is applicable only when the dst file exists and is sharded and has a hard link > 1 at the time of rename. But this piece of code is exercised even when dst doesn't exist. In this case, the mount crashes because local->int_inodelk.loc.inode is NULL. Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407 Updates: bz#1611692 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* tests: Fix for gfid-mismatch-resolution-with-fav-child-policy.t failurekarthik-us2018-08-141-0/+1
| | | | | | | | | | | | | | | | | | | This test was retried once on build https://build.gluster.org/job/regression-on-demand-multiplex/174/ (logs for the first try is not available with this build) Test case was failing in line #47 where it was was checking for the heal count to be 0. Line #51 had passed that means file got the gfid split brain resolved, and both the bricks had same gfids. At line #54 it again failed which checks for the md5sum on both the bricks. At this point the md5sum of the brick where the file got impunged had the md5sum same as the newly created empty file. This means the data heal has not happened for the file. At line #64 enabling granular-entry-heal faild, but without the logs it is not possible to debug this issue. Change-Id: I56d854dbb9e188cafedfd24a9d463603ae79bd06 fixes: bz#1615331 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* tests: fix brick check ordersAtin Mukherjee2018-08-139-43/+66
| | | | | | | | | | | | fix brick checks for validating-server-quorum.t & quorum-validation.t ...and make brick_up_status_1 function more generic. Also fix a timing issue in bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t Change-Id: I797ef4cec5b160aafa979bae7151b1e99fcb48ac Updates: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* tests/quick-read/bug-846240.t: fix a wrong testRaghavendra G2018-08-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier this test did following things on M0 and M1 mounted on same volume: 1 create file M0/testfile 2 open an fd on M0/testfile 3 remove the file from M1, M1/testfile 4 echo "data" >> M0/testfile The test expects appending data to M0/testfile to fail. However, redirector ">>" creates a file if it doesn't exist. So, the only reason test succeeded was due to lookup succeeding due to stale stat in md-cache. This hypothesis is verified by two experiments: * Add a sleep of 10 seconds before append operation. md-cache cache expires and lookup fails followed by creation of file and hence append succeeds to new file. * set md-cache timeout to 600 seconds and test never fails even with sleep 10 before append operation. Reason is stale stat in md-cache survives sleep 10. So, the spurious nature of failure was dependent on whether lookup is done when stat is present in md-cache or not. The actual test should've been to write to the fd opened in step 2 above. I've changed the test accordingly. Note that this patch also remounts M0 after initial file creation as open-behind disables opening-behind on witnessing a setattr on the inode and touch involves a setattr. On remount, create operation is not done and hence file is opened-behind. Change-Id: I739f255e0a62ff0024f0824dad3539974955df99 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1615096