summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* nfs-server-mount : fix coverity issues in mount3.cSunny Kumar2018-08-171-5/+9
| | | | | | | | Fixes CID 1389033, 1388767, 1288782. Change-Id: I244f88b2ca8487f8926da45d886982558ad45c7a updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* meta : fix coverity in meta-helpers.cSunny Kumar2018-08-171-13/+13
| | | | | | | | | This fixes CID 1214627 and 1257625. updates: bz#789278 Change-Id: I6eb1ccf7b498948d1c41ff830e65437ef818cd55 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* features/acl: Fix a possible null dereferenceVijay Bellur2018-08-171-2/+3
| | | | | | | | Addresses CID 1370952 Change-Id: I1f157dbede32e74e38aed8a1a162e38107f2628d updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: coverity defects fix introduced by commit 1f3bfe7Atin Mukherjee2018-08-172-2/+2
| | | | | | | | | | | | | Commit 1f3bfe7 stripped down the total size of certain path related variables in glusterd_brickinfo_t which considered couple of new coverity defects. Fix the following: CID : 1394969 Destination buffer too small CID : 1394968 Out-of-bounds access Change-Id: Ibc30eac4680cc6c83bd89d248f1435cb6a3d1b75 updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* statedump : fix coverity issuesSunny Kumar2018-08-171-2/+2
| | | | | | | | | | | Comparing an array to null is not useful, the test will always evaluate as true. Fixes CID 1325566 and 1389371. updates: bz#789278 Change-Id: Id51f84cc62767a432de1d12851ae2669c1596a94 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* jbr : fix coverity issues in jbrSunny Kumar2018-08-171-0/+6
| | | | | | | | This patch fixes CID 1357875 and 1357869. Change-Id: Ief88523e5ad92a2c884ff1b85cd613992bba0dad updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: ignore importing volume which is undergoing a delete operationAtin Mukherjee2018-08-165-6/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem explanation: Assuming in a 3 nodes cluster, if N1 originates a delete operation and while N1's commit phase completes, either glusterd service of N2 or N3 gets disconnected from N1 (before completing the commit phase), N1 will attempt to end up importing the volume which is in-flight for a delete in other nodes as a fresh resulting into an incorrect configuration state. Fix: Mark a volume as stage deleted once a volume delete operation passes it's staging phase and reset this flag during unlock phase. Now during this intermediate phase if the same volume gets imported to other peers, it shouldn't considered to be recreated. An automated .t is quite tough to implement with the current infra. Test Case: 1. Keep creating and deleting volumes in a loop on a 3 node cluster 2. Simulate n/w failure between the peers (ifdown followed by ifup) 3. Check if output of 'gluster v list | wc -l' is same across all 3 nodes during 1 & 2. Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246 Fixes: bz#1605077 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* build: use standard PKG_CHECK_MODULES for libxml2 availabilityNiels de Vos2018-08-164-10/+7
| | | | | | | | | | | | | In case the development parts of libxml2 are not installed, it was required to re-run ./autogen.sh to cleanup the cached values for the check. This is not nice towards users. By using the standard PKG_CHECK_MODULES for libxml-2.0 the results of the check are not cached and will be probed again when running ./configure. Change-Id: I3c4586e5555a521be5d4fb61bdb873ae0317311a Fixes: bz#1599219 Reported-by: Sachidananda Urs <surs@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>
* trash : fix coverity issues in trash.cSunny Kumar2018-08-161-3/+6
| | | | | | | | This patch fixes CID : 1382380 and 1382428. Change-Id: Ice3c8f5c2d97a0b541665bff744f32fbea9e294f updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* libglusterfs: FORWARD_NULL coverity fixSunil Kumar Acharya2018-08-162-1/+7
| | | | | | | | | | | Fixing FORWARD_NULL coverify errors in libglusterfs. CID: 1391407, 1391410 BUG: 789278 Change-Id: I3d20523005e4418759c8a72620edff7c977d2d00 updates: bz#789278 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* cluster/ec: FORWARD_NULL coverity fixSunil Kumar Acharya2018-08-162-1/+5
| | | | | | | | | | | Fixing FORWARD_NULL coverify errors with EC. CID: 1394650 BUG: 789278 Change-Id: I52c99dac3483ca31a86cd7e3a959d4010b195f32 updates: bz#789278 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* uss : fix coverity issuesSunny Kumar2018-08-161-2/+1
| | | | | | | | | | | | | | This patch fixes coverity issuse in snapview-server.c CID : 1274119, 1325525 Scan details at [1]. [1]. https://scan6.coverity.com/reports.htm#v42401/p10714/fileInstanceId=84476369&defectInstanceId=25631967&mergedDefectId=778645 Change-Id: I825f09eabf84a2262a079c1f920a673727c5792b updates: bz#789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* tests: Increase timeout to 30 minutes to handle lcov slownessPranith Kumar K2018-08-161-1/+1
| | | | | | | | | | This script on a normal setup takes 15 minutes. With lcov it needs to be increased. Considering we did 1.5X of the default $run_timeout in run-tests.sh, I am doing the same for this script. fixes bz#1614718 Change-Id: Ia571b33ff13deb8cbd8e48561769e876aa0b1aff Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* contrib: Remove gf_mkostemp copied from GLibCShyamsundarR2018-08-164-114/+13
| | | | | | | | | | | | | | | | | | gf_mkostemp is borrowed from GLibC a long time back, we now have mkostemp or mkstemp alternatives with all distributions and versions that we care for. As a result removing this from the contrib directory and modifying the one instance that is still using the same. This is part of code cleanup as we cleaned up coverity SECUR_TEMP errors. Updates: bz#1193929 Change-Id: I1ad7863043cdb0845c53748f5a0522e767079130 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* tests: Fix spurious failures in stats-dump.t testShyamsundarR2018-08-161-0/+8
| | | | | | | | | | | | | | | | | The test fails to grep and find queue_size, in a brick stats dump, having succesfully found aggr.* values in the same. The troubleshot is that, the writer thread in io-stats, that dumps this in a particular interval, truncates the file just before the grep attempts to read the contents, and hence the failure. The fix is to stop the dumper thread, and then wait for a couple of seconds and then check the output, so that the thread writer does not interfere with the test. Fixes: bz#1615582 Change-Id: I29f95488a2ad693abe1dd525b1d87a9d1eee29a2 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* performance/md-cache: Use bitwise AND instead of logical ANDVijay Bellur2018-08-161-1/+1
| | | | | | | | Addresses CID: 1394640 Change-Id: I1139222301569d17760df74624acd301594063b9 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* contrib/xxhash: update to latest xxHash (0.6.5)Yaniv Kaul2018-08-168-499/+730
| | | | | | | | | | | | | | | | | | | | | | Update to latest xxHash, which is supposed to faster with small keys. Specifically, updated to 3064d42e7d74b0921bdd1818395d9cb37bb8976a, which is a bit higher than 0.6.5. Compiled hopefully with namespace (XXH_NAMESPACE=GF_), which allows to use XXH() funcs with no fear they'll 'leak' from our library. Only compile tested! xxhsum is modified to display messages which was conflicting with regression tests (TAP harness). So modified the gfid2path_fuse.t and gfid2path_nfs.t to adhere to that. updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I35cea5cc93f338c1023ac2c9bc6d7d13225a967b
* mountbroker : fix coverity issue in glusterd-mountbroker.cSunny Kumar2018-08-151-1/+4
| | | | | | | | | Fixes CID : 1124789 updates: bz#789278 Change-Id: I61c70f05e6377d7ddc8961556274714dd356a117 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* features/changelog: Fix a resource leakVijay Bellur2018-08-151-0/+1
| | | | | | | | Fixes CID 1382359 Change-Id: Iaafbdb9a45496091327e3dc9092e09148fa9a5c5 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* Bash integration script should namespace variablesMark Mielke2018-08-151-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the original submitted script, it looks like there was effort put into namespacing all global variables. However a few mistakes remained. GLUSTER_TOP_SUBOPTIONSx were defined, but TOP_SUBOPTIONSx were referenced. This was likely an unrecognized defect in the original code submission? These are now corrected to refer to GLUSTER_TOP_SUBOPTIONSx. FINAL_LIST, LIST, and TOP were leaked into all Bash shells and used by the command completion functions. The most problematic of these was TOP, which was declared with "-i" making it an integer. This cause other code which used TOP to define a path to fail like this: $ bash $ TOP=/abc bash: /abc: syntax error: operand expected (error token is "/abc") These are now qualified as GLUSTER_FINAL_LIST, GLUSTER_LIST, and GLUSTER_TOP to reduce impact on scripts that might choose to use these extremely common variable names. Change-Id: Ic96eda8efd1f3238bbade6c6ddb69118e8d82158 Fixes: bz#1425325 Signed-off-by: Mark Mielke <mark.mielke@gmail.com>
* glusterd: fix gcc7 warningsAmar Tumballi2018-08-143-22/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | [sh]$ gcc --version gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Warnings were of the type below: xlators/mgmt/glusterd/src/glusterd-store.c:3285:33: warning: ‘/options’ directive output may be truncated writing 8 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (path, len, "%s/options", conf->workdir); ^~~~~~~~ xlators/mgmt/glusterd/src/glusterd-store.c:1280:39: warning: ‘/snaps/’ directive output may be truncated writing 7 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (snap_fpath, len, "%s/snaps/%s/%s", priv->workdir, ^~~~~~~ * Also changed some places where there was issues with key size * Made sure all the 'char buf[SOMESIZE] = {0,};' are changed to 'char buf[SOMESIZE] = "";` - In the files I changed * Also edited coding standard to reflect that. updates: bz#1193929 Change-Id: I04c652624ac63199cea2077e46b3a5def37c3689 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* mgmt/glusterd: Fix possible use after free in glusterd_op_ac_commit_op()Vijay Bellur2018-08-141-1/+3
| | | | | | | | Fixes CID 1391418 Change-Id: I60ce6cd3b2528369f4dc1be81c0c15a1a806982a updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix buffer length to prevent a memory overrunVijay Bellur2018-08-141-2/+2
| | | | | | | | Fixes CID 1394647, 1394658 Change-Id: I30cf6e793919a08e0a3fe10622351b8316d7767c updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: remove the unused databuf in rebalance structureAmar Tumballi2018-08-141-1/+0
| | | | | | | | | While it is a one line fix, it allows a significant unwanted memory being allocated for defrag structure. Updates: bz#1193929 Change-Id: Idda70d1d3dc0e7be56c35e872aa6edfaf752290d Signed-off-by: Amar Tumballi <amarts@redhat.com>
* features/shard: Fix crash and test case in RENAME fopKrutika Dhananjay2018-08-142-42/+61
| | | | | | | | | | | | | | Setting the refresh flag in inode ctx in shard_rename_src_cbk() is applicable only when the dst file exists and is sharded and has a hard link > 1 at the time of rename. But this piece of code is exercised even when dst doesn't exist. In this case, the mount crashes because local->int_inodelk.loc.inode is NULL. Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407 Updates: bz#1611692 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/dht: Fixed rebalanced filesN Balachandran2018-08-141-1/+1
| | | | | | | | | An error caused skipped files to be counted as rebalanced files. Change-Id: I02333f099fb8b73ba953f41a2922021a1e4da7be fixes: bz#1615474 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/dht: fix inode ref management in dht_heal_pathSusant Palai2018-08-141-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In dht_heal_path, the inodes are created & looked up from top to down. If the path is "a/b/c", then lookup will be done on a, then b and so on. Here is a rough snippet of the function "dht_heal_path". <snippet> if (bname) { ref_count - loc.inode = create/grep inode 1 - syncop_lookup (loc.inode) - linked_inode = inode_link (loc.inode) 2 /*clean up current loc*/ - loc_wipe(&loc) 1 /*set up parent and bname for next child */ - loc.parent = inode - bname = next_child_name } out: - inode_ref (linked_inode) 2 - loc_wipe (&loc) 1 </snippet> The problem with the above code is if _bname_ is empty ie the chain lookup is done, then for the next iteration we populate loc.parent anyway. Now that bname is empty, the loc_wipe is done in the _out_ section as well. Since, the loc.parent was set to the previous inode, we lose a ref unwantedly. Now a dht_local_wipe as part of the DHT_STACK_UNWIND takes away the last ref leading to inode_destroy. This problenm is observed currently with nfs-ganesha with the nameless lookup. Post the inode_purge, gfapi does not get the new inode to link and hence, it links the inode it sent in the lookup fop, which does not have any dht related context (layout) leading to "invalid argument error" in lookup path done parallely with tar operation. test done in the following way: - create two nfs client connected with two different nfs servers. - run untar on one client and run lookup continuously on the other. - Prior to this patch, invalid arguement was seen which is fixed with the current patch. Change-Id: Ifb90c178a2f3c16604068c7da8fa562b877f5c61 fixes: bz#1610256 Signed-off-by: Susant Palai <spalai@redhat.com>
* mgmt/glusterd: Fix a memory leak in volgenVijay Bellur2018-08-141-0/+1
| | | | | | | | Fixes CID 1325557 Change-Id: I5e33ae19ddf4c44a49a2b3b3dea0c739bc96d3a7 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* All: remove memset() before sprintf()Yaniv Kaul2018-08-1431-761/+96
| | | | | | | | | | | | It's not needed. There's a good chance the compiler is smart enough to remove it anyway, but it can't hurt - I hope. Compile-tested only! Change-Id: Id7c054e146ba630227affa591007803f3046416b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* error-gen, locks: Fix a typo in commentsVijay Bellur2018-08-142-3/+3
| | | | | | | | s/coverty/coverity/ Change-Id: Iac7c13176162eace4247dd3236373aa76d906380 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* tests: Fix for gfid-mismatch-resolution-with-fav-child-policy.t failurekarthik-us2018-08-141-0/+1
| | | | | | | | | | | | | | | | | | | This test was retried once on build https://build.gluster.org/job/regression-on-demand-multiplex/174/ (logs for the first try is not available with this build) Test case was failing in line #47 where it was was checking for the heal count to be 0. Line #51 had passed that means file got the gfid split brain resolved, and both the bricks had same gfids. At line #54 it again failed which checks for the md5sum on both the bricks. At this point the md5sum of the brick where the file got impunged had the md5sum same as the newly created empty file. This means the data heal has not happened for the file. At line #64 enabling granular-entry-heal faild, but without the logs it is not possible to debug this issue. Change-Id: I56d854dbb9e188cafedfd24a9d463603ae79bd06 fixes: bz#1615331 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* libglusterfs: Fix a resource leak in graph.cVijay Bellur2018-08-131-0/+1
| | | | | | | | Fixes CID 1382367 Change-Id: I02678fc71716ab0046ea2ef437c6594a8a34a4fc updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* cloudsync: fix -Werror=format-truncation error on gcc8Susant Palai2018-08-131-13/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is the gcc8 warning: libcloudsyncs3.c: In function ‘aws_download_s3’: libcloudsyncs3.c:480:48: error: ‘%s’ directive output may be truncated writing up to 4095 bytes into a region of size 1015 [-Werror=format-truncation=] snprintf(buf, sizeof(buf), "https://%s/%s", priv->hostname, resource); libcloudsyncs3.c:480:9: note: ‘snprintf’ output 10 or more bytes (assuming 4105) into a destination of size 1024 snprintf(buf, sizeof(buf), "https://%s/%s", priv->hostname, resource); Memleak: It fixes a memleak as well where sign_req in fn: aws_form_request was not freed. Adjusted the calloc size for sign_req as well to match with the demand. Test: Have tested the local cloudsync regression test to validate the changes. Smoke validation will be sufficient for the gcc8 warning fixes. Fixes: bz#1609126 Change-Id: I1c537b30168f2e0b54862344a951843e86b0b488 Signed-off-by: Susant Palai <spalai@redhat.com>
* tests: fix brick check ordersAtin Mukherjee2018-08-139-43/+66
| | | | | | | | | | | | fix brick checks for validating-server-quorum.t & quorum-validation.t ...and make brick_up_status_1 function more generic. Also fix a timing issue in bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t Change-Id: I797ef4cec5b160aafa979bae7151b1e99fcb48ac Updates: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* posix: Mark 'shared-brick-count' as settablePrashanth Pai2018-08-131-0/+1
| | | | | | updates: #302 Change-Id: I9c1b9c9751c21866b074ac5d3ef15a58ae7aa707 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* Fix a grammar error in the logsNigel Babu2018-08-131-1/+1
| | | | | | Change-Id: Ie4fe18d5094c051fa20de71f7fc841085cc6aaee Fixes: bz#1614142 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* coverity: last of the secure temp fixesShyamsundarR2018-08-131-3/+1
| | | | | | | | | | | | | | | | | | Coverity ignore directive is not working if the comment is split across lines (or has an empty line at the end. This can be seen in this report: https://download.gluster.org/pub/gluster/glusterfs/static-analysis /master/glusterfs-coverity/2018-08-06-b982e09f/html/1 /384glusterfsd-mgmt.c.html#error In other places the same pattern has avoided coverity from flagging off the same call, except here. Updates: bz#789278 Change-Id: Ic35ff0fc91d0a42904630728ef7c18215aa277f3 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* tests/quick-read/bug-846240.t: fix a wrong testRaghavendra G2018-08-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier this test did following things on M0 and M1 mounted on same volume: 1 create file M0/testfile 2 open an fd on M0/testfile 3 remove the file from M1, M1/testfile 4 echo "data" >> M0/testfile The test expects appending data to M0/testfile to fail. However, redirector ">>" creates a file if it doesn't exist. So, the only reason test succeeded was due to lookup succeeding due to stale stat in md-cache. This hypothesis is verified by two experiments: * Add a sleep of 10 seconds before append operation. md-cache cache expires and lookup fails followed by creation of file and hence append succeeds to new file. * set md-cache timeout to 600 seconds and test never fails even with sleep 10 before append operation. Reason is stale stat in md-cache survives sleep 10. So, the spurious nature of failure was dependent on whether lookup is done when stat is present in md-cache or not. The actual test should've been to write to the fd opened in step 2 above. I've changed the test accordingly. Note that this patch also remounts M0 after initial file creation as open-behind disables opening-behind on witnessing a setattr on the inode and touch involves a setattr. On remount, create operation is not done and hence file is opened-behind. Change-Id: I739f255e0a62ff0024f0824dad3539974955df99 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1615096
* cluster/afr: Fix bug-1433571-undo-pending-only-on-up-bricks.tkarthik-us2018-08-131-2/+2
| | | | | | | | | | | | | | Problem: The test case was checking for the entry pending marker reset on the root after performing client side lookup at line #60-63. But sometimes the entry heal was not getting completed immediately. Fix: Wait for the entry heal to complete before checking the changelog. Change-Id: I42fde21b04a126ab044ce58373a996d72f125d96 fixes: bz#1614730 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* tests: potential fixes to bugs/replicate/bug-1408712.tRavishankar N2018-08-131-2/+15
| | | | | | | | See BZ for details. Change-Id: I2cc2064f14d80271ebcc21747103ce4cee848cbf fixes: bz#1615078 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tests: fix replace-brick-self-heal.t failureRavishankar N2018-08-131-1/+1
| | | | | | | | Please see BZ for details. Change-Id: Id9273432874bc6a452ac96b2b8c7a61ea6c5b98d Fixes: bz#1615239 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* gfapi : Handle the path == "" glfs_resolve_atJiffin Tony Thottan2018-08-131-7/+10
| | | | | | | | | | | Currently there is no check for path = "" in glfs_resolve_at. So if application sends an empty path, then the function resolves into the parent inode which is incorrect. Plus modified possible of "path" with "origpath" in the same function. Change-Id: Ie5ff9ce4b771607b7dbb3fe00704fe670421792a fixes: bz#1610236 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* performance/quick-read: handle rollover of generation counterRaghavendra G2018-08-132-36/+108
| | | | | | Change-Id: I37a6e0efda430b70d03dd431c35bef23b3d16361 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1512691
* tests: fix tests/bugs/shard/configure-lru-limit.tAtin Mukherjee2018-08-131-0/+4
| | | | | | | | Check for the bricks to be up before attempting to mount. Change-Id: I1224908137016df3007f4467aa9760967ce0694d Fixes: bz#1615092 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* tests: potential fixes for tests/basic/afr/add-brick-self-heal.tRavishankar N2018-08-131-0/+7
| | | | | | | | Please see bug description for details. Change-Id: Ieb6bce6d1d5c4c31f1878dd1a1c3d007d8ff81d5 fixes: bz#1614654 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* geo-rep: Fix deadlock during worker startKotresh HR2018-08-132-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Analysis: Monitor process spawns monitor threads (one per brick). Each monitor thread, forks worker and agent processes. Each monitor thread, while intializing, updates the monitor status file. It is synchronized using flock. The race is that, some thread can fork worker while other thread opened the status file resulting in holding the reference of fd in worker process. Cause: flock gets unlocked either by specifically unlocking it or by closing all duplicate fds referring to the file. The code was relying on fd close, hence a reference in worker/agent process by fork could cause the deadlock. Fix: 1. flock is unlocked specifically. 2. Also made sure to update status file in approriate places so that the reference is not leaked to worker/agent process. With this fix, both the deadlock and possible fd leaks is solved. fixes: bz#1614799 Change-Id: I0d1ce93072dab07d0dbcc7e779287368cd9f093d Signed-off-by: Kotresh HR <khiremat@redhat.com>
* glusterd: compare friend data within mutexAtin Mukherjee2018-08-133-41/+48
| | | | | | | | | | | | | | | | | | | | | | | During friend handshake if the glusterd receives more than one friend updates, it might very well become possible that two threads would end up working on two different volinfo references and glusterd might end up updating the store with a old volinfo reference. While debugging glusterd crash from validating-server-quorum.t test file from the line-coverage regression the same was observed. Solution is to run glusterd_compare_friend_data under a mutex. Test: As the crash was more visible in the line-coverage run (given lcov does some instrumentation and exposes the races), 6 manual lcov runs were triggered starting from https://build.gluster.org/job/line-coverage/443 to https://build.gluster.org/job/line-coverage/449/ and no crash was observed from validating-server-quorum.t Change-Id: I86fce473a76fd24742d51bf17a685d28b90a8941 Fixes: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* tests: Fix cleanup routine for some mux testsShyamsundarR2018-08-134-10/+7
| | | | | | | | | | | | | | | | | Some of the mux tests, set a trap to catch test exit and call cleanup. This will cause cleanup to be invoked twice in case the test times out, or even otherwise, as include.rc also sets a trap to cleanup on exit (TERM and others). This leads to the tarballs generated on failures for these tests to be empty and does not aid debugging. This patch corrects this pattern across the tests to the more standard cleanup at the end. Fixes: bz#1615037 Change-Id: Ib83aeb09fac2aa591b390b9fb9e1f605bfef9a8b Signed-off-by: ShyamsundarR <srangana@redhat.com>
* Make sure EXPECT_WITHIN executes the statement multiple timesPranith Kumar K2018-08-122-6/+14
| | | | | | | | | | | | | When we pass a command to be executed in EXPECT_WITHIN and we use `` the value is passed by value, so if the first execution gives a result that is different from the expected value, EXPECT_WITHIN test will fail because the command will not be re-evaluated. Changed the expression with `` to a function. Added sleep(3) in afr.c for reconfigure to both RC and re-test after the change. fixes bz#1614662 Change-Id: I3bc8a75b996729261aa48067f6ed8da9c6273b13 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: Compare volume_id before start/attach a brickMohit Agrawal2018-08-102-24/+32
| | | | | | | | | | | | | | Problem: After reboot a node brick is not coming up because fsid comparison is failed before start a brick Solution: Instead of comparing fsid compare volume_id to resolve the same because fsid is changed after reboot a node but volume_id persist as a xattr on brick_root path at the time of creating a volume. Change-Id: Ic289aab1b4ebfd83bbcae8438fee26ae61a0fff4 fixes: bz#1612418 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>