summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht
Commit message (Collapse)AuthorAgeFilesLines
* cluster/dht: create request dictionary if necessary during refreshRaghavendra G2015-03-061-10/+17
| | | | | | | | | layout. Change-Id: I5a5d793c86ee5de345608eede5618e4e6c02af9f BUG: 1195668 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/9733
* testing: Switch to cmocka the successor of cmockery2Niels de Vos2015-03-053-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This uses https://cmocka.org/ as the unit testing framework. With this change, unit testing is made optional as well. We assume there is no cmocka available while building. cmocka will be enabled by default later on. For now, to build with cmocka run: $ ./configure --enable-cmocka This change is based on the work of Andreas (replacing cmockery2 with cmocka) and Kaleb (make cmockery2 an optional build dependency). The only modifications I made, are additional #defines in unittest.h for making sure the unit tests function as expected. Change-Id: Iea4cbcdaf09996b49ffcf3680c76731459cb197e BUG: 1067059 Merged-change: http://review.gluster.org/9762/ Signed-off-by: Andreas Schneider <asn@samba.org> Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Change-Id: Ia2e955481c102d5dce17695a9205395a6030e985 Reviewed-on: http://review.gluster.org/9738 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* xlators/dht : divide by zero coverity fixManikandan Selvaganesh2015-03-051-1/+2
| | | | | | | | | | | | | CID:1226163. BUG: 789278 Change-Id: Ie31d65da236d7029784defad963672b2ded2676a BUG:1192435 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/9563 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: fixes to should_layout_fix logicRaghavendra G2015-03-052-12/+53
| | | | | | | | | | | | | | * Don't consider "dir-spread-count" option. This option is not supported. * Consider transition to weighted to equal distribution or vice-versa a valid case for fixing the layout. Change-Id: I0dcfe555dae9269ce20a41611cfdaa4f96c9e98b BUG: 1196615 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/9809 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Fixes to should_fix_layout logicRaghavendra G2015-03-031-6/+38
| | | | | | | | | | | | | | | | | * With recent introduction of locking in self-heal codepath, fix layout was not allowed to progress during remove-brick. This patch fixes the issue. * dht_should_fix_layout also considers "dir-spread-count" option if set, to determine whether we should proceed with fix-layout or not. Change-Id: Icd96986f7af705744131d62e7f1456114ac1ee53 BUG: 1196615 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/9764 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* dht : logically dead code removedManikandan Selvaganesh2015-03-032-12/+4
| | | | | | | | | | | | | CID :1124378 1124401 Change-Id: Ib48e4a8d3fb12c4e0323a3946afb46eeb3926984 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9584 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Propagate an event only after hearing the same from all subvolumesAnoop C S2015-02-261-0/+3
| | | | | | | | | | | | | | | | | In dht_notify(), we propagate each event without checking whether all subvolumes have reported the same event earlier. As a result separate events are being forwarded for each dht-subvolume. This change is to make sure that we propagate a particular event only if all other subvolumes have already reported the same event once earlier. Change-Id: I6c73fa105e967f29648af9e2030f91a94f2df130 BUG: 1176543 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9322 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* dht: fix for dht_lock_count() compile errorNiels de Vos2015-02-262-2/+2
| | | | | | | | | | | | | | | dht-common.h includes a function definition with "inline", but the function is not declared in the header. Dropping the "inline" compile directive so that linking against .o files works correctly. BUG: 1196650 Change-Id: I105be591125b29cd455769b0c4ff22d6e139227d Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9760 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: serialize execution of dht_discover_complete andRaghavendra G2015-02-251-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | STACK_DESTROY (frame). In the current code, dht_discover_complete can be invoked because of: 1. attempt_unwind is true 2. we are processing reply from the last subvolume In scenario 1, following race is possible: T1: calls dht_frame_return. T2: calls dht_frame_return. This happens to be last call and hence it invokes dht_discover_complete, goes ahead and destroys frame T1: since attempt_unwind is true, calls dht_discover_complete. However, since frame is already freed, call to dht_discover_complete can result in a crash. The fix is to make sure that destruction of the frame is done only by the thread executing dht_discover_complete. Change-Id: I45765b90c4a9d0af0b33f8911b564d99e12d099e BUG: 1195120 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/9729 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/dht: synchronize with other concurrent healers while healing layout.Raghavendra G2015-02-205-41/+562
| | | | | | | | | | | | | | | | | | | | | | | | | | Current layout heal code assumes layout setting is idempotent. This allowed multiple concurrent healers to set the layout without any synchronization. However, this is not the case as different healers can come up with different layout for same directory and making layout setting non-idempotent. So, we bring in synchronization among healers to 1. Not to overwrite an ondisk well-formed layout. 2. Refresh the in-memory layout with the ondisk layout if in-memory layout needs healing and ondisk layout is well formed. This patch can synchronize 1. among multiple healers. 2. among multiple fix-layouts (which extends layout to consider added or removed brick) 3. (but) not between healers and fix-layouts. So, the problem of in-memory stale layouts (not matching with layout ondisk), is not _completely_ fixed by this patch. Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ia285f25e8d043bb3175c61468d0d11090acee539 BUG: 1176008 Reviewed-on: http://review.gluster.org/9302 Reviewed-by: N Balachandran <nbalacha@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Renaming directories can cause the size of the buffer required for posix_handle_path to increase between the first call, which calculates the size, and the second call which forms the path in the buffer allocated based on the size calculated in the first call. The path created in the second call overflows the allocated buffer and overwrites the stack causing the brick process to crash. The fix adds a buffer size check to prevent the buffer overflow. It also checks and returns an error if the posix_handle_path call is unable to form the path instead of working on the incomplete path, which is likely to cause subsequent calls using the path to fail with ELOOP. Preventing buffer overflow and handling errors BUG: 1113960 Change-Id: If3d3c1952e297ad14f121f05f90a35baf42923aa Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9289 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/dht: Fix dht_link to follow files under migrationShyam2015-02-171-19/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if a file is under migration, a hardlink to that file is lost post migration of the file. This is due to the fact that the hard link is created against the cached subvol of the source and as the source is under migration, it shifts to a linkto file post migration. Thus losing the hardlink. This change follows the stat information that triggers a phase1/2 detection for a file under migration, to create the link on the new subvol that the source file is migrating to. Thereby preserving the hard link post migration. NOTES: The test case added create a ~1GB file, so that we can catch the file during migration, smaller files may not capture this state and the test may fail. Even if migration of the file fails, we would only be left with stale linkto files on the subvol that the source was migrating to, which is not a problem. This change would create a double linkto, i.e new target hashed subvol would point to old source cached subol, which would point to the real cached subvol. This double redirection although not handled directly in DHT, works as lookup searches everywhere on hitting linkto files. The downside is that it never heals the new target hashed subvol linkto file, which is another bug to be resolved (does not cause functional impact). Change-Id: I871e6885b15e65e05bfe70a0b0180605493cb534 BUG: 1161311 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9105 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Modify dht-rebalance.c to use trusted.distribute.migrate-dataDan Lambright2015-02-133-5/+25
| | | | | | | | | | | | | | | rather than distribute.migrate-data and work with CLI. The change makes this work both when it is internally driven and from the shell. The problem is further described in bugzilla # 1147107. Change-Id: I4fe04cae661dca25432530ddf5ac6ff2c957d6b3 BUG: 1147107 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9284 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Compare key with GF_XATTR_LOCKINFO_KEY with length of GF_XATTR_LOCKINFO_KEY ↵Dennis Schafroth2015-02-021-1/+1
| | | | | | | | | | | | | | | | instead of length of 0 BUG: 1187952 Change-Id: I0a97c553e85a0f9260ab01d4b48c64831bf67c18 Signed-off-by: Dennis Schafroth <dennis@schafroth.com> Reviewed-on: http://review.gluster.org/9518 Reviewed-by: Joe Julian <joe@julianfamily.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: In MKDIR(), aggregate iatts from non-hashed subvols tooKrutika Dhananjay2015-01-231-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Gathering iatts from ONLY the hashed subvol during MKDIR and unwinding them can cause md-cache to cache and serve these values for a while to the application. And then, at a later point of time, when a LOOKUP on either the dir or its parent gathers attributes from all subvolumes of dht and things are evened out as part of DHT_UPDATE_TIME, the application could be getting a different set of [cm]times (i.e., one of the non-hashed subvolumes' times could be selected by virtue of having the highest values), causing it to think the directory underwent modification even when it might not have. The effect of this bug becomes apparent in programs like tar, which rely on the ctime of the files before and after archiving a file to ascertain that the file remained unchanged during this time. FIX: Aggregate iatts from ALL sub-volumes of DHT during MKDIR. Change-Id: I04c4ca3e3b9552772e2b089be680f8afeb72089e BUG: 1179169 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9465 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Fix incorrect updates to parent timesKrutika Dhananjay2015-01-191-6/+5
| | | | | | | | | | | | | | | | | | | | | In directory write FOPs, as far as updates to timestamps associated with parent by DHT is concerned, there are three possibilities: a) time (in sec) gotten from child of DHT < time (in sec) in inode ctx b) time (in sec) gotten from child of DHT = time (in sec) in inode ctx c) time (in sec) gotten from child of DHT > time (in sec) in inode ctx In case (c), for time in nsecs, it is the value returned by DHT's child that must be selected. But what DHT_UPDATE_TIME ends up doing is to choose the maximum of (time in nsec gotten from DHT's child, time in nsec in inode ctx). Change-Id: I535a600b9f89b8d9b6714a73476e63ce60e169a8 BUG: 1179169 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9457 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Don't restore entry when only one subvolume is presentPranith Kumar K2015-01-181-3/+6
| | | | | | | | | | | | | | | | | | | | | | Problem: When rmdir fails with op_errno other than ENOENT/EACCES then self-heal is attempted with zeroed-out stbuf. Only ia_type is filled from inode, when the self-heal progresses, it sees that the directory is still present and performs setattr with all valid flags set to '1' so the file will be owned by root:root and the time goes to epoch Fix: This fixes the problem only in dht with single subvolume. Just don't perform self-heal when there is a single subvolume. Change-Id: I6c85b845105bc6bbe7805a14a48a2c5d7bc0c5b6 BUG: 1181367 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9435 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/afr: split-brain resolution CLIRavishankar N2015-01-151-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the AFR heal command to include automated split-brain resolution. This patch [3/3] is the final patch for afr automated split-brain resolution implementation. "gluster volume heal <VOLNAME> [full | statistics [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain]| split-brain {bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}]" The new additions being: 1.gluster volume heal <VOLNAME> split-brain bigger-file <FILE> Locates the replica containing the FILE, selects bigger-file as source and completes heal. 2.gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME> <FILE> Selects <FILE> present in <HOSTNAME:BRICKNAME> as source and completes heal. 3.gluster volume heal <VOLNAME> split-brain <HOSTNAME:BRICKNAME> Selects all split-brained files in <HOSTNAME:BRICKNAME> as source and completes heal. Note: <FILE> can be either the full file name as seen from the root of the volume (or) the gfid-string representation of the file, which sometimes gets displayed in the heal info command's output. Entry/gfid split-brain resolution is not supported. Example can be found in the test case. Change-Id: I4649733922d406f14f28ee9033a5cb627b9538b3 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9377 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs: change signature of syncop_(f)getxattrRavishankar N2015-01-052-6/+6
| | | | | | | | | | | | | | | | | Pass xdata dict to syncop_(f)getxattr calls. This patch [1/3] is required as a part of afr automated split-brain resolution implementation. Change-Id: I3970b3dd6daf64681a031e37f8e9afb14fb3d668 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9375 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: fix Ubuntu code audit (cppcheck) resultsKaleb S. KEITHLEY2014-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See also http://review.gluster.org/#/c/7693/, BZ 1091677 AFAICT these are false positives: [geo-replication/src/gsyncd.c:100]: (error) Memory leak: str [geo-replication/src/gsyncd.c:403]: (error) Memory leak: argv [xlators/nfs/server/src/nlm4.c:1201]: (error) Possible null pointer dereference: fde [xlators/cluster/afr/src/afr-self-heal-common.c:138]: (error) Possible null pointer dereference: __ptr [xlators/cluster/afr/src/afr-self-heal-common.c:140]: (error) Possible null pointer dereference: __ptr [xlators/cluster/afr/src/afr-self-heal-common.c:331]: (error) Possible null pointer dereference: __ptr Test program: [extras/test/test-ffop.c:27]: (error) Buffer overrun possible for long command line arguments. [tests/basic/fops-sanity.c:55]: (error) Buffer overrun possible for long command line arguments. the remainder are fixed with this change-set: [cli/src/cli-rpc-ops.c:8883]: (error) Possible null pointer dereference: local [cli/src/cli-rpc-ops.c:8886]: (error) Possible null pointer dereference: local [contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 2) requires 'long *' but the argument type is 'unsigned long *'. [contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 3) requires 'long *' but the argument type is 'unsigned long *'. [xlators/cluster/dht/src/dht-rebalance.c:1734]: (error) Possible null pointer dereference: ctx [xlators/cluster/stripe/src/stripe.c:4940]: (error) Possible null pointer dereference: local [xlators/mgmt/glusterd/src/glusterd-geo-rep.c:1718]: (error) Possible null pointer dereference: command [xlators/mgmt/glusterd/src/glusterd-replace-brick.c:942]: (error) Resource leak: file [xlators/mgmt/glusterd/src/glusterd-replace-brick.c:1026]: (error) Resource leak: file [xlators/mgmt/glusterd/src/glusterd-sm.c:249]: (error) Possible null pointer dereference: new_ev_ctx [xlators/mgmt/glusterd/src/glusterd-snapshot.c:6917]: (error) Possible null pointer dereference: volinfo [xlators/mgmt/glusterd/src/glusterd-utils.c:4517]: (error) Possible null pointer dereference: this [xlators/mgmt/glusterd/src/glusterd-utils.c:6662]: (error) Possible null pointer dereference: this [xlators/mgmt/glusterd/src/glusterd-utils.c:7708]: (error) Possible null pointer dereference: this [xlators/mount/fuse/src/fuse-bridge.c:4687]: (error) Uninitialized variable: finh [xlators/mount/fuse/src/fuse-bridge.c:3080]: (error) Possible null pointer dereference: state [xlators/nfs/server/src/nfs-common.c:89]: (error) Dangerous usage of 'volname' (strncpy doesn't always null-terminate it). [xlators/performance/quick-read/src/quick-read.c:586]: (error) Possible null pointer dereference: iobuf Rerunning cppcheck after fixing the above: As before, test program: [extras/test/test-ffop.c:27]: (error) Buffer overrun possible for long command line arguments. [tests/basic/fops-sanity.c:55]: (error) Buffer overrun possible for long command line arguments. As before, false positive: [geo-replication/src/gsyncd.c:100]: (error) Memory leak: str [geo-replication/src/gsyncd.c:403]: (error) Memory leak: argv [xlators/nfs/server/src/nlm4.c:1201]: (error) Possible null pointer dereference: fde [xlators/cluster/afr/src/afr-self-heal-common.c:138]: (error) Possible null pointer dereference: __ptr [xlators/cluster/afr/src/afr-self-heal-common.c:140]: (error) Possible null pointer dereference: __ptr [xlators/cluster/afr/src/afr-self-heal-common.c:331]: (error) Possible null pointer dereference: __ptr False positive after fix: [xlators/performance/quick-read/src/quick-read.c:584]: (error) Possible null pointer dereference: iobuf Change-Id: I20e0e3ac1d600b2f2120b8d8536cd6d9e17023e8 BUG: 1109180 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/8064 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fix subvol check, to correctly determine cached file renameShyam2014-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The check to treat rename as a critical failure ignored when the cached file is being renamed to new name, as the new name falls on the same subvol as the cached file. This is in addition to when the target of the rename does not exist. The current change is simpler, as the rename logic, renames the cached file in case the target exists and falls on the same subvol as source name, OR the target does not exist and the hash of target falls on the same subvol as source cached. These conditions mean we are renaming the source, other conditions mean we are renaming the source linkto file which we do not want to treat as a critical failure (and we also instruct marker that it is an internal FOP and to not account for the same). Change-Id: I4414e61a0d2b28a429fa747e545ef953e48cfb5b BUG: 1161156 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9063 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT : Rebalance skipped file count fixNithya Balachandran2014-11-111-4/+5
| | | | | | | | | | | | | | | The return value in dht_migrate_file is used to indicate the status of the file migration. This value was being masked by the lock operation causing the skipped and failure file counts to be incorrectly calculated. Change-Id: Ice3d2f5d57766e18aa52659f22a76867d188dc65 BUG: 1161518 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9070 Reviewed-by: susant palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT : Fixed crash due to null derefNithya Balachandran2014-11-031-2/+3
| | | | | | | | | | | | | | | | A lookup on a linkto file whose trusted.glusterfs.dht.linkto xattr points to a subvol that is not part of the volume can cause the brick process to segfault due to a null dereference. Modified to check for a non-null value before attempting to access the variable. Change-Id: Ie8f9df058f842cfc0c2b52a8f147e557677386fa BUG: 1159571 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9034 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rebalance: ``check_free_space`` should ignore quota_statfsHarshavardhana2014-10-311-10/+33
| | | | | | | | | | | | | | | | | | | | | quota_statfs() returns aggregated details of space usage of bricks this causes distribute to be confused during ``rebalance``, where ``statfs()`` values are used to schedule file migration. We can make sure the values of ``statfs`` are from individual bricks by selectively instructing ``quota_statfs()`` to return non aggregated values. Change-Id: I1397faeee66a1b9c26709cfda693286d227a4170 BUG: 1158262 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8996 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Modified the calculation of brick_countVenkatesh Somyajulu2014-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever new_layout is calculated for a directory, we calculate the number of childs of dht, who will get the actual(Non-zero) layout-range, and assign range to only those subvolume and other will get 0 as their layout->start and layout->stop value. This calculation is based on either a) weight_by_size or b) number of brick who will be assigned the non-zero range So if in case we are not assigning the layout based on weight_by_size, we should choose the "bricks_to_use" instead of "bricks_used". In regression test, we found that priv->du_stat[0].chunks was zero. In this case "bricks_used" variable will be zero, which will cause crash for chunk = ((unsigned long) 0xffffffff) / bricks__used; calculation. Change-Id: I6f1b21eff972a80d9eb22771087c1e2f53e7e724 BUG: 1143835 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8792 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Fix dict_t leaks in rebalance process' execution pathKrutika Dhananjay2014-09-191-4/+7
| | | | | | | | | | | | | | | | | | | Two dict_t objects are leaked for every file migrated in success codepath. It is the caller's responsibility to unref dict that it gets from calls to syncop_getxattr(); and rebalance performs two syncop_getxattr()s per file without freeing them. Also, syncop_getxattr() on GF_XATTR_LINKINFO_KEY doesn't seem to be using the response dict. Hence, NULL is now passed as opposed to @dict to syncop_getxattr(). Change-Id: I5a4b5ab834df3633dea994f239bbdbc34cbe9259 BUG: 1142052 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8763 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Changed log level to DEBUGVenkatesh Somyajulu2014-09-161-2/+4
| | | | | | | | | | Change-Id: Ia4dde95367b44d63f57f0840c2a2f351b1cfb072 BUG: 1138602 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8697 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Changed log level to DEBUGVenkatesh Somyajulu2014-09-111-4/+4
| | | | | | | | | | | Change-Id: I7a4ee0c5a6a94bd4f31aff510a2971750913ed45 BUG: 1138602 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8621 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: fix memory corruption in locking api.Raghavendra G2014-09-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | <man 3 qsort> The contents of the array are sorted in ascending order according to a comparison function pointed to by compar, which is called with two arguments that "point to the objects being compared". </man 3 qsort> qsort passes "pointers to members of the array" to comparision function. Since the members of the array happen to be (dht_lock_t *), the arguments passed to dht_lock_request_cmp are of type (dht_lock_t **). Previously we assumed them to be of type (dht_lock_t *), which resulted in memory corruption. Change-Id: Iee0758704434beaff3c3a1ad48d549cbdc9e1c96 BUG: 1139506 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8659 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fixed double UNWIND in lookup everywhere codeShyam2014-09-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | In dht_lookup_everywhere_done: Line: 1194 we call DHT_STACK_UNWIND and in the same if condition we go ahead and call, goto unwind_hashed_and_cached; which at Line 1371 calls another UNWIND. As is obvious, higher frames could cleanup their locals and on receiving the next unwind could cause a coredump of the process. Fixed the same by calling the required return post the first unwind Change-Id: Ic5d57da98255b8616a65b4caaedabeba9144fd49 BUG: 1139812 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8666 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com>
* cluster/dht: Added code to capture races in dht-lookup pathVenkatesh Somyajulu2014-09-031-10/+146
| | | | | | | | | Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT: Changing rename log severityNithya Balachandran2014-09-031-6/+5
| | | | | | | | | | | | Changing log level for a rename message from debug to info to improve debuggability Change-Id: I53031fcf97fffd62095692477330ecde0cf47dcd BUG: 1130888 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8582 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: remove specifying cached-subvol as part of name inRaghavendra G2014-09-021-12/+1
| | | | | | | | | | | | | | | | | | | | | | unlink. commit 667b2496c3f29e24ed359a05b0f44df0d1894969 introduced a functionality where we can specify the subvol where file is stored. As part of same commit, dht_unlink was also changed to accept cached-subvol as part of name. While it makes sense to specify subvol while creating file, there is no necessity for specifying the subvol during unlink, since the default unlink logic works fine with this functionality too. Also, this code in unlink doesn't work well when files get migrated by rebalance process. Hence removing it. Change-Id: Ic3fc32ad657e2dcd677a4c80b04a618029eddd89 BUG: 1130888 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8548 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Rename should not fail post hardlink creationShyam2014-09-022-41/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | In the rename path, we wind the creation of newname hardlink and linkto file in dst hashed a the same time. If the linkto creation fails, but the link creation succeeds, we enter the failure code and cleanup the created newname hardlink. In the interim if another client looks up newname and finds it as a hardlink from FUSE, it could send an unlink for oldname instead of a rename. This combined with the above cleanup code could end up losing all the files copies, and thereby losing data. This fix separates these steps into 2 parts, creating the linkto first and then the link file, so that post link file creation no failures would cleanup the newname file. If linkto fails then link is not attempted, thereby not polluting the name space with newname. Change-Id: I61da8e906060da16a31ea1076eec2f01fd617f44 BUG: 1130888 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8570 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Treat linkto file rename failure as non-critial errorShyam2014-09-021-8/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a critical failure iff we fail to rename the cached file if the rename of the linkto failed, it is not a critical failure, and we do not want to lose the created hard link for the new name as that could have been read by other clients. NOTE: If another client is attempting the same oldname -> newname rename, and finds both file names as existing, and are hard links to each other, then FUSE would send in an unlink for oldname. In this time duration if we treat the linkto as a critical error and unlink the newname we created, we would have effectively lost the file to rename operations. Repercussions of treating this as a non-critical error is that we could leave behind a stale linkto file and/or not create the new linkto file, the second case would be rectified by a subsequent lookup, the first case by a rebalance, like for all stale linkto files Change-Id: Ia53ad8b43c3cf8f48ef5b43fd1fec4274e807556 BUG: 1130888 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8563 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht: Avoid using inline, if necessary use it with 'static inline'Harshavardhana2014-08-291-1/+1
| | | | | | | | | | This avoids flat namespace problems on OSX and with clang Change-Id: Id80d94d71b120c6b1166218caa8cf9cf7f2da03a BUG: 1130888 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8547 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: synchronize rename and file-migrationRaghavendra G2014-08-283-34/+293
| | | | | | | | | Change-Id: I4f243c946f76d440680b651235f925e3d0ebf0fd BUG: 1130888 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8523 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: invoke callback when there are no locks to be unlocked.Raghavendra G2014-08-281-0/+4
| | | | | | | | | | Change-Id: I375cb68f1075c2d58cf9d09ed6bd5e2746e1637d BUG: 1130888 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8549 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: introduce locking api.Raghavendra G2014-08-265-1/+658
| | | | | | | | | | Change-Id: I41389ba91951d3e63e617aa32cd0bee848261c72 BUG: 1130888 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8521 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT : Additional log messagesNithya Balachandran2014-08-242-5/+8
| | | | | | | | | | | | Adding log messages in the rename and lookup calls to help with debugging. Change-Id: I13b1c6f98fb49ead45362550c46359ab1f9028c0 BUG: 1130888 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8516 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Fix dht_access treating directory like filesShyam2014-08-181-4/+5
| | | | | | | | | | | | | | | | | | | | | | | When the cluster topology changes due to add-brick, all sub volumes of DHT will not contain the directories till a rebalance is completed. Till the rebalance is run, if a caller bypasses lookup and calls access due to saved/cached inode information (like NFS server does) then, dht_access misreads the error (ESTALE/ENOENT) from the new subvolumes and incorrectly tries to handle the inode as a file. This results in the directories in memory state in DHT to be corrupted and not heal even post a rebalance. This commit fixes the problem in dht_access thereby preventing DHT from misrepresenting a directory as a file in the case presented above. Change-Id: Idcdaa3837db71c8fe0a40ec0084a6c3dbe27e772 BUG: 1125824 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/8462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Added keys in dht_lookup_everywhere_doneVenkatesh Somyajulu2014-08-141-4/+72
| | | | | | | | | | | | | | | | | | | Case where both cached (C1) and hashed file are found, but hash does not point to above cached node (C1), then dont unlink if either fd-is-open on hashed or linkto-xattr is not found. Change-Id: I7ef49b88d2c88bf9d25d3aa7893714e6c0766c67 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Change-Id: I86d0a21d4c0501c45d837101ced4f96d6fedc5b9 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8429 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-07-312-21/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Symlink mtime changes when rebalancingAnders Blomdell2014-07-311-1/+2
| | | | | | | | | | | | | Added mtime preservation for special files during rebalance. Change-Id: If04921d4d66853fde8b4d8a3ab748790864f8f42 BUG: 1122443 Signed-off-by: Anders Blomdell <anders.blomdell@control.lth.se> Reviewed-on: http://review.gluster.org/8383 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Anand Avati <avati@redhat.com>
* build: fix compile issue in unittest on EPEL-5Niels de Vos2014-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When building the current master branch on EPEL-5, the following error is observed: CC dht_layout_unittest-dht_layout_mock.o unittest/dht_layout_mock.c:77:2: error: no newline at end of file make[5]: *** [dht_layout_unittest-dht_layout_mock.o] Error 1 make[5]: *** Waiting for unfinished jobs.... EPEL-5 has this gcc: $ gcc --version gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54) The fix is simple enough, dht_layout_unittest-dht_layout_mock.c should have a line ending. BUG: 1124858 Change-Id: I8ca7a1699fd1c35694073b9577f94ae525854713 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8392 Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Luis Pabon <lpabon@redhat.com>
* dht: fix rename raceNithya Balachandran2014-07-301-2/+5
| | | | | | | | | | | | | Additional check to check if we created the linkto file before deleting it in the rename cleanup function Change-Id: I919cd7cb24f948ba4917eb9cf50d5169bb730a67 BUG: 1117851 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8338 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Attempt to fix cmockery2 buildEmmanuel Dreyfus2014-07-271-1/+1
| | | | | | | | | | | | | | | | | | | The current code assumes cmockery2 is installed in default paths. Use PKG_MODULES_CHECK to find it using pkg-config if it is not. If not found by pkg-config, try AC_CHECK_LIB. There are also some build flag adjustement so that local overrides do not loose the required -I flags. This includes and enhance http://review.gluster.org/8340/ BUG: 764655 Change-Id: Ide9f77d1e70afe3c1c5c57ae2b93127af6a425f9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8365 Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: Support for unit tests using Cmockery2Luis Pabon2014-07-183-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will allow for developers to create unit tests for their code. Documentation has been added to the patch and is available here: doc/hacker-guide/en-US/markdown/unittest.md Also, unit tests are run when RPM is created. This patch is a replacement for http://review.gluster.org/#/c/7281 which removed unit test infrastucture from the repo due to multiple conflicts. Cmockery2 is now available in Fedora and EPEL, and soon to be available in Debian and Ubuntu. For all other operating systems, please install from the source: https://github.com/lpabon/cmockery2 BUG: 1067059 Change-Id: I1b36cb1f56fd10916f9bf535e8ad080a3358289f Signed-off-by: Luis Pabón <lpabon@redhat.com> Reviewed-on: http://review.gluster.org/7538 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* DHT/mkdir : Fill the stbuf from the subvols on which directory creationSusant Palai2014-07-181-4/+0
| | | | | | | | | | | | | | | | | was successful. Problem: In case a mkdir sees EEXIST on a non-hashed subvol it reports error to the application. Change-Id: I44b2f32fc1069e609d788b6d25b9366b1460395c BUG: 1114557 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/8203 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Anders Blomdell <anders.blomdell@control.lth.se> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fix races to avoid deletion of linkto fileVenkatesh Somyajulu2014-07-183-30/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8231 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>