summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht/src
Commit message (Collapse)AuthorAgeFilesLines
* build: out-of-tree builds generates files in the wrong directoryKaleb S KEITHLEY2016-09-181-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And minor cleanup of a few of the Makefile.am files while we're at it. Rewrite the make rules to do what xdrgen does. Now we can get rid of xdrgen. Note 1. netbsd6's sed doesn't do -i. Why are we still running smoke tests on netbsd6 and not netbsd7? We barely support netbsd7 as it is. Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with libglusterfs? A cut-and-paste mistake? It has no references to symbols in libglusterfs. Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_ regex that matches the same lines as the _extended_ regex "/#(ifndef|define|endif)/". To match the extended regex sed needs to be run with -r on Linux; with -E on *BSD. However NetBSD's and FreeBSD's sed helpfully also provide -r for compatibility. Using a basic regex avoids having to use a kludge in order to run sed with the correct option on OS X. Note 4. Not copying the bit of xdrgen that inserts copyright/license boilerplate. AFAIK it's silly to pretend that machine generated files like these can be copyrighted or need license boilerplate. The XDR source files have their own copyright and license; and their copyrights are bound to be more up to date than old boilerplate inserted by a script. From what I've seen of other Open Source projects -- e.g. gcc and its C parser files generated by yacc and lex -- IIRC they don't bother to add copyright/license boilerplate to their generated files. It appears that it's a long-standing feature of make (SysV, BSD, gnu) for out-of-tree builds to helpfully pretend that the source files it can find in the VPATH "exist" as if they are in the $cwd. rpcgen doesn't work well in this situation and generates files with "bad" #include directives. E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`, you get an #include directive in the generated .c file like this: ... #include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h" ... which (obviously) results in compile errors on out-of-tree build because the (generated) header file doesn't exist at that location. Compared to `rpcgen ./glusterfs3-xdr.x` where you get: ... #include "glusterfs3-xdr.h" ... Which is what we need. We have to resort to some Stupid Make Tricks like the addition of various .PHONY targets to work around the VPATH "help". Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/... looks exactly like -I$(top_srcdir)/rpc/xdr/... Don't be fooled though. And don't delete the -I$(top_builddir)/rpc/xdr/... bits Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e BUG: 1330604 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14085 Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* dht/events: Added rebalance eventsN Balachandran2016-09-161-6/+50
| | | | | | | | | | | | | | | | | The rebalance process will now send an event when it is complete. Also fixed a problem where the run-time was not always set causing spurious rebalance failure events to be sent. Change-Id: Ib445171c78c9560940022bca20c887d31a9bb1ca BUG: 1371874 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15501 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* xlators/{dht,tier}: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-09-132-6/+0
| | | | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a "pragma leak" where the generated rpc/xdr headers have a pair of pragmas that disable these warnings. With the warnings disabled, many unused variables have crept into the code base. And 14085 won't pass its own smoke test until all these warnings are fixed. BUG: 1369124 Change-Id: I6feee863927254ae9f59013112bcc297ad09e89b Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15477 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* dht: "replica.split-brain-status" attribute value is not correctMohit Agrawal2016-09-122-12/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a distributed-replicate volume attribute "replica.split-brain-status" value does not display split-brain condition though directory is in split-brain. If directory is in split brain on mutiple replica-pairs it does not show full list of replica pairs. Solution: Update the dht_aggregate code to aggregate the xattr value in this specific condition. Fix: 1) function getChoices returns the choices from split-brain status string. 2) function add_opt adding the choices to local buffer to store in dictionary 3) For the key "replica.split-brain-status" function dht_aggregate call dht_aggregate_split_brain_xattr to prepare the list. Test: To verify the patch followed below steps 1) Create a distributed replica volume and create mount point 2) Stop heal daemon 3) Touch file and directories on mount point mkdir test{1..5};touch tmp{1..5} 4) Down brick process on one of the replica set pkill -9 glusterfsd 5) Change permission of dir on mount point chmod 755 test{1..5} 6) Restart brick process on node with force option 7) kill brick process on other node in same replica set 8) Change permission of dir again on mount point chmod 766 test{1..5} 9) Reexecute same step from 4-9 on other replica set also 10) After check heal status on server it will show dir's are in split brain on all replica sets 11) After check the replica.split-brain-status attr on mount point it will show wrong status of split brain. 12) After apply the patch the attribute shows correct value. BUG: 1368312 Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15201 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: heal root permission post add-brickSusant Palai2016-09-112-5/+43
| | | | | | | | | | | | | | | | | | | | | | | | | Post add-brick event the new brick will have permission of 755 by default. If the root directory permission was other than 755, that does not get healed to the new brick leading to permission errors/inconsistencies. For choosing source of attr heal we can trust the subvols which have layouts with latest ctime(as part of missing directory heal, we heal the proper attr). In case none of the subvols have layout, return ESTALE to retrigger a fresh lookup. Note: This patch heals the permission of the root directories only. Since, permission healing of directory is not straight forward and required intrusive fix, those are not addressed here. Change-Id: If894e3895d070d46b62d2452e52c1eaafcf56c29 BUG: 1368012 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15195 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: Proper log message if data migration is skippedankit2016-09-111-8/+8
| | | | | | | | | | | | Change-Id: Id0af15a2aec96bdbe675b4c959b56f0fc8e72504 BUG: 1341948 Signed-off-by: ankit <anraj@redhat.com> Reviewed-on: http://review.gluster.org/15345 Tested-by: ankitraj Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht: udpate stbuf from servers those have layoutSusant Palai2016-09-091-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | Problem: For healing of uid/gid we check if local->stbuf.ia_ctime is lesser than stbuf->ia_ctime (received from brick). If yes then uid/gid is updated to local->prebuf(source of healing). But we merge local->stbuf also form the newly added brick. So if we receive response from the newly added brick first and update the local->stbuf, then local->prebuf will remain empty since the newly added brick will have the latest ctime among all servers. And this can result in healing wrong uid/gids to the rest of servers. Hence, we should update local->stbuf from servers with a layout which will ignore merging stbufs from newly added bricks. Change-Id: If4b64f75a0ea669abdbe9f5a3d1d18ff19374c2f BUG: 1365740 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15126 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Skip layout overlap maximization on weighted rebalanceN Balachandran2016-09-081-4/+21
| | | | | | | | | | | | | | | | dht_selfheal_layout_maximize_overlap () does not consider chunk sizes while calculating overlaps. Temporarily enabling this operation if only if weighted rebalance is disabled or all bricks are the same size. Change-Id: I5ed16cdff2551b826a1759ca8338921640bfc7b3 BUG: 1366494 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15403 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/tier: add tiering eventsMilind Changire2016-09-084-0/+72
| | | | | | | | | | | | | | | | | | | | | | Add events for: * tier attach and detach * tier pause and resume * tier rising and dropping hi and lo watermarks Update eventskeygen.py with tiering events. Update cli help with: * attach: add optional force argument * detach: make force available as non-optional argument on its own Change-Id: I43990d3a8742151a4a7889bafa19cb572fe661bd BUG: 1368336 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15232 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht: move layout logs to DEBUG levelSusant Palai2016-09-062-5/+8
| | | | | | | | | | | | Change-Id: Iad96256218be643b272762b5638a3f6837aff28d BUG: 1366495 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15343 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/tier: Adding compaction option for metadata databasesDiogenes Nunez2016-09-044-53/+564
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: As metadata in the database fills up, querying the database take a long time. As a result, tier migration slows down. To counteract this, we added a way to enable the compaction methods of the underlying database. The goal is to reduce the size of the underlying file by eliminating database fragmentation. NOTE: There is currently a bug where sometimes a brick will attempt to activate compaction. This happens even compaction is already turned on. The cause is narrowed down to the compact_mode_switch flipping its value. Changes: libglusterfs/src/gfdb - Added a gfdb function to compact the underlying database, compact_db() This is a no-op if the database has no such option. - Added a compaction function for SQLite3 that does the following 1) Changes the auto_vacuum pragma of the database 2) Compacts the database according to the type of compaction requested - Compaction type can be changed by changing the macro GF_SQL_COMPACT_DEF to one of the 4 compaction types in gfdb_sqlite3.h It is currently set to GF_SQL_COMPACT_INCR, or incremental vacuuming. xlators/cluster/dht/src - Added the following command-line option to enable SQLite3 compaction. gluster volume set <vol-name> tier-compact <off|on> - Added the following command-line option to change the frequency the hot and cold tier are ordered to compact. gluster volume set <vol-name> tier-hot-compact-frequency <int> gluster volume set <vol-name> tier-cold-compact-frequency <int> - tier daemon periodically sends the (new) GFDB_IPC_CTR_SET_COMPACT_PRAGMA IPC to the CTR xlator. The IPC triggers compaction of the database. The inputs are both gf_boolean_t. IPC Input: compact_active: Is compaction currently on for the db. compact_mode_switched: Did we flip the compaction switch recently? IPC Output: 0 if the compaction succeeds. Non-zero otherwise. xlators/features/changetimerecorder/src/ - When the CTR gets the compaction IPC, it launches a thread that will perform the compaction. The IPC ends after the thread is launched. To avoid extra allocations, the parameters are passed using static variables. Change-Id: I5e1433becb9eeff2afe8dcb4a5798977bf5ba0dd Signed-off-by: Diogenes Nunez <dnunez@redhat.com> Reviewed-on: http://review.gluster.org/15031 Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* dht, md-cache, upcall: Add invalidation of IATT when the layout changesPoornima G2016-08-301-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: dht_layout is built as a part of lookup only. The layout can be modified by rebalance process. Since every IO fop is preceded by a lookup, there are very less issues of stale layout. But with enhancements of aggressive caching of stats in md-cache, the lookup will reduce and expose the stale layout issue often. Solution: Since stale layout is already an issue on dht, there is already a plan to fix this at the dht layer, but this fix is not currently planned for any release. Until this fix comes out, we can have a workaround where, the upcall will send a notification to md-cache when a layout xattr is changed. As a part of layout change notification the existing cache is invalidated and the next lookup will fetch the latest layout. This is not a foolproof solution as the window between the layout change and the next lookup(after invalidation of stat), where there will be stale layout. But until the final fix comes in, this reduces the stale layout window. Change-Id: Iacf871a38b35880c1fc0bc68fe7ce291265e71d4 BUG: 1369638 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15300 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht/tiering: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-303-3/+0
| | | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: I367a737570dd7d2f6cc25f4bf4299d31bb6826aa BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15242 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht: Implement ipc fopPoornima G2016-08-273-0/+87
| | | | | | | | | | | | | | | | | | ipc is used by md-cache to communicate the list of xattrs that it is caching, to the upcall xlator. Hence implement this in dht, such that it winds to all the bricks if the ipc op is GF_IPC_MDC_TARGET_UPCALL. The ips should not fail if any of the bricks is down, as md-cache will replay the ipc late when the brick comes back up. Change-Id: Ica551a550c04cbb1240c0d211fe831c2e5eb6017 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15225 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* quotad: fix potential buffer overflowsRaghavendra G2016-08-252-4/+23
| | | | | | | | | | | | | | | | | | | This converts sprintf to gf_asprintf in following components: * quotad.c * dht * afr * protocol/client * rpc/rpc-lib * rpc/rpc-transport Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee BUG: 1332073 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14102 Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
* cluster/tier: dont promote if estimated block consumption > hi watermarkMilind Changire2016-07-292-50/+153
| | | | | | | | | | | | | | | | | | Add test to fail promotion if estimated block consumption grows beyond hi watermark. Skip file migrations until next cycle if tier_get_fs_stat() fails in tier_migrate_using_query_file() Change-Id: Ice04572fa739c09109c4433e65965197482a7beb BUG: 1349284 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/14780 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-07-281-3/+14
| | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1359711 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15000 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: fix statfs for dht/tiered volumesDan Lambright2016-07-276-22/+265
| | | | | | | | | | | | | | | | | | | | | | | | | Return the correct size of the tiered volume in statfs. It should be the size of the cold tier, not the sum of the hot and cold tier, because the hot tier is a cache and not an extension of the volume's capacity. The number of free blocks, etc is the cold tier's capacity subtracted by the sum of utilization on the hot and cold tiers. Note if both tiers are part of the same file system this must be accounted for as well. The patch also fixes a pre-existing bug in the DHT/tier translators. If statfs was taken on a file, the code only calculated free space on the cached subvolume, not all subvolumes in the replica group. With the fix, this is corrected, except in the case where quota is used with the deem-statfs option set to "on". Change-Id: I2b8bcb4511edf83f12130960aad0a609fcf8f513 BUG: 1339689 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/14536 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dict: Don't expose get_new_dict/dict_destroyPranith Kumar K2016-07-252-10/+8
| | | | | | | | | | | | | | | get_new_dict/dict_destroy is causing confusion where, dict_new/dict_destroy or get_new_dict/dict_unref are used instead of dict_new/dict_unref. Change-Id: I4cc69f5b6711d720823395e20fd624a0c6c1168c BUG: 1296043 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13183 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/dht: Wrong type of function's parameter when calling ↵Zhou Zhengping2016-06-231-7/+5
| | | | | | | | | | | | | | | | dht_selfheal_dir_xattr_cbk The second parameter's type is call_frame_t *, and we change it to be type xlator_t *, it is exactly what we need in this function. Change-Id: I6a154edcaa5a11084d837ca925efbfac853d0786 BUG: 1346551 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: http://review.gluster.org/14737 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: initialize cbk before attempting inode-linkRaghavendra G2016-06-171-3/+3
| | | | | | | | | | | | | | | | | Otherwise inode-link failures in selfheal codepath will result in a crash. Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022 BUG: 1345748 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14707 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Fix unsafe iteration on inode->fd_listXavier Hernandez2016-06-151-16/+70
| | | | | | | | | | | | | | | | | | | | | When DHT traverses the inode->fd_list, it does that in an unsafe way that can generate races with fd_unref() called from other threads. This patch fixes this problem taking the inode->lock and adding a reference to the fd while it's being used outside of the mutex protected region. A minor change in storage/posix has been done to also access the inode->fd_list in a safe way. Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93 BUG: 1344340 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14682 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht : add metalock/unlockSusant Palai2016-06-031-5/+98
| | | | | | | | | | | | Change-Id: I842a7ea1b286f1b893b200fe647597e7fd0f2105 BUG: 1331720 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14252 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: selfheal should wind mkdir call to subvols with ESTALE errorSakshi Bansal2016-05-251-1/+2
| | | | | | | | | | | | Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99 BUG: 1338634 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14496 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht/rebalance: mark hardlink failures as skipped in rebalanceSusant Palai2016-05-251-0/+8
| | | | | | | | | | | | | | | Since rebalance(not remove-brick) process does not migrate hardlinks mark them as skipped rather than failed as it creates confusion for the users. Change-Id: I5d469d10146274f00bb91482d0373c5235a9b8b2 BUG: 1339071 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14493 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* features/shard: Get hard-link-count in {unlink,rename}_cbk before deleting ↵Krutika Dhananjay2016-05-201-8/+13
| | | | | | | | | | | | | shards Change-Id: I0606b74f11f5412c4d9af44a6505635ed9022c15 BUG: 1335858 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14334 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht: rename takes lock on parent directory if destination existsSakshi Bansal2016-05-181-7/+32
| | | | | | | | | | | | | | | | For directory rename if destination exists the source directory is created as a child of the given destination directory. Since the new child directory does not exist take lock on parent of the child directory. Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71 BUG: 1336698 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14371 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/distribute: heal layout in discover codepath tooRaghavendra G2016-05-161-33/+7
| | | | | | | | | | | BUG: 1334164 Change-Id: I4259d88f2b6e4f9d4ad689bc4e438f1db9cfd177 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14365 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/distribute: use a linked inode in directory heal codepathRaghavendra G2016-05-162-11/+58
| | | | | | | | | | | | | | | | | | | | This is needed for following reasons: * healing is done in lookup and mkdir codepath where inode is not linked _yet_ as normally linking is done in interface layers (fuse-bridge, gfapi, nfsv3 etc). * healing consists of non-lookup fops like inodelk, setattr, setxattr etc. All non-lookup fops expect a linked inode. Change-Id: I1bd8157abbae58431b7f6f6fffee0abfe5225342 BUG: 1334164 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14295 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* cluster/tier: downgrade max-cycle-time log message to INFODan Lambright2016-05-161-1/+1
| | | | | | | | | | | | | | | The "max cycle time" log message was incorrectly logged as an error. Downgrade it to INFO. Change-Id: Ia7d074423019fa79443bc6ea694148b7b8da455d BUG: 1335973 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/14336 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* tier/detach: Clear tier-fix-layout-complete xattr after migration threads joinJoseph Fernandes2016-05-141-33/+42
| | | | | | | | | | | | | | | | | | | | | Previously we had wrongly placed the clearing tier-fix-layout-complete xattr before the joining of migration threads. This would lead to situations where failure of clearing the xattr would cause the premature death of migration threads. Now we clear the xattr only after the data movement threads join, ensuring that all migration is done. Change-Id: I829b671efa165ae13dbff7b00707434970b37a09 BUG: 1334839 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14285 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Joseph Fernandes CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier: avoid pthread_join if pthread_create failsPrasanna Kumar Kalever2016-05-111-2/+5
| | | | | | | | | | | | | | | | | this patch rearrange the code, to add some defence functionality for pthread_create(), i.e. only on a success on pthread_create() call pthread_join(). Change-Id: I0836bc950a210574cfdc755a666c6ac5df6ab430 BUG: 1332219 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14152 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier/detach : During detach check if background fixlayout is doneJoseph Fernandes2016-05-061-1/+16
| | | | | | | | | | | | | | | During detach check if background fixlayout is done, if not done ignore the case and continue detach. Change-Id: I5d5cfc0e73d0eb217fdeab54c432dc4af8bc598d BUG: 1332136 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14147 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht:remember locked subvol and send unlock to the sameMohammed Rafi KC2016-05-065-21/+218
| | | | | | | | | | | | | | | | | | | | | | | | | During locking we send lock request to cached subvol, and normally we unlock to the cached subvol But with parallel fresh lookup on a directory, there is a race window where the cached subvol can change and the unlock can go into a different subvol from which we took lock. This will result in a stale lock held on one of the subvol. So we will store the details of subvol which we took the lock and will unlock from the same subvol Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb BUG: 1311002 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13492 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Perform NULL check on xdata before dict_get()Krutika Dhananjay2016-05-041-1/+1
| | | | | | | | | | | | | | .. to prevent unnecessary logs from gf_msg_callingfn() Change-Id: I367628fee2f6783ba9ed6f918deabd034df820c9 BUG: 1333043 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14212 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: Handle rmdir failure correctlyN Balachandran2016-05-022-13/+108
| | | | | | | | | | | | | | | | | DHT did not handle rmdir failures on non-hashed subvols correctly in a 2x2 dist-rep volume, causing the directory do be deleted from the hashed subvol. Also fixed an issue where the dht_selfheal_restore errcodes were overwriting the rmdir error codes. Change-Id: If2c6f8dc8ee72e3e6a7e04a04c2108243faca468 BUG: 1330032 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/14060 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht/rebalance: add lock migration fop to dht_migrate_fileSusant Palai2016-05-012-63/+112
| | | | | | | | | | | Change-Id: Id0e7400c8ae950c90d42a3ddf8b558a14959a1f8 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14074 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: handle EREMOTE in dht lk/flushSusant Palai2016-05-012-7/+98
| | | | | | | | | | | | | | | | With lock-migration, we need to send requests to destination brick post migration. Once, the source brick marks the lock structure to be already migrated, the requests will be redirected to destination brick by dht_lk2/flush2. Change-Id: I50b14011c5ab68c34826fb7ba7f8c8d42a68ad97 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13493 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* glusterd: volume set changes for lock migrationSusant Palai2016-05-012-3/+27
| | | | | | | | | | | Change-Id: I48c6f9cdda47503615ba65882acd5eedf0a70c89 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14024 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* tier/migrator: Fetch the next query file for the next cycleJoseph Fernandes2016-04-302-0/+25
| | | | | | | | | | | | | | | | | | | | | | | Problem: When we spawn promote and demote thread, query files are build. And only query file with index 0 is picked for migration as the first query file. This may not be suitable for scenarios, where the file in the query are too big to move in the first cycle, as a result file in the other query files always get missed. We need to shuffle so that other query files also get a chance. Fix: Remember the previous first query file and shift it by one index, before the migration starts. Change-Id: I704947bcf4bab6b20b1179a6d9ae4a15a3d51bd9 BUG: 1330353 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14068 Tested-by: Joseph Fernandes Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht/afr/client/posix: Fail mkdir without gfid-reqPranith Kumar K2016-04-291-0/+8
| | | | | | | | | | | | | | | Do not allow directory creations without gfids as after the directories are created, operations on them fail anyway. So it is better to fail mkdir. BUG: 1317361 Change-Id: I8f8e3b38bbded1960b7215bac0432500f7e78038 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13690 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs: Add debug and trace logs for stack traceRaghavendra Talur2016-04-271-1/+2
| | | | | | | | | | | | | | | | | It has become very difficult to identify the xlator which returned negative op_ret. Being able to just change the log level and visualize the stack is helpful in such cases. Change-Id: I6545b4802c1ab4d0d230d5e9e036afb2384882e1 BUG: 1330052 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13448 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tier/dht: check for rebalance completion for EIO errorMohammed Rafi KC2016-04-251-1/+3
| | | | | | | | | | | | | | | | | When an ongoing rebalance completion check task been triggered by dht, there is a possibility of a race between afr setting subvol as non-readable and dht updates the cached subvol. In this window a write can fail with EIO. Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d BUG: 1329503 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/14049 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht/rebalance: Handle GF_DEFRAG_STOPSusant Palai2016-04-251-0/+17
| | | | | | | | | | | | | | | | Problem: On a rebal stop, the migrator threads don't intimate the crawler thread to wake up in case it is waiting on signal from migrator thread. Change-Id: I3cc4be41a4db25f48fee059ebb79a97ee99dcd00 BUG: 1327507 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14004 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht: Add lease() fopPoornima G2016-04-253-0/+47
| | | | | | | | | | | | | Change-Id: I0bbc2c2ef115c78393f6570815a5b80316e7e4be BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11720 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/distribute: detect stale layouts in entry fopsRaghavendra G2016-04-225-27/+646
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1323040 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/13885 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* quota: setting 'read-only' option in xdata to instruct DHT to not healSakshi Bansal2016-04-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1252244 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13988 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: add "nuke" functionality for efficient server-side deletionJeff Darcy2016-04-071-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turns a special xattr into an rmdir with flags set. When that hits the posix translator on the server side, that causes the file/directory to be moved into the special "landfill" directory. From there, the posix janitor thread will take care of deleting it entirely on the server side - traversing it recursively if necessary. A couple of secondary issues were fixed to make this effective. * FUSE now ensures that setxattr values are NUL terminated. * The janitor thread now gets woken up immediately when something is placed in 'landfill' instead of only when file descriptors need to be closed. * The default landfill-emptying interval was reduced to 10s. To use the feature, issue a setxattr something like this: setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir The value doesn't actually matter; the mere receipt of a request with this key is sufficient. Some day it might be useful to allow setting a required value as a sort of password, so that only those who know it can access the underlying special functionality. Change-Id: I8a343c2cdb40a76d5a06c707191fb67babb8514f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13878 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: lock on subvols to prevent rename and lookup selfheal raceSakshi2016-04-061-43/+158
| | | | | | | | | | | | | | | | | | | | | | This patch addresses two races while renaming directories: 1) While renaming src to dst, if a lookup selfheal is triggered it can recreate src on those subvols where rename was successful. This leads to multiple directories (src and dst) having same gfid. To avoid this we must take locks on all subvols with src. 2) While renaming if the dst exists and a lookup selfheal is triggered it will find anomalies in the dst layout and try to heal the stale layout. To avoid this we must take lock on any one subvol with dst. Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 BUG: 1252244 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/11880 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht: lock on subvols to prevent lookup vs rmdir raceSakshi2016-04-055-89/+435
| | | | | | | | | | | | | | | | | | | | | | There is a possibility that while an rmdir is completed on some non-hashed subvol and proceeding to others, a lookup selfheal can recreate the same directory on those subvols for which the rmdir had succeeded. Now the deletion of the parent directory will fail with an ENOTEMPTY. To fix this take blocking inodelk on the subvols before starting rmdir. Selfheal must also take blocking inodelk before creating the entry. Change-Id: I168a195c35ac1230ba7124d3b0ca157755b3df96 BUG: 1245065 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13528 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>