summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr-self-heald.c
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Heal directory rename without rmdir/mkdirPranith Kumar K2020-10-011-2/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem1: When a directory is renamed while a brick is down entry-heal always did an rm -rf on that directory on the sink on old location and did mkdir and created the directory hierarchy again in the new location. This is inefficient. Problem2: Renamedir heal order may lead to a scenario where directory in the new location could be created before deleting it from old location leading to 2 directories with same gfid in posix. Fix: As part of heal, if oldlocation is healed first and is not present in source-brick always rename it into a hidden directory inside the sink-brick so that when heal is triggered in new-location shd can rename it from this hidden directory to the new-location. If new-location heal is triggered first and it detects that the directory already exists in the brick, then it should skip healing the directory until it appears in the hidden directory. Credits: Ravi for rename-data-loss.t script Fixes: #1211 Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* xlators: prefer libglusterfs time APIDmitry Antipov2020-09-071-2/+2
| | | | | | | | | | Prefer timespec_now_realtime() and gf_time() over clock_gettime() and time(), use gf_tvdiff() and gf_tsdiff() where appropriate, drop unused time_elapsed() and leftovers in 'struct posix_private'. Change-Id: Ie1f0229df5b03d0862193ce2b7fb91d27b0981b6 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* libglusterfs: add library wrapper for time()Dmitry Antipov2020-08-171-1/+1
| | | | | | | | | Add thin convenient library wrapper gf_time(), adjust related users and comments as well. Change-Id: If8969af2f45ee69c30c3406bce5baa8305fb7f80 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* cluster/afr: Removing unsupported options from code base to improve coveragekarthik-us2020-04-071-9/+0
| | | | | | | | | | | | | | | | Support for gluster volume heal <volname> info healed/heal-failed was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in release-3.6. cli parser will display the usage message in all the supported versions whenever these clis are run, leading to some dead code in the latest branches. Since support for these clis were removed long back, this should not give any backward compatibility issues as well. Hence removing the dead code from the code base which will lead to better code coverage by the regression runs as well. Updates: #1052 Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* multiple xlators: reduce key lengthYaniv Kaul2020-01-141-3/+3
| | | | | | | | | | | | | | | In many cases, we were freely allocating long keys with no need. Smaller char arrays are just fine almost anywhere, so just went ahead and looked where they we can use smaller ones. In some cases, annotated the functions as static and the prefixes passed as const as it was easier to read and understand. Where relevant, converted the dict functions to use known key length. Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* afr/self-heald - Missing error logsBarak Sason Rofman2019-12-101-23/+69
| | | | | | | | | | As a follow up on https://review.gluster.org/#/c/glusterfs/+/23749/, adding error logging for the entire method. In addition, converted logging to structured logging in the method. Fixes: bz#1778457 Change-Id: I1f412159e6849d6f6ddbde53ec4a85ad709bbdf4 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* cluster/afr - coverity issue fixBarak Sason Rofman2019-11-281-0/+5
| | | | | | | | | | | Added a log for a failure in order to avoid "unused variable" coverity issue. fixes: CID#1274209 Change-Id: Ibc6b0ab4bdff482096e42e88fd4c8c7eadfeeadb Updates: bz#789278 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* afr: wake up index healer threadsRavishankar N2019-08-301-4/+10
| | | | | | | | | | | | | ...whenever shd is re-enabled after disabling or there is a change in `cluster.heal-timeout`, without needing to restart shd or waiting for the current `cluster.heal-timeout` seconds to expire. See BZ 1743988 for more details. Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe fixes: bz#1744548 Reported-by: Glen Kiessling <glenk1973@hotmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/ta: Notify the clients only if there are pending healskarthik-us2019-07-121-1/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In case of thin arbiter, before index healer starts crawling the indices at every heal-timeout interval, even if there is nothing to be healed it will send an upcall notification to all the clients to release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait for the upcall to return before proceeding with the heal even though there is nothing to be healed. This will also invalidates the cached information about the bricks states on the clients which leads to extra calls on TA from clients for the next reads & writes if needed. This will impact the IO performance. Fix: - Before sending the upcall to the clients, check for any pending heals on TA without taking any locks. - If there is nothing marked bad on TA, then continue with the index crawl to heal any dirty markings present on the files due to any post-op failure. - If there is a brick marked as bad on TA, then take the AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and continue with the current healing process. Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6 fixes: bz#1724184 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* ec/shd: Cleanup self heal daemon resources during ec finiMohammed Rafi KC2019-05-131-0/+5
| | | | | | | | | | We were not properly cleaning self-heal daemon resources during ec fini. With shd multiplexing, it is absolutely necessary to cleanup all the resources during ec fini. Change-Id: Iae4f1bce7d8c2e1da51ac568700a51088f3cc7f2 fixes: bz#1703948 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* Replace memdup() with gf_memdup()Vijay Bellur2019-04-121-1/+1
| | | | | | | | | memdup() and gf_memdup() have the same implementation. Removed one API as the presence of both can be confusing. Change-Id: I562130c668457e13e4288e592792872d2e49887e updates: bz#1193929 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Thin-arbiter SHD fixeskarthik-us2019-04-121-12/+12
| | | | | | | | | This patch address post-merge review comments for commit 5784a00f997212d34bd52b2303e20c097240d91c Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac fixes: bz#1697930 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* afr/shd: Cleanup self heal daemon resources during afr finiMohammed Rafi KC2019-02-121-0/+2
| | | | | | | | | We were not properly cleaning self-heal daemon resources during afr fini. This patch will clean the same. Change-Id: I597860be6f781b195449e695d871b8667a418d5a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* AFR xlator: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-12-171-46/+62
| | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - In some cases, moved strlen() of the key length outside of locks, which is usually a good thing. Please verify it's safe to do so. - In some cases, created a prefix for the keys, replacing something like "%d-%d" with a "%s" in snprintf(). Not sure it adds value, but improves readability. Please review carefully. Compile-tested only! Change-Id: I04f2a1eb2ecfc3283d849d150d10d088ae7aa7f1 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* afr: some minor itable related cleanupsRavishankar N2018-12-121-7/+2
| | | | | | | | | | | | - this->itable always needs to be allocated, hence move it outside afr_selfheal_daemon_init(). - Invoke afr_selfheal_daemon_init() only for self-heal daemon case. - remove redundant itable allocation in afr_discover(). - destroy itable in fini. Updates bz#1193929 Change-Id: Ib28b50b607386f5a5aa7d2f743c8b506ccb10eae Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* cluster/afr : Check for UP bricks before starting healAshish Pandey2018-10-241-0/+15
| | | | | | | | | | | | | | | | | | | | | | | Problem: Currently for replica volume, even if only one brick is UP SHD will keep crawling index entries even if it can not heal anything. In thin-arbiter volume which is also a replica 2 volume, this causes inode lock contention which in turn sends upcall to all the clients to release notify locks, even if it can not do anything for healing. This will slow down the client performance and kills the purpose of keeping in memory information about bad brick. Solution: Before starting heal or even crawling, check if sufficient number of children are UP and available to check and heal entries. Change-Id: I011c9da3b37cae275f791affd56b8f1c1ac9255d updates: bz#1640581 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* cluster/afr: Use 2 domain locking in SHD for thin-arbiterkarthik-us2018-09-201-88/+157
| | | | | | | | | | | | | | | | | | | | With this change when SHD starts the index crawl it requests all the clients to release the AFR_TA_DOM_NOTIFY lock so that clients will know the in memory state is no more valid and any new operations needs to query the thin-arbiter if required. When SHD completes healing all the files without any failure, it will again take the AFR_TA_DOM_NOTIFY lock and gets the xattrs on TA to see whether there are any new failures happened by that time. If there are new failures marked on TA, SHD will start the crawl immediately to heal those failures as well. If there are no new failures, then SHD will take the AFR_TA_DOM_MODIFY lock and unsets the xattrs on TA, so that both the data bricks will be considered as good there after. Change-Id: I037b89a0823648f314580ba0716d877bd5ddb1f1 fixes: bz#1579788 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-1098/+1053
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* afr: common thin-arbiter functionsRavishankar N2018-08-231-2/+2
| | | | | | | | | | | | | | ...that can be used by client and self-heal daemon, namely: afr_ta_post_op_lock() afr_ta_post_op_unlock() Note: These are not yet consumed. They will be used in the write txn changes patch which will introduce 2 domain locking. updates: bz#1579788 Change-Id: I636d50f8fde00736665060e8f9ee4510d5f38795 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/afr: shd changes for thin arbiterkarthik-us2018-04-301-0/+184
| | | | | | | Updates #352 Change-Id: I1bbb3c652ba33cec6aa37f3700370674077fb17d Signed-off-by: karthik-us <ksubrahm@redhat.com>
* afr: coverity fixesRavishankar N2017-11-241-3/+0
| | | | | | | | | | | | | | | | | | | | | | 1.afr_discover_do: COPY_PASTE_ERROR 2.afr_fav_child_reset_sink_xattrs_cbk: REVERSE_INULL 3.afr_fop_lock_proceed: UNUSED_VALUE 4.afr_local_init: CHECKED_RETURN 5.afr_set_split_brain_choice: REVERSE_INULL 6.__afr_inode_write_finalize: FORWARD_NULL 7.afr_refresh_heal_done: REVERSE_INULL 8.afr_xl_op:UNUSED_VALUE 9.afr_changelog_populate_xdata: DEADCODE 10.set_afr_pending_xattrs_option: RESOURCE_LEAK Note: RESOURCE_LEAK complaints about afr_fgetxattr_pathinfo_cbk, afr_getxattr_list_node_uuids_cbk and afr_getxattr_pathinfo_cbk seem to be false alarms. Change-Id: Ia4ca1478b5e2922084732d14c1e7b1b03ad5ac45 BUG: 789278 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/afr: Fail open on split-brainPranith Kumar K2017-10-261-3/+3
| | | | | | | | | | | | | | | | | Problem: Append on a file with split-brain succeeds. Open is intercepted by open-behind, when write comes on the file, open-behind does open+write. Open succeeds because afr doesn't fail it. Then write succeeds because write-behind intercepts it. Flush is also intercepted by write-behind, so the application never gets to know that the write failed. Fix: Fail open on split-brain, so that when open-behind does open+write open fails which leads to write failure. Application will know about this failure. Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15 BUG: 1294051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: GFID split-brain resolution with existing CLIkarthik-us2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently there is no way for the admin from CLI to resolve gfid split-brain based on some policy like choice of the brick, mtime or size. Fix: With the existing CLI options based on size, mtime, and choice of brick, we do lookup on the parent for the specified file. As part of the lookup, if we find gfid mismatch, we resolve them based on the policy and return. If the file is not in gfid split- brain, then we check for the data and metadata split-brain in the getxattr code path, and resolve if any. This will work provided absolute path to the file with the CLI and not with gfid of the file. Hence the source-brick policy without any file path will also not resolve the gfid split-brain since it uses the gfid of the files. But it can resolve any other type of split-brains and skip the gfid mismatch resolution with the usual error message. Reverting the change https://review.gluster.org/17290. This patch resolves the issue. Fixes gluster/glusterfs#135 Change-Id: Iaeba6fc32f184a34255d03be87cda02773130a09 BUG: 1459530 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/17485 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Implement heal info with lockAshish Pandey2016-10-111-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently heal info command prints all the files/directories if the index for the file/directory is present in .glusterfs/indices folder. After implementing patch http://review.gluster.org/#/c/13733/ indices of the file which is going through update fop will also be present in .glusterfs/indices even if the fop is successful on all the brick. At this time if heal info command is being used, it will also display this file which is actually healthy and does not require any heal. Solution: Take lock on a file corresponding to the indices and inspect xattrs to decide if the file needs heal or not. Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa BUG: 1366815 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15543 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* afr: fix incorrect debug log in selfheal pathRavishankar N2016-10-041-2/+2
| | | | | | | | | | | | | | | | | | | 1. While looking at glustershd logs in DEBUG log-level, it was found that all bricks of the replica were printed as local bricks even though they were not. Fixed it. 2. Print the name of the subvol from which the entry was got during index crawl. Change-Id: I08b32e38694c755715e9fe0c0e1dd9212abcfb16 BUG: 1381421 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15610 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* afr, index: Clean up stale directory and file indices in granular entry shKrutika Dhananjay2016-07-111-10/+45
| | | | | | | | | | | | | | | | | | | | | | | | Specifically when a directory tree is removed (rm -rf) while a brick is down, both the directory index and the name indices of the files and subdirs under it will remain. Self-heal will need to pick up these and remove them. Towards this, afr sh will now also crawl indices/entry-changes and call an rmdir on the dir if the directory index is stale. On the brick side, rmdir fop has been implemented for index xl, which would delete the directory index and its contents if present in a synctask. Change-Id: I8b527331c2547e6c141db6c57c14055ad1198a7e BUG: 1331323 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14832 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix a minor typo in afr-self-heald.cVijay Bellur2016-07-091-6/+6
| | | | | | | | | | | | s/outout/output/ Change-Id: I2aec770cdae513cd4932e5fd56e0267584e44cae Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/13930 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
* cluster/afr: Do heals with shd pidPranith Kumar K2016-05-051-1/+10
| | | | | | | | | | | | | | Multi-threaded healing doesn't create synctask with shd pid, this leads to healing problems when quota exceeds. BUG: 1332994 Change-Id: I80f57c1923756f3298730b8820498127024e1209 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14211 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* cluster/afr: Entry self-heal performance enhancementsKrutika Dhananjay2016-04-291-1/+1
| | | | | | | | | | | Change-Id: I52da41dff5619492b656c2217f4716a6cdadebe0 BUG: 1269461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12442 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Use parallel dir scan functionalityPranith Kumar K2016-04-041-12/+28
| | | | | | | | | | | BUG: 1221737 Change-Id: I0ed71a72f0e33bd733723e00a01cf28378c5534e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13755 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: Don't lookup/forget inodesPranith Kumar K2016-03-311-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: All inodes that are looked-up are always forgotten without fail in afr removing the benefits of them being in lru. This same code can cause crashes if between inode_lookup, inode_forget in afr if the top xlator does inode_forget(0). Fix: Don't use lookup/forget in afr. No benefits are there at the moment for keeping this code. It is impossible to prevent top xlators to do inode_forget(0). Found similar instances in ec and removed them even though those code paths are not going to be executed in any place other than heal-daemon. BUG: 1321554 Change-Id: Ia4cb236178f7f129cc898d53f0bbd26f494a2a8d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13834 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com>
* cluster/afr: Choose local child as source if possiblePranith Kumar K2016-03-111-0/+3
| | | | | | | | | | | | | | | | | It is better to choose local brick as source if possible to prevent over the wire read thus saving on bandwidth. Also changed code to not attempt data-heal if 'source' is selected as arbiter. Change-Id: I9a328d0198422280b13a30ab99545370a301dfea BUG: 1314150 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13585 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cli/ afr: op_ret for index heal launchRavishankar N2016-02-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If index heal is launched when some of the bricks are down, glustershd of that node sends a -1 op_ret to glusterd which eventually propagates it to the CLI. Also, glusterd sometimes sends an err_str and sometimes not (depending on the failure happening in the brick-op phase or commit-op phase). So the message that gets displayed varies in each case: "Launching heal operation to perform index self heal on volume testvol has been unsuccessful" (OR) "Commit failed on <host>. Please check log file for details." Fix: 1. Modify afr_xl_op() to return -1 even if index healing of atleast one brick fails. 2. Ignore glusterd's error string in gf_cli_heal_volume_cbk and print a more meaningful message. The patch also fixes a bug in glusterfs_handle_translator_op() where if we encounter an error in notify of one xlator, we break out of the loop instead of sending the notify to other xlators. Change-Id: I957f6c4b4d0a45453ffd5488e425cab5a3e0acca BUG: 1302291 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13303 Reviewed-by: Anuradha Talur <atalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr : Readdirp performance enhancementAnuradha Talur2015-11-301-79/+32
| | | | | | | | | | | | | | | | | | | | Things done : 1) during lookup and inode_refresh as part of read_txn, request is sent to detect if heal is required or not. 2) If heal is required, be conservative in setting the readdirp entry inodes to NULL, otherwise don't be. 3) Self-heal-daemon now crawls both indices/xattrop and indices/dirty directory while healing. Change-Id: Ic4a4da63fb7e0726eab5f341a200859b29cf7eb7 BUG: 1250803 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12507 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Mark self-heal fops as internalPranith Kumar K2015-11-181-3/+3
| | | | | | | | | | Change-Id: I8ae7af266d3e00460f0cfdc9389a926e5f2fee36 BUG: 1282761 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12598 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* afr: Porting messages to new logging frameworkarao2015-06-271-44/+63
| | | | | | | | | | | | | updated Change-Id: I94ac7b2cb0d43a82cf0eeee21407cff9b575c458 BUG: 1194640 Signed-off-by: arao <arao@redhat.com> Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com> Reviewed-on: http://review.gluster.org/9897 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs/syncop: Add xdata to all syncop callsRaghavendra Talur2015-04-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for xdata in both the request and response path of syncops. Few calls like lookup already had the support; have renamed variables in few places to maintain uniformity. xdata passed downwards is known as xdata_in and xdata passed upwards is known as xdata_out. There is an old patch by Jeff Darcy at http://review.gluster.org/#/c/8769/3 which does the same for some selected calls. It also brings in xdata support at gfapi level. xdata support at gfapi level would be introduced in subsequent patches. Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc BUG: 1158621 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9859 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/afr : Coverity fix.Manikandan Selvaganesh2015-04-081-4/+0
| | | | | | | | | | | | | | | CID:1194644 Childup[] value will not be equal to -1 when afr_xl_op() function gets called Change-Id: Iaf7a9d41a54f6b2d52d9ba5dadb638f328afe14b BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* afr: remove stale index entriesRavishankar N2015-03-171-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: During pre-op phase, the index xlator 1. Creates the entry inside .glusterfs/indices/xattrop 2. Winds the xattrop fop to posix to mark dirty/pending changelogs. If the brick crashes after 1, the xattrop entry becomes stale and never gets removed by shd during subsequent crawls because there is nothing to heal (changelogs are zero). Though the stale entry does not get displayed in the output of 'heal info' command, it nevertheless stays there forever unless a new write transaction is performed on the file. Fix: During index self-heal if afr xattrs are found to be clean (indicated by ret value of 2 on a call to afr_shd_selfheal(), send a dummy post-op with all 0s for the xattr values, which makes the index xlator to unlink the stale entry. Change-Id: I02cb2bc937f2e3f3f3cb35d67b006664dc7ef919 BUG: 1190069 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9714 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Add self-heal-daemon command handlersPranith Kumar K2015-03-091-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the changes required in ec xlator to handle index/full heal. Index healer threads: Ec xlator start an index healer thread per local brick. This thread keeps waking up every minute to check if there are any files to be healed based on the indices kept in index directory. Whenever child_up event comes, then also this index healer thread wakes up and crawls the indices and triggers heal. When self-heal-daemon is disabled on this particular volume then the healer thread keeps waiting until it is enabled again to perform heals. Full healer threads: Ec xlator starts a full healer thread for the local subvolume provided by glusterd to perform full crawl on the directory hierarchy to perform heals. Once the crawl completes the thread exits if no more full heals are issued. Changed xl-op prefix GF_AFR_OP to GF_SHD_OP to make it more generic. Change-Id: Idf9b2735d779a6253717be064173dfde6f8f824b BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9787 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Moved common functions as utils in syncop/common-utilsPranith Kumar K2015-02-271-78/+10
| | | | | | | | | | | | | | | These will be used by both afr and ec. Moved syncop_dirfd, syncop_ftw, syncop_dir_scan functions also into syncop-utils.c Change-Id: I467253c74a346e1e292d36a8c1a035775c3aa670 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9740 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Re-introduce heal-timeout optionPranith Kumar K2015-02-061-1/+1
| | | | | | | | | | | Change-Id: I87484c810006a92ed7489284b6d74e9b0aecae80 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9598 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* syncop: Provide syncop_ftw and syncop_dir_scan utilsPranith Kumar K2015-02-061-257/+117
| | | | | | | | | | | | | | | | | ftw provides file tree walk. dir_scan does just a readdir not readdirp. Also changed Afr's self-heal-daemon's crawling functions to use this. These utils will be used by ec in future to do proactive/full healing. Change-Id: I05715ddb789592c1b79a71e98f1e8cc29aac5c26 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9485 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: change signature of syncop_(f)getxattrRavishankar N2015-01-051-4/+4
| | | | | | | | | | | | | | | | | Pass xdata dict to syncop_(f)getxattr calls. This patch [1/3] is required as a part of afr automated split-brain resolution implementation. Change-Id: I3970b3dd6daf64681a031e37f8e9afb14fb3d668 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9375 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* telldir()/seekdir() portability fixesEmmanuel Dreyfus2014-12-171-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX says that an offset obtained from telldir() can only be used on the same DIR *. Linux is abls to reuse the offset accross closedir()/opendir() for a given directory, but this is not portable and such a behavior should be fixed. An incomplete fix for the posix xlator was merged in http://review.gluster.com/8926 This change set completes it. - Perform the same fix index xlator. - Use appropriate casts and variable types so that 32 bit signed offsets obtained by telldir() do not get clobbered when copied into 64 bit signed types. - modify glfs-heal.c and afr-self-heald.c so that they do not use anonymous fd, since this will cause closedir()/opendir() between each syncop_readdir(). On failure we fallback to anonymous fs only for Linux so that we can cope with updated client vs not updated brick. - Avoid sending an EINVAL when the client request for the EOF offset. Here we fix an error in previous fix for posix xlator: since we fill each directory entry with the offset of the next entry, we must consider as EOF the offset of the last entry, and not the value of telldir() after we read it. - Add checks in regression tests that we do not hit cases where offsets fed to seekdir() are wrong. Introduce log_newer() shell function to check for messages produced by the current script. This fix gather changes from http://review.gluster.org/9047 and http://review.gluster.org/8936 making them obsolete. BUG: 1129939 Change-Id: I59fb7f06a872c4f98987105792d648141c258c6a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9071 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* heal: glfs-heal implementationPranith Kumar K2014-10-151-2/+0
| | | | | | | | | | | Thanks a lot to Niels for helping me to get build stuff right. Change-Id: I634f24d90cd856ceab3cc0c6e9a91003f443403e BUG: 1147462 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6529 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>