cluster/ec: avoid delays in self-heal

Self-heal creates a thread per brick to sweep the index looking for files that need to be healed. These threads are started before the volume comes online, so nothing is done but waiting for the next sweep. This happens once per minute. When a replace brick command is executed, the new graph is loaded and all index sweeper threads started. When all bricks have reported, a getxattr request is sent to the root directory of the volume. This causes a heal on it (because the new brick doesn't have good data), and marks its contents as pending to be healed. This is done by the index sweeper thread on the next round, one minute later. This patch solves this problem by waking all index sweeper threads after a successful check on the root directory. Additionally, the index sweep thread scans the index directory sequentially, but it might happen that after healing a directory entry more index entries are created but skipped by the current directory scan. This causes the remaining entries to be processed on the next round, one minute later. The same can happen in the next round, so the heal is running in bursts and taking a lot to finish, specially on volumes with many directory levels. This patch solves this problem by immediately restarting the index sweep if a directory has been healed. Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e BUG: 1547662 Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
author: Xavi Hernandez <jahernan@redhat.com> 2018-02-21 17:47:37 +0100
committer: Amar Tumballi <amarts@redhat.com> 2018-03-14 03:12:27 +0000
commit: 7f81067f4522f973e98aa5abbb4d2028da2a2e6f (patch)
tree: cdcfa501a27a2c5bc472495e27b2d37d35fdc068 /xlators/cluster/ec/src/ec-heal.c
parent: fe52fc33d00eafe7d52ffff1b2dab846374f1d4a (diff)
1 files changed, 9 insertions, 0 deletions
diff --git a/xlators/cluster/ec/src/ec-heal.c b/xlators/cluster/ec/src/ec-heal.c
index 6562adf9e24..d1e40607e33 100644
--- a/xlators/cluster/ec/src/ec-heal.c
+++ b/xlators/cluster/ec/src/ec-heal.c
@@ -25,6 +25,7 @@
 #include "ec-combine.h"
 #include "ec-method.h"
 #include "ec-fops.h"
+#include "ec-heald.h"
 
 #define alloca0(size) ({void *__ptr; __ptr = alloca(size); memset(__ptr, 0, size); __ptr; })
 #define EC_COUNT(array, max) ({int __i; int __res = 0; for (__i = 0; __i < max; __i++) if (array[__i]) __res++; __res; })
@@ -2769,6 +2770,14 @@ ec_replace_heal (ec_t *ec, inode_t *inode)
                 gf_msg_debug (ec->xl->name, 0,
                         "Heal failed for replace brick ret = %d", ret);
 
+        /* Once the root inode has been checked, it might have triggered a
+         * self-heal on it after a replace brick command or for some other
+         * reason. It can also happen that the volume already had damaged
+         * files in the index, even if the heal on the root directory failed.
+         * In both cases we need to wake all index healers to continue
+         * healing remaining entries that are marked as dirty. */
+        ec_shd_index_healer_wake(ec);
+
         loc_wipe (&loc);
         return ret;
 }
author	Xavi Hernandez <jahernan@redhat.com>	2018-02-21 17:47:37 +0100
committer	Amar Tumballi <amarts@redhat.com>	2018-03-14 03:12:27 +0000
commit	7f81067f4522f973e98aa5abbb4d2028da2a2e6f (patch)
tree	cdcfa501a27a2c5bc472495e27b2d37d35fdc068 /xlators/cluster/ec/src/ec-heal.c
parent	fe52fc33d00eafe7d52ffff1b2dab846374f1d4a (diff)