summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorKotresh HR <khiremat@redhat.com>2015-01-29 15:53:19 +0530
committerVijay Bellur <vbellur@redhat.com>2015-03-15 21:20:03 -0700
commit7a9a66cc5fb7f06118fab1fc2ae1c43cfbb1178f (patch)
tree11a1b53b1410c7bd9b9cf2424b2e75118bd86d18 /doc
parent38e342ca4a2167720bea82d3cee7fca08baba666 (diff)
tools: Finds missing files in gluster volume given backend brickpath
The tool finds the missing files in a geo-replication slave volume. The tool crawls backend .glusterfs of the brickpath, which is passed as a parameter and stats each entry on slave volume mount to check the presence of file. The mount used is aux-gfid-mount, hence no path conversion is required and is fast. The tool needs to be run on every node in cluster for each brickpath of geo-rep master volume to find missing files on slave volume. The tool is generic enough and can be used in non geo-replication context as well. Most of the crawler code is leverged from Avati's xfind and is modified to crawl only .glusterfs (https://github.com/avati/xsync) Thanks Aravinda for scripts to convert gfid to path. Change-Id: I84deaaaf638f7c571ff1319b67a3440fe27da810 BUG: 1187140 Signed-off-by: Aravinda VK <avishwan@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/tools/gfind_missing_files.md67
1 files changed, 67 insertions, 0 deletions
diff --git a/doc/tools/gfind_missing_files.md b/doc/tools/gfind_missing_files.md
new file mode 100644
index 00000000000..47241be5ac6
--- /dev/null
+++ b/doc/tools/gfind_missing_files.md
@@ -0,0 +1,67 @@
+Introduction
+========
+The tool gfind_missing_files.sh can be used to find the missing files in a
+GlusterFS geo-replicated slave volume. The tool uses a multi-threaded crawler
+operating on the backend .glusterfs of a brickpath which is passed as one of
+the parameters to the tool. It does a stat on each entry in the slave volume
+mount to check for the presence of a file. The tool uses the aux-gfid-mount
+thereby avoiding path conversions and potentially saving time.
+
+This tool should be run on every node and each brickpath in a geo-replicated
+master volume to find the missing files on the slave volume.
+
+The script gfind_missing_files.sh is a wrapper script that in turn uses the
+gcrawler binary to do the backend crawling. The script detects the gfids of
+the missing files and runs the gfid-to-path conversion script to list out the
+missing files with their full pathnames.
+
+Usage
+=====
+```sh
+$bash gfind_missing_files.sh <BRICK_PATH> <SLAVE_HOST> <SLAVE_VOL> <OUTFILE>
+ BRICK_PATH - Full path of the brick
+ SLAVE_HOST - Hostname of gluster volume
+ SLAVE_VOL - Gluster volume name
+ OUTFILE - Output file which contains gfids of the missing files
+```
+
+The gfid-to-path conversion uses a quicker algorithm for converting gfids to
+paths and it is possible that in some cases all missing gfids may not be
+converted to their respective paths.
+
+Example output(126733 missing files)
+===================================
+```sh
+$ionice -c 2 -n 7 ./gfind_missing_files.sh /bricks/m3 acdc slave-vol ~/test_results/m3-4.txt
+Calling crawler...
+Crawl Complete.
+gfids of skipped files are available in the file /root/test_results/m3-4.txt
+Starting gfid to path conversion
+Path names of skipped files are available in the file /root/test_results/m3-4.txt_pathnames
+WARNING: Unable to convert some GFIDs to Paths, GFIDs logged to /root/test_results/m3-4.txt_gfids
+Use bash gfid_to_path.sh <brick-path> /root/test_results/m3-4.txt_gfids to convert those GFIDs to Path
+Total Missing File Count : 126733
+```
+In such cases, an additional step is needed to convert those gfids to paths.
+This can be used as shown below:
+```sh
+ $bash gfid_to_path.sh <BRICK_PATH> <GFID_FILE>
+ BRICK_PATH - Full path of the brick.
+ GFID_FILE - OUTFILE_gfids got from gfind_missing_files.sh
+```
+Things to keep in mind when running the tool
+============================================
+1. Running this tool can result in a crawl of the backend filesystem at each
+ brick which can be intensive. To ensure there is no impact on ongoing I/O on
+ RHS volumes, we recommend that this tool be run at a low I/O scheduling class
+ (best-effort) and priority.
+```sh
+$ionice -c 2 -p <pid of gfind_missing_files.sh>
+```
+
+2. We do not recommend interrupting the tool when it is running
+ (e.g. by doing CTRL^C). It is better to wait for the tool to finish
+ execution. In case it is interupted, manually unmount the Slave Volume.
+```sh
+ umount <MOUNT_POINT>
+```