summaryrefslogtreecommitdiffstats
path: root/xlators/features/bit-rot
Commit message (Collapse)AuthorAgeFilesLines
* feature/bitrot: Ignore files with sticky bit setKotresh HR2016-07-221-0/+8
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14903 Scrubber scrubs entries in backend. It is scrubbing files with sticky bit as well. This might include linkfiles which should be skipped. This patch adds the check to ignore linkfiles during scrub. Change-Id: Ic21367b37770d391326c55c659491a1e5a82335b BUG: 1359017 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 8c47b19fc057f08c47444ef557503e610c707128) Reviewed-on: http://review.gluster.org/14982 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/bitrot: Fix scrub status with sharded volumeKotresh HR2016-07-211-12/+26
| | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14927 Bitrot scrubs each shard entries separately. Scrub statistics was counting each shard entry which is incorrect. This patch skips the statistics count for sharded entries. Change-Id: I184c315a4bc7f2cccabc506eef083ee926ec26d3 BUG: 1357973 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 1929141da34d36f537e9798e3618e0e3bdc61eb6) Reviewed-on: http://review.gluster.org/14958 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* features/bitrot: Move throttling code to libglusterfsKotresh HR2016-07-197-391/+29
| | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14846 Since throttling is a separate feature by itself, move throttling code to libglusterfs. Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972 BUG: 1357514 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14944 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* features/bitrot: Option to set scrub interval to a minuteKotresh HR2016-07-152-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14836/ Bitrot scrub-frequency supports "hourly|daily|weekly|biweekly|monthly". But it is painful for testing as minimum scrub-interval is an hour Hence introducing a scrub interval of minute to ease testing. It is intentionally not exposed in bitrot command help as it is only for testing. e.g., gluster vol bitrot <volname> scrub-frequency minute Change-Id: I155a65298d3fad5ae9e529d9c7d4b0d25fa297c0 BUG: 1354425 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 7df1174f7bed2a00631cf17201f5217a053afeb1) Reviewed-on: http://review.gluster.org/14887 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* feature/bitrot: Show whether scrub is in progress/idleKotresh HR2016-07-153-13/+18
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14864/ Bitrot scrub status shows whether the scrub is paused or active. It doesn't show whether the scrubber is actually scrubbing or waiting in the timer wheel for the next schedule. This patch shows this status with "In Progress" and "Idle" respectively. Change-Id: I995d8553d1ff166503ae1e7b46282fc3ba961f0b BUG: 1355635 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit f4757d256e3e00132ef204c01ed61f78f705ad6b) Reviewed-on: http://review.gluster.org/14900 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* features/bitrot: Introduce scrubber monitor threadKotresh HR2016-05-0310-324/+697
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch does following changes. 1. Introduce scrubber monitor thread. 2. Move scrub status related APIs to separate file and make part of libbitrot library. Problem: Earlier, each child of the scrubber was maintaining the state machine and hence there was no way to track the start and end time of scrubbing as each brick has it's own start and end time. Also each brick was maintaining it's own timer wheel instance. It was also not possible to get scrubbed files count per session as we could not get last child which finishes scrubbing to reset it to zero. Solution: Introduce scrubber monitor thread. It does following. 1. Maintains the scrubber state machine. Earlier each child had it's own state machine. Now, only monitor maintains on behalf of all it's children. 2. Maintains the timer wheel instance. Earlier each child had it's own timer wheel instance. Now, only monitor maintains on behalf of all it's children. As a result, we can track the scrub statistics easily and correctly. Backport of: >Change-Id: Ic6e34ffa57984bd7a5ee81f4e263342bc1d9b302 >BUG: 1329211 >Signed-off-by: Kotresh HR <khiremat@redhat.com> >Reviewed-on: http://review.gluster.org/14044 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Venky Shankar <vshankar@redhat.com> Backport of: >http://review.gluster.org/#/c/14146 >BUG: 1332134 NOTE: The patch #14146 is a compilation warning not detected in master branch and detected only in 3.7 branch. Since the compilation warning is introduced by patch #14044, the above two backports are made into this single patch. Change-Id: I1da7a3ec673a36ae0f59dc33ac5992c74fd7a19b BUG: 1332072 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14140 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bit-rot-stub: get frame->local before unwindingRaghavendra Bhat2016-03-091-3/+5
| | | | | | | | | | | | | | | | | In bit-rot-stub, if unlink fails, then it was unwinding directly. Then it was trying to cleanup local. But local would be NULL, since it was unwinding directly without getting the value of frame->local. The NULL cleanup of local was causing the brick process to crash. Change-Id: I8544ba73b2e8dc0c50b1a53ff8027d85588d087b BUG: 1315552 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/13630 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: do not remove the quarantine handle in forgetRaghavendra Bhat2016-03-071-8/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an object is marked as bad, then an entry is corresponding to the bad object is created in the .glusterfs/quarantine directory to help scrub status. The entry name is the gfid of the corrupted object. The quarantine handle is removed in below 2 cases. 1) When protocol/server revceives the -ve lookup on an entry whose inode is there in the inode table (it can happen when the corrupted object is deleted directly from the backend for recovery purpose) it sends a forget on the inode and bit-rot-stub removes the quarantine handle in upon getting the forget. refer to the below commit f853ed9c61bf65cb39f859470a8ffe8973818868: http://review.gluster.org/12743) 2) When bit-rot-stub itself realizes that lookup on a corrupted object has failed with ENOENT. But with step1, there is a problem when the bit-rot-stub receives forget due to lru limit exceeding in the inode table. In such cases, though the corrupted object is not deleted (either from the mount point or from the backend), the handle in the quarantine directory is removed and that object is not shown in the bad objects list in the scrub status command. So it is better to follow only 2nd step (i.e. bit-rot-stub removing the handle from the quarantine directory in -ve lookups). Also the handle has to be removed when a corrupted object is unlinked from the mount point itself. Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9 BUG: 1313131 Original author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13472 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 2102010edab355ac9882eea41a46edaca8b9d02c) Reviewed-on: http://review.gluster.org/13552 Tested-by: Venky Shankar <vshankar@redhat.com>
* features / bitrot: Prevent spurious pthread_cond_wait() wakeupVenky Shankar2016-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13302 pthread_cond_wait() is prone to spurious wakeups and it's utmost necessarry to check a boolean predicate for thread continuation. See man(3) pthread_cond_wait() for details. The following is done in bitrot scrubber: if (list_empty (&fsscrub->scrublist)) pthread_cond_wait (&fsscrub->cond, &fsscrub->mutex); followed by: list_first_entry (&fsscrub->scrublist, ...) A spurious wakeup from pthread_cond_wait() with the absence of list_empty() check causes list_first_entry() to return garbage. BUG: 1302199 Change-Id: I60151eabb8af257a35acd8e7c117876388166a0e Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/13307 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: Fail node-uuid getxattr if file is marked badKotresh HR2016-01-271-0/+22
| | | | | | | | | | | | | | | | | | | | | If xattr is node-uuid and the inode is marked bad, fail getxattr and fgetxattr with EIO. Returning EIO would result in AFR to choose correct node-uuid coresponding to the subvolume where the good copy of the file resides. BUG: 1296795 Change-Id: I3f8dc807794f9a82867807e7c4c73ded6c64fd8a Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13116 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/13194 Tested-by: Venky Shankar <vshankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* features/bitrot: add check for corrupted object in f{stat}Venky Shankar2016-01-261-32/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13120 Check for corrupted objects is done bt bitrot stub component for data operations and such fops are denied processing by returning EIO. These checks were not done for operations such as get/set extended attribute, stat and the likes - IOW, stub only blocked pure data operations. However, its necessary to have these checks for certain other fops, most importantly stat (and fstat). This is due to the fact that clients could possibly get stale stat information (such as size, {a,c,m}time) resulting in incorrect operation of the application that rely on these fields. Note that, the data that replication would take care of fetching good (and correct) data, but the staleness of stat information could lead to data inconsistencies (e.g., rebalance, tier). Change-Id: I5a22780373b182a13f8d2c4ca6b7d9aa0ffbfca3 BUG: 1297213 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/13276 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* afr: handle bad objects during lookup/inode_refreshRavishankar N2016-01-211-0/+57
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12955, http://review.gluster.org/#/c/13077/ and http://review.gluster.org/#/c/13185/ If an object (file) is marked bad by bitrot, do not consider the brick on which the object is present as a potential read subvolume for AFR irrespective of the pending xattr values. Also do not consider the brick containing the bad object while performing afr_accuse_smallfiles(). Otherwise if the bad object's size is bigger,we may end up considering that as the source. Change-Id: I4abc68e51e5c43c5adfa56e1c00b46db22c88cf7 BUG: 1293300 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13041 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* all: reduce "inline" usageKaleb S KEITHLEY2016-01-184-54/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. backport of Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155, http://review.gluster.org/11769, BUG: 1245331 Change-Id: Iba1efb0bc578ea4a5e9bf76b7bd93dc1be9eba44 BUG: 1283302 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12646 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/bit-rot-stub: delete the link for bad object in quarantine directoryRaghavendra Bhat2016-01-094-2/+95
| | | | | | | | | | | | | | | | | | | | When the bad object is deleted (as of now manually from the backend itself), along with its gfid handle, the entry for the bad object in the quarantne directory is left as it is (it also can be removed manually though). But the next lookup of the object upon not finding it in the backend, sends forget on the in-memory inode. If the stale link for the gfid still exists in the quarantine directory, bir-rot-stub will unlink the entry in its forget or in the next failed lookup on that object with errno being ENOENT. Change-Id: If84292d3e44707dfa11fa29023b3d9f691b8f0f3 BUG: 1293584 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/12743 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit f853ed9c61bf65cb39f859470a8ffe8973818868) Reviewed-on: http://review.gluster.org/13032
* bitrot: getting correct value of scrub stat'sGaurav Kumar Garg2015-12-173-20/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is backport of: http://review.gluster.org/#/c/12776/ When user execute bitrot scrub status command then gluster is not giving correct value of Number of Scrubbed files, Number of Unsigned files, Last completed scrub time, Duration of last scrub. With this patch scrub status will give correct value for all the above fields. >> Change-Id: Ic966f76d22db5b0c889e6386a1c2219afbda1f49 >> BUG: 1285989 >> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> >> Signed-off-by: Kotresh HR <khiremat@redhat.com> >> Reviewed-on: http://review.gluster.org/12776 >> Tested-by: NetBSD Build System <jenkins@build.gluster.org> >> Tested-by: Gluster Build System <jenkins@build.gluster.com> >> Reviewed-by: Venky Shankar <vshankar@redhat.com> Change-Id: Ic966f76d22db5b0c889e6386a1c2219afbda1f49 BUG: 1291546 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> (cherry picked from commit 22827d51c232c44a8f5ac003529d907d93baf7b0) Change-Id: Icef24cce35c8d54ffdfa5282491338318e78780b Reviewed-on: http://review.gluster.org/12966 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bit-rot: Fix NULL dereferencePranith Kumar K2015-11-261-4/+10
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12754 Problem: By the time br_stub_worker is accessing this->private in it's thread, 'init' may not have set 'this->private = priv'. This leads to NULL dereference leading to brick crash. Fix: Set this->private before launching these threads. BUG: 1285758 Change-Id: I8a9234c4f96b0e5ea78f5b336369ec41f5a120ef Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12764 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* glusterd/bitrot : Integration of bad files from bitd with scrub status commandGaurav Kumar Garg2015-11-232-10/+21
| | | | | | | | | | | | | | | | | | | | | | | This patch is backport of: http://review.gluster.org/#/c/12720/ Currently scrub status command is not displaying list of all the bad files. All the bad files are avaliable in the bitd daemon. With this patch it will dispaly list of all the bad file's in the scrub status command. >> Change-Id: If09babafaf5d7cf158fa79119abbf5b986027748 >> BUG: 1207627 >> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Change-Id: If09babafaf5d7cf158fa79119abbf5b986027748 BUG: 1283881 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12725 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* features/bit-rot: scrubber changes for getting the list of bad objects from stubRaghavendra Bhat2015-11-233-2/+297
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12654 > Change-Id: I62885e4aba4a9b345db3c78c3291d563ff3d3567 > BUG: 1207627 > Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-on: http://review.gluster.org/12654 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Venky Shankar <vshankar@redhat.com> > Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Change-Id: I8e1f04f3f730cbd90bdf3cdc7b2149d0de53ea37 BUG: 1283881 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/12716 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bit-rot: stub changes for showing bad objects in the statusRaghavendra Bhat2015-11-236-91/+955
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12503 > Change-Id: If905132f6f1df4aebd9ab255e1e8c59902f84fe5 > BUG: 1207627 > Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-on: http://review.gluster.org/12503 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Venky Shankar <vshankar@redhat.com> > Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Change-Id: I310b71c215913c590b2747e53eea00c2261e975c BUG: 1283881 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/12715 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* glusterd: cli command implementation for bitrot scrub statusGaurav Kumar Garg2015-11-221-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is backport of: http://review.gluster.org/10231 CLI command for bitrot scrub status will be : gluster volume bitrot <volname> scrub status Above command will show the statistics of bitrot scrubber. Upon execution of this command it will show some common scrubber tunable value of volume <VOLNAME> followed by statistics of scrubber statistics of individual nodes. sample ouput for single node: Volume name : <VOLNAME> State of scrub: Active Scrub frequency: biweekly Bitrot error log location: /var/log/glusterfs/bitd.log Scrubber error log location: /var/log/glusterfs/scrub.log ========================================================= Node name: Number of Scrubbed files: Number of Unsigned files: Last completed scrub time: Duration of last scrub: Error count: ========================================================= This is just infrastructure. list of bad file, last scrub time, error count value will be taken care by http://review.gluster.org/#/c/12503/ and http://review.gluster.org/#/c/12654/ patches. >> Change-Id: I3ed3c7057c9d0c894233f4079a7f185d90c202d1 >> BUG: 1207627 >> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> >> Reviewed-on: http://review.gluster.org/10231 >> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >> Tested-by: NetBSD Build System <jenkins@build.gluster.org> >> Tested-by: Gluster Build System <jenkins@build.gluster.com> Change-Id: I45ed94e5e0e78a1e007c30eb0b252f74cf3c9187 BUG: 1283881 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12704 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* features/bitrot: Fix scrubber frequency setKotresh HR2015-08-272-5/+22
| | | | | | | | | | | | | | | | | | | | | | When bitrot is configured on multiple volumes in a cluster and scrubber-frequency is changed for one volume, it is resetting frequency for all other volumes w.r.t to its scrubber-frequency. This should not happen. Changing scrubber-frequency should affect only that volume on which it is set. This patch fixes the issue. Also restricted the logs to the configure volume. BUG: 1256669 Change-Id: I6eba385b50b3bdc86bc8f4ef295a004b3b87b68a Reviewed-on: http://review.gluster.org/11897 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12010
* features/bitrot: Fix rescheduling scrub-frequencyKotresh HR2015-08-231-20/+11
| | | | | | | | | | | | | | | | | | While rescheduling scrub frequency, boot time of the brick was considered where it is not required and also delta is calculated using unsigned int resulting in the loss of fractional part leading to wrong scrub frequency. Boot time is completely removed and delta calculation is simplified. BUG: 1253160 Change-Id: I98dd1fa99304c6d91c0a330dfca7fef57a770397 Reviewed-on: http://review.gluster.org/11853 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11904
* bitrot: Scrubber log should mark bad file as a ALERT in the scrubber logGaurav Kumar Garg2015-08-211-2/+2
| | | | | | | | | | | | | | | If bad file detected by scrubber then scrubber should log that bad file as a ALERT message in scrubber log. Change-Id: I410429e78fd3768655230ac028fa66f7fc24b938 BUG: 1255605 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/11965 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit 6cb73b4fe798b7bf3aface0aac2a4e6c7c618c0e) Reviewed-on: http://review.gluster.org/11974 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bit-rot-stub: fail the fop if inode context get failsRaghavendra Bhat2015-08-212-23/+96
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11449 In stub, for fops like readv, writev etc, if the the object is bad, then the fop is denied. But for checking if the object is bad inode context should be checked. Now, if the inode context is not there, then the fop is allowed to continue. This patch fixes it and the fop is unwound with an error, if the inode context is not found. Change-Id: I0dcbf80889427d4c0404e00bc6c773f6fe8fc8db BUG: 1255351 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/11966 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com>
* features/bit-rot-stub: handle REOPEN_WAIT on forgotten inodesRaghavendra Bhat2015-08-121-1/+43
| | | | | | | | | | | Backport of http://review.gluster.org/11729 Change-Id: I4d0143e72afdc9bd2cd2c4df7a33a6ecc07328f2 BUG: 1247551 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/11773 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bitrot: move inode state just at the last momentVenky Shankar2015-07-231-19/+42
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11461 Which was done at half the set expiry time resulting in actual IOs incrementing the object version. Now this is done just at the last moment with re-notification now cut-shorting into checksum calculation without waiting in the timer-wheel. BUG: 1242718 Change-Id: If655b77d822ebf7b2a4f65e1b5583dd3609306e7 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11653 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* features/bit-rot-stub: do not allow setxattr and removexattr on bit-rot xattrsRaghavendra Bhat2015-07-142-7/+106
| | | | | | | | | | | | | | | Backport of http://review.gluster.org/11389 * setxattr and {f}removexattr of versioning, signature and bad-file xattrs are returned with error. Change-Id: I8fe5f973d6e410bec2758959d20d379189808d5e BUG: 1241529 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/11604 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bit-rot-stub: deny access to bad objectsRaghavendra Bhat2015-07-145-32/+422
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11126 * Access to bad objects (especially operations such as open, readv, writev) should be denied to prevent applications from getting wrong data. * Do not allow anyone apart from scrubber to set bad object xattr. * Do not allow bad object xattr to be removed. Change-Id: I6903184ab64a9d1ea595330b603935979c33bc26 BUG: 1241529 Reviewed-on: http://review.gluster.org/11603 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bitrot: convert pending gf_log() to gf_msg()Venky Shankar2015-07-094-30/+73
| | | | | | | | | | | | Backport of http://review.gluster.org/11396 Change-Id: Idfd245327b485459ccbda503510b8ca0127bb66c BUG: 1226666 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11542 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* features/bitrot: handle scrub states via state machineVenky Shankar2015-07-097-52/+331
| | | | | | | | | | | | | | | Backport of http://review.gluster.org/11149 A bunch of command line options for scrubber tempted the use of state machine to track current state of scrubber under various circumstances where the options could be in effect. Change-Id: Id614bb2e6af30a90d2391ea31ae0a3edeb4e0d69 BUG: 1226666 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11541 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: cleanup, v2Venky Shankar2015-07-093-38/+187
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11148 This patch uses "cleanup, v1" infrastrcuture to cleanup scrubber (data structures, threads, timers, etc..) on brick disconnection. Signer is not cleaned up yet: probably would be done as part of another patch. Change-Id: I78a92b8a7f02b2f39078aa9a5a6b101fc499fd70 BUG: 1226666 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11540 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: cleanup, v1Venky Shankar2015-07-093-126/+271
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11147 This is a short series of patches (with other cleanups) aimed at cleaning up some of the incorrect assumptions taken in reconfigure() leading to crashes when subvolumes are not fully initialized (as reported here[1] on gluster-devel@). Furthermore, there is some amount of code cleanup to handle disconnection and cleanup up data structure (as part of subsequent patch). [1] http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html Change-Id: I68ac4bccfbac4bf02fcc31615bd7d2d191021132 BUG: 1226830 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11539 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* bit-rot : New logging framework for bit-rot log messageMohamed Ashiq2015-07-018-201/+815
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/10297 Cherry picked from 2f0d36d16c241365760aaa6d857b7a4d438e1042 >Change-Id: I83c494f2bb60d29495cd643659774d430325af0a >BUG: 1194640 >Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> >Reviewed-on: http://review.gluster.org/10297 >Tested-by: Venky Shankar <vshankar@redhat.com> >Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> >Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Venky Shankar <vshankar@redhat.com> Change-Id: I83c494f2bb60d29495cd643659774d430325af0a BUG: 1217722 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/11379 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bitrot: log scrub frequency & throttle valuesVenky Shankar2015-06-261-0/+28
| | | | | | | | | | | | Backport of http://review.gluster.org/11190 Change-Id: I56d5236c37a413046b5766320184047a908f2c8d BUG: 1231024 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11397 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/bit-rot: check for both inmemory and ondisk stalenessRaghavendra Bhat2015-06-262-19/+138
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/10947 * Let bit-rot stub check both on disk ongoing version, signed version xattrs and the in memory flags in the inode and then decide whether the inode is stale or not. This information is used by one shot crawler in BitD to decide whether to trigger the sign for the object or skip it. NOTE: The above check should be done only for BitD. For scrubber its still the old way of comparing on disk ongoing version with signed version. * BitD's one shot crawler should not sign zero byte objects if they do not contain signature. (Means the object was just created and nothing was written to it). Change-Id: I580b45b85f62fc075616ee3da9c15a3c8335d7a8 BUG: 1232199 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/11249 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/bitrot: fix fd leak in truncate (stub)Venky Shankar2015-06-191-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11077 The need to perform object versioning in the truncate() code path required an fd to reuse existing versioning infrastructure that's used by fd based operations (such as writev(), ftruncate(), etc..). This tempted the use of anonymous fd which was never ever unref()'d after use resulting in fd and/or memory leak depending on the code path taken. Versioning resulted in a dangling file descriptor left open in the filesystem effecting the signing process of a given object (no release() would be trigerred, hence no signing would be performed). On the other hand, cases where the object need not be versioned, the anonymous fd in still ref()'d resulting in memory leak (NOTE: there's no "dangling" file descriptor in this case). Change-Id: I29c3d2af9bbc5cd4b8ddf38954080e3c7a44ba61 BUG: 1232179 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11300 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* features/bitrot: tuanble object signing waiting time value for bitrotGaurav Kumar Garg2015-06-172-8/+31
| | | | | | | | | | | | | | | | | Currently bitrot using 120 second waiting time for object to be signed after all fop's released. This signing waiting time value should be tunable. Command for changing the signing waiting time will be #gluster volume bitrot <VOLNAME> signing-time <waiting time value in second> Change-Id: I89f3121564c1bbd0825f60aae6147413a2fbd798 BUG: 1231832 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11105 (cherry picked from commit 554fa0c1315d0b4b78ba35a2d332d7ac0fd07d48) Reviewed-on: http://review.gluster.org/11235 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bit-rot-stub: implement mknod fopRaghavendra Bhat2015-05-311-0/+51
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/10790 With the absence of mknod() fop implementation in bitrot stub, further operations that trigger versioning resulted in crashes as they expect the inode context to be valid. Therefore, this patch implements mknod() following similar simantics to fops such as create(). Furthermore, bitrot stub test C program is fixed to stop lying and validate obj versions according to the versioning protocol. Change-Id: If76f252577445d1851d6c13c7e969e864e2183ef BUG: 1226139 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10987 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: serialize versioningVenky Shankar2015-05-313-32/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/10832 Current signing interface (fsetxattr()) had couple of issues: One, a signing request (by bitrot daemon) is denied if the version against which an object is to be signed is unequal to the current version of the object (cases where another subsequent modification increments the version). Such request(s) are rejected with EINVAL sent back to the signer resulting in a bunch of errors (in logs) reported by bitrot daemon. Although, the object would be eventaully signed with the version matching the current version, the "lagging" request should be correctly handled. Two, more than one signing request could race against each other with the object getting signed with a version depending on which request ended up last in the race. Although harmless to some extent, such a case could end up marking the object's signature as stale for infinity (if the object is *never* touched) thereby resulting in scrubber skipping the object during verification. This patch fixes these issues by ordering signing request(s) and fixing version comparison checks at the time of signing. Change-Id: I9fa83dfa3be664ba4db61d7f2edc408f4bde77dd BUG: 1224650 Signed-off-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/10900 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/bitrot: refactor brick connection logicRaghavendra Bhat2015-05-302-63/+68
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/10763 Brick connection was bloated (and not implemented efficiently) with calls which were not required to be called under lock. This resulted in starvation of lock by critical code paths. This eventally did not scale when the number of bricks per volume increases (add-brick and the likes). Also, this patch cleans up some of the weird reconnection logic that added more to the starvation of resources and cleans up uncontrolled growing of log files. Change-Id: I05e737f2a9742944a4a543327d167de2489236a4 BUG: 1226146 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10986 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/bitrot: reimplement scrubbing frequencyVenky Shankar2015-05-305-180/+302
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch reimplments existing scrub-frequency mechanism used to schedule scrubber runs. Existing mechanism uses periodic sleeps (waking up periodically on minimum granularity) and performing a number of tracking checks based on counters and sleep times. This patch does away with all the nifty counters and uses timer-wheel to schedule scrub runs. Scheduling changes are peformed by merely calculating the new expiry time and calling mod_timer() [mod_timer_pending() in some cases] making the code more debuggable and easier to follow. This also introduces "hourly" scrubbing tunable as an aid for testing scrubbing during development/testing cycle. One could also implement on-demand scrubbing with ease: by invoking mod_timer() with an expiry of one (1) second, thereby scheduling a scrub run the very next second. Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b BUG: 1224647 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10902 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/bitrot: stub improvements and fixesVenky Shankar2015-05-305-426/+435
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the signing trigger mechanism used by bitrot daemon as a "catch up" meachanism to sign files which _missed_ signing on the last run either due to bitrot being disabled and enabled again or if bitrot is enabled for a volume with existing data. Existing implementation relies on overloading writev() to trigger signing which just by the looks sounded dangerous and I hated it to the core. This change moves all that business to the setxattr interface thereby keeping the writev path strictly for client IO. Why not use IPC fop to trigger signing? There's a need to access the object's inode to perform various maintainance operations. inode is not _directly_ accessible in the IPC fop (although, it can be found via inode_grep() for the object's GFID - the inode just needs to be pinned in memory, which is the case if there's an active fd on the inode). This patch relies on good old technique of overloading fsetxattr() to do the job instead of using IPC fop. There are some pretty nice cleanups along the lines of memory deallocations, unncessary allocations and redundant ref()ing of structures (such as fd's) provided by this patch. All in all - much improved code navigation. Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8 BUG: 1225709 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10953 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/bit-rot-stub: versioning of objects in write/truncate fop instead ↵Raghavendra Bhat2015-05-106-339/+1101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of open * This patch brings in the changes where object versioning is done in write and truncate fops instead of tracking them in open and create fops. This model works for both regular and anonymous fds. It also removes the race associated with open calls, create and lookups. This patch follows the below method for object versioning and notifications: Before sending writev on the fd, increase the ongoing version first. This makes anonymous fd write similar to the regular fd write by having the ongoing version increased before doing the write. Do following steps to do versioning: 1) For anonymous fds set the fd context (so that release is invoked) and add the fd context to the list maintained in the inode context. For regular fds the above think would have been done in open itself. 2) Increase the on-disk ongoing version 3) Increase the in memory ongoing version and mark inode as non-dirty 3) Once versioning is successfully done send write operation. If versioning fails, then fail the write fop. 5) In writev_cbk mark inode as modified. > Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c > BUG: 1207979 > Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-on: http://review.gluster.org/10233 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I4bb86989b5fab02b9ed2950798b1a80e566f1024 BUG: 1220041 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/10722 Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/bitrot: scrubber should crawl based on the scrubber frequency valueGaurav Kumar Garg2015-05-103-5/+192
| | | | | | | | | | | | | | | Currently scrubber is crawling all the files continuously. It should crawl files based on the scrubber frequency which user have set. By default scrubber crawling frequency value will be biweekly. Change-Id: I5762a92c1e700134cfe4283d1f631904adbfe31d BUG: 1220068 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/10739 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/bitrot: Scrubber pause/resumeVenky Shankar2015-05-103-9/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | With logical scan/scrub split, pausing filesystem scrubber is an override to the thread throttling mechanism, which effectively throttles "down" number of scrubber threads to zero. This causes scanner to wait until threads are spawned again (when resumed) thereby continuing where it left off (since the file tree walk stack is effectively preserved when the main scanner thread is waiting for scrubbers to consume scanned entries). The only catch is when scrubber daemon restarts: file tree walk stack is lost and scrubbing initiates from root. This is probably OK for now (can be changed later to persist parent directory information before entering pause state). > Change-Id: I5109a749b7fccd0f5367765078f46e6522dd32a1 > BUG: 1208131 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10521 > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > Tested-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I9b60f2ce24ca3787423a45ec7d502f89215fe45f Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1220041 Reviewed-on: http://review.gluster.org/10721 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* features/bitrot: Throttle filesystem scrubberVenky Shankar2015-05-105-51/+712
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces multithreaded filesystem scrubber based on throttling option configured for a particular volume. The implementation "logically" breaks scanning and scrubbing with the number of scrubber threads auto-configured depending upon the throttle configuration. Scanning (crawling) is left single threaded (per brick) with entries scrubbed in bulk. On reaching this "bulk" watermark, scanner waits until entries are scrubbed. Bricks for a particular volume have a set of thread(s) assigned for scrubbing, with entries for each brick scrubbed in a round robin fashion to avoid scrub "stalls" when a brick (out of N bricks) is under active scrubbing. This mechanism helps us implement "pause/resume" with ease: all one need to do is to cleanup scrubber threads and let the main scanner thread "wait" untill scrubbing is resumed (where the scrubber thread(s) are spawned again), therefore continuing where we left off (unless we restart the deamons, where crawl initiates from root directory again, but I guess that's OK). [ NOTE: Throttling is optional for the signer daemon, without which it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER" predefined in CFLAGS enables CPU throttling (during checksum calculation) thereby avoiding high CPU usage. ] Subsequent patches would introduce CPU throttling during hash calculation for scrubber. > Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f > BUG: 1207020 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10511 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I5a125b2d0ac7dafd3e278b7fe4c6c9dd07af76dd Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1220041 Reviewed-on: http://review.gluster.org/10720 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* features/bit-rot: Token Bucket based throttlingVenky Shankar2015-05-106-9/+432
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BitRot daemons (signer & scrubber) are disk/cpu hoggers when left running full throttle. Checksum calculations (especially SHA family of hash routines) can be quite CPU intensive. Moreover periodic disk scans performed by scrubber followed by reading data blocks for hash calculation (which is also done by signer) generate lot of heavy IO request(s). This causes interference with actual client operations (be it a regular client or filesystems daemons such as self-heal, etc..) and results in degraded system performance. This patch introduces throttling based on Token Bucket Filtering[1]. It's a well known algorithm for checking (and ensuring) that data transmission conform to defined limits and generally used in packet switched networks. Linux control groups (Cgroups) uses a variant[2] of this algorithm to provide block device IO throttling (cgroup subsys "blkio": blk-iothrottle). So, why not just live with Cgroups? Cgroups is linux specific. We need to have a throttling mechanism for other supported UNIXes. Moreover, having our own implementation gives much more finer control in terms of tuning it for our needs (plus the simplicity of the alogorithm itself). Ideally, throttling should be a part of server stack (either as a separate translator or integrated with io-threads) since that's the point of entry for IO request(s) from *all* client(s). That way one could selectively throttle IO request(s) based on client PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc.. (*actual* clients can run full throttle). This implementation avoids that deliberately (there needs to be a much more smarter queueing mechanism) and throttles CPU usage for hash calculations. This patch is just the infrastructure part with no interfaces exposed to set various throttling values. The tunable selected here (basically hardcoded) avoids 100% CPU usage during hash calculation (with some bursts cycles). We'd need much more intensive test(s) to assign values for various throttling options (lazy/normal/aggressive). [1] https://en.wikipedia.org/wiki/Token_bucket [2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket > Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8 > BUG: 1207020 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10307 > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > Tested-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I034ba1095aa3bfc3212a67a63ffb931431b372f6 Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1220041 Reviewed-on: http://review.gluster.org/10719 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* features/bitrot: Follow xattr naming conventionsVenky Shankar2015-05-103-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*" NOTE: With this patch, data on existing volumes would be resigned (which should be OK as of now since we do not expect many users as of now :-)) > Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0 > BUG: 1170075 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10181 > Tested-by: NetBSD Build System > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I3c18d7dc2db4beaca6e8d8d231b4171a7b18795f Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1220041 Reviewed-on: http://review.gluster.org/10718 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* features/bitrot: Use global timer wheelVenky Shankar2015-05-101-23/+8
| | | | | | | | | | | | | | | | | | > Change-Id: I761927ea263b4144b851881f25791fda5b794f59 > BUG: 1170075 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10381 > Tested-by: NetBSD Build System > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I4aa7c0d8b42b4c8d14a1d810e54c2de4d52b4389 Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1220041 Reviewed-on: http://review.gluster.org/10717 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* bitrot: Scrubber log should report 'bad' file detection as ALERT in logGaurav Kumar Garg2015-05-091-2/+2
| | | | | | | | | | | | | | | | | | | If scrubber detect any bad object by mismatching of checksum of scrubber and signer then log messages shold come as a Alert instead of warning. > Change-Id: I075d80700cbe6182e525a04419a80ab18419ff91 > BUG: 1210687 > Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> > Reviewed-on: http://review.gluster.org/10226 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Change-Id: I7c733c82aed5a00c74e60dc7baca0aa9acf26fad BUG: 1220041 Reviewed-on: http://review.gluster.org/10715 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>