glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	bitrot: Make number of signer threads configurable	Kotresh HR	2020-02-07	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The number of signing process threads (glfs_brpobj) is set to 4 by default. The recommendation is to set it to number of cores available. This patch makes it configurable as follows gluster vol bitrot <volname> signer-threads <count> fixes: bz#1797869 Change-Id: Ia883b3e5e34e0bc8d095243508d320c9c9c58adc Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	Land clang-format changes	Gluster Ant	2018-09-12	1	-157/+156
\| \| \| \|	Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	feature/bitrot: Ondemand scrub option for bitrot	Kotresh HR	2016-08-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bitrot scrubber takes 'hourly/daily/biweekly/monthly' as the values for 'scrub-frequency'. There is no way to schedule the scrubbing when the admin wants it. Ondemand scrubbing brings in the new option 'ondemand' with which the admin can start scrubbing ondemand. It starts the scrubbing immediately. Ondemand scrubbing is successful only if the scrubber is in 'Active (Idle)' (waiting for it's next frequency cycle to start scrubbing). It is not entertained when the scrubber is in 'Paused' or already running. Here is the command line syntax. gluster volume bitrot <vol name> scrub ondemand Change-Id: I84c28904367eed827a7dae8d6a535c14b28e9f4d BUG: 1366195 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/15111 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	features/bitrot: Move throttling code to libglusterfs	Kotresh HR	2016-07-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since throttling is a separate feature by itself, move throttling code to libglusterfs. Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972 BUG: 1352019 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14846 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	features/bitrot: Option to set scrub interval to a minute	Kotresh HR	2016-07-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bitrot scrub-frequency supports "hourly\|daily\|weekly\|biweekly\|monthly". But it is painful for testing as minimum scrub-interval is an hour Hence introducing a scrub interval of minute to ease testing. It is intentionally not exposed in bitrot command help as it is only for testing. e.g., gluster vol bitrot <volname> scrub-frequency minute Change-Id: I155a65298d3fad5ae9e529d9c7d4b0d25fa297c0 BUG: 1351537 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14836 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	features/bitrot: Fix Compilation Warning!!!	Kotresh HR	2016-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier the lock was using glusterfs macros LOCK/UNLOCK/LOCK_INIT/LOCK_DESTROY. The patch http://review.gluster.org/#/c/14140/ used 'pthread_cleanup_push' interface for the same lock which was giving "initialization discards qualifiers from pointer target type". It's strange that the build succeeded in master branch with no warnings but fails for the backport http://review.gluster.org/#/c/14140/ in 3.7 branch treating this warning as error. Change-Id: I75c8a65a2bfb1147fe9a84cfd8f09a97c089ae70 BUG: 1332134 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14146 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	features/bitrot: Introduce scrubber monitor thread	Kotresh HR	2016-05-01	1	-28/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch does following changes. 1. Introduce scrubber monitor thread. 2. Move scrub status related APIs to separate file and make part of libbitrot library. Problem: Earlier, each child of the scrubber was maintaining the state machine and hence there was no way to track the start and end time of scrubbing as each brick has it's own start and end time. Also each brick was maintaining it's own timer wheel instance. It was also not possible to get scrubbed files count per session as we could not get last child which finishes scrubbing to reset it to zero. Solution: Introduce scrubber monitor thread. It does following. 1. Maintains the scrubber state machine. Earlier each child had it's own state machine. Now, only monitor maintains on behalf of all it's children. 2. Maintains the timer wheel instance. Earlier each child had it's own timer wheel instance. Now, only monitor maintains on behalf of all it's children. As a result, we can track the scrub statistics easily and correctly. Change-Id: Ic6e34ffa57984bd7a5ee81f4e263342bc1d9b302 BUG: 1329211 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14044 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	bitrot: getting correct value of scrub stat's	Gaurav Kumar Garg	2015-12-14	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When user execute bitrot scrub status command then gluster is not giving correct value of Number of Scrubbed files, Number of Unsigned files, Last completed scrub time, Duration of last scrub. With this patch scrub status will give correct value for all the above fields. Change-Id: Ic966f76d22db5b0c889e6386a1c2219afbda1f49 BUG: 1285989 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12776 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	features/bit-rot: scrubber changes for getting the list of bad objects from stub	Raghavendra Bhat	2015-11-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Change-Id: I62885e4aba4a9b345db3c78c3291d563ff3d3567 BUG: 1207627 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/12654 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	features/bitrot: Fix scrubber frequency set	Kotresh HR	2015-08-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When bitrot is configured on multiple volumes in a cluster and scrubber-frequency is changed for one volume, it is resetting frequency for all other volumes w.r.t to its scrubber-frequency. This should not happen. Changing scrubber-frequency should affect only that volume on which it is set. This patch fixes the issue. Also restricted the logs to the configure volume. Change-Id: I90d6e864b131e3d8dd4010079a00f924032f2098 BUG: 1252825 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11897 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	features/bitrot: handle scrub states via state machine	Venky Shankar	2015-06-25	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bunch of command line options for scrubber tempted the use of state machine to track current state of scrubber under various circumstances where the options could be in effect. Change-Id: Id614bb2e6af30a90d2391ea31ae0a3edeb4e0d69 BUG: 1231619 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11149 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/bitrot: cleanup, v2	Venky Shankar	2015-06-25	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses "cleanup, v1" infrastrcuture to cleanup scrubber (data structures, threads, timers, etc..) on brick disconnection. Signer is not cleaned up yet: probably would be done as part of another patch. Change-Id: I78a92b8a7f02b2f39078aa9a5a6b101fc499fd70 BUG: 1231619 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11148 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
*	features/bitrot: cleanup, v1	Venky Shankar	2015-06-25	1	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a short series of patches (with other cleanups) aimed at cleaning up some of the incorrect assumptions taken in reconfigure() leading to crashes when subvolumes are not fully initialized (as reported here[1] on gluster-devel@). Furthermore, there is some amount of code cleanup to handle disconnection and cleanup up data structure (as part of subsequent patch). [1] http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html Change-Id: I68ac4bccfbac4bf02fcc31615bd7d2d191021132 BUG: 1231617 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11147 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
*	features/bitrot: tuanble object signing waiting time value for bitrot	Gaurav Kumar Garg	2015-06-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently bitrot using 120 second waiting time for object to be signed after all fop's released. This signing waiting time value should be tunable. Command for changing the signing waiting time will be #gluster volume bitrot <VOLNAME> signing-time <waiting time value in second> Change-Id: I89f3121564c1bbd0825f60aae6147413a2fbd798 BUG: 1228680 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/11105
*	build: do not #include "config.h" in each file	Niels de Vos	2015-05-29	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	features/bitrot: refactor brick connection logic	Raghavendra Bhat	2015-05-28	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Brick connection was bloated (and not implemented efficiently) with calls which were not required to be called under lock. This resulted in starvation of lock by critical code paths. This eventally did not scale when the number of bricks per volume increases (add-brick and the likes). Also, this patch cleans up some of the weird reconnection logic that added more to the starvation of resources and cleans up uncontrolled growing of log files. Change-Id: I05e737f2a9742944a4a543327d167de2489236a4 BUG: 1207134 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/10763 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: NetBSD Build System
*	features/bitrot: reimplement scrubbing frequency	Venky Shankar	2015-05-28	1	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reimplments existing scrub-frequency mechanism used to schedule scrubber runs. Existing mechanism uses periodic sleeps (waking up periodically on minimum granularity) and performing a number of tracking checks based on counters and sleep times. This patch does away with all the nifty counters and uses timer-wheel to schedule scrub runs. Scheduling changes are peformed by merely calculating the new expiry time and calling mod_timer() [mod_timer_pending() in some cases] making the code more debuggable and easier to follow. This also introduces "hourly" scrubbing tunable as an aid for testing scrubbing during development/testing cycle. One could also implement on-demand scrubbing with ease: by invoking mod_timer() with an expiry of one (1) second, thereby scheduling a scrub run the very next second. Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b BUG: 1224596 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10893 Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bitrot: stub improvements and fixes	Venky Shankar	2015-05-28	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch refactors the signing trigger mechanism used by bitrot daemon as a "catch up" meachanism to sign files which _missed_ signing on the last run either due to bitrot being disabled and enabled again or if bitrot is enabled for a volume with existing data. Existing implementation relies on overloading writev() to trigger signing which just by the looks sounded dangerous and I hated it to the core. This change moves all that business to the setxattr interface thereby keeping the writev path strictly for client IO. Why not use IPC fop to trigger signing? There's a need to access the object's inode to perform various maintainance operations. inode is not _directly_ accessible in the IPC fop (although, it can be found via inode_grep() for the object's GFID - the inode just needs to be pinned in memory, which is the case if there's an active fd on the inode). This patch relies on good old technique of overloading fsetxattr() to do the job instead of using IPC fop. There are some pretty nice cleanups along the lines of memory deallocations, unncessary allocations and redundant ref()ing of structures (such as fd's) provided by this patch. All in all - much improved code navigation. Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8 BUG: 1224600 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10942 Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bitrot: scrubber should crawl based on the scrubber frequency value	Gaurav Kumar Garg	2015-05-10	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently scrubber is crawling all the files continuously. It should crawl files based on the scrubber frequency which user have set. By default scrubber crawling frequency value will be biweekly. Change-Id: I5762a92c1e700134cfe4283d1f631904adbfe31d BUG: 1208131 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/10602 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bit-rot-stub: versioning of objects in write/truncate fop instead ↵	Raghavendra Bhat	2015-05-08	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of open * This patch brings in the changes where object versioning is done in write and truncate fops instead of tracking them in open and create fops. This model works for both regular and anonymous fds. It also removes the race associated with open calls, create and lookups. This patch follows the below method for object versioning and notifications: Before sending writev on the fd, increase the ongoing version first. This makes anonymous fd write similar to the regular fd write by having the ongoing version increased before doing the write. Do following steps to do versioning: 1) For anonymous fds set the fd context (so that release is invoked) and add the fd context to the list maintained in the inode context. For regular fds the above think would have been done in open itself. 2) Increase the on-disk ongoing version 3) Increase the in memory ongoing version and mark inode as non-dirty 3) Once versioning is successfully done send write operation. If versioning fails, then fail the write fop. 5) In writev_cbk mark inode as modified. Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c BUG: 1207979 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/10233 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bitrot: Scrubber pause/resume	Venky Shankar	2015-05-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With logical scan/scrub split, pausing filesystem scrubber is an override to the thread throttling mechanism, which effectively throttles "down" number of scrubber threads to zero. This causes scanner to wait until threads are spawned again (when resumed) thereby continuing where it left off (since the file tree walk stack is effectively preserved when the main scanner thread is waiting for scrubbers to consume scanned entries). The only catch is when scrubber daemon restarts: file tree walk stack is lost and scrubbing initiates from root. This is probably OK for now (can be changed later to persist parent directory information before entering pause state). Change-Id: I5109a749b7fccd0f5367765078f46e6522dd32a1 BUG: 1208131 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10521 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
*	features/bitrot: Throttle filesystem scrubber	Venky Shankar	2015-05-07	1	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduces multithreaded filesystem scrubber based on throttling option configured for a particular volume. The implementation "logically" breaks scanning and scrubbing with the number of scrubber threads auto-configured depending upon the throttle configuration. Scanning (crawling) is left single threaded (per brick) with entries scrubbed in bulk. On reaching this "bulk" watermark, scanner waits until entries are scrubbed. Bricks for a particular volume have a set of thread(s) assigned for scrubbing, with entries for each brick scrubbed in a round robin fashion to avoid scrub "stalls" when a brick (out of N bricks) is under active scrubbing. This mechanism helps us implement "pause/resume" with ease: all one need to do is to cleanup scrubber threads and let the main scanner thread "wait" untill scrubbing is resumed (where the scrubber thread(s) are spawned again), therefore continuing where we left off (unless we restart the deamons, where crawl initiates from root directory again, but I guess that's OK). [ NOTE: Throttling is optional for the signer daemon, without which it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER" predefined in CFLAGS enables CPU throttling (during checksum calculation) thereby avoiding high CPU usage. ] Subsequent patches would introduce CPU throttling during hash calculation for scrubber. Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f BUG: 1207020 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10511 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bit-rot: Token Bucket based throttling	Venky Shankar	2015-05-07	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BitRot daemons (signer & scrubber) are disk/cpu hoggers when left running full throttle. Checksum calculations (especially SHA family of hash routines) can be quite CPU intensive. Moreover periodic disk scans performed by scrubber followed by reading data blocks for hash calculation (which is also done by signer) generate lot of heavy IO request(s). This causes interference with actual client operations (be it a regular client or filesystems daemons such as self-heal, etc..) and results in degraded system performance. This patch introduces throttling based on Token Bucket Filtering[1]. It's a well known algorithm for checking (and ensuring) that data transmission conform to defined limits and generally used in packet switched networks. Linux control groups (Cgroups) uses a variant[2] of this algorithm to provide block device IO throttling (cgroup subsys "blkio": blk-iothrottle). So, why not just live with Cgroups? Cgroups is linux specific. We need to have a throttling mechanism for other supported UNIXes. Moreover, having our own implementation gives much more finer control in terms of tuning it for our needs (plus the simplicity of the alogorithm itself). Ideally, throttling should be a part of server stack (either as a separate translator or integrated with io-threads) since that's the point of entry for IO request(s) from all client(s). That way one could selectively throttle IO request(s) based on client PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc.. (actual clients can run full throttle). This implementation avoids that deliberately (there needs to be a much more smarter queueing mechanism) and throttles CPU usage for hash calculations. This patch is just the infrastructure part with no interfaces exposed to set various throttling values. The tunable selected here (basically hardcoded) avoids 100% CPU usage during hash calculation (with some bursts cycles). We'd need much more intensive test(s) to assign values for various throttling options (lazy/normal/aggressive). [1] https://en.wikipedia.org/wiki/Token_bucket [2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8 BUG: 1207020 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10307 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
*	features/bit-rot: filesystem scrubber	Venky Shankar	2015-03-24	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scrubber performs signature verification for objects that were signed by signer. This is done by recalculating the signature (using the hash algorithm the object was signed with) and verifying it aginst the objects persisted signature. Since the object could be undergoing IO opretaion at the time of hash calculation, the signature may not match objects persisted signature. Bitrot stub provides additional information about the stalesness of an objects signature (determinted by it's versioning mechanism). This additional bit of information is used by scrubber to determine the staleness of the signature, and in such cases the object is skipped verification (although signature staleness is performed twice: once before initiation of hash calculation and another after it (an object could be modified after staleness checks). The implmentation is a part of the bitrot xlator (signer) which acts as a signer or scrubber based on a translator option. As of now the scrub process is ever running (but has some form of weak throttling mechanism during filesystem scan). Going forward, there needs to be some form of scrub scheduling and IO throttling (during hash calculation) tunables (via CLI). Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9 BUG: 1170075 Original-Author: Raghavendra Bhat <rabhat@redhat.com> Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9914 Tested-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/bit-rot: Implementation of bit-rot xlator	Venky Shankar	2015-03-24	1	-0/+126
	This is the "Signer" -- responsible for signing files with their checksums upon last file descriptor close (last release()). The event notification facility provided by the changelog xlator is made use of. Moreover, checksums are as of now SHA256 hash of the object data and is the only available hash at this point of time. Therefore, there is no special "what hash to use" type check, although it's does not take much to add various hashing algorithms to sign objects with. Signatures are stored in extended attributes of the objects along with the the type of hashing used to calculate the signature. This makes thing future proof when other hash types are added. The signature infrastructure is provided by bitrot stub: a little piece of code that sits over the POSIX xlator providing interfaces to "get or set" objects signature and it's staleness. Since objects are signed upon receiving release() notification, pre-existing data which are "never" modified would never be signed. To counter this, an initial crawler thread is spawned The crawler scans the entire brick for objects that are unsigned or "missed" signing due to the server going offline (node reboots, crashes, etc..) and triggers an explicit sign. This would also sign objects when bit-rot is enabled for a volume and/or after upgrade. Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b BUG: 1170075 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9711 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>