| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/14846
Since throttling is a separate feature by itself,
move throttling code to libglusterfs.
Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972
BUG: 1357514
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/14944
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/14864/
Bitrot scrub status shows whether the scrub is paused
or active. It doesn't show whether the scrubber is
actually scrubbing or waiting in the timer wheel
for the next schedule. This patch shows this status
with "In Progress" and "Idle" respectively.
Change-Id: I995d8553d1ff166503ae1e7b46282fc3ba961f0b
BUG: 1355635
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit f4757d256e3e00132ef204c01ed61f78f705ad6b)
Reviewed-on: http://review.gluster.org/14900
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch does following changes.
1. Introduce scrubber monitor thread.
2. Move scrub status related APIs to separate file
and make part of libbitrot library.
Problem:
Earlier, each child of the scrubber was maintaining
the state machine and hence there was no way to track
the start and end time of scrubbing as each brick has
it's own start and end time. Also each brick was maintaining
it's own timer wheel instance. It was also not possible
to get scrubbed files count per session as we could not
get last child which finishes scrubbing to reset it to
zero.
Solution:
Introduce scrubber monitor thread. It does following.
1. Maintains the scrubber state machine. Earlier each
child had it's own state machine. Now, only monitor
maintains on behalf of all it's children.
2. Maintains the timer wheel instance. Earlier each
child had it's own timer wheel instance. Now, only
monitor maintains on behalf of all it's children.
As a result, we can track the scrub statistics easily
and correctly.
Backport of:
>Change-Id: Ic6e34ffa57984bd7a5ee81f4e263342bc1d9b302
>BUG: 1329211
>Signed-off-by: Kotresh HR <khiremat@redhat.com>
>Reviewed-on: http://review.gluster.org/14044
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Venky Shankar <vshankar@redhat.com>
Backport of:
>http://review.gluster.org/#/c/14146
>BUG: 1332134
NOTE: The patch #14146 is a compilation warning not detected
in master branch and detected only in 3.7 branch. Since the
compilation warning is introduced by patch #14044, the above
two backports are made into this single patch.
Change-Id: I1da7a3ec673a36ae0f59dc33ac5992c74fd7a19b
BUG: 1332072
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/14140
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are three kinds of inline functions: plain inline, extern inline,
and static inline. All three have been removed from .c files, except
those in "contrib" which aren't our problem. Inlines in .h files, which
are overwhelmingly "static inline" already, have generally been left
alone. Over time we should be able to "lower" these into .c files, but
that has to be done in a case-by-case fashion requiring more manual
effort. This part was easy to do automatically without (as far as I can
tell) any ill effect.
In the process, several pieces of dead code were flagged by the
compiler, and were removed.
backport of Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155,
http://review.gluster.org/11769, BUG: 1245331
Change-Id: Iba1efb0bc578ea4a5e9bf76b7bd93dc1be9eba44
BUG: 1283302
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/12646
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is backport of: http://review.gluster.org/#/c/12776/
When user execute bitrot scrub status command then gluster
is not giving correct value of Number of Scrubbed files,
Number of Unsigned files, Last completed scrub time,
Duration of last scrub.
With this patch scrub status will give correct value for
all the above fields.
>> Change-Id: Ic966f76d22db5b0c889e6386a1c2219afbda1f49
>> BUG: 1285989
>> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
>> Signed-off-by: Kotresh HR <khiremat@redhat.com>
>> Reviewed-on: http://review.gluster.org/12776
>> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
>> Tested-by: Gluster Build System <jenkins@build.gluster.com>
>> Reviewed-by: Venky Shankar <vshankar@redhat.com>
Change-Id: Ic966f76d22db5b0c889e6386a1c2219afbda1f49
BUG: 1291546
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
(cherry picked from commit 22827d51c232c44a8f5ac003529d907d93baf7b0)
Change-Id: Icef24cce35c8d54ffdfa5282491338318e78780b
Reviewed-on: http://review.gluster.org/12966
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is backport of: http://review.gluster.org/#/c/12720/
Currently scrub status command is not displaying list of all the bad files. All
the bad files are avaliable in the bitd daemon.
With this patch it will dispaly list of all the bad file's in the scrub
status command.
>> Change-Id: If09babafaf5d7cf158fa79119abbf5b986027748
>> BUG: 1207627
>> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Change-Id: If09babafaf5d7cf158fa79119abbf5b986027748
BUG: 1283881
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/12725
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is backport of: http://review.gluster.org/10231
CLI command for bitrot scrub status will be :
gluster volume bitrot <volname> scrub status
Above command will show the statistics of bitrot scrubber.
Upon execution of this command it will show some common
scrubber tunable value of volume <VOLNAME> followed by
statistics of scrubber statistics of individual nodes.
sample ouput for single node:
Volume name : <VOLNAME>
State of scrub: Active
Scrub frequency: biweekly
Bitrot error log location: /var/log/glusterfs/bitd.log
Scrubber error log location: /var/log/glusterfs/scrub.log
=========================================================
Node name:
Number of Scrubbed files:
Number of Unsigned files:
Last completed scrub time:
Duration of last scrub:
Error count:
=========================================================
This is just infrastructure. list of bad file, last scrub
time, error count value will be taken care by
http://review.gluster.org/#/c/12503/ and
http://review.gluster.org/#/c/12654/ patches.
>> Change-Id: I3ed3c7057c9d0c894233f4079a7f185d90c202d1
>> BUG: 1207627
>> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
>> Reviewed-on: http://review.gluster.org/10231
>> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
>> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
>> Tested-by: Gluster Build System <jenkins@build.gluster.com>
Change-Id: I45ed94e5e0e78a1e007c30eb0b252f74cf3c9187
BUG: 1283881
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/12704
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11461
Which was done at half the set expiry time resulting in actual
IOs incrementing the object version. Now this is done just at
the last moment with re-notification now cut-shorting into
checksum calculation without waiting in the timer-wheel.
BUG: 1242718
Change-Id: If655b77d822ebf7b2a4f65e1b5583dd3609306e7
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11653
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11126
* Access to bad objects (especially operations such as open, readv, writev)
should be denied to prevent applications from getting wrong data.
* Do not allow anyone apart from scrubber to set bad object xattr.
* Do not allow bad object xattr to be removed.
Change-Id: I6903184ab64a9d1ea595330b603935979c33bc26
BUG: 1241529
Reviewed-on: http://review.gluster.org/11603
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11396
Change-Id: Idfd245327b485459ccbda503510b8ca0127bb66c
BUG: 1226666
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11542
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11149
A bunch of command line options for scrubber tempted the use of
state machine to track current state of scrubber under various
circumstances where the options could be in effect.
Change-Id: Id614bb2e6af30a90d2391ea31ae0a3edeb4e0d69
BUG: 1226666
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11541
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11148
This patch uses "cleanup, v1" infrastrcuture to cleanup scrubber
(data structures, threads, timers, etc..) on brick disconnection.
Signer is not cleaned up yet: probably would be done as part of
another patch.
Change-Id: I78a92b8a7f02b2f39078aa9a5a6b101fc499fd70
BUG: 1226666
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11540
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/11147
This is a short series of patches (with other cleanups) aimed at
cleaning up some of the incorrect assumptions taken in reconfigure()
leading to crashes when subvolumes are not fully initialized (as
reported here[1] on gluster-devel@). Furthermore, there is some
amount of code cleanup to handle disconnection and cleanup up data
structure (as part of subsequent patch).
[1] http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html
Change-Id: I68ac4bccfbac4bf02fcc31615bd7d2d191021132
BUG: 1226830
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11539
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/10297
Cherry picked from 2f0d36d16c241365760aaa6d857b7a4d438e1042
>Change-Id: I83c494f2bb60d29495cd643659774d430325af0a
>BUG: 1194640
>Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
>Reviewed-on: http://review.gluster.org/10297
>Tested-by: Venky Shankar <vshankar@redhat.com>
>Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
>Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
>Tested-by: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Venky Shankar <vshankar@redhat.com>
Change-Id: I83c494f2bb60d29495cd643659774d430325af0a
BUG: 1217722
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/11379
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/10947
* Let bit-rot stub check both on disk ongoing version, signed version xattrs and
the in memory flags in the inode and then decide whether the inode is stale or
not. This information is used by one shot crawler in BitD to decide whether to
trigger the sign for the object or skip it.
NOTE: The above check should be done only for BitD. For scrubber its still the
old way of comparing on disk ongoing version with signed version.
* BitD's one shot crawler should not sign zero byte objects if they do not contain
signature. (Means the object was just created and nothing was written to it).
Change-Id: I580b45b85f62fc075616ee3da9c15a3c8335d7a8
BUG: 1232199
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/11249
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently bitrot using 120 second waiting time for object to be signed
after all fop's released. This signing waiting time value should be tunable.
Command for changing the signing waiting time will be
#gluster volume bitrot <VOLNAME> signing-time <waiting time value in second>
Change-Id: I89f3121564c1bbd0825f60aae6147413a2fbd798
BUG: 1231832
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11105
(cherry picked from commit 554fa0c1315d0b4b78ba35a2d332d7ac0fd07d48)
Reviewed-on: http://review.gluster.org/11235
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/10763
Brick connection was bloated (and not implemented efficiently) with
calls which were not required to be called under lock. This resulted
in starvation of lock by critical code paths. This eventally did not
scale when the number of bricks per volume increases (add-brick and
the likes).
Also, this patch cleans up some of the weird reconnection logic that
added more to the starvation of resources and cleans up uncontrolled
growing of log files.
Change-Id: I05e737f2a9742944a4a543327d167de2489236a4
BUG: 1226146
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10986
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reimplments existing scrub-frequency mechanism used
to schedule scrubber runs. Existing mechanism uses periodic
sleeps (waking up periodically on minimum granularity) and
performing a number of tracking checks based on counters and
sleep times. This patch does away with all the nifty counters
and uses timer-wheel to schedule scrub runs.
Scheduling changes are peformed by merely calculating the new
expiry time and calling mod_timer() [mod_timer_pending() in
some cases] making the code more debuggable and easier to
follow. This also introduces "hourly" scrubbing tunable as an
aid for testing scrubbing during development/testing cycle.
One could also implement on-demand scrubbing with ease: by
invoking mod_timer() with an expiry of one (1) second, thereby
scheduling a scrub run the very next second.
Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b
BUG: 1224647
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10902
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the signing trigger mechanism used by bitrot
daemon as a "catch up" meachanism to sign files which _missed_
signing on the last run either due to bitrot being disabled and
enabled again or if bitrot is enabled for a volume with existing
data.
Existing implementation relies on overloading writev() to trigger
signing which just by the looks sounded dangerous and I hated it
to the core. This change moves all that business to the setxattr
interface thereby keeping the writev path strictly for client
IO.
Why not use IPC fop to trigger signing?
There's a need to access the object's inode to perform various
maintainance operations. inode is not _directly_ accessible in
the IPC fop (although, it can be found via inode_grep() for the
object's GFID - the inode just needs to be pinned in memory,
which is the case if there's an active fd on the inode). This
patch relies on good old technique of overloading fsetxattr()
to do the job instead of using IPC fop.
There are some pretty nice cleanups along the lines of memory
deallocations, unncessary allocations and redundant ref()ing
of structures (such as fd's) provided by this patch. All in
all - much improved code navigation.
Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8
BUG: 1225709
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10953
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of open
* This patch brings in the changes where object versioning is done in write and
truncate fops instead of tracking them in open and create fops. This model
works for both regular and anonymous fds. It also removes the race associated
with open calls, create and lookups.
This patch follows the below method for object versioning and notifications:
Before sending writev on the fd, increase the ongoing
version first. This makes anonymous fd write similar to the regular
fd write by having the ongoing version increased before doing the
write.
Do following steps to do versioning:
1) For anonymous fds set the fd context (so that release is invoked) and add
the fd context to the list maintained in the inode context.
For regular fds the above think would have been done in open itself.
2) Increase the on-disk ongoing version
3) Increase the in memory ongoing version and mark inode as non-dirty
3) Once versioning is successfully done send write operation. If
versioning fails, then fail the write fop.
5) In writev_cbk mark inode as modified.
> Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
> BUG: 1207979
> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
> Reviewed-on: http://review.gluster.org/10233
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I4bb86989b5fab02b9ed2950798b1a80e566f1024
BUG: 1220041
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10722
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently scrubber is crawling all the files continuously. It should
crawl files based on the scrubber frequency which user have set.
By default scrubber crawling frequency value will be biweekly.
Change-Id: I5762a92c1e700134cfe4283d1f631904adbfe31d
BUG: 1220068
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10739
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With logical scan/scrub split, pausing filesystem scrubber is an
override to the thread throttling mechanism, which effectively
throttles "down" number of scrubber threads to zero. This causes
scanner to wait until threads are spawned again (when resumed)
thereby continuing where it left off (since the file tree walk
stack is effectively preserved when the main scanner thread
is waiting for scrubbers to consume scanned entries).
The only catch is when scrubber daemon restarts: file tree walk
stack is lost and scrubbing initiates from root. This is probably
OK for now (can be changed later to persist parent directory
information before entering pause state).
> Change-Id: I5109a749b7fccd0f5367765078f46e6522dd32a1
> BUG: 1208131
> Signed-off-by: Venky Shankar <vshankar@redhat.com>
> Reviewed-on: http://review.gluster.org/10521
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
> Tested-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I9b60f2ce24ca3787423a45ec7d502f89215fe45f
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1220041
Reviewed-on: http://review.gluster.org/10721
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.
This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).
[
NOTE:
Throttling is optional for the signer daemon, without which
it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
predefined in CFLAGS enables CPU throttling (during checksum
calculation) thereby avoiding high CPU usage.
]
Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.
> Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
> BUG: 1207020
> Signed-off-by: Venky Shankar <vshankar@redhat.com>
> Reviewed-on: http://review.gluster.org/10511
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I5a125b2d0ac7dafd3e278b7fe4c6c9dd07af76dd
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1220041
Reviewed-on: http://review.gluster.org/10720
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitRot daemons (signer & scrubber) are disk/cpu hoggers when left
running full throttle. Checksum calculations (especially SHA family
of hash routines) can be quite CPU intensive. Moreover periodic
disk scans performed by scrubber followed by reading data blocks
for hash calculation (which is also done by signer) generate lot
of heavy IO request(s). This causes interference with actual client
operations (be it a regular client or filesystems daemons such as
self-heal, etc..) and results in degraded system performance.
This patch introduces throttling based on Token Bucket Filtering[1].
It's a well known algorithm for checking (and ensuring) that data
transmission conform to defined limits and generally used in packet
switched networks. Linux control groups (Cgroups) uses a variant[2]
of this algorithm to provide block device IO throttling (cgroup
subsys "blkio": blk-iothrottle).
So, why not just live with Cgroups?
Cgroups is linux specific. We need to have a throttling mechanism
for other supported UNIXes. Moreover, having our own implementation
gives much more finer control in terms of tuning it for our needs
(plus the simplicity of the alogorithm itself).
Ideally, throttling should be a part of server stack (either as a
separate translator or integrated with io-threads) since that's
the point of entry for IO request(s) from *all* client(s). That
way one could selectively throttle IO request(s) based on client
PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc..
(*actual* clients can run full throttle). This implementation
avoids that deliberately (there needs to be a much more smarter
queueing mechanism) and throttles CPU usage for hash calculations.
This patch is just the infrastructure part with no interfaces
exposed to set various throttling values. The tunable selected
here (basically hardcoded) avoids 100% CPU usage during hash
calculation (with some bursts cycles). We'd need much more
intensive test(s) to assign values for various throttling
options (lazy/normal/aggressive).
[1] https://en.wikipedia.org/wiki/Token_bucket
[2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket
> Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8
> BUG: 1207020
> Signed-off-by: Venky Shankar <vshankar@redhat.com>
> Reviewed-on: http://review.gluster.org/10307
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
> Tested-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I034ba1095aa3bfc3212a67a63ffb931431b372f6
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1220041
Reviewed-on: http://review.gluster.org/10719
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
> Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
> BUG: 1170075
> Signed-off-by: Venky Shankar <vshankar@redhat.com>
> Reviewed-on: http://review.gluster.org/10181
> Tested-by: NetBSD Build System
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I3c18d7dc2db4beaca6e8d8d231b4171a7b18795f
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1220041
Reviewed-on: http://review.gluster.org/10718
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> Change-Id: I761927ea263b4144b851881f25791fda5b794f59
> BUG: 1170075
> Signed-off-by: Venky Shankar <vshankar@redhat.com>
> Reviewed-on: http://review.gluster.org/10381
> Tested-by: NetBSD Build System
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I4aa7c0d8b42b4c8d14a1d810e54c2de4d52b4389
Signed-off-by: Venky Shankar <vshankar@redhat.com>
BUG: 1220041
Reviewed-on: http://review.gluster.org/10717
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently whatever bitrot/scrubber tunable value user set for one
volume that value is considering for all other volumes also.
Each volume should act on their respective bitrot/scrubber tunable
value.
For handling bitrot/scrubber tunable value independently with respect
to all the volume bitrot and scrubber translator should run seperatly
for each volume.
Change-Id: I1d9379508afe6cfd2f78e3ebf29c829c362d84a9
BUG: 1218048
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10352
(cherry picked from commit f81deb95db417eeededf7442a30304a880cc8169)
Reviewed-on: http://review.gluster.org/10516
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for xdata in both the
request and response path of syncops.
Few calls like lookup already had the support;
have renamed variables in few places to maintain
uniformity.
xdata passed downwards is known as xdata_in
and xdata passed upwards is known as xdata_out.
There is an old patch by Jeff Darcy at
http://review.gluster.org/#/c/8769/3 which does the
same for some selected calls. It also brings in
xdata support at gfapi level.
xdata support at gfapi level would be introduced
in subsequent patches.
Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc
BUG: 1158621
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/9859
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a handful of problem with scrubber which
are detailed below.
Scrubber used to skip objects for verification due to missing
fd iterface to fetch versioning extended attributes. Similar
to the inode interface, an fd based interface in POSIX is now
introduced.
Moreover, this patch also fixes potential false reporting by
scrubber due to:
An object gets dirtied and signed when scrubber is busy
calculatingobject checksum. This is fixed by caching the
signed version when an object is first inspected for
stalenes, i.e., during pre-compute stage. This version is
used to verify checksum in the post-compute stage when the
signatures are compared for possible corruption.
Side effect of _not_ sending signature length during signing
resulted in "truncated" signature to be set for an object.
Now, at the time of signing, the signature length is sent
and is used in place of invoking strlen() to get signature
length (which could have possible 00s). The signature length
itself is not persisted in the signature xattr, but is
calculated on-the-fly by substracting the xattr length by
the "structure" header size.
Some of the log entries are made more meaningful (as and aid
for debugging).
Change-Id: I938bee5aea6688d5d99eb2640053613af86d6269
BUG: 1207624
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10118
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterfs relies on Linux uuid implementation, which
API is incompatible with most other systems's uuid. As
a result, libglusterfs has to embed contrib/uuid,
which is the Linux implementation, on non Linux systems.
This implementation is incompatible with systtem's
built in, but the symbols have the same names.
Usually this is not a problem because when we link
with -lglusterfs, libc's symbols are trumped. However
there is a problem when a program not linked with
-lglusterfs will dlopen() glusterfs component. In
such a case, libc's uuid implementation is already
loaded in the calling program, and it will be used
instead of libglusterfs's implementation, causing
crashes.
A possible workaround is to use pre-load libglusterfs
in the calling program (using LD_PRELOAD on NetBSD for
instance), but such a mechanism is not portable, nor
is it flexible. A much better approach is to rename
libglusterfs's uuid_* functions to gf_uuid_* to avoid
any possible conflict. This is what this change attempts.
BUG: 1206587
Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10017
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity fixes:
CID 1124725
CID 1291742
The problem is that gf_tw_cleanup_timers() frees the handed
in priv->timer_wheel but it can not set the pointer to NULL,
so subsequent checks for priv->timer_wheel show it as not NULL
and allow for access after free.
The proper change might be to change gf_tw_cleanup_timers() to
take a reference to the pointer and set it to NULL after free,
but since it is under contrib/, I did not want to change that
function. Instead this patch uses the function's return code
which was not used previously. (Maybe this should even be done
in a wrapper macro or function?)
Change-Id: I31d80d3df2e4dc7503d62c7819429e1a388fdfdd
BUG: 789278
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-on: http://review.gluster.org/10056
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes Coverity CIDs 1291728, 1291723, 1291732.
Change-Id: I62f3d540cac0f555fe2839b8418e59691c3ff4fd
BUG: 789278
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-on: http://review.gluster.org/10055
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scrubber performs signature verification for objects that were
signed by signer. This is done by recalculating the signature
(using the hash algorithm the object was signed with) and
verifying it aginst the objects persisted signature. Since the
object could be undergoing IO opretaion at the time of hash
calculation, the signature may not match objects persisted
signature. Bitrot stub provides additional information about
the stalesness of an objects signature (determinted by it's
versioning mechanism). This additional bit of information is
used by scrubber to determine the staleness of the signature,
and in such cases the object is skipped verification (although
signature staleness is performed twice: once before initiation
of hash calculation and another after it (an object could be
modified after staleness checks).
The implmentation is a part of the bitrot xlator (signer) which
acts as a signer or scrubber based on a translator option. As
of now the scrub process is ever running (but has some form of
weak throttling mechanism during filesystem scan). Going forward,
there needs to be some form of scrub scheduling and IO throttling
(during hash calculation) tunables (via CLI).
Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9
BUG: 1170075
Original-Author: Raghavendra Bhat <rabhat@redhat.com>
Original-Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9914
Tested-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
This is the "Signer" -- responsible for signing files with their
checksums upon last file descriptor close (last release()).
The event notification facility provided by the changelog xlator
is made use of.
Moreover, checksums are as of now SHA256 hash of the object data
and is the only available hash at this point of time. Therefore,
there is no special "what hash to use" type check, although it's
does not take much to add various hashing algorithms to sign
objects with. Signatures are stored in extended attributes of the
objects along with the the type of hashing used to calculate the
signature. This makes thing future proof when other hash types
are added. The signature infrastructure is provided by bitrot
stub: a little piece of code that sits over the POSIX xlator
providing interfaces to "get or set" objects signature and it's
staleness.
Since objects are signed upon receiving release() notification,
pre-existing data which are "never" modified would never be
signed. To counter this, an initial crawler thread is spawned
The crawler scans the entire brick for objects that are unsigned
or "missed" signing due to the server going offline (node reboots,
crashes, etc..) and triggers an explicit sign. This would also
sign objects when bit-rot is enabled for a volume and/or after
upgrade.
Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b
BUG: 1170075
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9711
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|