| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of signing process threads (glfs_brpobj)
is set to 4 by default. The recommendation is to set
it to number of cores available. This patch makes it
configurable as follows
gluster vol bitrot <volname> signer-threads <count>
fixes: bz#1797869
Change-Id: Ia883b3e5e34e0bc8d095243508d320c9c9c58adc
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
| |
Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. br_update_scrub_finish_time: BUFFER_SIZE_WARNING
2. br_read_bad_object_dir : DEADCODE
3. bit-rot.c: init : RESOURCE_LEAK
4. br_stub_fsetxattr : STACK_USE
5. br_stub_setxattr : STACK_USE
6. bit-rot-stub.c: init : BUFFER_SIZE_WARNING
Change-Id: Ie620f431bd7548fedae2152aa756ccdcd89ddf89
Signed-off-by: Kotresh HR <khiremat@redhat.com>
BUG: 789278
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since throttling is a separate feature by itself,
move throttling code to libglusterfs.
Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972
BUG: 1352019
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/14846
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a short series of patches (with other cleanups) aimed at
cleaning up some of the incorrect assumptions taken in reconfigure()
leading to crashes when subvolumes are not fully initialized (as
reported here[1] on gluster-devel@). Furthermore, there is some
amount of code cleanup to handle disconnection and cleanup up data
structure (as part of subsequent patch).
[1] http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html
Change-Id: I68ac4bccfbac4bf02fcc31615bd7d2d191021132
BUG: 1231617
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11147
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current signing interface (fsetxattr()) had couple of issues:
One, a signing request (by bitrot daemon) is denied if the version
against which an object is to be signed is unequal to the current
version of the object (cases where another subsequent modification
increments the version). Such request(s) are rejected with EINVAL
sent back to the signer resulting in a bunch of errors (in logs)
reported by bitrot daemon. Although, the object would be eventaully
signed with the version matching the current version, the "lagging"
request should be correctly handled.
Two, more than one signing request could race against each other
with the object getting signed with a version depending on which
request ended up last in the race. Although harmless to some extent,
such a case could end up marking the object's signature as stale
for infinity (if the object is *never* touched) thereby resulting
in scrubber skipping the object during verification.
This patch fixes these issues by ordering signing request(s) and
fixing version comparison checks at the time of signing.
Change-Id: I9fa83dfa3be664ba4db61d7f2edc408f4bde77dd
BUG: 1221938
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10832
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reimplments existing scrub-frequency mechanism used
to schedule scrubber runs. Existing mechanism uses periodic
sleeps (waking up periodically on minimum granularity) and
performing a number of tracking checks based on counters and
sleep times. This patch does away with all the nifty counters
and uses timer-wheel to schedule scrub runs.
Scheduling changes are peformed by merely calculating the new
expiry time and calling mod_timer() [mod_timer_pending() in
some cases] making the code more debuggable and easier to
follow. This also introduces "hourly" scrubbing tunable as an
aid for testing scrubbing during development/testing cycle.
One could also implement on-demand scrubbing with ease: by
invoking mod_timer() with an expiry of one (1) second, thereby
scheduling a scrub run the very next second.
Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b
BUG: 1224596
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10893
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of open
* This patch brings in the changes where object versioning is done in write and
truncate fops instead of tracking them in open and create fops. This model
works for both regular and anonymous fds. It also removes the race associated
with open calls, create and lookups.
This patch follows the below method for object versioning and notifications:
Before sending writev on the fd, increase the ongoing
version first. This makes anonymous fd write similar to the regular
fd write by having the ongoing version increased before doing the
write.
Do following steps to do versioning:
1) For anonymous fds set the fd context (so that release is invoked) and add
the fd context to the list maintained in the inode context.
For regular fds the above think would have been done in open itself.
2) Increase the on-disk ongoing version
3) Increase the in memory ongoing version and mark inode as non-dirty
3) Once versioning is successfully done send write operation. If
versioning fails, then fail the write fop.
5) In writev_cbk mark inode as modified.
Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
BUG: 1207979
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10233
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.
This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).
[
NOTE:
Throttling is optional for the signer daemon, without which
it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
predefined in CFLAGS enables CPU throttling (during checksum
calculation) thereby avoiding high CPU usage.
]
Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.
Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10511
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitRot daemons (signer & scrubber) are disk/cpu hoggers when left
running full throttle. Checksum calculations (especially SHA family
of hash routines) can be quite CPU intensive. Moreover periodic
disk scans performed by scrubber followed by reading data blocks
for hash calculation (which is also done by signer) generate lot
of heavy IO request(s). This causes interference with actual client
operations (be it a regular client or filesystems daemons such as
self-heal, etc..) and results in degraded system performance.
This patch introduces throttling based on Token Bucket Filtering[1].
It's a well known algorithm for checking (and ensuring) that data
transmission conform to defined limits and generally used in packet
switched networks. Linux control groups (Cgroups) uses a variant[2]
of this algorithm to provide block device IO throttling (cgroup
subsys "blkio": blk-iothrottle).
So, why not just live with Cgroups?
Cgroups is linux specific. We need to have a throttling mechanism
for other supported UNIXes. Moreover, having our own implementation
gives much more finer control in terms of tuning it for our needs
(plus the simplicity of the alogorithm itself).
Ideally, throttling should be a part of server stack (either as a
separate translator or integrated with io-threads) since that's
the point of entry for IO request(s) from *all* client(s). That
way one could selectively throttle IO request(s) based on client
PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc..
(*actual* clients can run full throttle). This implementation
avoids that deliberately (there needs to be a much more smarter
queueing mechanism) and throttles CPU usage for hash calculations.
This patch is just the infrastructure part with no interfaces
exposed to set various throttling values. The tunable selected
here (basically hardcoded) avoids 100% CPU usage during hash
calculation (with some bursts cycles). We'd need much more
intensive test(s) to assign values for various throttling
options (lazy/normal/aggressive).
[1] https://en.wikipedia.org/wiki/Token_bucket
[2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket
Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10307
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the "Signer" -- responsible for signing files with their
checksums upon last file descriptor close (last release()).
The event notification facility provided by the changelog xlator
is made use of.
Moreover, checksums are as of now SHA256 hash of the object data
and is the only available hash at this point of time. Therefore,
there is no special "what hash to use" type check, although it's
does not take much to add various hashing algorithms to sign
objects with. Signatures are stored in extended attributes of the
objects along with the the type of hashing used to calculate the
signature. This makes thing future proof when other hash types
are added. The signature infrastructure is provided by bitrot
stub: a little piece of code that sits over the POSIX xlator
providing interfaces to "get or set" objects signature and it's
staleness.
Since objects are signed upon receiving release() notification,
pre-existing data which are "never" modified would never be
signed. To counter this, an initial crawler thread is spawned
The crawler scans the entire brick for objects that are unsigned
or "missed" signing due to the server going offline (node reboots,
crashes, etc..) and triggers an explicit sign. This would also
sign objects when bit-rot is enabled for a volume and/or after
upgrade.
Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b
BUG: 1170075
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9711
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
Bitrot stub implements object versioning required for identifying
signature freshness. More details about versioning is explained
as a part of the "bitrot feature documentation" patch.
Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9709
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|