| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Let bit-rot stub check both on disk ongoing version, signed version xattrs and
the in memory flags in the inode and then decide whether the inode is stale or
not. This information is used by one shot crawler in BitD to decide whether to
trigger the sign for the object or skip it.
NOTE: The above check should be done only for BitD. For scrubber its still the
old way of comparing on disk ongoing version with signed version.
* BitD's one shot crawler should not sign zero byte objects if they do not contain
signature. (Means the object was just created and nothing was written to it).
Change-Id: I6941aefc2981bf79a6aeb476e660f79908e165a8
BUG: 1224611
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10947
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When disk quota exceeded, quota enforcer logs
alert message, so no need to log error message
as this can fill up the log file
Change-Id: Ia913f47bc0cedb7c0a9c611330ee5124d3bb6c9d
BUG: 1229609
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11135
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently bitrot using 120 second waiting time for object to be signed
after all fop's released. This signing waiting time value should be tunable.
Command for changing the signing waiting time will be
#gluster volume bitrot <VOLNAME> signing-time <waiting time value in second>
Change-Id: I89f3121564c1bbd0825f60aae6147413a2fbd798
BUG: 1228680
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/11105
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I5e3df8860ea35bce14a802391be9b22ad64f1ad4
BUG: 1075611
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/7574
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/10342 introduced a cleanup thread for expired
client entries. When enabling the 'features.cache-invalidation' volume
option, the brick process starts to run in a busy-loop. Obviously this
is not intentional, and a process occupying 100% of the cycles on a CPU
or core is not wanted.
Change-Id: I453c612d72001f4d8bbecdd5ac07aaed75b43914
BUG: 1200267
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11198
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HTIME_KEY marks the last changelog rolled over. The xattr is
maintained on .glusterfs/changelog/htime/HTIME.TSTAMP file.
On every rollover of the changelog file, the xattr is updated.
It is being updated with XATTR_REPLACE flag as xattr gets
created during changelog enable. But it is once found that
the xattrs on the file is cleared and is not reproduced later
on. This patch protects that case, if it happens by setting
xattr without XATTR_REPLACE flag in failure case.
The reason behind doing this in failure case is not to mask
the actual cause of xattrs getting cleared. This provides
the log message if the original issue still exists but the
consequential effects are fixed.
Also changed the log messages to depict the events happened
during changelog enable.
Change-Id: I699ed09a03667fd823d01d65c9c360fa7bc0e455
BUG: 1230015
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/11150
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is handling the comment on 3.7 backport
http://review.gluster.com/11047
Change-Id: I253a8e0bfebde63f25c839efd55ed3cc30e6cd9b
BUG: 1226276
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11076
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When a file is renamed and the (renamed)file's Hashing
falls into a different brick, DHT creates a special file(linkto file)
in the brick(Hashed subvolume) and carries out setattr operation
on that file.
Currently, Changelog records this(setattr) operation in Hashed
subvolume. glusterfind in turn records this operation
as MODIFY operation.
So, there is a NEW entry in Cached subvolume and MODIFY entry
in Hashed subvolume for the same file.
Solution:
Avoid logging setattr operation carried out, by
marking the operation as internal fop using xdata.
In changelog translator, check whether setattr is set
as internal fop and skip accordingly.
Change-Id: I21b09afb5a638b88a4ccb822442216680b7b74fd
BUG: 1230007
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/11137
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem was in marker xlator, where during
rename a NULL value is passed during STACK_WIND.
Marker needs to pass the xdata un-modified to next
translator if marker does not rely on that.
Change-Id: I9e47e504fd241263987645abfed7ca13c0d54a80
BUG: 1228492
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/11089
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I66e7ccc5e62482c3ecf0aab302568e6c9ecdc05d
BUG: 1194640
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/10938
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Server-side internally generated fops like 'quota/marker' will
not have any client associated with the frame. Hence we need a
check for clients to be valid before processing for upcall cache
invalidation. Also fixed an issue with initializing reaper-thread.
Added a testcase to test the fix.
Change-Id: If7419b98aca383f4b80711c10fef2e0b32498c57
BUG: 1227204
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10909
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch remove stores to variables that are no longer live.
Change-Id: Ib6acd8c70cbb7ea875c01b7cfd6620ac1d641d36
BUG: 1223378
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-on: http://review.gluster.org/10841
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Missing loc_wipe() for error paths in mq_readdir_cbk() can
cause memory leaks. loc_wipe() is now done for both happy
and unhappy paths.
Change-Id: I882aa5dcca06e25b56a828767fb2b91a1efaf83b
BUG: 1227904
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/11074
Reviewed-by: Sachin Pandit <spandit@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we recieve a statfs call on snap directory, we will redirect
the call into the root, by creating a new root loc. So it is better to
take a ref on the root inode.
(http://review.gluster.org/#/c/10358/5/xlators/features/
snapview-client/src/snapview-client.c)
Change-Id: I5649addac442d391b2550346b115dec58fed5b86
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/10750
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0b44b70f07be441e044d9dfc5c2b64bd5b4cac18
BUG: 1207735
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11045
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
(f)stat, unlink and rename must skip doing inode_ctx_get()
of shard block size on symbolic links.
Change-Id: I68688532164dd2ab491ff5c59b343174f8c4ce7f
BUG: 1223759
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10995
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to get_lowest_block() being a macro, what needs to be passed
to it is the evaluation of the expression (local->offset - 1), without
which its substitution can cause junk values to be assigned to
local->first_block.
This patch also fixes calls to get_highest_block() where if offset and
size are both equal to zero, it could return negative values.
Change-Id: I3ae918a0a3251ffd9ce8d2294bc5f9b681447627
BUG: 1200082
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10804
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If postbuf in quota_writev_cbk is NULL directly
an unwind should be done. Trying to dereference
it will lead to a crash.
Change-Id: Idba6ce3cd1bbf37ede96c7f17d01007d6c07057a
BUG: 1221577
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10898
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiled with gcc5, following warnings were displayed
and volume start failed:
changelog-helpers.h:499:1: warning: inline function 'changelog_dispatch_event'
declared but never defined
changelog_dispatch_event (xlator_t *, changelog_priv_t *, changelog_event_t *);
gf-changelog-journal-handler.c:692:17: warning: 'list_add_tail' is static but
used in inline function 'gf_changelog_queue_journal' which is not static
list_add_tail (&entry->list, &jnl_proc->entries);
Fix is to remove the keyword from function prototype and
definitions.
Change-Id: I188b35b7ca087a94d7a48a052b05a6d845e3b74b
BUG: 1226307
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/11004
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Suppose if there are two volumes vol1 and vol2,
and quota is enabled and limit is set on vol1.
Now if IO is happening on vol1 and quota is enabled/disabled
on vol2, quotad gets restarted and client will receive
ENOTCONN in the IO path of vol1.
This patch will retry connecting to quotad upto 60sec
in a interval of 5sec (12 retries)
If not able to connect with 12 retries, then return ENOTCONN
Change-Id: Ie7f5d108633ec68ba9cc3a6a61d79680485193e8
BUG: 1211220
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10230
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ancestry building.
We do quota_build_ancestry in function 'quota_get_limit_dir',
suppose if quota_build_ancestry fails, then we don't have a
frame saved to continue the statfs FOP and client can hang.
Change-Id: I92e25c1510d09444b9d4810afdb6b2a69dcd92c0
BUG: 1178619
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9380
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When readdir(p) is performed on '/' and ".shard" happens to be
the last of the entries read in a given iteration of dht_readdir(p)
(in other words the entry with the highest offset in the dirent list
sorted in ascending order of d_offs), shard xlator would delete this
entry as part of handling the call so as to avoid exposing its presence
to the application. This would cause xlators above (like fuse,
readdir-ahead etc) to wind the next readdirp as part of the same req
at an offset which is (now) the highest d_off (post deletion of .shard)
from the previously unwound list of entries. This offset would be less
than that of ".shard" and therefore cause /.shard to be read once again.
If by any chance this happens to be the only entry until end-of-directory,
shard xlator would delete this entry and unwind with 0 entries, causing the
xlator(s) above to think there is nothing more to readdir and the fop is
complete. This would prevent DHT from gathering entries from the rest of
its subvolumes, causing some entries to disappear.
Fix:
At the level of shard xlator, if ".shard" happens to be the last entry,
make shard xlator wind another readdirp at offset equal to d_off of
".shard". That way, if ".shard" happens to be the only other entry under '/'
until end-of-directory, DHT would receive an op_ret=0. This would enable it
to wind readdir(p) on the rest of its subvols and gather the complete picture.
Also, fixed a bug in shard_lookup_cbk() wherein file_size should be fetched
unconditionally in cbk since it is set unconditionally in the wind path, failing
which, lookup would be unwound with ia_size and ia_blocks only equal to that of
the base file.
Change-Id: I6c2bc770f1bcdad51c273c777ae0b42c88c53f61
BUG: 1222379
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10809
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the absence of mknod() fop implementation in bitrot stub,
further operations that trigger versioning resulted in crashes
as they expect the inode context to be valid. Therefore, this
patch implements mknod() following similar simantics to fops
such as create().
Furthermore, bitrot stub test C program is fixed to stop lying
and validate obj versions according to the versioning protocol.
Change-Id: If76f252577445d1851d6c13c7e969e864e2183ef
BUG: 1221914
Original-Author: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10790
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During snapshot, changelog barrier is enabled and a
explicit rollover of changelog is initiated. During
rollover of changelog, if any error or changelog is
empty, the notification was not sent to reconfigure
and hence snapshot was failing because of timeout.
This patch addresses it by sending notification
irrespective of failures and sends error if any
back to barrier.
Change-Id: I898af624b44555281a9e43c69066077e0e121c17
BUG: 1225542
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10951
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current signing interface (fsetxattr()) had couple of issues:
One, a signing request (by bitrot daemon) is denied if the version
against which an object is to be signed is unequal to the current
version of the object (cases where another subsequent modification
increments the version). Such request(s) are rejected with EINVAL
sent back to the signer resulting in a bunch of errors (in logs)
reported by bitrot daemon. Although, the object would be eventaully
signed with the version matching the current version, the "lagging"
request should be correctly handled.
Two, more than one signing request could race against each other
with the object getting signed with a version depending on which
request ended up last in the race. Although harmless to some extent,
such a case could end up marking the object's signature as stale
for infinity (if the object is *never* touched) thereby resulting
in scrubber skipping the object during verification.
This patch fixes these issues by ordering signing request(s) and
fixing version comparison checks at the time of signing.
Change-Id: I9fa83dfa3be664ba4db61d7f2edc408f4bde77dd
BUG: 1221938
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10832
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Root inode doesn't participate in ref/unref. Don't do it
in fini as by the time fini is called itable would be destroyed.
BUG: 1226276
Change-Id: I704d0a3c0813cb8f6c3f1f7d613c89aca8f4f9ad
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11002
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Locks can be taken just to inspect the data as well, so allow them.
Xattrops are internal fops so we can allow them as well as longs as
it doesn't change the xattr value, i.e. All-zero xattrop.
Change-Id: Idc06d2043eb472c064db40d811a80058f0bda378
BUG: 1211123
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10727
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Brick connection was bloated (and not implemented efficiently) with
calls which were not required to be called under lock. This resulted
in starvation of lock by critical code paths. This eventally did not
scale when the number of bricks per volume increases (add-brick and
the likes).
Also, this patch cleans up some of the weird reconnection logic that
added more to the starvation of resources and cleans up uncontrolled
growing of log files.
Change-Id: I05e737f2a9742944a4a543327d167de2489236a4
BUG: 1207134
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10763
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reimplments existing scrub-frequency mechanism used
to schedule scrubber runs. Existing mechanism uses periodic
sleeps (waking up periodically on minimum granularity) and
performing a number of tracking checks based on counters and
sleep times. This patch does away with all the nifty counters
and uses timer-wheel to schedule scrub runs.
Scheduling changes are peformed by merely calculating the new
expiry time and calling mod_timer() [mod_timer_pending() in
some cases] making the code more debuggable and easier to
follow. This also introduces "hourly" scrubbing tunable as an
aid for testing scrubbing during development/testing cycle.
One could also implement on-demand scrubbing with ease: by
invoking mod_timer() with an expiry of one (1) second, thereby
scheduling a scrub run the very next second.
Change-Id: I6c7c5f0c6c9f886bf574d88c04cde14b76e60a8b
BUG: 1224596
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10893
Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the signing trigger mechanism used by bitrot
daemon as a "catch up" meachanism to sign files which _missed_
signing on the last run either due to bitrot being disabled and
enabled again or if bitrot is enabled for a volume with existing
data.
Existing implementation relies on overloading writev() to trigger
signing which just by the looks sounded dangerous and I hated it
to the core. This change moves all that business to the setxattr
interface thereby keeping the writev path strictly for client
IO.
Why not use IPC fop to trigger signing?
There's a need to access the object's inode to perform various
maintainance operations. inode is not _directly_ accessible in
the IPC fop (although, it can be found via inode_grep() for the
object's GFID - the inode just needs to be pinned in memory,
which is the case if there's an active fd on the inode). This
patch relies on good old technique of overloading fsetxattr()
to do the job instead of using IPC fop.
There are some pretty nice cleanups along the lines of memory
deallocations, unncessary allocations and redundant ref()ing
of structures (such as fd's) provided by this patch. All in
all - much improved code navigation.
Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8
BUG: 1224600
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10942
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iaa7022c95a8d9c9c471db025ec644e0bcc4eeb29
BUG: 1221104
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10772
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During ancestry build, loc path was set to invalid
path. path was set to one of its child instead
of itself. Because of this quota accounting was
going wrong
This patch fix the issue
Below mentioned tests removed from bad test list
as part of patch# 10930
./tests/basic/ec/quota.t
./tests/basic/quota-nfs.t
./tests/bugs/quota/bug-1035576.t
Change-Id: Iaa65b2d968c04c9abcd476d0e9f588cb7fd39294
BUG: 1223798
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/10918
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch getxattr of inodelk/entrylk counts can be requested in
readv/writev/create/unlink/opendir.
Change-Id: If7430317ad478a3c753eb33bdf89046cb001a904
BUG: 1165041
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10880
Tested-by: NetBSD Build System
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
snapview-client and snapview-server doesnot have statfs
fop implemented
Change-Id: I2cdd4c5784414b0549a01af9a28dbc723b7cdc67
BUG: 1176837
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/10358
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently scrubber is crawling all the files continuously. It should
crawl files based on the scrubber frequency which user have set.
By default scrubber crawling frequency value will be biweekly.
Change-Id: I5762a92c1e700134cfe4283d1f631904adbfe31d
BUG: 1208131
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10602
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Log typo fix for CTR Xlator and Libgfdb
Change-Id: I8d7208b9756199f8dc0a72771a3c953b4fa00f3f
BUG: 1215896
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10668
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and processes
changelogs "independently". Processing changelogs is in brick level, but
all the fops will be replayed on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master and following operations
executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
Note:
This patch is dependent on dht changes from this patch.
http://review.gluster.org/10410/
changelog related changes are separated out for review.
In changelog, xdata passed from DHT is considered as :
1. In case of unlink (internal operation as part of rename), xdata value
is set , it is considered as RENAME and recorded accordingly.
2. In case of rename (Hash and Cache different), xdata value is NOT
set, recording rename operation is SKIPPED.
BUG: 1141379
Change-Id: Ifca675e6d4ef8c4e3b7ef4a7ec85de8b3a38dc08
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/10220
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie7e7c6028c7bffe47e60a2e93827e0e8767a3d66
BUG: 1219894
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10687
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of open
* This patch brings in the changes where object versioning is done in write and
truncate fops instead of tracking them in open and create fops. This model
works for both regular and anonymous fds. It also removes the race associated
with open calls, create and lookups.
This patch follows the below method for object versioning and notifications:
Before sending writev on the fd, increase the ongoing
version first. This makes anonymous fd write similar to the regular
fd write by having the ongoing version increased before doing the
write.
Do following steps to do versioning:
1) For anonymous fds set the fd context (so that release is invoked) and add
the fd context to the list maintained in the inode context.
For regular fds the above think would have been done in open itself.
2) Increase the on-disk ongoing version
3) Increase the in memory ongoing version and mark inode as non-dirty
3) Once versioning is successfully done send write operation. If
versioning fails, then fail the write fop.
5) In writev_cbk mark inode as modified.
Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c
BUG: 1207979
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/10233
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This issue introduced due to manual rebase.
Change-Id: I0589f4a0a1270190340f419b8022d6483bcf853d
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1219479
Reviewed-on: http://review.gluster.org/10685
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An empty changelog when rolled over gets unlinked and indexed with
a modified path-name in htime file. The modification is "changelog"
not "CHANGELOG" in basename of the empty changelog file.
Change-Id: I77fd0b48b5c33c245418f5ac7a9756f08ece24d9
BUG: 1208470
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/9572
Tested-by: NetBSD Build System
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With logical scan/scrub split, pausing filesystem scrubber is an
override to the thread throttling mechanism, which effectively
throttles "down" number of scrubber threads to zero. This causes
scanner to wait until threads are spawned again (when resumed)
thereby continuing where it left off (since the file tree walk
stack is effectively preserved when the main scanner thread
is waiting for scrubbers to consume scanned entries).
The only catch is when scrubber daemon restarts: file tree walk
stack is lost and scrubbing initiates from root. This is probably
OK for now (can be changed later to persist parent directory
information before entering pause state).
Change-Id: I5109a749b7fccd0f5367765078f46e6522dd32a1
BUG: 1208131
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10521
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces multithreaded filesystem scrubber based
on throttling option configured for a particular volume. The
implementation "logically" breaks scanning and scrubbing with
the number of scrubber threads auto-configured depending upon
the throttle configuration. Scanning (crawling) is left single
threaded (per brick) with entries scrubbed in bulk. On reaching
this "bulk" watermark, scanner waits until entries are scrubbed.
Bricks for a particular volume have a set of thread(s) assigned
for scrubbing, with entries for each brick scrubbed in a round
robin fashion to avoid scrub "stalls" when a brick (out of N
bricks) is under active scrubbing.
This mechanism helps us implement "pause/resume" with ease: all
one need to do is to cleanup scrubber threads and let the main
scanner thread "wait" untill scrubbing is resumed (where the
scrubber thread(s) are spawned again), therefore continuing
where we left off (unless we restart the deamons, where crawl
initiates from root directory again, but I guess that's OK).
[
NOTE:
Throttling is optional for the signer daemon, without which
it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER"
predefined in CFLAGS enables CPU throttling (during checksum
calculation) thereby avoiding high CPU usage.
]
Subsequent patches would introduce CPU throttling during hash
calculation for scrubber.
Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10511
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To-Do:
* Make ftruncate work even in the absence of path
* Aggregate and update ia_blocks appropriately when a file is
truncated to a lower size.
Change-Id: Ifd24c2f5e80d2c3bc921261f5481251df8948126
BUG: 1207615
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/10631
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) The glupy.so xlator should embed the runtime search path for
the python libraries. Unfortunately, python-config does not
gives the appprioate flags, therefore we need to also use
pkg-config to obtain them
2) Fix the glupy python module directory layout so that python
can import the module without problem
That two fixes seems to let glupy.t pass on NetBSD again.
BUG: 1129939
Change-Id: I397aa726ab8bf7d91fa0d6d870a30910a5f4a5d9
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10616
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitRot daemons (signer & scrubber) are disk/cpu hoggers when left
running full throttle. Checksum calculations (especially SHA family
of hash routines) can be quite CPU intensive. Moreover periodic
disk scans performed by scrubber followed by reading data blocks
for hash calculation (which is also done by signer) generate lot
of heavy IO request(s). This causes interference with actual client
operations (be it a regular client or filesystems daemons such as
self-heal, etc..) and results in degraded system performance.
This patch introduces throttling based on Token Bucket Filtering[1].
It's a well known algorithm for checking (and ensuring) that data
transmission conform to defined limits and generally used in packet
switched networks. Linux control groups (Cgroups) uses a variant[2]
of this algorithm to provide block device IO throttling (cgroup
subsys "blkio": blk-iothrottle).
So, why not just live with Cgroups?
Cgroups is linux specific. We need to have a throttling mechanism
for other supported UNIXes. Moreover, having our own implementation
gives much more finer control in terms of tuning it for our needs
(plus the simplicity of the alogorithm itself).
Ideally, throttling should be a part of server stack (either as a
separate translator or integrated with io-threads) since that's
the point of entry for IO request(s) from *all* client(s). That
way one could selectively throttle IO request(s) based on client
PIDs (frame->root->pid), e.g., self-heal daemon, bitrot, etc..
(*actual* clients can run full throttle). This implementation
avoids that deliberately (there needs to be a much more smarter
queueing mechanism) and throttles CPU usage for hash calculations.
This patch is just the infrastructure part with no interfaces
exposed to set various throttling values. The tunable selected
here (basically hardcoded) avoids 100% CPU usage during hash
calculation (with some bursts cycles). We'd need much more
intensive test(s) to assign values for various throttling
options (lazy/normal/aggressive).
[1] https://en.wikipedia.org/wiki/Token_bucket
[2] http://en.wikipedia.org/wiki/Token_bucket#Hierarchical_token_bucket
Change-Id: Icc49af80eeab6adb60166d0810e69ef37cfe2fd8
BUG: 1207020
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10307
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As optional feature, during unlink, full path will be recorded.
Changelog Version number to be bumped up to 1.2.
With this patch, parser checks the version number before parsing
and handles accordingly.
Change-Id: Ic1ad98259c39e417029a08e26a1d4b467817e65a
BUG: 1214561
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/10166
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inode quota is a new feature implemented in glusterfs-3.7
if quota is enabled in the older version and is upgraded
to a new version, we can hit setxattr spike during self-heal
of inode quotas. So, when a quota is enabled, turn off
inode-quotas with a xlator option.
With this patch, we still account for inode quotas but only
when a write operation is performed for a particular file.
User will be able to query inode quotas once the Inode-quota
xlator option is enabled.
Change-Id: I52fb28bf7024989ce7bb08ac63a303bf3ec1ec9a
BUG: 1209430
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
BUG: 1212037
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|