| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically when a directory tree is removed (rm -rf)
while a brick is down, both the directory index and the
name indices of the files and subdirs under it will remain.
Self-heal will need to pick up these and remove them.
Towards this, afr sh will now also crawl indices/entry-changes
and call an rmdir on the dir if the directory index is stale.
On the brick side, rmdir fop has been implemented for index xl,
which would delete the directory index and its contents if present
in a synctask.
Change-Id: I8b527331c2547e6c141db6c57c14055ad1198a7e
BUG: 1331323
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14832
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous commit to fix the bug, where io-stat-dump was overwriting
the dump file when the client and a brick was on the same host,
failed to consider the existing behaviour where io-stats can
help generate closely correlated set of stats across clients
and bricks, by triggering the dump using the same command.
This was introduced in commit:
0facb11220aea20a6573b656785922219c9650cf
Further, by limiting the first io-stat to unwind the dump request,
there is no way to trigger other io-stat xlators in the stack to
dump their stat information.
This bug hence is being fixed by this commit keeping the
following in mind,
- We need to trigger io-stat-dump for all instances in the
graph when this attr is set
- We need to write the output to different files, so that
they do not overwrite each others data
- We need to prevent this xattr from being set on the path
that is used to trigger the io-stat-dump information
Change-Id: I31ec380f0d85e10313a9d7b977da0e1ec74638a6
BUG: 1322825
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/14552
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem : The lock state needs to be protected when rebalance is reading the
lock state on the source. Otherwise there will be locks left unmigrated.
Hence, to synchronize incoming lock requests with lock-migration, meta lock
is needed. Any new lock request will be queued if there is an active meta-lock
and with sucessful lock migration, will be unwound with EREMOTE, so that
dht module can wind the request to the correct destination.
On a successful lock migration, "pl_inode->migrated" flag is enabled. Hence,
any further request would be unwound with EREMOTE and will be redirected to
new destination.
More details can be found here:
https://github.com/gluster/glusterfs-specs/blob/master/accepted/Lock-Migration.md
design discussion:
https://www.gluster.org/pipermail/gluster-devel/2016-January/048088.html
Change-Id: Ief033d5652b5ca4ba6f499110a521cae283d6aba
BUG: 1331720
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14251
Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating gfapi memory consumption with valgrind, valgrind
reported several memory access issues.
Also see the timer 'registry' being recreated (shortly) after being
freed during teardown due to the way it's currently written.
Passing ctx as data to gf_timer_proc() is prone to memory access
issues if ctx is freed before gf_timer_proc() terminates. (And in
fact this does happen, at least in valgrind.) gf_timer_proc() doesn't
need ctx for anything, it only needs ctx->timer, so just pass that.
Nothing ever calls gf_timer_registry_init(). Nothing outside of
timer.c that is. Making it and gf_timer_proc() static.
Change-Id: Ia28454dda0cf0de2fec94d76441d98c3927a906a
BUG: 1333925
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/14247
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Give the administrator a possibility to set oom_score_adj for glusterfs
process. Applies to Linux only.
Change-Id: Iff13c2f4cb28457871c6ebeff6130bce4a8bf543
BUG: 1336818
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-on: http://review.gluster.org/14399
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shards
Change-Id: I0606b74f11f5412c4d9af44a6505635ed9022c15
BUG: 1335858
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14334
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lk_flags from posix_lock_t structure is the primary key used to
differentiate locks as either advisory and mandatory type. During
lock migration this field is not read in getactivelk() call path.
So in order to copy the exact lock state from source to destination
it is necessary to include lk_flags within lock_migration_info_t
structure to maintain accurate state. This change also includes
minor modifications to setactivelk() call to consider lk_flags
during lock migration.
Change-Id: I20a7b6b6a0f3bdac5734cce8a2cd2349eceff195
BUG: 1332501
Signed-off-by: Anoop C S <anoopcs@redhat.com>
Reviewed-on: http://review.gluster.org/14189
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Negative cache feature implementation in md-cache requires xattrs
returned by posix to be intercepted for every call that can possibly
return xattrs. This includes readdirp(). This is crucial to treat
missing keys in cache as a case of negative entry (returns ENODATA)
md-cache puts names of xattrs that it wants to cache in xdata and
passes it down to posix which returns the specified xattrs in the
callback. This is done in lookup() and readdirp(). Hence, a xattr
that is cached can be invalidated during readdirp_cbk too.
This is based on the assumption that readdirp() will always return
all xattrs that md-cache is interested in. However, this is not the
case when readdirp() call is served from readdir-ahead's cache.
readdir-ahead xlator will pre-fetch dentries during opendir_cbk
and readdirp. These internal readdirp() calls made by readdir-ahead
xlator does not set xdata in it's requests. Hence, no xattrs are
fetched and stored in it's internal cache.
This causes metadata loss in gluster-swift. md-cache returns ENODATA
during getxattr() call even though the xattr for that object exists on
the brick. On receiving ENODATA, gluster-swift will create new metadata
and do setxattr(). This results in loss of information stored in
existing xattr.
Fix:
During opendir, md-cache will communicate to readdir-ahead asking it
to store the names of xattrs it's interested in so that readdir-ahead
can fetch those in all subsequent internal readdirp() calls issued by
it. This stored names of xattrs is invalidated/updated on the next
real readdirp() call issued by application. This readdirp() call will
have xdata set correctly by md-cache xlator.
BUG: 1333023
Change-Id: I32d46f93a99d4ec34c741f3c52b0646d141614f9
Reviewed-on: http://review.gluster.org/14214
Tested-by: Prashanth Pai <ppai@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial change to fix/enable the mandatory locking support in GlusterFS
as per the following design:
https://review.gluster.org/#/c/12014/
Accordingly 'locks.mandatory-locking' option is available as part of this
change which will accept one among the following values:
* off
* file
* forced
* optimal
See design doc for more details
Change-Id: I14c489b3f8af5ebcbfa155a03f0c175e9558ac46
BUG: 762184
Signed-off-by: Anoop C S <anoopcs@redhat.com>
Reviewed-on: http://review.gluster.org/9768
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifd0ff278dcf43da064021f5c25e5dcd34347fcde
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/13970
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements new indices type ENTRY_CHANGES where other
xlators can add/delete names.
Change-Id: I01c5568997085e11d22ba36a4376c70b78fb3827
BUG: 1269461
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/12482
Tested-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I52da41dff5619492b656c2217f4716a6cdadebe0
BUG: 1269461
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/12442
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Moving the enumeration of FOPs and some of the other parts that are
defining the network protocol to the rpc/xdr/ section. These structures
need some care when modifications are made, moving them out of the
common glusterfs.h header helps with that.
The protocol definition structures are generated in a new glusterfs-fops
header. This file is present in rpc/xdr/src/ and libglusterfs/src/, it
is a little ugly, but prevents the need to update all Makefile.am files
with the additional -I option for finding the new header file.
The generation of the .c and .h files from the .x descriptions needed
small modifications to accommodate these changes. The build/xdrgen
script was improved slightly for this. The .c and .h files are
incorrectly in the $(top_srcdir), instead of $(top_builddir). This is
an existing issue, and bug 1330604 has been filed to get that addressed.
Change-Id: I98fc8cf7e4b631082c7b203b5a0a77111bec1fb9
BUG: 1328502
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/14032
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GF_EVENT_SOME_CHILD_DOWN value seems to be mismatching between master and 3.7.
Fix the master since 3.7 is a release branch and GF_EVENT_SOME_CHILD_DOWN was
added newly and hence should be in the end in the enum list.
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: I1f758550d6300f6750d1574302096d8e7f493549
BUG: 1330974
Reviewed-on: http://review.gluster.org/14092
Tested-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This xlator decompounds the compound fops received,
and executes them serially.
Change-Id: Ieddcec3c2983dd9ca7919ba9d7ecaa5192a5f489
BUG: 1303829
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/13577
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_mkdir ()
{
first-hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any
subvol, but we choose first-hashed-subvol randomly");
{
begin:
hashed-subvol = hashed-subvol for "bname" in in-memory
layout of "parent";
hash-range = extract hashe-range from layout of "parent";
ret = mkdir (parent/bname, hashed-subvol, hash-range);
if (ret == "hash-value doesn't fall into layout stored on
the brick (this error is returned by posix-mkdir)")
{
refresh_parent_layout ();
goto begin;
}
}
inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN",
"first-hashed-subvol");
proceed with other parts of dht_mkdir;
}
posix_mkdir (parent/bname, client-hash-range)
{
disk-hash-range = getxattr (parent, "dht-layout-key");
if (disk-hash-range != client-hash-range) {
fail-with-error ("hash-value doesn't fall into layout
stored on the brick");
return 0;
}
continue-with-posix-mkdir;
}
Similar changes need to be done for dentry operations like create,
symlink, link, unlink, rmdir, rename. These will be addressed in
subsequent patches. This patch addresses only mkdir codepath.
This change breaks stripe tests, as on some striped subvols dht layout
xattrs are not set for some reason. This results in failure of
mkdir. Since striped volumes are always created with dht, some tests
associated with stripe also fail. So, I am making following tests
changes (since stripe is out of maintainance):
* modify ./tests/basic/rpc-coverage.t to not to use striped volumes
* mark all (2) tests in tests/bugs/stripe/ as bad tests
Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25
BUG: 1323040
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/13885
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I25e497459441334c13af77b3fec83c42a7a92ac4
BUG: 1319581
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/13793
Smoke: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia27d66b1061b0377857827515590eb89b18515c9
BUG: 1319992
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/11596
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When quota is enabled the quota enforcer tries to get the size of the
source directory by sending nameless lookup to quotad. But if the rename
is successful even on one subvol or the source layout has anomalies then
this nameless lookup in quotad tries to heal the directory which requires
a lock on as many subvols as it can. But src is already locked as part of
rename. For rename to proceed in brick it needs to complete a cluster-wide
lookup. But cluster-wide lookup in quotad is blocked on locks held by rename,
hence a deadlock. To avoid this quota sends an option in xdata which instructs
DHT not to heal.
Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0
BUG: 1252244
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/13988
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In replicate volumes, when a brick is added to a replicate
group, heal to the new brick should be triggered.
Also, the new brick should not be considered as source for
healing till it is up to date.
Previously, extended attributes had to be set manually on
the bricks for this to happen. This patch is part 1 patch
to automate this process.
Change-Id: I29958448618372bfde23bf1dac5dd23dba1ad98f
BUG: 1276203
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/12451
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally all security.* xattrs were forbidden if selinux is disabled,
which was causing Samba's acl_xattr module to not work, as it would
store the NTACL in security.NTACL. To fix this http://review.gluster.org/#/c/12826/
was sent, which forbid only security.selinux. This opened up a getxattr
call on security.capability before every write fop and others.
Capabilities can be used without selinux, hence if selinux is disabled,
security.capability cannot be forbidden. Hence adding a new mount
option called capability.
Only when "--capability" or "--selinux" mount option is used,
security.capability is sent to the brick, else it is forbidden.
Change-Id: I77f60e0fb541deaa416159e45c78dd2ae653105e
BUG: 1309462
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/13540
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minimal infrastructure changes for the seek() FOP. This will provide
SEEK_HOLE and SEEK_DATA functionalities.
BUG: 1220173
Change-Id: I4b74fce8b0bad2f45291fd2c2b9e243c4f4a1aa9
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11480
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are creating data file in a hot subvolume
then we will create a linkfile in cold subvolume.
Linkfile creation happens first. If linkfile creation
was successful and data file creation failed, then
linkfile in cold subvolume will become stale.
This patch will delete the linkfile as well, if data
file creation fails.
Also this code duplicates dht_create to make tier_create
Change-Id: I377a90dad47f288e9576c7323b23cf694a91a7a3
BUG: 1290677
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12948
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
files deleted during promotion were not deleting as the
files are moving from hashed to non-hashed.
On deleting a file that is undergoing promotion,
the unlink call is not sent to the dst file as the
hashed subvol == cached subvol. This causes
the file to reappear once the migration is complete.
This patch also fixes a problem with stale linkfile
deleting.
Change-Id: I4b02a498218c9d8eeaa4556fa4219e91e7fa71e5
BUG: 1282390
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12829
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CLI command for bitrot scrub status will be :
gluster volume bitrot <volname> scrub status
Above command will show the statistics of bitrot scrubber.
Upon execution of this command it will show some common
scrubber tunable value of volume <VOLNAME> followed by
statistics of scrubber statistics of individual nodes.
sample ouput for single node:
Volume name : <VOLNAME>
State of scrub: Active
Scrub frequency: biweekly
Bitrot error log location: /var/log/glusterfs/bitd.log
Scrubber error log location: /var/log/glusterfs/scrub.log
=========================================================
Node name:
Number of Scrubbed files:
Number of Unsigned files:
Last completed scrub time:
Duration of last scrub:
Error count:
=========================================================
This is just infrastructure. list of bad file, last scrub
time, error count value will be taken care by
http://review.gluster.org/#/c/12503/ and
http://review.gluster.org/#/c/12654/ patches.
Change-Id: I3ed3c7057c9d0c894233f4079a7f185d90c202d1
BUG: 1207627
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10231
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add xlator options to index xlator with xattrs that
it needs to keep track of.
Change-Id: If818673be5e626f77e65cc3a340f8cdd624179c2
BUG: 1250803
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/12467
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a part of CHILD_MODIFIED event DHT forgets the current layout and
performs fresh lookup. However this is not required when a replica pair
goes offline as the xattrs can be read from other replica pairs. Hence
setting different event to handle replica pair going down.
Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3
BUG: 1281230
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12573
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With unlink, rename, rmdir, contribution xattrs
are removed. If the file is a last link
then remove_xattr will fail with ENOENT.
So it better to perform remove_xattr
only if there are more links to the file
Change-Id: Ifc1e7fda4d310fd87f6f28a635c9ea78b8f3929d
BUG: 1257694
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12033
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a quota is disable and the clean-up process terminated
without completely cleaning-up the quota xattrs.
Now when quota is enabled again, this can mess-up the accounting
A version number is suffixed for all quota xattrs and this version
number is specific to marker xaltor, i.e when quota xattrs are
requested by quotad/client marker will remove the version suffix in the
key before sending the response
Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb
BUG: 1272411
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12386
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Using sampling feature you can record details about every Nth FOP.
The fields in each sample are: FOP type, hostname, uid, gid, FOP priority,
port and time taken (latency) to fufill the request.
- Implemented using a ring buffer which is not (m/c) allocated in the IO path,
this should make the sampling process pretty cheap.
- DNS resolution done @ dump time not @ sample time for performance w/
cache
- Metrics can be used for both diagnostics, traffic/IO profiling as well
as P95/P99 calculations
- To control this feature there are two new volume options:
diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means
sample every FOP, 100 means sample every 100th FOP
diagnostics.fop-sample-buf-size - The size (in bytes) of the ring
buffer used to store the samples. In the even more samples
are collected in the stats dump interval than can be held in this buffer,
the oldest samples shall be discarded. Samples are stored in the log
directory under /var/log/glusterfs/samples.
- Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache
TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option
and defaults to 24hrs.
Test Plan:
- Valgrind'd to ensure it's leak free
- Run prove test(s)
- Shadow testing on 100+ brick cluster
Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8
BUG: 1271310
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/12210
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Heal hardlink in the db for already existing data in the cold
tier during attach tier. i.e during fix layout do lookup to files
in the cold tier.
CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup.
Currently the namedlookup is done synchronous to the fixlayout that is
triggered by attach tier. This is not performant, adding more time to
fixlayout. The performant approach is record the hardlinks on a compressed
datastore and then do the namelookup asynchronously later, giving the ctr db
eventual consistency
Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9
BUG: 1252586
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/11828
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementation of xattrop type:
GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
GF_XATTROP_ADD_ARRAY64_WITH_DEFAULT
These operations are similar to 'GF_XATTROP_ADD_ARRAY',
except that it adds a default value if xattr is missing
or its value is zero on disk.
One use-case of this operation is in inode-quota.
When a new directory is created, its default dir_count
should be set to 1. So when a xattrop performed setting
inode-xattrs, it should account initial dir_count
1 if the xattrs are not present
Here is the usage of this operation
value required in xdata for each key
struct array {
int32_t newvalue_1;
int32_t newvalue_2;
...
int32_t newvalue_n;
int32_t default_1;
int32_t default_2;
...
int32_t default_n;
};
or
struct array {
int32_t value_1;
int32_t value_2;
...
int32_t value_n;
} data[2];
fill data[0] with new value to add
fill data[1] with default value
xattrop GF_XATTROP_ADD_ARRAY_WITH_DEFAULT
for i from 1 to n
{
if (xattr (dest_i) is zero or not set in the disk)
dest_i = newvalue_i + default_i
else
dest_i = dest_i + newvalue_i
}
value in xdata after xattrop is successful
struct array {
int32_t dest_1;
int32_t dest_2;
...
int32_t dest_n;
};
Change-Id: Ic6a08473e99fd98299a839d4d8416081a7534efd
BUG: 1243946
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11702
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GF_XATTROP_GET_AND_SET stores the existing xattr
value in xdata and sets the new value
xattrop was reusing input xattr dict to set the results
instead of creating new dict.
This can be problem for server side xlators as the inout dict
will have the value changed.
Change-Id: I43369082e1d0090d211381181e9f3b9075b8e771
BUG: 1251454
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/11995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a --resolve-gids commandline option to the glusterfs binary. This
option gets set when executing "mount -t glusterfs -o resolve-gids ...".
This option is most useful in combination with the "acl" mount option.
POSIX ACL permission checking is done on the FUSE-client side to improve
performance (in addition to the checking on the bricks).
The fuse-bridge reads /proc/$PID/status by default, and this file
contains maximum 32 groups. Any local (client-side) permission checking
that requires more than the first 32 groups will fail.
By enabling the "resolve-gids" option, the fuse-bridge will call
getgrouplist() to retrieve all the groups from the user accessing the
mountpoint. This is comparable to how "nfs.server-aux-gids" works.
Note that when a user belongs to more than ~93 groups, the volume option
server.manage-gids needs to be enabled too. Without this option, the
RPC-layer will need to reduce the number of groups to make them fit in
the RPC-header.
Change-Id: I7ede90d0e41bcf55755cced5747fa0fb1699edb2
BUG: 1246275
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11732
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Access to bad objects (especially operations such as open, readv, writev)
should be denied to prevent applications from getting wrong data.
* Do not allow anyone apart from scrubber to set bad object xattr.
* Do not allow bad object xattr to be removed.
Change-Id: Ia9185a067233a9f26e3d41d41d11d9a4eb0da827
BUG: 1210689
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/11126
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
and poststat atomically
Change-Id: I9b52ddaed4e306e9a49f39c86450c94bea843a7b
BUG: 1233617
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/11345
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
| |
Change-Id: Ie589c866a53c0cfc167e31d1d14a11bf58c8dabf
BUG: 1207829
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11413
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a file is under migration, any xattrs created on it
are lost post migration of the file. This is because
the xattrs are set only on the cached subvol of the source
and as the source is under migration, it becomes a linkto file
post migration.
Change-Id: Ib8e233b519cf954e7723c6e26b38fa8f9b8c85c0
BUG: 1193636
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10212
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume. These are stored as "commit hashes" on the directory and the
volume root respectively. The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done. A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it. If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.
Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1219637
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/7702
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Provided setfattr command to set timeout for split-brain
choice.
2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.
3) Updated the doc and testcase to reflect the changes.
Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1209104
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace-brick operation with data migration support have been
deprecated from gluster.
With this fix replace brick command will support only one commad
gluster volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force}
Change-Id: Ib81d49e5d8e7eaa4ccb5830cfec2bc081191b43b
BUG: 1094119
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/10101
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*"
NOTE:
With this patch, data on existing volumes would be resigned
(which should be OK as of now since we do not expect many
users as of now :-))
Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10181
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Background:
Glusterfs changelogs are stored in each brick, which records the changes
happened in that brick. Georep will run in all the nodes of master and
processes changelogs "independently".
Processing changelogs is in brick level, but all the fops will be replayed
on "slave mount" point.
Problem:
With a DHT volume, in changelog "internal fops" are NOT recorded.
For Rename case, Rename is recorded in "hashed" brick changelog.
(DHT's internal fops like creating linkto file, unlink is NOT recorded).
This lead us to inconsistent rename operations.
For example,
Distribute volume created with Two bricks B1, B2.
//Consider master volume mounted @ /mnt/master
and following operations executed:
cd /mnt/master
touch f1 // f1 falls on B1 Hash
mv f1 f2 // f2 falls on B2 Hash
// Here, Changelogs are recorded as below:
@B1
CREATE f1
@B2
RENAME f1 f2
Here, race exists between Brick B1 and B2, say B2 will get executed first.
Source file f1 itself is "NOT PRESENT", so it will go ahead and create
f2 (Current implementation).
We have this problem When rename falls in another brick and
file is unlinked in Master.
Similar kind of issue exists in following case too(multiple rename):
CREATE f1
RENAME f1 f2
RENAME f2 f1
Solution:
Instead of carrying out "changelogging" at "HASHED volume",
carry out at the "CACHED volume".
This way we have rename operations carried out where actual files are present.
So,Changelog recorded as :
@B1
CREATE f1
RENAME f1 f2
credit: sarumuga@redhat.com
PS: Some of the races as the one below are _NOT_ fixed by this patch
* f1 and f2 exist. B1 and B2 are their respective cached subvols. For
both files hashed-subvol == cached-subvol
* mv f1 f2 on master.
* B1 has change-log entry of rename f1 f2
* rebalance migrates f2 from B1 and B2
* mv f2 f1 on master.
* B2 has change-log entry of rename f2 f1
Since changelog entries (rename f1 f2) and (rename f2 f1) are processed
independently by gsyncds, which of either f1 and f2 survives on slave
is subject to race. Note that on master its file f1 with name f1 which
survived. On slave it can be either file f1 with name f1 or file f2
with name f2 based on who wins the race of processing changelog.
Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda
BUG: 1141379
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10410
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have added support to send attributes of both entries and
its parent (include oldparent in case of RENAME fop) in the
same notification request to avoid multiple rpc requests.
Also, made changes in gfapi to send parent object and its
attributes changed in a single upcall event.
Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1
BUG: 1200262
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10460
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when quota limit is set, corresponding gfid
is set in quota.conf. This patch supports storing
inode-quota limits in quota.conf and also stores
additional byte for each gfid to differentiate
between usage quota limit and inode quota limit.
Change-Id: I444d7399407594edd280e640681679a784d4c46a
BUG: 1202244
Signed-off-by: vmallika <vmallika@redhat.com>
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/10069
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested during the code-review of Bug1200262, have modified
GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION.
Thus, for any new upcall event, a new CBK procedure will be added.
Also made changes to store upcall data separately based on the
upcall event type received.
BUG: 1200262
Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Reviewed-on: http://review.gluster.org/10049
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node
Brief design explanation for the above two points.
1. Rebalance multiple files in parallel:
-------------------------------------
The existing rebalance engine is single threaded. Hence, introduced
multiple threads which will be running parallel to the crawler. The
current rebalance migration is converted to a "Producer-Consumer"
frame work.
Where Producer is : Crawler
Consumer is : Migrating Threads
Crawler: Crawler is the main thread. The job of the crawler is now
limited to fix-layout of each directory and add the files which are
eligible for the migration to a global queue in a round robin manner
so that we will use all the disk resources efficiently. Hence, the
crawler will not be "blocked" by migration process.
Producer: Producer will monitor the global queue. If any file is
added to this queue, it will dqueue that entry and migrate the file.
Currently 20 migration threads are spawned at the beginning of the
rebalance process. Hence, multiple file migration happens in parallel.
2. Crawl only bricks that belong to the current node:
--------------------------------------------------
As rebalance process is spawned per node, it migrates only the files
that belongs to it's own node for the sake of load balancing. But it
also reads entries from the whole cluster, which is not necessary as
readdir hits other nodes.
New Design:
As part of the new design the rebalancer decides the subvols
that are local to the rebalancer node by checking the node-uuid of
root directory prior to the crawler starts. Hence, readdir won't hit
the whole cluster as it has already the context of local subvols and
also node-uuid request for each file can be avoided. This makes the
rebalance process "more scalable".
Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1
BUG: 1171954
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/9657
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instantiate a process wide global instance of the timer wheel
data structure. Spawning glusterfs* process with option arg
"--global-timer-wheel" instantiates a global instance of
timer-wheel under global context (->ctx).
Translators can make use of this process wide instance [via a
call to glusterfs_global_timer_wheel()] instead of maintaining
an instance of their own and possibly consuming more memory.
Linux kernel too has a single instance of timer wheel where
subsystems such as IO, networking, etc.. make use of.
Bitrot daemon would be early consumers of this: bitrot translator
instances for multiple volumes would track objects belonging to
their respective bricks in this global expiry tracking data
structure. This is also a first step to move GlusterFS timer
mechanism to use timer-wheel.
Change-Id: Ie882df607e07acaced846ea269ebf1ece306d6ae
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10380
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a handful of problem with scrubber which
are detailed below.
Scrubber used to skip objects for verification due to missing
fd iterface to fetch versioning extended attributes. Similar
to the inode interface, an fd based interface in POSIX is now
introduced.
Moreover, this patch also fixes potential false reporting by
scrubber due to:
An object gets dirtied and signed when scrubber is busy
calculatingobject checksum. This is fixed by caching the
signed version when an object is first inspected for
stalenes, i.e., during pre-compute stage. This version is
used to verify checksum in the post-compute stage when the
signatures are compared for possible corruption.
Side effect of _not_ sending signature length during signing
resulted in "truncated" signature to be set for an object.
Now, at the time of signing, the signature length is sent
and is used in place of invoking strlen() to get signature
length (which could have possible 00s). The signature length
itself is not persisted in the signature xattr, but is
calculated on-the-fly by substracting the xattr length by
the "structure" header size.
Some of the log entries are made more meaningful (as and aid
for debugging).
Change-Id: I938bee5aea6688d5d99eb2640053613af86d6269
BUG: 1207624
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/10118
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|