| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because of http://review.gluster.org/#/c/6318/ patch,
ESTALE is returned for a lookukp on non-existent GFID.
But ENOENT is more appropriate when lookup happens
through virtual .gfid directory on aux-gfid-mount
point. This is avoids confusion for the consumers
of gfid-access-translator like geo-rep which expects
ENOENT.
Change-Id: I4add2edf5958bb59ce55d02726e6b3e801b101bb
BUG: 1069191
Signed-off-by: Kotresh H R <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/7154
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/7163
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifbebf1aa496d49a6c4bb30258b83aaf9792828e5
BUG: 1059833
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6924
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/6155
BUG: 1023974
Change-Id: Icb4bca8c4a5fa9b14855478e69b1315f2e9a3c3d
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/6887
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/6532
PROBLEM:
Alert log messages, corresponding to disk usage crossing soft-limit on a
directory, weren't being logged as often as expected.
CAUSE:
The mechanism in place to log alert messages, once every alert-time
seconds, set the previous logged time incorrectly.
FIX:
Update previous logged time only if we logged an alert message, ie. when
the "time was right" to alert.
BUG: 969461
Change-Id: Ie019c180c97d2b049fa664b14ce087b88e3b578f
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/6886
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Quota contributions of a file/directory are tracked by quota
xlator using xattrs on the file. Quota allows these xattrs to be
healed as part of metadata self-heal. This leads to
wrong quota calculations on this brick after self-heal because
quota xattrs don't represent the actual contributions on the
brick anymore.
Fix:
Don't let self-heal of this xattr happen as part of self-heal
by filtering quota xattrs on file in listxattr.
Change-Id: Iea68a116595ba271e58c6fdcc3dd21c7bb55ebb3
BUG: 1035576
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6374
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6812
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
quotad before marking quota as enabled.
without this patch there is a window of time when quota is marked as
enabled in quota-enforcer, but connection to quotad wouldn't have been
established. Any checklimit done during this period can result in a
failed fop because of unavailability of quotad.
Change-Id: I0d509fabc434dd55ce9ec59157123524197fcc80
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6572
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Even though limit is not set quota used to send 'quota-deem-statfs'
key in the dictionary resulting in incorrect calculations.
Change-Id: I643cb35cca6648e40f1c745135280876958ddaa9
BUG: 1046894
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/6607
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6840
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quota and marker uses 'trusted.glusterfs.quota*' and 'trusted.pgfid*' xattrs to
store its configurations and accounting information and also to build the
parent inode chain in case of absense of path.
Problem:
After disabling and then enabling quota back, the xattrs may contain stale data
leading to impaired accounting and thus improper enforcement.
Solution:
Clean up all the quota related xattrs after quota disable.
Marker xlator implements a virtual xattr to cleanup quota and pgfid xattrs. In
this approach glusterd mounts an auxiliary mount and sends the below command to
all the files by crawling the mountpoint.
#setfattr -n "glusterfs.quota-xattr-cleanup" -v 1 <path/to/file>
Credit:
Krishnan Parthasarathi <kparthas@redhat.com>
Varun Shastry <vshastry@redhat.com>
Change-Id: I9380eca58a285dc27dd572de1767aac8f2cd8049
BUG: 969461
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/6369
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
codepath.
Ancestry building codepath can be executed simultaneously when a file
has hardlinks. So, following fixes are needed:
* modify local->link_count under locks in ancestry building codepath.
* before invoking check_limit on newly constructed parents,
link_count should not be set, but instead incremented by as many
number of new invocations of check_limit.
* decrementing link_count and the check whether the count has hit zero
to resume the waiting call_stub should be atomic.
Change-Id: I6f52e50547362a5ded6bd9f085570223aea372d1
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6491
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6819
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accounting was done in enforcer (though marker is the ultimate source
of truth) to offset cached directory size becoming stale. However,
with enforcer being moved to brick we can no longer maintain correct
cluster wide size for a directory. Hence removing accounting code from
enforcer.
Change-Id: I5ea94234da4da85ed5f5ced1354d8de3454b3fcb
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6434
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6818
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8a0b7f3a1995a72560c210efbad1eaafb0bdf329
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6373
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6816
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I60abf576999996e0d0d65534e1e416f6e10994c8
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 969461
Reviewed-on: http://review.gluster.org/6479
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6814
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-> log additional records.
-> include FOP number for metadata.
-> prevent crash if inode is not found in a fop.
Change-Id: I9edd4b71819ebd68c6a2b4150ae279c471d129da
BUG: 1036536
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/6403
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/6808
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gfid translator, lookup was not handling the case when
the lookup is sent on .gfid/<parent>/bname. In this case,
we flip with fake inode of the parent with the real inode
in loc and send it downwards.
Change-Id: I639ff1dce10ffc045da419e333d455e208b6a0f0
BUG: 1057881
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/6795
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/6807
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting appropriate ia_type and gfid for the inode, obtained during
virtual_lookup_cbk of a directory by doing an "inode_link".
Change-Id: I9582570c82e70ff5f1d4e441f9e9868ef82e9dc6
BUG: 1054199
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/6806
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* dict_set was happening with a string instead of 'uuid_t' causing
entry creations to happen with wrong gfid
* revalidate was causing excessive ESTALE logs as lookup was
happening on '.gfid/' path itself causing server to become clueless
Change-Id: I3b76ce7fdec9c2ff785be21f506f859f489f80f0
BUG: 1054199
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/6520
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6805
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
changing the 'frame->root->uid' on the fly is not a good idea as
posix-acl xlator on brick process would fail the op.
Change-Id: I996b43e4ce6efb04f52949976339dad6eb89bede
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 847839
Reviewed-on: http://review.gluster.org/5833
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6801
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This way disconnect cleanup code can differentiate which locks
are granted vs blocked.
Change-Id: I2a835c6865b6c804231d852953ea84eeccef35a3
BUG: 849630
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6762
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- implement ref/unref of entry locks (and fix bad pointer deref crashes)
- code cleanup and deleted various data types
- fix improper read/write lock conflict detection in entrylk
- fix indefinite hang of blocked locks on disconnect
- register locks in client_t synchronously, fix crashes in disconnect path
Change-Id: Id273690c9111b8052139d1847060d1fb5a711924
BUG: 849630
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6695
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a backport of Iccd27ac076b7a74e40dcbaa1c4762fd3ad59da5f
POSIX does not says wether link(2) on symlink should link on
symlink itself or on target. Linux use symlink, most other
systems use target. Using linkat(2) allows the behavior to be
specified, so that the behavior is portable.
Also fix configure test for NetBSD linkat(2), which ceased to work.
BUG: 764655
Change-Id: Ifcabda5e81b15cd80982bcfc05afda4c9e5370ef
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/6612
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a backport of Iccd27ac076b7a74e40dcbaa1c4762fd3ad59da5f
POSIX does not says wether link(2) on symlink should link on
symlink itself or on target. Linux use symlink, most other
systems use target. Using linkat(2) allows the behavior to be
specified, so that the behavior is portable.
Also fix configure test for NetBSD linkata(2), which ceased to work.
BUG: 764655
Change-Id: I7cf9e62ea19c7eb356935c11b480cf637c83126b
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/6594
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... which may be grouped under the following categories:
1. Fix incorrect cli exit status for 'quota list' cmd
2. Print appropriate error message on quota parse errors in cli
Authored by: Anuradha Talur <atalur@redhat.com>
3. glusterd: Improve quota validation during stage-op
4. Fix peer probe issues resulting from quota conf checksum mismatches
5. Enhancements to CLI output in the event of quota command failures
Authored by: Kaushal Madappa <kmadappa@redhat.com>
7. Move aux mount location from /tmp to /var/run/gluster
Authored by: Krishnan Parthasarathi <kparthas@redhat.com>
8. Fix performance issues in quota limit-usage
Authored by: Krutika Dhananjay <kdhananj@redhat.com>
Note: Some functions that were used in earlier version of quota,
that aren't called anymore have been removed.
Change-Id: I963d4145f3ecdfe30c61bfa8920baccb33d2d4bd
BUG: 969461
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/6386
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for two reasons:
* since dht-linkfiles are internal, they shouldn't be accounted.
* hardlink handling in marker is broken. link/unlink of hardlinks
present in same directory can break marker accounting. Hence, if src
and dst are in same directory in case of rename, dht - if it breaks
rename into link/unlink operations - should instruct marker to not to
do accounting.
Change-Id: I9c9f7384569f75a2792f6450ee7a5279bf751ae7
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6203
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I3239f5e8477664dcc04434e4d455ae447493a7ac
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6153
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
available space.
This patch aims to prevent creation of infinite zero byte sized files
due to amount of storage available before exceeding quota limit
being less than write sizes. Imagine x bytes of storage is available
before we exceed quota limit and quota enforcer is receiving writes of
size y and (y > x). In this scenario, if we run a shell script like:
# for i in $(seq 1 10); do dd if=/dev/zero of=$i bs=y count=1; done
Then, we would end up with 10 zero byte sized files, because we allow
only complete writes and all writes will fail because of lack of space.
However, creates succeed since a create itself will consume zero
bytes. In this pattern of creates and writes, size of volume would
never grow and x bytes of space will always be available and we can
end up with an infinite number of zero byte sized files.
Change-Id: Ice148d6a2207883e41759f7b0be73abaa3198b41
BUG: 1012216
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6035
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Two stages of quota enforcement is done:
Soft and hard quota Upon reaching soft quota limit on the directory
it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log)
and no more writes allowed after hard
quota limit. After reaching the soft-limit the daemon alerts the
user/admin repeatively for every 'alert-time', which is
configurable.
* Quota enforcer is moved to server-side.
It takes care of enforcing quota. Since enforcer doesn't have the
cluster view, it relies on another service called
quota-aggregator. Aggregator, on query can return the size of a
directory based on the cluster view.
Enforcer is always loaded in the server graph and is by passed if
the feature is not enabled.
Options specific to enforcer:
server-quota - Specifies whether the feature is on/off. It is used
to by pass the quota if turned off.
deem-statfs - If set to on, it takes quota limits into consideration
while estimating fs size. (df command). The algorithm followed is,
i. Adjust statvfs based on limit configured on root.
ii. If limit is set on the inode passed, use size/limits on that inode to
populate statvfs. Otherwise, use size/limits configured on root.
iii. Upon statvfs, update the ctx->size on the inode.
iv. Don't let DHT aggregate, instead take the maximum of the usages from the
subvols of the DHT, since each of it contains the complete information.
Enforcer also makes use of gfid-to-path conversion functionality to
work correctly when a client like nfs predominently relies on
nameless lookups.
* Quota Aggregator acts as a thin client to provide cluster view
Its a lightweight *gluster client* process with no mount point,
started upon enabling quota or restarting the volume. This is a
single process run on each brick, which can answer queries on all
volumes in the cluster. Its volfile stored in
GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol.
Credits:
Raghavendra Bhat <rabhat@redhat.com>
Varun Shastry <vshastry@redhat.com>
Shishir Gowda <sgowda@redhat.com>
Kruthika Dhananjay <kdhananj@redhat.com>
Brian Foster <bfoster@redhat.com>
Krishnan Parthasarathi <kparthas@redhat.com>
Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/5952
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* handles renames on dht linkfiles correctly
* nameless lookup friendly changes. uses gfid-to-path conversion
functionality from storage/posix to build ancestry till root.
* log message cleanup.
* build inode contexts in readdirp
* Accounting still not correct with hardlinks.
Credits:
========
Vijay Bellur <vbellur@redhat.com>
Raghavendra Bhat <rabhat@redhat.com>
Change-Id: I415b6fbbc9691f5a38d9fd3c5d083a61e578bb81
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/5953
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement backtrace(3) and backtrace_symbols(3) which do not exist in NetBSD
While there, remove duplicate #include <stdio.h>
BUG: 764655
Change-Id: Iccd695765906e085c3f8fcb670506d4fea68fa39
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/6285
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.
Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* When a writev call occurs, the client compresses the data before
sending it to server. On the server, compressed data is decompressed.
Similarly, when a readv call occurs, the server compresses the data
before sending it to client. On the client, the compressed data is
decompressed. Thus the amount of data sent over the wire is minimized.
* Compression/Decompression is done using Zlib library.
* During normal operation, this is the format of data sent over wire :
<compressed-data> + trailer(8)
The trailer contains the CRC32 checksum and length of original
uncompressed data. This is used for validation.
HOW TO USE
----------
Turning on compression xlator:
gluster volume set <vol_name> compress on
Configurable options:
gluster volume set <vol_name> compress.compression-level 8
gluster volume set <vol_name> compress.min-size 50
Change-Id: Ib7a66b6f1f70fe002b7c513588cdf75c69370805
BUG: 923540
Original-author : Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Prashanth Pai <nullpai@gmail.com>
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Reviewed-on: http://review.gluster.org/3251
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current coroutine model, mapping synctasks 1-1 with qemu internal
Coroutines, has some unresolved raciness issues. This problem usually
manifests as lifecycle mismatches between top-level (gluster created)
synctasks and the subsequently created internal coroutines from that
context. Qemu's internal queueing (and locking) can cause situations
where the top-level synctask is destroyed before the internal scheduler
has released references to memory, leading to use after free crashes
and asserts.
Simplify the coroutine model to use a single synctask as a coroutine
processor and rely on the existing native ucontext coroutine
implementation. The syncenv thread is donated to qemu and ensures a
single top-level coroutine is processed at a time. Qemu now has
complete control over coroutine scheduling.
BUG: 986775
Change-Id: I38223479a608d80353128e390f243933fc946fd6
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/6110
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add basic backing image support to the block-format mechanism. This
is a functionality checkpoint that enables the raw mechanism
required to support client driven "snapshot" and "clone" requests.
This change enhances the block-format setxattr command to support
an additional and optional backing image reference. For example:
setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:<bimg>" ./newimage
... where <bimg> refers to the backing image for unallocated blocks
in newimage. <bimg> can be provided in one of two formats:
- a gfid string in the following format (assuming a valid gfid):
<gfid:00000000-0000-0000-0000-000000000000>
- or a filename that must be resident in the same directory as the
new clone file being formatted. E.g.,
setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:baseimg" ./newimage
This latter format is more restrictive, simply provided for
convenience or until something more refined is available.
This change makes no assumptions about the backing image file and
affords no additional protection. It is up to the user/client to
recognize the relationship between the files and manage them
appropriately (i.e., no writes to the backing image, etc.).
BUG: 986775
Change-Id: I7aff7bdc59b85a6459001a6bfeae4db6bf74f703
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5967
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remove server_ctx and locks_ctx from client_ctx directly and store as
into discrete entities in the scratch_ctx
hooking up dump will be in phase 3
BUG: 849630
Change-Id: I94cea328326db236cdfdf306cb381e4d58f58d4c
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/5678
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic3f9d30d75df0bbbdf8fe28446fabe62d90fa854
BUG: 1024666
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6194
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gettimeofday() returns the current wall clock time and timezone.
Using these functions in order to measure the passage of time
(how long an operation took) therefore seems like a no-brainer.
This time suffer's from some limitations:
a. They have a low resolution: “High-performance” timing by
definition, requires clock resolutions into the microseconds
or better.
b. They can jump forwards and backwards in time: Computer
clocks all tick at slightly different rates, which causes
the time to drift. Most systems have NTP enabled which
periodically adjusts the system clock to keep them in sync
with “actual” time. The adjustment can cause the clock to
suddenly jump forward (artificially inflating your timing
numbers) or jump backwards (causing your timing calculations
to go negative or hugely positive). In such cases timer
thread could go into an infinite loop.
From 'man gettimeofday':
----------
..
..
The time returned by gettimeofday() is affected by discontinuous
jumps in the system time (e.g., if the system administrator manually
changes the system time). If you need a monotonically increasing
clock, see clock_gettime(2).
..
..
----------
Rationale:
For calculating interval timing for Timer thread, all that’s
needed should be clock as a simple counter that increments
at a stable rate.
This is necessary to avoid the jumps which are caused by using
"wall time", this counter must be monotonic that can never
“tick” backwards, ever.
Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb
BUG: 1017993
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/6070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to know the number of files to be healed, either user
has to go to backend and check the number of entries present in
indices/xattrop directory. But if a volume consists of large
number of bricks, going to each backend and counting the number
of entries is a time-taking task. Otherwise user can give
gluster volume heal vol-name info command but with this
approach if no. of entries are very hugh in the indices/
xattrop directory, it will comsume time.
So as a feature, new command is implemented.
Command 1: gluster volume heal vn statistics heal-count
This command will get the number of entries present in
every brick of a volume. The output displays only entries
count.
Command 2: gluster volume heal vn statistics heal-count
replica 192.168.122.1:/home/user/brickname
Here if we are concerned with just one replica.
So providing any one of the brick of a replica will get
the number of entries to be healed for that replica only.
Example:
Replicate volume with replica count 2.
Backend status:
--------------
[root@dhcp-0-17 xattrop]# ls -lia | wc -l
1918
NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy
entries so actual no. of entries to be healed are
1916.
[root@dhcp-0-17 xattrop]# pwd
/home/user/2ty/.glusterfs/indices/xattrop
Command output:
--------------
Gathering count of entries to be healed on volume volume3 has been successful
Brick 192.168.122.1:/home/user/22iu
Status: Brick is Not connected
Entries count is not available
Brick 192.168.122.1:/home/user/2ty
Number of entries: 1916
Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290
BUG: 1015990
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bdrv_init() is intended to be invoked only once. If invoked again
after initialization (i.e., due to graph changes), the block driver
registration code can reinsert entries into bdrv_drivers,
effectively corrupting the NULL terminated linked list. This
manifests as an infinite loop for callers attempt to traverse the
list (to locate a driver).
BUG: 986775
Change-Id: I2f95bda3c753dece4648cdad009d0e2a2d16f644
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/6021
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
change in encoding method of changelog was critical section for
"fop dispatch thread", "roll-over thread" and "reconfigure dispatch thread".
In this patch the "encoding-method" is changed by the reconfigure dispatch thread
lazily during handle_change, which solves the concurrency among the racing
threads.
BUG: 1002940
Change-Id: I78c3e8887efa46d0fcc60755cdf4243031cfa3eb
Signed-off-by: Ajeet Jha <ajha@redhat.com>
Reviewed-on: http://review.gluster.org/5844
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Block all signal except those which are set for explicit handling
in glusterfs_signals_setup(). Since thread spawning code in
libglusterfs and xlators can get called from application threads
when used through libgfapi, it is necessary to do this blocking.
Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281
BUG: 1011662
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support the readdirp fop in qemu-block to ensure that image files
are handled correctly when readdirp is enabled. E.g., without
readdirp support, incorrect stat data for formatted files can be
reported back (and cached) via the client.
BUG: 986775
Change-Id: Ibc4bd0b42548953ebe30832f3d853bb68095f0ac
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5946
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building RPMs from a 'make dist' tarball fails when qemu-block is
enabled. Enabling is done automatically when the glib2 development files
are available (enabled by ./configure).
Manual building with:
$ ./autogen.sh && ./configure && make dist && rpmbuild -ta *.tar.gz
Building in mock works fine, glib2-devel is not installed by default so
the qemu-block xlator gets disabled. This change also adds glib2-devel
to the BuildRequires in the glusterfs.spec file, causing the qemu-block
xlator to be built by default, and included in the glusterfs RPM.
Change-Id: Ibb73628772586d9e07bbfde7a8ff2fc973489086
BUG: 986775
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/5896
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7aab98a70793bee936272f0b795f4c22c3733dd2
BUG: 1001055
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5705
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required by Geo-Replication that does auxillary mount
with client-pid as -1 (which has special treatment at specific
places in GlusterFS), to trigger xtime updates on the intermediate
master in a cascading setup.
Marker too had a check to "not" mark updates for geo-replication's
auxillary mounts. With the new geo-replication design, xtimes are
not set by the master on the slave for all entities. Due to this
cascading setups were broken.
This patch introduces "geo-replication.ignore-pid-check" option
as a "override" for the client-pid check for gsyncd's client-pid.
When this options is enabled, marker start "marking" even if the
updates are from the special client.
Geo-Replication on the detection of itself being an intermediate
master, enables this option.
Change-Id: I9f7140edd12fef5480595ee0f93f35b94cdb8345
BUG: 996371
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5591
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Tested-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extends the persistent instrumentation work done by
Aravinda (@avishwa), by introducing a handfull of instrumentation
variables for crawl. These variables are "pulled up" by glusterd
in the event of a geo-replication status cli command and looks
something like below:
"Uptime=00:21:10;FilesSyned=2982;FilesPending=0;BytesPending=0;DeletesPending=0;"
"FilesPending", "BytesPending" and "DeletesPending" are short-lived
variables that are non-zero when a changelog is being processes (ie.
when an active sync in ongoing). After a successfull changelog process
"FilesPending" is summed up into "FilesSynced". The three short-lived
variabled are then reset to zero and the data is persisted
Additionally this patch also reverts some of the changes made for
BZ #986929 (those were not needed).
Change-Id: I948f1a0884ca71bc5e5bcfdc017d16c8c54fc30b
BUG: 990420
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/5441
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for internals snapshots using QCOW2 and
general framework for external snapshots (next patch) with
QCOW2 and QED.
For internal snapshots, the file must be "initialized" or
"formatted" into QCOW2 format, and specify a file size.
Snapshots can be created, deleted, and applied ("goto").
e.g:
// Format and Initialize
sh# setfattr -n trusted.glusterfs.block-format -v qcow2:10GB /mnt/imgfile
sh# ls -l /mnt/imgfile
-rw-r--r-- 1 root root 10G Jul 18 21:20 imgfile
// Create a snapshot
sh# setfattr -n trusted.glusterfs.block-snapshot-create -v name1 imgfile
// Apply a snapshot
sh# setfattr -n trusted.gluterfs.block-snapshot-goto -v name1 imgfile
Change-Id: If993e057a9455967ba3fa9dcabb7f74b8b2cf4c3
BUG: 986775
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5367
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prints, in the statedump, the information about the mount that
performed the inode/entry lk.
For the entrylks that are granted after a blocked state, the
blocked time is not printed. A patch for that will be sent
later.
Change-Id: Ib0c1ed21fa9328b435f96b590dd343f59814a08d
BUG: 915629
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/5712
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6
BUG: 952029
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5683
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
BUG: 952029
Change-Id: I7405d473d369a4a951836eceda4faccbad19ce0e
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5497
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A volume would fail to start if any one of the bricks fails staging or
fails to start, even with the 'force' option. With this patch, when the
'force' option is given for a volume start, glusterd will continue and
start other bricks even if one fails staging or starting.
Also did a small fix in changelog, to prevent it crashing when it fails
to init.
Change-Id: I7efbd9ab13d12d69b0335ae54143fa17586f8f98
BUG: 994375
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5510
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|