| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Whenever there syncop_xxx() is used inside a synctask and gcc
optimizes it when compiled with -O2 there is a problem where
'errno' would not work as expected.
Fix:
Until http://review.gluster.com/6475 is reviewed and merged
we are making sure the function is not going to be optimized.
Change-Id: I504c18c8a7789f0c776a56f0aa60db3618b21601
BUG: 1040356
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6481
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I98da41342127b1690d887a5bc025e4c9dd504894
BUG: 1038452
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/6435
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <gowda.shishir@gmail.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
@key can legally be NULL. Handle that case without crashing.
Change-Id: Iaae293caa7eeb24afc9cd2580799173e2ce00911
BUG: 1036879
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6395
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When getxattr fails with errno other than ENODATA fail rebalance
on that file. Log the reason for error.
Change-Id: Ia519870b88e6e6dd464d1c0415411aa999f80bc9
BUG: 1032927
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6341
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Creating linkfile could have failed, but we dont care about linkfile
for setting layout in the inode ctx (could be EEXIST etc.)
So ignore @inode in cbk and pick it up from local->loc.inode
Change-Id: I2952799d7ae0d3441b84b2ca2981afd75d7576e2
BUG: 1032859
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6319
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When clients refer to a GFID which does not exist, the errno to
be returned in ESTALE (and not ENOENT). Even though ENOENT might
look "proper" most of the time, as the application eventually expects
ENOENT even if a parent directory does not exist, not returning
ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution
in uncached mode. This can result in spurious ENOENTs during
concurrent path modification operations.
Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936
BUG: 1032894
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6318
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is needed for two reasons:
* since dht-linkfiles are internal, they shouldn't be accounted.
* hardlink handling in marker is broken. link/unlink of hardlinks
present in same directory can break marker accounting. Hence, if src
and dst are in same directory in case of rename, dht - if it breaks
rename into link/unlink operations - should instruct marker to not to
do accounting.
Change-Id: I9c9f7384569f75a2792f6450ee7a5279bf751ae7
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6203
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
components other than distribute (like marker to exclude linkfiles
from being accounted) also need awareness of what constitutes a
linkfile. Hence its good to separate out this functionality into
core.
Change-Id: Ib944eeacc991bb1de464c9e73ee409fc7a689ff1
BUG: 1022995
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/6152
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Two stages of quota enforcement is done:
Soft and hard quota Upon reaching soft quota limit on the directory
it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log)
and no more writes allowed after hard
quota limit. After reaching the soft-limit the daemon alerts the
user/admin repeatively for every 'alert-time', which is
configurable.
* Quota enforcer is moved to server-side.
It takes care of enforcing quota. Since enforcer doesn't have the
cluster view, it relies on another service called
quota-aggregator. Aggregator, on query can return the size of a
directory based on the cluster view.
Enforcer is always loaded in the server graph and is by passed if
the feature is not enabled.
Options specific to enforcer:
server-quota - Specifies whether the feature is on/off. It is used
to by pass the quota if turned off.
deem-statfs - If set to on, it takes quota limits into consideration
while estimating fs size. (df command). The algorithm followed is,
i. Adjust statvfs based on limit configured on root.
ii. If limit is set on the inode passed, use size/limits on that inode to
populate statvfs. Otherwise, use size/limits configured on root.
iii. Upon statvfs, update the ctx->size on the inode.
iv. Don't let DHT aggregate, instead take the maximum of the usages from the
subvols of the DHT, since each of it contains the complete information.
Enforcer also makes use of gfid-to-path conversion functionality to
work correctly when a client like nfs predominently relies on
nameless lookups.
* Quota Aggregator acts as a thin client to provide cluster view
Its a lightweight *gluster client* process with no mount point,
started upon enabling quota or restarting the volume. This is a
single process run on each brick, which can answer queries on all
volumes in the cluster. Its volfile stored in
GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol.
Credits:
Raghavendra Bhat <rabhat@redhat.com>
Varun Shastry <vshastry@redhat.com>
Shishir Gowda <sgowda@redhat.com>
Kruthika Dhananjay <kdhananj@redhat.com>
Brian Foster <bfoster@redhat.com>
Krishnan Parthasarathi <kparthas@redhat.com>
Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de
BUG: 969461
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/5952
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
BUG: 764655
Change-Id: I4d18c9a6c00cb4696645fcb437398562f00b9d24
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/6284
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.
Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* migrate all the fd's on an inode to newer subvol after rebalance
* use the migration in progress flag in inode, so all the operations
on the inode can make use of it
Change-Id: Ib807a46e927a1062688fc15119c916797c52a350
BUG: 1013456
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5891
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make changes to distributed xlator to work with BD xlator. Unlike files,
a block device can't be removed when its opened. So some part of the
code were moved down to avoid this situation. Also before truncating a
BD file its BD_XATTR should be set otherwise truncate will result in
truncating posix file. So file is created with needed BD_XATTR and
truncate is invoked. Also enables BD xlator in stripe volume type.
Change-Id: If127516e261fac5fc5b137e7fe33e100bc92acc0
BUG: 1028672
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5235
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two glusterfs clients return inconsistent errnos when the bricks of the volume
were down. Consider two gluster mounts. Mount 1 was done when the bricks were
online. Mount 2 was done after the bricks were killed, (using the 'glusterfs'
command instead of the mount script).
For any request, mount 1 will return ENOTCONN, where as mount 2 will return
ENOENT.
This happens because for the 2nd mount, a fuse would send a lookup on '/' for
any request, as it hadn't been done yet. The client xlator returns ENOTCONN,
but the dht_lookup_dir_cbk changed this to ENOENT unconditionally when
aggregating. So, fuse returned ENOENT, even though the errno should have been
ENOTCONN.
Change-Id: I4b7a6d84ce5153045a807fccc01485afe0377117
BUG: 1019095
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/6072
Reviewed-by: Anand Avati <avati@redhat.com>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Incorrect NFS ACL encoding causes "system.posix_acl_default"
setxattr failure on bricks on XFS file system. XFS (potentially
others?) doesn't understand when the 0x10 prefix is added to the
ACL type field for default ACLs (which the Linux NFS client adds)
which causes setfacl()->setxattr() to fail silently. NFS client
adds NFS_ACL_DEFAULT(0x1000) for default ACL.
FIX:
Mask the prefix (added by NFS client) OFF, so the setfacl is not
rejected when it hits the FS.
Original patch by: "Richard Wareing"
Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396
BUG: 1009210
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/5980
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Block all signal except those which are set for explicit handling
in glusterfs_signals_setup(). Since thread spawning code in
libglusterfs and xlators can get called from application threads
when used through libgfapi, it is necessary to do this blocking.
Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281
BUG: 1011662
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5995
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a3e593f9f17cb1e68db97bb5a0d8074793a33964 which
was bought into fix dht_layout_anomalies error handling.
We still see applications error'ing out due to graph switches.
To fix the above issue -
We cannot heal in dht_discover, as it is a gfid based lookup, and not
path based. So, returning error here would lead to app's to see failure.
Also, update the layout in inode_ctx even if it has anomalies. Let
subsequent heals fix the issue.
Conflicts:
xlators/cluster/dht/src/dht-common.c
Signed-off-by: shishir gowda <sgowda@redhat.com>
Change-Id: I68c1056c3587e04a02344548546ddd06034489c5
BUG: 960348
Reviewed-on: http://review.gluster.org/5443
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were wrongly detecting holes/overlaps for already accounted
errors. Additionally, sort should also handle zero'ed out layout
Change-Id: Ic3d13e1d735b914f9acc01fe919bc90656baea48
BUG: 1003851
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5762
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier disk space check had an issue which didn't
provide the needed functionality to avoid migration
when the destination had lesser available space,
scenario we need to avoid is stated below :
During rebalance `migrate-data` - Destination subvol experiences
a `reduction` in 'blocks' of free space, at the same time source
subvol gains certain 'blocks' of free space. A valid check is
necessary here to avoid errorneous move to destination where
the space could be scantily available.
This patch provides a proper fix in place by subtracting
necessary file blocks from destination and adding those blocks
to source.
Change-Id: I9c7840716a4256ef614ffc0fbfd9f2b456ac28c8
BUG: 982919
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/5961
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currenly the CLI rebalance status command output does not indicate the
'type' of rebalance, i.e. whether a full rebalance or only a fix-layout
was carried out.
Fix: After the rebalance status of all peers is received by the
originator glusterd, alter it to reflect the type of rebalance
before passing it on to the CLI process.
Change-Id: I1940ffda0d36e25e5b33c84a0ea210394cc9e1d3
BUG: 1004744
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/5826
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 215fea41a96479312a5ab8783c13b30ab9fe00fa
Realized soon after merging, that this patch is actually useless.
Checking for inode utilization on dst relative to src is of no use
because dst is already storing linkfile and inode is already consumed
no matter what. We might as well perform the rebalance and save on an
inode on the src node.
Change-Id: I7b8b8d35d9063a710e1bee1c7c6c7739fcc45ad4
Reviewed-on: http://review.gluster.org/5960
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
Tested-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently during `MIGRATE_DATA` we never verified about the total
inode usage among new and old bricks. Such checks are available for
disk space usage but it is also needed for inodes during data
migration. Such a check leads to uniform outcome of file distribution
upon rebalance.
Patch provides:
- Check dst_inodes < src_inodes, a friendly `warning` message to
indicate we have to ignore such an attempt
- Rename __dht_check_free_space() --> __dht_check_free_space_and_inodes()
Change-Id: I7bc4dd8b507883f0fb836300e99f0bb083493f5f
BUG: 982919
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/5948
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current self-healing algorithm is ignoring missing directories
for assigning new layout. When lookup() is racing against mkdir()
or when self-healing a half-done mkdir(), the layout assignment split
must happen based on the final number of directories, and not the
currently existing number of directories (because we finish mkdir()
of missing directories before hash layout assignment).
Without this fix, concurrent mkdir() and lookup() will step on
each others feet, create a messed up layout on disk, and end up
with different in-memory layouts.
Once two clients have different in-memory layouts, creation of
subdirectory will not arbitrate on the same hashed subvolume and will
result in GFID mismatch of the sub-directory.
Change-Id: Ia47acad67c265060405984c822b4d37512b9dbb3
BUG: 907072
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5849
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Peter Portante <pportant@redhat.com>
Tested-by: Peter Portante <pportant@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6
BUG: 952029
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5683
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I38d2fdc47e4b805deafca6805e54807976ffdb7e
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 952029
Reviewed-on: http://review.gluster.org/5496
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20.
Conflicts:
xlators/mount/fuse/src/fuse-bridge.c
For build issues added CREATE_MODE_KEY definition in:
libglusterfs/src/glusterfs.h
Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e
BUG: 952029
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5495
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we sent GF_READDIR_SKIP_DIRS for all subvolumes if
first_subvol != first_up_subvolume.
Also first_up_subvolume can change with-in the life of a call and
cbk. Saving the first_up_subvol in dht_local for checks.
Change-Id: I6e369e63f29c9761993f2a66ed768c424bb44d27
BUG: 996474
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5577
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because of posix default_acl on parent directory, the mode
of linkfile can get masked with the mode in the default acl.
This breaks DHT integrity. So let the mode get explicitly reset
after mknod().
Change-Id: Ia7328e1ee7b4430bda308f9da293dba78405e081
BUG: 990410
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5440
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently non-regular files were created as root:root, and
their ownership never healed during rebalance process.
Also, in dht_linkfile_attr_heal, we have to heal the linkfiles
as default, as currently linkfiles are created as root:root.
That check existed, as earlier linkfiles were created as
frame->root->uid/gid
Change-Id: I6cd88361b81bdd500e15bc47b623f5db8eec88e9
BUG: 990154
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5434
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently rebalance/remove-brick op's display migration failed count even
for files which failed due to space issues (not enough space for file, or
migration leading to cluster imbalance)
These will now be counted as skipped, and rebalance/remove-brick status
will display the additional counter
Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4
BUG: 989846
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5399
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nufa fails to init if a local brick is not found as of today.
With this patch, if a local brick is not found, nufa switches
over to dht mode of operations.
Change-Id: I50ac1af37621b1e776c8c00a772b8e3dfb3691df
BUG: 980838
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-on: http://review.gluster.org/5414
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently dht creates/deletes linkfiles for various ops like
rename/linking and when layout changes. dht_linkfile_create
already sends a key GLUSTERFS_INTERNAL_FOP_KEY in dict to
identify this as an internal fop and not user based op.
Enhancing rename related links/unlinks to send this key too.
Marker/changelog or other xlators can now identify these as
internal fops and handle them accordingly
Change-Id: Ib1ca789e6dbce48703c55ad3f4f88f7cd2df3d06
BUG: 987428
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5335
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* files can be accessed directly through their gfid and not just
through their paths. For eg., if the gfid of a file is
f3142503-c75e-45b1-b92a-463cf4c01f99, that file can be accessed
using <gluster-mount>/.gfid/f3142503-c75e-45b1-b92a-463cf4c01f99
.gfid is a virtual directory used to seperate out the namespace
for accessing files through gfid. This way, we do not conflict with
filenames which can be qualified as uuids.
* A new file/directory/symlink can be created with a pre-specified
gfid. A setxattr done on parent directory with fuse_auxgfid_newfile_args_t
initialized with appropriate fields as value to key "glusterfs.gfid.newfile"
results in the entry <parent>/bname whose gfid is set to args.gfid. The
contents of the structure should be in network byte order.
struct auxfuse_symlink_in {
char linkpath[]; /* linkpath is a null terminated string */
} __attribute__ ((__packed__));
struct auxfuse_mknod_in {
unsigned int mode;
unsigned int rdev;
unsigned int umask;
} __attribute__ ((__packed__));
struct auxfuse_mkdir_in {
unsigned int mode;
unsigned int umask;
} __attribute__ ((__packed__));
typedef struct {
unsigned int uid;
unsigned int gid;
char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid string
* in canonical form.
*/
unsigned int st_mode;
char bname[]; /* bname is a null terminated string */
union {
struct auxfuse_mkdir_in mkdir;
struct auxfuse_mknod_in mknod;
struct auxfuse_symlink_in symlink;
} __attribute__ ((__packed__)) args;
} __attribute__ ((__packed__)) fuse_auxgfid_newfile_args_t;
An initial consumer of this feature would be geo-replication to
create files on slave mount with same gfids as that on master.
It will also help gsyncd to access files directly through their
gfids. gsyncd in its newer version will be consuming a changelog
(of master) containing operations on gfids and sync corresponding
files to slave.
* Also, bring in support to heal gfids with a specific value.
fuse-bridge sends across a gfid during a lookup, which storage
translators assign to an inode (file/directory etc) if there is
no gfid associated it. This patch brings in support
to specify that gfid value from an application, instead of relying
on random gfid generated by fuse-bridge.
gfids can be healed through setxattr interface. setxattr should be
done on parent directory. The key used is "glusterfs.gfid.heal"
and the value should be the following structure whose contents
should be in network byte order.
typedef struct {
char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid
* string in canonical form
*/
char bname[]; /* a null terminated basename */
} __attribute__((__packed__)) fuse_auxgfid_heal_args_t;
This feature can be used for upgrading older geo-rep setups where gfids
of files are different on master and slave to newer setups where they
should be same. One can delete gfids on slave using setxattr -x and
.glusterfs and issue stat on all the files with gfids from master.
Thanks to "Amar Tumballi" <amarts@redhat.com> and "Csaba Henk"
<csaba@redhat.com> for their inputs.
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: Ie8ddc0fb3732732315c7ec49eab850c16d905e4e
BUG: 952029
Reviewed-on: http://review.gluster.com/#/c/4702
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/4702
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with the sequence of operations are like below, we have issues
with current code (MP == mountpoint):
T0,MP1# mkdir /abcd (Succeeds on hash_subvol)
T1,MP2# mkdir /abcd (Gets EEXIST as dir exists in hash_subvol)
T2,MP2# mkdir /.gfid/<abcd's gfid>/xyz (lookup happens on abcd's
gfid, calls dht_discover)
T3,MP1# (Completes mkdir(), goes to dir_selfheal to set the layouts).
T4,MP2# (dht_discover_cbk gets success for lookup as the entry
existed, as layout is not yet written, it says normalize
done, found holes).
T5,MP2# (as layout anomaly is not considered an issue in this patch,
dht_layout_set happens on inode, with all xlators pointing
to 0s)
T6,MP1# (completes mkdir call, inode has proper layouts)
T7,MP2# mkdir /.gfid/<abcd's gfid>/xyz fails with ENOENT
(with log saying no subvol found for hash value of xyz.
Porting Amar's fix from down-stream beta branch.
Change-Id: Ibdc37ee614c96158a1330af19cad81a39bee2651
BUG: 982913
Original-author: Amar Tumballi <amarts@redhat.com>
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5302
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If access fails with ENOTCONN, do not wind to same subvol.
We wind to first-up-subvol if access fails with ENOTCONN.
In few cases, if dht has only 1 subvolume, and access fails with
ENOTCONN, we go into a infinite loop of winding to same subvol
The fix is to check if we previously wound to same subvol, and
fail if first-up-subvol is same.
Change-Id: Ib5d3ce7d33e8ea09147905a7df1ed280874fa549
BUG: 983431
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5319
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The API is described in libxlator.h.
Behavior remains the same for this commit; this
is a preparatory step for per-translator customization
of aggregation.
Change-Id: I5d42923af59b2fd78e1ff59c12763875b57c5190
BUG: 847839
Original Author: Csaba Henk <csaba@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4903
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this works similar to pathinfo now except that the request is sent
to all subvolumes of dht. Underlying replica selects it's subvolume
in a round-robin fashion till one of them returns successfully.
Change-Id: Ie46c5f7090d04d8c2e487b209916ae6791e94624
BUG: 847839
Original Author: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/5225
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* in both distribute and replicate (ignoring stripe for now),
add logic to calculate the min() of stime values.
* What is a 'stime' ? Why is this required:
- stime means 'slave xtime', mainly used to keep track of slave
node's sync status when distributed geo-replication is used.
Logic of calculating 'min()' for this stime is very important as
in case of crashes/reboots/shutdown, we will have to 'restart'
with crawling from stime time value from the mount point, which
gives the 'min()' of all the bricks, which means, we don't miss
syncing any files in the above cases.
Change-Id: I2be8d434326572be9d4986db665570a6181db1ee
BUG: 847839
Original Author: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/4893
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a rename happens during migration, unlink fails. This leads
to stale xattrs, and Sticky bits still being set.
By removing the xattrs and Sticky bits from dst (through fd ops),
the stale link file would be cleaned up eventually (even if unlink
fails on src).
Change-Id: Iec537d021905438327a20e1d811aa06e74034364
BUG: 983399
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5316
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently if linkfile fails with ENOENT, we do not fail. We also
need to treat failures with ENOTCONN as success, as if cached subvol
is up, rm of a file should succeed. A stale linkfile will get removed
later
Change-Id: I71d136847933351ed9e2c939bda4a69bc96a3cfc
BUG: 983416
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5317
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when selecting a alternative subvolume when hashed
subvol has exceeded min-free-disk/inodes, we do not check if
layouts have errors (including decommissioning). This leads
to data being written to those subvolumes, and in case of
decommissioning, will lead to data loss.
Change-Id: Ie0c6cf4a29d7c53d8a6d8a8c1bd595cf58a0012a
BUG: 982919
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5299
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, nufa wouldn't work on volume topologies such as
distribute-replicate or distribute-stripe.
Change-Id: Ia89ed4412a00601022c1fc94f046056ce4820fe8
BUG: 980838
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5262
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If no decommissioned nodes options are set in the options, then
clear the conf->decommissioned_bricks.
Change-Id: I426d2bcc874aab21b2eba0b16a580b9a26672ea2
Signed-off-by: shishir gowda <sgowda@redhat.com>
BUG: 973073
Reviewed-on: http://review.gluster.org/5199
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the DISCARD file operation. Discard punches a hole
in a file in the provided range. Block de-allocation is implemented
via fallocate() (as requested via fuse and passed on to the brick
fs) but a separate fop is created within gluster to emphasize the
fact that discard changes file data (the discarded region is
replaced with zeroes) and must invalidate caches where appropriate.
BUG: 963678
Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5090
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support for the fallocate file operation. fallocate
allocates blocks for a particular inode such that future writes
to the associated region of the file are guaranteed not to fail
with ENOSPC.
This patch adds fallocate support to the following areas:
- libglusterfs
- mount/fuse
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
BUG: 949242
Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4969
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In some code paths neither loc->gfid nor loc->inode->gfid
is populated which leads to EINVAL for linkfile setattr
in dht_linkfile_attr_heal.
Fix:
Populate loc->gfid before dht_linkfile_attr_heal.
Change-Id: I062770e6f6eaead304eff1dae81f8588a3b97eed
BUG: 971805
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5178
Reviewed-by: Shishir Gowda <sgowda@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ie2b2dc54aa0d500c35752c72d3b562bcc05b1fc2
BUG: 969336
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5123
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If unlink of linkfile returns ENOENT, do not fail unlink.
Proceed with unlinking of cached file.
Change-Id: If7cec92b40c39d68dd9c3606c6c2c3a6bd67d27b
BUG: 966848
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/4971
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assign local = frame->local before dereferencing
local->linkfile.linkfile_cbk. Additionally, fail if
op_ret is non_zero.
Change-Id: I96a2f34ba29887da9ccaae38a644431cf7c43265
BUG: 966858
Signed-off-by: shishir gowda <sgowda@redhat.com>
Reviewed-on: http://review.gluster.org/5141
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|