| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A crash is seen during a reattempt to clean up shards in background
upon remount. And this happens even on remount (which means a remount
is no workaround for the crash).
In such a situation, the in-memory base inode object will not be
existent (new process, non-existent base shard).
So local->resolver_base_inode will be NULL.
In the event of an error (in this case, of space running out), the
process would crash at the time of logging the error in the following line -
gf_msg(this->name, GF_LOG_ERROR, local->op_errno, SHARD_MSG_FOP_FAILED,
"failed to delete shards of %s",
uuid_utoa(local->resolver_base_inode->gfid));
Fixed that by using local->base_gfid as the source of gfid when
local->resolver_base_inode is NULL.
Change-Id: I0b49f2b58becd0d8874b3d4b14ff8d92a89d02d5
Fixes: #1127
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit cc43ac8651de9aa508b01cb259b43c02d89b2afc)
|
|
|
|
|
|
|
| |
Change-Id: I0cebaaf55c09eb1fb77a274268ff564e871b743b
fixes bz#1737141
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit 51237eda7c4b3846d08c5d24d1e3fe9b7ffba1d4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since we added quorum checks for lookups in afr via commit
bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution
commands would not work for replica 3 because there would be no
readables for the lookup fop.
The argument was that split-brains do not occur in replica 3 but we do
see (data/metadata) split-brain cases once in a while which indicate that there are
a few bugs/corner cases yet to be discovered and fixed.
Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added
GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we
leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD,
split-brain resolution commands will work for replica 3 volumes too.
Likewise, the check is added in shard_lookup as well to permit resolving
split-brains by specifying "/.shard/shard-file.xx" as the file name
(which previously used to fail with EPERM).
Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9
Fixes: bz#1760792
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 47dbd753187f69b3835d2e42fdbe7485874c4b3e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of:
> BUG: bz#1705884
> Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53
> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
The way delta_blocks is computed in shard is incorrect, when a file
is truncated to a lower size. The accounting only considers change
in size of the last of the truncated shards.
FIX:
Get the block-count of each shard just before an unlink at posix in
xdata. Their summation plus the change in size of last shard
(from an actual truncate) is used to compute delta_blocks which is
used in the xattrop for size update.
Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53
fixes: bz#1716871
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit 400b66d568ad18fefcb59949d1f8368d487b9a80)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of:
> BUG: bz#1705884
> Change-Id: I2c1ddab17457f45e27428575ad16fa678fd6c0eb
> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
... by holding delta_blocks in 64-bit int as opposed to 32-bit int.
Change-Id: I2c1ddab17457f45e27428575ad16fa678fd6c0eb
updates: bz#1716871
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit e18e98659dd2b41eb59cf593fd625f1821a20abf)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
Lot of the earlier changes in the management of shards in lru, fsync
lists assumed that if a given shard exists in fsync list, it must be
part of lru list as well. This was found to be not true.
Consider this - a file is FALLOCATE'd to a size which would make the
number of participant shards to be greater than the lru list size.
In this case, some of the resolved shards that are to participate in
this fop will be evicted from lru list to give way to the rest of the
shards. And once FALLOCATE completes, these shards are added to fsync
list but without a ref. After the fop completes, these shard inodes
are unref'd and destroyed while their inode ctxs are still part of
fsync list. Now when an FSYNC is called on the base file and the
fsync-list traversed, the client crashes due to illegal memory access.
FIX:
Hold a ref on the shard inode when adding to fsync list as well.
And unref under following conditions:
1. when the shard is evicted from lru list
2. when the base file is fsync'd
3. when the shards are deleted.
Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2
fixes: bz#1669077
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
When multiple sharded files are deleted in quick succession, multiple
issues were observed:
1. misleading logs corresponding to a sharded file where while one log
message said the shards corresponding to the file were deleted
successfully, this was followed by multiple logs suggesting the very
same operation failed. This was because of multiple synctasks
attempting to clean up shards of the same file and only one of them
succeeding (the one that gets ENTRYLK successfully), and the rest of
them logging failure.
2. multiple synctasks to do background deletion would be launched, one
for each deleted file but all of them could readdir entries from
.remove_me at the same time could potentially contend for ENTRYLK on
.shard for each of the entry names. This is undesirable and wasteful.
FIX:
Background deletion will now follow a state machine. In the event that
there are multiple attempts to launch synctask for background deletion,
one for each file deleted, only the first task is launched. And if while
this task is doing the cleanup, more attempts are made to delete other
files, the state of the synctask is adjusted so that it restarts the
crawl even after reaching end-of-directory to pick up any files it may
have missed in the previous iteration.
This patch also fixes uninitialized lk-owner during syncop_entrylk()
which was leading to multiple background deletion synctasks entering
the critical section at the same time and leading to illegal memory access
of base inode in the second syntcask after it was destroyed post shard deletion
by the first synctask.
Change-Id: Ib33773d27fb4be463c7a8a5a6a4b63689705324e
updates: bz#1662368
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
excessive logging
... of the kind
"[2018-12-26 05:22:44.195019] E [MSGID: 133010]
[shard.c:2253:shard_common_lookup_shards_cbk] 0-volume1-shard: Lookup
on shard 785 failed. Base file gfid = cd938e64-bf06-476f-a5d4-d580a0d37416
[No such file or directory]"
shard_common_lookup_shards_cbk() has a specific check to ignore ENOENT error without
logging them during specific fops. But because background deletion is done in a new
frame (with local->fop being GF_FOP_NULL), the ENOENT check is skipped and the
absence of shards gets logged everytime.
To fix this, local->fop is initialized to GF_FOP_UNLINK during background deletion.
Change-Id: I0ca8d3b3bfbcd354b4a555eee520eb0479bcda35
updates: bz#1662368
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since glusterd2 don't maintain the xlator option details in code, it
directly reads the xlators options table from `*.so` files. To support
enable and disable of xlator new option added to the option table with
the name same as xlator name itself.
This change will not affect the functionality with glusterd1.
Change-Id: I23d9e537f3f422de72ddb353484466d3519de0c1
updates: #302
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
| |
Fixes: #164
Change-Id: I93ad6f0232a1dc534df099059f69951e1339086f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
| |
CID: 1325524
Change-Id: Ic713285bd9e76d8e4dc1815aa471087d279008b5
updates: bz#789278
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there are possibilities in few places, where a user-controlled
(like filename, program parameter etc) string can be passed as 'fmt' for
printf(), which can lead to segfault, if the user's string contains '%s',
'%d' in it.
While fixing it, makes sense to make the explicit check for such issues
across the codebase, by making the format call properly.
Fixes: CVE-2018-14661
Fixes: bz#1644763
Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes CID:
1394664 : CHECKED_RETURN
1356534 : Macro compares unsigned to 0 (NO_EFFECT)
1356532 : Macro compares unsigned to 0 (NO_EFFECT)
updates: bz#789278
Change-Id: I04d64fd8c007627611710dc56109b76eeb59333a
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch fixes CID: 1396177: NULL dereference.
updates: bz#789278
Change-Id: Ic5d302a5e32d375acf8adc412763ab94e6dabc3d
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In __shard_update_shards_inode_list(), previously shard translator
was not holding a ref on the base inode whenever a shard was added to
the lru list. But if the base shard is forgotten and destroyed either
by fuse due to memory pressure or due to the file being deleted at some
point by a different client with this client still containing stale
shards in its lru list, the client would crash at the time of locking
lru_base_inode->lock owing to illegal memory access.
So now the base shard is ref'd into the inode ctx of every shard that
is added to lru list until it gets lru'd out.
The patch also handles the case where none of the shards associated
with a file that is about to be deleted are part of the LRU list and
where an unlink at the beginning of the operation destroys the base
inode (because there are no refkeepers) and hence all of the shards
that are about to be deleted will be resolved without the existence
of a base shard in-memory. This, if not handled properly, could lead
to a crash.
Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499
updates: bz#1605056
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
When compiling in other architectures there appear many warnings. Some
of them are actual problems that prevent gluster to work correctly on
those architectures.
Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0
fixes: bz#1632717
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post changing the max op-version to 4.2, after release
4.1 branching, the decision was to go with increasing
release numbers. Thus this needs to change to 5.0.
This commit addresses the above change.
Fixes: bz#1628664
Change-Id: Ifcc0c6da90fdd51e4eceea40749511110a432cce
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
| |
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
| |
Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xlators/storage/posix/src/posix-inode-fd-ops.c:
xlators/storage/posix/src/posix-helpers.c:
xlators/storage/bd/src/bd.c:
xlators/protocol/client/src/client-lk.c:
xlators/performance/quick-read/src/quick-read.c:
xlators/performance/io-cache/src/page.c
xlators/nfs/server/src/nfs3-helpers.c
xlators/nfs/server/src/nfs-fops.c
xlators/nfs/server/src/mount3udp_svc.c
xlators/nfs/server/src/mount3.c
xlators/mount/fuse/src/fuse-helpers.c
xlators/mount/fuse/src/fuse-bridge.c
xlators/mgmt/glusterd/src/glusterd-utils.c
xlators/mgmt/glusterd/src/glusterd-syncop.h
xlators/mgmt/glusterd/src/glusterd-snapshot.c
xlators/mgmt/glusterd/src/glusterd-rpc-ops.c
xlators/mgmt/glusterd/src/glusterd-replace-brick.c
xlators/mgmt/glusterd/src/glusterd-op-sm.c
xlators/mgmt/glusterd/src/glusterd-mgmt.c
xlators/meta/src/subvolumes-dir.c
xlators/meta/src/graph-dir.c
xlators/features/trash/src/trash.c
xlators/features/shard/src/shard.h
xlators/features/shard/src/shard.c
xlators/features/marker/src/marker-quota.c
xlators/features/locks/src/common.c
xlators/features/leases/src/leases-internal.c
xlators/features/gfid-access/src/gfid-access.c
xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c
xlators/features/bit-rot/src/bitd/bit-rot.c
xlators/features/bit-rot/src/bitd/bit-rot-scrub.c
bxlators/encryption/crypt/src/metadata.c
xlators/encryption/crypt/src/crypt.c
xlators/performance/md-cache/src/md-cache.c:
Move to GF_MALLOC() instead of GF_CALLOC() when possible
It doesn't make sense to calloc (allocate and clear) memory
when the code right away fills that memory with data.
It may be optimized by the compiler, or have a microscopic
performance improvement.
In some cases, also changed allocation size to be sizeof some
struct or type instead of a pointer - easier to read.
In some cases, removed redundant strlen() calls by saving the result
into a variable.
1. Only done for the straightforward cases. There's room for improvement.
2. Please review carefully, especially for string allocation, with the
terminating NULL string.
Only compile-tested!
.. and allocate memory as much as needed.
xlators/nfs/server/src/mount3.c :
Don't blindly allocate PATH_MAX, but strlen() the string and allocate
appropriately.
Also, align error messges.
updates: bz#1193929
Original-Author: Yaniv Kaul <ykaul@redhat.com>
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xlators/features/index/src/index.c
xlators/features/shard/src/shard.c
xlators/features/upcall/src/upcall-internal.c
xlators/mgmt/glusterd/src/glusterd-bitrot.c
xlators/mgmt/glusterd/src/glusterd-locks.c
xlators/mgmt/glusterd/src/glusterd-mountbroker.c
xlators/mgmt/glusterd/src/glusterd-op-sm.c
For const strings, just do compile time size calc instead of runtime.
Compile-tested only!
Change-Id: I995b2b89f14454b3855a4cd0ca90b3f01d5e080f
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix following coverity issues-
CID:
1394660
1394668
1394667
1389008
1389434
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821108&mergedDefectId=1389008
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821101&mergedDefectId=1389434
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821001&mergedDefectId=1394660
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821010&mergedDefectId=1394667
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821017&mergedDefectId=1394668
Change-Id: I08f09649dbe758ba0d367ae5330b48b18784dec3
updates: bz#789278
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting the refresh flag in inode ctx in shard_rename_src_cbk()
is applicable only when the dst file exists and is sharded and
has a hard link > 1 at the time of rename.
But this piece of code is exercised even when dst doesn't exist.
In this case, the mount crashes because local->int_inodelk.loc.inode
is NULL.
Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407
Updates: bz#1611692
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently this lru limit is hard-coded to 16384. This patch makes it
configurable to make it easier to hit the lru limit and enable testing
of different cases that arise when the limit is reached.
The option is features.shard-lru-limit. It is by design allowed to
be configured only in init() but not in reconfigure(). This is to avoid
all the complexity associated with eviction of least recently used shards
when the list is shrunk.
Change-Id: Ifdcc2099f634314fafe8444e2d676e192e89e295
updates: bz#1605056
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A synctask is created that would scan the indices from
.shard/.remove_me, to delete the shards associated with the
gfid corresponding to the index bname and the rate of deletion
is controlled by the option features.shard-deletion-rate whose
default value is 100.
The task is launched on two accounts:
1. when shard receives its first-ever lookup on the volume
2. when a rename or unlink deleted an inode
Change-Id: Ia83117230c9dd7d0d9cae05235644f8475e97bc3
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(part 1)
PROBLEM:
Shards are deleted synchronously when a sharded file is unlinked or
when a sharded file participating as the dst in a rename() is going to
be replaced. The problem with this approach is it makes the operation
really slow, sometimes causing the application to time out, especially
with large files.
SOLUTION:
To make this operation atomic, we introduce a ".remove_me" directory.
Now renames and unlinks will simply involve two steps:
1. creating an empty file under .remove_me named after the gfid of the file
participating in unlink/rename
2. carrying out the actual rename/unlink
A synctask is created (more on that in part 2) to scan this directory
after every unlink/rename operation (or upon a volume mount) and clean
up all shards associated with it. All of this happens in the background.
The task takes care to delete the shards associated with the gfid in
.remove_me only if this gfid doesn't exist in backend, ensuring that the
file was successfully renamed/unlinked and its shards can be discarded now
safely.
Change-Id: Ia1d238b721a3e99f951a73abbe199e4245f51a3a
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
| |
updates: bz#789278
Change-Id: I745a98e957cf3c6ba69247fcf6b58dd05cf59c3c
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Also move the common parallel unlink callback for GF_FOP_TRUNCATE and
GF_FOP_FTRUNCATE into a separate function.
Change-Id: Ib0f90a5f62abdfa89cda7bef9f3ff99f349ec332
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Iea7ad2102220c6d415909f8caef84167ce2d6818
updates: bz#1568521
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
shard_post_lookup_fsync_handler() goes over the list of inode-ctx that need to
be fsynced and in cbk it removes each of the inode-ctx from the list. When the
first member of list is removed it tries to modifies list head's memory with
the latest next/prev and when this happens, there is no guarantee that the
list-head which is from stack memory of shard_post_lookup_fsync_handler() is
valid.
Fix:
Do list_del_init() in the loop before winding fsync.
BUG: 1557876
Change-Id: If429d3634219e1a435bd0da0ed985c646c59c2ca
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Ib74354f57a18569762ad45a51f182822a2537421
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
For as long as a shard's inode is in priv->lru_list, it should have a non-zero
ref-count. This patch achieves it by taking a ref on the inode when it
is added to lru list. When it's time for the inode to be evicted
from the lru list, a corresponding unref is done.
Change-Id: I289ffb41e7be5df7489c989bc1bbf53377433c86
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of stopping the volume while brick multiplex is
enabled memory is not cleanup from all server side xlators.
Solution: To cleanup memory for all server side xlators call fini
in glusterfs_handle_terminate after send GF_EVENT_CLEANUP
notification to top xlator.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/19574)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
|
|
|
|
|
|
|
|
|
| |
... instead of adding this information in fd_ctx in call path and
retrieving it again in the callback.
Change-Id: Ibbddbbe85baadb7e24aacf5ec8a1250d493d7800
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Icf3a5d0598a081adb7d234a60bd15250a5ce1532
BUG: 1468483
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
| |
Updates #302
Change-Id: Ife21440ffcf5805ce5858360dc94a456ead891e5
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
| |
updates #220
Change-Id: I6e25dbb69b2c7021e00073e8f025d212db7de0be
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch creates a new way of defining message id's that is easier
and less error prone because it doesn't require so many manual changes
each time a new component is defined or a new message created.
Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b
Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch fixes coverity issues 242 and 453.
Change-Id: If18f40539dccc7c2fcdcf8ef9b6fa3efbb3e462f
BUG: 789278
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I55fa87e07136cff10b0d725ee24dd3151016e64e
BUG: 1489823
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/18243
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Sunil Kumar Acharya <sheggodu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Because create_count/eexist_count are incremented without locks, all the shards may not
be created because call_count will be lesser than what it needs to be. This can lead
to crash in shard_common_inode_write_do() because inode on which we want to do
fd_anonymous() is NULL
Fix:
Increment the counts in frame->lock
Change-Id: Ibc87dcb1021e9f4ac2929f662da07aa7662ab0d6
BUG: 1488354
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/18203
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I42df7679d63fec9b4c03b8dbc66c5625f097fac0
BUG: 1488546
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/18209
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
There is a race when the following two commands are executed on the mount in
parallel from two different terminals on a sharded volume,
which leads to use-after-free.
Terminal-1:
while true; do dd if=/dev/zero of=file1 bs=1M count=4; done
Terminal-2:
while true; do cat file1 > /dev/null; done
In the normal case this is the life-cycle of a shard-inode
1) Shard is added to LRU when it is first looked-up
2) For every operation on the shard it is moved up in LRU
3) When "unlink of the shard"/"LRU limit is hit" happens it is removed from LRU
But we are seeing a race where the inode stays in Shard LRU even after it is
forgotten which leads to Use-after-free and then some memory-corruptions.
These are the steps:
1) Shard is added to LRU when it is first looked-up
2) For every operation on the shard it is moved up in LRU
Reader-handler Truncate-handler
1) Reader handler needs shard-x to be read. 1) Truncate has just deleted shard-x
2) In shard_common_resolve_shards(), it does
inode_resolve() and that leads to
a hit in LRU, so it is going to call
__shard_update_shards_inode_list() to move the
inode to top of LRU
2) shard-x gets unlinked from the itable
and inode_forget(inode, 0) is called
to make sure the inode can be purged
upon last unref
3) when __shard_update_shards_inode_list() is
called it finds that the inode is not in LRU
so it adds it back to the LRU-list
Both these operations complete and call inode_unref(shard-x) which leads to the inode
getting freed and forgotten, even when it is in Shard LRU list. When more inodes are
added to LRU, use-after-free will happen and it leads to undefined behaviors.
Fix:
I see that the inode can be removed from LRU even by the protocol layers like gfapi/gNFS
when LRU limit is reached. So it is better to add a check in shard_forget() to remove itself
from LRU list if it exists.
BUG: 1466037
Change-Id: Ia79c0c5c9d5febc56c41ddb12b5daf03e5281638
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/17644
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a file is opened with append, all writes are appended at the end of file
irrespective of the offset given in the write syscall. This needs to be
considered in shard size update function and also for choosing which shard to
write to.
At the moment shard piggybacks on queuing from write-behind
xlator for ordering of the operations. So if write-behind is disabled and
two parallel appending-writes come both of which can increase the file size
beyond shard-size the file will be corrupted.
BUG: 1455301
Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/17387
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I7e984bb0f50c7d42764c0648e697d94d6c768dc7
BUG: 1448299
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/17184
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9008ca9960df4821636501ae84f93a68f370c67f
BUG: 1440051
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/17014
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shard's writev implementation, as part of identifying
presence of participant shards that aren't in memory,
first sends an MKNOD on these shards, and upon EEXIST error,
looks up the shards before proceeding with the writes.
The VM corruption was caused when the following happened:
1. DHT had n subvolumes initially.
2. Upon add-brick + fix-layout, the layout of .shard changed
although the existing shards under it were yet to be migrated
to their new hashed subvolumes.
3. During this time, there were writes on the VM falling in regions
of the file whose corresponding shards were already existing under
.shard.
4. Sharding xl sent MKNOD on these shards, now creating them in their
new hashed subvolumes although there already exist shard blocks for
this region with valid data.
5. All subsequent writes were wound on these newly created copies.
The net outcome is that both copies of the shard didn't have the correct
data. This caused the affected VMs to be unbootable.
FIX:
For want of better alternatives in DHT, the fix changes shard fops to do
a LOOKUP before the MKNOD and upon EEXIST error, perform another lookup.
Change-Id: I8a2e97d91ba3275fbc7174a008c7234fa5295d36
BUG: 1440051
RCA'd-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Reported-by: Mahdi Adnan <mahdi.adnan@outlook.com>
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/17010
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since mem_get0 can return NULL, local->op_ret is gonna
crash. Found by coverity. And since we only have ENOMEM
as potential error, we can also simplify the code by avoiding
using 'local' for that.
Change-Id: I778747b57f520b1a52347c0fc9f27efd7a7c5ca0
BUG: 789278
Signed-off-by: Michael Scherer <misc@redhat.com>
Reviewed-on: https://review.gluster.org/16739
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Michael Scherer <misc@fedoraproject.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a performance issue with shard which was causing
the translator to trigger unusually high number of lookups
for cache invalidation even when there was no modification to
the file.
In shard_common_stat_cbk(), it is local->prebuf that contains the
aggregated size and block count as opposed to buf which only holds the
attributes for the physical copy of base shard. Passing buf for
inode_ctx invalidation would always set refresh to true since the file
size in inode ctx contains the aggregated size and would never be same
as @buf->ia_size. This was leading to every write/read being preceded
by a lookup on the base shard even when the file underwent no
modification.
Change-Id: Ib0349291d2d01f3782d6d0bdd90c6db5e0609210
BUG: 1436739
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/16961
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|