| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since setxattr and removexattr fops cbk do not carry poststat,
the stat cache was being invalidated in setxatr/remoxattr cbk.
Hence the further lookup wouldn't be served from cache.
To prevent this invalidation, md-cache is modified to get
the poststat in set/removexattr_cbk in dict.
Co-authored with Xavi Hernandez.
Change-Id: I6b946be2d20b807e2578825743c25ba5927a60b4
fixes: bz#1586018
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Inconsistent access permissions on directories after
bringing back the down sub-volumes, in case of directories dht_setattr
first wind a call on MDS once call is finished on MDS then wind a call
on NON-MDS.At the time of revalidating dht just compare the uid/gid with
stbuf uid/gid and if anyone differs set a flag to heal the same.
Solution: Add a condition to compare permission also in dht_revalidate_cbk
to set a flag to call dht_dir_attr_heal.
BUG: 1584517
Change-Id: I3e039607148005015b5d93364536158380d4c5aa
fixes: bz#1584517
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes compile warnings that appear with newer compilers. The
solution applied is only to remove the warnings, but it doesn't always
solve the problem in the best way. It assumes that the problem will never
happen, as the previous code assumed.
Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When call_count is decremented by one thread, another thread can
go ahead with the operation leading to undefined behavior for the
thread executing statements after decrementing call count.
Fix:
Do the operations necessary before decrementing call count.
fixes bz#1598663
Change-Id: Icc90cd92ac16e5fbdfe534d9f0a61312943393fe
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In the new eager-lock implementation lk-owner is assigned after the
'local' is added to the eager-lock list, so there exists a possibility
of lock being sent even before lk-owner is assigned.
Fix:
Make sure to assign lk-owner before adding local to eager-lock list
fixes bz#1597805
Change-Id: I26d1b7bcf3e8b22531f1dc0b952cae2d92889ef2
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
fixes bz#1597156
Change-Id: I323eb9190e40b12df216698dcdba74a6d336beeb
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT attempts to use up the entire buffer in readdirp before
unwinding in an attempt to reduce the number of calls.
However, this has 2 disadvantages:
1. This can cause a stack overflow when parallel readdir
is enabled. If the buffer only has a little space,rda can send back
only one or two entries. If those entries are stripped out by
dht_readdirp_cbk (linkto files for example) it will once again
wind down to rda in an attempt to fill the buffer before unwinding to FUSE.
This process can continue for several iterations, causing the stack
to grow and eventually overflow, causing the process to crash.
2. If parallel readdir is disabled, dht could send readdirp
calls with small buffers to the bricks, thus increasing the
number of network calls.
We are therefore reverting to the existing behaviour.
Please note, this only mitigates the stack overflow, it does
not prevent it from happening. This is still possible if
a subvol has thousands of linkto files for instance.
Change-Id: I291bc181c5249762d0c4fe27fa4fc2631166adf5
fixes: bz#1593548
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With patch [1], renames are journalled only
on cached subvolume. The dht sends the special
key on the cached subvolume so that the changelog
journals the rename. With single distribute
sub-volume, the key is not being set. This patch
fixes the same.
[1] https://review.gluster.org/10410
fixes: bz#1583018
Change-Id: Ic2e35b40535916fa506a714f257ba325e22d0961
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The dht lookup code is getting difficult to maintain
due to its size. Refactoring the code will make it
easier to modify it in future.
Change-Id: Ic7cb5bf4f018504dfaa7f0d48cf42ab0aa34abdd
updates: bz#1590385
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
| |
Removed extra variable.
Change-Id: If43c47f6630454aeadab357a36d061ec0b53cdb5
updates: bz#1590385
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 20fa80057eb430fd72b4fa31b9b65598b8ec1265 introduced a regression
wherein if a file is present in only 1 brick of replica *and* doesn't
have a gfid associated with it, it doesn't get healed upon the next
lookup from the client. Fix it.
Change-Id: I7d1111dcb45b1b8b8340a7d02558f05df70aa599
fixes: bz#1591193
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If inode refresh failed on all children of afr due to ENOENT (say file
migrated by dht), it resets the readables to zero. Any inflight txn which
then later comes on the inode fails with EIO because no readable
children present for the inode.
Fix:
Don't update readables when inode refresh fails on *all* children of
afr. In that way any inflight txns will either proceed with its own inode
refresh if needed and fail it with the right errno or use the old value
of readables and continue with the txn.
Also, add quorum checks to the beginning of afr_transaction(). Otherwise, we
seem to be winding the lock and checking for quorum only in pre-op pahse.
Note: This should ideally fix BZ 1329505 since the stop gap fix for
it is has been reverted at https://review.gluster.org/#/c/20028.
Change-Id: Ia638c092d8d12dc27afb3cdad133394845061319
updates: bz#1584483
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Created init and cleanup functions for certain
functionality in order to improve readability.
Removed unused code.
Change-Id: Ia6a2f4ab64923b6ea8e10487227fb5621eec1488
updates: bz#1586363
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a distributed volume situation can be arise when custom
extended attributed are not removed from all bricks after
stop/start or added newly brick.
Solution: To resolve the same use MDS subvol for remove xattr also
BUG: 1575587
Change-Id: I7701e0d3833e3064274cb269f26061bff9b71f50
fixes: bz#1575587
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of fetching xattr to heal xattr by afr
it is not able to fetch xattr because posix_getxattr
has a check to ignore if xattr name is MDS
Solution: To ignore same xattr update a check in dht_getxattr_cbk
instead of having a check in posix_getxattr
BUG: 1584098
Change-Id: I86cd2b2ee08488cb6c12f407694219d57c5361dc
fixes: bz#1584098
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
| |
syncop functions return -op_errno.
Change-Id: Ifdb1bd1d1d11972b4306a2336e6737d6236a2fb1
fixes: bz#1580238
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In case of error(ESTALE/ENOENT) dht_revalidate_cbk
throws "dict is null" error because xattr is not available
Solution: To avoid the logs update condition in dht_revalidate_cbk
and dht_lookup_dir_cbk
BUG: 1583565
Change-Id: Ife6b3eeb6d91bf24403ed3100e237bb5d15b4357
fixes: bz#1583565
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An entry from readdirp might get renamed just before migration leading to
lookup failures. For such lookup failure, remove-brick process does not
see any increment in failure count. Though there is a warning message
after remove-brick commit for the user to check in the decommissioned brick
for any files those are not migrated, it's better to increase the failure count
so that user can check in the decommissioned bricks for files before commit.
Note: This can result in false negative cases for rm -rf interaction with
remove-brick op, where remove-brick shows non-zero failed count, but the
entry was actually deleted by user.
Fixes :bz#1580269
Change-Id: Icd1047ab9edc1d5bfc231a1f417a7801c424917c
fixes: bz#1580269
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Corrected the name of the xattr and fixed
the code to log an error only if op_errno
is not ENODATA or ENOATTR.
Change-Id: I42c5b1d838eec586ac7bed2471eb1d27ff09a9ea
fixes: bz#1580238
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Multiple pre-op xattrop can be simultaneously being processed. On the cbk
it was checked if the fop was waiting for some specific data (like size and
version) and, if so, it was assumed that this answer should contain that
data.
This is not true, since a fop can be waiting for some data, but it may come
from the xattrop of another fop.
This patch differentiates between needing some information and providing it.
This is related to parallel writes. Disabling them fixed the problem, but
also prevented concurrent reads. A change has been made so that disabling
parallel writes still allows parallel reads.
Fixes: bz#1578325
Change-Id: I74772ad6b80b7b37805da93d5ec3ae099e96b041
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In the .t, when the only good brick was brought down, writes on the fd were
still succeeding on the bad bricks. The inflight split-brain check was
marking the write as failure but since the write succeeded on all the
bad bricks, afr_txn_nothing_failed() was set to true and we were
unwinding writev with success to DHT and then catching the failure in
post-op in the background.
Fix:
Don't wind the FOP phase if the write_subvol (which is populated with readable
subvols obtained in pre-op cbk) does not have at least 1 good brick which was up
when the transaction started.
Note: This fix is not related to brick muliplexing. I ran the .t
10 times with this fix and brick-mux enabled without any failures.
Change-Id: I915c9c366aa32cd342b1565827ca2d83cb02ae85
updates: bz#1577672
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, it was often[1] forgotten for xlators to be linked against
the symbols they refer to. This often caused glusterd2 to fail while
loading xlator's shared object (.so) file.
This change adds "--no-undefined" as a linker flag which causes the
linker to treat unresolved symbol references as an error and hence fail
linking.
[1]:
https://review.gluster.org/#/c/19912/
https://review.gluster.org/#/c/19664/
https://review.gluster.org/#/c/19056/
https://review.gluster.org/#/c/17659/
https://bugzilla.redhat.com/show_bug.cgi?id=1532238
Bonus:
Added cloudsync and utime xlator's generated source files to .gitignore
Updates: bz#1193929
Change-Id: I9604a4a87b7313a5fa43bda5fdb37dfa7ef8facd
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Removed EIO from the list of errnos that triggered
a migrate check task.
Change-Id: I7f89c7a16056421588f1af2377cebe6affddcb47
fixes: bz#1578823
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In Geo-Rep setup excessive "dict is null" logs in
dht_discover_complete while xattr is NULL
Solution: To avoid the logs update a condition in dht_discover_complete
BUG: 1576767
Change-Id: Ic7aad712d9b6d69b85b76e4fdf2881adb0512237
fixes: bz#1576767
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Provide a virtual xattr with which to query
the hashed subvol for a file.
Change-Id: Ic7abd031f875da4b9084841ea7c25d6c8a851992
fixes: bz#1574421
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Additional log messages to help debug issues
with file listings.
Change-Id: Iccd07498ba01d597c0c40f026f4177dd06d7e901
fixes: bz#1575887
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before populate MDS internal xattr first dht checks if MDS is
present in xattr or not.If xattr dictionary is NULL dict_get
log the message either dict or key is NULL
Solution: Before call dict_get check xattr, if it is NULL then no
need to call dict_get.
BUG: 1575910
Change-Id: I81604ec5945b85eba14b42f4583d06ec713028f4
fixes: bz#1575910
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test case:
# while true; do uuid="`uuidgen`"; echo "some data" > "test$uuid"; mv
"test$uuid" "test" -f || break; echo "done:$uuid"; done
This script was run in parallel from multiple mountpoints
Along the course of getting the above usecase working, many issues
were found:
Issue 1:
=======
consider a case of rename (src, dst). We can encounter a situation
where,
* dst is a file present at the time of lookup
* dst is removed by the time rename fop reaches glusterfs
In this scenario, acquring inodelk on dst fails with ESTALE resulting
in failure of rename. However, as per POSIX irrespective of whether
dst is present or not, rename should be successful. Acquiring entrylk
provides synchronization even in races like this.
Algorithm:
1. Take inodelks on src and dst (if dst is present) on respective
cached subvols. These inodelks are done to preserve backward
compatibility with older clients, so that synchronization is
preserved when a volume is mounted by clients of different
versions. Once relevant older versions (3.10, 3.12, 3.13) reach
EOL, this code can be removed.
2. Ignore ENOENT/ESTALE errors of inodelk on dst.
3. protect namespace of src and dst. To protect namespace of a file,
take inodelk on parent on hashed subvol, then take entrylk on the
same subvol on parent with basename of file. inodelk on parent is
done to guard against changes to parent layout so that hashed
subvol won't change during rename.
4. <rest of rename continues>
5. unlock all locks
Issue 2:
========
linkfile creation in lookup codepath can race with a rename. Imagine
the following scenario:
* lookup finds a data-file with gfid - gfid-dst - without a
corresponding linkto file on hashed-subvol. It decides to create
linkto file with gfid - gfid-dst.
- Note that some codepaths of dht-rename deletes linkto file of
dst as first step. So, a lookup racing with an in-progress
rename can easily run into this situation.
* a rename (src-path:gfid-src, dst-path:gfid-dst) renames data-file
and hence gfid of data-file changes to gfid-src with path dst-path.
* lookup proceeds and creates linkto file - dst-path - with gfid -
dst-gfid - on hashed-subvol.
* rename tries to create a linkto file dst-path with src-gfid on
hashed-subvol, but it fails with EEXIST. But EEXIST is ignored
during linkto file creation.
Now we've ended with dst-path having different gfids - dst-gfid on
linkto file and src-gfid on data file. Future lookups on dst-path will
always fail with ESTALE, due to differing gfids.
The fix is to synchronize linkfile creation in lookup path with rename
using the same mechanism of protecting namespace explained in solution
of Issue 1. Once locks are acquired, before proceeding with linkfile
creation, we check whether conditions for linkto file creation are
still valid. If not, we skip linkto file creation.
Issue 3:
========
gfid of dst-path can change by the time locks are acquired. This
means, either another rename overwrote dst-path or dst-path was
deleted and recreated by a different client. When this happens,
cached-subvol for dst can change. If rename proceeds with old-gfid and
old-cached subvol, we'll end up in inconsistent state(s) like dst-path
with different gfids on different subvols, more than one data-file
being present etc.
Fix is to do the lookup with a new inode after protecting namespace of
dst. Post lookup, we've to compare gfids and correct local state
appropriately to be in sync with backend.
Issue 4:
========
During revalidate lookup, if following a linkto file doesn't lead to a
valid data-file, local->cached-subvol was not reset to NULL. This
means we would be operating on a stale state which can lead to
inconsistency. As a fix, reset it to NULL before proceeding with
lookup everywhere.
Issue 5:
========
Stale dentries left out in inode table on brick resulted in failures
of link fop even though the file/dentry didn't exist on backend fs. A
patch is submitted to fix this issue. Please check the dependency tree
of current patch on gerrit for details
In short, we fix the problem by not blindly trusting the
inode-table. Instead we validate whether dentry is present by doing
lookup on backend fs.
Change-Id: I832e5c47d232f90c4edb1fafc512bf19bebde165
updates: bz#1543279
BUG: 1543279
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
selfhealing of directory is invoked on two conditions:
1. no layout on disk or layout has some anomalies (holes/overlaps)
2. mds xattr is not set on the directory
When dht_selfheal_directory is called with a correct layout just to
set mds xattr, we see error msgs complaining about "not able to form
layout on directory", which is misleading as the layout is
correct. So, log this msg only if layout has anomalies.
Change-Id: I4af25246fc3a2450c2426e9902d1a5b372eab125
updates: bz#1543279
BUG: 1543279
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
| |
Change-Id: Ied047dd5ee44e9d5a5d3db214826f7df30332ef9
updates: #350
BUG: 1319992
Signed-off-by: Poornima G <pgurusid@redhat.com>
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
If dht_selfheal_dir_mkdir returns an error, cbk passed to
dht_selfheal_directory is not invoked. So, Current codepath leaves an
unwound frame resulting in a hung fop forever.
Change-Id: I422308b8a34a074301ca46b029ffe676f5e0f66c
fixes: bz#1574305
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: A directory deletion can happen just before gf_defrag_settle_hash
which internally does a setxattr operation on a directory.
Solution: Ignore ENOENT and ESTALE errors
Fixes: bz#1572581
Change-Id: I2f91809f3b5e02976c4c3a5a596406a8b2f8f6f2
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
| |
Updates #352
Change-Id: I1bbb3c652ba33cec6aa37f3700370674077fb17d
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
| |
1. Create thin arbiter index file during mount.
2. Set pending marker in thin arbiter id file in case of failure.
Change-Id: I269eb8d069f0323f1fc616175e5e5eb7b91d5f82
updates: #352
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently it is not possible to capture the xattrs values which
are set on the bricks by calling syncop_(f)xattrop, because the
response dict is not being assigned to any of the dictionaries.
Fix:
In the xattrop callback capture the response dict and send it
back to the caller if it is requested.
Change-Id: I9de9bcd97d6008091c9b060bcca3676cb9ae8ef9
fixes: bz#1572076
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If we have 2 bricks, brick-A and brick-B with brick-A within halo-max-latency
and brick-B more than halo-max-latency. If we set both halo-min, halo-max replicas
as '1'. In this case, brick-A comes online and then ping-latency will be updated for it.
When brick-B comes online, we have 2 up-bricks, so the code tries to find the brick with
worst latency to mark it down. Since Brick-B just came online it always had '0' latency
so brick-B used to be marked offline and Brick-B would eventually be the one to be
online even when brick-A is more suited.
Fix:
Consider latency of just-up child as HALO_MAX_LATENCY so that worst-child until
ping-latency is found as the just-up brick. Also keep ping-latency as -1 until
child-up during initialization.
BUG: 1567881
fixes bz#1567881
Change-Id: I148262fe505468190f0eb99225d0f6d57cdb6f04
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixed dht_order_rename_lock to use the same inodelk ordering
as that of the dht selfheal locks (dictionary order of
lock subvolumes).
Change-Id: Ia3f8353b33ea2fd3bc1ba7e8e777dda6c1d33e0d
fixes: bz#1568348
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In Halo replication, there are pending heals more often than not.
It makes sense to give users the capability to configure it as low
as 5 seconds.
BUG: 1569489
fixes bz#1569489
Change-Id: I451c1975827f66398b903f659c981ef3121d5376
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem: With the current code, post graph switch the old fd is received for
fuse_getattr and since it is associated with old inode, it does not
have the inode ctx across xlators in new graph. Hence, dht
errored out saying "no layout" for fstat call. Hence the EINVAL.
Solution: if fd is passed, init and resolve fd to carry on getattr
test case:
- Created a single brick distributed volume
- Started untar
- Added a new-brick
Without this fix, untar used to abort with ERROR.
Change-Id: I5805c463fb9a04ba5c24829b768127097ff8b9f9
fixes: bz#1566207
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
xlator_notify doesn't pass the extra arguments that come in the
input function, so XLATOR_NOTIFY macro should be used instead
to pass the extra arguments to the function.
BUG: 1567881
fixes bz#1567881
Change-Id: Ic15b6c446638cbacf3149693147a754219037c47
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. If pre-op fails on all bricks,set lock->release to true in
afr_handle_lock_acquire_failure so that the GF_ASSERT in afr_unlock() does not
crash.
2. Added a missing 'return' after handling pre-op failure in
afr_transaction_perform_fop(), fixing a use-after-free issue.
Change-Id: If0627a9124cb5d6405037cab3f17f8325eed2d83
fixes: bz#1561129
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decision as to which node would migrate a file
was based on the gfid of the file. Files were divided
among the nodes for the replica/disperse set. However,
if a brick was down when rebalance started, the nodeuuids
would be saved as NULL and a set of files would not be migrated.
Now, if the nodeuuid is NULL, the first non-null entry in
the set is the node responsible for migrating the file.
Change-Id: I72554c107792c7d534e0f25640654b6f8417d373
fixes: bz#1564198
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.
Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1564198
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Various synchronization present in dht_rename while handling
directories and files is necessary only if we have more than only one
child.
Change-Id: Ie21ad419125504ca2f391b1ae2e5c1d166fee247
fixes: bz#1563511
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I0a290396c30c635b13ee73004d20259efb76a954
fixes: bz#1563945
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
We seem to be winding the FOP if pre-op did not succeed on quorum bricks
and then failing the FOP with EROFS since the fop did not meet quorum.
This essentially masks the actual error due to which pre-op failed. (See
BZ).
Fix:
Skip FOP phase if pre-op quorum is not met and go to post-op.
Fixes: 1561129
Change-Id: Ie58a41e8fa1ad79aa06093706e96db8eef61b6d9
fixes: bz#1561129
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lookup-optimize has been shown to improve create
performance. The code has been in the project for several
years and is considered stable.
Enabling this by default in order to test this in the
upstream regression runs.
Change-Id: Iab792979ee34f0af4713931e0b5b399c23f65313
updates: bz#1557435
BUG: 1557435
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
On shd, we shouldn't treat any brick down based
on latency, otherwise self-heal will never happen
fixes: bz#1562717
Change-Id: Ica07fcc4fae91a6bfd9c9a670e2be464704d94b7
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Set the levels for DHT options based on
https://review.gluster.org/#/c/19466/
Change-Id: I51b31a706a0b9517404e83224c89de145fd5d7e1
updates: #430
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With lookup-optimize enabled, gf_defrag_settle_hash in rebalance
sometimes flips the on-disk layout on volume root post the
migration of all files in the directory.
This is sometimes seen when attempting to fix the layout of a
directory multiple times before calling gf_defrag_settle_hash.
dht_fix_layout_of_directory generates a new layout in memory but
updates it in the inode ctx before it is set on disk. The layout
may be different the second time around due to
dht_selfheal_layout_maximize_overlap. If the layout is then not
written to the disk, the inode now contains the wrong layout.
gf_defrag_settle_hash does not check the correctness of the layout
in the inode before updating the commit-hash and writing it to the
disk thus changing the layout of the directory.
Change-Id: Ie1407d92982518f2a0c40ec70ad370b34a87b4d4
updates: bz#1557435
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|