| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While only mgmt SSL is enabled for a brick process use_ssl flag
is false for a brick process and socket api's cleanup ssl_ctx only
while use_ssl and ssl_ctx both are valid
Solution: To avoid a leak check only ssl_ctx, if it is valid cleanup
ssl_ctx
Fixes: #1196
Change-Id: I2f4295478f4149dcb7d608ea78ee5104f28812c3
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If selinux is set in enforcing mode geo-rep goes into faulty state.
To avoid this from happening some relevant selinux booleans need to be set
in 'on' state to allow rsync operation.
Change-Id: Ia8ce530d6548c2a545f4c99c600f5aac2bbb3363
Fixes: #1182
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch https://review.gluster.org/#/c/glusterfs/+/23733/( which
optimizes the changelog) introduces change in dirctory structure
which is above changelog files.
Thus, before upgrade, old version should get updated, with respect
to the corresponding changes made by the above qouted patch.
This upgrade script,
1) creates a temp htime file, with updated paths from the htime file.
Updates temp htime file as htime file.
2) places the changelog files under the required directory structure.
Updates: #154
Change-Id: I4b5a6cb9a9266a65972b419b329bc958de8fdf8a
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.
Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.
Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().
Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.
The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.
Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Rebalance process handling of files which contains holes casued
rebalance to fail with "No space left on device" errors.
This patch modifies the code-flow in such a way that files with holes
will be rebalanced correctly.
fixes: #1187
Change-Id: I89bc3d4ea7f074db7213d759c49307f379543932
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch includes the following CID from Coverity Scan:
* 1425196
* 1425197
* 1425198
* 1425199
* 1525200
Change-Id: Iddcfea449d3dd56d4dfcc39f4c3c608518e611e4
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Updates: #1060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The test is failing at
14:56:41 ok 13, LINENUM:38
14:56:41 not ok 14 Got "test-message0" instead of "test-message1", LINENUM:41
14:56:41 FAILED COMMAND: test-message1 cat /mnt/glusterfs/1/test.txt
This happens because fuse sometimes doesn't send 'read' fop to glusterfs
and is served from cache.
Fix:
Mount with direct-io-mode=yes so that read is always received by
gluster
Fixes: #1190
Change-Id: I369e2024a85dc492dc24c7579b161fb965f55d19
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
If rebalance process is failing, recursive failures appear in the log
file, which is distracting from the root cause.
In order to avoid recursive failure, error handling mechanism has
been modified.
fixes: #1072
Change-Id: Iae19430323630acd97c2c8d35685626d8da747a7
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is removing some of the "tier" code in dht xlator, as it is no longer
being used.
Not all of the not-needed code is removed at once, so reviewing is easier.
Follow up patches removing additional unused code will follow.
This is based in the work done in https://review.gluster.org/#/c/glusterfs/+/23935/
Change-Id: I3cb6a0c5d8f14afcd87cf021ef8f74b91c0f908a
updates: #1097
Signed-off-by: Barak Sason Rofman <bsaonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes test case is failing at the time of creating files
on mount point after mounting the volume
Solution: After started the volume need to wait to make sure all
bricks instances are completely started so put a online_brick_count
check after just started the volume
Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6
Fixes: #1162
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
mdc_inode_xatt_set() blindly cleared current cache when dict was not
NULL, even if there was no xattr requested.
This patch fixes this by only calling mdc_inode_xatt_set() when we have
explicitly requested something to cache.
Change-Id: Idc91a4693f1ff39f7059acde26682ccc361b947d
Fixes: #1140
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove GF_VALIDATE_OR_GOTO(this->name, this, out) when this
is passed as an argument and is checked for NULL in the caller
itself.
GF_VALIDATE_OR_GOTO(this->name, this, out) is modified to use
xlator name instead of this->name as we are still verifying
whether this is NULL.
updates: #1000
Change-Id: Ide3180da29d0d4a35b2c5b9a7604fdf2ff4a9ffb
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterfs(GNFS) is crashing at the time of handling
Pollerr event in rpcsvc_drc_client_unref.GNFS is crashed
because ref was 0 at the time of unref and ref was taken
while Pollin event successfully handled.
Solution: Convert drc_client ref to atomic ref to avoid the crash
Change-Id: Ia4c054f2f388032a5cd99597d0cfa18b003ca690
Fixes: #1038
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Do not truncate file offsets and sizes to 32-bit to
prevent tests from spurious failures on >2Gb files.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2
Fixes: #1161
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pacemaker devs change the format of the ouput of `pcs status`
Expected to find a line in the format:
Online: ....
but now it's
* Online: ...
And the `grep -E "^Online:" no longer finds the list of nodes that
are online.
Change-Id: If2aa1e7b53c766c625d7b4cc222a83ea2c0bd72d
Fixes: #1169
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the output of date command is a single digit
number it is preceded by zero which is getting
considered as an octal number. Removing the leading
zero from the number solved the problem.
Fixes: #1156
Change-Id: Iac4fa20607c0bb90d94dd8ff157ef6b60932c560
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found with 0-symbol-check.t:
./tests/basic/0symbol-check.t ..
1..2
./xlators/mgmt/glusterd/src/.libs/glusterd_la-glusterd-volume-set.o should call sys_stat, not stat
ok 1 [ 40/ 41011] < 40> 'find . -name *.o -exec ./tests/basic/symbol-check.sh {} \;'
not ok 2 [ 11/ 1] < 42> '[ ! -e ./.symbol-check-errors ]' -> ''
Failed 1/2 subtests
Change-Id: I8962f487cd88738a1f7a962049d513712687088c
Fixes: #1160
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
|
|
|
|
|
|
|
|
| |
Parts of the test weren't designed to run in mux mode, this is now fixed
Change-Id: I428c2fcce6d047e324ca5dcaef677ee1794e3dfe
updates: #1154
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t
Problem: Sometime volume status is failed after restart glusterd
in one cluster node
Solution: Wait to finish glusterd handshake on down cluster node
Change-Id: Ib23ca41c943caf2903c61ebf42dc437c1b9d6054
Fixes: #1158
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Handle case of arg not freed
CID: 1422174
Updates: #1060
Change-Id: Ibd03908a3ea8369035c2b7f6e024b3e5be48f436
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
| |
This test case includes all the basic glusterfind scenarios.
fixes: #1044
Change-Id: I6021443729e35769fe855c5cc41bb3fbc6365ef0
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The key "GF_PREOP_PARENT_KEY" has been populated by dht and
for non-distribute volume like 1x3 key is not populated so
posix_is_layout stale throw a message while a file is created
Solution: To avoid a log put a condition before delete a key
Change-Id: I813ee7960633e7f9f5e9ad2f42f288053d9eb71f
Fixes: #1150
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
tests/bugs/protocol/bug-1433815-auth-allow.t fails
sometimes because of stale mount. This stale mount
comes into picture when parent process dies without
waiting for the child process which mounts fuse fs
to die
Fix:
Wait for mounting child process to die before dying.
Fixes: #1152
Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-replication has a large number of repeated imports as shown below:
```
from syncdutils import set_term_handler, finalize, lf
from syncdutils import log_raise_exception, FreeObject, escape
```
There imports can be clubbed together as shown below:
``
from syncdutils import (set_term_handler, finalize, lf,
log_raise_exception, FreeObject, escape)
```
Fixes: #1105
Change-Id: I59a48dd57a70fc851d93150b85e736ce41e8b793
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When bringing back a downed brick and performing lookup from the client
side, the permission on said brick aren't updated on the first lookup,
but only on the second.
This patch modifies permission update logic so the first lookup will
trigger a permission update on the downed brick.
LIMITATIONS OF THE PATCH:
As the choice of source depends on whether the directory has layout or not.
Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory.
Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted.
fixes: #999
Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Test should wait for process down notification to be received
by glusterd.
Fixes: #1153
Change-Id: I9162b58a92c1a909ca98097f14c0714f9086bdd1
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are certain conditions which blocks the current
execution thread (like waiting on mutex lock or condition
variable or I/O response). In such cases, if it is a
synctask thread, we should suspend the task instead
of blocking it (like done in SYNCOP using synctask_yield)
This is to avoid deadlock like the one mentioned below -
1) synctaskA sets fs->migration_in_progress to 1 and
does I/O (LOOKUP)
2) Other synctask threads wait for fs->migration_in_progress
to be reset to 0 by synctaskA and hence blocked
3) but synctaskA cannot resume as all synctask threads are blocked
on (2).
Note: this same approach is already used by few other components
like syncbarrier etc.
Change-Id: If90f870d663bb242c702a5b86ac52eeda67c6f0d
Fixes: #1146
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for gluster volume heal <volname> info healed/heal-failed
was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in
release-3.6. cli parser will display the usage message in all the
supported versions whenever these clis are run, leading to some
dead code in the latest branches. Since support for these clis
were removed long back, this should not give any backward
compatibility issues as well. Hence removing the dead code from
the code base which will lead to better code coverage by the
regression runs as well.
Updates: #1052
Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before executing a fop in POSIX xlator it builds an internal
path based on GFID.To validate the path it call's (l)stat
system call and while .glusterfs is heavily loaded kernel takes
time to lookup inode and due to that performance drops
Solution: In this patch we followed two ways to improve the performance.
1) Keep open fd specific to first level directory(gfid[0])
in .glusterfs, it would force to kernel keep the inodes
from all those files in cache. In case of memory pressure
kernel won't uncache first level inodes. We need to open
256 fd's per brick to access the entry faster.
2) Use at based call's to access relative path to reduce
path based lookup time.
Note: To verify the patch we have executed kernel untar 100 times on 6
different clients after enabling metadata group-cache and some
other option.We were getting more than 20 percent improvement in
kenel untar after applying the patch.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343
updates: #891
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Following tests are done -
1 - After finishing reset-brick all the bricks should be up.
2 - Heal should be completed.
3 - Check number of entries present on brick which was reset.
Change-Id: I9314bed180293a99d400d94bb8cc7ece999da29e
Updates: #1144
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed that the following message repeats quite a bit in log files when
an external monitoring tool queries gluster for list of volumes
periodically:
"Received get vol req"
As there's not much value in having this log message at log level INFO,
changing the log level to DEBUG to make glusterd.log a bit quieter.
Change-Id: I4e791fc65b9a4f813d295e7b2b6a05f3c0782e69
Updates: #1000
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
| |
Fixes: #1149
Change-Id: I38483fc7d76d7fe0ac9fb649669a46bdf9c82234
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I93f11dae6e4939ab79b0481ead2a4f7bb3085b70
Fixes: #1142
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a bug in write-behind that allowed a previous completed write
to overwrite the overlapping region of data from a future write.
Suppose we want to send three writes (W1, W2 and W3). W1 and W2 are
sequential, and W3 writes at the same offset of W2:
W2.offset = W3.offset = W1.offset + W1.size
Both W1 and W2 are sent in parallel. W3 is only sent after W2 completes.
So W3 should *always* overwrite the overlapping part of W2.
Suppose write-behind processes the requests from 2 concurrent threads:
Thread 1 Thread 2
<received W1>
<received W2>
wb_enqueue_tempted(W1)
/* W1 is assigned gen X */
wb_enqueue_tempted(W2)
/* W2 is assigned gen X */
wb_process_queue()
__wb_preprocess_winds()
/* W1 and W2 are sequential and all
* other requisites are met to merge
* both requests. */
__wb_collapse_small_writes(W1, W2)
__wb_fulfill_request(W2)
__wb_pick_unwinds() -> W2
/* In this case, since the request is
* already fulfilled, wb_inode->gen
* is not updated. */
wb_do_unwinds()
STACK_UNWIND(W2)
/* The application has received the
* result of W2, so it can send W3. */
<received W3>
wb_enqueue_tempted(W3)
/* W3 is assigned gen X */
wb_process_queue()
/* Here we have W1 (which contains
* the conflicting W2) and W3 with
* same gen, so they are interpreted
* as concurrent writes that do not
* conflict. */
__wb_pick_winds() -> W3
wb_do_winds()
STACK_WIND(W3)
wb_process_queue()
/* Eventually W1 will be
* ready to be sent */
__wb_pick_winds() -> W1
__wb_pick_unwinds() -> W1
/* Here wb_inode->gen is
* incremented. */
wb_do_unwinds()
STACK_UNWIND(W1)
wb_do_winds()
STACK_WIND(W1)
So, as we can see, W3 is sent before W1, which shouldn't happen.
The problem is that wb_inode->gen is only incremented for requests that
have not been fulfilled but, after a merge, the request is marked as
fulfilled even though it has not been sent to the brick. This allows
that future requests are assigned to the same generation, which could
be internally reordered.
Solution:
Increment wb_inode->gen before any unwind, even if it's for a fulfilled
request.
Special thanks to Stefan Ring for writing a reproducer that has been
crucial to identify the issue.
Change-Id: Id4ab0f294a09aca9a863ecaeef8856474662ab45
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: #884
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
frame is accessed after stack-wind. This can lead to crash
if the cbk frees the frame.
Fix:
Use new frame for the wind instead.
Updates: #832
Change-Id: I64754609f1114b0bbd4d1336fa81a56f2cca6e03
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
"snap_scheduler.py init" command failing with the below traceback:
[root@dhcp43-104 ~]# snap_scheduler.py init
Traceback (most recent call last):
File "/usr/sbin/snap_scheduler.py", line 941, in <module>
sys.exit(main(sys.argv[1:]))
File "/usr/sbin/snap_scheduler.py", line 851, in main
initLogger()
File "/usr/sbin/snap_scheduler.py", line 153, in initLogger
logfile = os.path.join(process.stdout.read()[:-1], SCRIPT_NAME + ".log")
File "/usr/lib64/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.6/genericpath.py", line 151, in _check_arg_types
raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components
Solution:
Added the 'universal_newlines' flag to Popen to support backward compatibility.
Added a basic test for snapshot scheduler.
Change-Id: I78e8fabd866fd96638747ecd21d292f5ca074a4e
Fixes: #1134
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, ssl_setup_connection_params throws 4 messages for every
rpc connection that irritates a user while reading the logs. The same
info we can print in a single log with peerinfo to make it more
useful.ssl_setup_connection_params try to load dh_param even user
has not configured it and if a dh_param file is not available it throws
a failure message.To avoid the message load dh_param only while the user
has configured it.
Change-Id: I9ddb57f86a3fa3e519180cb5d88828e59fe0e487
Fixes: #1141
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...if pending xattrs are zero for all children.
Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the BZ.
Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.
Fixes: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
found by flock test, without correct ref number of fd,
lock will not be correctly released.
Fixes: bz#1779089
Change-Id: I3e466b17c852eb219c8778e43af8ad670a8449cc
Signed-off-by: l17zhou <cynthia.zhou@nokia-sbell.com>
|
|
|
|
|
|
|
|
| |
Have removed the deadcode found by the coverity id:1356503
Change-Id: Ieaa41e864538fb82dc967b4a214d4db09e267098
Updates: #1060
Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment, while volumes are created/stopped in a loop
after running a long time the main brick is crashed.The brick is crashed
because the main brick process was not cleaned up memory for all objects
at the time of detaching a volume.
Below are the objects that are missed at the time of detaching a volume
1) xlator object for a brick graph
2) local_pool for posix_lock xlator
3) rpc object cleanup at quota xlator
4) inode leak at brick xlator
Solution: To avoid the crash resolve all leak at the time of detaching a brick
Change-Id: Ibb6e46c5fba22b9441a88cbaf6b3278823235913
updates: #977
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A crash is seen during a reattempt to clean up shards in background
upon remount. And this happens even on remount (which means a remount
is no workaround for the crash).
In such a situation, the in-memory base inode object will not be
existent (new process, non-existent base shard).
So local->resolver_base_inode will be NULL.
In the event of an error (in this case, of space running out), the
process would crash at the time of logging the error in the following line -
gf_msg(this->name, GF_LOG_ERROR, local->op_errno, SHARD_MSG_FOP_FAILED,
"failed to delete shards of %s",
uuid_utoa(local->resolver_base_inode->gfid));
Fixed that by using local->base_gfid as the source of gfid when
local->resolver_base_inode is NULL.
Change-Id: I0b49f2b58becd0d8874b3d4b14ff8d92a89d02d5
Fixes: #1127
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Currently even though gf_defrag_fix_layout fails with ENOENT or ESTALE, a
subsequent call is made to gf_defrag_process_dir leading to rebalance failure.
fixes: #1102
Change-Id: Ib0c309fd78e89a000fed3feb4bbe2c5b48e61478
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This patch fixes CID: 1420405
updates: #1060
Change-Id: I0524e999fa1d36ed5a713eabf65482c04ad43a1a
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: posix_release(dir) functions add the fd's into a ctx->janitor_fds
and janitor thread closes the fd's.In brick_mux environment it is
difficult to handle race condition in janitor threads because brick
spawns a single janitor thread for all bricks.
Solution: Use synctask to execute posix_release(dir) functions instead of
using background a thread to close fds.
Credits: Pranith Karampuri <pkarampu@redhat.com>
Change-Id: Iffb031f0695a7da83d5a2f6bac8863dad225317e
Fixes: bz#1811631
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch now you can notice log if it is due to EIO:
[2020-03-16 16:24:48.293837] E [syncdutils(worker /bricks/brick1/mbr3):348:log_raise_exception] <top>: Getting "Input/Output error" is most likely due to a. Brick is down or b. Split brain issue.
[2020-03-16 16:24:48.293915] E [syncdutils(worker /bricks/brick1/mbr3):352:log_raise_exception] <top>: This is expected as per design to keep the consistency of the file system. Once the above issue is resolved geo-rep would automatically proceed further.
Change-Id: Ie33f2440bc96089731ce12afa8dab91d9550a7ca
Fixes: #1104
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If worm-file-level enabled and auto-commit-period 0 an initial write
of a file (e.g. $ echo test >> file1.txt) would lead to an zero byte
file because the WORM xlator immediately WORMed the file when it was
created.
To avoid this we move the setting of trusted.worm_file from
worm_create_cbk to worm_release . This means that this xattr will set
when the filehandle is closed and all initial WRITE FOPs succeed.
Finally we also perform gf_worm_state_transition in worm_release to
ensure that the file will be immediately WORMed after the file handle
was closed.
Change-Id: I5d02e18975b646ca1a27ed41d836e9d0dc333204
Fixes: bz#1808421
Signed-off-by: David Spisla <david.spisla@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Open behind was not keeping any reference on fd's pending to be
opened. This makes it possible that a concurrent close and en entry
fop (unlink, rename, ...) caused destruction of the fd while it
was still being used.
Change-Id: Ie9e992902cf2cd7be4af1f8b4e57af9bd6afd8e9
Fixes: bz#1810934
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When brick-mux is enabled:
i)brick statedumps seem to be listing the same lock information multiple times.
While that is getting fixed, make changes to the .ts to check for unique values.
ii)detecting a brick as online via brick_up_status() seems to be taking
longer time when delaygen is enabled. Hence bump up PROCESS_UP_TIMEOUT to
90 for afr-lock-heal-advanced.t
Updates: #1042
Change-Id: Ife76008f7a99dd1f1fe5791a32577366baaab4b3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.
fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|