| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Fixes: bz#1542072
Change-Id: Ia5fa1df81bbaec3a84653d136a331c76b457f42c
Signed-off-by: Milan Zink <zeten30@gmail.com>
|
|
|
|
|
|
| |
updates: bz#1699866
Change-Id: I7ccd1fc5fc134eeb6d443c755962a20819320d48
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of handshaking glusterd populate volume
data in a dictionary.While no. of volumes are configured
more than 1500 glusterd takes more than 10 min to generated
the data.Due to taking more time rpc request times out and
rpc start bailing of call frames.
Solution: To optimize the code done below changes
1) Spawn multiple threads to populate volumes data in bulk
in separate dictionary and introduce an option
glusterd.brick-dict-thread-count to configure no. of threads
to populate volume data.
2) Populate tier data only while volume type is tier
3) Compare snap data only while snap_count is non zero
Fixes: bz#1699339
Change-Id: I38dc71970c049217f9d1a06fc0aaf4c26eab18f5
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When eager-lock lock acquisition fails because of say network failures, the
local is not being removed from owners_list, this leads to accumulation of
waiting frames and the application will hang because the waiting frames are
under the assumption that another transaction is in the process of acquiring
lock because owner-list is not empty. Handled this case as well in this patch.
Added asserts to make it easier to find these problems in future.
fixes bz#1696599
Change-Id: I3101393265e9827755725b1f2d94a93d8709e923
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: commit c34e4161f3cb6539ec83a9020f3d27eb4759a975 set log-level
per xlator during reconfigure only for a brick process not for
the client process.
Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure
about brick multiplex introudce a flag brick_mux at ctx->cmd_args.
Note: There are two other changes done with this patch
1) Ignore client-log-level option to attach a brick with
already running brick if brick_mux is enabled
2) Add a log to print pid of the running process to make easier
debugging
Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Fixes: bz#1696046
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Creation of tar file on gluster volume throws warning
'file changed as we read it'
Cause:
During readdirp, for few of the files whose inode is not
present, time attributes were served from backend. This caused
the ctime of few files to be different between before readdir
and after readdir by tar.
Solution:
If ctime feature is enabled and inode is not present, don't
serve the time attributes from backend file, serve it from xattr.
fixes: bz#1698078
Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment, while volumes are stopped in a
loop bricks are not detached successfully. Brick's are not
detached because xprtrefcnt has not become 0 for detached brick.
At the time of initiating brick detach process server_notify
saves xprtrefcnt on detach brick and once counter has become
0 then server_rpc_notify spawn a server_graph_janitor_threads
for cleanup brick resources.xprtrefcnt has not become 0 because
socket framework is not working due to assigning 0 as a fd for socket.
In commit dc25d2c1eeace91669052e3cecc083896e7329b2
there was a change in changelog fini to close htime_fd if htime_fd is not
negative, by default htime_fd is 0 so it close 0 also.
Solution: Initialize htime_fd to -1 after just allocate changelog_priv
by GF_CALLOC
Fixes: bz#1699025
Change-Id: I5f7ca62a0eb1c0510c3e9b880d6ab8af8d736a25
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I3556793c5e9d58cc6a08644b41dc5740fab2610b
updates: bz#1628194
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: bug-1650403.t && bug-858215.t are throwing error
at the time of access glustershd pidfile
Solution: Use ps command to findout glustershd pid
Change-Id: I3477345b6220aa039e012e674cba21d741e9abab
fixes: bz#1697486
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
| |
Just to make all files will be listed, which means we have max code-coverage
updates: bz#1693692
Change-Id: I11d36ac2f4d6d4fb91223aacd423ad23242eb454
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The helper funcion get_fd_count() returns how many open fd's has a given
gfid on a brick. It could happen that the brick doesn't have information
about that inode because it has not been previously accessed.
Before this patch, the function returned "" when the inode was not
present. This caused basic/ec/ec-fix-openfd.t test to fail because it
was expecting '0' as the result.
This patch forces get_fd_count() to return '0' when the gfid is not
present in the state dump.
Change-Id: I848b57744e96656bf81fbb7b126a5faf44e535eb
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
1) The placement of cloudsync xlator has been changed
to make it shard xlator's child. If cloudsync has to
work with shard in the graph, it needs to be child of shard.
Change-Id: Ib55424fdcb7ce8edae9f19b8a6e3d3ba86c1f0c4
fixes: bz#1642168
Signed-off-by: Anuradha Talur <atalur@commvault.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As protocol implements every fop, and in general a large part of
the codebase. Considering our regression is run mostly in 1 machine,
there was no way of forcing the client to use old protocol (while new
one is available). With this patch, a new 'testing' option is provided
which forces client to use old protocol if found.
This should help increase the code coverage by at least 10k lines overall.
updates: bz#1693692
Change-Id: Ie45256f7dea250671b689c72b4b6f25037cef948
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Test ec-cpu-extensions.t has been modified so that it uses a bigger
matrix. This makes use of more functions from ec-code-c.c. Changing
read-policy to round-robin increases even more the functions used,
reaching 100% of line and function coverage for this file.
Change-Id: I26e4d33269cbd67f5d76d862f4cf1e69285e85e1
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
| |
this test alone covers most of code of trace xlator
updates: bz#1693692
Change-Id: I287c72ee89bd1c02d992b020d5644e8dac0b77ab
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Part 1: refactor the dht_lookup_dir_cbk
and dht_selfheal_directory functions.
Added a simple dht selfheal directory test
Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9
updates: bz#1590385
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
When split-brain choice is changed from one brick to another
brick, inode-invalidate is not called so readv call is served
from cache leading to failures in split-brain-resolution.t.
Fixed it by calling inode_invaldate() when this happens.
updates bz#1193929
Change-Id: I2624614eec38c0303f3e1dc55dfae3d4b864218b
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For testing the recovery of bad (or corrupted files) in a dispersed
volume, first enable self-heal daemon and let heal happen.
In bitrot feature, if a file becomes corrupted, the solution recommended
is to remove that file directly from the backend and then allowing heal
to happen. Hence turn on self-heal daemon and allow the heal to happen
after removing corrupted copy from the backend.
Change-Id: I7186110398ec1aee7e5727b9d1aac9a01db4d831
fixes: bz#1695327
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
this works as a better solution, as we reuse more functions from library.
Also just do write/read on a file when acl is enabled, so we can see
improvement in code coverage.
updates: bz#1693692
Change-Id: If3359260c8ec2cf4fcf148fb4b95fdecc922c252
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
| |
It helps in increased code coverage of playground.
updates: bz#1693692
Change-Id: I81bcf30be1450948a6360d8915f06b973387a560
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Shd daemon is per node, which means they create a graph
with all volumes on it. While this is a great for utilizing
resources, it is so good in terms of performance and managebility.
Because self-heal daemons doesn't have capability to automatically
reconfigure their graphs. So each time when any configurations
changes happens to the volumes(replicate/disperse), we need to restart
shd to bring the changes into the graph.
Because of this all on going heal for all other volumes has to be
stopped in the middle, and need to restart all over again.
Solution:
This changes makes shd as a per volume daemon, so that the graph
will be generated for each volumes.
When we want to start/reconfigure shd for a volume, we first search
for an existing shd running on the node, if there is none, we will
start a new process. If already a daemon is running for shd, then
we will simply detach a graph for a volume and reatach the updated
graph for the volume. This won't touch any of the on going operations
for any other volumes on the shd daemon.
Example of an shd graph when it is per volume
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
--------- --------- ----------
| AFR-1 | | AFR-2 | | AFR-3 |
-------- --------- ----------
A running shd daemon with 3 volumes will be like-->
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
------------ ------------ ------------
| volume-1 | | volume-2 | | volume-3 |
------------ ------------ ------------
Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an open comes on a file when a brick is down and after the brick comes up,
a fop comes on the fd, client xlator would still wind the fop on anon-fd
leading to wrong behavior of the fops in some cases.
Example:
If lk fop is issued on the fd just after the brick is up in the scenario above,
lk fop will be sent on anon-fd instead of failing it on that client xlator.
This lock will never be freed upon close of the fd as flush on anon-fd is
invalid and is not wound below server xlator.
As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag.
Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc
fixes bz#1390914
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixes afr_ta_read_txn() to handle inode refresh failures.
code-path.
- Fixes a double free issue of dict.
Note: This patch address post-merge review comments for commit
69532c141be160b3fea03c1579ae4ac13018dcdf
fixes: bz#1686398
Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively
Worker crash at slave:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
[ESTALE, EINVAL, EBUSY])
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
return call(*arg)
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
cls.raise_oserr()
File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory
Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
So while syncing SYMLINK, readlink is done on
master to get target path.
2. Geo-rep will create destination if source is not
present while syncing RENAME. Hence while syncing
RENAME of SYMLINK, target path is collected from
destination.
Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known. In this case,
while creating destination directly at slave, regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.
Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine. If it's renamed to something else,
it will be synced then.
Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
| |
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.
Fixes: bz#1692666
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
1 - heal-wait-qlength is by default 128. If shd is disabled
and we need to heal files, client side heal is needed.
If we access these files that will trigger the heal.
However, it has been observed that a file will be enqueued
multiple times in the heal wait queue, which in turn causes
queue to be filled and prevent other files to be enqueued.
2 - While a file is going through healing and a write fop from
mount comes on that file, it sends write on all the bricks including
healing one. At the end it updates version and size on all the
bricks. However, it does not unset dirty flag on all the bricks,
even if this write fop was successful on all the bricks.
After healing completion this dirty flag remain set and never
gets cleaned up if SHD is disabled.
Solution:
1 - If an entry is already in queue or going through heal process,
don't enqueue next client side request to heal the same file.
2 - Unset dirty on all the bricks at the end if fop has succeeded on
all the bricks even if some of the bricks are going through heal.
Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727
updates: bz#1593224
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Just setting the path to the CRL directory in socket_init() wasn't working.
Solution:
Need to use special API to retrieve and set X509_VERIFY_PARAM and set
the CRL checking flags explicitly.
Also, setting the CRL checking flags is a big pain, since the connection
is declared as failed if any CRL isn't found in the designated file or
directory. A comment has been added to the code appropriately.
Change-Id: I8a8ed2ddaf4b5eb974387d2f7b1a85c1ca39fe79
fixes: bz#1687326
Signed-off-by: Milind Changire <mchangir@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test case bug-844688.t is failing quite frequently on master.
This test check for the existence of call_stack, frame creation
time.
But there is a chance that at a point in time, the stack count
might become zero. So doing the check in EXPECT_WITHIN make
more sense.
Change-Id: Id2ede7f6fdcb5f016f52c5c0557ce6ac510d4e96
updates: bz#1688116
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
While checking a test case using EXPECT_WITHIN, the
argument is actually missing a '$' symbol to denote
the token as a variable in bash
Change-Id: I5b9150acdea000b29e94cfb01d975c77f5ece3e5
fixes: bz#1688116
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.
Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.
Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1686568
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: I347de62755100cd69e3cf341434767ae23fd1ba4
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The tests assumed that the file is created on a
particular brick.This need not be the case
in all scenarios and has been removed.
Change-Id: Id420f43d7f72d983a7c6f16ea8fed273d46c4824
updates: bz#1672480
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a thread pool that is wait-free for adding jobs to
the queue and uses a very small locked region to get jobs. This makes it
possible to decrease contention drastically. It's based on wfcqueue
structure provided by urcu library.
It automatically enables more threads when load demands it, and stops
them when not needed. There's a maximum number of threads that can be
used. This value can be configured.
Depending on the workload, the maximum number of threads plays an
important role. So it needs to be configured for optimal performance.
Currently the thread pool doesn't self adjust the maximum for the
workload, so this configuration needs to be changed manually.
For this reason, the global thread pool has been made optional, so that
volumes can still use the thread pool provided by io-threads.
To enable it for bricks, the following option needs to be set:
config.global-threading = on
This option has no effect if bricks are already running. A restart is
required to activate it. It's recommended to also enable the following
option when running bricks with the global thread pool:
performance.iot-pass-through = on
To enable it for a FUSE mount point, the option '--global-threading'
must be added to the mount command. To change it, an umount and remount
is needed. It's recommended to disable the following option when using
global threading on a mount point:
performance.client-io-threads = off
To enable it for services managed by glusterd, glusterd needs to be
started with option '--global-threading'. In this case all daemons, like
self-heal, will be using the global thread pool.
Currently it can only be enabled for bricks, FUSE mounts and glusterd
services.
The maximum number of threads for clients and bricks can be configured
using the following options:
config.client-threads
config.brick-threads
These options can be applied online and its effect is immediate most of
the times. If one of them is set to 0, the maximum number of threads
will be calcutated as #cores * 2.
Some distributions use a very old userspace-rcu library (version 0.7)
for this reason, some header files from version 0.10 have been copied
into contrib/userspace-rcu and are used if the detected version is 0.7
or older.
An additional change has been made to io-threads to prevent that threads
are started when iot-pass-through is set.
Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b
updates: #532
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this patch the following error is seen:
....
warning: implicit declaration of function ‘makedev’ [-Wimplicit-function-declaration]
ret = mknod("cspecial", S_IFCHR | S_IRWXU | S_IRWXG, makedev(2, 3));
^~~~~~~
/usr/bin/ld: /tmp/ccIVwT46.o: in function `path_based_fops':
/home/pk/workspace/gerrit-repo/tests/basic/fops-sanity.c:478:
undefined reference to `makedev'
....
updates bz#1676797
Change-Id: I8a17c38fdfd458dd2dc75f4c7e2bf20ce555a042
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The bricks are loopback devices. Unmounting them is done
before the cleanup and leads to "target is busy" messages.
Change-Id: Ia808c2c9580273e1bf0595ecf53c210847c44577
fixes: bz#1676736
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1672727
Change-Id: I2b9be45f199f6436b858536c6f49be85902217f0
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scenarios tested:
* Upgrade the node when there are stripe / tiering and regular
type of volumes are present.
- All volumes are started fine (as the change was not on brick volfile)
- For tier, the functionality may not even work, as changetimerecorder
is not present.
- 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and
tier type of volume.
* Upgrade in a rolling upgrade scenario, where an old version is
able to connect to higher master.
- on a normal volume, if the volfile-server was new, the newer client
volfiles needed to have utime xlator conditionally.
- with this one change, all other changes seem to work fine.
Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8
updates: bz#1635688
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Auto invalidation is necessary when same (meta)data is shared/access
across multiple mounts. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, fuse-auto-invalidation
can be disabled for this case which gives a huge performance boost for
workloads that write data and then immediately read the data they just
wrote.
From glusterfs --help,
<snip>
--auto-invalidation[=BOOL] controls whether fuse-kernel can
auto-invalidate attribute, dentry and page-cache.
Disable this only if same files/directories are
not accessed across two different mounts
concurrently [default: "on"]
</snip>
Details on how disabling auto-invalidation helped to reduce pgbench
init times can be found at [1]. Time taken for pgbench init of scale
8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s)
with auto-invalidations turned off along with other
optimizations. Just disabling auto-invalidation contributed 56%
improvement by reducing the total time taken by 33260s.
[1] https://www.spinics.net/lists/gluster-devel/msg25907.html
Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.
As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.
Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...when ctime is zero. ia_type and ia_gfid always need to be non-zero
for things to work correctly.
Problem:
Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt
buffer in the cbks of modification fops before unwinding if the ctime in
the buffer was zero. This was causing the fops to fail: noticeable when
AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime
when the option is set. See commit
4c4624c9bad2edf27128cb122c64f15d7d63bbc8).
Fixes:
-Do not zero out the ia_type and ia_gfid of the iatt buff under any
circumstance.
-Also, fixed _rda_inode_ctx_update_iatts() to always update these values from
the incoming buf when ctime is zero. Otherwise we end up with zero
ia_type and ia_gfid the first time the function is called *and* the
incoming buf has ctime set to zero.
fixes: bz#1670253
Reported-By:Michael Hanselmann <public@hansmi.ch>
Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
Lot of the earlier changes in the management of shards in lru, fsync
lists assumed that if a given shard exists in fsync list, it must be
part of lru list as well. This was found to be not true.
Consider this - a file is FALLOCATE'd to a size which would make the
number of participant shards to be greater than the lru list size.
In this case, some of the resolved shards that are to participate in
this fop will be evicted from lru list to give way to the rest of the
shards. And once FALLOCATE completes, these shards are added to fsync
list but without a ref. After the fop completes, these shard inodes
are unref'd and destroyed while their inode ctxs are still part of
fsync list. Now when an FSYNC is called on the base file and the
fsync-list traversed, the client crashes due to illegal memory access.
FIX:
Hold a ref on the shard inode when adding to fsync list as well.
And unref under following conditions:
1. when the shard is evicted from lru list
2. when the base file is fsync'd
3. when the shards are deleted.
Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2
fixes: bz#1669077
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
so that we can understand more about process memory and thread consumptions
With this, we will also be able to understand more about the process details
with brick-mux.
updates: bz#1193929
Change-Id: I147a3e3814fc37dfb635217d0a0f0184fae0994f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Automatic Splitbrain with size as policy must
not resolve splitbrains when the copies are of same size.
Determining if the sizes of copies are same and
returning -1 in that case.
updates: bz#1655052
Change-Id: I3d8e8b4d7962b070ed16c3ee02a1e5a926fd5eab
Signed-off-by: Iraj Jamali <ijamali@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a low level security issue with fencing since one client
can preempt another client's lock.
This patch does not completely eliminate the issue of a client
misbehaving, but certainly it adds a security layer for default use cases
that does not need fencing.
Change-Id: I55cd15f2ed1ae0f2556e3d27a2ef4bc10fdada1c
updates: #466
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
rm -rf <dir> fails on dirs which contain linkto files
that point to themselves because dht incorrectly thought
that they were cached files after looking them up.
The fix now treats them as invalid linkto files
and deletes them.
Change-Id: I376c72a5309714ee339c74485e02cfb4e29be643
fixes: bz#1667804
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In automatic Splitbrain resolution when favorite child policy
is set as size, split brain resolution must not work for
directories.
Currently, if a directory is in split brain with both copies
having same size, the source is selected arbitrarily
and healed.
fixes: bz#1655050
Change-Id: I5739498639c17c89874cc577362e543adab55f5d
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added functionality to gluster volume set auth.allow command to
accept CIDR IP addresses. Modified few functions to isolate cidr
feature so that it prevents other gluster commands such as peer
probe to use cidr format ip. The functions are modified in such
a way that they have an option to enable accepting of cidr
format for other gluster commands if required in furture.
updates: bz#1138841
Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b
Credits: Mohit Agrawal
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
design reference: https://review.gluster.org/#/c/glusterfs-specs/+/21925/
This patch adds the lock preempt support.
Note: The current model stores lock enforcement information as separate
xattr on disk. There is another effort going in parallel to store this
in stat(x) of the file. This patch is self sufficient to add fencing
support. Based on the availability of the stat(x) support either I will
rebase this patch or we can modify the necessary bits post merging this
patch.
Change-Id: If4a42f3e0afaee1f66cdb0360ad4e0c005b5b017
updates: #466
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In gluster code some of the places it call's get_new_dict
to create a dictionary without taking reference so at the time
of dict_unref it has become a leak
Solution: To resolve the same call dict_new instead of get_new_dict
updates bz#1650403
Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|