| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Add thin convenient library wrapper gf_time(),
adjust related users and comments as well.
Change-Id: If8969af2f45ee69c30c3406bce5baa8305fb7f80
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Updates: #1002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of handling friend update request glusterd updates peer
file and if DNS has returned multiple hostnames for the same IP, glusterd
saves all hostnames in peer file.In commit 1fa089e7a2b180e0bdcc1e7e09a63934a2a0c0ef
We changed the approach to save all key value pairs in single shot.
In case of a buffer is not having space to store the hostnames glusterd
writes partial hostname in peer file.
Solution: To avoid the failure increase the buffer length
Change-Id: Iee969d165333e9c5ba69431d474c541b8f12d442
Fixes: #1407
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When auth.allow list is set to some ip's, add-brick
operation is failing.
Cause: add-brick commands creates a temparary mount on the
bricks to set the extended attributes on the brick mount
points. When auth.allow list is set to default i.e, * (all)
we will not see any issue, but when it is set to certain ip's
add-brick operation fails as temparory mount on the bricks
fails because the peers are not part of auth.allow list.
Solution: When auth.allow list is already set, add all the
peers to the auth.allow list during add-brick operation.
the old list will be replaced in post commit phase.
As this can happen with replace-brick operation as well,
added code to handle it.
updates: #1391
Change-Id: I5ede8c35f05ab25ff431b88e074ddbe9c10a90f1
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
| |
CID: 1430146
Change-Id: Icce4ffa0e78575b110e0cfd9d5cfd133141680c1
Updates: #1060
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
On executiing the command gluster vol set help, an error
comes up in glusterd logs stating `undefined symbol: xlator_api`.
This issue is seen for the rpc-transport/socket.so file.
Fix:
The symbol `xlator_api` is not found in rpc-transport/socket.so
file as it is not a xlator but a shared object for transport.
In the `xlator.c` file, there is a function `xlator_volopt_dynload`,
which looks for the default values of the options available in gluster,
which is stored inside the respective xlator files for different voltypes.
In each of these files the `options` object is present which contains
the default values, which is therefore referenced from the `options` data
member of `xlator_api` object in case of xlators.But, since
`rpc-transport/socket.so` is not an xlator we don't have the `xlator_api`
object present to point to that object. So, in case of
`rpc-transport/socket.so` type we are accesing the `options` object
directly from the `xlator_volopt_dynload` function to fetch the default
values for the available options.
Fixes: #827
Change-Id: I3b2b0c1f2a11896be250aaca1a33a65b044991d5
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a cluster env: getspec() detects that volfile not found.
but further on, this return code is set by another call
so the error is lost and not handled.
As a result the server responds with ambiguous message:
{op_ret = -1, op_errno = 0..} - which cause the client to stuck.
Fix:
server side: don't override the failure error.
fixes: #1375
Change-Id: Id394954d4d0746570c1ee7d98969649c305c6b0d
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
|
|
|
|
|
|
|
|
|
| |
When dirp is null, we should not call sys_closedir() on it.
fixes: #1379
Change-Id: I33633df983aeea11e9d685e41ed9ec58644b6258
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Display which options were not changed from the default.
The user may have opted to change some global or volume options
from the default they were initially. Display '(DEFAULT)' if the values
used are those that were not explicitly set by the user.
Example output:
Option Value
------ -----
cluster.server-quorum-ratio 50
cluster.enable-shared-storage disable (DEFAULT)
cluster.op-version 80000
cluster.max-op-version 90000
cluster.brick-multiplex disable (DEFAULT)
cluster.max-bricks-per-process 250 (DEFAULT)
glusterd.vol_count_per_thread 100 (DEFAULT)
cluster.localtime-logging disable (DEFAULT)
cluster.daemon-log-level INFO (DEFAULT)
Since glusterfind uses the value, it is now filtering the value
and only picking the 1st word (which is the value itself) and ignores
the rest, which may now be '(DEFAULT)'.
Fixes: #1357
Change-Id: I7c59055158d099a5de38943f2169fd02c77f5d09
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
'this' is used before it is defined,
therefore it can lead to a NULL dereference.
Fix:
Moved the definition of 'this', before it's use
to avoid NULL dereference.
Change-Id: I6ad382192129dfa3a206426e5610040e7a905be6
Updates: #1096
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seen in fedora rawhide/33, and SUSE tumbleweed, in all versions
going back at least as far as glusterfs-6.
[ 351s] glusterd.c:68:13: warning: type of 'snap_mount_dir' does not match original declaration [-Wlto-type-mismatch]
[ 351s] 68 | extern char snap_mount_dir[PATH_MAX];
[ 351s] | ^
[ 351s] glusterd-snapshot.c:65:6: note: array types have different bounds
[ 351s] 65 | char snap_mount_dir[VALID_GLUSTERD_PATHMAX];
[ 351s] | ^
[ 351s] glusterd-snapshot.c:65:6: note: 'snap_mount_dir' was previously declared here
In this case it's only a warning, but certainly merits fixing.
Another case where a decl in a header file instead of open-coding
extern decls in multiple .c files would have been preferable.
Change-Id: Idc91e536a56a1a7717be83ed27698069e71dff67
Updates: #1002
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
There has been either an explicit null
dereference or a dereference after null
check in some cases.
Fix:
Added the proper condition for null check
and fixed null derefencing.
CID: 1430106 : Dereference after null check
CID: 1430120 : Explicit null dereferenced
CID: 1430132 : Dereference after null check
CID: 1430134 : Dereference after null check
Change-Id: I7e795cf9f7146a633097c26a766f16b159881fa3
Updates: #1060
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Replace an over-engineered GF_SKIP_IRRELEVANT_ENTRIES() with
inline function gf_irrelevant_entry(), adjust related users.
Change-Id: I6f66c460f22a82dd9ebeeedc2c55fdbc10f4eec5
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1350
|
|
|
|
|
|
|
|
|
| |
Reason for changing the log-level stated at the github isse
fixes: #1353
Change-Id: I21202075916c5a7525e5f26e7fb595efe7717b66
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem: while the rebalance is in progress, if a node is
rebooted rebalance v status shows the stats of this node as
0 once the node is back.
Reason: when the node is rebooted, once it is back
glusterd_volume_defrag_restart() starts the rebalance and
creates the rpc. but due to some race, rebalance process is
sending disconnect event, so rpc object is getting destroyed. As
the rpc object is null, request for fetching the latest stats is
not sent to rebalance process. and stats are shows as default values
which is 0.
Solution: When the rpc object null, we should create the rpc if the
rebalance process is up. so that request can be sent to rebalance
process using the rpc.
fixes: #1339
Change-Id: I1c7533fedd17dcaffc0f7a5a918c87356133a81c
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Removed the macro 'GD_MSG_DICT_SERL_LENGTH_GET_FAIL'
from the glusterd-messages file as
'GD_MSG_DICT_ALLOC_AND_SERL_LENGTH_GET_FAIL' is
used in it's place
Change-Id: I69d7d95b5cb8f1bdd7e616d7a3e9539e891ba378
Fixes: #874
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
Some of the functions didn't had sufficient
logging of information in case of failure.
Fix:
Added log information in few functions in
case of failure indicating the cause of
such event.
Change-Id: I301cf3a1c8d2c94505c6ae0d83072b0241c36d84
fixes: #874
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: add-brick operation is failing when replica or disperse
count is not mentioned in the add-brick command.
Reason: with commit a113d93 we are checking brick order while
doing add-brick operation for replica and disperse volumes. If
replica count or disperse count is not mentioned in the command,
the dict get is failing and resulting add-brick operation failure.
fixes: #1306
Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also add a message saying this is to be used only
for 'debug' purpose only. This is helpful to corner the
issue to acl. There were recently many issues reported
related to permissions, and acl access denied bugs.
The bugs were elsewhere, but to validate them and to
get people back to service (in certain cases like oVirt,
where gluster volumes are used mostly by single user),
this option can be used.
Updates: #876
Change-Id: I7be4401153607e11c9efb831ab794df4176604df
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently remove-brick commands follow sync-op framework. For code
extensibility (like, adding more phases in the trnasaction) we are
migrating the command to mgmt v3 framework.
fixes: #1164
Change-Id: I5d363223d6f9dc7a70b61adb9d3a5250e84a71b4
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logs and other output carrying timestamps
will have now timezone offsets indicated, eg.:
[2020-03-12 07:01:05.584482 +0000] I [MSGID: 106143] [glusterd-pmap.c:388:pmap_registry_remove] 0-pmap: removing brick (null) on port 49153
To this end,
- gf_time_fmt() now inserts timezone offset via %z strftime(3) template.
- A new utility function has been added, gf_time_fmt_tv(), that
takes a struct timeval pointer (*tv) instead of a time_t value to
specify the time. If tv->tv_usec is negative,
gf_time_fmt_tv(... tv ...)
is equivalent to
gf_time_fmt(... tv->tv_sec ...)
Otherwise it also inserts tv->tv_usec to the formatted string.
- Building timestamps of usec precision has been converted to
gf_time_fmt_tv, which is necessary because the method of appending
a period and the usec value to the end of the timestamp does not work
if the timestamp has zone offset, but it's also beneficial in terms of
eliminating repetition.
- The buffer passed to gf_time_fmt/gf_time_fmt_tv has been unified to
be of GF_TIMESTR_SIZE size (256). We need slightly larger buffer space
to accommodate the zone offset and it's preferable to use a buffer
which is undisputedly large enough.
This change does *not* do the following:
- Retaining a method of timestamp creation without timezone offset.
As to my understanding we don't need such backward compatibility
as the code just emits timestamps to logs and other diagnostic
texts, and doesn't do any later processing on them that would rely
on their format. An exception to this, ie. a case where timestamp
is built for internal use, is graph.c:fill_uuid(). As far as I can
see, what matters in that case is the uniqueness of the produced
string, not the format.
- Implementing a single-token (space free) timestamp format.
While some timestamp formats used to be single-token, now all of
them will include a space preceding the offset indicator. Again,
I did not see a use case where this could be significant in terms
of representation.
- Moving the codebase to a single unified timestamp format and
dropping the fmt argument of gf_time_fmt/gf_time_fmt_tv.
While the gf_timefmt_FT format is almost ubiquitous, there are
a few cases where different formats are used. I'm not convinced
there is any reason to not use gf_timefmt_FT in those cases too,
but I did not want to make a decision in this regard.
Change-Id: I0af73ab5d490cca7ed8d07a2ce7ac22a6df2920a
Updates: #837
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Add destroy calls for 'store_volinfo_lock' and 'lock' of volume info.
Move initialization of 'store_volinfo_lock' from glusterd_op_create_volume()
to common place, which is glusterd_volinfo_new() indeed.
Change-Id: I5fae4469f28eab80c4fa6f5947646528e6aedad7
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1291
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Brick process are not properly attached on any cluster node while
some volume options are changed on peer node and glusterd is down on
that specific node.
Solution: At the time of restart glusterd it got a friend update request
from a peer node if peer node having some changes on volume.If the brick
process is started before received a friend update request in that case
brick_mux behavior is not workingproperly. All bricks are attached to
the same process even volumes options are not the same. To avoid the
issue introduce an atomic flag volpeerupdate and update the value while
glusterd has received a friend update request from peer for a specific
volume.If volpeerupdate flag is 1 volume is started by
glusterd_import_friend_volume synctask
Change-Id: I4c026f1e7807ded249153670e6967a2be8d22cb7
Credit: Sanju Rakaonde <srakonde@redhat.com>
fixes: #1290
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
We have 3 nodes and create ec 3*(2+1) volume for
test-disperse-0/test-disperse-1/test-disperse-2 when we do
'gluster v heal test full' in node-1 that can in node-1/
node-2/node-3 glustershd's get op=GF_EVENT_TRANSLATOR_OP
and then do full heal in different disperse group.
Let us say we have 2X(2+1) disperse with each brick
from different machine m0, m1, m2, m3, m4, m5. and candidate_max is m5.
and do full heal so '*index' is 3 and !gf_uuid_compare(MY_UUID, brickinfo->uuid)
will be true in m3,and then m3's glustershd will be the heal-xlator.
Id: I5c6762e6cfb375aed32d3fc11fe5eae3ee41aab4
Signed-off-by: yinkui <13965432176@163.com>
Change-Id: Ic7ef3ddfd30b5f4714ba99b4e7b708c927d68764
fixes: bz#1724948
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimal way for configuring disperse and replicate volumes
is to have all bricks in different nodes.
During create operation it fails saying it is not optimal, user
must use force to over-ride this behavior. Implementing same
during add-brick operation to avoid situation where all the added
bricks end up from same host. Operation will error out accordingly.
and this can be over-ridden by using force same as create.
fixes: #1047
Change-Id: I3ee9c97c1a14b73f4532893bc00187ef9355238b
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found with GCC's address sanitizer:
==67190==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 24624 byte(s) in 6 object(s) allocated from:
#0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
#1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
#2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
#3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
#4 0x7f623e2f9ed7 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2389
Direct leak of 8208 byte(s) in 2 object(s) allocated from:
#0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
#1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
#2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
#3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
#4 0x7f623e2f9cf0 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2363
#5 0x7fff5cb70bcf ([stack]+0x15bcf)
#6 0x7f623e309113 in glusterd_store_retrieve_volumes /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:3505
#7 0xfffeb96e61d (<unknown module>)
#8 0x7f623e4586d7 (/usr/lib64/glusterfs/9dev/xlator/mgmt/glusterd.so+0x2f86d7)
Change-Id: I9b2a543dc095f4fa739cd664fd4d608bf8c87d60
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- performance.cache-size has a flawed semantics, as it's
dispatched on two independent translators, io-cache
and quick-read.
- performance.qr-cache-timeout has a confusing name, as
other options affecting quick-read have an unabbreviated
"quick-read-..." prefix in their names.
We keep these options with unchanged operation, but in the
help output we indicate their deprecation.
The following better alternatives are introduced:
- performance.io-cache-size to tune cache-size option of io-cache
- performance.quick-read-cache-size to tune cache-size option of
quick-read
- performance.quick-read-cache-timeout as a preferred synonym for
performance.qr-cache-timeout
Fixes: #952
Change-Id: Ibd04fb638de8cac450ba992ad8a415154f9f4281
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
While taking a snapshot clone, if the snapshot is not activated,
th cli was returning that the bricks are down.
This patch clearly print tha the error is due to the snapshot
state.
Change-Id: Ia840e6e071342e061ad38bf15e2e2ff2b0dacdfa
Fixes: #1255
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
After the changes in commit 3da22f8cb08b05562a4c6bd2694f2f19199cff7f,
there was a place where synccond_broadcast() was missing. It could
cause a hang if another synctask was waiting on the condition variable.
Change-Id: I92bfe4e15c5c3591e4854a64aa9e1566d50dd204
Fixes: #1116
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In add-brick that increases replica count
SHD was restarted after pending xattrs are set on the new bricks and
adding bricks. But before restarting SHD there is a possibility that
old SHD would do a scan on root-directory see no heal is needed and
delete index for root-dir leading to no heals until lookup is executed
on the mount
Fix:
Stop shd, perform pending-xattr setting/adding new bricks and
then restart shd
Fixes: #1240
Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current scaling of the syncop thread pool is not working properly
and can leave some tasks in the run queue more time than necessary
when the maximum number of threads is not reached.
This patch provides a better scaling condition to react faster to
pending work.
Condition variables and sleep in the context of a synctask have also
been implemented. Their purpose is to replace regular condition
variables and sleeps that block synctask threads and prevent other
tasks to be executed.
The new features have been applied to several places in glusterd.
Change-Id: Ic50b7c73c104f9e41f08101a357d30b95efccfbf
Fixes: #1116
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Found with 0-symbol-check.t:
./tests/basic/0symbol-check.t ..
1..2
./xlators/mgmt/glusterd/src/.libs/glusterd_la-glusterd-volume-set.o should call sys_stat, not stat
ok 1 [ 40/ 41011] < 40> 'find . -name *.o -exec ./tests/basic/symbol-check.sh {} \;'
not ok 2 [ 11/ 1] < 42> '[ ! -e ./.symbol-check-errors ]' -> ''
Failed 1/2 subtests
Change-Id: I8962f487cd88738a1f7a962049d513712687088c
Fixes: #1160
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed that the following message repeats quite a bit in log files when
an external monitoring tool queries gluster for list of volumes
periodically:
"Received get vol req"
As there's not much value in having this log message at log level INFO,
changing the log level to DEBUG to make glusterd.log a bit quieter.
Change-Id: I4e791fc65b9a4f813d295e7b2b6a05f3c0782e69
Updates: #1000
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
While changing a volume configuration, there is a chance that the
brick layout on the disk might be changed. If snapshot is present
on such volumes that will effects it's working. So this patch adds
a warning message if snapshot is present while a volume config change
happen.
Change-Id: I7256863fef734841fce0bc9ad94d5d201b1813d5
Fixes: bz#1812144
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Description of problem:
server.statedump-path is the path where statedumps are stored,
by default it is /var/run/gluster. And can be set to any valid
directory path. It was observed that server.statedump-path was
also accepting file, non-existent file and non-existent paths
as well. And statedump command was successful even when
statedumps with all the invalid paths.
a. A file
b. A non-existent path
Solution:
Added a validation function in gluster-volume-set.c which will
allow volume set to success if it's a valid directory
and in all other cases, volume set should fail.
Fixes: bz#1787122
Change-Id: Ia66e2b3d35f23efc5444c829928779a79d827b42
Signed-off-by: yatipadia <ypadia@redhat.com>
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
1- In a cluster of 3 Nodes N1, N2, N3. Create 3 volumes vol1,
vol2, vol3 with 3 bricks (one from each node)
2- Set cluster.brick-multiplex on
3- Start all 3 volumes
4- Check if all bricks on a node are running on same port
5- Kill N1
6- Set performance.readdir-ahead for volumes vol1, vol2, vol3
7- Bring N1 up and check volume status
8- All bricks processes not running on N1.
Root Cause -
Since, There is a diff in volfile versions in N1 as compared
to N2 and N3 therefore glusterd_import_friend_volume() is called.
glusterd_import_friend_volume() copies the new_volinfo and deletes
old_volinfo and then calls glusterd_start_bricks().
glusterd_start_bricks() looks for the volfiles and sends an rpc
request to glusterfs_handle_attach(). Now, since the volinfo
has been deleted by glusterd_delete_stale_volume()
from priv->volumes list before glusterd_start_bricks() and
glusterd_create_volfiles_and_notify_services() and
glusterd_list_add_order is called after glusterd_start_bricks(),
therefore the attach RPC req gets an empty volfile path
and that causes the brick to crash.
Fix- Call glusterd_list_add_order() and
glusterd_create_volfiles_and_notify_services before
glusterd_start_bricks() cal is made in glusterd_import_friend_volume
Change-Id: Idfe0e8710f7eb77ca3ddfa1cabeb45b2987f41aa
Fixes: bz#1773856
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Thin-arbiter module makes use of 'pending-xattr' name for the translator
as the filename which gets created in thin-arbiter node. By making this
unique, we can host single thin-arbiter node for multiple clusters.
Updates: #763
Change-Id: Ib3c732e7e04e6dba229e71ae3e64f1f3cb6d794d
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Changelog creates threads even if the changelog is not enabled
Background:
Changelog xlator broadly does two things
1. Journalling - Cosumers are geo-rep and glusterfind
2. Event Notification for registered events like (open, release etc) -
Consumers are bitrot, geo-rep
The existing option "changelog.changelog" controls journalling and
there is no option to control event notification and is enabled by
default. So when bitrot/geo-rep is not enabled on the volume, threads
and resources(rpc and rbuf) related to event notifications consumes
resources and cpu cycle which is unnecessary.
Solution:
The solution is to have two different options as below.
1. changelog-notification : Event notifications
2. changelog : Journalling
This patch introduces the option "changelog-notification" which is
not exposed to user. When either bitrot or changelog (journalling)
is enabled, it internally enbales 'changelog-notification'. But
once the 'changelog-notification' is enabled, it will not be disabled
for the life time of the brick process even after bitrot and changelog
is disabled. As of now, rpc resource cleanup has lot of races and is
difficult to cleanup cleanly. If allowed, it leads to memory leaks
and crashes on enable/disable of bitrot or changelog (journal) in a
loop. Hence to be safer, the event notification is not disabled within
lifetime of process once enabled.
Change-Id: Ifd00286e0966049e8eb9f21567fe407cf11bb02a
Updates: #475
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of signing process threads (glfs_brpobj)
is set to 4 by default. The recommendation is to set
it to number of cores available. This patch makes it
configurable as follows
gluster vol bitrot <volname> signer-threads <count>
fixes: bz#1797869
Change-Id: Ia883b3e5e34e0bc8d095243508d320c9c9c58adc
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when we issue a heal statistics or similar commands
for disperse volume, it fails with message "Volume is not of
type replicate." Adding message "this command is supported for
volumes of type replicate" to reflect supportability and better
understanding of heal functionality for disperse volumes.
fixes: bz#1785998
Change-Id: I9688a9fdf427cb6f657cfd5b8db2f76a6c56f6e2
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When gNFS is disabled in build time, we have to ensure
taht the .stop(), .start() and other functions of the nfs_svc
are not called, otherwise we'd crash.
In addition, #ifdef more code that is gNFS related.
updates: bz#1793995
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I5081f1670c04ca306aeaab7208829b0f2f149a42
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This parameter may have been used in the past, but is no longer
needed. Removing it and the few locations it was actually referenced.
This allows to remove an extra memdup as well, that was not needed
in the 1st place in server_setvolume() and unserialize_rsp_direntp()
functions.
A followup separate patch will remove extra_stdfree parmeter
from the dictionary structure.
Change-Id: Ica0ff0a330672373aaa60e808b7e76ec489a0fe3
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
| |
updates: bz#789278
Change-Id: I652d8d4428cf6ce61b712a66d309e78030a5f911
Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In many cases, we were freely allocating long keys with no need.
Smaller char arrays are just fine almost anywhere, so just went ahead
and looked where they we can use smaller ones.
In some cases, annotated the functions as static and the prefixes
passed as const as it was easier to read and understand.
Where relevant, converted the dict functions to use known key length.
Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code was not ifdef'ed properly when gNFS was not enabled.
Strangely, I could not reproduce the failure on my system (Fedora 31),
but it was reproduced elsehwere and the fix was verified to correct it.
The failure:
gluster volume create testvol replica 3 127.0.0.2:/tests/brick{1..3} force
gluster v set testvol write-behind off
grep -rne write-behind /var/lib/glusterd/vols/testvol/trusted-testvol.tcp-fuse.vol
The last grep was supposed to come out empty.
The issue was that perfxl_option_handler may not have been called when it should
have been.
Change-Id: Ie9f8ec87dabeef6624527c2266ddf9af01ca7373
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This volume option was not made avaialble to `gluster volume set` CLI.
Reported-by: epolakis(https://github.com/kinsu) in
https://github.com/gluster/glusterfs/issues/781
fixes: bz#1787554
Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently enabling run-with-valgrind will cause the gnfs and quota to
fail to start. The phenomenon is as follows.
------------------------------------------------------------------------------
NFS Server on localhost 2049 0 N 48406
Quota Daemon on localhost N/A N/A N 48428
------------------------------------------------------------------------------
Solution: The cause of the above phenomenon is that the log path of valgrind is
set incorrectly. Gnfs and quota can start with valgrind normally after correcting
the log path.
Updates: #788
Change-Id: Ib91408c08522ff66afff908fbff3fce4b93ea770
Signed-off-by: He Min <hemin@cmss.chinamobile.com>
|
|
|
|
|
|
|
|
|
|
| |
As a follow up to https://review.gluster.org/#/c/glusterfs/+/23799/
When compiling without gNFS, there were some 'unused' warnings by
the compiler. This patch fixes them.
Change-Id: I621562261f53950e821a450e0e7da304d00ae557
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: default option itransport.address-family is disappered
in volume info output after a volume reset.
Cause: with 3.8.0 onwards volume option transport.address-family
has default value, any volume which is created will have this
option set. So, volume info will show this in its output. But,
with reset volume, this option is not handled.
Solution: In glusterd_enable_default_options(), we should add this
option along with other default options. This function is called
by glusterd_options_reset() with volume reset command.
fixes: bz#1786478
Change-Id: I58f7aa24cf01f308c4efe6cae748cc3bc8b99b1d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are not compiling gNFS (--enable-gnfs is not given in the
./configure script params), there is little point in compiling code
that is related to it.
This patch tries to eliminate it.
My hope (and it's not clear from the code ) is that I did not break
the NFS Ganesha support as well.
Other than that, tried to compile with and without anad it looks sane.
Change-Id: I8d6c98066b9fceab4ec10fc6f5e81ab069e853bd
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
In glusterd_proc_stop(), after killing the pid
we should remove the pidfile.
fixes: bz#1784375
Change-Id: Ib6367aed590932c884b0f6f892fc40542aa19686
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|