| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Current rebalance commands use the op_state machine framework.
Porting it to use the mgmt_v3 framework.
Change-Id: I6faf4a6335c2e2f3d54bbde79908a7749e4613e7
fixes: bz#1655827
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
get_mux_limit_per_process () reads the global option dictionary and in
case it doesn't find out a key, assumes that
cluster.max-bricks-per-process option isn't configured however the
default value should be picked up in such case.
Change-Id: I35dd8da084adbf59793d58557e818d8e6c17f9f3
Fixes: bz#1656951
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While glusterd has an infra to allow post install of spec to bring it up
in the interim upgrade mode to allow all the volfiles to be regenerated
with the latest executable, in container world the same methodology is
not followed as container image always point to the specific gluster rpm
and gluster rpm doesn't go through an upgrade process.
This fix does the following:
1. If glusterd.upgrade file doesn't exist, regenerate the volfiles
2. If maximum-operating-version read from glusterd.upgrade doesn't match
with GD_OP_VERSION_MAX, glusterd detects it to be a version where new
options are introduced and regenerate the volfiles.
Tests done:
1. Bring up glusterd, check if glusterd.upgrade file has been created
with GD_OP_VERSION_MAX value.
2. Post 1, restart glusterd and check glusterd hasn't regenerated the
volfiles as there's is no change in the GD_OP_VERSION_MAX vs the
op_version read from the file.
3. Bump up the GD_OP_VERSION_MAX in the code by 1 and post compilation
restart glusterd where the volfiles should be again regenerated.
Note: The old way of having volfiles regenerated during an rpm upgrade
is kept as it is for now but eventually this can be sunset later.
Change-Id: I75b49a1601c71e99f6a6bc360dd12dd03a96414b
Fixes: bz#1651463
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd should not try to acquire locks on any resources,
when it already received a SIGTERM and cleanup is started. Otherwise
we might hit segfault, since the thread which is going through
cleanup path will be freeing up the resouces and some other thread
might be trying to acquire locks on freed resources.
Solution: perform rcu_read_lock/unlock() under cleanup_lock mutex.
fixes: bz#1654270
Change-Id: I87a97cfe4f272f74f246d688660934638911ce54
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Removed unnecessary iteration during brick disconnect
handler when multiplex is enabled.
Change-Id: I62dd3337b7e7da085da5d76aaae206e0b0edff9f
fixes: bz#1650115
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently in glusterd code uses sync_lock/sync_unlock to update blockers
counter which could add delays to the overall transaction phase
escpecially when there's a batch of volume stop operations processed by
glusterd in brick multiplexing mode.
Solution: Use GF_ATOMIC to update blocker counter to ensure unnecessary
context switching can be avoided.
Change-Id: Ie13177dfee2af66687ae7cf5c67405c152853990
Fixes: bz#1631128
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: With commit cb0339f92, we are using a separate syntask
for restart_bricks. There can be a situation where two threads
are accessing the same volinfo structure at the same time and
updating volinfo structure. This can lead volinfo to have
inconsistent values and assertion failures because of unexpected
values.
Solution: While updating the volinfo structure, acquire a
store_volinfo_lock, and release the lock only when the thread
completed its critical section part.
Fixes: bz#1627610
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
Change-Id: I545e4e2368e3285d8f7aa28081ff4448abb72f5d
|
|
|
|
| |
Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
|
|
|
|
|
|
|
|
| |
This patch fixes buffer size issue 1138522.
Change-Id: Ia12fc8f34f75704f8ed3efae2022c4fd67a8c76c
updates: bz#789278
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem explanation:
Assuming in a 3 nodes cluster, if N1 originates a delete operation and
while N1's commit phase completes, either glusterd service of N2 or N3
gets disconnected from N1 (before completing the commit phase), N1 will
attempt to end up importing the volume which is in-flight for a delete
in other nodes as a fresh resulting into an incorrect configuration
state.
Fix:
Mark a volume as stage deleted once a volume delete operation passes
it's staging phase and reset this flag during unlock phase. Now during
this intermediate phase if the same volume gets imported to other peers,
it shouldn't considered to be recreated.
An automated .t is quite tough to implement with the current infra.
Test Case:
1. Keep creating and deleting volumes in a loop on a 3 node cluster
2. Simulate n/w failure between the peers (ifdown followed by ifup)
3. Check if output of 'gluster v list | wc -l' is same across all 3
nodes during 1 & 2.
Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246
Fixes: bz#1605077
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[sh]$ gcc --version
gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Warnings were of the type below:
xlators/mgmt/glusterd/src/glusterd-store.c:3285:33:
warning: ‘/options’ directive output may be truncated writing 8 bytes into a region of size between 1 and 4096 [-Wformat-truncation=]
snprintf (path, len, "%s/options", conf->workdir);
^~~~~~~~
xlators/mgmt/glusterd/src/glusterd-store.c:1280:39:
warning: ‘/snaps/’ directive output may be truncated writing 7 bytes into a region of size between 1 and 4096 [-Wformat-truncation=]
snprintf (snap_fpath, len, "%s/snaps/%s/%s", priv->workdir,
^~~~~~~
* Also changed some places where there was issues with key size
* Made sure all the 'char buf[SOMESIZE] = {0,};' are changed to 'char buf[SOMESIZE] = "";`
- In the files I changed
* Also edited coding standard to reflect that.
updates: bz#1193929
Change-Id: I04c652624ac63199cea2077e46b3a5def37c3689
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
While it is a one line fix, it allows a significant unwanted memory
being allocated for defrag structure.
Updates: bz#1193929
Change-Id: Idda70d1d3dc0e7be56c35e872aa6edfaf752290d
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During friend handshake if the glusterd receives more than one friend
updates, it might very well become possible that two threads would end
up working on two different volinfo references and glusterd might end up
updating the store with a old volinfo reference. While debugging
glusterd crash from validating-server-quorum.t test file from the
line-coverage regression the same was observed.
Solution is to run glusterd_compare_friend_data under a mutex.
Test:
As the crash was more visible in the line-coverage run (given lcov does
some instrumentation and exposes the races), 6 manual lcov runs were
triggered starting from https://build.gluster.org/job/line-coverage/443
to https://build.gluster.org/job/line-coverage/449/ and no crash was
observed from validating-server-quorum.t
Change-Id: I86fce473a76fd24742d51bf17a685d28b90a8941
Fixes: bz#1603063
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The same variable 'len' was used both in the macros and the functions.
(Introduced as part of commit 6dc5dfef819cad69d6d4b4c1c305efa74236ad84 ?)
Change-Id: If434999d6470067f8a1e501c8e132561e8cd81ef
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes compile warnings that appear with newer compilers. The
solution applied is only to remove the warnings, but it doesn't always
solve the problem in the best way. It assumes that the problem will never
happen, as the previous code assumed.
Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This option, applicable to the node level daemons can be very helpful in
controlling the log level of these services. Please note any daemon
which is started prior to setting the specific value of this option (if
not INFO) will need to go through a restart to have this change into
effect.
Change-Id: I7f6d2620bab2b094c737f5cc816bc093e9c9c4c9
fixes: bz#1597473
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With brick multiplexing, to attach a brick to an existing brick process
the prerequisite is to have the compatible brick to finish it's
initialization and portmap sign in and hence the thread might have to go
to a sleep and context switch the synctask to allow the brick process to
communicate with glusterd. In normal code path, this works fine as
glusterd_restart_bricks () is launched through a separate synctask.
In case there's a mismatch of the volume when glusterd restarts,
glusterd_import_friend_volume is invoked and then it tries to call
glusterd_start_bricks () from the main thread which eventually may land
into the similar situation. Now since this is not done through a
separate synctask, the 1st brick will never be able to get its turn to
finish all of its handshaking and as a consequence to it, all the bricks
will fail to get attached to it.
Solution : Execute import volume and glusterd restart bricks in separate
synctask. Importing snaps had to be also done through synctask as
there's a dependency of the parent volume need to be available for the
importing snap functionality to work.
Change-Id: I290b244d456afcc9b913ab30be4af040d340428c
BUG: 1540607
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417,
418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439,
440, 441, 442, 443
Issue: Event include_recursion
Removed redundant, recursive includes from the files.
Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7
BUG: 789278
Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd's brick restart logic is not always sequential as there is
atleast three different ways how the bricks are restarted.
1. through friend-sm and glusterd_spawn_daemons ()
2. through friend-sm and handling volume quorum action
3. through friend handshaking when there is a mimatch on quorum on
friend import.
In a brick multiplexing setup, glusterd ended up trying to spawn the
same brick process couple of times as almost in fraction of milliseconds
two threads hit glusterd_brick_start () because of which glusterd didn't
have any choice of rejecting any one of them as for both the case brick
start criteria met.
As a solution, it'd be better to control this madness by two different
flags, one is a boolean called start_triggered which indicates a brick
start has been triggered and it continues to be true till a brick dies
or killed, the second is a mutex lock to ensure for a particular brick
we don't end up getting into glusterd_brick_start () more than once at
same point of time.
Change-Id: I292f1e58d6971e111725e1baea1fe98b890b43e2
BUG: 1506513
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
| |
Updates: #242
BUG: 1428063
Change-Id: Iaaf2edf99b2ecc75f6d30762c752a6d445c1c826
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a multinode environment, if two of the op-sm transactions
are initiated on one of the receiver nodes at the same time,
there might be a possibility that glusterd may end up in
stale lock.
Solution:
During mgmt_v3_lock a registration is made to gf_timer_call_after
which release the lock after certain period of time
Change-Id: I16cc2e5186a2e8a5e35eca2468b031811e093843
BUG: 1499004
Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summmary:
Adds a new server-side daemon called gfproxyd & a new FUSE client
called gfproxy-client
Updates: #242
BUG: 1428063
Change-Id: I83210098d3a381922bc64fed1110ae1b76e6519f
Tested-by: Shreyas Siravara <sshreyas@fb.com>
Reviewed-by: Kevin Vigor <kvigor@fb.com>
Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd.vol file always had an option (commented out) to indicate the
base-port to start the portmapper allocation. This patch brings in the
max-port configuration where one can limit the range of ports which
gluster can be allowed to bind.
Fixes: #305
Change-Id: Id7a864f818227b9530a07e13d605138edacd9aa9
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/18016
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Gaurav Yadav <gyadav@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently Gluster keeps process pid information of all the daemons
and brick processes in Gluster configuration file directory
(ie., /var/lib/glusterd/*).
These pid files should be seperate from configuration files.
Deletion of the configuration file directory might result into serious problems.
Also, /var/run/gluster is the default placeholder directory for pid files.
So, with this fix Gluster will keep all process pid information of all
processes in /var/run/gluster/* directory.
Change-Id: Idb09e3fccb6a7355fbac1df31082637c8d7ab5b4
BUG: 1258561
Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com>
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: https://review.gluster.org/13580
Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In multiplexing setup in a container environment we hit a race
where before the first brick finishes its handshake with glusterd, the
subsequent attach requests went through and they actually failed and
glusterd has no mechanism to realize it. This resulted into all the such
bricks not to be active resulting into clients not able to connect.
Solution: Introduce a new flag port_registered in glusterd_brickinfo
to make sure about pmap_signin finish before the subsequent
attach bricks can be processed.
Test: To reproduce the issue followed below steps
1) Create 100 volumes on 3 nodes(1x3) in CNS environment
2) Enable brick multiplexing
3) Reboot one container
4) Run below command
for v in ‛gluster v list‛
do
glfsheal $v | grep -i "transport"
done
After apply the patch command should not fail.
Note: A big thanks to Atin for suggest the fix.
BUG: 1478710
Change-Id: I8e1bd6132122b3a5b0dd49606cea564122f2609b
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://review.gluster.org/17984
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Despite the fact that appliances generally use UTC, some
users really want log entries in localtime.
fixes gluster/glusterfs#272
feature page: https://review.gluster.org/17807
Change-Id: I5fbf2c3eedd9eb128fb3f851dd67b2f4081c8bba
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/16911
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM: Both attach tier and add brick have the same RPC
and set of code. This becomes a hurdle while tring to implement
add brick on a tiered volume.
FIX: This patch separates the add brick and attach tier
giving them separate RPCs.
Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2
BUG: 1376326
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: https://review.gluster.org/15503
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently 'storage/posix' xlator has an option called option
`export-statfs-size no`, which exports zero as values for few
fields in `struct statvfs`. In a case of backend brick shared
between multiple brick processes, the values of these variables
should be `field_value / number-of-bricks-at-node`. This way,
even the issue of 'min-free-disk' etc at different layers would
also be handled properly when the statfs() sys call is made.
Fixes #241
Change-Id: I2e320e1fdcc819ab9173277ef3498201432c275f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: https://review.gluster.org/17618
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new global option that can be set to limit
the number of multiplexed bricks in one process.
Usage:
`# gluster volume set all cluster.max-bricks-per-process <value>`
If this option is not set then multiplexing will happen for now
with no limitations set; i.e. a brick process will have as many
bricks multiplexed to it as possible. In other words the current
multiplexing behaviour won't change if this option isn't set to
any value.
This commit also introduces a brick process instance that contains
information about brick processes, like the number of bricks
handled by the process (which is 1 in non-multiplexing cases), list
of bricks, and port number which also serves as an unique identifier
for each brick process instance. The brick process list is
maintained in 'glusterd_conf_t'.
Updates: #151
Change-Id: Ib987d14ab0a4f6034dac01b73a4b2839f7b0b695
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: https://review.gluster.org/17469
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem:
When we call listen from protocol/server, we are giving a
hard coded valie of 10 if it is not manually given.
With multiplexing, especially when glusterd restarts all
clients may try to connect to the server at a time.
Which will result in overflowing the queue, and kernel
will complain about the errors.
Solution:
This patch will introduce a volume set command to make backlog
value as a configurable. This patch also changes the default
values for backlog from 10 to 128. This changes is only applicable
for sockets listening from protocol.
Example:
gluster volume set <volname> transport.listen-backlog 1024
Note: 1 Brick has to be restarted to get this value in effect
2 This changes won't be reflected in glusterd, or other
xlators which calls listen. If you need, you have to
add this option to the volfile.
Change-Id: I0c5a2bbf28b5db612f9979e7560e05dd82b41477
BUG: 1456405
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://review.gluster.org/17411
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit tries to handle a race where we might end up trying to spawn
the brick process twice with two different set of ports resulting into
glusterd portmapper having the same brick entry in two different ports
which will result into clients to fail connect to bricks because of
incorrect ports been communicated back by glusterd.
In glusterd_brick_start () checking brickinfo->status flag to identify
whether a brick has been started by glusterd or not is not sufficient as
there might be cases where while glusterd restarts
glusterd_restart_bricks () will be called through glusterd_spawn_daemons
() in synctask and immediately glusterd_do_volume_quorum_action () with
server-side-quorum set to on will again try to start the brick and in
case if the RPC_CLNT_CONNECT event for the same brick hasn't been processed by
glusterd by that time, brickinfo->status will still be marked as
GF_BRICK_STOPPED resulting into a reattempt to start the brick with a different
port and that would result portmap go for a toss and resulting clients to fetch
incorrect port.
Fix would be to introduce another enum value called GF_BRICK_STARTING in
brickinfo->status which will be set when a brick start is attempted by
glusterd and will be set to started through RPC_CLNT_CONNECT event. For
brick multiplexing, on attach brick request given the brickinfo->status
flag is marked to started directly this value will not have any effect.
Also this patch removes started_here flag as it looks to be redundant as
brickinfo->status.
Change-Id: I9dda1a9a531b67734a6e8c7619677867b520dcb2
BUG: 1457981
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: https://review.gluster.org/17447
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: While brick-muliplexing is on after restarting glusterd, CLI is
not showing pid of all brick processes in all volumes.
Solution: While brick-mux is on all local brick process communicated through one
UNIX socket but as per current code (glusterd_brick_start) it is trying
to communicate with separate UNIX socket for each volume which is populated
based on brick-name and vol-name.Because of multiplexing design only one
UNIX socket is opened so it is throwing poller error and not able to
fetch correct status of brick process through cli process.
To resolve the problem write a new function glusterd_set_socket_filepath_for_mux
that will call by glusterd_brick_start to validate about the existence of socketpath.
To avoid the continuous EPOLLERR erros in logs update socket_connect code.
Test: To reproduce the issue followed below steps
1) Create two distributed volumes(dist1 and dist2)
2) Set cluster.brick-multiplex is on
3) kill glusterd
4) run command gluster v status
After apply the patch it shows correct pid for all volumes
BUG: 1444596
Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: https://review.gluster.org/17101
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The aux mount is created on the first limit/remove_limit/list command
and it remains until volume is stopped / deleted / (quota is disabled)
, where we do a lazy unmount. If the process is uncleanly terminated,
then the mount entry remains and we get (Transport disconnected) error
on subsequent attempts to run quota list/limit-usage/remove commands.
Second issue, There is also a risk of inadvertent rm -rf on the
/var/run/gluster causing data loss for the user. Ideally, /var/run is
a temp path for application use and should not cause any data loss to
persistent storage.
Solution:
1) unmount the aux mount after each use.
2) clean stale mount before mounting, if any.
One caveat with doing mount/unmount on each command is that we cannot
use same mount point for both list and limit commands.
The reason for this is that list command needs mount to be accessible
in cli after response from glusterd, So it could be unmounted by a
limit command if executed in parallel (had we used same mount point)
Hence we use separate mount points for list and limit commands.
Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0
BUG: 1433906
Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
Reviewed-on: https://review.gluster.org/16938
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind can not show the symbols if a .so after calling dlclose(). The
unhelpful ??? in the output gets resolved properly with this change:
==25170== 344 bytes in 1 blocks are definitely lost in loss record 233 of 324
==25170== at 0x4C29975: calloc (vg_replace_malloc.c:711)
==25170== by 0x52C7C0B: __gf_calloc (mem-pool.c:117)
==25170== by 0x12B0638A: ???
==25170== by 0x528FCE6: __xlator_init (xlator.c:472)
==25170== by 0x528FE16: xlator_init (xlator.c:498)
==25170== by 0x52DA8D6: glusterfs_graph_init (graph.c:321)
==25170== by 0x52DB587: glusterfs_graph_activate (graph.c:695)
==25170== by 0x5046407: glfs_process_volfp (glfs-mgmt.c:79)
==25170== by 0x5043B9E: glfs_volumes_init (glfs.c:281)
==25170== by 0x5044FEC: glfs_init_common (glfs.c:986)
==25170== by 0x50451A7: glfs_init@@GFAPI_3.4.0 (glfs.c:1031)
By not calling dlclose(), the dynamically loaded .so is still available
upon program exit, and Valgrind is able to resolve the symbols. This
will add an additional leak, so dlclose() is called for normal builds,
but skipped when configuring with "./configure --enable-valgrind" or
passing the "run-with-valgrind" xlator option.
URL: http://valgrind.org/docs/manual/faq.html#faq.unhelpful
Change-Id: I2044e21b1b8fcce32ad1a817fdd795218f967731
BUG: 1425623
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16809
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to do this because modifying the volume/brick tree while
glusterd_restart_bricks is still walking it can lead to segfaults.
Without waiting we could accidentally "slip in" while attach_brick has
released big_lock between retries and make such a modification.
Change-Id: I30ccc4efa8d286aae847250f5d4fb28956a74b03
BUG: 1432542
Signed-off-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-on: https://review.gluster.org/16927
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
remove all vestiges of ganesha
The storhaug CLI is used to manage ganesha and Samba. Also any setup
and teardown of the ganesha HA is initiated using storhaug to preserve
the proper layering.
Change-Id: I0eec0016a1b7802a36e7b2d92896b86fdf8607d5
BUG: 1420713
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: https://review.gluster.org/16504
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for multiple brick translator stacks running
in a single brick server process. This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more. It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.
Multiplexing is controlled by the "cluster.brick-multiplex" global option. By
default it's off, and bricks are started in separate processes as before. If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.
Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this, we will be able to trigger statedumps on remote Gluster
clients, mainly targetted for applications using libgfapi.
Design:
SIGUSR signal is the most comman way of taking a statedump in Gluster.
But it cannot be used for libgfapi based processes, as the process
loading the library might have already consumed SIGUSR signal. Hence
going by the command way.
One has to issue a Gluster command to initiate a statedump on the
libgfapi based client. The command takes hostname and PID as an
argument. All the glusterds in the cluster, check if they are connected
to the specified hostname, and send an RPC request to all the connected
clients from that hostname (via the mgmt connection).
URL: http://review.gluster.org/16357
Change-Id: Icbe4d2f026b32a2c7d5535e1bfb2cdaaff042e91
BUG: 1169302
Signed-off-by: Poornima G <pgurusid@redhat.com>
[ndevos: minor fixes and split patch in smaller pieces]
Reviewed-on: https://review.gluster.org/9228
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The estimates will be logged to the rebalance log on running
gluster v rebalance <vol> status
Change-Id: I9d51b139cd4c8dfde1ff2c2050720ae606c13fc6
BUG: 1396004
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15893
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tierd is implemented by separating from rebalance process.
The commands affected:
1) Attach tier will trigger this process instead of old one
2) tier start and tier start force will also trigger this process.
3) volume status [tier] will show tier daemon as a process instead
of task and normal tier status and tier detach status works.
4) tier stop implemented.
5) detach tier implemented separately along with new detach tier
status
6) volume tier volname status will work using the changes.
7) volume set works
This patch has separated the tier translator from the legacy
DHT rebalance code. It now sends the RPCs from the CLI
to glusterd separate to the DHT rebalance code.
The daemon is now a service, similar to the snapshot daemon,
and can be viewed using the volume status command.
The code for the validation and commit phase are the same
as the earlier tier validation code in DHT rebalance.
The “brickop” phase has been changed so that the status
command can use this framework.
The service management framework is now used.
DHT rebalance does not use this framework.
This service framework takes care of :
*) spawning the daemon, killing it and other such processes.
*) volume set options , which are written on the volfile.
*) restart and reconfigure functions. Restart is to restart
the daemon at two points
1)after gluster goes down and comes up.
2) to stop detach tier.
*) reconfigure is used to make immediate volfile changes.
By doing this, we don’t restart the daemon.
it has the code to rewrite the volfile for topological
changes too (which comes into place during add and remove brick).
With this patch the log, pid, and volfile are separated
and put into respective directories.
Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399
BUG: 1313838
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: http://review.gluster.org/13365
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gluster volume get <VOLNAME> cluster.opversion gives us the current
op-version on which the cluster is operating. There is no command
that lets the user know the maximum supported op-version that the
cluster can run on.
This patch adds a new global option cluster.max-op-version, that
can be used to retrieve the maximum supported op-version in a
cluster.
Usage:
# gluster volume get all cluster.max-op-version
Example output:
Option Value
------ -----
cluster.max-op-version 30900
NOTE: The only way to test this feature for now is to set the
GD_OP_VERSION_MAX macro to different values (30800 for 3.8,30900
for 3.9, and so on) and rebuild glusterd. Since the regression test
framework currently doesn't have support to simulate these tests,
there are no accompanying regression tests for this feature. It
should be possible to add tests once glusto comes in and makes it
easier to run a heterogeneous cluster.
Change-Id: I547480ee5e7912664784643e436feb198b6d16d0
BUG: 1365822
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: http://review.gluster.org/16283
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently it is not possible to retrieve values of global options
by using the 'gluster volume get' functionality if there are no
volumes present. In order to get the global options one has to use
'gluster volume get' with a specific volume name. This usage makes
the illusion as though the option is set only on one volume, which
is incorrect. When setting the global options, 'gluster volume set'
provides a way to set them using the volume name as 'all'.
Similarly, retrieving the global options should be made possible by
using the volume name 'all' with the 'gluster volume get'
functionality. This patch adds that functionality to 'volume get'
Usage:
# gluster volume get all <OPTION/all>
Change-Id: Ic2fdb9eda69d4806d432dae26d117d9660fe6d4e
BUG: 1378842
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: http://review.gluster.org/15563
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added further checks to ensure we do not go beyond prevalidate
when trying to restore a snapshot which has a nfs-gansha conf
file, in a cluster when nfs-ganesha is not enabled
The error message for the particular scenario is:
"Snapshot(<snapname>) has a nfs-ganesha export conf
file. cluster.enable-shared-storage and nfs-ganesha
should be enabled before restoring this snapshot."
Change-Id: I1b87e9907e0a5e162f26ef1ca89fe76e8da8610f
BUG: 1404118
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/16116
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "gluster volume reset" should first unexport the volume and then delete
export configuration file. Also reset option is not applicable for ganesha.enable
if volume value is "all".
This patch also changes the name of create_export_config into manange_export_config
Change-Id: Ie81a49e7d3e39a88bca9fbae5002bfda5cab34af
BUG: 1397795
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/15914
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brick path of snapshot clones contained the clonename,
thereby failing to create newer clones with the same name
after the original clone had been deleted.
This fix creates the brick path with the clone's vol id
instead of the clones name. Hence future clones with the
same name will not have the namespace clash.
Change-Id: I262712adc576122f051b5d1ce171d020efaefd1a
BUG: 1387160
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/15683
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check if S32gluster_enable_shared_storage.sh is present
at /var/lib/glusterd/hooks/1/set/post/ at staging
before proceeding with the command. Fail the command
with the appropriate error message in case it is not
present.
Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b
BUG: 1388348
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/15718
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a volume stop trigger glusterd issues a brick-op to terminate the brick
process during brick-op phase , however in the commit-op glusterd once again
tries to kill the same process if it exists and then mark the brickinfo->status
flag to GF_BRICK_STOPPED. In the former case, if brick is successfully killed
there is a possibility that GlusterD will receive RPC_CLNT_DISCONNECT from the
said brick process before even the commit op phase is executed and hence by that
time brickinfo->status will still be set to GF_BRICK_STARTED.
BRICK_DISCONNECT event should be only sent if a brick has been killed and not
through a volume stop/remove brick trigger, however due to this trace, this
event is also sent out on a volume stop.
Fix is to introduce an intermediate state GF_BRICK_STOPPING which can be used to
mark the brick status at brick op phase of volume stop/remove brick to avoid
sending spurious BRICK_DISCONNECT events on a volume stop trigger.
Change-Id: Ieed4450e1c988715e0f9958be44faa6b14be81e1
BUG: 1387652
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/15699
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The command basically allows replace brick with src and
dst bricks as same.
Usage:
gluster v reset-brick <volname> <hostname:brick-path> start
This command kills the brick to be reset. Once this command is run,
admin can do other manual operations that they need to do,
like configuring some options for the brick. Once this is done,
resetting the brick can be continued with the following options.
gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force}
Does the job of resetting the brick. 'force' option should be used
when the brick already contains volinfo id.
Problem: On doing a disk-replacement of a brick in a replicate volume
the following 2 scenarios may occur :
a) there is a chance that reads are served from this replaced-disk brick,
which leads to empty reads. b) potential data loss if next writes succeed
only on replaced brick, and heal is done to other bricks from this one.
Solution: After disk-replacement, make sure that reset-brick command is
run for that brick so that pending markers are set for the brick and it
is not chosen as source for reads and heal. But, as of now replace-brick
for the same brick-path is not allowed. In order to fix the above
mentioned problem, same brick-path replace-brick is needed.
With this patch reset-brick commit {force} will be allowed even when
source and destination <hostname:brickpath> are identical as long as
1) destination brick is not alive
2) source and destination brick have the same brick uuid and path.
Also, the destination brick after replace-brick will use the same port
as the source brick.
Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de
BUG: 1266876
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/12250
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.
However 14085 won't pass the smoke test until all the warnings are
fixed.
Change-Id: Id3577872ef720787796f7bfe6772734a3b26fef0
BUG: 1369124
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15286
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Prashanth Pai <ppai@redhat.com>
|