| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Handle case of arg not freed
CID: 1422174
Updates: #1060
Change-Id: Ibd03908a3ea8369035c2b7f6e024b3e5be48f436
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment, while volumes are created/stopped in a loop
after running a long time the main brick is crashed.The brick is crashed
because the main brick process was not cleaned up memory for all objects
at the time of detaching a volume.
Below are the objects that are missed at the time of detaching a volume
1) xlator object for a brick graph
2) local_pool for posix_lock xlator
3) rpc object cleanup at quota xlator
4) inode leak at brick xlator
Solution: To avoid the crash resolve all leak at the time of detaching a brick
Change-Id: Ibb6e46c5fba22b9441a88cbaf6b3278823235913
updates: #977
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
| |
Change-Id: I32f801206ce7fbd05aa693f44c2f140304f2e275
Fixes: bz#1810842
|
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Updates: #657
Change-Id: I76a09cfd283bb4ec5c4358536da66547aaf0de31
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Updates: #657
Change-Id: Ic7b38b646fa0932f7c1562467866137c4567e1f1
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Updates: #657
Change-Id: I01146bcd06bca44faeca29da48fab1ee3fc51e00
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
| |
Convert all gf_msg() to gf_smsg()
Updates: #657
Change-Id: Ic54b03f05e2766c87f50df0b3a66803b5519fad9
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: With lookup-optimize set to on by default, a client with
stale-layout can create a new file on a wrong subvol. This will lead to
possible duplicate files if two different clients attempt to create the
same file with two different layouts.
Solution: Send in-memory layout to be cross checked at posix before
commiting a "create". In case of a mismatch, sync the client layout with
that of the server and attempt the create fop one more time.
test: Manual, testcase(attached)
fixes: bz#1786679
Change-Id: Ife0941f105113f1c572f4363cbcee65e0dd9bd6a
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert all gf_msg() to gf_smsg()
Updates: #657
Change-Id: Ia9d4fb17579af6586bc13d69ec7990c6cf220aac
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The current version of the code depends on the non-zero-port information to
propagate ping event from brick to parent layers. But as and when connection
succeeds, port is set to zero in rpc layer.So ping event is never propagated to
parent layers. Halo doesn't work without this.
Fix:
Remember the status of connection in 'private' structure and use that to decide
to propagate ping event to parent xlator.
fixes: bz#1797934
Change-Id: Ia578ba9fb3813953d2068dbba5c982ab27cc3429
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert all gf_msg() to gf_smsg()
Updates: #657
Change-Id: I69b228d7c7a8bc6263f9bd33710e880678d8c017
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Updates: #657
Change-Id: Iee07228cfc3a9a9cd10e89ae9eb918681b072585
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Updates: #657
Change-Id: Ib45f121f583c2af09bfddb23391f73a117e63213
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
We can use memcpy() instead of strncpy() as both are strings that are
37 bytes (GF_UUID_BUF_SIZE) long.
fixes: CID#1405844
Change-Id: Ic74e8817cd790c13e29f3e6be8f18f2bfff77115
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of coming up one server node(1x3) after reboot
client is unmounted.The client is unmounted because a client
is getting AUTH_FAILED event and client call fini for the graph.The
client is getting AUTH_FAILED because brick is not attached with a
graph at that moment
Solution: To avoid the unmounting the client graph throw ENOENT error
from server in case if brick is not attached with server at
the time of authenticate clients.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e
Fixes: bz#1793852
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This parameter may have been used in the past, but is no longer
needed. Removing it and the few locations it was actually referenced.
This allows to remove an extra memdup as well, that was not needed
in the 1st place in server_setvolume() and unserialize_rsp_direntp()
functions.
A followup separate patch will remove extra_stdfree parmeter
from the dictionary structure.
Change-Id: Ica0ff0a330672373aaa60e808b7e76ec489a0fe3
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
| |
It has been a while since we removed lock healing logic from protocol
client. So no need to mention that we healed locks after fd reopen.
Change-Id: I24bd3f9e9f2942e306714b2cb83c229ae57c60ae
Fixes: bz#1193929
Signed-off-by: Anoop C S <anoopcs@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
'volume-id' is good to have for a graph for uniquely identifying it.
Add it to graph->volume_id while generating volfile itself.
This can be further used in many other places.
Updates: #763
Change-Id: I80516d62d28a284e8ff4707841570ced97a37e73
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
squash tens of warnings on padding of structs in afr structures.
The warnings were found by manually added '-Wpadded' to the GCC
command line.
Also made relevant structs and definitions static, where it
was applicable.
Change-Id: Ib71a7e9c6179378f072d796d11172d086c343e53
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In many places we use it, compare to it, etc. It could be a static variable,
as it really doesn't change. I think it's better than initializing to 0
and then doing gfid[15] = 1 or other tricks.
I think there are additional oppportunuties to make more variables static.
This is an attempt at an easy one.
Change-Id: I7f23a30a94056d8f043645371ab841cbd0f90d19
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- remove dead code
- move functions to be static
- move some code that only needs to be executed under if branch
- remove some dead assignments and redundant checks.
No functional change, I hope.
Change-Id: I93d952408197ecd2fa91c3f812a73c54242342fa
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With added check of volume-id during handshake, we can be sure to not
connect with a brick if this gets re-used in another volume. This
prevents any accidental issues which can happen with a stale client
process lurking along.
Also added test case for testing same volume name which would fetch a
different volfile (ie, different bricks, different type), and a
different volume name, but same brick.
For reference:
Currently a client<->server handshake happens in glusterfs through
protocol/client translator (setvolume) to protocol/server using a
dictionary which containes many keys. Rejection happens in server
side if some of the required keys are missing in handshake
dictionary.
Till now, there was no single unique identifier to validate for a
client to tell server if it is actually talking to a corresponding
server. All we look in protocol/client is a key called
'remote-subvolume', which should match with a subvolume name in server
volume file, and for any volume with same brick name (can be present
in same cluster due to recreate), it would be same. This could cause
major issue, when a client was connected to a given brick, in one
volume would be connected to another volume's brick if its
re-created/re-used.
To prevent this behavior, we are now passing along 'volume-id' in
handshake, which would be preserved for the life of client process,
which can prevent this accidental connections.
NOTE: This behavior wouldn't be applicable for user-snapshot enabled
volumes, as snapshotted volume's would have different volume-id.
Fixes: bz#1620580
Change-Id: Ie98286e94ce95ae09c2135fd6ec7d7c2ca1e8095
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reconnect
Bricks cleanup any granted locks after a client disconnects and
currently these locks are not healed after a reconnect. This means
post reconnect a competing process could be granted a lock even though
the first process which was granted locks has not unlocked. By not
re-opening fds, subsequent operations on such fds will fail forcing
the application to close the current fd and reopen a new one. This way
we prevent any silent corruption.
A new option "client.strict-locks" is introduced to control this
behaviour. This option is set to "off" by default.
Change-Id: Ieed545efea466cb5e8f5a36199aa26380c301b9e
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
updates: bz#1694920
|
|
|
|
|
|
|
|
|
|
|
| |
1404965 - Null pointer dereference
1404316 - Program hangs
1401715 - Program hangs
1401713 - Program hangs
Updates: bz#789278
Change-Id: I6e6575daafcb067bc910445f82a9d564f43b75a2
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were unconditionally cleaning up the grap when we get
child_down followed by parent_down. But this is prone to
race condition when some of the bricks are already disconnected.
In this case, even before the last child down is executed in the
client xlator code,we might have freed the graph. Because the
child_down event is alreadt recevied.
To fix this race, we have introduced a check to see if all client
xlator have cleared thier reconnect chain, and called the child_down
for last time.
Change-Id: I7d02813bc366dac733a836e0cd7b14a6fac52042
fixes: bz#1727329
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To convert the existing `gf_msg` to `gf_smsg`:
- Define `_STR` of respective Message ID as below(In `*-messages.h`)
#define PC_MSG_REMOTE_OP_FAILED_STR "remote operation failed."
- Change `gf_msg` to use `gf_smsg`. Convert values into fields and
add any missing fields. Note: `errno` and `error` fields will be
added automatically to log message in case errnum is specified.
Example:
gf_smsg(
this->name, // Name or log domain
GF_LOG_WARNING, // Log Level
rsp.op_errno, // Error number
PC_MSG_REMOTE_OP_FAILED, // Message ID
"path=%s", local->loc.path, // Key Value 1
"gfid=%s", loc_gfid_utoa(&local->loc), // Key Value 2
NULL // Log End
);
Key value pairs formatting Help:
gf_slog(
this->name, // Name or log domain
GF_LOG_WARNING, // Log Level
rsp.op_errno, // Error number
PC_MSG_REMOTE_OP_FAILED, // Message ID
"op=CREATE", // Static Key and Value
"path=%s", local->loc.path, // Format for Value
"brick-%d-status=%s", brkidx, brkstatus, // Use format for key and val
NULL // Log End
);
Before:
[2019-07-03 08:16:18.226819] W [MSGID: 114031] [client-rpc-fops_v2.c \
:2633:client4_0_lookup_cbk] 0-gv3-client-0: remote operation failed. \
Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint \
is not connected]
After:
[2019-07-29 07:50:15.773765] W [MSGID: 114031] \
[client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-gv1-client-0: \
remote operation failed. [{path=/f1}, \
{gfid=00000000-0000-0000-0000-000000000000}, \
{errno=107}, {error=Transport endpoint is not connected}]
To add new `gf_smsg`, Add a Message ID in respective `*-messages.h` file
and the follow the steps mentioned above.
Change-Id: I4e7d37f27f106ab398e991d931ba2ac7841a44b1
Updates: #657
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment sometime brick is crashed while
volume stop/start in a loop.Brick is crashed in janitor task
at the time of accessing priv.If posix priv is cleaned up before
call janitor task then janitor task is crashed.
Solution: To avoid the crash in brick_mux environment introduce a new
flag janitor_task_stop in posix_private and before send CHILD_DOWN event
wait for update the flag by janitor_task_done
Change-Id: Id9fa5d183a463b2b682774ab5cb9868357d139a4
fixes: bz#1730409
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Moved null pointer check up in order to avoid seg-fault
CID: 1404258
Updates: bz#789278
Change-Id: Ib97e05302bfeb8fe38d6ce9870b9740cb576e492
Signed-off-by: Barak Sason <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Goal: 'libglusterfs' files shouldn't have any dependency outside of
the tree, specially the header files, shouldn't have '#include'
from outside the tree.
Fixes:
* Had to introduce libglusterd so, methods and structures required
for only mgmt/glusterd, and cli/ are separated from 'libglusterfs/'
* Remove rpc/xdr/gen from build, which was used mainly so
dependency for libglusterfs could be properly satisfied.
* Move rpcsvc_auth_data to client_t.h, so all dependencies could
be handled.
Updates: bz#1636297
Change-Id: I0e80243a5a3f4615e6fac6e1b947ad08a9363fce
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
- Removal of quite a bit of dead code.
- Use dict_set_str_sizen and friends where applicable.
- Moved some functions to be static and initialize values right away.
Change-Id: Ic25b5da4028198694a0e24796dea375661eb66b9
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I hit one crash issue when using the libgfapi.
In the libgfapi it will call glfs_poller() --> event_dispatch()
in file api/src/glfs.c:721, and the event_dispatch() is defined
by libgluster locally, the problem is the name of event_dispatch()
is the extremly the same with the one from libevent package form
the OS.
For example, if a executable program Foo, which will also use and
link the libevent and the libgfapi at the same time, I can hit the
crash, like:
kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp
00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000]
The link for Foo is:
lib_foo_LADD = -levent $(GFAPI_LIBS)
It will crash.
This is because the glfs_poller() is calling the event_dispatch() from
the libevent, not the libglsuter.
The gfapi link info :
GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid
If I link Foo like:
lib_foo_LADD = $(GFAPI_LIBS) -levent
It will works well without any problem.
And if Foo call one private lib, such as handler_glfs.so, and the
handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't
and it will dlopen(handler_glfs.so), then the crash will be hit everytime.
The link info will be:
foo_LADD = -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)
I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like:
foo_LADD = $(GFAPI_LIBS) -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)
But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS.
And in some cases when the --as-needed link option is added(on many dists
it is added as default), then the crash is back again, the above workaround
won't work.
Fixes: #699
Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa
Signed-off-by: Xiubo Li <xiubli@redhat.com>
|
|
|
|
|
|
|
|
| |
This function does length, allocation and serialization for you.
Change-Id: I142a259952a2fe83dd719442afaefe4a43a8e55e
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two reasons:
* ping responses from glusterd may not be relevant for Halo
replication. Instead, it might be interested in only knowing whether
the brick itself is responsive.
* When a brick is killed, propagating GF_EVENT_CHILD_PING of ping
response from glusterd results in GF_EVENT_DISCONNECT spuriously
propagated to parent xlators. These DISCONNECT events are from the
connections client establishes with glusterd as part of its
reconnect logic. Without GF_EVENT_CHILD_PING, the last event
propagated to parent xlators would be the first DISCONNECT event
from brick and hence subsequent DISCONNECTS to glusterd are not
propagated as protocol/client prevents same event being propagated
to parent xlators consecutively. propagating GF_EVENT_CHILD_PING for
ping responses from glusterd would change the last_sent_event to
GF_EVENT_CHILD_PING and hence protocol/client cannot prevent
subsequent DISCONNECT events
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1716979
Change-Id: I50276680c52f05ca9e12149a3094923622d6eaef
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* locks/posix.c: key was not freed in one of the cases.
* locks/common.c: lock was being free'd out of context.
* nfs/exports: handle case of missing free.
* protocol/client: handle case of entry not freed.
* storage/posix: handle possible case of double free
CID: 1398628, 1400731, 1400732, 1400756, 1124796, 1325526
updates: bz#789278
Change-Id: Ieeaca890288bc4686355f6565f853dc8911344e8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment new stack doesn't populate frame->root->unique in all cases. This
makes it difficult to debug hung frames by examining successive state dumps.
Fuse and server xlators populate it whenever they can, but other xlators won't
be able to assign 'unique' when they need to create a new frame/stack because
they don't know what 'unique' fuse/server xlators already used. What we need is
for unique to be correct. If a stack with same unique is present in successive
statedumps, that means the same operation is still in progress. This makes
'finding hung frames' part of debugging hung frames easier.
fixes bz#1714098
Change-Id: I3e9a8f6b4111e260106c48a2ac3a41ef29361b9e
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the following CID's:
* 1124829
* 1274075
* 1274083
* 1274128
* 1274135
* 1274141
* 1274143
* 1274197
* 1274205
* 1274210
* 1274211
* 1288801
* 1398629
Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558
Updates: bz#789278
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Compound fops are kept on wire as a backward compatibility with
older AFR modules. The AFR module used beyond 4.x releases are
not using compound fops. Hence removing the compound fop in the
protocol code.
Note that, compound-fops was already an 'option' in AFR, and
completely removed since 4.1.x releases.
So, point to note is, with this change, we have 2 ways to upgrade
when clients of 3.x series are present.
i) set 'use-compound-fops' option to 'false' on any volume which
is of replica type. And then upgrade the servers.
ii) Do a two step upgrade. First from current version (which will
already be EOL if it's using compound) to a 4.1..6.x version,
and then an upgrade to 7.x.
Consider the overall code which we are removing for the option
seems quite high, I believe it is worth it.
updates: bz#1693692
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Change-Id: I0a8876d0367a15e1410ec845f251d5d3097ee593
|
|
|
|
|
|
|
|
|
| |
memdup() and gf_memdup() have the same implementation. Removed one API
as the presence of both can be confusing.
Change-Id: I562130c668457e13e4288e592792872d2e49887e
updates: bz#1193929
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a race condition in rpc_transport later
and client fini.
Sequence of events to happen the race condition
1) When we want to destroy a graph, we send a parent down
event first
2) Once parent down received on a client xlator, we will
initiates a rpc disconnect
3) This will in turn generates a child down event.
4) When we process child down, we first do fini for
Every xlator
5) On successful return of fini, we delete the graph
Here after the step 5, there is a chance that the fini
on client might not be finished. Because an rpc_tranpsort
ref can race with the above sequence.
So we have to wait till all rpc's are successfully freed
before returning the fini from client
Change-Id: I20145662d71fb837e448a4d3210d1fcb2855f2d4
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As protocol implements every fop, and in general a large part of
the codebase. Considering our regression is run mostly in 1 machine,
there was no way of forcing the client to use old protocol (while new
one is available). With this patch, a new 'testing' option is provided
which forces client to use old protocol if found.
This should help increase the code coverage by at least 10k lines overall.
updates: bz#1693692
Change-Id: Ie45256f7dea250671b689c72b4b6f25037cef948
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Shd daemon is per node, which means they create a graph
with all volumes on it. While this is a great for utilizing
resources, it is so good in terms of performance and managebility.
Because self-heal daemons doesn't have capability to automatically
reconfigure their graphs. So each time when any configurations
changes happens to the volumes(replicate/disperse), we need to restart
shd to bring the changes into the graph.
Because of this all on going heal for all other volumes has to be
stopped in the middle, and need to restart all over again.
Solution:
This changes makes shd as a per volume daemon, so that the graph
will be generated for each volumes.
When we want to start/reconfigure shd for a volume, we first search
for an existing shd running on the node, if there is none, we will
start a new process. If already a daemon is running for shd, then
we will simply detach a graph for a volume and reatach the updated
graph for the volume. This won't touch any of the on going operations
for any other volumes on the shd daemon.
Example of an shd graph when it is per volume
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
--------- --------- ----------
| AFR-1 | | AFR-2 | | AFR-3 |
-------- --------- ----------
A running shd daemon with 3 volumes will be like-->
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
------------ ------------ ------------
| volume-1 | | volume-2 | | volume-3 |
------------ ------------ ------------
Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an open comes on a file when a brick is down and after the brick comes up,
a fop comes on the fd, client xlator would still wind the fop on anon-fd
leading to wrong behavior of the fops in some cases.
Example:
If lk fop is issued on the fd just after the brick is up in the scenario above,
lk fop will be sent on anon-fd instead of failing it on that client xlator.
This lock will never be freed upon close of the fd as flush on anon-fd is
invalid and is not wound below server xlator.
As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag.
Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc
fixes bz#1390914
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fops allocate 3 kind of payload(buffer) in the client xlator:
- fop payload, this is the buffer allocated by the write and put fop
- rsphdr paylod, this is the buffer required by the reply cbk of
some fops like lookup, readdir.
- rsp_paylod, this is the buffer required by the reply cbk of fops like
readv etc.
Currently, in the lookup and readdir fop the rsphdr is sent as payload,
hence the allocated rsphdr buffer is also sent on the wire, increasing
the bandwidth consumption on the wire.
With this patch, the issue is fixed.
Fixes: bz#1692093
Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
1399758 Dereference before null check
It was introduced @ commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008
updates: bz#789278
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I1424b008b240691fe2a8924e31c708d0fb4f362d
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Brick is getting crash because graph was not activated
at the time of accessing server_conf
Solution: To avoid the crash check ctx->active before processing
a request
Change-Id: Ib112e0eace19189e45f430abdac5511c026bed47
fixes: bz#1687705
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Changed to use the dict_() funcs which take the key length.
This happens to also reduce work under the lock in one case as well.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I958fcc29e95286fe3c74178cae3f01a8b2db26f2
|
|
|
|
|
|
|
|
|
|
| |
Removed op_errno based SERVER_REQ_SET_ERROR() calls which was
dead-code. xdr_to_dict() calls have this check which is used
in 4.0 version of xdr-to-dict.
fixes bz#1676797
Change-Id: I6f56907c85576f1263a6ec04ed7e37f723b01ac3
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk()
is supposed to call rpc_transport_unref() after open-files on
that transport are flushed per transport.But open-fd-count is
maintained in bound_xl->fd_count, which can be incremented/decremented
cumulatively in server_connection_cleanup() by all transport
disconnect paths. So instead of rpc_transport_unref() happening
per transport, it ends up doing it only once after all the files
on all the transports for the brick are flushed leading to
rpc-leaks.
Solution: To avoid races maintain fd_cnt at client instead of maintaining
on brick
Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of deleting block hosting volume
through heketi-cli , it is throwing an error "target is busy".
cli is throwing an error because brick is not detached successfully
and brick is not detached due to race condition to cleanp xprt
associated with detached brick
Solution: To avoid xprt specifc race condition introduce an atomic flag
on rpc_transport
Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added functionality to gluster volume set auth.allow command to
accept CIDR IP addresses. Modified few functions to isolate cidr
feature so that it prevents other gluster commands such as peer
probe to use cidr format ip. The functions are modified in such
a way that they have an option to enable accepting of cidr
format for other gluster commands if required in furture.
updates: bz#1138841
Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b
Credits: Mohit Agrawal
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|