summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: additional log informationnik-redhat2020-06-291-5/+14
| | | | | | | | | | | | | | | Issue: Some of the functions didn't had sufficient logging of information in case of failure. Fix: Added log information in few functions in case of failure indicating the cause of such event. Change-Id: I301cf3a1c8d2c94505c6ae0d83072b0241c36d84 fixes: #874 Signed-off-by: nik-redhat <nladha@redhat.com>
* mgmt/shd: Implement multiplexing in self heal daemonMohammed Rafi KC2019-04-011-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ --------- --------- ---------- | AFR-1 | | AFR-2 | | AFR-3 | -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ ------------ ------------ ------------ | volume-1 | | volume-2 | | volume-3 | ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* rpc: Remove duplicate codePranith Kumar K2019-03-281-1/+1
| | | | | | | | | | rpc_clnt_disable() and rpc_clnt_disconnect() have same code. Removed rpc_clnt_disconnect() and moved calls to rpc_clnt_disconnect() to rpc_clnt_disable() updates bz#1193929 Change-Id: I965f57cc1d5af36d266810125558b6f5e5f279d4 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* rpc/transport: Missing a ref on dict while creating transport objectMohammed Rafi KC2019-03-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: set integer instead of stringSanju Rakonde2018-10-241-3/+2
| | | | | | | | | | | | | | dict_get_str_boolean expects a integer, so we need to set all the boolean variables as integers to avoid log messages like below: [2018-09-10 03:55:19.236387] I [dict.c:2838:dict_get_str_boolean] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_reconnect+0xc2) [0x7ff7a83d0452] -->/usr/local/lib/glusterfs/4.2dev/rpc-transport/socket.so(+0x65b0) [0x7ff7a06cf5b0] -->/usr/local/lib/libglusterfs.so.0(dict_get_str_boolean+0xcf) [0x7ff7a85fc58f] ) 0-dict: key transport.socket.ignore-enoent, integer type asked, has string type [Invalid argument] This patch addresses all such instances in glusterd. Change-Id: I7e1979fcf381363943f4d09b94c3901c403727da updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-86/+83
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Some (mgmt) xlators: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-09-091-1/+3
| | | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - It also adds dict_get_int32n which was missing. - It also reduces the size of some key variables. They were set to 1024b or PATH_MAX, where sometimes 64 bytes were really enough. Please review carefully: 1. That I did not reduce some the size of the key variables too much. 2. That I did not mix up some keys. Compile-tested only! Change-Id: Ic729baf179f40e8d02bc2350491d4bb9b6934266 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* rpc: add owner xlator argument to rpc_clnt_newKrishnan Parthasarathi2015-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The @owner argument tells RPC layer the xlator that owns the connection and to which xlator THIS needs be set during network notifications like CONNECT and DISCONNECT. Code paths that originate from the head of a (volume) graph and use STACK_WIND ensure that the RPC local endpoint has the right xlator saved in the frame of the call (callback pair). This guarantees that the callback is executed in the right xlator context. The client handshake process which includes fetching of brick ports from glusterd, setting lk-version on the brick for the session, don't have the correct xlator set in their frames. The problem lies with RPC notifications. It doesn't have the provision to set THIS with the xlator that is registered with the corresponding RPC programs. e.g, RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set to its xlator. This implies, call(-callbacks) originating from this thread don't have the right xlator set too. The fix would be to save the xlator registered with the RPC connection during rpc_clnt_new. e.g, protocol/client's xlator would be saved with the RPC connection that it 'owns'. RPC notifications such as CONNECT, DISCONNECT, etc inherit THIS from the RPC connection's xlator. Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409 BUG: 1235582 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/11436 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: fix repeated connection to nfssvc failed msgsKrishnan Parthasarathi2015-05-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | ... and disable reconnect timer on rpc_clnt_disconnect. Root Cause ---------- gluster-NFS service wouldn't be started if there are no started volumes that have nfs service enabled for them. Before this fix we would initiate a connect even when the gluster-NFS service wasn't (re)started. Compounding that glusterd_conn_disconnect doesn't disable reconnect timer. So, it is possible that the reconnect timer was in execution when the timer event was attempted to be removed. Change-Id: Iadcb5cff9eafefa95eaf3a1a9413eeb682d3aaac BUG: 1222378 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/10830 Tested-by: NetBSD Build System Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Kaushal M <kaushal@redhat.com>
* mgmt/glusterd: Porting messages to new logging frameworkNandaja Varma2015-05-041-1/+3
| | | | | | | | | | Change-Id: I25f3536446798ea1cffd6b5dfbb3d2398766fcf3 BUG: 1194640 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/9808 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* glusterd: nfs,shd,quotad,snapd daemons refactoringAtin Mukherjee2015-02-201-0/+135
This patch ports nfs, shd, quotad & snapd with the approach suggested in http://www.gluster.org/pipermail/gluster-devel/2014-December/043180.html Change-Id: I4ea5b38793f87fc85cc9d2cf873727351dedffd2 BUG: 1191486 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9428 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>