summaryrefslogtreecommitdiffstats
path: root/rpc/rpc-lib/src/rpc-clnt.h
diff options
context:
space:
mode:
authorNiels de Vos <ndevos@redhat.com>2017-07-10 09:45:02 +0200
committerNiels de Vos <ndevos@redhat.com>2017-07-11 13:25:31 +0000
commit9d5328e0e07353501392b33e9553300bd34a32ad (patch)
tree2f6ca08831b2495325c59cf31d0555f7ab9d8c32 /rpc/rpc-lib/src/rpc-clnt.h
parent3d5ef4e2cf31f611e3cbcd865c0367bab44a9552 (diff)
rpc/clnt: remove locks while notifying CONNECT/DISCONNECT
Locking during notify was introduced as part of commit aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent xlators [2]. However as part of handling DISCONNECT protocol/client does unwind saved frames (with failure) waiting for responses. This saved_frames_unwind can be a costly operation and hence ideally shouldn't be included in the critical section of notifylock, as it unnecessarily delays the reconnection to same brick. Also, its not a good practise to pass control to other xlators holding a lock as it can lead to deadlocks. So, this patch removes locking in rpc-clnt while notifying parent xlators. To fix [2], two changes are present in this patch: * notify DISCONNECT before cleaning up rpc connection (same as commit a6b63e11b7758cf1bfcb6798, patch [3]). * protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc connection and does a start while handling a DISCONNECT event from rpc. Note that patch [3] was reverted as rpc_clnt_start called in quick_reconnect path of protocol/client didn't invoke connect on transport as the connection was not cleaned up _yet_ (as cleanup was moved post notification in rpc-clnt). This resulted in clients never attempting connect to bricks. Note that one of the neater ways to fix [2] (without using locks) is to introduce generation numbers to map CONNECT and DISCONNECTS across epochs and ignore DISCONNECT events if they don't belong to current epoch. However, this approach is a bit complex to implement and requires time. So, current patch is a hacky stop-gap fix till we come up with a more cleaner solution. [1] http://review.gluster.org/15916 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626 [3] http://review.gluster.org/15681 Cherry picked from commit 773f32caf190af4ee48818279b6e6d3c9f2ecc79: > Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1427012 > Reviewed-on: https://review.gluster.org/16784 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Milind Changire <mchangir@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 Reported-by: Markus Stockhausen <mst@collogia.de> Signed-off-by: Niels de Vos <ndevos@redhat.com> BUG: 1462447 Reviewed-on: https://review.gluster.org/17733 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Diffstat (limited to 'rpc/rpc-lib/src/rpc-clnt.h')
-rw-r--r--rpc/rpc-lib/src/rpc-clnt.h4
1 files changed, 3 insertions, 1 deletions
diff --git a/rpc/rpc-lib/src/rpc-clnt.h b/rpc/rpc-lib/src/rpc-clnt.h
index 3a5b287cd49..df19a0c403f 100644
--- a/rpc/rpc-lib/src/rpc-clnt.h
+++ b/rpc/rpc-lib/src/rpc-clnt.h
@@ -150,6 +150,7 @@ struct rpc_clnt_connection {
int32_t ping_timeout;
uint64_t pingcnt;
uint64_t msgcnt;
+ uint64_t cleanup_gen;
};
typedef struct rpc_clnt_connection rpc_clnt_connection_t;
@@ -172,7 +173,6 @@ struct rpc_req {
typedef struct rpc_clnt {
pthread_mutex_t lock;
- pthread_mutex_t notifylock;
rpc_clnt_notify_t notifyfn;
rpc_clnt_connection_t conn;
void *mydata;
@@ -199,6 +199,8 @@ struct rpc_clnt *rpc_clnt_new (dict_t *options, xlator_t *owner,
int rpc_clnt_start (struct rpc_clnt *rpc);
+int rpc_clnt_cleanup_and_start (struct rpc_clnt *rpc);
+
int rpc_clnt_register_notify (struct rpc_clnt *rpc, rpc_clnt_notify_t fn,
void *mydata);