diff options
author | Mohit Agrawal <moagrawa@redhat.com> | 2018-03-12 19:43:15 +0530 |
---|---|---|
committer | Raghavendra G <rgowdapp@redhat.com> | 2018-04-19 04:31:51 +0000 |
commit | 0043c63f70776444f69667a4ef9596217ecb42b7 (patch) | |
tree | e6c239e4b27198d40bca329edcce317ded59de09 /xlators/protocol/server/src/server-handshake.c | |
parent | be26b0da2f1a7fe336400de6a1c016716983bd38 (diff) |
gluster: Sometimes Brick process is crashed at the time of stopping brick
Problem: Sometimes brick process is getting crashed at the time
of stop brick while brick mux is enabled.
Solution: Brick process was getting crashed because of rpc connection
was not cleaning properly while brick mux is enabled.In this patch
after sending GF_EVENT_CLEANUP notification to xlator(server)
waits for all rpc client connection destroy for specific xlator.Once rpc
connections are destroyed in server_rpc_notify for all associated client
for that brick then call xlator_mem_cleanup for for brick xlator as well as
all child xlators.To avoid races at the time of cleanup introduce
two new flags at each xlator cleanup_starting, call_cleanup.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
Diffstat (limited to 'xlators/protocol/server/src/server-handshake.c')
-rw-r--r-- | xlators/protocol/server/src/server-handshake.c | 25 |
1 files changed, 24 insertions, 1 deletions
diff --git a/xlators/protocol/server/src/server-handshake.c b/xlators/protocol/server/src/server-handshake.c index de90a6b8eda..08f76de9748 100644 --- a/xlators/protocol/server/src/server-handshake.c +++ b/xlators/protocol/server/src/server-handshake.c @@ -474,6 +474,7 @@ server_setvolume (rpcsvc_request_t *req) struct _child_status *tmp = NULL; char *subdir_mount = NULL; char *client_name = NULL; + gf_boolean_t cleanup_starting = _gf_false; params = dict_new (); reply = dict_new (); @@ -575,11 +576,13 @@ server_setvolume (rpcsvc_request_t *req) "initialised yet. Try again later"); goto fail; } + list_for_each_entry (tmp, &conf->child_status->status_list, status_list) { if (strcmp (tmp->name, name) == 0) break; } + if (!tmp->name) { gf_msg (this->name, GF_LOG_ERROR, 0, PS_MSG_CHILD_STATUS_FAILED, @@ -593,6 +596,7 @@ server_setvolume (rpcsvc_request_t *req) "Failed to set 'child_up' for xlator %s " "in the reply dict", tmp->name); } + ret = dict_get_str (params, "process-uuid", &client_uid); if (ret < 0) { ret = dict_set_str (reply, "ERROR", @@ -634,8 +638,27 @@ server_setvolume (rpcsvc_request_t *req) goto fail; } - if (req->trans->xl_private != client) + pthread_mutex_lock (&conf->mutex); + if (xl->cleanup_starting) { + cleanup_starting = _gf_true; + } else if (req->trans->xl_private != client) { req->trans->xl_private = client; + } + pthread_mutex_unlock (&conf->mutex); + + if (cleanup_starting) { + op_ret = -1; + op_errno = EAGAIN; + + ret = dict_set_str (reply, "ERROR", + "cleanup flag is set for xlator. " + " Try again later"); + if (ret < 0) + gf_msg_debug (this->name, 0, "failed to set error: " + "cleanup flag is set for xlator. " + "Try again later"); + goto fail; + } auth_set_username_passwd (params, config_params, client); if (req->trans->ssl_name) { |