From f9c59e29ccd770ae212da76b5e6f6ce3d8d09e61 Mon Sep 17 00:00:00 2001 From: Prasanna Kumar Kalever Date: Wed, 27 Apr 2016 19:12:19 +0530 Subject: glusterd: add defence mechanism to avoid brick port clashes Intro: Currently glusterd maintain the portmap registry which contains ports that are free to use between 49152 - 65535, this registry is initialized once, and updated accordingly as an then when glusterd sees they are been used. Glusterd first checks for a port within the portmap registry and gets a FREE port marked in it, then checks if that port is currently free using a connect() function then passes it to brick process which have to bind on it. Problem: We see that there is a time gap between glusterd checking the port with connect() and brick process actually binding on it. In this time gap it could be so possible that any process would have occupied this port because of which brick will fail to bind and exit. Case 1: To avoid the gluster client process occupying the port supplied by glusterd : we have separated the client port map range with brick port map range more @ http://review.gluster.org/#/c/13998/ Case 2: (Handled by this patch) To avoid the other foreign process occupying the port supplied by glusterd : To handle above situation this patch implements a mechanism to return EADDRINUSE error code to glusterd, upon which a new port is allocated and try to restart the brick process with the newly allocated port. Note: Incase of glusterd restarts i.e. runner_run_nowait() there is no way to handle Case 2, becuase runner_run_nowait() will not wait to get the return/exit code of the executed command (brick process). Hence as of now in such case, we cannot know with what error the brick has failed to connect. This patch also fix the runner_end() to perform some cleanup w.r.t return values. Backport of: > Change-Id: Iec52e7f5d87ce938d173f8ef16aa77fd573f2c5e > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever > Reviewed-on: http://review.gluster.org/14043 > Tested-by: Prasanna Kumar Kalever > Reviewed-by: Atin Mukherjee > Smoke: Gluster Build System > NetBSD-regression: NetBSD Build System > CentOS-regression: Gluster Build System > Reviewed-by: Raghavendra G > Signed-off-by: Prasanna Kumar Kalever Change-Id: Ief247b4d4538c1ca03e73aa31beb5fa99853afd6 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever Reviewed-on: http://review.gluster.org/14208 Tested-by: Prasanna Kumar Kalever Smoke: Gluster Build System NetBSD-regression: NetBSD Build System CentOS-regression: Gluster Build System Reviewed-by: Raghavendra G --- libglusterfs/src/run.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) (limited to 'libglusterfs/src/run.c') diff --git a/libglusterfs/src/run.c b/libglusterfs/src/run.c index 7c237b35fa0..17da3ad1201 100644 --- a/libglusterfs/src/run.c +++ b/libglusterfs/src/run.c @@ -338,13 +338,13 @@ int runner_end_reuse (runner_t *runner) { int i = 0; - int ret = -1; + int ret = 1; int chstat = 0; if (runner->chpid > 0) { if (waitpid (runner->chpid, &chstat, 0) == runner->chpid) { if (WIFEXITED(chstat)) { - ret = -WEXITSTATUS(chstat); + ret = WEXITSTATUS(chstat); } else { ret = chstat; } @@ -358,7 +358,7 @@ runner_end_reuse (runner_t *runner) } } - return ret; + return -ret; } int @@ -387,8 +387,12 @@ runner_run_generic (runner_t *runner, int (*rfin)(runner_t *runner)) int ret = 0; ret = runner_start (runner); + if (ret) + goto out; + ret = rfin (runner); - return -(rfin (runner) || ret); +out: + return ret; } int -- cgit