From 9df752213e6f8a1cc9a5e875cf68ca8ef32f61db Mon Sep 17 00:00:00 2001 From: Poornima G Date: Tue, 19 Jul 2016 15:20:09 +0530 Subject: gfapi: Fix IO error caused when there is consecutive graph switches Backport of http://review.gluster.org/#/c/14722/ This is part 2 of the fix, the part 1 can be found at: http://review.gluster.org/#/c/14656/ Problem: ======= Consider a race between, __glfs_active_subvol() and graph_setup(). Lets say @TIME T1: fs->active_subvol = A fs->next_subvol = B __glfs_active_subvol() //under lock fs->mutex { .... new_subvol = fs->next_subvol //which is B .... //Start migration from A to B __glfs_first_lookup(){ .... unlock fs->mutex //@TIME T2 network fop lock fs->mutex .... } .... //migration continue on B fs->active_subvol = fs->next_subvol //which is C (explained below) .... } @Time T2, lets say in another thread, graph_setup() is called with C, note that at T2, fs->mutex is unlocked. graph_stup(C...) { lock fs->mutex .... if (fs->next_subvol) // which is B destroy subvol (fs->next_subvol) .... fs->next_subvol = C .... unlock fs->mutex } Thus at the end of this, fs->old_subvol = A; fs->active_subvol = C; fs->next_subvol = NULL; which is wrong, as B completed migration, but was destroyed by graph_setup, and C never was migrated. Solution: ========= Any new graph can be in one of the 2 states: - Picked for migration, migration in progress (fs->mip_subvol) - Not picked so far for migration (fs->next_subvol) graph_setup() updates fs->next_subvol only, __glfs_active_subvol() moves fs->next_subvol to fs->mip_subvol and fs->next_subvol = NULL atomically, and then once the migration is complete, make that the fs->active_subvol > Reviewed-on: http://review.gluster.org/14722 > Smoke: Gluster Build System > CentOS-regression: Gluster Build System > NetBSD-regression: NetBSD Build System > Reviewed-by: Raghavendra Talur > Reviewed-by: Rajesh Joseph > Reviewed-by: Niels de Vos BUG: 1367294 Change-Id: Ib6ff0565105c5eedb912a43da4017cd413243612 Signed-off-by: Poornima G Signed-off-by: Oleksandr Natalenko Reviewed-on: http://review.gluster.org/15167 Smoke: Gluster Build System NetBSD-regression: NetBSD Build System CentOS-regression: Gluster Build System Reviewed-by: Kaushal M --- api/src/glfs-resolve.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) (limited to 'api/src/glfs-resolve.c') diff --git a/api/src/glfs-resolve.c b/api/src/glfs-resolve.c index 4c44b4f7ac8..b31507a949b 100644 --- a/api/src/glfs-resolve.c +++ b/api/src/glfs-resolve.c @@ -825,6 +825,13 @@ __glfs_migrate_openfds (struct glfs *fs, xlator_t *subvol) } +/* Note that though it appears that this function executes under fs->mutex, + * it is not fully executed under fs->mutex. i.e. there are functions like + * __glfs_first_lookup, __glfs_refresh_inode, __glfs_migrate_openfds which + * unlocks fs->mutex before sending any network fop, and reacquire fs->mutex + * once the fop is complete. Hence the variable read from fs at the start of the + * function need not have the same value by the end of the function. + */ xlator_t * __glfs_active_subvol (struct glfs *fs) { @@ -835,7 +842,8 @@ __glfs_active_subvol (struct glfs *fs) if (!fs->next_subvol) return fs->active_subvol; - new_subvol = fs->next_subvol; + new_subvol = fs->mip_subvol = fs->next_subvol; + fs->next_subvol = NULL; ret = __glfs_first_lookup (fs, new_subvol); if (ret) { @@ -869,8 +877,8 @@ __glfs_active_subvol (struct glfs *fs) should be atomic */ fs->old_subvol = fs->active_subvol; - fs->active_subvol = fs->next_subvol; - fs->next_subvol = NULL; + fs->active_subvol = fs->mip_subvol; + fs->mip_subvol = NULL; if (new_cwd) { __glfs_cwd_set (fs, new_cwd); -- cgit