diff options
author | Ravishankar N <ravishankar@redhat.com> | 2016-05-30 09:56:38 +0530 |
---|---|---|
committer | Niels de Vos <ndevos@redhat.com> | 2016-06-01 04:29:00 -0700 |
commit | fb899d3ef58319e3f583e6abcea52b6b0e47b73b (patch) | |
tree | 8e20ec1f550530d98b0f97180d54bea47f9b3da7 /done | |
parent | 230e3a3fab63421176400af40d41da7f24e576d7 (diff) |
add feature page for Automagic unsplit-brain in afr
Change-Id: I2bc990ee7d1de3a6d3c9d1c235741dd7efadec24
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/14554
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
Diffstat (limited to 'done')
-rw-r--r-- | done/GlusterFS 3.8/automagic-unsplit-brain.md | 132 |
1 files changed, 132 insertions, 0 deletions
diff --git a/done/GlusterFS 3.8/automagic-unsplit-brain.md b/done/GlusterFS 3.8/automagic-unsplit-brain.md new file mode 100644 index 0000000..bc9d5d6 --- /dev/null +++ b/done/GlusterFS 3.8/automagic-unsplit-brain.md @@ -0,0 +1,132 @@ +# Automagic unsplit-brain by [ctime|mtime|size|majority] + +## Summary +A new volume option 'cluster.favorite-child-policy' is introduced which will automatically resolve split-brains by +choosing a particular brick as the good copy based on the value (policy) set. + +## Owners +Ravishankar N <ravishankar@redhat.com> +The patch is a rework of the one submitted by Richard Wareing from facebook. + +## Current status +Patch merged in master: http://review.gluster.org/#/c/14026/ +Patch merged in 3.8 http://review.gluster.org/#/c/14535/ + +## Related Feature Requests and Bugs +3.8 BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1339639 +Original BZ to which facebook's patch was attached: https://bugzilla.redhat.com/show_bug.cgi?id=1262161 + +## Detailed Description +In a replicate volume, when a file ends up in split-brain, accessing them from the client results in input/output error. +To resolve split-brains, the user/admin needs to use the gluster CLI commands or use the virtual setfattr commands to choose a particular +copy of the file as source and trigger heal. Until such manual intervention happens accessing the file fails with EIO. + +With the 'cluster.favorite-child-policy', users can set a policy for AFR to automatically pick a source when the file ends in split-brain and do the heal. +This means they no longer get EIO when trying to acess files and the split-brains get resolved automatically. The various policies available are: +* none: This is the default value. When set, there is no automatic resolution of split-brains. +* ctime: Selects the file with the highest ctime as the source. +* mtime: Selects the file with the highest mtime as the source. +* size: Selects the file with the biggest file size as the source. +* majority: Selects a file with identical mtime and size in more than half the number of bricks in the replica as the source. + +This is a volume wide option, i.e. the same policy will be applied to all split-brained files of the volume. + +## Benefit to GlusterFS +No manual intervention required to fix split-brains. + +## Scope + +### Nature of proposed change +Code changes for handling the various policies for the option is done in AFR. + +### Implications on manageability +New volume option 'cluster.favorite-child-policy' is introduced. + +### Implications on presentation layer +None. + +### Implications on persistence layer +None. + +### Implications on 'GlusterFS' backend +None. + +### Modification to GlusterFS metadata +None. + +### Implications on 'glusterd' +Just the introduction of the volume option. + +## How To Test +Create files in data/ metadata split-brain, use the volume set command to set various policies and see if split-brain heal happens according to the policy. The [.t file](https://github.com/gluster/glusterfs/blob/2f29065/tests/basic/afr/split-brain-favorite-child-policy.t) in the patch contains test cases. +Here is an example of how the volume option can be used: + +### 1. A replica 2 volume that has '/file' in split-brain: +```[root@dhcp42-116 ~]# gluster v heal testvol info +Brick 127.0.0.2:/brick/brick1 +/file - Is in split-brain + +Status: Connected +Number of entries: 1 + +Brick 127.0.0.2:/brick/brick2 +<gfid:fa6f2ab2-722e-4cf3-9f75-662c70be3f58> - Is in split-brain + +Status: Connected +Number of entries: 1 +``` + +### 2. The file size in the backend is different: +```[root@dhcp42-116 ~]# ll /brick/brick*/file +-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick1/file +-rw-r--r-- 2 root root 1024 May 30 12:58 /brick/brick2/file +``` + +### 3. Set the policy to heal based on bigger size: +``` +[root@dhcp42-116 ~]# gluster volume set testvol cluster.favorite-child-policy size +volume set: success +``` + +### 4. Launch heal: +``` +[root@dhcp42-116 ~]# gluster volume heal testvol +Launching heal operation to perform index self heal on volume testvol has been successful +Use heal info commands to check status + +``` + +### 5. Check heal info output again to verify file has been healed: +``` +[root@dhcp42-116 ~]# gluster v heal testvol info +Brick 127.0.0.2:/brick/brick1 +Status: Connected +Number of entries: 0 + +Brick 127.0.0.2:/brick/brick2 +Status: Connected +Number of entries: 0 +``` + +### 6. Check in the backend that the bigger file has been used as source: +```[root@dhcp42-116 ~]# ll /brick/brick*/file +-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick1/file +-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick2/file +``` + + +## User Experience +New CLI volume option 'cluster.favorite-child-policy' + +## Dependencies +None. + +## Documentation +ToDo. + +## Status +Patch merged. + +## Comments and Discussion + + |