diff options
| author | Ravishankar N <ravishankar@redhat.com> | 2015-05-05 10:07:13 +0530 | 
|---|---|---|
| committer | Humble Devassy Chirammal <humble.devassy@gmail.com> | 2015-05-05 02:27:57 -0700 | 
| commit | b7da3d569dfa6d6c27480a74a345cf0e166e3fef (patch) | |
| tree | 7a377326a10e87bc8cacf751f552f103dc1c2d39 /doc | |
| parent | 9ddea81a3b6f557c899c90ec84ac8463616e0bc5 (diff) | |
doc: AFR arbiter volume usage
Contains information on creation and behaviour of replica 3 arbiter volumes.
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: I6af4aa3488649686fdb9b839c733046160e0785b
BUG: 1199985
Reviewed-on: http://review.gluster.org/10541
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/admin-guide/en-US/markdown/admin_setting_volumes.md | 7 | ||||
| -rw-r--r-- | doc/features/afr-arbiter-volumes.md | 53 | 
2 files changed, 60 insertions, 0 deletions
| diff --git a/doc/admin-guide/en-US/markdown/admin_setting_volumes.md b/doc/admin-guide/en-US/markdown/admin_setting_volumes.md index e58bb63ab23..d66a6894152 100644 --- a/doc/admin-guide/en-US/markdown/admin_setting_volumes.md +++ b/doc/admin-guide/en-US/markdown/admin_setting_volumes.md @@ -266,6 +266,13 @@ high-availability and high-reliability are critical.      >  Use the `force` option at the end of command if you want to create the volume in this case. +###Arbiter configuration for replica volumes +Arbiter  volumes are replica 3 volumes where the 3rd brick acts as the arbiter brick. This configuration has mechanisms that prevent occurrence of split-brains. +It can be created with the following command: +`# gluster volume create  <VOLNAME>  replica 3 arbiter 1 host1:brick1 host2:brick2 host3:brick3` +More information about this configuration can be found at `doc/features/afr-arbiter-volumes.md` +Note that the arbiter configuration for replica 3 can be used to create distributed-replicate volumes as well. +  ##Creating Striped Volumes  Striped volumes stripes data across bricks in the volume. For best diff --git a/doc/features/afr-arbiter-volumes.md b/doc/features/afr-arbiter-volumes.md new file mode 100644 index 00000000000..1348e5645b8 --- /dev/null +++ b/doc/features/afr-arbiter-volumes.md @@ -0,0 +1,53 @@ +Usage guide: Replicate volumes with arbiter configuration +========================================================== +Arbiter volumes are replica 3 volumes where the 3rd brick of the replica is +automatically configured as an arbiter node. What this means is that the 3rd +brick will store only the file name and metadata, but does not contain any data. +This configuration is helpful in avoiding split-brains while providing the same +level of consistency as a normal replica 3 volume. + +The arbiter volume can be created with the following command: +`gluster volume create <VOLNAME>  replica 3 arbiter 1 host1:brick1 host2:brick2 host3:brick3` + +Note that the syntax is similar to creating a normal replica 3 volume with the +exception of the `arbiter 1` keyword. As seen in the command above, the only +permissible values for the replica count and arbiter count are 3 and 1 +respectively. Also, the 3rd brick is always chosen as the arbiter brick and it +is not configurable to have any other brick as the arbiter. + +Client/ Mount behaviour: +======================== +By default, client quorum (`cluster.quorum-type`) is set to `auto` for a replica +3 volume when it is created;  i.e. at least 2 bricks need to be up to satisfy +quorum and to allow writes. This setting is not to be changed for arbiter +volumes also. Additionally, the arbiter volume has additional some checks to +prevent files from ending up in split-brain: + +* Clients take full file locks when writing to a file as opposed to range locks +  in a normal replica 3 volume. + +* If 2 bricks are up and if one of them is the arbiter (i.e. the 3rd brick) *and* +  it blames the other up brick, then all FOPS will fail with ENOTCONN (Transport +  endpoint is not connected). IF the arbiter doesn't blame the other brick, +  FOPS will be allowed to proceed. 'Blaming' here is w.r.t the values of AFR +  changelog extended attributes. + +* If 2 bricks are up and the arbiter is down, then FOPS will be allowed. + +* In all cases, if there is only one source before the FOP is initiated and if +  the FOP fails on that source, the application will receive ENOTCONN. + +Note: It is possible to see if a replica 3 volume has arbiter configuration from +the mount point. If +`$mount_point/.meta/graphs/active/$V0-replicate-0/options/arbiter-count` exists +and its value is 1, then it is an arbiter volume. Also the client volume graph +will have arbiter-count as a xlator option for AFR translators. + +Self-heal daemon behaviour: +=========================== +Since the arbiter brick does not store any data for the files, data-self-heal +from the arbiter brick will not take place. For example if there are 2 source +bricks B2 and B3 (B3 being arbiter brick) and B2 is down, then data-self-heal +will *not* happen from B3 to sink brick B1, and will be pending until B2 comes +up and heal can happen from it. Note that  metadata and entry self-heals can +still happen from B3 if it is one of the sources. | 
