summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAshish Pandey <aspandey@redhat.com>2018-03-13 14:03:20 +0530
committerjiffin tony Thottan <jthottan@redhat.com>2018-04-06 12:49:09 +0000
commit7061fc7e37847d0674059da0e62f42bd7032707f (patch)
treebc4d513e709f3cc6dc31d719664628a6d625dfc0
parent3487b309feb864d3bb8458357f7d72cbaee8b500 (diff)
cluster/ec: Change default read policy to gfid-hash
Problem: Whenever we read data from file over NFS, NFS reads more data then requested and caches it. Based on the stat information it makes sure that the cached/pre-read data is valid or not. Consider 4 + 2 EC volume and all the bricks are on differnt nodes. In EC, with round-robin read policy, reads are sent on different set of data bricks. This way, it balances the read fops to go on all the bricks and avoid heating UP (overloading) same set of bricks. Due to small difference in clock speed, it is possible that we get minor difference for atime, mtime or ctime for different bricks. That might cause a different stat returned to NFS based on which NFS will discard cached/pre-read data which is actually not changed and could be used. Solution: Change read policy for EC as gfid-hash. That will force all the read to go to same set of bricks. >Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 >BUG: 1554743 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 BUG: 1558352 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
-rw-r--r--tests/basic/ec/ec-read-policy.t7
-rw-r--r--xlators/cluster/ec/src/ec.c2
2 files changed, 4 insertions, 5 deletions
diff --git a/tests/basic/ec/ec-read-policy.t b/tests/basic/ec/ec-read-policy.t
index e4390aa07cb..fe6fe6576e7 100644
--- a/tests/basic/ec/ec-read-policy.t
+++ b/tests/basic/ec/ec-read-policy.t
@@ -20,10 +20,9 @@ TEST $CLI volume start $V0
TEST glusterfs --direct-io-mode=yes --entry-timeout=0 --attribute-timeout=0 -s $H0 --volfile-id $V0 $M0
EXPECT_WITHIN $CHILD_UP_TIMEOUT "6" ec_child_up_count $V0 0
#TEST volume operations work fine
-EXPECT "round-robin" mount_get_option_value $M0 $V0-disperse-0 read-policy
-TEST $CLI volume set $V0 disperse.read-policy gfid-hash
-EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "gfid-hash" mount_get_option_value $M0 $V0-disperse-0 read-policy
-TEST $CLI volume reset $V0 disperse.read-policy
+
+EXPECT "gfid-hash" mount_get_option_value $M0 $V0-disperse-0 read-policy
+TEST $CLI volume set $V0 disperse.read-policy round-robin
EXPECT_WITHIN $CONFIG_UPDATE_TIMEOUT "round-robin" mount_get_option_value $M0 $V0-disperse-0 read-policy
#TEST if the option gives the intended behavior. The way we perform this test
diff --git a/xlators/cluster/ec/src/ec.c b/xlators/cluster/ec/src/ec.c
index 6e9d3e97357..0d616296aff 100644
--- a/xlators/cluster/ec/src/ec.c
+++ b/xlators/cluster/ec/src/ec.c
@@ -1452,7 +1452,7 @@ struct volume_options options[] =
{ .key = {"read-policy" },
.type = GF_OPTION_TYPE_STR,
.value = {"round-robin", "gfid-hash"},
- .default_value = "round-robin",
+ .default_value = "gfid-hash",
.description = "inode-read fops happen only on 'k' number of bricks in"
" n=k+m disperse subvolume. 'round-robin' selects the read"
" subvolume using round-robin algo. 'gfid-hash' selects read"