<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/basic/afr, branch v4.1.2</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>afr: add new value for read-hash-mode volume option</title>
<updated>2018-03-29T07:37:04+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2018-03-22T12:25:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=c87bd439ef12adc70dc580e75304121c3cd38e9a'/>
<id>c87bd439ef12adc70dc580e75304121c3cd38e9a</id>
<content type='text'>
Updates: #363

This new value (3) will try to wind read requests to the child of AFR
having the least amount of pending requests in its queue.

Change-Id: If6bda2aac9bf7aec3fc39622f78659313c4b6508
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Updates: #363

This new value (3) will try to wind read requests to the child of AFR
having the least amount of pending requests in its queue.

Change-Id: If6bda2aac9bf7aec3fc39622f78659313c4b6508
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Switch to active-fd-count for open-fd checks</title>
<updated>2018-03-21T05:06:32+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2018-03-19T09:56:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=448dec703d603a150dc1f1cc231c8389ab2fb2ea'/>
<id>448dec703d603a150dc1f1cc231c8389ab2fb2ea</id>
<content type='text'>
BUG: 1557932
Change-Id: I3783e41b3812267bc10c0d05d062a31396ce135b
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
BUG: 1557932
Change-Id: I3783e41b3812267bc10c0d05d062a31396ce135b
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Remove compound-fops usage in afr</title>
<updated>2018-03-06T07:55:25+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2018-03-02T04:43:20+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9be043159a514db68b336c6aea49613f3125c5e0'/>
<id>9be043159a514db68b336c6aea49613f3125c5e0</id>
<content type='text'>
We are not seeing much improvement with this change. So removing the
feature so that it doesn't need to be maintained anymore.

Fixes: #414
Change-Id: Ic7969b151544daf2547bd262a9fa03f575626411
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We are not seeing much improvement with this change. So removing the
feature so that it doesn't need to be maintained anymore.

Fixes: #414
Change-Id: Ic7969b151544daf2547bd262a9fa03f575626411
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: bring option of per test timeout</title>
<updated>2018-02-15T08:20:19+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2018-01-18T17:54:04+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9d0d1fdd091d754149242fd4389b964695aacf13'/>
<id>9d0d1fdd091d754149242fd4389b964695aacf13</id>
<content type='text'>
This uses 'timeout' command with 300 seconds default. Right now,
there is just 1 test which takes more than that in a properly
setup machine.

Ideally best case is set the default to something like 30 seconds,
and if a test is supposed to take more than that, owner should add
a timeout line to test knowingly. That way, it makes test writers
think about a time limit too.

Change-Id: I747005ce1f208aeb2ecbf899e8feea487ecd21a0
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This uses 'timeout' command with 300 seconds default. Right now,
there is just 1 test which takes more than that in a properly
setup machine.

Ideally best case is set the default to something like 30 seconds,
and if a test is supposed to take more than that, owner should add
a timeout line to test knowingly. That way, it makes test writers
think about a time limit too.

Change-Id: I747005ce1f208aeb2ecbf899e8feea487ecd21a0
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: don't treat all cases all bricks being blamed as split-brain</title>
<updated>2018-02-01T14:17:50+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2018-01-28T08:20:47+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=0e6e8216823c2d9dafb81aae0f6ee3497c23d140'/>
<id>0e6e8216823c2d9dafb81aae0f6ee3497c23d140</id>
<content type='text'>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk.  Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.

Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.

Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1539358
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not
met. Though the FOP is still unwound with failure, the xattrs remain on
the disk.  Due to these partial post-ops and partial heals (healing only when
2 bricks are up), we can end up in split-brain purely from the afr
xattrs point of view i.e each brick is blamed by atleast one of the
others. These scenarios are hit when there is frequent
connect/disconnect of the client/shd to the bricks while I/O or heal
are in progress.

Fix:
Instead of undoing the post-op, pick a source based on the xattr values.
If 2 bricks blame one, the blamed one must be treated as sink.
If there is no majority, all are sources. Once we pick a source,
self-heal will then do the heal instead of erroring out due to
split-brain.

Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae
BUG: 1539358
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Renable basic/afr/split-brain-favorite-child-policy.t</title>
<updated>2017-11-29T08:51:28+00:00</updated>
<author>
<name>Nigel Babu</name>
<email>nigelb@redhat.com</email>
</author>
<published>2017-11-29T07:15:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e05d8eb96b9f06c39660caa9a613d6064f46f3f2'/>
<id>e05d8eb96b9f06c39660caa9a613d6064f46f3f2</id>
<content type='text'>
This test was failing due to an infra issue. The infra issue is now
fixed.

BUG: 1517961
Change-Id: I09dfab9c0a3ebe73c738222e6269d9e35c85eddb
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This test was failing due to an infra issue. The infra issue is now
fixed.

BUG: 1517961
Change-Id: I09dfab9c0a3ebe73c738222e6269d9e35c85eddb
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: mark currently failing regression tests as known issues</title>
<updated>2017-11-28T13:04:43+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2017-11-27T18:26:50+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=858fae39936e5aee5ea4e3816a10ba310d04cf61'/>
<id>858fae39936e5aee5ea4e3816a10ba310d04cf61</id>
<content type='text'>
Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
BUG: 1517961
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
BUG: 1517961
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: add checks for allowing lookups</title>
<updated>2017-11-18T00:38:47+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2017-08-16T12:31:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4'/>
<id>bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4</id>
<content type='text'>
Problem:
In an arbiter volume, lookup was being served from one of the sink
bricks (source brick was down). shard uses the iatt values from lookup cbk
to calculate the size and block count, which in this case were incorrect
values. shard_local_t-&gt;last_block was thus initialised to -1, resulting
in an infinite while loop in shard_common_resolve_shards().

Fix:
Use client quorum logic to allow or fail the lookups from afr if there
are no readable subvolumes. So in replica-3 or arbiter vols, if there is
no good copy or if quorum is not met, fail lookup with ENOTCONN.

With this fix, we are also removing support for quorum-reads xlator
option. So if quorum is not met, neither read nor write txns are allowed
and we fail the fop with ENOTCONN.

Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0
BUG: 1467250
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In an arbiter volume, lookup was being served from one of the sink
bricks (source brick was down). shard uses the iatt values from lookup cbk
to calculate the size and block count, which in this case were incorrect
values. shard_local_t-&gt;last_block was thus initialised to -1, resulting
in an infinite while loop in shard_common_resolve_shards().

Fix:
Use client quorum logic to allow or fail the lookups from afr if there
are no readable subvolumes. So in replica-3 or arbiter vols, if there is
no good copy or if quorum is not met, fail lookup with ENOTCONN.

With this fix, we are also removing support for quorum-reads xlator
option. So if quorum is not met, neither read nor write txns are allowed
and we fail the fop with ENOTCONN.

Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0
BUG: 1467250
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Update tier CLI in .t files</title>
<updated>2017-10-30T15:50:44+00:00</updated>
<author>
<name>N Balachandran</name>
<email>nbalacha@redhat.com</email>
</author>
<published>2017-10-23T07:12:32+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=62dbefde4bc5c8d9dcfb10f3be5d349db341bb30'/>
<id>62dbefde4bc5c8d9dcfb10f3be5d349db341bb30</id>
<content type='text'>
Update .t tier tests to use the new tier CLI.

Change-Id: I0e7f1769071108d8266fc86378c4466bcaf96e7d
BUG: 1505253
Signed-off-by: N Balachandran &lt;nbalacha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Update .t tier tests to use the new tier CLI.

Change-Id: I0e7f1769071108d8266fc86378c4466bcaf96e7d
BUG: 1505253
Signed-off-by: N Balachandran &lt;nbalacha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fail open on split-brain</title>
<updated>2017-10-26T18:23:35+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-09-04T11:27:25+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=786343abca3474ff01aa1017210112d97cbc4843'/>
<id>786343abca3474ff01aa1017210112d97cbc4843</id>
<content type='text'>
Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.

Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.

Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.

Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.

Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
