<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/bugs, branch exp</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>tests: Fix spurious failure in bug-1402841.t-mt-dir-scan-race.t</title>
<updated>2016-12-19T09:31:52+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-12-16T16:11:58+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2bb2313656a19ee8efe3cfeda4f2ae90e8e49b62'/>
<id>2bb2313656a19ee8efe3cfeda4f2ae90e8e49b62</id>
<content type='text'>
Check that shd is up before executing 'volume heal' command

Change-Id: Ib510a5de06d732fd3a738e90fa16376698479897
BUG: 1405554
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16169
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check that shd is up before executing 'volume heal' command

Change-Id: Ib510a5de06d732fd3a738e90fa16376698479897
BUG: 1405554
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16169
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Talur &lt;rtalur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Fix spurious test failure in bug-1316437.t</title>
<updated>2016-12-16T14:36:46+00:00</updated>
<author>
<name>Rajesh Joseph</name>
<email>rjoseph@redhat.com</email>
</author>
<published>2016-12-15T15:21:30+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e9d8525a0d34130ba2a582109937b8e79eecf6ab'/>
<id>e9d8525a0d34130ba2a582109937b8e79eecf6ab</id>
<content type='text'>
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.

BUG: 1404573
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16162
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.

BUG: 1404573
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16162
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>access_control : address O_TRUNC and O_APPEND flag properly in posix_acl_open</title>
<updated>2016-12-14T13:01:07+00:00</updated>
<author>
<name>Jiffin Tony Thottan</name>
<email>jthottan@redhat.com</email>
</author>
<published>2016-11-17T12:52:39+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e81fd0b85c8dd3f521e54e32b7da2f99a513f2f2'/>
<id>e81fd0b85c8dd3f521e54e32b7da2f99a513f2f2</id>
<content type='text'>
In posix_acl_open, in switch value passed is (flag &amp; O_ACCMODE). The value for
O_ACCMODE is 0003, so the result will always be less than or equal to 3.
But value for O_TRUNC is 01000 and O_APPEND is 02000, so it is not right to
check it in switch case

Change-Id: Ia17db80a6a5f681c35e08e062d384f33ef7e0354
BUG: 1387241
Signed-off-by: Jiffin Tony Thottan &lt;jthottan@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15688
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In posix_acl_open, in switch value passed is (flag &amp; O_ACCMODE). The value for
O_ACCMODE is 0003, so the result will always be less than or equal to 3.
But value for O_TRUNC is 01000 and O_APPEND is 02000, so it is not right to
check it in switch case

Change-Id: Ia17db80a6a5f681c35e08e062d384f33ef7e0354
BUG: 1387241
Signed-off-by: Jiffin Tony Thottan &lt;jthottan@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15688
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Kaleb KEITHLEY &lt;kkeithle@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fix per-txn optimistic changelog initialisation</title>
<updated>2016-12-12T16:38:49+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-12-08T17:19:48+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2b76520ca3e41cbac8f9318dce87e0b8d670c0ee'/>
<id>2b76520ca3e41cbac8f9318dce87e0b8d670c0ee</id>
<content type='text'>
Incorrect initialisation of local-&gt;optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.

FIX:
Initialise local-&gt;optimistic_change_log just before pre-op.

Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.

Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d
BUG: 1402730
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16075
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Incorrect initialisation of local-&gt;optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.

FIX:
Initialise local-&gt;optimistic_change_log just before pre-op.

Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.

Change-Id: Ia3ad716d6fb1821555f02180e86e8711a79f958d
BUG: 1402730
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16075
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncop: fix conditional wait bug in parallel dir scan</title>
<updated>2016-12-09T10:24:21+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-12-09T04:20:43+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2d012c4558046afd6adb3992ff88f937c5f835e4'/>
<id>2d012c4558046afd6adb3992ff88f937c5f835e4</id>
<content type='text'>
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.

Fix: Made some changes to _dir_scan_job_fn

- If a worker encounters error while processing an entry, notify the
  readdir loop in syncop_mt_dir_scan() of the error but continue to process
  other entries in the queue, decrementing the qlen as and when we dequeue
  elements, and ending only when the queue is empty.

- If the readdir loop in syncop_mt_dir_scan() gets an error form the
  worker, stop the readdir+queueing of further entries.

Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1402841
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16073
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.

Fix: Made some changes to _dir_scan_job_fn

- If a worker encounters error while processing an entry, notify the
  readdir loop in syncop_mt_dir_scan() of the error but continue to process
  other entries in the queue, decrementing the qlen as and when we dequeue
  elements, and ending only when the queue is empty.

- If the readdir loop in syncop_mt_dir_scan() gets an error form the
  worker, stop the readdir+queueing of further entries.

Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1402841
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16073
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>uss: snapd should enable SSL if SSL is enabled on volume</title>
<updated>2016-12-01T09:27:07+00:00</updated>
<author>
<name>Rajesh Joseph</name>
<email>rjoseph@redhat.com</email>
</author>
<published>2016-11-29T16:27:37+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=182f0d12040dab5081ca645a3f370f65cd68b528'/>
<id>182f0d12040dab5081ca645a3f370f65cd68b528</id>
<content type='text'>
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.

Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
BUG: 1400013
Reviewed-on: http://review.gluster.org/15979
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kaushal M &lt;kaushal@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.

Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
BUG: 1400013
Reviewed-on: http://review.gluster.org/15979
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kaushal M &lt;kaushal@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libglusterfs: Fix a read hang</title>
<updated>2016-11-29T03:59:08+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-21T14:27:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=8943c19a2ef51b6e4fa66cb57211d469fe558579'/>
<id>8943c19a2ef51b6e4fa66cb57211d469fe558579</id>
<content type='text'>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}

As a part of http://review.gluster.org/15901/ the caller io-cache
was fixed.

BUG: 1388292
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15923
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}

As a part of http://review.gluster.org/15901/ the caller io-cache
was fixed.

BUG: 1388292
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15923
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: allow I/O when favorite-child-policy is enabled</title>
<updated>2016-11-28T07:51:59+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-11-26T15:54:01+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=a07ddd8fcc8dcdcf7ccfa61211d258f13b9f9229'/>
<id>a07ddd8fcc8dcdcf7ccfa61211d258f13b9f9229</id>
<content type='text'>
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.

Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.

The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.

Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1386188
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Simon Turcotte-Langevin &lt;simon.turcotte-langevin@ubisoft.com&gt;
Reviewed-on: http://review.gluster.org/15673
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.

Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.

The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.

Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1386188
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Simon Turcotte-Langevin &lt;simon.turcotte-langevin@ubisoft.com&gt;
Reviewed-on: http://review.gluster.org/15673
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io-cache: Fix a read hang</title>
<updated>2016-11-23T13:11:07+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-21T14:27:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=114c50c1a10d649a8b640627f09fd5872828d4ec'/>
<id>114c50c1a10d649a8b640627f09fd5872828d4ec</id>
<content type='text'>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Ideally, STACK_WIND_TAIL() should be modified as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}
But for this solution, knowing the type of variable 'next_xl_fn' is
a challenge and is not easy. Hence just modifying all the existing
callers to pass "FIRST_CHILD (this)" as obj, instead of
"FIRST_CHILD (frame-&gt;this)".

Change-Id: I179ffe3d1f154bc5a1935fd2ee44e912eb0fbb61
BUG: 1388292
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15901
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Ideally, STACK_WIND_TAIL() should be modified as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}
But for this solution, knowing the type of variable 'next_xl_fn' is
a challenge and is not easy. Hence just modifying all the existing
callers to pass "FIRST_CHILD (this)" as obj, instead of
"FIRST_CHILD (frame-&gt;this)".

Change-Id: I179ffe3d1f154bc5a1935fd2ee44e912eb0fbb61
BUG: 1388292
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15901
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>marker: Fix inode value in loc, in setxattr fop</title>
<updated>2016-11-17T10:53:15+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-11T06:38:57+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=46e5466850311ee69e6ae9a11c2bba2aabadd5de'/>
<id>46e5466850311ee69e6ae9a11c2bba2aabadd5de</id>
<content type='text'>
On recieving a rename fop, marker_rename() stores the,
oldloc and newloc in its 'local' struct, once the rename
is done, the xtime marker(last updated time) is set on
the file, but sending a setxattr fop. When upcall
receives the setxattr fop, the loc-&gt;inode is NULL and
it crashes. The loc-&gt;inode can be NULL only in one valid
case, i.e. in rename case where the inode of new loc
can be NULL. Hence, marker should have filled the inode
of the new_loc before issuing a setxattr.

Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977
BUG: 1394131
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15826
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On recieving a rename fop, marker_rename() stores the,
oldloc and newloc in its 'local' struct, once the rename
is done, the xtime marker(last updated time) is set on
the file, but sending a setxattr fop. When upcall
receives the setxattr fop, the loc-&gt;inode is NULL and
it crashes. The loc-&gt;inode can be NULL only in one valid
case, i.e. in rename case where the inode of new loc
can be NULL. Hence, marker should have filled the inode
of the new_loc before issuing a setxattr.

Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977
BUG: 1394131
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15826
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
