<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/performance, branch v3.11dev</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>readdir-ahead : Perform STACK_UNWIND outside of mutex locks</title>
<updated>2017-01-10T04:52:28+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2017-01-09T04:25:26+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=c89a065af2adc11d5aca3a4500d2e8c1ea02ed28'/>
<id>c89a065af2adc11d5aca3a4500d2e8c1ea02ed28</id>
<content type='text'>
Currently STACK_UNWIND is performnd within ctx-&gt;lock.
If readdir-ahead is loaded as a child of dht, then there
can be scenarios where the function calling STACK_UNWIND
becomes re-entrant. Its a good practice to not call
STACK_WIND/UNWIND within local mutex's

Change-Id: If4e869849d99ce233014a8aad7c4d5eef8dc2e98
BUG: 1401812
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16068
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently STACK_UNWIND is performnd within ctx-&gt;lock.
If readdir-ahead is loaded as a child of dht, then there
can be scenarios where the function calling STACK_UNWIND
becomes re-entrant. Its a good practice to not call
STACK_WIND/UNWIND within local mutex's

Change-Id: If4e869849d99ce233014a8aad7c4d5eef8dc2e98
BUG: 1401812
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16068
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md-cache: Cache updated as a part of invalidate should not update time</title>
<updated>2017-01-09T05:12:48+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-12-26T10:56:49+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=3fcd790d9ca01a95026026d64385c52b5476174d'/>
<id>3fcd790d9ca01a95026026d64385c52b5476174d</id>
<content type='text'>
Currently when a invalidate happens we update the cache along with
the cache time. The problem with this is, upcall doesn't update the
last access time of a client when an invalidation is sent, thus resulting
in a timewindow where the md-cache has cached, but the upcall is unaware
and hence upcall will not further invalidate the cache(unless a fop is sent
from the same client, and upcall updates its database to reflect the same)

Change-Id: Ibceb8d2fc360582752846bbf7fd59697d5424754
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16295
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently when a invalidate happens we update the cache along with
the cache time. The problem with this is, upcall doesn't update the
last access time of a client when an invalidation is sent, thus resulting
in a timewindow where the md-cache has cached, but the upcall is unaware
and hence upcall will not further invalidate the cache(unless a fop is sent
from the same client, and upcall updates its database to reflect the same)

Change-Id: Ibceb8d2fc360582752846bbf7fd59697d5424754
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16295
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>performance/write-behind: Add debug messages</title>
<updated>2017-01-09T04:33:43+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2016-12-26T09:46:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=521c55c53bd42bfdcc0919019ee81c81305382a2'/>
<id>521c55c53bd42bfdcc0919019ee81c81305382a2</id>
<content type='text'>
Change-Id: I2ea1350fcbe4b6c06dcb8093b28316f734cd3b48
BUG: 1379655
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16285
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I2ea1350fcbe4b6c06dcb8093b28316f734cd3b48
BUG: 1379655
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16285
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: Invalidations in disperse volume should not update the stat</title>
<updated>2017-01-06T05:12:18+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2017-01-05T10:06:02+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=95d07a3d2d68805d93d36a447436e27c48777939'/>
<id>95d07a3d2d68805d93d36a447436e27c48777939</id>
<content type='text'>
Issue:
In disperse volume, the file is present across bricks, hence the stat
from one brick doesn't carry the valid size of the file. Therefore
the upcall from one brick updating the md-cache results in wrong size
being updated.

Fix:
If the notification is cache invalidation then, indicate md-cache that
the attributes is invalid.

BUG: 1410375
Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16329
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
In disperse volume, the file is present across bricks, hence the stat
from one brick doesn't carry the valid size of the file. Therefore
the upcall from one brick updating the md-cache results in wrong size
being updated.

Fix:
If the notification is cache invalidation then, indicate md-cache that
the attributes is invalid.

BUG: 1410375
Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16329
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>readdir-ahead: Enhance EOD detection logic</title>
<updated>2017-01-06T04:27:50+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-12-06T11:01:51+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=f60631904defdaec2f1bae84b3cfd6a3e083cf09'/>
<id>f60631904defdaec2f1bae84b3cfd6a3e083cf09</id>
<content type='text'>
Issue:
Currently end of directory is identified on obtaining a
readdirp_cbk with op_ret = 0 (i.e. 0 entries fetched in
readdirp). Thus an extra readdirp is required for every
directory just to identify EOD. Consider a case of listing
large number of small directories. The readdirp fops required
are doubled in that case.

Solution:
On reaching the EOD, posix sets the op_errno to ENOENT,
hence along with looking for 'op_ret == 0' we also
look for 'operrno == ENOENT' ehile checking for EOD condition

Change-Id: I7a5b52e7b98f5dc236c387635fcc651dac0171b3
BUG: 1401812
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16042
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
Currently end of directory is identified on obtaining a
readdirp_cbk with op_ret = 0 (i.e. 0 entries fetched in
readdirp). Thus an extra readdirp is required for every
directory just to identify EOD. Consider a case of listing
large number of small directories. The readdirp fops required
are doubled in that case.

Solution:
On reaching the EOD, posix sets the op_errno to ENOENT,
hence along with looking for 'op_ret == 0' we also
look for 'operrno == ENOENT' ehile checking for EOD condition

Change-Id: I7a5b52e7b98f5dc236c387635fcc651dac0171b3
BUG: 1401812
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16042
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>performance/readdir-ahead: limit cache size</title>
<updated>2016-12-22T11:43:14+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2016-11-24T09:28:20+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=96fb35624060565e02e946a970b3e777071bde9c'/>
<id>96fb35624060565e02e946a970b3e777071bde9c</id>
<content type='text'>
This patch introduces a new option called "rda-cache-limit", which is
the maximum value the entire readdir-ahead cache can grow into. Since,
readdir-ahead holds a reference to inode through dentries, this patch
also accounts memory stored by various xlators in inode contexts.

Change-Id: I84cc0ca812f35e0f9041f8cc71effae53a9e7f99
BUG: 1356960
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16137
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Poornima G &lt;pgurusid@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces a new option called "rda-cache-limit", which is
the maximum value the entire readdir-ahead cache can grow into. Since,
readdir-ahead holds a reference to inode through dentries, this patch
also accounts memory stored by various xlators in inode contexts.

Change-Id: I84cc0ca812f35e0f9041f8cc71effae53a9e7f99
BUG: 1356960
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/16137
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Poornima G &lt;pgurusid@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>performance/write-behind: Add more fops for checking dependency with</title>
<updated>2016-12-12T13:05:46+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2016-10-31T05:19:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9c769c6ee1d125b6bab513073767b628b60abeeb'/>
<id>9c769c6ee1d125b6bab513073767b628b60abeeb</id>
<content type='text'>
cached writes

Fops like readdirp, link, fallocate, discard, zerofill return iatt of
files in their responses. This iatt can be cached by md-cache. Hence
it is important that write-behind maintains relative ordering of these
fops with cached writes. Failure to do so, can result in md-cache
storing stale iatts and returning the same to applications.

Change-Id: Icfe12ad807e42fe9e52a9f63e47ce63f511c6946
BUG: 1390050
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15757
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cached writes

Fops like readdirp, link, fallocate, discard, zerofill return iatt of
files in their responses. This iatt can be cached by md-cache. Hence
it is important that write-behind maintains relative ordering of these
fops with cached writes. Failure to do so, can result in md-cache
storing stale iatts and returning the same to applications.

Change-Id: Icfe12ad807e42fe9e52a9f63e47ce63f511c6946
BUG: 1390050
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15757
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dht/md-cache: Filter invalidate if the file is made a linkto file</title>
<updated>2016-12-02T10:46:58+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-08T05:02:29+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=4536f7bdf16f8286d67598eda9a46c029f0c0bf4'/>
<id>4536f7bdf16f8286d67598eda9a46c029f0c0bf4</id>
<content type='text'>
Upcall as a part of setattr, sends an invalidation and the
invalidation carries the resulting stat value. When a file
is converted to linkto files, even then an invalidation
is set and as a result the mountpoint shows the sticky
bit in the stat of the file.
eg: ---------T. 945 root root 0 Nov  8 10:14 hardlink.999

Fix:
When dht recieves a notification of sticky bit change, it updates
the flag, to indicate md-cache to send the subsequent lookup.

Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606
BUG: 1392713
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15789
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Susant Palai &lt;spalai@redhat.com&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Upcall as a part of setattr, sends an invalidation and the
invalidation carries the resulting stat value. When a file
is converted to linkto files, even then an invalidation
is set and as a result the mountpoint shows the sticky
bit in the stat of the file.
eg: ---------T. 945 root root 0 Nov  8 10:14 hardlink.999

Fix:
When dht recieves a notification of sticky bit change, it updates
the flag, to indicate md-cache to send the subsequent lookup.

Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606
BUG: 1392713
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15789
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Susant Palai &lt;spalai@redhat.com&gt;
Reviewed-by: Rajesh Joseph &lt;rjoseph@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io-cache: Fix a read hang</title>
<updated>2016-11-23T13:11:07+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-11-21T14:27:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=114c50c1a10d649a8b640627f09fd5872828d4ec'/>
<id>114c50c1a10d649a8b640627f09fd5872828d4ec</id>
<content type='text'>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Ideally, STACK_WIND_TAIL() should be modified as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}
But for this solution, knowing the type of variable 'next_xl_fn' is
a challenge and is not easy. Hence just modifying all the existing
callers to pass "FIRST_CHILD (this)" as obj, instead of
"FIRST_CHILD (frame-&gt;this)".

Change-Id: I179ffe3d1f154bc5a1935fd2ee44e912eb0fbb61
BUG: 1388292
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15901
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.

RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).

Consider the following graph:
...
io-cache (parent)
   |
readdir-ahead
   |
read-ahead
...

Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
 if (!inode_ctx)
   STACK_WIND_TAIL (frame, FIRST_CHILD (frame-&gt;this),
                    FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv, fd,
                    size, offset, flags, xdata);
   /* Ideally, this stack_wind should wind to readdir-ahead:readv()
      but it winds to read-ahead:readv(). See below for
      explaination.
    */
...
}

STACK_WIND_TAIL (frame, obj, fn, ...)
{
  frame-&gt;this = obj;
  /* for the above mentioned graph, frame-&gt;this will be readdir-ahead
   * frame-&gt;this = FIRST_CHILD (frame-&gt;this) i.e. readdir-ahead, which
   * is as expected
   */
  ...
  THIS = obj;
  /* THIS will be read-ahead instead of readdir-ahead!, as obj expands
   * to "FIRST_CHILD (frame-&gt;this)" and frame-&gt;this was pointing
   * to readdir-ahead in the previous statement.
   */
  ...
  fn (frame, obj, params);
  /* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
   * as fn expands to "FIRST_CHILD (frame-&gt;this)-&gt;fops-&gt;readv" and
   * frame-&gt;this was pointing ro readdir-ahead in the first statement
   */
  ...
}

Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame-&gt;this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame-&gt;this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.

Solution:
=========
Ideally, STACK_WIND_TAIL() should be modified as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
  next_xl = obj /* resolve obj as the variables passed in obj macro
                   can be overwritten in the further instrucions */
  next_xl_fn = fn /* resolve fn and store in a tmp variable, before
                     modifying any variables */
  frame-&gt;this = next_xl;
  ...
  THIS = next_xl;
  ...
  next_xl_fn (frame, next_xl, params);
  ...
}
But for this solution, knowing the type of variable 'next_xl_fn' is
a challenge and is not easy. Hence just modifying all the existing
callers to pass "FIRST_CHILD (this)" as obj, instead of
"FIRST_CHILD (frame-&gt;this)".

Change-Id: I179ffe3d1f154bc5a1935fd2ee44e912eb0fbb61
BUG: 1388292
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15901
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>performance/io-threads: Exit threads in fini() as well</title>
<updated>2016-11-22T07:01:18+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-11-18T08:00:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=25817a8c868b6c1b8149117f13e4216a99e453aa'/>
<id>25817a8c868b6c1b8149117f13e4216a99e453aa</id>
<content type='text'>
Problem:
io-threads starts the thread in 'init()' but doesn't clean them up
on 'fini()'. It relies on PARENT_DOWN to exit threads but there can
be cases where event before PARENT_UP the graph init code can think
of issuing fini(). This code path is hit when glfs_init() is called
on a volume that is in 'stopped' state. It leads to a crash in ganesha
process, because the io-thread tries to access freed memory.

Fix:
Ideal fix would be to wait for all fops in io-thread list to be completed on
PARENT_DOWN, and have fini() do cleanup of threads. Because there is no proper
documentation about how PARENT_DOWN/fini are supposed to be used,
we are getting different kinds of sequences in different higher level protocols.
So for now cleaning up in both PARENT_DOWN and fini(). Fuse doesn't call fini()
gfapi is not calling PARENT_DOWN in some cases, so for now I don't see
another way out.

BUG: 1396793
Change-Id: I9c9154e7d57198dbaff0f30d3ffc25f6d8088aec
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15888
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
io-threads starts the thread in 'init()' but doesn't clean them up
on 'fini()'. It relies on PARENT_DOWN to exit threads but there can
be cases where event before PARENT_UP the graph init code can think
of issuing fini(). This code path is hit when glfs_init() is called
on a volume that is in 'stopped' state. It leads to a crash in ganesha
process, because the io-thread tries to access freed memory.

Fix:
Ideal fix would be to wait for all fops in io-thread list to be completed on
PARENT_DOWN, and have fini() do cleanup of threads. Because there is no proper
documentation about how PARENT_DOWN/fini are supposed to be used,
we are getting different kinds of sequences in different higher level protocols.
So for now cleaning up in both PARENT_DOWN and fini(). Fuse doesn't call fini()
gfapi is not calling PARENT_DOWN in some cases, so for now I don't see
another way out.

BUG: 1396793
Change-Id: I9c9154e7d57198dbaff0f30d3ffc25f6d8088aec
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15888
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
