<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src, branch v3.4.0alpha</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/afr: added logging of changelog for split-brain in glustershd.log file</title>
<updated>2013-02-03T19:48:01+00:00</updated>
<author>
<name>Venkatesh Somyajula</name>
<email>vsomyaju@redhat.com</email>
</author>
<published>2013-01-23T06:37:12+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=454c6c0fde1f0788c4a1a7506c434a9b7d822e85'/>
<id>454c6c0fde1f0788c4a1a7506c434a9b7d822e85</id>
<content type='text'>
Change-Id: Iaf119f839cb2113b8f8efb7bf7636d471b6541bf
BUG: 866440
Signed-off-by: Venkatesh Somyajula &lt;vsomyaju@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4385
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Iaf119f839cb2113b8f8efb7bf7636d471b6541bf
BUG: 866440
Signed-off-by: Venkatesh Somyajula &lt;vsomyaju@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4385
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: if a subvolume is down wind the lock request to next</title>
<updated>2013-01-29T20:50:55+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2013-01-28T10:44:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=326a47939dabff218205ca2959b9701e2e0ce47c'/>
<id>326a47939dabff218205ca2959b9701e2e0ce47c</id>
<content type='text'>
When one of the subvolume is down, then lock request is not attempted
on that subvolume and move on to the next subvolume.

/* skip over children that are down */
                while ((child_index &lt; priv-&gt;child_count)
                       &amp;&amp; !local-&gt;child_up[child_index])
                        child_index++;

In the above case if there are 2 subvolumes and 2nd subvolume is down (subvolume
1 from afr's view), then after attempting lock on 1st child (i.e subvolume 0)
child index is calculated to be 1. But since the 2nd child is down child_index
is incremented to 2 as per the above logic and lock request is STACK_WINDed to
the child with child_index 2. Since there are only 2 children for afr the child
(i.e the xlator_t pointer) for child_index will be NULL. The process crashes
when it dereference the NULL xlator object.

Change-Id: Icd9b5ad28bac1b805e6e80d53c12d296526bedf5
BUG: 765564
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4438
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When one of the subvolume is down, then lock request is not attempted
on that subvolume and move on to the next subvolume.

/* skip over children that are down */
                while ((child_index &lt; priv-&gt;child_count)
                       &amp;&amp; !local-&gt;child_up[child_index])
                        child_index++;

In the above case if there are 2 subvolumes and 2nd subvolume is down (subvolume
1 from afr's view), then after attempting lock on 1st child (i.e subvolume 0)
child index is calculated to be 1. But since the 2nd child is down child_index
is incremented to 2 as per the above logic and lock request is STACK_WINDed to
the child with child_index 2. Since there are only 2 children for afr the child
(i.e the xlator_t pointer) for child_index will be NULL. The process crashes
when it dereference the NULL xlator object.

Change-Id: Icd9b5ad28bac1b805e6e80d53c12d296526bedf5
BUG: 765564
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4438
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: wakeup delayed post op on fsync</title>
<updated>2013-01-29T20:33:05+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2013-01-29T12:28:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=360868e9b010f770bd727570e0f0404069c3375a'/>
<id>360868e9b010f770bd727570e0f0404069c3375a</id>
<content type='text'>
Change-Id: I5d84ef72615f9d71b4af210976e2449de6e02326
BUG: 888174
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4446
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I5d84ef72615f9d71b4af210976e2449de6e02326
BUG: 888174
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4446
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Change order of unwind, resume for writev</title>
<updated>2013-01-29T20:29:51+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2013-01-25T05:28:19+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=36a9ac82f05aeb01b0656bb82631a87db6a11803'/>
<id>36a9ac82f05aeb01b0656bb82631a87db6a11803</id>
<content type='text'>
Generally inode-write fops do transaction.unwind then
transaction.resume, but writev needs to make sure that
delayed post-op frame is placed in fdctx before unwind
happens. This prevents the race of flush doing the
changelog wakeup first in fuse thread and then this
writev placing its delayed post-op frame in fdctx.
This helps flush make sure all the delayed post-ops are
completed.

Change-Id: Ia78ca556f69cab3073c21172bb15f34ff8c3f4be
BUG: 888174
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4428
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Generally inode-write fops do transaction.unwind then
transaction.resume, but writev needs to make sure that
delayed post-op frame is placed in fdctx before unwind
happens. This prevents the race of flush doing the
changelog wakeup first in fuse thread and then this
writev placing its delayed post-op frame in fdctx.
This helps flush make sure all the delayed post-ops are
completed.

Change-Id: Ia78ca556f69cab3073c21172bb15f34ff8c3f4be
BUG: 888174
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4428
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: before checking lock_count of internal lock make sure its not</title>
<updated>2013-01-28T08:20:46+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2013-01-28T07:25:17+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=99e63168c498cf57f3f8fabab1d2b86a4ea639ce'/>
<id>99e63168c498cf57f3f8fabab1d2b86a4ea639ce</id>
<content type='text'>
             entrylk

when the expected lock count is equal to the attempted lock count, then before
deciding that lock is failed on all the nodes, make sure the lock type is
checked properly.

Change-Id: I1f362d54320cb6ec5654c5c69915c0f61c91d8c7
BUG: 765564
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4436
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
             entrylk

when the expected lock count is equal to the attempted lock count, then before
deciding that lock is failed on all the nodes, make sure the lock type is
checked properly.

Change-Id: I1f362d54320cb6ec5654c5c69915c0f61c91d8c7
BUG: 765564
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4436
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>replicate: fix lock counting in blocking lock path</title>
<updated>2013-01-26T08:20:03+00:00</updated>
<author>
<name>Anand Avati</name>
<email>avati@redhat.com</email>
</author>
<published>2013-01-26T03:22:55+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=1939519979b6c4da110f08434fe63d0b03332d52'/>
<id>1939519979b6c4da110f08434fe63d0b03332d52</id>
<content type='text'>
As of http://review.gluster.org/2828, the blocking lock code
path's condition for checking completion of locking atempt is
broken. The condition -

	if ((child_index == priv-&gt;child_count) || ...)

and

	if ((child_index == priv-&gt;child_count) &amp;&amp; ...)

which is retained to check completion of blocking lock attempts
for DATA/METADATA transaction will _always_ fail because a few
lines above we have -

      child_index = cookie % priv-&gt;child_count;

So child_index will never equal priv-&gt;child_count. This leaves
the correctness at the mercy of the next part of the
conditional -

	.. (int_lock-&gt;lock_count == int_lock-&gt;lk_expected_count) ..

This "works" as long as no server went down during the transaction.

If a server goes down in the middle of the transaction, then this
condition also fails, and the code wraps around and starts a
blocking lock attempt loop all the way again from from the first server.
This results in double locks getting acquired on those servers, and
eventually the second condition gets hit (first condition is _never_
hit) and we come out of locking phase.

During unlock phase we perform only one unlock per server leaving the
other lock "leaked" forever.

Change-Id: I7189cdf3f70901b04647516fe1d1e189f36cc8dd
BUG: 765564
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4433
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As of http://review.gluster.org/2828, the blocking lock code
path's condition for checking completion of locking atempt is
broken. The condition -

	if ((child_index == priv-&gt;child_count) || ...)

and

	if ((child_index == priv-&gt;child_count) &amp;&amp; ...)

which is retained to check completion of blocking lock attempts
for DATA/METADATA transaction will _always_ fail because a few
lines above we have -

      child_index = cookie % priv-&gt;child_count;

So child_index will never equal priv-&gt;child_count. This leaves
the correctness at the mercy of the next part of the
conditional -

	.. (int_lock-&gt;lock_count == int_lock-&gt;lk_expected_count) ..

This "works" as long as no server went down during the transaction.

If a server goes down in the middle of the transaction, then this
condition also fails, and the code wraps around and starts a
blocking lock attempt loop all the way again from from the first server.
This results in double locks getting acquired on those servers, and
eventually the second condition gets hit (first condition is _never_
hit) and we come out of locking phase.

During unlock phase we perform only one unlock per server leaving the
other lock "leaked" forever.

Change-Id: I7189cdf3f70901b04647516fe1d1e189f36cc8dd
BUG: 765564
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4433
Reviewed-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: Modified book-keeping structures for entrylks</title>
<updated>2013-01-23T17:17:00+00:00</updated>
<author>
<name>Krishnan Parthasarathi</name>
<email>kp@gluster.com</email>
</author>
<published>2012-02-21T10:58:13+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=1ab086d8630687985fb412f4093a53d3e3e9aca1'/>
<id>1ab086d8630687985fb412f4093a53d3e3e9aca1</id>
<content type='text'>
* There are upto 3 entry lockees that may be needed to perform
  entrylk'ing in posix dir-write operations.

* For eg, rmdir ("/a/b") needs to acquire locks on two entities,
  - entrylk ("/a", "b")
  - entrylk ("/a/b", null)

* Changed existing entrylk/rename/selfheal (entrylk) transactions
  to use the new book-keeping structures

* Fixed few issues in afr_trace_entry_lk{in,out} functions. Tracing is now
  aware of the new entry lockee structure.

Implementation notes:
* Changed 'cookie' sent in stack_wind to encode lockee_entity_no
  and subvol_no.

  cookie is a non-negative integer such that 0 &lt;= cookie &lt; replica_count,
  When more than one lock is being acquired across the subvolumes,
  cookie % replica_count gives the subvol_no
  cookie / replica_count gives the lockee_entity_no.

Change-Id: Idbf41803387a7d59a0f7fcb1453d91cea74da153
BUG: 765564
Signed-off-by: Krishnan Parthasarathi &lt;kp@gluster.com&gt;
Reviewed-on: http://review.gluster.org/2828
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* There are upto 3 entry lockees that may be needed to perform
  entrylk'ing in posix dir-write operations.

* For eg, rmdir ("/a/b") needs to acquire locks on two entities,
  - entrylk ("/a", "b")
  - entrylk ("/a/b", null)

* Changed existing entrylk/rename/selfheal (entrylk) transactions
  to use the new book-keeping structures

* Fixed few issues in afr_trace_entry_lk{in,out} functions. Tracing is now
  aware of the new entry lockee structure.

Implementation notes:
* Changed 'cookie' sent in stack_wind to encode lockee_entity_no
  and subvol_no.

  cookie is a non-negative integer such that 0 &lt;= cookie &lt; replica_count,
  When more than one lock is being acquired across the subvolumes,
  cookie % replica_count gives the subvol_no
  cookie / replica_count gives the lockee_entity_no.

Change-Id: Idbf41803387a7d59a0f7fcb1453d91cea74da153
BUG: 765564
Signed-off-by: Krishnan Parthasarathi &lt;kp@gluster.com&gt;
Reviewed-on: http://review.gluster.org/2828
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Remove strict-readdir implementation</title>
<updated>2013-01-23T09:26:09+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2012-12-13T19:35:39+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=91ac9f97417790f439702c0297bca953ece597c8'/>
<id>91ac9f97417790f439702c0297bca953ece597c8</id>
<content type='text'>
Leaving option frame-work un-changed for backward compatibility.

Change-Id: I40bce1ec360801307e67f09e53b0721f64efab37
BUG: 886998
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4309
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Leaving option frame-work un-changed for backward compatibility.

Change-Id: I40bce1ec360801307e67f09e53b0721f64efab37
BUG: 886998
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4309
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>self-heald: Remove stale index even in heal info</title>
<updated>2013-01-23T03:43:50+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2012-10-17T15:54:53+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e908659dfee988799deee888259ca54a0a9f6e31'/>
<id>e908659dfee988799deee888259ca54a0a9f6e31</id>
<content type='text'>
Change-Id: Ic1c9559aec59c1fb9dfede4aba8895f3b86f32f1
BUG: 861015
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4098
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ic1c9559aec59c1fb9dfede4aba8895f3b86f32f1
BUG: 861015
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4098
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Link inode only on lookup</title>
<updated>2013-01-22T05:48:29+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2012-10-16T04:54:30+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=8d5bb5292d75794838ee37e87a97a01cfc59b932'/>
<id>8d5bb5292d75794838ee37e87a97a01cfc59b932</id>
<content type='text'>
Problem:
When "gluster volume heal &lt;volname&gt; info is executed, crawl's
process_entry is not going to populate iatt structure so the
iatt's gfid will be empty. So inode_links are failing.

Fix:
inode_link should be done only after lookup i.e. when heal is
performed. So moved the inode_link related code to just after
the lookup which is triggered when self-heal is done.

Tests:
The testcase that gives this issue does not give the inode-link
failures anymore. glustershd heal, info commands are working as
expected.
Wrote basic automation tests for proactive-self-heal-daemon
https://github.com/pranithk/gluster-tests/blob/master/afr/proactive-self-heal.sh

Change-Id: Ic112bf104a4d553a64d3d8559f681a25ae1a5362
BUG: 861015
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4090
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When "gluster volume heal &lt;volname&gt; info is executed, crawl's
process_entry is not going to populate iatt structure so the
iatt's gfid will be empty. So inode_links are failing.

Fix:
inode_link should be done only after lookup i.e. when heal is
performed. So moved the inode_link related code to just after
the lookup which is triggered when self-heal is done.

Tests:
The testcase that gives this issue does not give the inode-link
failures anymore. glustershd heal, info commands are working as
expected.
Wrote basic automation tests for proactive-self-heal-daemon
https://github.com/pranithk/gluster-tests/blob/master/afr/proactive-self-heal.sh

Change-Id: Ic112bf104a4d553a64d3d8559f681a25ae1a5362
BUG: 861015
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/4090
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
