<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/afr/src/afr-self-heal-metadata.c, branch v7.5</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>afr: mark pending xattrs as a part of metadata heal</title>
<updated>2020-04-07T11:09:25+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2018-12-24T07:30:19+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=b1071f4257f139ed2a01ffb881e05d1ac00a4d37'/>
<id>b1071f4257f139ed2a01ffb881e05d1ac00a4d37</id>
<content type='text'>
...if pending xattrs are zero for all children.

Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the  BZ.

Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.

Updates: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 2d5ba449e9200b16184b1e7fc84cabd015f1f779)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
...if pending xattrs are zero for all children.

Problem:
If there are no pending xattrs and a metadata heal needs to be
performed, it can be possible that we end up with xattrs inadvertendly
deleted from all bricks, as explained in the  BZ.

Fix:
After picking one among the sources as the good copy, mark pending xattrs on
all sources to blame the sinks. Now even if this metadata heal fails midway,
a subsequent heal will still choose one of the valid sources that it
picked previously.

Updates: #1067
Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 2d5ba449e9200b16184b1e7fc84cabd015f1f779)
</pre>
</div>
</content>
</entry>
<entry>
<title>Cluster/afr: Don't treat all bricks having metadata pending as split-brain</title>
<updated>2019-06-10T14:48:11+00:00</updated>
<author>
<name>karthik-us</name>
<email>ksubrahm@redhat.com</email>
</author>
<published>2019-06-06T05:29:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=1b0b869d91d4e5bedc69922128551602dc4bbc13'/>
<id>1b0b869d91d4e5bedc69922128551602dc4bbc13</id>
<content type='text'>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.

Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.

Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.

Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1717819
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.

Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.

Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.

Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1717819
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: add client-pid to all gf_event() calls</title>
<updated>2019-03-27T00:18:30+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2019-03-15T14:01:03+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=8016d51a3bbd410b0b927ed66be50a09574b7982'/>
<id>8016d51a3bbd410b0b927ed66be50a09574b7982</id>
<content type='text'>
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD

updates: bz#1689250
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD
client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD

updates: bz#1689250
Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libglusterfs: Move devel headers under glusterfs directory</title>
<updated>2018-12-05T21:47:04+00:00</updated>
<author>
<name>ShyamsundarR</name>
<email>srangana@redhat.com</email>
</author>
<published>2018-11-29T19:08:06+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=20ef211cfa5b5fcc437484a879fdc5d4c66bbaf5'/>
<id>20ef211cfa5b5fcc437484a879fdc5d4c66bbaf5</id>
<content type='text'>
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.

Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation &lt;&gt; in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs

This change although big, is just moving around the headers and
making it correct when including these headers from other sources.

This helps us correctly include libglusterfs includes without
namespace conflicts.

Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR &lt;srangana@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.

Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation &lt;&gt; in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs

This change although big, is just moving around the headers and
making it correct when including these headers from other sources.

This helps us correctly include libglusterfs includes without
namespace conflicts.

Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR &lt;srangana@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: fix incorrect reporting of directory split-brain</title>
<updated>2018-09-21T16:35:26+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2018-09-20T04:44:26+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9d8f5a34c1ff4ae512f80fe9651508ae201c273e'/>
<id>9d8f5a34c1ff4ae512f80fe9651508ae201c273e</id>
<content type='text'>
Problem:
When a directory has dirty xattrs due to failed post-ops or when
replace/reset brick is performed, AFR does a conservative merge as
expected, but heal-info reports it as split-brain because there are no
clear sources.

Fix:
Modify pending flag to contain information about pending heals and
split-brains. For directories, if spit-brain flag is not set,just show
them as needing heal and not being in split-brain.

Fixes: bz#1626994
Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When a directory has dirty xattrs due to failed post-ops or when
replace/reset brick is performed, AFR does a conservative merge as
expected, but heal-info reports it as split-brain because there are no
clear sources.

Fix:
Modify pending flag to contain information about pending heals and
split-brains. For directories, if spit-brain flag is not set,just show
them as needing heal and not being in split-brain.

Fixes: bz#1626994
Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Land part 2 of clang-format changes</title>
<updated>2018-09-12T12:22:45+00:00</updated>
<author>
<name>Gluster Ant</name>
<email>bugzilla-bot@gluster.org</email>
</author>
<published>2018-09-12T12:22:45+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e16868dede6455cab644805af6fe1ac312775e13'/>
<id>e16868dede6455cab644805af6fe1ac312775e13</id>
<content type='text'>
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fail open on split-brain</title>
<updated>2017-10-26T18:23:35+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-09-04T11:27:25+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=786343abca3474ff01aa1017210112d97cbc4843'/>
<id>786343abca3474ff01aa1017210112d97cbc4843</id>
<content type='text'>
Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.

Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.

Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Append on a file with split-brain succeeds. Open is intercepted by open-behind,
when write comes on the file, open-behind does open+write. Open succeeds
because afr doesn't fail it. Then write succeeds because write-behind
intercepts it. Flush is also intercepted by write-behind, so the application
never gets to know that the write failed.

Fix:
Fail open on split-brain, so that when open-behind does open+write open fails
which leads to write failure. Application will know about this failure.

Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15
BUG: 1294051
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: mark non sources as sinks in metadata heal</title>
<updated>2017-07-13T17:22:58+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2017-07-06T14:19:47+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=77c1ed5fd299914e91ff034d78ef6e3600b9151c'/>
<id>77c1ed5fd299914e91ff034d78ef6e3600b9151c</id>
<content type='text'>
Problem:
In a 3 way replica, when the source brick does not have pending xattrs
for the sinks, but the 2 sinks blame each other, metadata heal was not
happpening because we were not setting all non-sources as sinks.

Fix: Mark all non-sources as sinks, like it is done in data and entry
heal.

Change-Id: I534978940f5087302e307fcc810a48ffe898ce08
BUG: 1468279
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: https://review.gluster.org/17717
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In a 3 way replica, when the source brick does not have pending xattrs
for the sinks, but the 2 sinks blame each other, metadata heal was not
happpening because we were not setting all non-sources as sinks.

Fix: Mark all non-sources as sinks, like it is done in data and entry
heal.

Change-Id: I534978940f5087302e307fcc810a48ffe898ce08
BUG: 1468279
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: https://review.gluster.org/17717
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: restore atime/mtime for non-regular files</title>
<updated>2017-03-06T10:01:43+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2017-03-03T19:34:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=804a65f07ea8e2093f781807651d0d07513b2627'/>
<id>804a65f07ea8e2093f781807651d0d07513b2627</id>
<content type='text'>
AFR restores atime/mtime only as a part of data heal. For non-regular
files (dirs, symlinks, char/block/socket files etc) which do not undergo
data-heal, atime/mtime is not restored.

This patch restores atime/mtime as a part of metadata heal for such
files.

Change-Id: Id8da885fc93fdf65c2f4bae2af3605b146ac1f16
BUG: 1429198
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16844
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AFR restores atime/mtime only as a part of data heal. For non-regular
files (dirs, symlinks, char/block/socket files etc) which do not undergo
data-heal, atime/mtime is not restored.

This patch restores atime/mtime as a part of metadata heal for such
files.

Change-Id: Id8da885fc93fdf65c2f4bae2af3605b146ac1f16
BUG: 1429198
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16844
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: allow I/O when favorite-child-policy is enabled</title>
<updated>2016-11-28T07:51:59+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2016-11-26T15:54:01+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=a07ddd8fcc8dcdcf7ccfa61211d258f13b9f9229'/>
<id>a07ddd8fcc8dcdcf7ccfa61211d258f13b9f9229</id>
<content type='text'>
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.

Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.

The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.

Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1386188
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Simon Turcotte-Langevin &lt;simon.turcotte-langevin@ubisoft.com&gt;
Reviewed-on: http://review.gluster.org/15673
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.

Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.

The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.

Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1386188
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Simon Turcotte-Langevin &lt;simon.turcotte-langevin@ubisoft.com&gt;
Reviewed-on: http://review.gluster.org/15673
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
