<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/volume.rc, branch v3.11.0beta1</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/ec: Don't trigger data/metadata heal on Lookups</title>
<updated>2017-02-27T03:06:55+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-01-25T10:01:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=c1fc1fc9cb5a13e6ddf8c9270deb0c7609333540'/>
<id>c1fc1fc9cb5a13e6ddf8c9270deb0c7609333540</id>
<content type='text'>
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.

Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen

Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.

BUG: 1414287
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.

Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen

Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.

BUG: 1414287
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: keep snapshot bricks separate from regular ones</title>
<updated>2017-02-10T13:16:48+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2017-02-03T15:51:21+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=f1c6ae24361b1bf39794a34ea35a0202a6b49fa6'/>
<id>f1c6ae24361b1bf39794a34ea35a0202a6b49fa6</id>
<content type='text'>
The problem here is that a volume's transport options can change, but
any snapshots' bricks don't follow along even though they're now
incompatible (with respect to multiplexing).  This was causing the
USS+SSL test to fail.  By keeping the snapshot bricks separate
(though still potentially multiplexed with other snapshot bricks
including those for other volumes) we can ensure that they remain
unaffected by changes to their parent volumes.

Also fixed various issues with how the test waits (or more precisely
didn't) for various events to complete before it continues.

Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16544
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
Tested-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The problem here is that a volume's transport options can change, but
any snapshots' bricks don't follow along even though they're now
incompatible (with respect to multiplexing).  This was causing the
USS+SSL test to fail.  By keeping the snapshot bricks separate
(though still potentially multiplexed with other snapshot bricks
including those for other volumes) we can ensure that they remain
unaffected by changes to their parent volumes.

Also fixed various issues with how the test waits (or more precisely
didn't) for various events to complete before it continues.

Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16544
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
Tested-by: Avra Sengupta &lt;asengupt@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: fix online_brick_count for multiplexing</title>
<updated>2017-02-08T03:22:19+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2017-02-02T15:22:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=4a7fd196d4a141f2b693d5b49995733f6ad1776f'/>
<id>4a7fd196d4a141f2b693d5b49995733f6ad1776f</id>
<content type='text'>
The number of brick processes no longer matches the number of bricks,
therefore counting processes doesn't work.  Counting *pidfiles* does.
Ironically, the fix broke multiplex.t which used this function, so it
now uses a different function with the old process-counting behavior.
Also had to fix online_brick_count and kill_node in cluster.rc to be
consistent with the new reality.

Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16527
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The number of brick processes no longer matches the number of bricks,
therefore counting processes doesn't work.  Counting *pidfiles* does.
Ironically, the fix broke multiplex.t which used this function, so it
now uses a different function with the old process-counting behavior.
Also had to fix online_brick_count and kill_node in cluster.rc to be
consistent with the new reality.

Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16527
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: run many bricks within one glusterfsd process</title>
<updated>2017-01-31T00:13:58+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2016-12-08T21:24:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=1a95fc3036db51b82b6a80952f0908bc2019d24a'/>
<id>1a95fc3036db51b82b6a80952f0908bc2019d24a</id>
<content type='text'>
This patch adds support for multiple brick translator stacks running
in a single brick server process.  This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more.  It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.

Multiplexing is controlled by the "cluster.brick-multiplex" global option.  By
default it's off, and bricks are started in separate processes as before.  If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.

Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds support for multiple brick translator stacks running
in a single brick server process.  This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more.  It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.

Multiplexing is controlled by the "cluster.brick-multiplex" global option.  By
default it's off, and bricks are started in separate processes as before.  If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.

Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/bit-rot-stub: use the correct spelling of quarantine for bad objects</title>
<updated>2017-01-30T20:17:51+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2016-12-02T20:24:46+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2e0dceb6165d0a93581284fa2e0d74abe4ee615b'/>
<id>2e0dceb6165d0a93581284fa2e0d74abe4ee615b</id>
<content type='text'>
                       container

The directory for containing the list of bad objects was named "quanrantine"
instead of "quarantine"

Change-Id: I8c20381ac637201d9d1a224f5223e8dfbed53f1e
BUG: 1401571
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16027
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
                       container

The directory for containing the list of bad objects was named "quanrantine"
instead of "quarantine"

Change-Id: I8c20381ac637201d9d1a224f5223e8dfbed53f1e
BUG: 1401571
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16027
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tier : Tier as a service</title>
<updated>2017-01-17T04:49:47+00:00</updated>
<author>
<name>hari gowtham</name>
<email>hgowtham@redhat.com</email>
</author>
<published>2016-07-12T11:10:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=3263d1c4f4b7efd1a018c17e1ba4dd9245094f48'/>
<id>3263d1c4f4b7efd1a018c17e1ba4dd9245094f48</id>
<content type='text'>
tierd is implemented by separating from rebalance process.

The commands affected:

1) Attach tier will trigger this process instead of old one
2) tier start and tier start force will also trigger this process.
3) volume status [tier] will show tier daemon as a process instead
of task and normal tier status and tier detach status works.
4) tier stop implemented.
5) detach tier implemented separately along with new detach tier
status
6) volume tier volname status will work using the changes.
7) volume set works

This patch has separated the tier translator from the legacy
DHT rebalance code. It now sends the RPCs from the CLI
to glusterd separate to the DHT rebalance code.
The daemon is now a service, similar to the snapshot daemon,
and can be viewed using the volume status command.

The code for the validation and commit phase are the same
as the earlier tier validation code in DHT rebalance.

The “brickop” phase has been changed so that the status
command can use this framework.

The service management framework is now used.
DHT rebalance does not use this framework.

This service framework takes care of :

*) spawning the daemon, killing it and other such processes.
*) volume set options , which are written on the volfile.
*) restart and reconfigure functions. Restart is to restart
the daemon at two points
        1)after gluster goes down and comes up.
        2) to stop detach tier.
*) reconfigure is used to make immediate volfile changes.
By doing this, we don’t restart the daemon.
it has the code to rewrite the volfile for topological
changes too (which comes into place during add and remove brick).

With this patch the log, pid, and volfile are separated
and put into respective directories.

Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399
BUG: 1313838
Signed-off-by: hari gowtham &lt;hgowtham@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13365
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: hari gowtham &lt;hari.gowtham005@gmail.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Dan Lambright &lt;dlambrig@redhat.com&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tierd is implemented by separating from rebalance process.

The commands affected:

1) Attach tier will trigger this process instead of old one
2) tier start and tier start force will also trigger this process.
3) volume status [tier] will show tier daemon as a process instead
of task and normal tier status and tier detach status works.
4) tier stop implemented.
5) detach tier implemented separately along with new detach tier
status
6) volume tier volname status will work using the changes.
7) volume set works

This patch has separated the tier translator from the legacy
DHT rebalance code. It now sends the RPCs from the CLI
to glusterd separate to the DHT rebalance code.
The daemon is now a service, similar to the snapshot daemon,
and can be viewed using the volume status command.

The code for the validation and commit phase are the same
as the earlier tier validation code in DHT rebalance.

The “brickop” phase has been changed so that the status
command can use this framework.

The service management framework is now used.
DHT rebalance does not use this framework.

This service framework takes care of :

*) spawning the daemon, killing it and other such processes.
*) volume set options , which are written on the volfile.
*) restart and reconfigure functions. Restart is to restart
the daemon at two points
        1)after gluster goes down and comes up.
        2) to stop detach tier.
*) reconfigure is used to make immediate volfile changes.
By doing this, we don’t restart the daemon.
it has the code to rewrite the volfile for topological
changes too (which comes into place during add and remove brick).

With this patch the log, pid, and volfile are separated
and put into respective directories.

Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399
BUG: 1313838
Signed-off-by: hari gowtham &lt;hgowtham@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13365
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: hari gowtham &lt;hari.gowtham005@gmail.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Dan Lambright &lt;dlambrig@redhat.com&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Implement heal info with lock</title>
<updated>2016-10-11T09:29:27+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2016-09-20T07:02:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=0fed7e7f0aad9973900c89434f736797d9ace2bd'/>
<id>0fed7e7f0aad9973900c89434f736797d9ace2bd</id>
<content type='text'>
Problem: Currently heal info command prints all
the files/directories if the index for the
file/directory is present in .glusterfs/indices folder.
After implementing patch http://review.gluster.org/#/c/13733/
indices of the file which is going through update fop
will also be present in .glusterfs/indices even
if the fop is successful on all the brick. At this time
if heal info command is being used, it will also display this
file which is actually healthy and does not require any heal.

Solution: Take lock on a file corresponding to the indices
and inspect xattrs to decide if the file needs heal or not.

Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa
BUG: 1366815
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15543
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: Currently heal info command prints all
the files/directories if the index for the
file/directory is present in .glusterfs/indices folder.
After implementing patch http://review.gluster.org/#/c/13733/
indices of the file which is going through update fop
will also be present in .glusterfs/indices even
if the fop is successful on all the brick. At this time
if heal info command is being used, it will also display this
file which is actually healthy and does not require any heal.

Solution: Take lock on a file corresponding to the indices
and inspect xattrs to decide if the file needs heal or not.

Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa
BUG: 1366815
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15543
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Fix races in open-behind.t</title>
<updated>2016-09-27T11:44:27+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-09-27T02:21:48+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=cd072b61841c19ec942871e3f06519d2a938814b'/>
<id>cd072b61841c19ec942871e3f06519d2a938814b</id>
<content type='text'>
Problems:
1) flush-behind is on by default, so just because write completes doesn't mean
   it will be on the disk, it could still be in write-behind's cache. This
   leads to failure where if you write from one mount and expect it to be there
   on the other mount, sometimes it won't be there.
2) Sometimes the graph switch is not completing by the time we issue read which
   is leading to opens not being sent on brick leading to failures.

Fixes:
1) Disable flush-behind
2) Add new functions to check the new graph is there and connected to bricks
   before 'cat' is executed.

BUG: 1379511
Change-Id: I0faed684e0dc70cfd2258ce6fdaed655ee915ae6
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15575
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problems:
1) flush-behind is on by default, so just because write completes doesn't mean
   it will be on the disk, it could still be in write-behind's cache. This
   leads to failure where if you write from one mount and expect it to be there
   on the other mount, sometimes it won't be there.
2) Sometimes the graph switch is not completing by the time we issue read which
   is leading to opens not being sent on brick leading to failures.

Fixes:
1) Disable flush-behind
2) Add new functions to check the new graph is there and connected to bricks
   before 'cat' is executed.

BUG: 1379511
Change-Id: I0faed684e0dc70cfd2258ce6fdaed655ee915ae6
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15575
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>feature/bitrot: Fix recovery of corrupted hardlink</title>
<updated>2016-09-08T17:09:33+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2016-09-06T12:58:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=b86a7de9b5ea9dcd0a630dbe09fce6d9ad0d8944'/>
<id>b86a7de9b5ea9dcd0a630dbe09fce6d9ad0d8944</id>
<content type='text'>
Problem:
When a file with hardlink is corrupted in ec volume,
the recovery steps mentioned was not working.
Only name and metadata was healing but not the data.

Cause:
The bad file marker in the inode context is not removed.
Hence when self heal tries to open the file for data
healing, it fails with EIO.

Background:
The bitrot deletes inode context during forget.

Briefly, the recovery steps involves following steps.
   1. Delete the entry marked with bad file xattr
      from backend. Delete all the hardlinks including
      .glusters hardlink as well.
   2. Access the each hardlink of the file including
      original from the mount.

The step 2 will send lookup to the brick where the files
are deleted from backend and returns with ENOENT. On
ENOENT, server xlator forgets the inode if there are
no dentries associated with it. But in case hardlinks,
the forget won't be called as dentries (other hardlink
files) are associated with the inode. Hence bitrot stube
won't delete it's context failing the data self heal.

Fix:
Bitrot-stub should delete the inode context on getting
ENOENT during lookup.

Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
BUG: 1373520
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15408
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When a file with hardlink is corrupted in ec volume,
the recovery steps mentioned was not working.
Only name and metadata was healing but not the data.

Cause:
The bad file marker in the inode context is not removed.
Hence when self heal tries to open the file for data
healing, it fails with EIO.

Background:
The bitrot deletes inode context during forget.

Briefly, the recovery steps involves following steps.
   1. Delete the entry marked with bad file xattr
      from backend. Delete all the hardlinks including
      .glusters hardlink as well.
   2. Access the each hardlink of the file including
      original from the mount.

The step 2 will send lookup to the brick where the files
are deleted from backend and returns with ENOENT. On
ENOENT, server xlator forgets the inode if there are
no dentries associated with it. But in case hardlinks,
the forget won't be called as dentries (other hardlink
files) are associated with the inode. Hence bitrot stube
won't delete it's context failing the data self heal.

Fix:
Bitrot-stub should delete the inode context on getting
ENOENT during lookup.

Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
BUG: 1373520
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15408
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>feature/bitrot: Ondemand scrub option for bitrot</title>
<updated>2016-08-25T21:39:38+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2016-08-05T03:33:22+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=0b3e4130b576c11156d6327e4cc3c9310a74c143'/>
<id>0b3e4130b576c11156d6327e4cc3c9310a74c143</id>
<content type='text'>
The bitrot scrubber takes 'hourly/daily/biweekly/monthly'
as the values for 'scrub-frequency'. There is no way
to schedule the scrubbing when the admin wants it.

Ondemand scrubbing brings in the new option 'ondemand'
with which the admin can start scrubbing ondemand.
It starts the scrubbing immediately.

Ondemand scrubbing is successful only if the scrubber
is in 'Active (Idle)' (waiting for it's next frequency
cycle to start scrubbing). It is not entertained when
the scrubber is in 'Paused' or already running.

Here is the command line syntax.

gluster volume bitrot &lt;vol name&gt; scrub ondemand

Change-Id: I84c28904367eed827a7dae8d6a535c14b28e9f4d
BUG: 1366195
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15111
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Venky Shankar &lt;vshankar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bitrot scrubber takes 'hourly/daily/biweekly/monthly'
as the values for 'scrub-frequency'. There is no way
to schedule the scrubbing when the admin wants it.

Ondemand scrubbing brings in the new option 'ondemand'
with which the admin can start scrubbing ondemand.
It starts the scrubbing immediately.

Ondemand scrubbing is successful only if the scrubber
is in 'Active (Idle)' (waiting for it's next frequency
cycle to start scrubbing). It is not entertained when
the scrubber is in 'Paused' or already running.

Here is the command line syntax.

gluster volume bitrot &lt;vol name&gt; scrub ondemand

Change-Id: I84c28904367eed827a7dae8d6a535c14b28e9f4d
BUG: 1366195
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15111
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Venky Shankar &lt;vshankar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
