<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/bitrot/bug-1373520.t, branch v6.2</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>tests/bitrot: Fix tests/bitrot/bug-1373520.t</title>
<updated>2018-08-09T02:51:17+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2018-08-02T11:30:46+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=12ce1848f55686cfae0b03f532279938722c0963'/>
<id>12ce1848f55686cfae0b03f532279938722c0963</id>
<content type='text'>
The test was failing with brick-mux enabled
intermittently. As the test depends on lookup
to recover file via heal, it's advisable to
disable all perf xlators. Hence doing the same.

fixes: bz#1611566
Change-Id: Ib7705e7951d53c435b8e390298164d73c6d71745
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The test was failing with brick-mux enabled
intermittently. As the test depends on lookup
to recover file via heal, it's advisable to
disable all perf xlators. Hence doing the same.

fixes: bz#1611566
Change-Id: Ib7705e7951d53c435b8e390298164d73c6d71745
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: Add multiple checks before attach/start a brick</title>
<updated>2018-07-27T01:24:09+00:00</updated>
<author>
<name>Mohit Agrawal</name>
<email>moagrawal@redhat.com</email>
</author>
<published>2018-07-12T07:59:48+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9400b6f2c8aa219a493961e0ab9770b7f12e80d2'/>
<id>9400b6f2c8aa219a493961e0ab9770b7f12e80d2</id>
<content type='text'>
Problem: In brick mux scenario sometime glusterd is not able
         to start/attach a brick and gluster v status shows
         brick is already running

Solution:
          1) To make sure brick is running check brick_path in
             /proc/&lt;pid&gt;/fd , if a brick is consumed by the brick
             process it means brick stack is come up otherwise not
          2) Before start/attach a brick check if a brick is mounted
             or not
          3) At the time of printing volume status check brick is
             consumed by any brick process

Test:  To test the same followed procedure
       1) Setup brick mux environment on a vm
       2) Put a breaking point in gdb in function posix_health_check_thread_proc
          at the time of notify GF_EVENT_CHILD_DOWN event
       3) unmount anyone brick path forcefully
       4) check gluster v status it will show N/A for the brick
       5) Try to start volume with force option, glusterd throw
          message "No device available for mount brick"
       6) Mount the brick_root path
       7) Try to start volume with force option
       8) down brick is started successfully

Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69
fixes: bz#1595320
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: In brick mux scenario sometime glusterd is not able
         to start/attach a brick and gluster v status shows
         brick is already running

Solution:
          1) To make sure brick is running check brick_path in
             /proc/&lt;pid&gt;/fd , if a brick is consumed by the brick
             process it means brick stack is come up otherwise not
          2) Before start/attach a brick check if a brick is mounted
             or not
          3) At the time of printing volume status check brick is
             consumed by any brick process

Test:  To test the same followed procedure
       1) Setup brick mux environment on a vm
       2) Put a breaking point in gdb in function posix_health_check_thread_proc
          at the time of notify GF_EVENT_CHILD_DOWN event
       3) unmount anyone brick path forcefully
       4) check gluster v status it will show N/A for the brick
       5) Try to start volume with force option, glusterd throw
          message "No device available for mount brick"
       6) Mount the brick_root path
       7) Try to start volume with force option
       8) down brick is started successfully

Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69
fixes: bz#1595320
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>storage/posix: Set ret value correctly before exiting</title>
<updated>2017-03-01T17:35:59+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2017-03-01T07:18:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=a2d4c928e93c95dfe2ceff450556f8494d67e654'/>
<id>a2d4c928e93c95dfe2ceff450556f8494d67e654</id>
<content type='text'>
Change-Id: I07c3a21c1c0625a517964693351356eead962571
BUG: 1427404
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16792
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I07c3a21c1c0625a517964693351356eead962571
BUG: 1427404
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16792
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Mark tests/bitrot/bug-1373520.t bad until fixed</title>
<updated>2017-02-28T10:58:58+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2017-02-28T06:15:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=bbb03ab1a2a9f0acc02f1d252a9bf811ba854bab'/>
<id>bbb03ab1a2a9f0acc02f1d252a9bf811ba854bab</id>
<content type='text'>
Change-Id: Ic0b5c93c6365e26a5742184dd9445354c0a57295
BUG: 1427404
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16780
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ic0b5c93c6365e26a5742184dd9445354c0a57295
BUG: 1427404
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16780
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Don't trigger data/metadata heal on Lookups</title>
<updated>2017-02-27T03:06:55+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-01-25T10:01:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=c1fc1fc9cb5a13e6ddf8c9270deb0c7609333540'/>
<id>c1fc1fc9cb5a13e6ddf8c9270deb0c7609333540</id>
<content type='text'>
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.

Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen

Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.

BUG: 1414287
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.

Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen

Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.

BUG: 1414287
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: run many bricks within one glusterfsd process</title>
<updated>2017-01-31T00:13:58+00:00</updated>
<author>
<name>Jeff Darcy</name>
<email>jdarcy@redhat.com</email>
</author>
<published>2016-12-08T21:24:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=1a95fc3036db51b82b6a80952f0908bc2019d24a'/>
<id>1a95fc3036db51b82b6a80952f0908bc2019d24a</id>
<content type='text'>
This patch adds support for multiple brick translator stacks running
in a single brick server process.  This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more.  It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.

Multiplexing is controlled by the "cluster.brick-multiplex" global option.  By
default it's off, and bricks are started in separate processes as before.  If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.

Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds support for multiple brick translator stacks running
in a single brick server process.  This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more.  It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.

Multiplexing is controlled by the "cluster.brick-multiplex" global option.  By
default it's off, and bricks are started in separate processes as before.  If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.

Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>features/bit-rot-stub: use the correct spelling of quarantine for bad objects</title>
<updated>2017-01-30T20:17:51+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2016-12-02T20:24:46+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2e0dceb6165d0a93581284fa2e0d74abe4ee615b'/>
<id>2e0dceb6165d0a93581284fa2e0d74abe4ee615b</id>
<content type='text'>
                       container

The directory for containing the list of bad objects was named "quanrantine"
instead of "quarantine"

Change-Id: I8c20381ac637201d9d1a224f5223e8dfbed53f1e
BUG: 1401571
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16027
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
                       container

The directory for containing the list of bad objects was named "quanrantine"
instead of "quarantine"

Change-Id: I8c20381ac637201d9d1a224f5223e8dfbed53f1e
BUG: 1401571
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16027
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Mark tests/bitrot/bug-1373520.t bad</title>
<updated>2017-01-30T13:46:01+00:00</updated>
<author>
<name>Atin Mukherjee</name>
<email>amukherj@redhat.com</email>
</author>
<published>2017-01-30T07:43:54+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=faf43235e0b3308388d0146e65b3982c11661db0'/>
<id>faf43235e0b3308388d0146e65b3982c11661db0</id>
<content type='text'>
Change-Id: Ief8014dd9faa012c7f3c5347f597a155873a8f92
BUG: 1417540
Signed-off-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16479
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ief8014dd9faa012c7f3c5347f597a155873a8f92
BUG: 1417540
Signed-off-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16479
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: N Balachandran &lt;nbalacha@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-by: Jeff Darcy &lt;jdarcy@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>feature/bitrot: Fix recovery of corrupted hardlink</title>
<updated>2016-09-08T17:09:33+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2016-09-06T12:58:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=b86a7de9b5ea9dcd0a630dbe09fce6d9ad0d8944'/>
<id>b86a7de9b5ea9dcd0a630dbe09fce6d9ad0d8944</id>
<content type='text'>
Problem:
When a file with hardlink is corrupted in ec volume,
the recovery steps mentioned was not working.
Only name and metadata was healing but not the data.

Cause:
The bad file marker in the inode context is not removed.
Hence when self heal tries to open the file for data
healing, it fails with EIO.

Background:
The bitrot deletes inode context during forget.

Briefly, the recovery steps involves following steps.
   1. Delete the entry marked with bad file xattr
      from backend. Delete all the hardlinks including
      .glusters hardlink as well.
   2. Access the each hardlink of the file including
      original from the mount.

The step 2 will send lookup to the brick where the files
are deleted from backend and returns with ENOENT. On
ENOENT, server xlator forgets the inode if there are
no dentries associated with it. But in case hardlinks,
the forget won't be called as dentries (other hardlink
files) are associated with the inode. Hence bitrot stube
won't delete it's context failing the data self heal.

Fix:
Bitrot-stub should delete the inode context on getting
ENOENT during lookup.

Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
BUG: 1373520
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15408
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When a file with hardlink is corrupted in ec volume,
the recovery steps mentioned was not working.
Only name and metadata was healing but not the data.

Cause:
The bad file marker in the inode context is not removed.
Hence when self heal tries to open the file for data
healing, it fails with EIO.

Background:
The bitrot deletes inode context during forget.

Briefly, the recovery steps involves following steps.
   1. Delete the entry marked with bad file xattr
      from backend. Delete all the hardlinks including
      .glusters hardlink as well.
   2. Access the each hardlink of the file including
      original from the mount.

The step 2 will send lookup to the brick where the files
are deleted from backend and returns with ENOENT. On
ENOENT, server xlator forgets the inode if there are
no dentries associated with it. But in case hardlinks,
the forget won't be called as dentries (other hardlink
files) are associated with the inode. Hence bitrot stube
won't delete it's context failing the data self heal.

Fix:
Bitrot-stub should delete the inode context on getting
ENOENT during lookup.

Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
BUG: 1373520
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15408
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
