<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/geo-replication/syncdaemon, branch v8dev</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>geo-rep: Convert gfid conflict resolutiong logs into debug</title>
<updated>2019-05-14T05:51:48+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-05-14T05:35:45+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=40b7121afbd3969706acb8198cf660a710583e70'/>
<id>40b7121afbd3969706acb8198cf660a710583e70</id>
<content type='text'>
The gfid conflict resolution code path is not supposed
to hit in generic code path. But few of the heavy rename
workload (BUG: 1694820) makes it a generic case. So
logging the entries to be fixed as INFO floods the log
in these particular workloads. Hence convert them to DEBUG.

fixes: bz#1709653
Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The gfid conflict resolution code path is not supposed
to hit in generic code path. But few of the heavy rename
workload (BUG: 1694820) makes it a generic case. So
logging the entries to be fixed as INFO floods the log
in these particular workloads. Hence convert them to DEBUG.

fixes: bz#1709653
Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix sync hang with tarssh</title>
<updated>2019-05-13T11:06:41+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-05-08T05:56:06+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=f0d3690e8916cfb5e10a0df2e9721a0fd079bfce'/>
<id>f0d3690e8916cfb5e10a0df2e9721a0fd079bfce</id>
<content type='text'>
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.

Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.

This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.

Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.

tar | ssh &lt;args&gt; root@slave tar --overwrite -xf - -C &lt;path&gt;

It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.

Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1707728
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Geo-rep sync hangs when tarssh is used as sync
engine at heavy workload.

Analysis and Root cause:
It's found out that the tar process was hung.
When debugged further, it's found out that stderr
buffer of tar process on master was full i.e., 64k.
When the buffer was copied to a file from /proc/pid/fd/2,
the hang is resolved.

This can happen when files picked by tar process
to sync doesn't exist on master anymore. If this count
increases around 1k, the stderr buffer is filled up.

Fix:
The tar process is executed using Popen with stderr as PIPE.
The final execution is something like below.

tar | ssh &lt;args&gt; root@slave tar --overwrite -xf - -C &lt;path&gt;

It was waiting on ssh process first using communicate() and then tar.
Note that communicate() reads stdout and stderr. So when stderr of tar
process is filled up, there is no one to read until untar via ssh is
completed. This can't happen and leads to deadlock.
Hence we should be waiting on both process parallely, so that stderr is
read on both processes.

Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf
fixes: bz#1707728
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix sync-method config</title>
<updated>2019-05-09T05:18:05+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-05-08T05:26:31+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=84b7cc57df065e2d8c0ac88b179aab3614ec814a'/>
<id>84b7cc57df065e2d8c0ac88b179aab3614ec814a</id>
<content type='text'>
Problem:
When 'use_tarssh' is set to true, it exits with successful
message but the default 'rsync' was used as sync-engine.
The new config 'sync-method' is not allowed to set from cli.

Analysis and Fix:
The 'use_tarssh' config is deprecated with new
config framework and 'sync-method' is the new
config to choose sync-method i.e. tarssh or rsync.
This patch fixes the 'sync-method' config. The allowed
values are tarssh and rsync.

Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84
fixes: bz#1707686
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When 'use_tarssh' is set to true, it exits with successful
message but the default 'rsync' was used as sync-engine.
The new config 'sync-method' is not allowed to set from cli.

Analysis and Fix:
The 'use_tarssh' config is deprecated with new
config framework and 'sync-method' is the new
config to choose sync-method i.e. tarssh or rsync.
This patch fixes the 'sync-method' config. The allowed
values are tarssh and rsync.

Change-Id: I0edb0319cad0455b29e49f2f08a64ce324735e84
fixes: bz#1707686
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix rename with existing destination with same gfid</title>
<updated>2019-04-26T07:15:00+00:00</updated>
<author>
<name>Sunny Kumar</name>
<email>sunkumar@redhat.com</email>
</author>
<published>2019-04-02T07:08:09+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e7e89c9ec8b56ad5a442ad105c0b05e674a591cd'/>
<id>e7e89c9ec8b56ad5a442ad105c0b05e674a591cd</id>
<content type='text'>
Problem:
   Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors

Cause:
   Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.

Solution:
   Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
   Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.

Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
   Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors

Cause:
   Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.

Solution:
   Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
   Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.

Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix entries and metadata counters in geo-rep status</title>
<updated>2019-04-24T16:17:39+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-04-23T05:15:25+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=e0a6941af6ed352911698012ada895d1296b549e'/>
<id>e0a6941af6ed352911698012ada895d1296b549e</id>
<content type='text'>
Entries counter was incremented twice and decremented only
once. And entries count was being used in place of metadata
entries. This patch fixes both of them.

fixes: bz#1512093
Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Entries counter was incremented twice and decremented only
once. And entries count was being used in place of metadata
entries. This patch fixes both of them.

fixes: bz#1512093
Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libgfchangelog : use find_library to locate shared library</title>
<updated>2019-04-15T14:27:59+00:00</updated>
<author>
<name>Sunny Kumar</name>
<email>sunkumar@redhat.com</email>
</author>
<published>2019-04-12T14:25:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=f316c8b797283818bd800569771870a4b9bf1310'/>
<id>f316c8b797283818bd800569771870a4b9bf1310</id>
<content type='text'>
Issue:

libgfchangelog.so: cannot open shared object file

Due to hardcoded shared library name runtime loader looks for particular version of
a shared library.

Solution:

Using find_library to locate shared library at runtime solves this issue.

Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 323, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1261, in service_loop
    changelog_agent.init()
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in __call__
    raise res
OSError: libgfchangelog.so: cannot open shared object file: No such file or directory

Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5
fixes: bz#1699394
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:

libgfchangelog.so: cannot open shared object file

Due to hardcoded shared library name runtime loader looks for particular version of
a shared library.

Solution:

Using find_library to locate shared library at runtime solves this issue.

Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 323, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1261, in service_loop
    changelog_agent.init()
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in __call__
    raise res
OSError: libgfchangelog.so: cannot open shared object file: No such file or directory

Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5
fixes: bz#1699394
Signed-off-by: Sunny Kumar &lt;sunkumar@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: Fix syncing multiple rename of symlink</title>
<updated>2019-03-29T07:23:41+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-03-28T11:17:16+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=877af725b3e35b548d6d7aeec5adb21721d8bf8b'/>
<id>877af725b3e35b548d6d7aeec5adb21721d8bf8b</id>
<content type='text'>
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively

Worker crash at slave:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",  in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory

Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
   So while syncing SYMLINK, readlink is done on
   master to get target path.

2. Geo-rep will create destination if source is not
   present while syncing RENAME. Hence while syncing
   RENAME of SYMLINK, target path is collected from
   destination.

Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known.  In this case,
while creating destination directly at slave,  regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.

Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine.  If it's renamed to something else,
it will be synced then.

Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Geo-rep fails to sync rename of symlink if it's
renamed multiple times if creation and rename
happened successively

Worker crash at slave:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",  in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops
    [ESTALE, EINVAL, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap
    return call(*arg)
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr
    cls.raise_oserr()
  File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr
    raise OSError(errn, os.strerror(errn))
OSError: [Errno 12] Cannot allocate memory

Geo-rep Behaviour:
1. SYMLINK doesn't record target path in changelog.
   So while syncing SYMLINK, readlink is done on
   master to get target path.

2. Geo-rep will create destination if source is not
   present while syncing RENAME. Hence while syncing
   RENAME of SYMLINK, target path is collected from
   destination.

Cause:
If symlink is created and renamed multiple times, creation of
symlink is ignored, as it's no longer present on master at
that path. While symlink is renamed multiple times at master,
when syncing first RENAME of SYMLINK, both source and destination
is not present, hence target path is not known.  In this case,
while creating destination directly at slave,  regular file
attributes were encoded into blob instead of symlink,
causing failure in gfid-access translator while decoding
blob.

Solution:
While syncing of RENAME of SYMLINK, when target is not known
and when src and destination is not present on the master,
don't create destination. Ignore the rename. It's ok to ignore.
If it's unliked, it's fine.  If it's renamed to something else,
it will be synced then.

Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87
fixes: bz#1693648
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: fix integer config validation</title>
<updated>2019-03-27T14:34:31+00:00</updated>
<author>
<name>Aravinda VK</name>
<email>avishwan@redhat.com</email>
</author>
<published>2019-03-26T07:50:13+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=c574984e19d59e351372eacce0ce11fb36e96dd4'/>
<id>c574984e19d59e351372eacce0ce11fb36e96dd4</id>
<content type='text'>
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.

Fixes: bz#1692666
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ssh-port validation is mentioned as `validation=int` in template
`gsyncd.conf`, but not handled this during geo-rep config set.

Fixes: bz#1692666
Change-Id: I3f19d9b471b0a3327e4d094dfbefcc58ed2c34f6
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>geo-rep: IPv6 support</title>
<updated>2019-03-15T08:53:46+00:00</updated>
<author>
<name>Aravinda VK</name>
<email>avishwan@redhat.com</email>
</author>
<published>2019-03-14T14:36:54+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=240e1d6821fbb779c3dd73f6f0225d755a5b7cc6'/>
<id>240e1d6821fbb779c3dd73f6f0225d755a5b7cc6</id>
<content type='text'>
`address_family=inet6` needs to be added while mounting master and
slave volumes in gverify script.

New option introduced to gluster cli(`--inet6`) which will be used
internally by geo-rep while calling `gluster volume info
--remote-host=&lt;ipv6&gt;`.

Fixes: bz#1688833
Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
`address_family=inet6` needs to be added while mounting master and
slave volumes in gverify script.

New option introduced to gluster cli(`--inet6`) which will be used
internally by geo-rep while calling `gluster volume info
--remote-host=&lt;ipv6&gt;`.

Fixes: bz#1688833
Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventsapi: Fix Python3 compatibility issues</title>
<updated>2019-02-25T15:30:19+00:00</updated>
<author>
<name>Aravinda VK</name>
<email>avishwan@redhat.com</email>
</author>
<published>2019-02-21T05:55:55+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=cd68f7b88b9a2c9a4e4ff9fca61517384e54130a'/>
<id>cd68f7b88b9a2c9a4e4ff9fca61517384e54130a</id>
<content type='text'>
- Fixed Relative import and non-package import related issues.
- socketserver import issues fix
- Renamed installed directory name to `gfevents` from `events`(To
  avoid any issues with other global libs)

Fixes: bz#1679406
Change-Id: I3dc38bc92b23387a6dfbcc0ab8283178235bf756
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Fixed Relative import and non-package import related issues.
- socketserver import issues fix
- Renamed installed directory name to `gfevents` from `events`(To
  avoid any issues with other global libs)

Fixes: bz#1679406
Change-Id: I3dc38bc92b23387a6dfbcc0ab8283178235bf756
Signed-off-by: Aravinda VK &lt;avishwan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
