| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If symlink is created on master pointing
to current directory (e.g symlink -> ".") with
non root uid or gid, geo-rep worker crashes
with ENOTSUP.
Cause:
Geo-rep creates the symlink on slave and
fixes the uid and gid using chown cmd.
os.chown dereferences the symlink which is
pointing to ".gfid" which is not supported.
Note that geo-rep operates on aux-gfid-mount
(e.g. "/mnt/.gfid/<gfid-of-symlink-file>").
Solution:
The uid or gid change is acutally on symlink
file. So use os.lchown, i.e, don't deference.
BUG: 1567209
Change-Id: I63575fc589d71f987bef1d350c030987738c78ad
updates: bz#1567209
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lazy umounting the master volume by worker causes
issues with rsync's usage of getcwd. Henc removing
the lazy umount and using private mount namespace
for the same. On the slave, the lazy umount is
retained as we can't use private namespace in non
root geo-rep setup.
Change-Id: I403375c02cb3cc7d257a5f72bbdb5118b4c8779a
BUG: 1546129
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch re-enables the geo-rep test cases.
Along with it does following optimizations.
1. Use EXPECT_WITHIN instead of sleep
2. Clean up geo-rep ssh key after test
3. Changes to gverify.sh and S56glusterd-geo-rep-create-post.sh
to use the given ssh identity file for geo-rep create
4. Make gluster-command-dir configurable and introduce
slave-gluster-command-dir which points the parent directory
of gluster binaries in master and slave respectively.
Change-Id: Ia7696278d9dd3ba04224dcd7c3564088ca970b04
BUG: 1480491
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
| |
BUG: 1529480
Change-Id: If4775ed9886990c0e1bcf4e44c7dfef95cc4f0c3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixed Python pep8 issues
- Removed dead code
- Rewritten configuration management
- Rewritten Arguments/subcommands handling
- Added Args upgrade to accommodate all these changes without changing
glusterd code
- use of md5 removed, which was used to hash the brick path for workdir
Both Master and Slave nodes will have subdir for session in the
format "<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication/<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication-slaves/<mastervol>_<primary_slave_host>_<slavevol>
Log file paths renamed since session info is available with directory
name itself.
$LOG_DIR_MASTER/
- gsyncd.log - Gsyncd, Worker monitor logs
- mnt-<brick-path>.log - Aux mount logs, mounted by each worker
- changes-<brick-path>.log - Changelog related logs(One per brick)
$LOG_DIR_SLAVE/
- gsyncd.log - Slave Gsyncd logs
- mnt-<master-node>-<master-brick-path>.log - Aux mount logs,
mounted for each connection from master-node:master-brick
- mnt-mbr-<master-node>-<master-brick-path>.log - Same as above,
but mountbroker setup
Fixes: #73
Change-Id: I2ec2a21e4e2a92fd92899d026e8543725276f021
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The data is not getting synced if master witnessed
IO as below.
1. echo "test_data" > f1
2. ln f1 f2
3. mv f2 f3
4. unlink f1
On master, 'f3' exists with data "test_data" but on
slave, only f3 exists with zero byte file without
backend gfid link.
Cause:
On master, since 'f2' no longer exists, the hardlink
is skipped during processing. Later, on trying to sync
rename, since source ('f2') doesn't exist, dst ('f3')
is created with same gfid. But in this use case, it
succeeds but backend gfid would not have linked as 'f1'
exists with the same gfid. So, rsync would fail with
ENOENT as backend gfid is not linked with 'f3' and 'f1'
is unlinked.
Fix:
On processing rename, if src doesn't exist on slave,
don't blindly create dst with same gfid. The gfid
needs to be checked, if it exists, hardlink needs
to be created instead of mknod.
Thanks Aravinda for helping in RCA :)
Change-Id: I5af4f99798ed1bcb297598a4bc796b701d1e0130
Signed-off-by: Kotresh HR <khiremat@redhat.com>
BUG: 1512483
Reporter: dimitri.ars@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In hybrid crawl, renames and unlink can't be
synced but directory renames can be detected.
While syncing the directory on slave, if the
gfid already exists, it should be rename.
Hence if directory gfid already exists, rename
it.
Change-Id: Ibf9f99e76a3e02795a3c2befd8cac48a5c365bb6
BUG: 1499566
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During xsync crawl, worker occasionally crashed
with ENODATA on getting gfid from backend. This
is not persistent and is transient. Worker restart
invovles re-processing of few entries in changenlogs.
So adding ENODATA to retry list to avoid worker
restart.
Change-Id: Ib78d1e925c0a83c78746f28f7c79792a327dfd3e
BUG: 1499391
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
os.listdir gives ENOTSUP on gfid path occasionally
which is not persistant. Adding it to retry list
to avoid worker to crash if it's transient error.
Change-Id: Ic795dd1f02a27c9e5d901e20722ee32451838feb
BUG: 1499180
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If there is a hardlink to a symlink on master
and if the symlink file is deleted on master,
geo-rep fails to sync the hardlink.
Typical Usecase:
It's easily hit with rsnapshot use case where
it uses hardlinks.
Example Reproducer:
Setup geo-replication between master and slave
volume and in master mount point, do the following.
1. mkdir /tmp/symlinkbug
2. ln -f -s /does/not/exist /tmp/symlinkbug/a_symlink
3. rsync -a /tmp/symlinkbug ./
4. cp -al symlinkbug symlinkbug.0
5. ln -f -s /does/not/exist2 /tmp/symlinkbug/a_symlink
6. rsync -a /tmp/symlinkbug ./
7. cp -al symlinkbug symlinkbug.1
Cause:
If the source was not present while syncing hardlink,
it was always packing the blob as regular file.
Fix:
If the source was not present while syncing hardlink,
pack the blob based on the mode.
Change-Id: Iaa12d6f99de47b18e0650e7c4eb455f23f8390f2
BUG: 1432046
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reported-by: Christian Lohmaier <lohmaier+rhbz@gmail.com>
Reviewed-on: https://review.gluster.org/18011
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In a distribute replicate volume, if the hardlinks
are created when a subvolume is down, it gets
healed from other subvolume when it comes up.
If this subvolume becomes ACTIVE in geo-rep
there are chances that those hardlinks won't
be synced to slave.
Cause:
AFR can't detect hardlinks during self heal.
It just create those files using mknod and
the same is recorded in changelog. Geo-rep
processes these mknod and ignores it as
it finds gfid already on slave.
Solution:
Geo-rep should process the mknod as link
if the gfid already exists on slave.
Change-Id: I2f721b462b38a74c60e1df261662db4b99b32057
BUG: 1475308
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17880
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Updates: #246
Change-Id: If0ce83fe8dd3068bfb671f398b2e82ac831288d0
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17577
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changed all log messages to structured log format
Change-Id: Idae25f8b4ad0bbae38f4362cbda7bbf51ce7607b
Updates: #240
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17551
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
chmod doesn't support 'no dereference' option.
It always deference the symlink. But 'chown'
does support metadata changes on symlink itself,
which was not taken care while syncing. This
patch fixes the same.
Change-Id: Ic9985f4e39d15b5a9deb379841bcfb2c263d3e6c
BUG: 1455559
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17389
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flag: --ignore-missing-args
This Rsync flag reduces sync failures if the source file is
unlinked but present in --files-from list. This reduces
Rsync retries in Geo-rep and improves the performance
Flag: --existing
Rsync in Geo-rep never creates target files. Using RPC Geo-rep creates
entry in Slave and rsync --inplace used to prevent creating temporary file
and rename.(To avoid different GFID in Slave). If the entry is missing in
Slave then Geo-rep Rsync gets Permission denied errors when it tries to
create file with name as GFID inside .gfid dir.(Geo-rep rsync syncs data
using GFIDS with aux-gfid-mount)
To disable these flags,
gluster volume geo-replication <session> config \
rsync-opt-ignore-missing-args false
gluster volume geo-replication <session> config \
rsync-opt-existing false
Thanks Kotresh for finding these awesome tunables.
BUG: 1400924
Change-Id: I6a84fb86a589bf6edc8dfd1086456a84b05a64fc
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/16010
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On corner cases, mount cleanup might cause
worker crash. Fixing the same.
Change-Id: I38c0af51d10673765cdb37bc5b17bb37efd043b8
BUG: 1433506
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17015
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even though it is known to be 'RMDIR', os.unlink
was being tried and os.rmdir is issued upon receiving
EISDIR. It's unnecessary unlink call for 'RMDIR'.
Fixed the same.
Change-Id: I8dbb680ee2c7f0c32b7799b1ed5351b3621cb42a
BUG: 1441106
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17041
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EBUSY was added to retry list of errno_wrap
without importing. Fixing the same.
Change-Id: Ide81a9ccc9b948a96265b6890da078b722b45d51
BUG: 1434018
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17011
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Monitor process expects worker to establish SSH Tunnel to slave node
and mount master volume locally with in 60 secs and acknowledge monitor
process by closing feedback fd. If something goes wrong and worker
does not close feedback fd with in 60 secs, monitor kills the worker.
But there was no clue in log message about the actual issue. This patch
adds log and indicates whether the worker is hung during SSH
or master mount.
Change-Id: Id08a12fa6f3bba1d4fe8036728dbc290e6c14c8c
BUG: 1261689
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16997
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not crash on EBUSY error. Add EBUSY
retry errno list. Crash only if the error
persists even after max retries.
Change-Id: Ia067ccc6547731f28f2a315d400705e616cbf662
BUG: 1434018
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16924
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to improve debuggability, it is important
to have access to geo-rep master and slave mounts.
With the default behaviour, geo-rep lazy unmounts
the mounts after changing the current working
directory into the mount point. It also cleans
up the mount points. So only geo-rep worker has
the access and it becomes impossible to take the
client profile info and do any other client statck
analysis. Hence the following new config is being
introduced to allow access to mounts.
gluster vol geo-rep <mastervol> <slavehost>::<slavevol> \
config access_mount true
The default value of 'access_mount' is false.
Change-Id: I53dce4ea86a6ffc979c82f9330e8954327180ca3
BUG: 1433506
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16912
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep worker mounts the slave volume on the slave
node. If multiple worker connects to same slave node,
all workers share the same mount log file. This
is very difficult to debug as logs are cluttered from
different mounts. Hence creating separate mount log
file for each connection from worker. Each connection
from worker is identified uniquely using 'mastervol uuid',
'master host', 'master brickpath', 'salve vol'. The log
file name will be combination of the above.
Change-Id: I67871dc8e8ea5864e2ad55e2a82063be0138bf0c
BUG: 1412689
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/16384
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If directory creation is failed, return immediately before
further processing. Allowing it to further process will
fail the entire directory tree syncing to slave. Hence
master will log and raise exception if it's directory
failure. Earlier, master used to log the failure and
proceed.
Change-Id: Iba2a8b5d3d0092e7a9c8a3c2cdf9e6e29c73ddf0
BUG: 1411607
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/16364
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If log-rsync-performance config is not set, gconf.get_realtime
will return None, Added default value as False if config file
doesn't have this option set.
BUG: 1393678
Change-Id: I89016ab480a16179db59913d635d8553beb7e14f
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/16102
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep restarts workers when any of the configurations changed. We
don't need to restart workers if tunables like log-rsync-performance
is modified.
With this patch, Geo-rep workers will get new "log-rsync-performance"
config automatically without restart.
BUG: 1393678
Change-Id: I40ec253892ea7e70c727fa5d3c540a11e891897b
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15816
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added Master node information to GEOREP_ACTIVE, GEOREP_PASSIVE, GEOREP_FAULTY
and GEOREP_CHECKPOINT_COMPLETED events.
EVENT_GEOREP_ACTIVE(master_node and master_node_id are new fields)
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_ACTIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"master_node": MASTER_NODE,
"master_node_id": MASTER_NODE_ID,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_PASSIVE(master_node and master_node_id are new fields)
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_PASSIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"master_node": MASTER_NODE,
"master_node_id": MASTER_NODE_ID,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_FAULTY(master_node and master_node_id are new fields)
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_FAULTY",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"master_node": MASTER_NODE,
"master_node_id": MASTER_NODE_ID,
"current_slave_host": CURRENT_SLAVE_HOST,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_CHECKPOINT_COMPLETED(master_node and master_node_id are new fields)
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_CHECKPOINT_COMPLETED",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"master_node": MASTER_NODE,
"master_node_id": MASTER_NODE_ID,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH,
"checkpoint_time": CHECKPOINT_TIME,
"checkpoint_completion_time": CHECKPOINT_COMPLETION_TIME
}
}
BUG: 1395660
Change-Id: Ic91af52fa248c8e982e93a06be861dfd69689f34
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15858
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not raise traceback if a file/dir not exists during
unlink or rmdir
BUG: 1396062
Change-Id: Idd43ca1fa6ae6056c3cd493f0e2f151880a3968c
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15868
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added following events
EVENT_GEOREP_ACTIVE
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_ACTIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_PASSIVE
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_PASSIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH
}
}
EVENT_GEOREP_CHECKPOINT_COMPLETED
{
"nodeid": NODEID,
"ts": TIMESTAMP,
"event": "GEOREP_ACTIVE",
"message": {
"master_volume": MASTER_VOLUME_NAME,
"slave_host": SLAVE_HOST,
"slave_volume": SLAVE_VOLUME,
"brick_path": BRICK_PATH,
"checkpoint_time": CHECKPOINT_TIME,
"checkpoint_completion_time": CHECKPOINT_COMPLETION_TIME
}
}
BUG: 1379330
Change-Id: I90716175868c59dd65c8d202e73e0ede90347b6a
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15630
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After every sync iteration with tarssh mode leaves defunct tar
process.
Added wait for tar process to prevent this issue.
BUG: 1374286
Change-Id: I9953239ef601cc1970c814b00074b45eb00f481e
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15426
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfchangelog was not respecting the log_level configured
in Geo-replication. With this patch Libgfchangelog log level
can be configured using `config changelog_log_level TRACE`.
Default Changelog log level is INFO
BUG: 1363965
Change-Id: Ida714931129f6a1331b9d0815da77efcb2b898e3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15078
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During Rename, If Source and Target has same inode then
Geo-rep unlinks source. But if source is a directory then
this will fail with below traceback
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 772,
in entry_ops
os.unlink(entry)
OSError: [Errno 21] Is a directory: '.gfid/12711ebf-7fdc-4f4b-9850-2d75581eb
452/New folder'
With this patch, if EISDIR, rmdir is tried. Logs error in Slave log in case
of ENOTEMPTY.
BUG: 1365791
Change-Id: I099af4192adac5125c0a23988ceb6506f91e987f
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15132
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setfattr may get ESTALE/EINVAL if a file is being unlinked.
To prevent worker crashing, added retry for these error messages.
On second retry it will get ENOENT and that error is handled by
ignoring.
BUG: 1373373
Change-Id: Ic660fa13208366d57c8d3d492bbef611475e45b7
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15404
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If Rsync/Tar subprocess dies, while logging error Geo-rep fails
with EBADF while accessing error file. Also worker dies while
accessing elines before it is set.
BUG: 1372193
Change-Id: I9cfce116e8aafa4a98654f5190d40a455af8ec95
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15379
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes changelogsdb part of post processing since
not got much performance advantage as expected.
Entry stime and other logging improvements retained.
BUG: 1364420
Change-Id: Ib99d23f09d96c14bc28225b47d9134260f5551bf
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15371
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, Data and Meta GFIDs are post processed. If Changelog has
UNLINK entry then remove from Data and Meta GFIDs list(If stat on GFID is
ENOENT in Master).
While processing Changelogs,
- Collect all the data and meta operations in a temporary database
- Delete all Data and Meta GFIDs which are already unlinked as per Changelogs
(unlink only if stat on GFID is ENOENT)
- Process all Entry operations as usual
- Process data and meta operations in batch(Fetch from Db in batch)
- Data sync is again batched based on number of changelogs(Default 1day
changelogs). Once the sync is complete, Update last Changelog's time as last_synced
time as usual.
Additionally maintain entry_stime on Brick root, ignore Entry ops if changelog
suffix time is less than entry_stime. If data stime is more than entry_stime,
this can happen only when passive worker updates stime by itself by getting
mount point stime. Use entry_stime = data_stime in this case.
New configurations:
max-rsync-retries - Default Value is 10
max-data-changelogs-in-batch - Max number of changelogs to be considered in a
batch for syncing. Default value is 5760(4 changelogs per min * 60 min *
24 hours)
max-history-changelogs-in-batch - Max number of history changelogs to be
processed at once. Default value 86400(4 changelogs per min * 60 min * 24
hours * 15 days)
BUG: 1364420
Change-Id: I7b665895bf4806035c2a8573d361257cbadbea17
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15110
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add error logs if gf_history_changelog fails. If requested
changelog range is not available, log the error and exit
instead of continuing the loop and exiting in readdir
without logging. Also fixed the duplicate MSGID number in
'changelog-lib-messages.h'
Change-Id: Icd71b89ae23b48a71380657ba5649029c32fabfd
BUG: 1362151
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/15064
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While setting stime/xtime, if the file or directory is already
deleted then Geo-rep will crash with ENOENT.
With this patch, Geo-rep will ignores ENOENT since stime/xtime can't
be applied on a deleted file/directory.
Change-Id: I2d90569e51565f81ae53fcb23323e4f47c9e9672
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1339471
Reviewed-on: http://review.gluster.org/14529
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During entry_ops RENAME Geo-rep sends stat info along with the
recorded info from Changelog. In Slave side if Source file exists
Geo-rep renames to Target file by calling os.rename. If source file
does not exists, it tries to create Target file directly using available
stat info from Master. If UID and GID are different in Master for that
file then stat info will have different UID/GID during Create. Geo-rep
gets EACCES when it tries to create a new entry using gfid-access with
different UID/GID.
With this patch, Entry creation with different UID/GID is split into two
operations. Create Entry with UID:0 and GID:0 and then set UID/GID.
Change-Id: I4987e3a205d8513c06fa66198cde145a87003a01
BUG: 1313303
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/13542
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
LINK + RENAME changelog when replayed after worker restart causes stale
hard-links to persist since VFS returns success for RENAME if hard-links
point to same inode.
Solution:
Worker detects RENAME being issued on hard-links to the same inode and
unlinks the source file-name. Conditionally rename by verifying that the
source gfid matches with the on-disk gfid on the slave.
Change-Id: I3ff1c30ef79e77503c8b246d46dab8ac3059ccf2
BUG: 1296174
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/13189
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Errorstore is maintained by Geo-rep to collect errors from
the child processes opened using Popen. If Popen.communicate
is used then it closes stderr. When stderr is not available
errorstore.tailer() will enter into infinite loop without gap.
With this patch, sleep time added when stderr of Child process
is already closed.
Change-Id: Ic36aabd6de35b259467d0bab7952468432867a94
BUG: 1315601
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/13637
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
CREATE + RENAME changelogs replayed by geo-replication cause
stale old-name entries with same gfid on slave nodes.
A gfid is a unique key in the file-system and should not be
assigned to multiple entries.
Solution:
Create entry on slave only if lstat(gfid) at aux-mount fails.
This applies to files as well as directories.
Change-Id: Ice3340f4ae1251c2dcef024a2388c4d33b5d4919
BUG: 1296206
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/13316
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep processes Changelogs in Batch, if one file in batch
fails with rsync error that Changelog file is reprocessed multiple times.
After MAX_RETRY, it logs all the GFIDs from that batch as Skipped.
This patch addresses following issues,
1. When Rsync/Tar fails do not parse Changelog again for retry
2. When Rsync/Tar fails do not replay Entry operations, only retry
rsync/tar for those GFIDs
3. Log Error in Rsync/Tar only in the last Retry
4. Do not log Skipped GFIDs since Rsync/Tar errors are logged for
only failed files.
5. Changed Entry failures as Error instead of Warning
BUG: 1287723
Change-Id: Ie134ce2572693056ab9b9008cd8aa5b5d87f7975
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12856
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ENTRY creation failed for symlink in Slave and symlink
renamed in Master. If Source not exists to Rename in Slave
Geo-rep interprets as Create of Target file. Geo-rep sends blob
of regular file to create symlink instead of sending blob of
symlink.
With this patch, Geo-rep identifies symlink and sends respective
blob.
BUG: 1289859
Change-Id: If9351974d1945141a1d3abb838b7d0de7591e48e
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Tested-by: Milind Changire <mchangir@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without '--sparse' option tar will not properly archive sparse file
and geo-replication will result in non-sparse file on the remote end.
Here is more on how I arrived at this
http://markelov.org/wiki/index.php/GlusterFS_3.6.1_on_CentOS_6.5:_geo-replication_and_sparse_files_problem
Change-Id: I8d671964a1b48bbb916e4a064571221bf3631494
BUG: 1276839
Signed-off-by: Alex Markelov <alex@markelov.org>
Reviewed-on: http://review.gluster.org/12476
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
symlinks are not getting synced to slave in a Tiering based volume.
Solution:
Now, symlinks are created directly in cold tier bricks( in the backend).
Earlier, cold tier was avoided for namespace operations and only
hot tier was used while processing changelogs.
Now, cold tier is HASH subvolume in a Tiering volume.
So, carry out namespace operation only in cold tier subvolume and
avoid hot tier subvolume to avoid any races.
Earlier, XSYNC was used(and changeloghistory avoided) during initial sync
in order to avoid race while processing historychangelog in Hot tier.
This is no longer required as there is no race from Hot tier.
Also, avoid both live and history changelog ENTRY operations from Hot tier to avoid any race with cold tier.
Change-Id: Ia8fbb7ae037f5b6cb683f36c0df5c3fc2894636e
BUG: 1287519
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12844
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
geo-replication session creation fails when Hostname
is having CAPS in it.
Issue is with the regex pattern which handles only small lettered
Hostname.
Fix:
Fix the regex pattern to handle CAPS based hostname as well.
Change-Id: I5c99c102e9706acc2b1fab1e6bf158e68beed373
BUG: 1265522
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12216
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If different port used for SSH instead of 22, Geo-replication
was failing to establish SSH connection.
ssh_port option can be added using config:ssh_command and
config:ssh_command_tar, but user has to remember complete
ssh command used with parameter to add/modify ssh port.
This patch adds new config option for ssh_port,
gluster volume geo-replication <MASTERVOL> <SLAVEHOST::<SLAVEVOL> \
config ssh_port 52022
Change-Id: I7753a09485f0b1f49d2b2a80b962c720817c96f4
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1276028
Reviewed-on: http://review.gluster.org/12444
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When Changelog agent process dies, Geo-replication fails to detect
and worker will run without respective Changelog agent. Status shows
Active/Passive without any progress.
With this patch, Worker process gets killed whenever Changelog
agent dies.
Change-Id: I30b4cc77f924f7e8174b8bfe415ac17f0b3851b4
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1277076
Reviewed-on: http://review.gluster.org/12485
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a series of patch which aims to fix geo-replication
in a Tiering Volume.
Problem:
Consider, a file is placed in volume initially and then hot tier is
attached. During any operation on the file, due to lookup a linkto
file is created in hot tier.
Now, any namespace operation carried out on the file is recorded in
both cold and hot tier.
There is a room for races when both changelogs are replayed.
Solution:
So, We are going to replay (namespace related)operations
only in the hot tier.
Why?
a. If the file is directly placed in Hot tier , all fops will be
recorded in HOT tier.
b. If the file is already present in Cold tier, and if any fop is
carried out, it creates linkto file in Hot tier.
Now, operations like UNLINK, RENAME are captured in Hot
tier(by means of linkto file).
This way, we can get both tier's operation in HOT tier itself.
Now, once the file is demoted to COLD tier, any namespace operation
carried out on the cold tier can be avoided as we directly RECORD
the same in HOT tier.
How?
1. Check whether the brick is cold tier and skip ENTRY operation.
2. Also, if it is cold tier brick, use Xsync(which is used during initial run).
This will help in getting all cold tier bricks changes using File System crawl
and helps in avoiding races with hot tier brick(which can happen
if historychangelog used in cold tier brick).
Dependent patches:
1. http://review.gluster.org/12239
2. http://review.gluster.org/12326
Change-Id: I7692b1dbb8813a7e253451bca02f8f09a5782dde
BUG: 1266875
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12355
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When DHT can't resolve a File it raises ESTALE, ignore ESTALE errors
same as ENOENT after retry.
Affected places:
Xattr.lgetxattr
os.listdir
os.link
Xattr.lsetxattr
os.chmod
os.chown
os.utime
os.readlink
BUG: 1232912
Change-Id: I02015f508d901e4a74dd48e1c52423e78eaf1dcd
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/11296
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|