| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Brick paths in Volinfo used `:` as delimiter, Geo-rep uses split
based on `:` char. This will go wrong with IPv6.
This patch handles the IPv6 case and handles the split properly.
Fixes: #1366
Change-Id: I25e88d693744381c0ccf3c1dbf1541b84be2499d
Signed-off-by: Aravinda Vishwanathapura <aravinda@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-replication has a large number of repeated imports as shown below:
```
from syncdutils import set_term_handler, finalize, lf
from syncdutils import log_raise_exception, FreeObject, escape
```
There imports can be clubbed together as shown below:
``
from syncdutils import (set_term_handler, finalize, lf,
log_raise_exception, FreeObject, escape)
```
Fixes: #1105
Change-Id: I59a48dd57a70fc851d93150b85e736ce41e8b793
Signed-off-by: kshithijiyer <kshithij.ki@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There exists a window of 15 sec, where the deletes are picked up
by history crawl when the ignore_deletes is set to true.
And it eventually deletes the file/s from slave which is/are not
supposed to be deleted. Though it is working as per design, a
note regarding this is needed.
Added a warning message indicating the same.
Also logged info when the worker restarts after ignore-deletes
option set.
fixes: bz#1708603
Change-Id: I103be882fac18b4cef935efa355f5037a396f7c1
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
- libgfchangelog is simplified by removing unnecessary API Class
- Merged Agent logic into Worker instead of running Worker and Agent as
two separate processes and maintaining RPC between Worker and Agent.
- Geo-rep command Pause and Resume will continue without any changes.
But Agent functionality also gets paused with that.
Updates: #755
Change-Id: Ie2c00fa7dddf21f180f0649e0aaf084d29023c98
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1741779
Change-Id: I708b6b7e6c520dee10445528e6f99ba38e141c25
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gfid conflict resolution code path is not supposed
to hit in generic code path. But few of the heavy rename
workload (BUG: 1694820) makes it a generic case. So
logging the entries to be fixed as INFO floods the log
in these particular workloads. Hence convert them to DEBUG.
fixes: bz#1709653
Change-Id: I4d5e102b87be5fe5b54f78f329e588882d72b9d9
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Geo-rep fails to sync the rename properly if destination exists.
It results in source to be remained on slave causing more number of
files on slave. Also heavy rename workload like logrotate caused
lot of ESTALE errors
Cause:
Geo-rep fails to sync rename if destination exists if creation
of source file also falls into single batch of changelogs being
processed. This is because, after fixing problematic gfids verifying
from master, while re-processing original entries, CREATE also was
re-processed causing more files on slave and rename to be failed.
Solution:
Entries need to be removed from retrial list after fixing
problematic gfids on slave so that it's not re-created again on slave.
Also treat ESTALE as EEXIST so that the error is properly handled
verifying the op on master volume.
Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9
fixes: bz#1694820
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Entries counter was incremented twice and decremented only
once. And entries count was being used in place of metadata
entries. This patch fixes both of them.
fixes: bz#1512093
Change-Id: I5601a5fe8d25c9d65b72eb529171e7117ebbb67f
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep's automatic error handling does gfid conflict
resolution. But if there are ENOENT errors because the
parent is not synced to slave, it doesn' handle them.
This patch adds the intelligence to create missing
parent directories on slave. It can create the missing
directories upto the depth of 10.
fixes: bz#1643402
Change-Id: Ic97ed1fa5899c087e404d559e04f7963ed7bb54c
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch fixes spell mistake in log message.
Change-Id: I84779c64aef6698cbc1a60ae1a82533e8a6a6e3d
updates: bz#1193929
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Iac241166d7a35dc7cc6cf07850f9f1bce38fe207
Updates: #411
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Autmatic gfid conflict resolution needs to be disabled
during failover/failback as it might lead to data loss
in the following scenario.
1. Master went down without syncing directory "dir1" to slave.
2. When slave is failed over to master, if a new file
is written inside "dir1", creating dir1 again if not
present, "dir1" ends up with different gfid on original
slave.
3. When original master is up and failed back, due to
automatic gfid conflict resolution, "dir1" present in
original master is deleted losing all files and only
new file created on original slave is restored.
Hence during failover/failback, automatic gfid conflict
resolution should be disabled. So in these cases, appropriate
decision is taken.
fixes: bz#1622076
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Change-Id: I433616f5d3e13d4b6eb675475bd554ca34928573
|
|
|
|
|
|
|
|
|
|
|
| |
Cleanup the Active/Passive logging code
which is redundant. With new status infra
implmented, this is redundant as every
status switch is logged by status infra.
fixes: bz#1619027
Change-Id: I0a6644cb998f3520e62a5189f21e4d66acc0e7c5
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...
Only compile-tested!
Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. MKDIR/RMDIR is recorded on all bricks. So if
one brick succeeds creating it, other bricks
should ignore it. But this was not happening.
The fix rename of directories in hybrid crawl,
was trying to rename the directory to itself
and in the process crashing with ENOENT if the
directory is removed.
2. If file is created, deleted and a directory is
created with same name, it was failing to sync.
Again the issue is around the fix for rename
of directories in hybrid crawl. Fixed the same.
If the same case was done with hardlink present
for the file, it was failing. This patch fixes
that too.
fixes: bz#1598884
Change-Id: I6f3bca44e194e415a3d4de3b9d03cc8976439284
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
see https://review.gluster.org/#/c/19788/,
https://review.gluster.org/#/c/19871/,
https://review.gluster.org/#/c/19952/, and
https://review.gluster.org/#/c/20104/
https://review.gluster.org/#/c/20162/
This patch changes uses of map() and raise(), and a few cases of print()
that were overlooked in the prior patch that fixed print.
Note: Fedora packaging guidelines require explicit shebangs, so popular
practices like #!/usr/bin/env python and #!/usr/bin/python are not
allowed; they must be #!/usr/bin/python2 or #!/usr/bin/python3
Note: Selected small fixes from 2to3 utility. Specifically apply,
basestring, funcattrs, idioms, numliterals, set_literal, types, urllib,
zip, map, and raise have already been applied. Also version agnostic
imports for urllib, cpickle, socketserver, _thread, queue, etc., suggested
by Aravinda in https://review.gluster.org/#/c/19767/1
Note: these 2to3 fixes report no changes are necessary: asserts, buffer,
exec, execfile, exitfunc, filter, getcwdu, intern, itertools, metaclass,
methodattrs, ne, next, nonzero, operator, paren, raw_input, reduce,
reload, renames, repr, standarderror, sys_exc, throw, tuple_params,
xreadlines.
Change-Id: Id62ea491e4ab5dd390075c5c6d9d889cf6f9da27
updates: #411
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
| |
BUG: 1529480
Change-Id: If4775ed9886990c0e1bcf4e44c7dfef95cc4f0c3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
| |
Fixes: #376
Change-Id: Ib92920c716c7d27e1eeb4bc4ebaf3efb48e0694d
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fixed Python pep8 issues
- Removed dead code
- Rewritten configuration management
- Rewritten Arguments/subcommands handling
- Added Args upgrade to accommodate all these changes without changing
glusterd code
- use of md5 removed, which was used to hash the brick path for workdir
Both Master and Slave nodes will have subdir for session in the
format "<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication/<mastervol>_<primary_slave_host>_<slavevol>
$GLUSTER_LOGDIR/geo-replication-slaves/<mastervol>_<primary_slave_host>_<slavevol>
Log file paths renamed since session info is available with directory
name itself.
$LOG_DIR_MASTER/
- gsyncd.log - Gsyncd, Worker monitor logs
- mnt-<brick-path>.log - Aux mount logs, mounted by each worker
- changes-<brick-path>.log - Changelog related logs(One per brick)
$LOG_DIR_SLAVE/
- gsyncd.log - Slave Gsyncd logs
- mnt-<master-node>-<master-brick-path>.log - Aux mount logs,
mounted for each connection from master-node:master-brick
- mnt-mbr-<master-node>-<master-brick-path>.log - Same as above,
but mountbroker setup
Fixes: #73
Change-Id: I2ec2a21e4e2a92fd92899d026e8543725276f021
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Passive brick's stime was not updated to the
status file immediately after updating the brick
root. As a result the last sync time was showing
'0' until it finishes first crawl if passive
worker becomes active after restart. Fix is to
update the status file immediately after upgrading
the brick root.
Change-Id: I248339497303bad20b7f5a1d42ab44a1fe6bca99
BUG: 1500346
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Worker occasionally crashed with EINTR on readlink.
This is not persistent and is transient. Worker restart
invovles re-processing of few entries in changenlogs.
So adding EINTR to retry list to avoid worker restart.
Change-Id: Iefe641437b5d5be583f079fc2a7a8443bcd19f9d
BUG: 1499393
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If there is a hardlink to a symlink on master
and if the symlink file is deleted on master,
geo-rep fails to sync the hardlink.
Typical Usecase:
It's easily hit with rsnapshot use case where
it uses hardlinks.
Example Reproducer:
Setup geo-replication between master and slave
volume and in master mount point, do the following.
1. mkdir /tmp/symlinkbug
2. ln -f -s /does/not/exist /tmp/symlinkbug/a_symlink
3. rsync -a /tmp/symlinkbug ./
4. cp -al symlinkbug symlinkbug.0
5. ln -f -s /does/not/exist2 /tmp/symlinkbug/a_symlink
6. rsync -a /tmp/symlinkbug ./
7. cp -al symlinkbug symlinkbug.1
Cause:
If the source was not present while syncing hardlink,
it was always packing the blob as regular file.
Fix:
If the source was not present while syncing hardlink,
pack the blob based on the mode.
Change-Id: Iaa12d6f99de47b18e0650e7c4eb455f23f8390f2
BUG: 1432046
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reported-by: Christian Lohmaier <lohmaier+rhbz@gmail.com>
Reviewed-on: https://review.gluster.org/18011
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the logs related to entry failures fix
due to gfid mismatch logs into structured logging
format
Change-Id: I9bce950c5339b48d3ec8b84bddee38b0473b7634
Updates: #246
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17896
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Updates: #246
Change-Id: If0ce83fe8dd3068bfb671f398b2e82ac831288d0
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17577
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfchangelog was encoding path using spec rfc3986, but encoding only
required for SPACE and NEWLINE chars since the NEWLINE char is used as
record separator and SPACE as field separator in the parsed changelogs
output.
Changed the encoding function to encode only SPACE and NEWLINE.
BUG: 1451724
Change-Id: I1936efad31788a9e636f912c832ed7d7efea4fe2
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17787
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a distributed volume on master, it can so happen that
the RMDIR followed by MKDIR is recorded in changelog on
a particular subvolume with same gfid and pargfid/bname
but not on all subvolumes as below.
E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 RMDIR \
9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
E 61c67a2e-07f2-45a9-95cf-d8f16a5e9c36 MKDIR 16877 0 0 \
9cc51be8-91c3-4ef4-8ae3-17596fcfed40%2Ffedora2
While processing this changelog, geo-rep thinks RMDIR is
successful and does recursive rmdir on slave. But in the
master the directory still exists. This could lead to
data discrepancy between master and slave.
Cause:
RMDIR-MKDIR pair gets recorded so in changelog when the
directory removal is successful on cached subvolume and
failed in one of hashed subvol for some reason
(may be down). In this case, the directory is re-created
on cached subvol which gets recorded as MKDIR again in
changelog.
Solution:
So while processing RMDIR geo-replication should stat on
master with gfid and should not delete it if it's present.
Change-Id: If5da1d6462eb4d9ebe2e88b3a70cc454411a133e
BUG: 1467718
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17695
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changed all log messages to structured log format
Change-Id: Idae25f8b4ad0bbae38f4362cbda7bbf51ce7607b
Updates: #240
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17551
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Geo-rep, Sync jobs can be configured using, `config sync-jobs 3`. This
patch adds following information related to the sync job(Rsync/Tarssh)
Example output:
[2017-06-13 09:09:32.532181] I [master(/bricks/b1):1713:syncjob] Syncer: \
Sync Time Taken (Job:2 Files:5484 ReturnCode:0): 4.8774 secs
Change-Id: Ifceb96d4b8d14e00fd1290c0aeff60d64b4d7f37
BUG: 1455179
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: https://review.gluster.org/17531
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With each batch having the type and count of
each fop helps to know the kind of I/O.
Having time taken to sync entry ops, metadata
ops and data ops gives us good understanding
into where the more time is being spent.
This patch does the same.
Change-Id: Ib52a0f9ede905f28a468b68bdf6d23e4b043f3e3
BUG: 1455179
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17066
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changelog batch size is set to 727040 bytes which
is the size of all the changelogs in a single batch.
It's based on few tests which approximately processes
5K entries. But it might vary on different machines.
Making it configurable gives more control on the
frequency of stime updates. This patch does the same.
Change-Id: I9a5ebb3d92c1327dded0e0a712c43a5a9046c1b0
BUG: 1454872
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/17376
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If stime is set to (0, 0) on master brick root, it
is expected to do complete sync ignoring the stime
set on sub directories. But while initializing the
stime variable for comparison, it was initailized
to (-1, 0) instead of (0, 0). Fixed the same.
The stime is set to (0, 0) with the 'reset-sync-time' option
while deleting session.
'gluster vol geo-rep master fedora1::slave delete reset-sync-time'
The scenario happens when geo-rep session is deleted as above and
for some reason the session is re-established with same slave volume
after deleting data on slave volume.
Change-Id: Ie5bc8f008dead637a09495adeef5577e2b33bc90
BUG: 1422760
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: https://review.gluster.org/16629
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If directory creation is failed, return immediately before
further processing. Allowing it to further process will
fail the entire directory tree syncing to slave. Hence
master will log and raise exception if it's directory
failure. Earlier, master used to log the failure and
proceed.
Change-Id: Iba2a8b5d3d0092e7a9c8a3c2cdf9e6e29c73ddf0
BUG: 1411607
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/16364
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During Hybrid crawl, Geo-rep maintains stime xattr in subdirectories along
with the Brick root. This is done to skip directories if Geo-rep crashes
before Hybrid crawl completes.
Update Last synced status only when stime xattr updated in brick root.
Status output will mislead if it shows sub directory stime as
last synced time.
BUG: 1396081
Change-Id: I5b73aee7ae4a1c1e2d1001d1f55559b9f9efd6e6
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15869
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Redundant log messages removed.
- Worker and connected slave node details added in "starting worker" log
- Added log for Monitor state change
- Added log for Worker status change(Initializing/Active/Passive/Faulty)
- Added log for Crawl status Change
- Added log for config set and reset
- Added log for checkpoint set, reset and completion
BUG: 1359612
Change-Id: Icc7173ff3c93de4b862bdb1a61760db7eaf14271
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15684
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgfchangelog was not respecting the log_level configured
in Geo-replication. With this patch Libgfchangelog log level
can be configured using `config changelog_log_level TRACE`.
Default Changelog log level is INFO
BUG: 1363965
Change-Id: Ida714931129f6a1331b9d0815da77efcb2b898e3
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15078
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes changelogsdb part of post processing since
not got much performance advantage as expected.
Entry stime and other logging improvements retained.
BUG: 1364420
Change-Id: Ib99d23f09d96c14bc28225b47d9134260f5551bf
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15371
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, Data and Meta GFIDs are post processed. If Changelog has
UNLINK entry then remove from Data and Meta GFIDs list(If stat on GFID is
ENOENT in Master).
While processing Changelogs,
- Collect all the data and meta operations in a temporary database
- Delete all Data and Meta GFIDs which are already unlinked as per Changelogs
(unlink only if stat on GFID is ENOENT)
- Process all Entry operations as usual
- Process data and meta operations in batch(Fetch from Db in batch)
- Data sync is again batched based on number of changelogs(Default 1day
changelogs). Once the sync is complete, Update last Changelog's time as last_synced
time as usual.
Additionally maintain entry_stime on Brick root, ignore Entry ops if changelog
suffix time is less than entry_stime. If data stime is more than entry_stime,
this can happen only when passive worker updates stime by itself by getting
mount point stime. Use entry_stime = data_stime in this case.
New configurations:
max-rsync-retries - Default Value is 10
max-data-changelogs-in-batch - Max number of changelogs to be considered in a
batch for syncing. Default value is 5760(4 changelogs per min * 60 min *
24 hours)
max-history-changelogs-in-batch - Max number of history changelogs to be
processed at once. Default value 86400(4 changelogs per min * 60 min * 24
hours * 15 days)
BUG: 1364420
Change-Id: I7b665895bf4806035c2a8573d361257cbadbea17
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/15110
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set the stime xattr at all the brick roots to (0,0) if the argument
reset-sync-time has been provided on the command-line.
To avoid testing against directory specific stime, the remote
stime is assumed to be minus_infinity, if the root directory
stime is set to (0,0), before the directory scan begins.
This triggers a full volume resync to slave in the case of a
geo-rep session recreation with the same master-slave volume
pair.
Command synopsis:
gluster volume geo-replication <MASTERVOL> <SLAVE>::<SLAVEVOL> delete \
[reset-sync-time]
Update gluster cli man page to include new sub-command reset-sync-time.
Change-Id: Ie4ce03b9425ed9bb81eda8681058c0fc6f990948
BUG: 1311926
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/14051
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If unlinked GFID is not present in data list to be synced then
Geo-rep worker was crashing with KeyError. Handled KeyError with
this patch.
BUG: 1345744
Change-Id: I5a1c9ca4473e32606df2e5c7e26c95faf55d44c0
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/14706
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geo-rep processes Changelogs in Batch, if one file in batch
fails with rsync error that Changelog file is reprocessed multiple times.
After MAX_RETRY, it logs all the GFIDs from that batch as Skipped.
This patch addresses following issues,
1. When Rsync/Tar fails do not parse Changelog again for retry
2. When Rsync/Tar fails do not replay Entry operations, only retry
rsync/tar for those GFIDs
3. Log Error in Rsync/Tar only in the last Retry
4. Do not log Skipped GFIDs since Rsync/Tar errors are logged for
only failed files.
5. Changed Entry failures as Error instead of Warning
BUG: 1287723
Change-Id: Ie134ce2572693056ab9b9008cd8aa5b5d87f7975
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12856
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Hardlinks are synced as Sticky bit files to Slave in
a Tiering based volume.
In a Tiering based volume, cold tier is hashed subvolume
and geo-rep captures all namespace operations in cold tier.
While syncing a file and its corresponding hardlink, it is
recorded as MKNOD in cold tier(for both) and
We end up creating two different files in Slave.
Solution:
If MKNOD with Sticky bit set is present, record it as LINK.
This way it will create a HARDLINK if source file exists (on slave),
else it will create a new file.
This way, Slave can create Hardlink file itself (instead
of creating a new file) in case of hardlink.
Change-Id: Ic50dc6e64df9ed01799c30539a33daace0abe6d4
BUG: 1301032
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/13281
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ENTRY creation failed for symlink in Slave and symlink
renamed in Master. If Source not exists to Rename in Slave
Geo-rep interprets as Create of Target file. Geo-rep sends blob
of regular file to create symlink instead of sending blob of
symlink.
With this patch, Geo-rep identifies symlink and sends respective
blob.
BUG: 1289859
Change-Id: If9351974d1945141a1d3abb838b7d0de7591e48e
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Tested-by: Milind Changire <mchangir@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
symlinks are not getting synced to slave in a Tiering based volume.
Solution:
Now, symlinks are created directly in cold tier bricks( in the backend).
Earlier, cold tier was avoided for namespace operations and only
hot tier was used while processing changelogs.
Now, cold tier is HASH subvolume in a Tiering volume.
So, carry out namespace operation only in cold tier subvolume and
avoid hot tier subvolume to avoid any races.
Earlier, XSYNC was used(and changeloghistory avoided) during initial sync
in order to avoid race while processing historychangelog in Hot tier.
This is no longer required as there is no race from Hot tier.
Also, avoid both live and history changelog ENTRY operations from Hot tier to avoid any race with cold tier.
Change-Id: Ia8fbb7ae037f5b6cb683f36c0df5c3fc2894636e
BUG: 1287519
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12844
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When any of the open fd of a file is closed
on which fcntl lock is taken even though another
fd of the same file is open on which lock is taken,
all fcntl locks will be released. This causes both
replica workers to be ACTIVE sometimes. This patche
fixes that issue.
Change-Id: I1e203ab0e29442275338276deb56d09e5679329c
BUG: 1285488
Original-Author: Aravinda VK <avishwan@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12752
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Active worker tries to acquire lock in each iteration. On every successfull
lock acqusition it was not closing previously opened lock fd.
To see the leak, get the PID of worker,
ps -ax | grep feedback-fd
watch ls /proc/$pid/fd
BUG: 1225566
Change-Id: Ic476c24c306e7ab372c5560fbb80ef39f4fb31af
Signed-off-by: Aravinda VK <avishwan@redhat.com>
Reviewed-on: http://review.gluster.org/12332
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GEO-REP INTEROP WITH SHARD FEATURE
Problem:
The sequence of entry creation and chown in master
is recorded as creation of entry with resulted
user:group in xsync changelog. During sync, entry
creation is always split into two ops, MKNOD and
SETATTR. Hence the issue is not being hit otherwise
it would have failed with EPERM if parent is owned
by different user. But with shard translator being
enabled on slave, doing entry creation with MKNOD and
SETATTR is not allowed, SETATTR fails as it accesses
inode structure which is not linked.
Solution:
The sequence of entry creation and chown in master
should be recorded as MKNOD and SETATTR separately always
and do entry creation with single op in gfid-access
xlator. The gfid-access patch will be sent separately.
Change-Id: I93e554bf9342397a7660503f5128e9709f8a0cd8
BUG: 1265148
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12205
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During XSync crawl, last_synced time in status file was not updated.
This patch fixes the issue by updating status file when stime xattr
is updated after Xsync or Changelog Crawl.
Change-Id: I4dc3a2d4c3d8378a939da0868caf1aef4f789599
Signed-off-by: Aravinda VK <avishwan@redhat.com>
BUG: 1247536
Reviewed-on: http://review.gluster.org/11771
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GEO-REP INTEROP WITH SHARD FEATURE
If it is FXATTROP or XATTROP in changelog,
add the gfid to rsync queue.
Change-Id: If68d38d7ed00b70a4618cfcc8e75df3fbadbf724
BUG: 1265148
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/12226
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a series of patch which aims to fix geo-replication
in a Tiering Volume.
Problem:
Consider, a file is placed in volume initially and then hot tier is
attached. During any operation on the file, due to lookup a linkto
file is created in hot tier.
Now, any namespace operation carried out on the file is recorded in
both cold and hot tier.
There is a room for races when both changelogs are replayed.
Solution:
So, We are going to replay (namespace related)operations
only in the hot tier.
Why?
a. If the file is directly placed in Hot tier , all fops will be
recorded in HOT tier.
b. If the file is already present in Cold tier, and if any fop is
carried out, it creates linkto file in Hot tier.
Now, operations like UNLINK, RENAME are captured in Hot
tier(by means of linkto file).
This way, we can get both tier's operation in HOT tier itself.
Now, once the file is demoted to COLD tier, any namespace operation
carried out on the cold tier can be avoided as we directly RECORD
the same in HOT tier.
How?
1. Check whether the brick is cold tier and skip ENTRY operation.
2. Also, if it is cold tier brick, use Xsync(which is used during initial run).
This will help in getting all cold tier bricks changes using File System crawl
and helps in avoiding races with hot tier brick(which can happen
if historychangelog used in cold tier brick).
Dependent patches:
1. http://review.gluster.org/12239
2. http://review.gluster.org/12326
Change-Id: I7692b1dbb8813a7e253451bca02f8f09a5782dde
BUG: 1266875
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12355
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a series of patches which aims to fix geo-replication
in a Tiering Volume.
Problem:
Consider, a file is placed in volume initially and then hot tier is
attached. During any operation on the file, due to lookup a linkto
file is created in hot tier.
Now, any namespace operation carried out on the file is recorded in
both cold and hot tier.
There is a room for races when both changelogs are replayed.
Solution:
So, We are going to replay (namespace related)operations
only in the hot tier.
Why?
a. If the file is directly placed in Hot tier, all fops will be
recorded in HOT tier.
b. If the file is already present in Cold tier, and if any fop is
carried out, it creates linkto file in Hot tier.
Now, operations like UNLINK, RENAME are captured in Hot tier(by means of linkto file).
This way, we can get both tier's operation in HOT tier itself.
But, We may miss initial Data sync immediately after creating the
file as it is only recording MKNOD. So, if MKNOD encountered
with sticky bit set, queue DATA operation for the corresponding gfid.
(This is addressed here in this patch)
So, If tier-gfid linkto is set, we need to record the corresponding
MKNOD. Earlier this was avoided as it was set as INTERNAL fop.
(This changelog related changes are addressed in the patch:
- http://review.gluster.org/12417)
Change-Id: I2fa84cfa2b0f86506c3d15d484138ab9651e4f83
BUG: 1266875
Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com>
Reviewed-on: http://review.gluster.org/12326
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|