summaryrefslogtreecommitdiffstats
path: root/tests/geo-rep.rc
Commit message (Collapse)AuthorAgeFilesLines
* tests/geo-rep: Add geo-rep cli testcasesKotresh HR2019-06-061-0/+12
| | | | | | Change-Id: Icf93b90bcac022a355d4718220698987dbc91ecf Signed-off-by: Kotresh HR <khiremat@redhat.com> updates: bz#1693692
* tests/geo-rep: Add geo-rep glusterd test casesKotresh HR2019-06-041-0/+19
| | | | | | | | | | 1. Add geo-rep fanout test case 2. Add glusterd geo-rep negative test cases 3. Add glusterd geo-rep config test cases Change-Id: I856c087eb3216d8f0ffd1f266deac88e9a4effec Signed-off-by: Kotresh HR <khiremat@redhat.com> updates: bz#1693692
* geo-rep: Fix sync hang with tarsshKotresh HR2019-05-131-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-rep sync hangs when tarssh is used as sync engine at heavy workload. Analysis and Root cause: It's found out that the tar process was hung. When debugged further, it's found out that stderr buffer of tar process on master was full i.e., 64k. When the buffer was copied to a file from /proc/pid/fd/2, the hang is resolved. This can happen when files picked by tar process to sync doesn't exist on master anymore. If this count increases around 1k, the stderr buffer is filled up. Fix: The tar process is executed using Popen with stderr as PIPE. The final execution is something like below. tar | ssh <args> root@slave tar --overwrite -xf - -C <path> It was waiting on ssh process first using communicate() and then tar. Note that communicate() reads stdout and stderr. So when stderr of tar process is filled up, there is no one to read until untar via ssh is completed. This can't happen and leads to deadlock. Hence we should be waiting on both process parallely, so that stderr is read on both processes. Change-Id: I609c7cc5c07e210c504771115b4d551a2e891adf fixes: bz#1707728 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests/geo-rep: Fix arequal checksum comparisonKotresh HR2019-05-091-1/+2
| | | | | | | | | The arequal checkusm comparison was always returning as successful, eventhough, if it was not. Fixed the same. Change-Id: I5083da25c0954126e452d06311d2d376f8540555 fixes: bz#1707742 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* geo-rep: Fix rename with existing destination with same gfidSunny Kumar2019-04-261-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-rep fails to sync the rename properly if destination exists. It results in source to be remained on slave causing more number of files on slave. Also heavy rename workload like logrotate caused lot of ESTALE errors Cause: Geo-rep fails to sync rename if destination exists if creation of source file also falls into single batch of changelogs being processed. This is because, after fixing problematic gfids verifying from master, while re-processing original entries, CREATE also was re-processed causing more files on slave and rename to be failed. Solution: Entries need to be removed from retrial list after fixing problematic gfids on slave so that it's not re-created again on slave. Also treat ESTALE as EEXIST so that the error is properly handled verifying the op on master volume. Change-Id: I50cf289e06b997adddff0552bf2466d9201dd1f9 fixes: bz#1694820 Signed-off-by: Kotresh HR <khiremat@redhat.com> Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* geo-rep: Fix syncing multiple rename of symlinkKotresh HR2019-03-291-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-rep fails to sync rename of symlink if it's renamed multiple times if creation and rename happened successively Worker crash at slave: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", in entry_ops [ESTALE, EINVAL, EBUSY]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", in errno_wrap return call(*arg) File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in lsetxattr cls.raise_oserr() File "/usr/libexec/glusterfs/python/syncdaemon/libcxattr.py", in raise_oserr raise OSError(errn, os.strerror(errn)) OSError: [Errno 12] Cannot allocate memory Geo-rep Behaviour: 1. SYMLINK doesn't record target path in changelog. So while syncing SYMLINK, readlink is done on master to get target path. 2. Geo-rep will create destination if source is not present while syncing RENAME. Hence while syncing RENAME of SYMLINK, target path is collected from destination. Cause: If symlink is created and renamed multiple times, creation of symlink is ignored, as it's no longer present on master at that path. While symlink is renamed multiple times at master, when syncing first RENAME of SYMLINK, both source and destination is not present, hence target path is not known. In this case, while creating destination directly at slave, regular file attributes were encoded into blob instead of symlink, causing failure in gfid-access translator while decoding blob. Solution: While syncing of RENAME of SYMLINK, when target is not known and when src and destination is not present on the master, don't create destination. Ignore the rename. It's ok to ignore. If it's unliked, it's fine. If it's renamed to something else, it will be synced then. Change-Id: Ibdfa495513b7c05b5370ab0b89c69a6802338d87 fixes: bz#1693648 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* geo-rep: Make slave volume read-only (by default)Harpreet Kaur2018-12-071-0/+7
| | | | | | | | | | | | | | | Added a command to set "features.read-only" option to a default value "on" for slave volume. Changes are made in: $SRC//extras/hook-scripts/S56glusterd-geo-rep-create-post.sh for root geo-rep and $SRC/geo-replication/src/set_geo_rep_pem_keys.sh for non-root geo-rep. Fixes: bz#1654187 Change-Id: I15beeae3506f3f6b1dcba0a5c50b6344fd468c7c Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
* tests/geo-rep: Mask failure of geo-rep arbiter testKotresh HR2018-12-051-17/+17
| | | | | | | | | | | | Comment out the particular test which is failing arbitrarily. Also changed the code to differentiate error cases. There could be some race because of which it's failing arbitrarily. This will be debugged and fixed in separate patch. Change-Id: I925df6421737d7a9abd9446a9d85029b4285ad2c updates: bz#1193929 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* geo-rep: Fix syncing of files with non-ascii filenamesKotresh HR2018-12-041-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Creation of files/directories with non-ascii names fails to sync to the slave. It crashes with below traceback on slave. ... File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops [ESTALE, EINVAL, EBUSY]) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap return call(*arg) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr cls.raise_oserr() File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr raise OSError(errn, os.strerror(errn)) OSError: [Errno 12] Cannot allocate memory Cause: The length calculation arguments passed to blob creation was done before encoding. Hence was failing in gfid-access layer. Fix: It appears that the calculating lenght properly fixes this issue. But it will cause issues in other places in 'python2' and not in 'python3'. So encoding and decoding each required string to make geo-rep compatible with both 'python2' and 'python3' is a nightmare and is not fool proof. Hence kept 'python2' code as is with out encode/decode and applied encode/decode only to 'python3' Added non-ascii filename tests to regression fixes: bz#1650893 Change-Id: I35cfaf848e07b1a0b5cb93c01b98b472f08271a6 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* geo-rep/hook-script: Fix ssh/scp optionsKotresh HR2018-08-031-6/+16
| | | | | | | | | | | | | | Always use ssh and scp with "-oPasswordAuthentication=no" and "-oStrictHostKeyChecking=no" options. It might hang the post script otherwise leading geo-rep setup failure Also increased geo-rep timeout. Occasionally, it's taking more time to reach Active/Passive status. Especially, the first start after create. fixes: bz#1610405 Change-Id: I9560d64dbe0edf5db73446a9fc97dda19b88d233 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests/geo-rep: Add rsnapshot and hardlink rename test caseKotresh HR2018-07-131-3/+89
| | | | | | | | | | | | 1. rsnapshot creates a I/O pattern which involves create, rename, symlink, hardlink in specific order. 2. Hardlink rename on master with source unlinked use case fixes: bz#1597540 Change-Id: Iedade47e5bf9905424a974df6ec33bc6f6695082 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests/geo-rep: Add symlink rename test caseKotresh HR2018-07-021-34/+66
| | | | | | | | | | Added a test case of symlink rename and directory creation with the name same as original symlink file. Also fixed few other issues in geo-rep.rc fixes: bz#1595726 Change-Id: I8e6acd3e742f3a0104cd37b87d1c0e0c902679b5 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests: Increase timeout for geo-rep testcasesKotresh HR2018-06-181-1/+1
| | | | | | Change-Id: I11715c23fddc1dd218a3ab88a55f1d6355dcfd43 Updates: bz#1537602 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests: Enable geo-rep test casesKotresh HR2018-01-051-135/+142
| | | | | | | | | | | | | | | | | This patch re-enables the geo-rep test cases. Along with it does following optimizations. 1. Use EXPECT_WITHIN instead of sleep 2. Clean up geo-rep ssh key after test 3. Changes to gverify.sh and S56glusterd-geo-rep-create-post.sh to use the given ssh identity file for geo-rep create 4. Make gluster-command-dir configurable and introduce slave-gluster-command-dir which points the parent directory of gluster binaries in master and slave respectively. Change-Id: Ia7696278d9dd3ba04224dcd7c3564088ca970b04 BUG: 1480491 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* georep: tests for logrotate, create+rename, hard-link renameMilind Changire2016-08-091-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * log rotate eg. with rotate count as 2 and with the following files created x.log, x.log.1 and x.log.2, another iteration of logrotate results in the following operations to be performed x.log.2 is renamed to x.log.3 x.log.1 is renamed to x.log.2 x.log is renamed to x.log.1 x.log.3 is deleted x.log is created [possible gfid allocated that belonged to x.log.3] when a file is created, there's a big possibility that a gfid used earlier can be reassigned and reused. This causes RENAME operations to fail on slave nodes when logs are replayed, typically on a Georep restart. A function called logrotate_simulate has been added to tests/geo-rep.rc to help simulate this situation. Starting from a clean state, logrotate_simulate has to be called 4 times with a rotate count of 2 to help the logs roll over and cause a delete at the end and a gfid reallocation. With the bug-fix in place, this test should not cause the Georep session to go into a Faulty state. * create+rename On log replay after Georep restart, a create+rename causes an entry to be created for the original file, but it cannot be linked to the gfid back-end since it is associated with the renamed file before the Georep restart. A function create_rename simulates the create+rename scenario and the function create_rename_ok tests if the dangling entry is present at the slave mount. With the bug-fix in place, a dangling entry with the original name should not be found at the slave mount after logs are replayed after a Georep restart. * hard-link rename This case is similar to the create+rename case except that this is a case of renaming hard-link to one of its other names. A function hardlink_rename has been added to tests/geo-rep.rc which simulates the creation and rename of hard-link. After a Georep session restart, the test function hardlink_rename_ok should not find the source link name on the slave. With the bug-fix in place, this test should not fail. If changelogs have been enabled on the slave as well, then the logs should show an UNLINK entry for the source link name, since a rename operation of a hard link to one of its names essentially just drops the source name as per the 'mv' command semantics. Change-Id: I85b196c00cf79a11bada25ef2fe5f1dc5c0c858a BUG: 1316389 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/13663 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* geo-rep: Fix syncing chown in xsync crawlKotresh HR2015-11-081-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | GEO-REP INTEROP WITH SHARD FEATURE Problem: The sequence of entry creation and chown in master is recorded as creation of entry with resulted user:group in xsync changelog. During sync, entry creation is always split into two ops, MKNOD and SETATTR. Hence the issue is not being hit otherwise it would have failed with EPERM if parent is owned by different user. But with shard translator being enabled on slave, doing entry creation with MKNOD and SETATTR is not allowed, SETATTR fails as it accesses inode structure which is not linked. Solution: The sequence of entry creation and chown in master should be recorded as MKNOD and SETATTR separately always and do entry creation with single op in gfid-access xlator. The gfid-access patch will be sent separately. Change-Id: I93e554bf9342397a7660503f5128e9709f8a0cd8 BUG: 1265148 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12205 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* tests: Fix return value in geo-rep testsKotresh HR2015-08-311-45/+14
| | | | | | | | | | | | | Remove the function 'data_tests' and TEST each fop in testcase itself to determine the exact test that fails. Change-Id: Iffc3e2ac3b0fc0c7c64259fcdff8f27b146dcb9b BUG: 1227624 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11907 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: New simple geo-rep regression test suiteKotresh HR2015-08-111-0/+183
This is a new simple regression test suite for geo-replication. This is written keeping in mind the run time for regression test. The existing regression test suite is rigorous one and could be run nightly. Hence the existing geo-rep tests are being removed as part of this. Also re-enable geo-rep regression with this patch. Thanks Aravinda for initial template and plan. Change-Id: If544ac295eaf67ac66e0b071903cc1096e71d437 BUG: 1227624 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11058 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>