summaryrefslogtreecommitdiffstats
path: root/geo-replication
Commit message (Collapse)AuthorAgeFilesLines
* geo-rep: Use gfchangelog context init APIVenky Shankar2015-03-273-0/+10
| | | | | | | | | | | | | With the RPC based changes to {libgf}changelog, changelog_init is required before changelog_register. Change-Id: Id125b2bd2e51aaaffa22ecab463dfb739c50d83c Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1170075 Reviewed-on: http://review.gluster.org/9993 Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep/gverify: Ignore .trashcan while checking slave is empty.Kotresh HR2015-03-201-2/+2
| | | | | | | | | | | | | | | | With trash translator, '.trashcan' is created in gluster volume during init. So geo-rep create used to fail, with slave not being empty. The gverify script should ignore '.trashcan' while checking whether slave is empty. Change-Id: Ib7ee265c6aa99b63c7ccf103747d47a6f756ad35 BUG: 1203086 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9921 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Ignore .trashcan during xsync crawl.Kotresh HR2015-03-181-0/+1
| | | | | | | | | | | | | | | | With trash feature, .trashcan directory gets created at each export directory. Xsync picks .trashcan to sync and fails with EPERM. Xsync should ignore .trashcan directory. Change-Id: I45bd226c96011ace2c40dd2de878d886c7d34ce5 BUG: 1203293 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9934 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep / gsyncd: use RTLD_GLOBAL while loading libgfchangelogVenky Shankar2015-03-181-2/+2
| | | | | | | | | | | | | | | | | | With the RPC based changes to {libgf}changelog, loading shared objects dynamically would need symbols to be available from other shared libraries. As an example, creating an RPC listner loads the RPC transport shared object which requires symbols to be available from already loaded shared objects. Using RTLD_GLOBAL makes the symbols available for symbol resolution of subsequently loaded libraries. Change-Id: I3d3ef790eded82911f05836c707509157680645c BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9814 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Remove Generated file From gitAravinda VK2015-03-181-177/+0
| | | | | | | | | | | geo-replication/src/peer_mountbroker BUG: 1136312 Change-Id: Ib9b287b4e1183cb44acbf01184a240be7f09be7c Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9923 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* feature/glusterfind: A tool to find incremental changesAravinda VK2015-03-183-73/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Documentation is available in patch: http://review.gluster.org/#/c/9800/ A tool which helps to get list of modified files or list of all files in GlusterFS Volume using Changelog or find command. Usage ===== glusterfind --help Create: ------- glusterfind create --help The tool creates status file $GLUSTERD_WORKDIR/SESSION/VOLUME/status and records current timestamp to initiate the session. This timestamp will be used as start time for next runs. As part of create also generates ssh key and distributes to all peers. and enables build.pgfid and changelog using volume set command. Pre: ---- glusterfind pre --help This command is used to generate the list of files modified after session creation time or after last run. To get list of all files/dirs in Volume, run pre command with `--full` argument. The tool gets all nodes details using gluster volume info and runs node agent for each brick in respective nodes via ssh command. Once these node agents generate the output file, tool copies to local using scp. Merges all the output files to generate the final output file. Post: ----- glusterfind post --help After consuming the list, this sub command is called to update the session time based on pre command status file. List: ----- glusterfind list --help To view all the sessions Delete: ------- glusterfind delete --help Delete session. Known Issues ------------ 1. Deleted files will not get listed, since we can't convert GFID to Path if file/dir is deleted. 2. Only new name will get listed if Renamed. 3. All hardlinks will get listed. Change-Id: I82991feb0aea85cb6ec035fddbf80a2b276e86b0 BUG: 1193893 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9682 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: mountbroker user managementAravinda VK2015-03-173-1/+356
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Non root geo-replication setup is now simplified. This patch provides cli for mountbroker user and options management To set Options, gluster system:: execute mountbroker opt <KEY> <VALUE> # for example, gluster system:: execute mountbroker opt mountbroker-root /var/mountbroker-root gluster system:: execute mountbroker opt geo-replication-log-group geogroup gluster system:: execute mountbroker opt rpc-auth-allow-insecure on To remove option, gluster system:: execute mountbroker optdel <KEY> # for example, gluster system:: execute mountbroker optdel geo-replication-log-group To add/edit user, gluster system:: execute mountbroker user <USERNAME> <VOLUMES> # for example gluster system:: execute mountbroker user geoaccount slavevol1,slavevol2 To remove user, gluster system:: execute mountbroker userdel <USERNAME> # for example gluster system:: execute mountbroker userdel geoaccount For info, gluster system:: execute mountbroker info gluster system:: execute mountbroker -j info For JSON output add -j after mountbroker, for example, gluster system:: execute mountbroker -j user geoaccount slavevol1,slavevol2 PS: Each peer prints its own JSON output, aggregator required from consumer side BUG: 1136312 Change-Id: Ie52210c0bcc91ac2ffd3ba58988222ffca62b47f Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: darshan n <dnarayan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Do not use xsync_upper_limit for change detectionAravinda VK2015-03-152-34/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use register time(xsync_upper_limit) only for stime update, do not use for change detection. Problem 1: If a file created before geo-rep, xtime xattr does not exist. Geo-rep updates xtime of the file to current time if not exists. xtime > upper_limit so geo-rep will not pick those files. Changelog either will have SETXATTR, and fails to sync the file. Problem 2: If a file is created before geo-rep create and updated after geo-rep start. xtime of the file is greater than upper limit(geo-rep start time/changelog register time). Geo-rep(XSync) will not pick this file for syncing. Changelog will have only DATA recorded for that file. Geo-rep tries DATA without any ENTRY ops and fails with rsync error. BUG: 1200733 Change-Id: Ie4e8f284db689d2c755ef8e7ecbb658db1c0785f Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9855 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/geo-rep: Active Passive Switching logic flockKotresh HR2015-03-154-4/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CURRENT DESIGN AND ITS LIMITATIONS: ----------------------------------- Geo-replication syncs changes across geography using changelogs captured by changelog translator. Changelog translator sits on server side just above posix translator. Hence, in distributed replicated setup, both replica pairs collect changelogs w.r.t their bricks. Geo-replication syncs the changes using only one brick among the replica pair at a time, calling it as "ACTIVE" and other non syncing brick as "PASSIVE". Let's consider below example of distributed replicated setup where NODE-1 as b1 and its replicated brick b1r is in NODE-2 NODE-1 NODE-2 b1 b1r At the beginning, geo-replication chooses to sync changes from NODE-1:b1 and NODE-2:b1r will be "PASSIVE". The logic depends on virtual getxattr 'trusted.glusterfs.node-uuid' which always returns first up subvolume i.e., NODE-1. When NODE-1 goes down, the above xattr returns NODE-2 and that is made 'ACTIVE'. But when NODE-1 comes back again, the above xattr returns NODE-1 and it is made 'ACTIVE' again. So for a brief interval of time, if NODE-2 had not finished processing the changelog, both NODE-2 and NODE-1 will be ACTIVE causing rename race as mentioned in the bug. SOLUTION: --------- 1. Have a shared replicated storage, a glusterfs management volume specific to geo-replication. 2. Geo-rep creates a file per replica set on management volume. 3. fcntl lock on the above said file is used for synchronization between geo-rep workers belonging to same replica set. 4. If management volume is not configured, geo-replication will back to previous logic of using first up sub volume. Each worker tries to lock the file on shared storage, who ever wins will be ACTIVE. With this, we are able to solve the problem but there is an issue when the shared replicated storage goes down (when all replicas goes down). In that case, the lock state is lost. So AFR needs to rebuild the lock state after brick comes up. NOTE: ----- This patch brings in the, pre-requisite step of setting up management volume for geo-replication during creation. 1. Create mgmt-vol for geo-replicatoin and start it. Management volume should be part of master cluster and recommended to be three way replicated volume having each brick in different nodes for availability. 2. Create geo-rep session. 3. Configure mgmt-vol created with geo-replication session as follows. gluster vol geo-rep <mastervol> slavenode::<slavevol> config meta_volume \ <meta-vol-name> 4. Start geo-rep session. Backward Compatiability: ----------------------- If management volume is not configured, it falls back to previous logic of using node-uuid virtual xattr. But it is not recommended. Change-Id: I7319d2289516f534b69edd00c9d0db5a3725661a BUG: 1196632 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9759 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Avoid duplicate entries in authorized_keys fileAravinda VK2015-03-051-2/+9
| | | | | | | | | | | | | | | | | | When Georep create(force) command is run multiple times it appends master nodes public keys to Slave nodes without checking the existence of the key. With this patch, create push-pem force adds pub key only if not available in authorized_keys. BUG: 1197433 Change-Id: Iad57f6c45698e258ad1a547fa7a2e376a315f0cd Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9776 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Handle ENOENT during cleanupAravinda VK2015-03-051-1/+6
| | | | | | | | | | | | | | | | shutil.rmtree was failing to remove file if file was not exists. Added error handling function to ignore ENOENT if a file/dir not present. BUG: 1198101 Change-Id: I1796db2642f81d9e2b5e52c6be34b4ad6f1c9786 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9792 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
* geo-rep: Add support for xattrsAravinda VK2015-03-023-3/+13
| | | | | | | | | | | | | | | | | | | | | | This patch adds support for xattrs. When it sees SETXATTR in Changelog, it adds the file to data queue. rsync/tar+ssh will take care of syncing xattrs. User set xattrs will be synced to Slave. New config interface is introduced, sync-xattrs Which can be set using geo-rep config(Default is True) gluster volume geo-replication <VOLUME> <SLAVEHOST>::<SLAVEVOL> \ config sync-xattrs false Change-Id: I70626d854a0d616469dd54d61e5ef155ed8b67d8 BUG: 1196690 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9499 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Unit tests infrastructure for geo-replicationAravinda VK2015-03-027-0/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Added sample unit test file, $GLUSTER_SRC/geo-replication/tests/unit/test_syncdutils.py Install unittest dependencies, (tox https://tox.readthedocs.org/en/latest/, nose https://nose.readthedocs.org/en/latest/) sudo pip install --upgrade tox nose To run pep8 coding standards tests, cd $GLUSTER_SRC/geo-replication tox -e pep8 To run unit tests, cd $GLUSTER_SRC/geo-replication tox -e py27 py27 is for with Python 2.7+, or py26 for systems with Python 2.6+ Change-Id: Ibdefe2c9d73afda9a7fa1c0db5b126f592b7cb40 BUG: 1177722 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9290 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Add support for non standard AuthorizedKeysFile locationAravinda VK2015-02-191-9/+32
| | | | | | | | | | | | | | | | | | | | | | In /etc/ssh/sshd_config, AuthorizedKeysFile can be customized using %u and %h variables, %u will be replaced by user name and %h will be replaced by home dir name. Default location is .ssh/authorized_keys For example, AuthorizedKeysFile .ssh/authorized_keys AuthorizedKeysFile %h/.my_secret_dir/authorized_keys AuthorizedKeysFile /etc/ssh/keys/%u/authorized_keys PS: Support only added for %h and %u in sshd_config BUG: 1181117 Signed-off-by: Aravinda VK <avishwan@redhat.com> Change-Id: Ic6ba20f9d202762dfdb6d0c73ea42e7f7c64e177 Reviewed-on: http://review.gluster.org/9436 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Archive Changelogs and avoid generating empty XSync changelogsAravinda VK2015-02-192-5/+79
| | | | | | | | | | | | | | | With this patch, - Hybrid Crawl will not generate empty Changelogs - Archives Changelogs when processed(Hybrid(XSync), History, and Changelog Crawl - Passive worker cleans up its processing directory BUG: 1169331 Change-Id: I1383ffaed261cdf50da91b14260b4d43177657d1 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9453 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fixing the typo errorsarao2015-02-037-14/+14
| | | | | | | | | Change-Id: Iacc67e4ba9ac45e0858f3befe84ffb8fccf7e1c3 BUG: 1075417 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* geo-rep: Handle copying of common_secret.pem.pub to slave correctly.Kotresh HR2015-01-212-12/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current Behaviour: 1. Geo-replication gsec_create creates common_secret.pem.pub file containing public keys of the all the nodes of master cluster in the location /var/lib/glusterd/ 2. Geo-replication create push-pem copies the common_secret.pem.pub to the same location on all the slave nodes with same name. Problem: Wrong public keys might get copied on to slave nodes in multiple geo-replication sessions simultaneosly. E.g. A geo-rep session is established between Node1(vol1:Master) to Node2 (vol2:Slave). And one more geo-rep session where Node2 (vol3) becomes master to Node3 (vol4) as below. Session1: Node1 (vol1) ---> Node2 (vol2) Session2: Node2 (vol3) ---> Node3 (vol4) If steps followed to create both geo-replication session is as follows, wrong public keys are copied on to Node3 from Node2. 1. gsec_create is done on Node1 (vol1) -Session1 2. gsec_create is done on Node2 (vol3) -Session2 3. create push-pem is done Node1 - Session1. -This overwrites common_secret.pem.pub in Node2 created by gsec_create in second step. 4. create push-pem on Node2 (vol3) copies overwrited common_secret.pem.pub keys to Node3. -Session2 Consequence: Session2 fails to start with Permission denied because of wrong public keys Solution: On geo-rep create push-pem, don't copy common_secret.pem.pub file with same name on to all slave nodes. Prefix master and slave volume names to the filename. NOTE: This brings change in manual steps to be followed to setup non-root geo-replication (mountbroker). To copy ssh public keys, extra two arguments needs to be followed. set_geo_rep_pem_keys.sh <mountbroker_user> <master vol name> \ <slave vol name> Path to set_geo_rep_pem_keys.sh: Source Installation: /usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh Rpm Installatino: /usr/libexec/glusterfs/set_geo_rep_pem_keys.sh Change-Id: If38cd4e6f58d674d5fe2d93da15803c73b660c33 BUG: 1183229 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9460 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Handle Volume status error while getting slave nodesAravinda VK2015-01-121-1/+6
| | | | | | | | | | | | | | | gluster volume status command not returns xml output, when any error like "Transaction in Progress", we need to handle returncode along with xml error. BUG: 1151412 Signed-off-by: Aravinda VK <avishwan@redhat.com> Change-Id: Id5b7712df7cff58744b4c5a0d00870aec1d926a8 Reviewed-on: http://review.gluster.org/9432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Error handling in tar+ssh modeAravinda VK2014-12-291-12/+12
| | | | | | | | | | | | | | | | | | | | | | | Georep raises exception if tar+ssh fails and worker dies due to the exception. This patch adds resilience to tar+ssh error and geo-rep worker retries when error, and skips those changelogs after maximum retries.(same as rsync mode) Removed warning messages for each rsync/tar+ssh failure per GFID, since skipped list will be populated after Max retry. Retry changelog files log also available, hence warning message for each GFID is redundent. BUG: 1177527 Change-Id: I3019c5c1ada7fc0822e4b14831512d283755b1ea Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9356 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Sync atime, mtime to slave when SETATTRAravinda VK2014-12-252-3/+15
| | | | | | | | | | | | Existing geo-rep only syncs chown and chmod, with this patch geo-rep also syncs atime and mtime. Change-Id: Iea52d86682873bb4a47eeb0d325f5b9ddf2de2cf Signed-off-by: Aravinda VK <avishwan@redhat.com> BUG: 1176934 Reviewed-on: http://review.gluster.org/9331 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: gid is not set in entry opsAravinda VK2014-10-281-2/+2
| | | | | | | | | | | | | | uid is sent in place of gid while CREATE and MKDIR. Change-Id: Icd1072cb9dcbfc1f419a3cdd456f3d02168175fa BUG: 1104954 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7984 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Failover when a Slave node goes downAravinda VK2014-10-191-5/+64
| | | | | | | | | | | | | | | | | | When a slave node goes down, worker in master node can connect to different slave node and resume the operation. Existing georep was not checking the status of slave node before worker restart. With this patch, geo-rep worker will check the node status using `gluster volume status` when it goes faulty. BUG: 1151412 Change-Id: If3ab7fdcf47f5b3f3ba383c515703c5f1f9dd668 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8921 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* feature/geo-rep: Fix skipped files logging.Kotresh H R2014-10-161-0/+10
| | | | | | | | | | | | | | | | When changelog is failed to process even after max retries, files are skipped but were not logging. This patch addresses this issue to log the skipped files. Change-Id: Ic617e4151231fe18a171efa57a119e21164e1a6a BUG: 1103577 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7945 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fix rename of directory syncing.Kotresh HR2014-09-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | The rename of directories are captured in all distributed brick changelogs. gsyncd processess these changelogs on each brick parallellaly. The first changelog to get processed will be successful. All subsequent ones will stat the 'src' and if not present, tries to create freshly on slave. It should be done only for files and not for directories. Hence when this code path was hit, regular file's blob is sent as directory's blob and gfid-access translator was erroring out as 'Invalid blob length' with errno as 'ENOMEM' Change-Id: I50545b02b98846464876795159d2446340155c82 BUG: 1146823 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8865 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: fix same file different gfid in master and slaveAravinda VK2014-09-262-17/+38
| | | | | | | | | | | | | | | | | | | | | While processing RENAME in changelog, if the file is unlinked in master, then geo-rep was sending UNLINK to slave instead of RENAME. If rsync job fails if one of the file failed to sync in the job. This patch adds logic to remove GFID from data list if the same changelog has UNLINK entry for it after the DATA. Or it removes those GFIDs during retry of changelogs processing. BUG: 1143853 Change-Id: I982dc976397cd0ab676bb912583f66a28f821926 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8761 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* Regression test portability: mktempEmmanuel Dreyfus2014-08-201-2/+2
| | | | | | | | | | | | | | Linux mktemp accepts to run without a template, NetBSD mandates it. Since the template option has the same syntax, add it everywhere. While there, also do this in scripts outside of regression testing. BUG: 764655 Change-Id: I3ec140afbc9009257c81a56d77afcc21fef74cc4 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net>
* geo-rep: Fixing issue with xsync upper limitAravinda VK2014-08-192-59/+50
| | | | | | | | | | | | | | | | | | | | While identifying the file/dir to sync, xtime of the file was compared with xsync_upper_limit as `xtime < xsync_upper_limit` After the sync, xtime of parent directory is updated as stime. With the upper limit condition, stime is updated as MIN(xtime_parent, xsync_upper_limit) With this files will get missed if `xtime_of_file == xsync_upper_limit` With this patch xtime_of_file is compared as xtime_of_file <= xsync_upper_limit BUG: 1128093 Change-Id: I5022ce67ba503ed4621531a649a16fc06b2229d9 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8439 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Handle RMDIR recursivelyAravinda VK2014-08-141-0/+12
| | | | | | | | | | | | | | | | If RMDIR is recorded in brick changelog which is due to self heal traffic then it will not have UNLINK entries for child files. Geo-rep hangs with ENOTEMPTY error on slave. Now geo-rep recursively deletes the dir if it gets ENOTEMPTY. BUG: 1129702 Change-Id: Iacfe6a05d4b3a72b68c3be7fd19f10af0b38bcd1 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8477 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* build: make GLUSTERD_WORKDIR rely on localstatedirHarshavardhana2014-08-075-21/+15
| | | | | | | | | | | | | | | | | | | | | | - Break-way from '/var/lib/glusterd' hard-coded previously, instead rely on 'configure' value from 'localstatedir' - Provide 's/lib/db' as default working directory for gluster management daemon for BSD and Darwin based installations - loff_t is really off_t on Darwin - fix-off the warnings generated by clang on FreeBSD/Darwin - Now 'tests/*' use GLUSTERD_WORKDIR a common variable for all platforms. - Define proper environment for running tests, define correct PATH and LD_LIBRARY_PATH when running tests, so that the desired version of glusterfs is used, regardless where it is installed. (Thanks to manu@netbsd.org for this additional work) Change-Id: I2339a0d9275de5939ccad3e52b535598064a35e7 BUG: 1111774 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8246 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: minimize xsync crawl usage and set upper limit to xsync crawlAravinda VK2014-07-232-38/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For effective handling of deletes and renames use history crawl as much as possible. History crawl will run in loop till it syncs all data before live changelog time. When it uses xsync crawl(fallback when changelog not available, or very first crawl) it sets upper limit to crawl. After completing History crawl, it checks actual end time returned by history api to compare with register time, if actual end is less than register time then run history crawl one more time. If first turn history processing time is less than the CHANGELOG ROLLOVER TIME then sleep for the difference, After sleep if it is guaranteed that rollover will happen and switches to live changelog consumption without switching to xsync. This sleep is only when history processing completed < CHANGELOG_ROLLOVER_TIME and sleep only after the first turn, So will not affect the performance. BUG: 1112238 Change-Id: I644ef24b07e42e81cec96a025ebd21244a555ec0 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8151 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Avoid duplicate stat in xsync changelog processingAravinda VK2014-07-091-6/+24
| | | | | | | | | | | | | | | | | | | | When A file/dir is identified for metadata sync, it was doing duplicate stat to get the metadata to sync. With this patch it avoids doing one additional stat call. Xsync performance will improve. rsync will copy files metadata, so no need to include for processing. BUG: 1111490 Change-Id: I79dad6375fa4742d9aaca7d9856993c184a744dc Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8124 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: History Change detector method select issueAravinda VK2014-07-011-1/+2
| | | | | | | | | | | | | | | | | Geo-rep does history crawl even if change-detector is set to xsync. This patch fixes by taking priority to user configured change detector. BUG: 1113525 Change-Id: Ic6c34e187c9cb6608c9ef8a010ea07015ba60a80 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8183 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fix the fd leak in worker/agent spawnAravinda VK2014-06-301-2/+5
| | | | | | | | | | | | | | | | | | worker and agent uses pipe to communicate, if worker dies for some reason agent should get EOF and terminate. Each worker-agent spawning is done in thread, Due to race if multiple workers in same node retain the pipe refs of other workers. Hence agent will not get EOF even if worker dies. BUG: 1114003 Change-Id: I36b9709b9392299483606bd3ef1db764fa3f2bff Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8194 Tested-by: Justin Clift <justin@gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* feature/geo-rep: Fix to retain pause state of gsyncd on restartKotresh H R2014-06-202-14/+18
| | | | | | | | | | | | | | On soft reboot, geo-rep monitor is writing 'faulty' into status file. It should not do it if previous state is paused as glusterd depend on the state file on node restart. Change-Id: Idd45abf13350b087371935f1b4f6e1a346433d27 BUG: 1101410 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8097 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* feature/geo-rep: Fix for changelog agent becoming zombie.Kotresh H R2014-06-171-4/+12
| | | | | | | | | | | | | | | | | | Monitor process spawns changelog agent and is not wait on it, hence becoming zombie. When worker is dies/killed, it respawns both worker and corresponding agent leaving the earlier changelog agent in zombie state. This patch addresses this issue by waiting on agent process in montor process. Change-Id: I571b7d6487133848edca67e7446f1caa70ae01c9 BUG: 1103643 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7956 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep/gverify: Never use ping to check for host reachabilityHarshavardhana2014-06-121-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On many linux distributions with iptables enabled, ICMP traffic is usually dropped even when port 22 is open for SSH service So practically `ping` is an unreliable command ~~~ root@rhs1:/var/log/glusterfs # gluster volume geo-replication geo-test \ 17.16.10.1::geo-test-slave create push-pem force 172.16.10.1 not reachable. geo-replication command failed ~~~ ~~~ root@rhs1:/var/log/glusterfs # ping 172.16.10.1 PING rhs2.sjc.redhat.com (172.16.10.1) 56(84) bytes of data. From rhs2.sjc.redhat.com (172.16.10.1) icmp_seq=1 Destination Host Prohibited ... ... ~~~ ~~~ root@rhs2:/var/log/glusterfs # service iptables status | grep 22 4 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:22 root@rhs2:/var/log/glusterfs # service iptables status | grep icmp-host-prohibited 25 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited ~~~ Change-Id: I33206ca071aa5d755c0762f7c486da222ec3c7db BUG: 1105337 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/7997 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Making replica failover check interval configurableAravinda VK2014-06-112-1/+3
| | | | | | | | | | | | | | | | | | Replica failover check interval is hardcoded to 60 sec by default. Now this option is made configurable and defaulted to 1 sec. To change the default value gluster volume geo-replication <MASTERVOL> \ <SLAVEHOST>::<SLAVEVOL> config replica_failover_interval 15 Change-Id: Iada1b80d510452dcfedebd8a21bebd62394b0597 BUG: 1066410 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8003 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: entry_ops errors handlingAravinda VK2014-06-112-8/+4
| | | | | | | | | | | | | | | Xattr.lsetxattr_l call will not raise OSError which errno_wrap can handle, so we need to use Xattr.lsetxattr to get proper OSError and errno_wrap ignores or retries accordingly. Change-Id: Ie0a777152ddbaf9ed80c977e4704974fec997bea BUG: 1105083 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7972 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: FSH recommended log locationsVenky Shankar2014-06-104-8/+31
| | | | | | | | | | | | | | | | Upgrading "working_dir" on the fly is a bit unclean yet (though it works) as currently config upgrade does not support "old" values to be expanded by using configuration variables. Change-Id: I44ed65c281f2e0ce3b6b467addc5c1c88ac674e7 BUG: 1077516 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh H R <khiremat@redhat.com> Signed-off-by: Aravinda VK <avishwan@redhat.com> Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/7375 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd / geo-rep: Xsync crawl metadata synchronizationVenky Shankar2014-06-101-0/+2
| | | | | | | | | | | | | Added "metadata" record for directory and file creations during the intial crawl. Change-Id: I811ae26e0144cadf7249cb64541ec354ab83fe66 BUG: 1106604 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/8018 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/geo-rep: Fix to retain pause state of gsyncd on restart.Kotresh H R2014-06-053-2/+16
| | | | | | | | | | | | | | | A new gsyncd options '--pause-on-start' is introduced. When node reboots, if the status is paused, gsyncd is started with this option. After gsyncd spawns worker and agent, worker will send SIGSTOP to negative pid of monitor to enter pause mode. Change-Id: I5aad82c9a9fc8c243f384940b77d25e26e520d6d BUG: 1101410 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7885 Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Not setting the user name as the user group by defaultAvra Sengupta2014-06-052-4/+3
| | | | | | | | | | | | | Also removing multiple mounts in gverify.sh Change-Id: I1567a9f711222c5f571a12f7c4ab49cef32d60c9 BUG: 1101948 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7911 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd : Use --remote-host option during cli invocationVenky Shankar2014-05-251-0/+1
| | | | | | | | | | | | Required for unprivileged geo-replication sessions to execute glusterd commands (which are however restricted). Change-Id: Ib83b81defa061717f4465ffa665450d0f5d3d20d BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7833 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd / geo-rep: Mountbroker cli to use INET socketsVenky Shankar2014-05-251-1/+1
| | | | | | | | | | | | | | | | | | | | unprivileged geo-replication session runs the slave gsyncd process as unprivileged, thereby executing gluster cli commands as an unprivileged user. By default, cli to glusterd uses unix domain sockets, thereby restricting cli command execution by non root users. This patch introduces '--remote-host' cli option to force cli to use INET socket. For this to work, the following needs to be added in glusterd volfile option rpc-auth-allow-insecure on Change-Id: I84b1711281bbcbde156200f80ebdb065afb55488 BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7820 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd / geo-rep: fix cli query for volinfo fetchVenky Shankar2014-05-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | With an unprivileged geo-replication session, monitor was using user@slave for --remote-host option for gluster cli, thereby failing to sucessfully connect to the slave glusterd. This patch fixes the issue by selecting the hostname/IP from the speicified slave endpoint url. - For privileged geo-replication sessions, this patch has no effect as the slave endpoint url is just the hostname/IP. Change-Id: I88f66c406a8d9a34db7fc626965f949075e3ceac BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7818 Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/geo-rep: Creating .ssh dir with right ownershipAvra Sengupta2014-05-252-6/+19
| | | | | | | | | | | | | | Also adding -P option to the usage of df for portability in gverify.sh Change-Id: I0be19d26ea63769a934c6ccbfc04ef80768ebc9a BUG: 1099041 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7812 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* glusterd/geo-rep: Use getent passwd instead of $HOMEAvra Sengupta2014-05-202-3/+3
| | | | | | | | | | | | | | | | $HOME might not be set in the env variables, as is the case when these scripts are executed using the runner framework. Hence using getent passwd instead of $HOME Change-Id: I99f6bcd788d727be534b3040600d66c8dbb7ee92 BUG: 1099041 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7803 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd/geo-rep: Fix remote vol info fetching for non-rootKotresh H R2014-05-151-1/+1
| | | | | | | | | | Signed-off-by: Kotresh H R <khiremat@redhat.com> Change-Id: If1d2cab3fcfe2391105551e54f0b9729a7c204e4 BUG: 1077452 Reviewed-on: http://review.gluster.org/7767 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/geo-rep: Allow gverify.sh and S56glusterd-geo-rep-create-post.shAvra Sengupta2014-05-144-39/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to operate for non-root privileged slave volume Mounting the slave-volume on local node, to perform disk checks in order to allow gverify.sh to operate for non-root privileged slave volume Allowing the hook script S56glusterd-geo-rep-create-post.sh to operate for non-root privileged slave volume Modified peer_add_secret_pub.in to accept username as argument and add the pem keys to the users's_home_dir/.ssh/authorized_keys Wrote set_geo_rep_pem_keys.sh which accepts username as argument and copies the pem keys from the user's home directory to $GLUSTERD_WORKING_DIR/geo-replication/ and then copies the keys to other nodes in the cluster and add them to the respective authorized keys. The script takes as argument the user name and assumes that the user will be present in all the nodes in the cluster. It is not needed for root. To summarize: For a privileged slave user, execute the following on master node as super user: gluster system:: execute gsec_create gluster volume geo-replication <master_vol> [root@]<slave_ip>::<slave_vol> create push_pem For a non-privileged slave user execute the following on master node as super user: gluster system:: execute gsec_create gluster volume geo-replication <master_vol> <slave_user>@<slave_ip>::<slave_vol> create push_pem then on the slave node execute the following as super user: /usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh <slave_user> BUG: 1077452 Change-Id: I88020968aa5b13a2c2ab86b1d6661b60071f6f5e Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: Partial support for Non-root geo-replication.Venky Shankar2014-05-142-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch enables geo-replication to be run as an unprivileged user. As of now, this is just the partial support, but is very close to achieve full functionality. Current limitation * Geo-replication executed Gluster CLI commands on the slave via SSH. On a non-root setup, Gluster CLI would run as an unprivileged user, failing to execute the command. As a workaround (for testing), setuid(2) Gluster CLI executable or use the glusterd option to accept commands by unprivileged CLI process. The nature of cli commands are "system::" commands (for key management) and remote volume info fetching. Remote volume info fetching has been modified to use --remote-host gluster cli option rather than ssh and remote cli execution. Change-Id: Ica89e2ba9b7f48fd6e1c876c477d7822dc693617 BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7658 Tested-by: Gluster Build System <jenkins@build.gluster.com>