summaryrefslogtreecommitdiffstats
path: root/geo-replication/syncdaemon/resource.py
Commit message (Collapse)AuthorAgeFilesLines
* geo-rep: Fix Rsync hang issueAravinda VK2015-05-071-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | When rsync is executed using Python subprocess, by default stdout of subprocess will be None. With the log rsync performance patch stdout is assigned to PIPE. Rsync writes to that PIPE whenever it syncs files. If log_rsync_performance is disabled then nobody will consume stdout and that gets full. Rsync hangs if PIPE is full. log_rsync_performance option is introduced with patch 10070 With this patch stdout=PIPE only if log_rsync_performance is enabled. Also removed -v option from Rsync. Thanks Venky and Kotresh for RCA. BUG: 1218552 Change-Id: I4ebcfb6999358c8e2c147f7964255bd836ed7499 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10556 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Minimize rm -rf race in Geo-repAravinda VK2015-05-051-21/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While doing RMDIR worker gets ENOTEMPTY because same directory will have files from other bricks which are not deleted since that worker is slow processing. So geo-rep does recursive_delete. Recursive delete was done using shutil.rmtree. once started, it will not check disk_gfid in between. So it ends up deleting the new files created by other workers. Also if other worker creates files after one worker gets list of files to be deleted, then first worker will again get ENOTEMPTY again. To fix these races, retry is added when it gets ENOTEMPTY/ESTALE/ENODATA. And disk_gfid check added for original path for which recursive_delete is called. This disk gfid check executed before every Unlink/Rmdir. If disk gfid is not matching with GFID from Changelog, that means other worker deleted the directory. Even if the subdir/file present, it belongs to different parent. Exit without performing further deletes. Retry on ENOENT during create is ignored, since if CREATE/MKNOD/MKDIR failed with ENOENT will not succeed unless parent directory is created again. Rsync errors handling was handling unlinked_gfids_list only for one Changelog, but when processed in batch it fails to detect unlinked_gfids and retries again. Finally skips the entire Changelogs in that batch. Fixed this issue by moving self.unlinked_gfids reset logic before batch start and after batch end. Most of the Geo-rep races with rm -rf is eliminated with this patch, but in some cases stale directories left in some bricks and in mount point we get ENOTEMPTY.(DHT issue, Error will be logged in Slave log) BUG: 1211037 Change-Id: I8716b88e4c741545f526095bf789f7c1e28008cb Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10204 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Status EnhancementsAravinda VK2015-05-051-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Discussion in gluster-devel http://www.gluster.org/pipermail/gluster-devel/2015-April/044301.html MASTER NODE - Master Volume Node MASTER VOL - Master Volume name MASTER BRICK - Master Volume Brick SLAVE USER - Slave User to which Geo-rep session is established SLAVE - <SLAVE_NODE>::<SLAVE_VOL> used in Geo-rep Create command SLAVE NODE - Slave Node to which Master worker is connected STATUS - Worker Status(Created, Initializing, Active, Passive, Faulty, Paused, Stopped) CRAWL STATUS - Crawl type(Hybrid Crawl, History Crawl, Changelog Crawl) LAST_SYNCED - Last Synced Time(Local Time in CLI output and UTC in XML output) ENTRY - Number of entry Operations pending.(Resets on worker restart) DATA - Number of Data operations pending(Resets on worker restart) META - Number of Meta operations pending(Resets on worker restart) FAILURES - Number of Failures CHECKPOINT TIME - Checkpoint set Time(Local Time in CLI output and UTC in XML output) CHECKPOINT COMPLETED - Yes/No or N/A CHECKPOINT COMPLETION TIME - Checkpoint Completed Time(Local Time in CLI output and UTC in XML output) XML output: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> cliOutput> geoRep> volume> name> sessions> session> session_slave> pair> master_node> master_brick> slave_user> slave/> slave_node> status> crawl_status> entry> data> meta> failures> checkpoint_completed> master_node_uuid> last_synced> checkpoint_time> checkpoint_completion_time> BUG: 1212410 Change-Id: I944a6c3c67f1e6d6baf9670b474233bec8f61ea3 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10121 Tested-by: NetBSD Build System Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: log ENTRY failures from slave on masterMilind Changire2015-04-271-7/+37
| | | | | | | | | | | | | | | ENTRY operations failures on slave left no trace for debugging purposes. This patch captures such failures on slave cluster and forwards them to the master and logs them. Failures of specific interest are the ones which return code EEXIST on the failing operations. Change-Id: Iecab876f16593c746d53f4b7ec2e0783367856bb BUG: 1207115 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/10048 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Do not fail-back to xsync if Changelog is failedAravinda VK2015-04-271-25/+16
| | | | | | | | | | | | | | | | | Unless change_detector is set to xsync, do not fallback to xsync, except during Initial Sync or Partial History. When a brick goes down, Changelog exception is raised due to which geo-rep fallback to xsync. Even after brick comes back geo-rep will not consume Changelog. BUG: 1202649 Change-Id: I1f8ea26ac7735f6ee09b3b143ee3eb66bfc9fc37 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9758 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com>
* geo-rep: Log Rsync performanceAravinda VK2015-04-021-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | Introducing configurable option to log the rsync performance. gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> \ config log-rsync-performance true Default value is False. Example log: [2015-03-31 16:48:34.572022] I [resource(/bricks/b1):857:rsync] SSH: rsync performance: Number of files: 2 (reg: 1, dir: 1), Number of regular files transferred: 1, Total file size: 178 bytes, Total transferred file size: 178 bytes, Literal data: 178 bytes, Matched data: 0 bytes, Total bytes sent: 294, Total bytes received: 32, sent 294 bytes received 32 bytes 652.00 bytes/sec Change-Id: If11467e29e6ac502fa114bd5742a8434b7084f98 Signed-off-by: Aravinda VK <avishwan@redhat.com> BUG: 764827 Reviewed-on: http://review.gluster.org/10070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Add support for aclsKotresh HR2015-03-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for ACLS. When it sees SETXATTR in Changelog, it adds the file to data queue. rsync/tar+ssh will take care of syncing ACLS. User set ACLS will be synced to Slave. This requires "system.posix_acl_access" to go through when client-pid is equal GF_CLIENT_PID_GSYNCD in fuse layer. New config interface is introduced, sync-acls Which can be set using geo-rep config(Default is True) gluster volume geo-replication <VOLUME> <SLAVEHOST>::<SLAVEVOL> \ config sync-acls false Change-Id: I7eb3523fa72b8fed830efc98138891244e830d65 BUG: 1187021 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10001 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Use gfchangelog context init APIVenky Shankar2015-03-271-0/+1
| | | | | | | | | | | | | With the RPC based changes to {libgf}changelog, changelog_init is required before changelog_register. Change-Id: Id125b2bd2e51aaaffa22ecab463dfb739c50d83c Signed-off-by: Venky Shankar <vshankar@redhat.com> BUG: 1170075 Reviewed-on: http://review.gluster.org/9993 Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Ignore .trashcan during xsync crawl.Kotresh HR2015-03-181-0/+1
| | | | | | | | | | | | | | | | With trash feature, .trashcan directory gets created at each export directory. Xsync picks .trashcan to sync and fails with EPERM. Xsync should ignore .trashcan directory. Change-Id: I45bd226c96011ace2c40dd2de878d886c7d34ce5 BUG: 1203293 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9934 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Do not use xsync_upper_limit for change detectionAravinda VK2015-03-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use register time(xsync_upper_limit) only for stime update, do not use for change detection. Problem 1: If a file created before geo-rep, xtime xattr does not exist. Geo-rep updates xtime of the file to current time if not exists. xtime > upper_limit so geo-rep will not pick those files. Changelog either will have SETXATTR, and fails to sync the file. Problem 2: If a file is created before geo-rep create and updated after geo-rep start. xtime of the file is greater than upper limit(geo-rep start time/changelog register time). Geo-rep(XSync) will not pick this file for syncing. Changelog will have only DATA recorded for that file. Geo-rep tries DATA without any ENTRY ops and fails with rsync error. BUG: 1200733 Change-Id: Ie4e8f284db689d2c755ef8e7ecbb658db1c0785f Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9855 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Add support for xattrsAravinda VK2015-03-021-3/+7
| | | | | | | | | | | | | | | | | | | | | | This patch adds support for xattrs. When it sees SETXATTR in Changelog, it adds the file to data queue. rsync/tar+ssh will take care of syncing xattrs. User set xattrs will be synced to Slave. New config interface is introduced, sync-xattrs Which can be set using geo-rep config(Default is True) gluster volume geo-replication <VOLUME> <SLAVEHOST>::<SLAVEVOL> \ config sync-xattrs false Change-Id: I70626d854a0d616469dd54d61e5ef155ed8b67d8 BUG: 1196690 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9499 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Fixing the typo errorsarao2015-02-031-2/+2
| | | | | | | | | Change-Id: Iacc67e4ba9ac45e0858f3befe84ffb8fccf7e1c3 BUG: 1075417 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* geo-rep: Sync atime, mtime to slave when SETATTRAravinda VK2014-12-251-0/+4
| | | | | | | | | | | | Existing geo-rep only syncs chown and chmod, with this patch geo-rep also syncs atime and mtime. Change-Id: Iea52d86682873bb4a47eeb0d325f5b9ddf2de2cf Signed-off-by: Aravinda VK <avishwan@redhat.com> BUG: 1176934 Reviewed-on: http://review.gluster.org/9331 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: gid is not set in entry opsAravinda VK2014-10-281-2/+2
| | | | | | | | | | | | | | uid is sent in place of gid while CREATE and MKDIR. Change-Id: Icd1072cb9dcbfc1f419a3cdd456f3d02168175fa BUG: 1104954 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7984 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fix rename of directory syncing.Kotresh HR2014-09-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | The rename of directories are captured in all distributed brick changelogs. gsyncd processess these changelogs on each brick parallellaly. The first changelog to get processed will be successful. All subsequent ones will stat the 'src' and if not present, tries to create freshly on slave. It should be done only for files and not for directories. Hence when this code path was hit, regular file's blob is sent as directory's blob and gfid-access translator was erroring out as 'Invalid blob length' with errno as 'ENOMEM' Change-Id: I50545b02b98846464876795159d2446340155c82 BUG: 1146823 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8865 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: fix same file different gfid in master and slaveAravinda VK2014-09-261-2/+3
| | | | | | | | | | | | | | | | | | | | | While processing RENAME in changelog, if the file is unlinked in master, then geo-rep was sending UNLINK to slave instead of RENAME. If rsync job fails if one of the file failed to sync in the job. This patch adds logic to remove GFID from data list if the same changelog has UNLINK entry for it after the DATA. Or it removes those GFIDs during retry of changelogs processing. BUG: 1143853 Change-Id: I982dc976397cd0ab676bb912583f66a28f821926 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8761 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Fixing issue with xsync upper limitAravinda VK2014-08-191-20/+16
| | | | | | | | | | | | | | | | | | | | While identifying the file/dir to sync, xtime of the file was compared with xsync_upper_limit as `xtime < xsync_upper_limit` After the sync, xtime of parent directory is updated as stime. With the upper limit condition, stime is updated as MIN(xtime_parent, xsync_upper_limit) With this files will get missed if `xtime_of_file == xsync_upper_limit` With this patch xtime_of_file is compared as xtime_of_file <= xsync_upper_limit BUG: 1128093 Change-Id: I5022ce67ba503ed4621531a649a16fc06b2229d9 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8439 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Handle RMDIR recursivelyAravinda VK2014-08-141-0/+12
| | | | | | | | | | | | | | | | If RMDIR is recorded in brick changelog which is due to self heal traffic then it will not have UNLINK entries for child files. Geo-rep hangs with ENOTEMPTY error on slave. Now geo-rep recursively deletes the dir if it gets ENOTEMPTY. BUG: 1129702 Change-Id: Iacfe6a05d4b3a72b68c3be7fd19f10af0b38bcd1 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8477 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: minimize xsync crawl usage and set upper limit to xsync crawlAravinda VK2014-07-231-19/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For effective handling of deletes and renames use history crawl as much as possible. History crawl will run in loop till it syncs all data before live changelog time. When it uses xsync crawl(fallback when changelog not available, or very first crawl) it sets upper limit to crawl. After completing History crawl, it checks actual end time returned by history api to compare with register time, if actual end is less than register time then run history crawl one more time. If first turn history processing time is less than the CHANGELOG ROLLOVER TIME then sleep for the difference, After sleep if it is guaranteed that rollover will happen and switches to live changelog consumption without switching to xsync. This sleep is only when history processing completed < CHANGELOG_ROLLOVER_TIME and sleep only after the first turn, So will not affect the performance. BUG: 1112238 Change-Id: I644ef24b07e42e81cec96a025ebd21244a555ec0 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/8151 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* feature/geo-rep: Fix to retain pause state of gsyncd on restartKotresh H R2014-06-201-7/+0
| | | | | | | | | | | | | | On soft reboot, geo-rep monitor is writing 'faulty' into status file. It should not do it if previous state is paused as glusterd depend on the state file on node restart. Change-Id: Idd45abf13350b087371935f1b4f6e1a346433d27 BUG: 1101410 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8097 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: entry_ops errors handlingAravinda VK2014-06-111-3/+4
| | | | | | | | | | | | | | | Xattr.lsetxattr_l call will not raise OSError which errno_wrap can handle, so we need to use Xattr.lsetxattr to get proper OSError and errno_wrap ignores or retries accordingly. Change-Id: Ie0a777152ddbaf9ed80c977e4704974fec997bea BUG: 1105083 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7972 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: FSH recommended log locationsVenky Shankar2014-06-101-2/+4
| | | | | | | | | | | | | | | | Upgrading "working_dir" on the fly is a bit unclean yet (though it works) as currently config upgrade does not support "old" values to be expanded by using configuration variables. Change-Id: I44ed65c281f2e0ce3b6b467addc5c1c88ac674e7 BUG: 1077516 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh H R <khiremat@redhat.com> Signed-off-by: Aravinda VK <avishwan@redhat.com> Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/7375 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* feature/geo-rep: Fix to retain pause state of gsyncd on restart.Kotresh H R2014-06-051-0/+8
| | | | | | | | | | | | | | | A new gsyncd options '--pause-on-start' is introduced. When node reboots, if the status is paused, gsyncd is started with this option. After gsyncd spawns worker and agent, worker will send SIGSTOP to negative pid of monitor to enter pause mode. Change-Id: I5aad82c9a9fc8c243f384940b77d25e26e520d6d BUG: 1101410 Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/7885 Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: Mountbroker cli to use INET socketsVenky Shankar2014-05-251-1/+1
| | | | | | | | | | | | | | | | | | | | unprivileged geo-replication session runs the slave gsyncd process as unprivileged, thereby executing gluster cli commands as an unprivileged user. By default, cli to glusterd uses unix domain sockets, thereby restricting cli command execution by non root users. This patch introduces '--remote-host' cli option to force cli to use INET socket. For this to work, the following needs to be added in glusterd volfile option rpc-auth-allow-insecure on Change-Id: I84b1711281bbcbde156200f80ebdb065afb55488 BUG: 1077452 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/7820 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Pause and Resume feature for geo-replicationAravinda VK2014-05-091-52/+17
| | | | | | | | | | | | | | | | Changelog consumption/processing now happens in seperate process group than monitor. When monitor process group gets SIGSTOP all worker process, ssh, rsync will be paused except the changelog processing. When it gets SIGCONT it resumes its operation. Changelog agent runs as RepceServer, geo-rep worker communicates with changelog agent using RepceClient. Change-Id: I35c333e4d8b13d03a7808aed601960eef23cfa04 BUG: 1093602 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7322
* geo-rep: Changelog History consumption more fixesAravinda VK2014-05-091-2/+9
| | | | | | | | | | | Number of parallel threads to process changelog history is made configurable via sync_jobs Change-Id: Idcd8e655d9df540cfa48648b9e98af941f95e9d0 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7660 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Loading libgfchangelog.so only while running geo-repAravinda VK2014-05-091-1/+1
| | | | | | | | | | | | | | | | | In source install, libgfchangelog is installed in /usr/local/lib When glusterd runs /usr/local/libexec/glusterfs/python/gsyncd --version it fails to find library without LD_LIBRARY_PATH. This patch avoids loading library when it is run from glusterd during start. BUG: 1096026 Change-Id: I59912227ac27ff4877d947a7c8f1fe2e8c5be06e Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7713 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: Consume Changelog History APIAravinda VK2014-04-301-6/+60
| | | | | | | | | | | | | | | | | | | | | | | | Every time when geo-rep restarts it first does FS crawl using XCrawl and then switches to Changelog Mode. This is because changelog only had live API, that is we can get changes only after registering. Now this(http://review.gluster.org/#/c/6930/) patch introduces History API for changelogs. If history is available then geo-rep will use it instead of FS Crawl. History API returns TS till what time history is available for given start and end time. If TS < endtime then switch to FS Crawl. (History => FS Crawl => Live Changelog) If TS >= endtime, then switch directly to Changelog mode (History => Live Changelog) Change-Id: I4922f62b9f899c40643bd35720e0c81c36b2f255 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/6938 Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: fix the code bug introduced due to flake8 refactoringAravinda VK2014-04-081-6/+9
| | | | | | | | | | | Sorry for the bug, which got introduced due to code refactoring. http://review.gluster.org/#/c/7311/ Change-Id: Ide519ca114aa8a7d7624d7af99945c857f069069 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: code pep8/flake8 fixesAravinda VK2014-04-071-94/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | | pep8 is a style guide for python. http://legacy.python.org/dev/peps/pep-0008/ pep8 can be installed using, `pip install pep8` Usage: `pep8 <python file>`, For example, `pep8 master.py` will display all the coding standard errors. flake8 is used to identify unused imports and other issues in code. pip install flake8 cd $GLUSTER_REPO/geo-replication/ flake8 syncdaemon Updated license headers to each source file. Change-Id: I01c7d0a6091d21bfa48720e9fb5624b77fa3db4a Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/7311 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd / geo-rep: ignore DHTs sticky bit file during crawlVenky Shankar2014-02-061-0/+15
| | | | | | | | | | Change-Id: Ide927759c6a3d5301475eac9f6e785aa901d426e BUG: 1036539 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6792 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: "patch" up missing stimeVenky Shankar2014-02-061-0/+1
| | | | | | | | | | | | | In cases (mostly upgrade) of unavailability of "stime" key and availability of "xtime" (slave's xtime), introduce "stime" key on the fly by setting it to the value to "xtime". Change-Id: Iaa424662d838154c8abc2cf00830c7f9d6be45ac BUG: 1036539 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6791 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: cleanup the "tar" processVenky Shankar2014-02-061-0/+4
| | | | | | | | | | | | A missing cleanup for the "tar" process (when tar+ssh is used as the sync engine). Change-Id: Ib9599b43e7ec606c70b7c5598793417142be3c0b BUG: 1036539 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6794 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: geo-replication fixesAjeet Jha2013-12-121-38/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -> "threaded" hybrid crawl. -> Enabling metatadata synchronization. -> Handling EINVAL/ESTALE gracefully while syncing metadata. -> Improvments to changelog crawl code. -> Initial crawl changelog generation format. -> No gsyncd restart when checkpoint updated. -> Fix symlink handling in hybrid crawl. -> Slave's xtime key is 'stime'. -> tar+ssh as data synchronization. -> Instead of 'raise', just log in warning level for xtime missing cases. -> Fix for JSON object load failure -> Get new config value after config value reset. -> Skip already processed changelogs. -> Saving status of each individual worker thread. -> GFID fetch on slave for purges. -> Add tar ssh keys and config options. -> Fix nlink count when using backend. -> Include "data" operation for hardlink. -> Use changelog time prefix as slave's time. -> Process changelogs in parallel. Change-Id: I09fcbb2e2e418149a6d8435abd2ac6b2f015bb06 BUG: 1036539 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/6404 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-replication: treat MKNOD also as regular file createAmar Tumballi2013-09-201-1/+1
| | | | | | | | | | Change-Id: Iec04f642282b554a4d1b5f5c8cdc099fd001b3f4 Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 847839 Reviewed-on: http://review.gluster.org/5949 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: retry in case of ENOENT errors in entry creationsAmar Tumballi2013-09-171-2/+2
| | | | | | | | | | Change-Id: I8961633a7371c941a3feee44c949d5c934eca998 Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 847839 Reviewed-on: http://review.gluster.org/5933 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* gsyncd / geo-rep: use the correct virtual xattr to collect 'gfid'Venky Shankar2013-09-051-1/+1
| | | | | | | | | Change-Id: Ifea32ad1b2695f1668428fae6b263014bf320b70 BUG: 996379 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5589 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* gsyncd / geo-rep: Fix "regular file" overloading link()Venky Shankar2013-09-041-0/+1
| | | | | | | | | | | | | | | ... missing entry2pb() call before going ahead with create. Change-Id: I48de4df3f0ea6c789c9eeb10cd332ab461f3c868 BUG: 1003805 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5760 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: fix hardlink creation on slaveVenky Shankar2013-09-041-2/+3
| | | | | | | | | | | | | Change-Id: I20fbd518bf519cbf2362b97aeb8be7c3b105087a BUG: 1003805 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5757 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: fix "regular file" overloading renameVenky Shankar2013-09-041-0/+1
| | | | | | | | | | | | | | | | | entry operation on the slave was using source parent gfid and basename when renames are overloaded to use regular file creation. This patch fixes the issue by using the destination parent gfid and basename for these cases. Change-Id: I1a4e8df7f07905224ce44ef5abd6f180234285ab BUG: 1003800 Tested-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5754 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: Introduce basic crawl instrumentationVenky Shankar2013-09-041-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch extends the persistent instrumentation work done by Aravinda (@avishwa), by introducing a handfull of instrumentation variables for crawl. These variables are "pulled up" by glusterd in the event of a geo-replication status cli command and looks something like below: "Uptime=00:21:10;FilesSyned=2982;FilesPending=0;BytesPending=0;DeletesPending=0;" "FilesPending", "BytesPending" and "DeletesPending" are short-lived variables that are non-zero when a changelog is being processes (ie. when an active sync in ongoing). After a successfull changelog process "FilesPending" is summed up into "FilesSynced". The three short-lived variabled are then reset to zero and the data is persisted Additionally this patch also reverts some of the changes made for BZ #986929 (those were not needed). Change-Id: I948f1a0884ca71bc5e5bcfdc017d16c8c54fc30b BUG: 990420 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5441 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: periodically set slave xtime on slaveVenky Shankar2013-09-041-0/+12
| | | | | | | | | | | | | | | | | | | setting the slave xtime on the slave (after each changelog/xsync) crawl helps in two things: * effective recover of master (failover/failback) * cascading setup - instances when the session from intermediate master session is stopped, data is put on the master -> slave sesssion and then the cascading session is started again. Change-Id: Ifae10a6ac09dc0d17707c3b5a3090bcf1efec8b6 BUG: 990900 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5451 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Saving geo-rep session details in a more specific pathVenky Shankar2013-09-041-4/+5
| | | | | | | | | | | | | | | Now saving the session details in /var/lib/glusterd/geo-replication/<mastervol>_<slaveip>_<slavevol> repo to distinguish between two master-slave sessions where the slavename is same across two different clusters. Change-Id: I57c93f55cc9bd4fe2bffe579028aaf5e4335b223 BUG: 991501 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/5488 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-replication: Use a md5 based unique control pathHarshavardhana2013-09-041-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A hostname fqdn can be of length 255 according to RFC1123 -------------------------> /usr/include/bits/posix1_lim.h:#define _POSIX_HOST_NAME_MAX 255 <------------------------- On linux this length is 64 -------------------------> /usr/include/bits/local_lim.h:#define HOST_NAME_MAX 64 <------------------------- When a given hostname is > 45 (characters) - SSH fails with --------------------------> "ControlPath too long for Unix domain socket". <-------------------------- Indicating that the total length of ControlPath which is on linux should be 108 -------------------------> /usr/include/linux/un.h:#define UNIX_PATH_MAX 108 <------------------------- This leads to "faulty" geo-replication status. This patch brings in a new file called manifest which carries given a geo-rep session some unique information - with which a unique `md5` is generated in a 32length digest, this ensures that we don't exceed UNIX_PATH_MAX limitations instead we use a conservative approach and still be able to provide a unique socket path. Change-Id: I3a6a27d605d751a86e7c82eace4561d9b0134fe1 BUG: 990330 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5681 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@redhat.com>
* glusterd/cli changes for distributed geo-repAvra Sengupta2013-07-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd: distribute the crawling loadAvra Sengupta2013-07-261-19/+188
| | | | | | | | | | | | | | | | | | | * also consume changelog for change detection. * Status fixes * Use new libgfchangelog done API * process (and sync) one changelog at a time Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16 BUG: 847839 Original Author: Csaba Henk <csaba@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5131 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* move 'xlators/marker/utils/' to 'geo-replication/' directoryAvra Sengupta2013-07-221-0/+972
Change-Id: Ibd0faefecc15b6713eda28bc96794ae58aff45aa BUG: 847839 Original Author: Amar Tumballi <amarts@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5133 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>