summaryrefslogtreecommitdiffstats
path: root/xlators/features/marker/utils/syncdaemon
Commit message (Collapse)AuthorAgeFilesLines
* move 'xlators/marker/utils/' to 'geo-replication/' directoryAvra Sengupta2013-07-2213-3444/+0
| | | | | | | | | | Change-Id: Ibd0faefecc15b6713eda28bc96794ae58aff45aa BUG: 847839 Original Author: Amar Tumballi <amarts@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5133 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsync: Display additional information in status commandsarvotham s pai2013-04-042-4/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | Added code to display extra information when status command is executed. Information shown now are 1 Number of files synced 2 crawl time 3 total sync time 4 bytes synced bytes synced is taken from rsync output . --stats option of rsync gives extra infor mation about the sync.In stats output there is a field called Total transferred file size which states the ammount of bytes synced . This information is parsed from stdout output using regular expressions.Bytes synced information can be used to calculate throughput. Change-Id: Id9bba9fff45ee7049bb8257c6fd918e5237e05b1 BUG: 947774 Signed-off-by: sarvotham s pai <spai@redhat.com> Reviewed-on: http://review.gluster.org/4749 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: retire old style ssh setupCsaba Henk2013-03-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Users are still using geo-rep with the old, deprecated, insecure, unsupported ssh setup. Not their fault -- the implementation of the new method had the following charasteristics: - old method is possible, but with default settings it's not working - it can be made operational by fiddling with "remote-gsyncd" tunable - with default setting, an unhelpful, actually misleading error message is produced - the UI gave no hint to the changes in the ssh setup http://review.gluster.org/4392 tried to fix these; what it accomplished was unrestricted support to the bad practice (by making the default old setup operational). From this on: - we disable the old method by reserving the "remote-gsyncd" tunable - if the old method is attempted, give a hint what to do Change-Id: Icade94725d8d8d2d4c89cab992d4226351637b86 BUG: 895656 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4602 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: Separate log file directory for Mountbroker sessionsVenky Shankar2013-02-041-1/+5
| | | | | | | | | | | | | | | | | | | | | ... so that a mountbroker session which is initiated b/w master and slave does not use the same log file if it's started after a normal geo-rep session b/w master and slave. This results in EPERM as the log file is owned by root and the geo-rep slave process (now running as a non privileged user) does not have access to it. Also, having separate log file directory for mountbroker sessions looks clean. NOTE: geo-rep's client mount log file location remains unchanged. Change-Id: Ic7a732e250aee5393b9c3f6ebf6dfe2c310b7fe4 BUG: 893960 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/4407 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep: do not access BaseException.message in syncdutilsNiels de Vos2012-12-181-2/+2
| | | | | | | | | | | | | | http://www.python.org/dev/peps/pep-0352/ explains that the .message property of BaseException is being removed. Most of the other exception handlers access <Exception>.args[] which should be suitable for this case too. Change-Id: I1810450b78d2b3d7f8bd07f2beb02cbe9e2adecb BUG: 888346 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4328 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: play nicely with peer multiplexing when setting a checkpointCsaba Henk2012-12-041-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | The gsyncd invocation that instruments the "geo-rep config" command is multiplexed over peers to ensure the uniformity of configuration. In general, that works well, but checkpoint setting is a special case, because (unlike other instances of config-set) it is logged (as recording of checkpoint events is part of the feature). Problem is that the path components leading to the log file are created only on the original node, where gsyncd was started. Therefore the logging attempt will fail on the other nodes. Fix: ignore if opening the logfile on behalf of checkpoint setting fails with ENOENT. Change-Id: I677f3f081bf4b9e3ba4d25d58979d86931e6beb4 BUG: 881997 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4248 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd,glusterd: do not hardcode socket pathCsaba Henk2012-11-282-2/+5
| | | | | | | | | | | | ... in gsyncd python code. Indeed, use the configuration mechanism to set it suitably from glusterd. Change-Id: I9fe2088b14d28588d1e64fe892740cc5755b8365 BUG: 868877 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4143 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-replication: catch select.error on select()Niels de Vos2012-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tailer() in resource.py does not correctly catch exceptions from select(). select() can raise an instance of the select.error class and the current expression only catches ValueError (and the instance will have reference called selecterror). The geo-rep log contains a call trace like this: > E [syncdutils:190:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 216, in twrap > tf(*aa) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 123, in tailer > poe, _ ,_ = select([po.stderr for po in errstore], [], [], 1) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 276, in select > return eintr_wrap(oselect.select, oselect.error, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 269, in eintr_wrap > return func(*a) > error: (9, 'Bad file descriptor') BUG: 880308 Change-Id: I2babe42918950d0e9ddb3d08fa21aa3548ccf7c5 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4233 Reviewed-by: Peter Portante <pportant@redhat.com> Reviewed-by: Csaba Henk <csaba@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep/gsyncd: work around rsync argument overflowCsaba Henk2012-09-071-3/+8
| | | | | | | | | | | | | instead of passing the files to be synced as args to rsync, have rsync read them on stdin with '-0 --files-from=-' Change-Id: Ic3f71a0269941ce50051af8adfad183a52a79b01 BUG: 855306 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/3917 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsycnd: restore the generic signature for sendmark_regularCsaba Henk2012-07-301-3/+3
| | | | | | | | | | | | | | | | | Earlier fixes to 842330 changed the generic (*a, **kw) signature, although that was not related to the issue. We restore the generic signature as it was used for a reason (proxy methods that do none or only algebraic transformations on passed arguments idiomatically have generic signature, both to serve as visual cue and agnosticism wrt. the inner API). Change-Id: Ib609a3a58be53d78b7f1221a3c162c6aec8fd488 BUG: 842330 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3754 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep: Fix sendmark() invocation for Normal MixinVenky Shankar2012-07-251-1/+1
| | | | | | | | | Change-Id: I0ae81ab01418becba83e401ec36c6db5323945e8 BUG: 842330 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/3725 Tested-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fixes for gsyncd / geo-rep and FUSE listxattrVenky Shankar2012-07-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes two problems with recent changes to Geo-Replication First: ------ Recent changes to geo-replication relies on Rsync to tranfer extended attributes. Essentially Rsync would invoke a listxattr() and then getxattr() the set reutrned by listxattr() and finally transfer it to the remote slave. Xattrs like security.selinux would create problems as they are not allowed to be set explicitly (unless there's a rule that allows this). So, to make Rsync behave sanely we filter out all "*.selinux*" xattrs from listxattr() (which is getxattr() with ->name as NULL). Second: ------- Python's "if {..} else {..}" shortcut ".. and .. or .." was misused here. This is a straightforward fix by interchanging last two variables (classes in this case). Also fix a typo in sendmark_regular() definition. Change-Id: I097b5f5d88a36c7eef5560a78d4332948a545942 BUG: 842330 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/3714 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd / geo-rep: Fix typo in 'purge' flowVenky Shankar2012-07-201-1/+1
| | | | | | | | | Change-Id: I6c329b895178545d16b0cb9f01ad116f5342f752 BUG: 841855 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/3706 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep / gsyncd: convert ignore-deletes to a mixin tooCsaba Henk2012-07-191-3/+16
| | | | | | | | | Change-Id: I164a1d1dd5f15569afd6806834119a6844949df0 BUG: 841062 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3684 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep / gsyncd: add support for sending xtimes through rsyncCsaba Henk2012-07-193-14/+35
| | | | | | | | | | | | | | Note that in said mode metadata synchronization is best effort: rsync syncs metadata at last so if rsync is interrupted in between xattr sync and metadata sync stages, then file will be considered in sync Change-Id: I1c75eab33b0a1000abf3ad36b2d484a89eeda1bd BUG: 841062 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / gsyncd: rsync option cleanups, fixesCsaba Henk2012-07-182-2/+6
| | | | | | | | | | | | - add two tunables for rsync: "rsync-options" and "rsync-ssh-options" - always pass "--no-implied-dirs" to rsync Change-Id: I3d67a4cba8cabd681edac80e6b1fb8ea322008bd BUG: 841062 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3682 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep / gsyncd: fixes to communication with child processesCsaba Henk2012-07-141-7/+11
| | | | | | | | | | | | due to not using the proper Python keyword, errhandler thread was possible to run into empty select Signed-off-by: Csaba Henk <csaba@redhat.com> BUG: 764678 Change-Id: I3c39e718e72545c27d50fd73aa6daf54062331b0 Reviewed-on: http://review.gluster.com/3560 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: sanitize error log of external commandsCsaba Henk2012-07-141-2/+10
| | | | | | | | | | | | | | If a command invoked by gsyncd fails, gsyncd makes a log of what comes out on its stderr. So far the log indeterministically broke lines at random places. Now put some effort into reconstructing original lines and having a faithful log. BUG: 764678 Change-Id: I16fcc75d3e0f624c10c71d9b37c937ca677087cc Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3561 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gsyncd / geo-rep : failover/failbackCsaba Henk2012-06-133-66/+341
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is based on Venky Shankar <vshankar@>'s original implementation. Let us first quote Venky's description, then we summarize changes to his work. ------ First version of failover/failback. Failback mechanism uses two exclusive modes: * blind-sync This mode works with xtime pairs (both master and slave) to identify candidated to sync the original master from the slave * wrapup-sync This mode is similar to the normal working of gsyncd except that orphaned entities in the gluster volume are not assigned xtimes. This prevents un-necessary transfer of data for such entities. Modes can be enabled via: gluster volume geo-replication M S config special_sync_mode blind gluster volume geo-replication M S config special_sync_mode wrapup To turn off the special modes (i.e. to revert to normal gsyncd behaviour) use: gluster volume geo-replication colon-d0 192.168.1.2::colon-d config \!special_sync_mode ------ Code has been refactored to meet following goals: - make checkpointing work with special sync modes - move out sync mode related conditionals from the crawl loop and make all decisions to be made at startup time - be intrusive to the crawl loop to smallest possible degree (we will have to change/revisit it for other reasons, and the complexity of that should not increase) So, xtime parsing/updating/evaluation that's specific to the certain special modes are represented as mixin classes; basic operation logic is in an abstract base class. On startup, special-sync-mode tunable is dynamically dispatched to the corresponding mixin and the actual master class is derived from the chosen mixin and the ABS. Change-Id: Ic9b8448f31ad4239a8200dc689f7d713662a67de BUG: 830497 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3541 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep: checkpointingCsaba Henk2012-06-133-19/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - gluster vol geo-rep M S conf checkpoint <LABEL|now> sets a checkpoint with LABEL (the keyword "now" is special, it's rendered to the label "as of <timestamp of current time>") that's used to refer to the checkpoint in the sequel. (Technically, gsyncd makes a note of the xtime of master's root as of setting the checkpoint, called the "checkpoint target".) - gluster vol geo-rep M S conf \!checkpoint deletes the checkpoint. - gluster vol geo-rep M S stat if status is OK, and there is a checkpoint configured, the checkpoint info is appended to status (either "not yet reached", or "completed at <timestamp of completion>"). (Technically, the worker runs a thread that monitors / serializes / verifies checkpoint status, and answers checkpoint status requests through a UNIX socket; monitoring boils down to querying the xtime of slave's root and comparing with the target.) - gluster vol geo-rep M S conf log-file | xargs grep checkpoint displays the checkpoint history. Set, delete and completion events are logged properly. Change-Id: I4398e0819f1504e6e496b4209e91a0e156e1a0f8 BUG: 826512 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3491 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / gsyncd: further cleanup refinementsCsaba Henk2012-05-244-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Regarding issue of leftover ssh control dirs: If master side worker is stuck in connection establishment phase, have the monitor kill it softly (ie. first by SIGTERM, to let it cleanup). This is trickier than sounds on first hearing, because if worker is stuck in waiting for a RePCe answer (in threading.Condition().wait()), then SIGTERM is ignored (more precisely, Python holds it back for the wait and resends it to itself when wait is over). So instead of signalling the worker only, we send TERM to the whole process group -- that brings down the ssh connection, which wakes up the waiting worker, which then can cleanup. Only problem is that monitor is also in the process group and it should not coomit a suicide. That is taken care by setting up a one-time SIGTERM handler in the monitor. - Regarding slave gsyncd stuck in chdir: Slave gsyncd is usually well behaved: if master does not send keepalives, it takes care to exit. However, if a hang occurs in early phase, when slave is to change to the gluster mountpoint, no timeout is set up for that (and unlike on master side, neither is there an external actor like the monitor to do that). So, to manage this scenario, we do the chdir in a (supposedly) short lived thread, and in the main thread we wait for the termination of this thread. If that does not happen within the time limit, main thread calls for cleanup and exit. (This logic explicitely takes the appropriate action in the cases when chdir succeeds or when hangs; but what about the remaining case, when chdir fails? Well in that case the chdir thread's exception handler will put the process to cleanup and exit route.) Change-Id: I6ad6faa9c7b1c37084d171d1e1a756abaff9eba8 BUG: 786291 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3376 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: add "--super" to rsync invocationCsaba Henk2012-05-242-2/+1
| | | | | | | | | | | | | | | | | | | | This forces rsync to perform supposedly privileged operations on unprivileged slaves (like chown(2)). For consistent behavior (with gsyncd's "chown" RPC call that's being used for symlinks and directories), we also pass "--numeric-ids" to rsync. Also took the chance to retire gsyncd's "--rsync-extra" option which was there for debugging purposes (related to a resolved issue). Change-Id: I4ee4d0d3a8c4e0f6746d34d7722c8a567a67491c BUG: 822121 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3426 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: fix cleanup of temporary mountsCsaba Henk2012-05-211-17/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [This is a "forward port" of fafd5c17, http://review.gluster.com/2908] The "finally" clause that was meant to cleanup after the temp mount has not covered the case of getting signalled (eg. by monitor, upon worker timing out). So here we "outsource" the cleanup to an ephemeral child process. Child calls setsid(2) so it won't be bothered by internal process management. We use a pipe in between worker and the cleanup child; when child sees the worker end getting closed, it performs the cleanup. Worker end can get closed either because worker closes it (normal case), or because worker has terminated (faulty case) -- thus as bonus, we get a nice uniform handling with no need to differentiate between normal and faulty cases. The faulty case that was seen IRL -- ie., users of maintainance mounts hang in chdir(2) to mount point -- can be simulated for testing purposes by applying the following patch: diff --git a/xlators/mount/fuse/src/fuse-bridge.c b/xlators/mount/fuse/src/fuse-bridge.c index acd3c68..1ce5dc1 100644 --- a/xlators/mount/fuse/src/fuse-bridge.c +++ b/xlators/mount/fuse/src/fuse-bridge.c @@ -2918,7 +2918,7 @@ fuse_init (xlator_t *this, fuse_in_header_t *finh, void *msg) if (fini->minor < 9) *priv->msg0_len_p = sizeof(*finh) + FUSE_COMPAT_WRITE_IN_SIZE; #endif - ret = send_fuse_obj (this, finh, &fino); + ret = priv->client_pid_set ? 0 : send_fuse_obj (this, finh, &fino); if (ret == 0) gf_log ("glusterfs-fuse", GF_LOG_INFO, "FUSE inited with protocol versions:" Change-Id: I14bad56a60a7fa82d0104fa4b9a20f4e42a7186f BUG: 786291 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3259 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: fixes regarding the command invocation frameworkCsaba Henk2012-05-192-9/+25
| | | | | | | | | | | | | | | | Some of the bugs to fix were found by the following stress-test: make "glusterfs --client-pid=-1" exit immediately on slave side. Also fix eintr_wrap which should not "adopt" exceptions generated by the wrapped call, by re-raising them as GsyncdError. Change-Id: Ia0d39e0635975ebbbf98d86e1e26f3122e1ed6ff BUG: 764678 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / gsyncd: recognize ECONNABORTED as termination of aux glusterfsCsaba Henk2012-05-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't dump stack, rather log the "glusterfs session went down" message. If the aux glusterfs is already dead when we try to do some file operation, we get a failure with ENOTCONN, which is already handled as above. However, it's also possible that glusterfs dies while we are in a syscall into it -- in that case we get ECONNABORTED, and so far then we end up with an ugly stack strace. From now on we take ECONNABORTAD as well into consideration. Nb. wrt. testing: it's not easy to synthetically force the aux glusterfs to end this way; for that we have to provoke gsyncd into intensive synchronization. I succeeded in that with the following ruby oneliner: ruby -rcgi -e ' Dir.chdir($*[0]) a=[] Thread.new { loop { while a.size >= 100; File.delete a.shift; end; sleep 1 }} loop { a<<CGI.escape(STDIN.read 10); open(a[-1], "w") {}}' MTPT < /dev/urandom where the geo-rep master is mounted at MTPT. With this going on, deliver a SIGKILL to the geo-rep session's aux glusterfs. (It is giving ECONNABORTED non-deterministically, actually in the minority of cases.) Change-Id: I24fd8d0295cdba91d8b994057a1255ca8e2d1a67 BUG: 764510 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep / gsyncd: typo fixCsaba Henk2012-05-191-1/+1
| | | | | | | | | | | fix topy. Change-Id: I84df3e850dd24d9e86713dfa401c603a84a81ca6 BUG: 763302 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3375 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep / gsyncd: log sync failures as warningsCsaba Henk2012-04-131-1/+1
| | | | | | | | | | | | | | | Syncing of certain files can fail naturally if changes happen on master (eg. file gets deleted). Therefore logging an error is misleading. Change-Id: I7b54904e5ec7c85e4e0fa1e330123d2c44c78ac5 BUG: 764510 Reported-by: Vijaykumar Koppad <vkoppad@redhat.com> Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3113 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaykumar <vkoppad@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* Fix compiler warnings and typos from Debian build.Jeff Darcy2012-04-101-1/+1
| | | | | | | | | | | | | Mostly to do with "-Werror=format-security" being buggy, but while we're here we might as well fix some typos and such. Credit goes to Patrick Matthäi <pmatthaei@debian.org> for pointing these out. Change-Id: Ia32d1111d7c10b1f213df85d86b17a1326248ffd BUG: 811387 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3117 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: shuffle directory entries in crawlCsaba Henk2012-04-051-0/+2
| | | | | | | | | | | In order to randomize the walk of the file tree. Change-Id: I9fc3b83d5804914a50faae8df7dbcfed2ba6f4b4 BUG: 809675 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/3079 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cleanup and fix xattr namespace flipCsaba Henk2012-03-071-1/+1
| | | | | | | | | | | | | | | | - function of actual flipping made static - clean out references to particular namespaces from flipping logic - namespaces involved in flipping defined at single location - fix fnmatch(3) invocation with reversed pattern and string arguments - instead of "user", use "system" to flip from, because latter is free from supervision of the VFS layer (cf. attr(5)) Change-Id: I3cc5836fadcad5b237fd5c67d0dcaea63aee9164 BUG: 798716 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/2890 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / syncdaemon: determine suitable xattr namespace based on privilegeCsaba Henk2012-03-052-6/+7
| | | | | | | | | Change-Id: I91fe16d7e5e4c21f138eab4ee0b9334aec40e41b BUG: 765433 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/2838 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* geo-rep / syncdaemon: make the timeout for establishing the connection to ↵Csaba Henk2012-03-012-1/+2
| | | | | | | | | | | | | | slave configurable It can be set through the connection-timeout tunable but we keep it hidden, intended as a workaround for some special scenarios not for general use. Change-Id: I31f9fa3873afa7babc2106ee34484123a01bdc57 BUG: 789078 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.com/2839 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: gsyncd: fix up fallback xtime for orphans on master sideCsaba Henk2012-02-071-7/+8
| | | | | | | | | | Change-Id: I2fa543b4bd317e06ea621ae968300ffb7223a68a BUG: 771787 Signed-off-by: Csaba Henk <csaba@gluster.com> Reviewed-on: http://review.gluster.com/2580 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: gsyncd: Python3 compat fixesVenky Shankar2012-01-261-3/+4
| | | | | | | | | | Change-Id: I2eef82faab3eed1189e3786a5dca296773e1caa0 BUG: 784498 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.com/2690 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Csaba Henk <csaba@redhat.com>
* log to stderr if "-" is given as log-fileCsaba Henk2011-11-201-1/+1
| | | | | | | | | | This works around broken /dev/stderr on some systems. Change-Id: I017b03082ff630c4a713ae74990e88b3fa20d0e1 BUG: 3686 Reviewed-on: http://review.gluster.com/560 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cli: add geo-replication log-rotate commandVenky Shankar2011-10-206-24/+66
| | | | | | | | | | | | | | | | | | | | | | | | | Rotating geo-replication master/monitor log files from cli. On invocation, the log file for a given master-slave session is backed up with the current timestamp suffixed to the file name and signal is sent to gsyncd to start logging to a new log file. Sample commands: * Rotate log file for this <master>:<slave> session: gluster volume geo-replication <master> <slave> log-rotate * Rotate log files for all session for master volume <master> gluster volume geo-replication <master> log-rotate * Rotate log files for all sessions: gluster volume geo-replication log-rotate Change-Id: I75f641b4e082a04d5373c18583ca4a1d9651d27a BUG: 3519 Reviewed-on: http://review.gluster.com/529 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@gluster.com>
* geo-rep: disallow some special characters in url syntaxCsaba Henk2011-09-221-1/+1
| | | | | | | | | | | | | | | - space is disallowed to make rsync target unambigous for gsyncd wrapper - *, ?, [ is disallowed so that we can tell away globs from urls Nothing too bad would happen without these restrictions, but this way gluster errs out early instead of producing some mystical error further down on the way. Change-Id: Idd4e68f7d91598a7a8e30ccbc6d395da570cdf2e BUG: 3610 Reviewed-on: http://review.gluster.com/490 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: add support to glob patterns with "geo-rep config"Csaba Henk2011-09-221-3/+7
| | | | | | | | Change-Id: I0d54cea72e4363eab85ade774cc918081d8036e9 BUG: 3610 Reviewed-on: http://review.gluster.com/489 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: implement IP address based access controlCsaba Henk2011-09-223-23/+64
| | | | | | | | | | | | | | | | | | | - gsyncd gets allow-network tunable which is expected to hold a comma-separated list of IP network addresses - for IP addess matching, bring in ipaddr module from Google (http://code.google.com/p/ipaddr-py/, rev. trunk@225) This will let users control master's access to slave's volumes until we implement unprivileged geo-rep (delayed due to some technical issues). It's also needed for the completeness of our hardening efforts, as plain file slaves won't be able to work with an unprivileged gsyncd. Change-Id: I58431cba6592f8672e93ea89a5eef478905b00b9 BUG: 2825 Reviewed-on: http://review.gluster.com/488 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: gsyncd: make sure path operations do not act outside the volumeCsaba Henk2011-09-221-0/+28
| | | | | | | | Change-Id: I2da62b34aa833b9a28728fa1db23951f28b7e538 BUG: 2825 Reviewed-on: http://review.gluster.com/462 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* gsyncd: control rsync targetCsaba Henk2011-09-221-1/+1
| | | | | | | | | | | | | - require/perform rsync invocation with unprotected args (so that target is revealed to gateway program) - make use of some procfs wizardry to find gsyncd sibling and match rsync target against its working directory Change-Id: Iae1e39b0e61f22563c0f2a2e0605567e0d1902df BUG: 2825 Reviewed-on: http://review.gluster.com/461 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* gsyncd: implement restricted mode and utility dispatchCsaba Henk2011-09-221-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this change, the suggested way of setting up a geo-sync slave is to use an ssh key with gsyncd as a forced command (see sshd(8)), or set gsyncd as shell. This prevents the master in executing arbitrary commands on slave (a major security hole). Detailed list the changes: - All gsyncd invocations that are not done by glusterd are considered unsafe and then we operate in so-called "restricted mode" (see below) - if we are invoked on purpose (ie. it's not the case that sshd forced us to run as frontend of a remote-invoked command), we execute gsyncd.py - if invoked by sshd as frontend command, we check the remote command line and call the required utility if it's among the allowed ones (rsyncd and gsyncd) - with rsync, we check if invocation is server mode and some other sanity measures - with gsyncd, in restricted mode we enforce the usage of the glusterd provided config file, and in python, we enforce operation in server mode and some other sanity checks Impact on using geo-rep the old way: remote file slave now also requires a running glusterd (to pick up config from). Missing: we not implemented check of the rsync target path. The issue of master being able to modify arbitrary locations is planned to be mitigated by using geo-rep with an unprivileged user. Change-Id: I9b5825bfe282a9ca777429aadd554d78708f1638 BUG: 2825 Reviewed-on: http://review.gluster.com/460 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: gsyncd: add --ignore-deletes optionVenky Shankar2011-09-203-3/+29
| | | | | | | | | | | | | | | When this option is set, a file deleted on master will not trigger a delete operation on the slave. Hence, the slave will remain as a superset of the master and can be used to recover the master in case of crash and/or accidental deletes. This options is not enabled by default. Change-Id: I9244d9dfa4f38f19436036f36bec0d9c3a1f7993 BUG: 3552 Reviewed-on: http://review.gluster.com/426 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@gluster.com>
* geo-rep: partial support for unprivileged gsyncd via mountbrokerCsaba Henk2011-09-123-32/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gsyncd: - mounting code is split to a direct and a mountbroker based backend - option gluster-command gone - new options: gluster-params, gluster-cli-options, mountbroker - mountbroker mount backend is used if either a mountbroker label is given through the mountbroker option, or if gsyncd is unprivileged; in this case the username is used as label - have gluster cli invocations log to stderr so that we don't hit a permission issue with the logfiles glusterd: - do gsyncd pre-config with new options - add option geo-replication-log-group, so if that specified geo-rep logfile directories are given to that group (and thus members of the given group can do logging there) This is just WIP as geo-rep relies on trusted extended attributes and those are not accessible for unprivileged users. Even if we solved this issue, glusterd security settings are too coarse, so that if we made it possible for an unprivileged gsyncd to operate, we would open up too far. Change-Id: Icd520b58cbadccea3fad7c0f437b99de1e22db14 BUG: 2825 Reviewed-on: http://review.gluster.com/399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* gsyncd: python3 compat fixesCsaba Henk2011-09-124-5/+36
| | | | | | | | | | | Also add __codecheck script which can verify if source is OK at the syntactical level with a given Python interpreter. Change-Id: Ieff34bcd3efd1cdc0e8f9a510c05488f35897bbe BUG: 1570 Reviewed-on: http://review.gluster.com/320 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gsyncd: do the homework, document _everything_Csaba Henk2011-09-089-17/+483
| | | | | | | | Change-Id: I559e6a0709b8064cfd54c693e289c741f9c4c4ab BUG: 1570 Reviewed-on: http://review.gluster.com/319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* gsyncd: refine command invocationCsaba Henk2011-08-254-32/+124
| | | | | | | | | | | | | | | | | | | | Use subprocess module instead of os.spawn* / ad-hoc fork/exec. With this, we do now: - close uneeded files in children - watch childrens' stderr: - have a thread which collects childrens' stderr into a ring buffer (so that stderr pipe doesn't get stuffed) - on command failure show stderr - distinguish between rsync exit values, tolerate only partial errors - if connection is broken to slave, show ssh/slave gsycd's stderr Change-Id: Ia92f57b5bdfa47f8c44375c50cf279006a0bf69b BUG: 2946 Reviewed-on: http://review.gluster.com/85 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Kaushik BV <kaushikbv@gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* gsyncd: do some basic sanitization on logsCsaba Henk2011-07-295-24/+60
| | | | | | | | | | | | - exceptions raised by us will be logged as single-line error messages (full stack strace is shown only at DEBUG loglevel) - common/well understood exceptions are mapped to "user-parsable" error logs Change-Id: I75f1fb848483372364b2093878d9cfed576c9739 BUG: 2778 Reviewed-on: http://review.gluster.com/125 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* geo-rep: minor fixesCsaba Henk2011-07-291-1/+1
| | | | | | | | Change-Id: I5c5211858bdb2bd28324818362d95edd97f94207 BUG: 2778 Reviewed-on: http://review.gluster.com/81 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>