summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr.h
Commit message (Collapse)AuthorAgeFilesLines
* replicate: remove checks which prevented self-heal when open fds were presentAnand V. Avati2010-10-181-1/+0
| | | | | | | | | | | this check is not needed anymore since the introduction of changelog piggybacking as the optimization technique instead of first-write-to-flush technique Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* replicate: replace first-write-to-flush optimizationAnand V. Avati2010-10-181-17/+10
| | | | | | | | | | | use a changelog piggybacking optimization instead of first-write-to-flush optimization and do other cleanups (removal of post-post-op hook etc.) Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* cluster/afr: Return correct flock structures correctly in lk fops.Pavan Sondur2010-07-021-1/+2
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1042 (Use correct flock structures in lk fops) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1042
* cluster/afr: Set lk-owner to pid when fuse does not supply it.Pavan Sondur2010-06-291-0/+3
| | | | | | | | | | | | | | | | Use the frame->root address as lk-owner when FUSE does not supply lk_owner. Raghu, I unit tested this patch with dbench and self heal tests. Did not observe lk-owner=0 in any server logs. Can you verify this patch with the other tests you had run today? Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1032 (Set lock-owner with pid when fuse does not supply value) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1032
* Unset split-brain flags in afr_self_heal_completion_cbk if self heal ↵Simone Gotti2010-06-011-1/+1
| | | | | | | | | | | | | | | completes successfully. Signed-off-by: Simone Gotti <simone.gotti@gmail.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 884 (I/O errors after fixed split brain and successfully completed self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=884 Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 884 (I/O errors after fixed split brain and successfully completed self heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=884
* cluster/afr: Cleanup fd ctx in releasedir cbkVijay Bellur2010-04-071-0/+3
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 805 (memory leak in afr) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=805
* cluster/afr: Failover readdir calls.Vikas Gorur2010-03-041-0/+7
| | | | | | | | | | | | | | | | This patch makes the replicate readdir call fail over to the next subvolume if the first call fails. It takes care to ensure that entries are not duplicated. The failover behavior of readdir only comes into effect if the option 'strict-readdir' is on. Signed-off-by: Vikas Gorur <vikas@dev.gluster.com> Signed-off-by: root <root@client02.(none)> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 453 (afr_readdir does not fail over) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=453
* afr: fix memory leaksAnand Avati2009-12-041-0/+3
| | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* storage/posix: Added janitor thread.Vikas Gorur2009-12-021-1/+0
| | | | | | | | | | | | | | | | | | The janitor thread deletes all files and directories in the "/" GF_REPLICATE_TRASH_DIR directory. This directory is used by replicate self-heal to dump files and directories it deletes. This is needed because letting replicate walk the directory tree and delete a directory and all its children is too racy. Instead, replicate self-heal only does an atomic rename(), and the janitor thread takes care of actually deleting them. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 227 (replicate selfheal does not remove directory with contents in it) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=227
* cluster/afr: Set the self-heal "source" as read subvolume even when not ↵Vikas Gorur2009-12-011-0/+1
| | | | | | | | | | | | | | doing self-heal. This patch sets the read-subvolume equal to the self-heal "source" even if we're not doing self-heal (because some one else is already doing it). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Preserve generation number along with inode in lookup and ↵Vikas Gorur2009-11-301-0/+6
| | | | | | | | | | | | | creation fops. This fixes fuse_create_cbk conflict warnings and random errors while running dbench (typically open handle failure with ENOENT). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 315 (generation number support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
* cluster/afr: Do self-heal on unopened fds.Vikas Gorur2009-11-251-0/+4
| | | | | | | | | | | | | | This patch completes the previous patch for self-heal of open fds in replicate. If an fd was never opened on a subvolume, we remember that and do the open after we've done self-heal on that fd. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Refactored the self-heal interface.Vikas Gorur2009-11-241-17/+31
| | | | | | | | | | Cleaned up the self-heal interface to callers. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Do self-heal on reopened fds.Vikas Gorur2009-11-241-1/+29
| | | | | | | | | | | | | | | | | | | | | | | This patch brings in partial support for self-heal of open fds. The precondition is that the fd should have been opened successfully during the initial open() (or create()), and we assume that protocol/client has successfully reopened the fd when the subvolume comes back up. It works by doing an "up/down flush" (a dummy flush transaction to do post-op wherever necessary) and then triggering data self-heal on the file in the post-post-op hook of the dummy flush transaction. This ensures that any writes that come in during self-heal will wait until self-heal completes. The up/down flush is also done when a subvolume goes down, so that post-op is done on all subvolumes where pre-op was done. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Provide a post-post_op hook in the transaction.Vikas Gorur2009-11-241-0/+3
| | | | | | | | Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Refactored the data self-heal algorithm.Vikas Gorur2009-11-241-0/+6
| | | | | | | | | | | | | | | | | | Refactored the operation of the data self-heal algorithm as: * open all fd's (if fd not supplied by caller) * lock 0-0 (if lock not supplied by caller) * fxattrop, fstat (instead of lookup) ... self heal ... * unlock (if lock not supplied by caller) * close (if fd not supplied by caller). Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Hold blocking locks for data self-heal.Vikas Gorur2009-11-241-0/+1
| | | | | | | | | | | Data self-heal now holds blocking locks, and instead of locking on all subvolumes, it only locks on {data-lock-server-count} subvolumes. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 170 (Auto-heal fails on files that are open()-ed/mmap()-ed) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=170
* cluster/afr: Fix inode context bitmasks.Vikas Gorur2009-11-241-2/+2
| | | | | | | | | | | Set opendir_done and split_brain flags correctly in the inode context. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 249 (Self heal of a file that does not exist on the first subvolume) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
* cluster/afr: Fix handling of revalidate lookups.Vikas Gorur2009-11-241-0/+1
| | | | | | | | | | | | | | | | This patch does two things related to revalidate: 1) If a revalidate fails on any subvolume, the entire lookup call is failed. 2) Self-heal is not triggered on a revalidate if revalidate has failed on any subvolume. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 389 (auto-heal fails randomly and causes "Stale NFS file handle" errors) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=389
* cluster/afr: Ensure directory contents are in sync during opendir.Vikas Gorur2009-11-131-0/+10
| | | | | | | | | | | | | | | | | | | | | | The problem: If some files on the first subvolume disappeared without leaving a trace in the entry changelog (this can happen, for example, when an fsck has deleted files or when a hard drive is replaced), those files would never be self-healed even though they would be present on the second subvolume. This is because readdir is sent only to the first subvolume, and since the files don't appear in the directory listing, no lookup would ever be sent on them. This patch fixes this problem by doing a readdir on all the subvolumes during the first opendir on a directory inode. If a discrepancy in the contents is detected, entry self-heal in a special "force merge" mode is triggered on that directory. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 249 (Self heal of a file that does not exist on the first subvolume) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=249
* cluster/afr: Don't try to self-heal if there are locks heldVikas Gorur2009-10-301-0/+3
| | | | | | | | | | | If the inodelk_count or entrylk_count is positive on a file/directory, don't try to do self-heal on it. Signed-off-by: Vikas Gorur <vikas@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 326 ([2.0.8rc9] Spurious self-heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=326
* cluster/afr: Move deleted files to /.trash in entry self-heal.Vikas Gorur2009-10-291-0/+1
| | | | | | | | | | | | | | | | | If entry self-heal determines that a file/directory should be deleted from a subvolume, move that entry to a directory called "/.trash" on that subvolume. This is for two reasons: 1) It limits the damage that can be done by a "wrong" entry self-heal. 2) It solves the problem of a to-be-deleted directory not being empty. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 227 (replicate selfheal does not remove directory with contents in it) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=227
* cluster/afr: Remove struct's for [f]chown, [f]chmod, utimens from afr_local_t.Vikas Gorur2009-10-271-37/+0
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 146 (Add setattr FOP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=146
* cluster/afr: NFS-friendly logic changesVikas Gorur2009-10-271-6/+42
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 145 (NFSv3 related additions to 2.1 task list) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=145
* cluster/afr: Check the target of symlink's in entry self-heal.Vikas Gorur2009-10-261-0/+3
| | | | | | | | | | During entry self-heal, make sure not only that a symlink exists on all subvolumes, but also that their targets match. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 193 (symlink contents not self-healed by replicate) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=193
* cluster/afr: Do self-heal in the background.Vikas Gorur2009-10-261-0/+12
| | | | | | | | | | | | | | This patch introduces a new option "background-self-heal-count", with a default value of 16. This means that upto {background-self-heal-count} number of files/directories will be healed in the background at any given time. If such number of self-heals are already in progress, further self-heals take place in the foreground. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Pipeline the "full" data self-heal read-write loop.Vikas Gorur2009-10-231-2/+4
| | | | | | | | | | | | | Start upto "data-self-heal-window-size" instances of the read-write loop of the "full" data self-heal algorithm simultaneously. Add a new option "data-self-heal-window-size" with range [1-1024], and a default value of 16. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 320 (Improve self-heal performance) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=320
* cluster/afr: Set mtime of parent directory in self-heal properly.Vikas Gorur2009-10-131-0/+1
| | | | | | | | | | | While creating/deleting an entry as part of entry self-heal, set the parent directory's mtime to match that on the source subvolume. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 137 (Parent directory mtime not reset after a create in self-heal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=137
* prevent spurious unlocks from afr selfhealAnand Avati2009-10-131-0/+1
| | | | | | | | | | afr selfheal now remembers all the nodes on which locks were successfully held and sends unlocks only to those nodes Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 112 (parallel deletion of files mounted by different clients on the same back-end hangs and/or does not completely delete) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=112
* Changed occurrences of Z Research to Gluster.Vijay Bellur2009-10-071-1/+1
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cluster/afr: Change STACK_UNWIND to STACK_UNWIND_STRICT.Vikas Gorur2009-10-071-2/+2
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 269 (Add a specialized STACK_UNWIND macro for each FOP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=269
* cluster/afr: Initialize local->first_up_child in AFR_LOCAL_INIT.Vikas Gorur2009-10-051-3/+29
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 285 ("first up child" can change during a transaction) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=285
* Global: Introduce setattr and fsetattr fopsShehjar Tikoo2009-10-011-0/+16
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 146 (Add setattr FOP) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=146
* cluster/afr: dir-write: Fix inode number handling.Vikas Gorur2009-09-281-0/+1
| | | | | | | | | | | | | | | | | | | create, mkdir, symlink, mknod: Prefer to return itransform'd inode number from the first_up_child. If not, fall back on any other child that returned succcess. link, rename: Return the same inode number that was passed as part of loc_t. Also adds a new member to afr_local_t, local->first_up_child which is initialized at the start of the transaction. This fixes the race where a subvolume might go down during the transaction and thus have the first_up_child change. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 285 ("first up child" can change during a transaction) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=285
* cluster/afr: Add new option "data-self-heal-algorithm"Vikas Gorur2009-09-221-1/+3
| | | | | | | | | | option: data-self-heal-algorithm type: string default: "full" This option allows the user to specify the algorithm to be used for data self-heal. Currently supported values are "full" and "diff". Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cluster/afr: Make the self-heal algorithm pluggable.Vikas Gorur2009-09-221-0/+6
| | | | | | | | Abstract the read/write loop part of data self-heal. This patch has support for the "full" (i.e., read and write entire file) algorithm. Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cluster/afr: Return same inode number in stat buf for readv_cbkVikas Gorur2009-07-271-0/+1
| | | | | | | | | | Remember the inode number that had been returned in lookup_cbk and set the stat buf->ino to the same. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 166 (libglusterfsclient: Cached stat buf inode is different from ino in inode_t) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=166
* Return stat from read subvolume in dir-write ops.Vikas Gorur2009-07-161-0/+6
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 138 (create family calls do not return stat buf from read child) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=138
* Return stat info from read-child in all the inode-write opsVikas Gorur2009-07-161-0/+10
| | | | | | | | | | | Also modifies the inode-write ops to wait for the call to read-child to return (whether success or failure) before unwinding. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 125 (stat information not returned from the same subvolume always) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=125
* Return inode number always from the first up subvolume in AFR.Vikas Gorur2009-07-151-0/+1
| | | | | | | | | | Also fixes a bug in the "KLUDGE" part. It was setting lookup_buf when it should have been setting local->cont.lookup.buf Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 116 (Replicate: Need inode number from first subvolume on fresh lookup) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=116
* Use original pid when calling the FOP in afr transaction.Vikas Gorur2009-04-161-1/+3
| | | | | | | | Save the original pid while locking and restore it after the FOP is done. This ensures posix-locks can release locks (fcntl) properly. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Changed xattr format of afr changelog to support adding and removing of ↵Vikas Gorur2009-04-161-7/+46
| | | | | | subvolumes while keeping existing data. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* update cluster/afr with new readv writev prototypesAnand V. Avati2009-04-121-1/+1
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Fix in changelog logic.Vikas Gorur2009-04-071-0/+9
| | | | | | If a writev fails, remember it by marking it in the fd context. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Consider a subvolume dead if an fop fails on itVikas Gorur2009-04-071-0/+2
| | | | | | | | | Transaction fops earlier called afr_transaction_child_died only if an fop failed due to ENOTCONN or EBADFD. Now they consider a child dead regardless of the reason for failure. This handles cases such as ENOSPC. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Made afr inode context a 64-bit packed value instead of a structure.Vikas Gorur2009-04-031-16/+6
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Load balance read operations among subvolumes in afrVikas Gorur2009-04-021-2/+6
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Defined afr_inode_ctx_t structure.Vikas Gorur2009-04-021-0/+17
| | | | | | | | Notification of a split-brain situation, which was earlier signalled by the mere presence of inode context is now signalled by the 'split_brain' member in the structure. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* updated copyright header to extend copyright upto 2009Basavanagowda Kanur2009-02-261-1/+1
| | | | | | updated copyright header to include 2009. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Added all filesVikas Gorur2009-02-181-0/+523