summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src
Commit message (Collapse)AuthorAgeFilesLines
* afr: launch index heal on local subvols up on a child-up eventRavishankar N2015-09-091-17/+11
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11912/ Problem: When a replica's child goes down and comes up, the index heal is triggered only on the child that just came up. This does not serve the intended purpose as the list of files that need to be healed to this child is actually captured on the other child of the replica. Fix: Launch index-heal on all local children of the replica xlator which just received a child up. Note that afr_selfheal_childup() eventually calls afr_shd_index_healer() which will not run the heal on non-local children. Change-Id: I524fda17c28864758b35679cfb232f81f8374571 BUG: 1256245 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11994 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Pick gfid from poststat during fresh lookup for read child ↵Krutika Dhananjay2015-07-176-41/+67
| | | | | | | | | | | | | | calculation Backport of : http://review.gluster.org/11373 Change-Id: I03f11af082a0decf4ea084480b67e9e156964c76 BUG: 1235601 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11408 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Do not attempt entry self-heal if the last lookup on entry ↵Krutika Dhananjay2015-06-181-1/+8
| | | | | | | | | | | | | | | | | | | | | failed on src Backport of: http://review.gluster.org/11119 Test bug-948686.t was causing shd to dump core due to gfid being NULL. This was due to the volume being stopped while index heal's in progress, causing afr_selfheal_unlocked_lookup_on() to fail sometimes on the src brick with ENOTCONN. And when afr_selfheal_newentry_mark() copies the gfid off the src iatt, it essentially copies null gfid. This was causing the assertion as part of xattrop in protocol/client to fail. Change-Id: I81723567af824ce4a9fa37e309eeeab8404ac71e BUG: 1233036 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11309 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: allow readdir to proceed for directories in split-brainv3.6.4beta2Ravishankar N2015-06-151-18/+22
| | | | | | | | | | | | | | | | | | | | | | | Problem: afr_read_txn() bails out if read_subvol==-1. This meant that for directories that were in entry split-brain, FOPS like readdir, access, stat etc were not allowed. Fix: Except for getxattr, all other FOPS are wound on the first up child of afr. Change-Id: Iacec8fbb1e75c4d2094baa304f62331c81a6f670 BUG: 1230242 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10776 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> (cherry picked from commit 49b428433a03fcf709fdc8c08603b4cf02198e0a) Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: honour selfheal enable/disable volume set optionsRavishankar N2015-06-152-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11012 Note: http://review.gluster.org/9459 is not backported to 3.6 but the change it makes to afr_get_heal_info() (i.e. handling ret values) is needed for heal info to work correctly and tests/basic/afr/client-side-heal.t to pass. -------------------------- afr-v1 had the following volume set options that are used to enable/ disable self-heals from happening in AFR xlator when loaded in the client graph: cluster.metadata-self-heal cluster.data-self-heal cluster.entry-self-heal In afr-v2, these 3 heals can happen from the client if there is an inode refresh. This patch allows such heals to proceed only if the corresponding volume set options are set to true. -------------------------- Change-Id: Iebf863758d902fd2f95be320c6791d4e15f634e7 BUG: 1230259 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11170 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Treat op_ret >= 0 as success in afr_final_errno()Krutika Dhananjay2015-06-031-1/+1
| | | | | | | | | | | | Backport of: http://review.gluster.org/10946 Change-Id: I6ca36fee5dc46f2a2afe04f3359c4102f45c3ae0 BUG: 1225745 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10961 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* Tests: fix spurious failure in read-subvol-entry.tEmmanuel Dreyfus2015-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | read-subvol-entry.t tests that if a brick has pending operations, it is not used for readdir operations. On NetBSD this test exhibits spurious failures, with the wrong brick being used to perform readdir. It happens because when afr_replies_interpret() looks at xattr for pending attributes, it uses alternative bahvior whether it is working on a directory or another object. The decision is based on inode->ia_type, which may be IA_INVAL at that time if we come there from: afr_replies_interpret.() afr_xattrs_are_equal() afr_lookup_metadata_heal_chec() afr_lookup_entry_heal() afr_lookup_cbk() Using replies[i].poststat.ia_type, which is correctly set, works around the problem. Resubmitted as is after rebase. This is a backport of Id9ccdd8604f79a69db5f1902697f8913acac50ad BUG: 1138897 Change-Id: I73f5e04dec86e648a28363f417559b0cbf80324d Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10178 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: serialize inode locksPranith Kumar K2015-03-252-60/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9372 Problem: Afr winds inodelk calls without any order, so blocking inodelks from two different mounts can lead to dead lock when mount1 gets the lock on brick-1 and blocked on brick-2 where as mount2 gets lock on brick-2 and blocked on brick-1 Fix: Serialize the inodelks whether they are blocking inodelks or non-blocking inodelks. Non-blocking locks also need to be serialized. Otherwise there is a chance that both the mounts which issued same non-blocking inodelk may endup not acquiring the lock on any-brick. Ex: Mount1 and Mount2 request for full length lock on file f1. Mount1 afr may acquire the partial lock on brick-1 and may not acquire the lock on brick-2 because Mount2 already got the lock on brick-2, vice versa. Since both the mounts only got partial locks, afr treats them as failure in gaining the locks and unwinds with EAGAIN errno. BUG: 1189023 Change-Id: If5dd502d9d25d12425749a8efcf08a1423b29255 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9576 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Make read child match check in afr optionalKrutika Dhananjay2015-03-253-0/+21
| | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/9917 Having this particular check which was introduced by commit c57c455347a72ebf0085add49ff59aae26c7a70d causes a drop in performance in readdirp. So the behavior is made configurable with this patch. Change-Id: I4a19813cfc786504340264a5a5533a0c43a1d4a4 BUG: 1202673 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9929 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: remove stale index entriesRavishankar N2015-03-252-3/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9714 Problem: During pre-op phase, the index xlator 1. Creates the entry inside .glusterfs/indices/xattrop 2. Winds the xattrop fop to posix to mark dirty/pending changelogs. If the brick crashes after 1, the xattrop entry becomes stale and never gets removed by shd during subsequent crawls because there is nothing to heal (changelogs are zero). Though the stale entry does not get displayed in the output of 'heal info' command, it nevertheless stays there forever unless a new write tansaction is performed on the file. Fix: During index self-heal if afr xattrs are found to be clean (indicated by ret value of 2 on a call to afr_shd_selfheal(), send a dummy post-op with all 0s for the xattr values, which makes the index xlator to unlink the stale entry. Change-Id: Iffb171e40490abd8d44df09ccc058b5da67baafe BUG: 1203081 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9920 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Convert quota size from n/w to host order before useKrutika Dhananjay2015-03-161-0/+4
| | | | | | | | | | | | | | Backport of: http://review.gluster.org/9853 Change-Id: I83f1ab16a2dc54841e7beff3033333fba009b3a4 BUG: 1201622 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9884 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Handle getxattr of quota-size keyPranith Kumar K2015-03-143-61/+103
| | | | | | | | | | | | | | | Backport of http://review.gluster.org/9820 Afr needs to query QUOTA_SIZE_KEY from all the subvolumes and return the value which is maximum of the readable bricks. BUG: 1201624 Change-Id: I41725a7323020c1480c38560dc5ae2c2e82d6d47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9873 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Do not increment healed_count if no healing was performedKrutika Dhananjay2015-03-146-26/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/9713 PROBLEM: When file modifications are happening while index heal is launched, index healer could pick up entries which appeared in indices/xattrop transiently during the course of the operations on the mount point, and do not really need any heal. This will cause index healer to keep doing index-heal in a loop as long as it finds this entry, by believing that it did successfully heal some gfids even when it didn't. FIX: afr_selfheal() now returns a 1 to indicate that it did not (need to) heal a given gfid. afr_shd_selfheal() will not increment healed_count whenever afr_selfheal() returns a 1. Change-Id: I9158c814419b635fac3dfe2fe40c94d1548ea4e8 BUG: 1194306 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9852 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: stop encoding subvolume id in readdir d_offAnand Avati2015-03-033-131/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9332 The purpose of encoding d_off in AFR is to indicate the selected subvolume for the first readdir, and continue all further readdirs of the session on the same subvolume. This is required because, unlike files, dir d_offs are specific to the backend and cannot be re-used on another subvolume. The d_off transformation encodes the subvolume id and prevents such invalid use of d_offs on other servers. However, this approach could be quite wasteful of precious d_off bit-space. Unlike DHT, where server id can change from entry to entry and thus encoding the server id in the transformed d_off is necessary, we could take a slightly relaxed approach in AFR. The approach is to save the subvolume where the last readdir request was sent in the fd_ctx. This consumes constant space (i.e no per-entry cache), and serves the purpose of avoiding d_off "misuse" (i.e using d_off from one server on another). The compromise here is NFS resuming readdir from a non-0 cookie after an extended delay (either anonymous FD has been reclaimed, or server has restarted). In such cases a subvolume is picked freshly. To make this fresh picking more deterministic (i.e, to pick the same subvolume whenever possible, even after reboots), the function afr_hash_child (used by afr_read_subvol_select_by_policy) is modified to skip all dynamic inputs (i.e PID) for the case of directories. BUG: 1191537 Change-Id: I7e3bd8dfe346a9a8e428d7ddeada6cfb66e64e54 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/9638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Crawl should continue on self-heal failuresPranith Kumar K2015-02-221-3/+2
| | | | | | | | | | | | | | | | | | | | Problem: If self-heal fails on the last entry of readdir response list, index heal stops. Fix: Continue crawl even with partial failures. PS: Bug seems to exist only on 3.6.2, on master the code is fine after http://review.gluster.com/9485 Change-Id: Ie39b0d424297e3c95a05cbe72438dfd9a4d5696d BUG: 1192522 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9653 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr : Fixes to 59ba78ae1461651e290ce72013786d828545d4c1Anuradha2015-02-111-0/+4
| | | | | | | | | | | | | | | | | | | A bug was found while reviewing http://review.gluster.org/9377/ (inode was not being unreffed after use). It is fixed on master as a part of the following change - http://review.gluster.org/#/c/9377/7/xlators/cluster/afr/src/afr-common.c . Fixing the same issue in 3.6 with this patch. Change-Id: Ibdc4be49d88613ccb1f80349eb4d368710c0c24b BUG: 1173528 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9450 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: When parent and entry read subvols are different, set ↵Krutika Dhananjay2015-02-112-23/+94
| | | | | | | | | | | | | | | | | entry->inode to NULL Backport of: http://review.gluster.org/#/c/9477 That way a lookup would be forced on the entry, and its attributes will always be selected from its read subvol. Change-Id: Iaba25e2cd5f83e983fc8b1a1f48da3850808e6b8 BUG: 1186119 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9562 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: Don't write to sparse regions of sink.Ravishankar N2015-02-031-2/+39
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9480 Problem: When data-self-heal-algorithm is set to 'full', shd just reads from source and writes to sink. If source file happened to be sparse (VM workloads), we end up actually writing 0s to the corresponding regions of the sink causing it to lose its sparseness. Fix: If the source file is sparse, and the data read from source and sink are both zeros for that range, skip writing that range to the sink. Change-Id: Id23d953fe2c8c64cde5ce3530b52ef91a7583891 BUG: 1187547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9515 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr : glfs-heal implementationAnuradha2015-01-0610-38/+400
| | | | | | | | | | | | | Backport of http://review.gluster.org/6529 and http://review.gluster.org/9119 Change-Id: Ie420efcb399b5119c61f448b421979c228b27b15 BUG: 1173528 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9335 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Make entry-self-heal in afr-v2 compatible with afr-v1Pranith Kumar K2015-01-041-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9227 Problem: entry self-heal in 3.6 and above, takes full lock on the directory only for the duration of figuring out the xattrs of the directories where as 3.5 takes locks through out the entry-self-heal. If the cluster is heterogeneous then there is a chance that 3.6 self-heal is triggered and then 3.5 self-heal will also triggered and both the self-heal daemons of 3.5 and 3.6 do self-heal. Fix: In 3.6.x and above get an entry lock on a very long name before entry self-heal begins so that 3.5 entry self-heal will not get locks until 3.6.x entry self-heal completes. BUG: 1177418 Change-Id: Iecf49d794c6b480e38563e39599a40067b3a21cb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9355 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* Avoid spurious directory metedata split brainEmmanuel Dreyfus2014-12-231-1/+115
| | | | | | | | | | | | | | | | | | | | | When directory content is modified, [mc]time is updated. On Linux, the filesystem does it, while at least on NetBSD, the kernel file-system independant code does it. This means that when entries are added while bricks are down, the kernel sends a SETATTR [mc]time which will cause metadata split brain for the directory. In this case, clear the split brain by finding the source with the most recent modification date. Backport of: Ic0177e0df753a4748624d0b906834ed54593adb9 BUG: 1138897 Change-Id: Ic2e697be4f0074e2bbd3f23d6ad40a2d2d126a2c Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9319 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* telldir()/seekdir() portability fixesEmmanuel Dreyfus2014-12-201-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX says that an offset obtained from telldir() can only be used on the same DIR *. Linux is abls to reuse the offset accross closedir()/opendir() for a given directory, but this is not portable and such a behavior should be fixed. An incomplete fix for the posix xlator was merged in http://review.gluster.org/8933 This change set completes it. - Perform the same fix index xlator. - Use appropriate casts and variable types so that 32 bit signed offsets obtained by telldir() do not get clobbered when copied into 64 bit signed types. - modify afr-self-heald.c so that it does not use anonymous fd, since this will cause closedir()/opendir() between each syncop_readdir(). On failure we fallback to anonymous fs only for Linux so that we can cope with updated client vs not updated brick. - Avoid sending an EINVAL when the client request for the EOF offset. Here we fix an error in previous fix for posix xlator: since we fill each directory entry with the offset of the next entry, we must consider as EOF the offset of the last entry, and not the value of telldir() after we read it. This is a backport of I59fb7f06a872c4f98987105792d648141c258c6a BUG: 1138897 Change-Id: I1e9f3e4a7d780b98adf6d9f197ee2198d43ef94d Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9084 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Eliminate locking in sh domain in metadata self-healKrutika Dhananjay2014-12-091-35/+2
| | | | | | | | | | | | | Backport of: http://review.gluster.org/9240 Change-Id: I2ab2ad9a02d88c299cfb32e0cf6baa44d1c2ee12 BUG: 1171077 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9246 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* features/marker: Filter internal xattrs in lookupPranith Kumar K2014-11-163-7/+21
| | | | | | | | | | | | | | Backport of http://review.gluster.org/9061 Afr should ignore quota-size-key as part of self-heal but should heal quota-limit key. BUG: 1163569 Change-Id: I93d203002eac4fe20b70730c27c852d783c16d7f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9110 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Preserve errno in case of failures on all subvolsPranith Kumar K2014-11-161-5/+16
| | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8984 Problem: When quorum is enabled and the fop fails on all the subvolumes, op_errno is set to EROFS which overrides the actual errno returned from bricks. Fix: Don't override the errno when fop fails on all subvols. Change-Id: I61e57bbf1a69407230ec172a983de18d1c624fd2 BUG: 1162122 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9087 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform post-op in entry selfheal inside locksKrutika Dhananjay2014-10-311-3/+31
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/9020 Take entrylks in xlator domain before doing post-op (undo-pending) in entry self-heal. This is to prevent a parallel name self-heal on an entry under @fd->inode from reading pending xattrs while it is being modified by SHD after entry sh below, given that name self-heal takes locks ONLY in xlator domain and is free to read pending changelog in the absence of the following locking. Change-Id: I0bc92978efc0741d6e3f2439540d008e31472313 BUG: 1136821 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9030 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* logs: Do selective logging for errnosPranith Kumar K2014-10-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8918 http://review.gluster.org/8955 Problem: Just after replace-brick the mount logs are filled with ENOENT/ESTALE warning logs because the file is yet to be self-healed now that the brick is new. Fix: Do conditional logging for the logs. ENOENT/ESTALE will be logged at lower log level. Only when debug logs are enabled, these logs will be written to the logfile. Change-Id: If203d09e2479e8c2415ebc14fb79d4fbb81dfc95 BUG: 1155027 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Add afr-v1 xattr compatibilityPranith Kumar K2014-10-226-83/+330
| | | | | | | | | | | | | | | Backport of http://review.gluster.org/8536 All the special cases v1 handles and also self-accusing pending changelog from v1 pre-op also is handled in this patch. BUG: 1155017 Change-Id: I86cf6b80492be5c1f240c74f91a0e1b0dd9b58b2 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8956 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fix invalid seekdir() usagev3.6.0beta3Emmanuel Dreyfus2014-09-301-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to POSIX, seekdir() should only be given offset obtained from telldir() on the same DIR * http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html Code from afr-self-heald.c and index.c is operating outside of the specification, by doing using seekdir() with offset from a previously open/close/re-open directory. This seems to work on Linux (although with no guarantee it will always in the future). On NetBSD the seekdir() with a in invalid offset is a nilpotent operation, and causes an infinite loop, since index_fill_readdir() always restart from the beginning of the directory. The situation is fixed by using a non anonymous fd in afr-self-heald.c: we explicitely open the directory so that it remains open on the brick side during the timeframe where we want to reuse offsets in seekdir(). This requires adding an opendir fop in index xlator. If the brick was not updated, the opendir will fail and we fallback to the standard violating approach for backward compatibility on Linux. On other systems we fail since it never worked. While there, add tests to check seekdir() success in index and posix xlators, so that incorrect usage from calling code produce an explicit error instead of an infinite loop. We can only do it on non Linux systems, for the sake of backward compatibility when the brick was updated but not the client. Backport of I88ca90acfcfee280988124bd6addc1a1893ca7ab BUG: 1138897 Change-Id: I5446a9a17d5451ec5aab8fbd10d381da9a0a23ad Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8860 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr : Fix incorrect looping of index healerAnuradha2014-09-291-4/+9
| | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/8868 Sending appropriate return value from afr_selfheal() fixes the issue. Credits : Krutika Dhananjay and Pranith Kumar. Change-Id: I1dc8105078e99dbc12295ef557a407bf3d8cfec3 BUG: 1147486 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/8884 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Launch self-heal only when all the brick status is knownPranith Kumar K2014-09-291-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: File goes into split-brain because of wrong erasing of xattrs. RCA: The issue happens because index self-heal is triggered even before all the bricks are up. So what ends up happening while erasing the xattrs is, xattrs are erased only on the sink brick for the brick that it thinks is up leading to split-brain Example: lets say the xattrs before heal started are: brick 2: trusted.afr.vol1-client-2=0x000000020000000000000000 trusted.afr.vol1-client-3=0x000000020000000000000000 brick 3: trusted.afr.vol1-client-2=0x000010040000000000000000 trusted.afr.vol1-client-3=0x000000000000000000000000 if only brick-2 came up at the time of triggering the self-heal only 'trusted.afr.vol1-client-2' is erased leading to the following xattrs: brick 2: trusted.afr.vol1-client-2=0x000000000000000000000000 trusted.afr.vol1-client-3=0x000000020000000000000000 brick 3: trusted.afr.vol1-client-2=0x000010040000000000000000 trusted.afr.vol1-client-3=0x000000000000000000000000 So the file goes into split-brain. BUG: 1142612 Change-Id: I0c8b66e154f03b636db052c97745399a7cca265b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8756 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix inode leakKrutika Dhananjay2014-09-291-0/+2
| | | | | | | | | | | | Backport of: http://review.gluster.org/8875 Change-Id: Ib000be1238d38f8d63ff25b3873bb813bf72beec BUG: 1145914 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8876 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: More dict_t leak fixesKrutika Dhananjay2014-09-262-33/+68
| | | | | | | | | | | | Backport of: http://review.gluster.org/8852 Change-Id: I2c927092dc5a834fabdd3495a7f7a3527604a6d2 BUG: 1136831 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8869 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix locking issues in entry self-healKrutika Dhananjay2014-09-261-92/+123
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/8837/ Original reporter of the bug & designer of the solution: Pranith Kumar K <pkarampu@redhat.com> Change-Id: I6e36326e1d9398ede82166b358ab438367dd3011 BUG: 1136829 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8845 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix spurious metadata self-healsv3.6.0beta2Pranith Kumar K2014-09-247-29/+86
| | | | | | | | | | | | | | | Backport of http://review.gluster.org/8709 - Added logging for metadata and data self-heals which helped in debugging this issue. - Added checks to skip self-heals when no sinks are available to heal BUG: 1145987 Change-Id: Ide03af4f531a1280ec8ad95b627285df4d7bc42d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8832 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fixed mem leaks in self-heal code path.Anuradha2014-09-242-1/+17
| | | | | | | | | | | | | | | | | backport of: http://review.gluster.org/8821 AFR_STACK_RESET previously didn't cleanup afr_local_t, leading to memory leaks. With this patch, cleanup is done. All credit goes to Pranith Kumar Karampuri. Change-Id: I26506dfd9273b917eff5127c3e0cf9421e60f228 BUG: 1145914 Reviewed-on: http://review.gluster.org/8831 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Do not reset pending xattrs on gfid or type mismatch in entry-shKrutika Dhananjay2014-09-231-18/+79
| | | | | | | | | | | | Backport of: http://review.gluster.org/8816 Change-Id: I8463a579f542a2336b02edba8f5fbfea0edbbffe BUG: 1136829 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8823 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Don't start heal when lookup succeeds on < 2 childrenPranith Kumar K2014-09-236-8/+29
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8698 Problem: When self-heal code doesn't see at least 2 successes on looking up children, then self-heal can't be done. What is happening now is if all the lookups fail then the pending changelog is all zeros in xattrs so all the children are becoming sources and leading to crashes when the code paths further assume that some data structures are populated properly Fix: Don't proceed with self-heals when < 2 children succeed lookups. BUG: 1145726 Change-Id: I65465843f0e554c8ccdd8fa930ab42ac123ec023 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8824 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Set all the xattrs needed by index xlatorAnuradha2014-09-212-41/+30
| | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/8652 Index xlator removes the index file from indices xattrop directory in case the value for keys sent are zero. If all the required keys are not set by afr then index file might be removed in an invalid way. With this change all the keys required by index xlator are set by afr such that invalid removal of files does not occur. Change-Id: I1b77904920c8566057415c52242179aec6a015e2 BUG: 1144744 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/8788 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: perform list-xattr during lookupRavishankar N2014-09-194-11/+222
| | | | | | | | | | | | | Detect and heal mismatching user extended attributes during lookup. Backport of: http://review.gluster.org/8558 Change-Id: Id03c9746f083ffd3014711d0b3a2e5a71a45eed4 BUG: 1144274 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8773 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr : Mark pending changelog xattrs for new creationsAnuradha2014-09-187-90/+148
| | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/8555 Based on type of file, set appropriate pending changelogs for new entries. Change-Id: Icf9af866fe9a9e511210e8ad097e968e2307d8ee BUG: 1141787 Reviewed-on: http://review.gluster.org/8555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/8748 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Propagate EIO on inode's type mismatchKrutika Dhananjay2014-09-167-121/+332
| | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/8574, and http://review.gluster.org/8586 Original author of the test script: Pranith Kumar K <pkarampu@redhat.com> Change-Id: I0c32bdd8e666f8175c0a8fbf940934e6ce469931 BUG: 1136830 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8706 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix dict_t leaksKrutika Dhananjay2014-09-168-35/+64
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/8557/ dict_t objects that are ref'd in alloca'd "replies" in afr_replies_copy() are not unref'd before "replies" go out of scope. Change-Id: I9bb45bc673ec13292ac96dda060aceb48739ebe8 BUG: 1136831 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8704 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Handle EAGAIN properly in inodelkPranith Kumar K2014-09-152-14/+130
| | | | | | | | | | | | | | | | | Problem: When one of the brick is taken down and brough back up in a replica pair, locks on that brick will be allowed. Afr returns inodelk success even when one of the bricks already has the lock taken. Fix: If any brick returns EAGAIN return failure to parent xlator. BUG: 1142020 Change-Id: Iee3f5990be75e10f8accec9bc3856e3f76d1593c Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix all locked_on bricks are sinks check in self-healsPranith Kumar K2014-09-125-82/+66
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8456 Problem: Counts may give wrong results when the number of bricks is > 2. If the locks are acquired on one source and sink, but the source accuses even the down sink then there will be 2 sinks and lock is acquired on 2 bricks so even when there is a clear source and sink **_finalize_source functions think the file/directory is in split-brain. Fix: Check that all the bricks which are locked are sinks. BUG: 1136829 Change-Id: I56a8f9ff261bdeec8c441237c485036141b6f00d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8593 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform metadata sync inside metadata locksPranith Kumar K2014-09-121-19/+15
| | | | | | | | | | | | | Backport of http://review.gluster.org/8514 BUG: 1136825 Change-Id: I480a0dc0dfff8bcff084c9b3f048c5b355683f73 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8590 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform gfid heal inside locks.Pranith Kumar K2014-09-126-27/+69
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8512 Problem: Allowing lookup with 'gfid-req' will lead to assigning gfid at posix layer. When two mounts perform lookup in parallel that can lead to both bricks getting different gfids leading to gfid-mismatch/EIO for the lookup. Fix: Perform gfid heal inside lock. BUG: 1136823 Change-Id: I1059fcd38338348b7e96a0bd4c35b0f49bf77aec Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8589 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: Capture "correct" internal FOPsVenky Shankar2014-09-082-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes changelog capturing internal FOPs in a cascaded setup, where the intermediate master would record internal FOPs (generated by DHT on link()/rename()). This is due to I/O happening on the intermediate slave on geo-replication's auxillary mount with client-pid -1. Currently, the internal FOP capturing logic depends on client pid being non-negative and the presence of a special key in dictionary. Due to this, internal FOPs on an inter-mediate master would be recorded in the changelog. Checking client-pid being non-negative was introduced to capture AFR self-heal traffic in changelog, thereby breaking cascading setups. By coincidence, AFR self-heal daemon uses -1 as frame->root->pid thereby making is hard to differentiate b/w geo-rep's auxillary mount and self-heal daemon. BUG: 1138952 Change-Id: Ia08a2cfa3b02bb785f343794f5b2695d44398c4c Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Kotresh H R <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8347 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8638
* cluster/afr: Set pending changelog based on filetype for new entriesPranith Kumar K2014-09-071-4/+12
| | | | | | | | | | | Backport of http://review.gluster.org/8506 BUG: 1136822 Change-Id: Ia864040306405acf9ebddabf63e87dc2016372dd Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8588 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Fix mem-leakPranith Kumar K2014-08-191-2/+5
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/8457 Problem: local->xattr_req is already reffed with xattr_req that comes in lookup fop. But when afr_lookup_xattr_req_prepare is called local->xattr_req is over-written with dict_new() which leads to ref leak on the dict which came in lookup fop Fix: Create local->xattr_req only when it is NULL BUG: 1128801 Change-Id: I4a1065add7700317e0cd3d0dda0a91e12d77e340 Reviewed-on: http://review.gluster.org/8460 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>