summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
...
* cluster/dht: Fix subvol check, to correctly determine cached file renameShyam2014-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The check to treat rename as a critical failure ignored when the cached file is being renamed to new name, as the new name falls on the same subvol as the cached file. This is in addition to when the target of the rename does not exist. The current change is simpler, as the rename logic, renames the cached file in case the target exists and falls on the same subvol as source name, OR the target does not exist and the hash of target falls on the same subvol as source cached. These conditions mean we are renaming the source, other conditions mean we are renaming the source linkto file which we do not want to treat as a critical failure (and we also instruct marker that it is an internal FOP and to not account for the same). Change-Id: I4414e61a0d2b28a429fa747e545ef953e48cfb5b BUG: 1161156 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9063 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: susant palai <spalai@redhat.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* client: writev,fsync to use correct rsp structureRudra Siva2014-11-161-2/+2
| | | | | | | | | | | | | Presently writev_cbk and fsync_cbk pass truncate_rsp for decoding, this should not create any problems as they are structurally the same. Should they diverge in the future this could show up as a bug. Change-Id: Id7da7b6a20f468ca943ceb7926de64b7692f7ec8 BUG: 1164559 Signed-off-by: Rudra Siva <rudrasiva11@gmail.com> Reviewed-on: http://review.gluster.org/9134 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* client: pass xflags for unlinkJeff Darcy2014-11-151-0/+1
| | | | | | | | | | | | | | | | | Nobody seems to use these currently, but I tried to for some debugging, and that led to a few head-scratches before I figured out that it wasn't being passed across the server/client boundary. Might as well fix it before somebody tries to use it for real and has to go through the same exercise. Change-Id: Ieddfac106103db02fdf488c86f3f979d29a6ab83 BUG: 1158614 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8287 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Avoid self-heal on directories on (f)stat callsXavier Hernandez2014-11-151-1/+2
| | | | | | | | | | | | | | | | | | | To avoid inconsistent directory listings, a full self-heal cannot happen on a directory until all its contents have been healed. This is controlled by a manual command using getfattr recursively and in post-order. While navigating the directories, sometimes an (f)stat fop can be sent. This fop caused a full self-heal of the directory. This patch makes that (f)stat only initiates a partial self-heal. Change-Id: I0a92bda8f4f9e43c1acbceab2d7926944a8a4d9a BUG: 1163760 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9117 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* gNFS : make it possible to mount a subdir that actually is a symlinkjiffin2014-11-141-1/+143
| | | | | | | | | | | | | | | | | We are using the function to export all sub-directories in a gluster volume via nfs. For real directories it works fine but if we have a symbolic link which points to the directory, it is not possible to mount that directory via nfs and the nameof the link. Kernel nfs resolves symlink handle to directoryhandle , similar gluster nfs should resolve the symbolic link handle into directory handle. Change-Id: I8bd07534ba9474f0b863f2335b2fd222ab625dba BUG: 1157223 Signed-off-by: jiffin tony thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/9052 Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* uss/snapd: Handle readlink fops on snap view serverAvra Sengupta2014-11-141-0/+1
| | | | | | | | | | | | | Handle readlink fops in case of symlinks on snap view server BUG: 1162462 Change-Id: Ia08e9e9c1c61e06132732aa580c5a9fd5e7c449b Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/9102 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Validate the options of ussvmallika2014-11-142-7/+15
| | | | | | | | | | | Change-Id: Id13dc4cd3f5246446a9dfeabc9caa52f91477524 BUG: 1111554 Signed-off-by: Varun Shastry <vshastry@redhat.com> Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8133 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* posix: Changed order of chown and chmodVenkatesh Somyajulu2014-11-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rebalance process runs in the root mode. If a normal user create a file and if it requires migration then because the migrated file is created by root, its owner and mode should be changed to the source normal user and permission should be changed the previous mode. If the suid bit is also set, then at the destination suid bit should also be set. Two operations are performed in the given order: 1. chmod 2. chown But chown resets the suid bit. So changed the order of these two operations so that first chown will be performed and then chmod will be performd so that suid bit will be preserved. Change-Id: Ib63b5cf528f8336b69bf090ad43bb02eec1d1602 BUG: 1086228 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/7435 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Portability fix: mount.glusterfsEmmanuel Dreyfus2014-11-131-2/+3
| | | | | | | | | | | | Remove bash-specific syntax from mount.glusterfs BUG: 1129939 Change-Id: Iec3a52686f7cee1825ac5a06c11fb8ac4d3e5d65 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9044 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Don't append nouuid mount option for snapshot brickvmallika2014-11-132-1/+37
| | | | | | | | | | | | | if original brick already has this option Change-Id: I2841d2ac371a3e9505f6061f35d1d447946c0bae BUG: 1133456 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8526 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* gNFS: Allow reading ACLs even without read permissions on the file.Meghana Madhusudhan2014-11-132-9/+79
| | | | | | | | | | | | | | | | | | When root-squash is enabled or when no permissions are given to a file, NFS threw permission errors. According to the kernel-nfs behaviour, no permissions are required to read ACLs. When no ACLs are set, the system call sys_lgetxattr fails and returns a ENODATA error. This translates to ESERVERFAULT error in NFS. Fuse makes an exception to this error and returns a success case. Similar changes are made here to achieve the expected behaviour. Change-Id: I46b8f5911114eb087a3f8ca4e921b6b41e83f3b3 BUG: 1161092 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9085 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* uss/gluster: creating file/directories inside .snaps should fail withvmallika2014-11-134-27/+37
| | | | | | | | | | | | | | | EROFS When an attempt is made to create file/directories inside .snaps, it fails with wrong error message as "Stale file handle". It should fail with "Read-only file system" Change-Id: I3a812a0afc4762cbb71ab180b9394c866e576a66 BUG: 1159840 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9039 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Snapshot should be deactivated when it is createdvmallika2014-11-122-134/+203
| | | | | | | | | | | | | | | | | By default snapshot should be deactivated and this should be a configurable option. This behaviour can be configured by the command below: gluster snapshot config activate-on-create <enable|disable> Change-Id: I1911595c32beed43bb2fca4bf99f0d264b422513 BUG: 1157991 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8985 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd/snapshot: Check if LVM device path exists before delete.Avra Sengupta2014-11-121-46/+59
| | | | | | | | | | | | | | | | | Check if the LV is present before deleting the LV. In case where the LV is absent (already deleted?), need not fail the snap delete operation. Also check if the LV is mounted before trying umount. In case it isn't umounted, only remove the LV. Change-Id: I0f5b2674797299d8748c6fac5b091f0caba65ca4 BUG: 1104714 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/8954 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* uss/gluster: Move all uss related logs into subfoldervmallika2014-11-122-6/+14
| | | | | | | | | | | | | | | | | | | | | | For USS we have 1 snapd log per volume and as many snap logs for volume. For example if there are 4 volumes having 256 snaps each and USS is enabled than total number of logs under /var/log/glusterfs for USS would be 1028 logs. Total logs = (4(snapd per volume) + 4(volumes)*256(snaps)) = 1028 Hence, it makes sense to move into into sub-folder structure like /var/log/glusterfs/snaps/<vol-name>/<snapd + snaps logs> Change-Id: I29262e6458c3906916923cd67d1145d6ae10bec3 BUG: 1160534 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9050 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* USS : Display only the activated snapshotsSachin Pandit2014-11-123-41/+68
| | | | | | | | | | | | | | | Instead of displaying all the snapshots in the uss world, it is better if we display only the activated snapshots. Change-Id: I70d3ec212b62ec15956ae3e826bc4201d8dedd17 BUG: 1155042 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/8958 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* ec: Correctly handle quota xattrsXavier Hernandez2014-11-121-0/+53
| | | | | | | | | Change-Id: I35e11d83c318210d44b918e847cf13db35b01510 BUG: 1158008 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8990 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Cluster/DHT : Rebalance skipped file count fixNithya Balachandran2014-11-111-4/+5
| | | | | | | | | | | | | | | The return value in dht_migrate_file is used to indicate the status of the file migration. This value was being masked by the lock operation causing the skipped and failure file counts to be incorrectly calculated. Change-Id: Ice3d2f5d57766e18aa52659f22a76867d188dc65 BUG: 1161518 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9070 Reviewed-by: susant palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/marker: Filter internal xattrs in lookupPranith Kumar K2014-11-114-39/+98
| | | | | | | | | | | | | Afr should ignore quota-size-key as part of self-heal but should heal quota-limit key. Change-Id: Ic0b06bd20a563a00d6bfdc2dc5a76c661e533ecb BUG: 1161106 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9061 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: mount snapshot volume with read-only optionvmallika2014-11-101-0/+5
| | | | | | | | | | | | | | | | | | Snapshot volumes are readonly. If you mount the volume to the client it doesn't allow writes, but its attributes are rw which contradicts the functionality. mount script should set read-only attributes for snapshot volumes. Change-Id: I056253abd8dfe7b2b43a064fbdbd9c16b8eca679 BUG: 1132946 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8518 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* inode: Handle '/' in basename in inode_link/unlinkPranith Kumar K2014-11-071-1/+1
| | | | | | | | | | | | | | | | | | | Problem: inode_link is sometimes called with a trailing '/'. Lookup, dentry operations like link/unlink/mkdir/rmdir/rename etc come without trailing '/' so the stale dentry with '/' remains in the dentry list of the inode. Fix: Add assert checks and return NULL for '/' in bname. Fix ancestry building code to call without '/' at the end. Change-Id: I9c71292a3ac27754538a4e75e53290e182968fad BUG: 1158751 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9004 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/gfid-access: Always send setattr down in overloaded setxattr.Kotresh HR2014-11-071-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: File ownership is not being preserved for root in geo-rep mountbroker setup. Analysis and Cause: Entry creations for geo-rep is overloaded in ga_setxattr. It happens in two phase, entry creation followed by setattr to preserve ownership as in master. If uid and gid of file being synced is root, setattr was not being sent down. Since, the file creation happens with non-root user in mountborker geo-rep setup, if setattr is not done explicitly, file ownership is not preserved for root. Solution: Always pass setattr down in overloaded ga_setxattr. Change-Id: I062215c1b2379d515f28ec7f271077ad37182c7e BUG: 1104954 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9051 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* api: versioned symbols in libgfapi.so for compatibilityKaleb S. KEITHLEY2014-11-071-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use versioned symbols to keep libgfapi at libgfapi.so.0.0.0 Some nits uncovered: + there are a couple functions declared that do not have an associated definition, e.g. glfs_truncate(), glfs_caller_specific_init() + there are seven private/internal functions used by heal/src/glfsheal and the gfapi master xlator (glfs-master.c): glfs_loc_touchup(), glfs_active_subvol(), and glfs_subvol_done(), glfs_init_done(), glfs_resolve_at(), glfs_free_from_ctx(), and glfs_new_from_ctx(); which are not declared in glfs.h; + for this initial pass at versioned symbols, we use the earliest version of all public symbols, i.e. those for which there are declarations in glfs.h or glfs-handles.h. Further investigation as we do backports to 3.6, 3.4, and 3.4 will be required to determine if older implementations need to be preserved (forward ported) and their associated alias(es) and symbol version(s) defined. FWIW, we should consider linking all of our libraries with a map, it'll result in a cleaner ABI. Perhaps something for an intern to do or a Google Summer of Code project. Change-Id: I499456807a5cd26acb39843216ece4276f8e9b84 BUG: 1160709 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/9036 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* barrier: Correct gfid in statedump of barriered fopsggarg2014-11-062-14/+145
| | | | | | | | | | | | | | | | In brick statedump file the barriered fop's gfid was showing 0 when statedump was taken. This is because of statedump code was not referring to correct gfid. With this change statedump code will use correct gfid and gfid will not be 0 in statedump file when barrier is enable and user takes statedump of volume. Change-Id: Ia296cba7e132402df53c602daa160c1c2cd21245 BUG: 1099369 Reviewed-on: http://review.gluster.org/7893 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd : release cluster wide locks in op-sm during failuresAtin Mukherjee2014-11-064-69/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterd op-sm infrastructure has some loophole in handing error cases in locking/unlocking phases which ends up having stale locks restricting further transactions to go through. This patch still doesn't handle all possible unlocking error cases as the framework neither has retry mechanism nor the lock timeout. For eg - if unlocking fails in one of the peer, cluster wide lock is not released and further transaction can not be made until and unless originator node/the node where unlocking failed is restarted. Following test cases were executed (with the help of gdb) after applying this patch: * RPC timesout in lock cbk * Decoding of RPC response in lock cbk fails * RPC response is received from unknown peer in lock cbk * Setting peerinfo in dictionary fails while sending lock request for first peer in the list * Setting peerinfo in dictionary fails while sending lock request for other peers * Lock RPC could not be sent for peers For all above test cases the success criteria is not to have any stale locks Change-Id: Ia1550341c31005c7850ee1b2697161c9ca04b01a BUG: 1154635 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9012 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glupy: portability fixesEmmanuel Dreyfus2014-11-053-13/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes portability problems so that NetBSD passes tests/features/glupy.t - Use python-config to detect python build environment on all systems, not just Linux and Darwin. - Get the site-package directory from python and make sure we install glupy.py there, Previously we installed within glusterfs prefix, which caused a problem if it was different that python's prefix. - Set PYTHONPATH for tests so that the detected site-packages is used in python's search path. This should be useless, but let us have it just in case. - Pass glupy.so path from glusterfsd to glupy.py through an environment variable and use it in CDLL instead of "", as the later seems not portable (at least it fails on NetBSD). - Use gil_init_key pthread_getspecific to avoid deadlocks (that code was #ifdef out, perhaps because it was not needed on Linux, but it seems to be required for NetBSD. - Recover the error message from Python and send it to the logs to help debugging problems. BUG: 1129939 Change-Id: Icc71e77d6940f0759cc14c5c5cf7ca6fa431e0d2 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8978 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* uss/gluster: Fix typo error in the description for USS under "gluster volumevmallika2014-11-051-1/+1
| | | | | | | | | | | | | | | | | set help" gluster volume set help for uss shows "User Servicable Snapshots" whereas it should be "User Serviceable Snapshots" Change-Id: I3cc8b3ea2cb6d209e1a12678eb7d0e68f4160d99 BUG: 1160236 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9041 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* storage/posix: Treat ENODATA/ENOATTR as success in bulk removexattrPranith Kumar K2014-11-051-0/+14
| | | | | | | | | | | | | | | | | Bulk remove xattr is internal fop in gluster. Some of the xattrs may have special behavior. Ex: removexattr("posix.system_acl_access"), removes more than one xattr on the file that could be present in the bulk-removal request. Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR. Since all this fop cares is removal of the xattrs in bulk-remove request and if they are already deleted, it can be treated as success. Change-Id: Id8f2a39b68ab763ec8b04cb71b47977647f22da4 BUG: 1160509 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9049 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Preserve errno in case of failures on all subvolsPranith Kumar K2014-11-051-5/+16
| | | | | | | | | | | | | | | | | | | | Problem: When quorum is enabled and the fop fails on all the subvolumes, op_errno is set to EROFS which overrides the actual errno returned from bricks. Fix: Don't override the errno when fop fails on all subvols. Change-Id: I61e57bbf1a69407230ec172a983de18d1c624fd2 BUG: 1157976 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8984 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlator/io-stat: Check and copy loc->pathShyam2014-11-051-3/+9
| | | | | | | | | | | | | | | | | | Cases where loc->path is NULL, the current code in create/open/mkdir would copy the same blindly and as a result coredump. This is a preventive fix for the coredump. The reason for loc->path to be NULL in certain cases is yet to be determined. One such case is when resolve_loc_touchup fails to get inode_path due to loops in the inode table. Change-Id: Ic2ddf2cc9f2acaf9b939afc11afd193b4402ee7c BUG: 1159221 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/9029 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: Use per-volume log file for crawlerKrutika Dhananjay2014-11-031-6/+8
| | | | | | | | | | | Change-Id: I195b3309bae7e684b7dbf771e4f3b4778d0dac4c BUG: 1146377 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/8843 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* Cluster/DHT : Fixed crash due to null derefNithya Balachandran2014-11-031-2/+3
| | | | | | | | | | | | | | | | A lookup on a linkto file whose trusted.glusterfs.dht.linkto xattr points to a subvol that is not part of the volume can cause the brick process to segfault due to a null dereference. Modified to check for a non-null value before attempting to access the variable. Change-Id: Ie8f9df058f842cfc0c2b52a8f147e557677386fa BUG: 1159571 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9034 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/geo-rep: Fix glusterd crash in non-originator slave node.Kotresh HR2014-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | Problem: glusterd crashes in non-originator slave node during geo-rep create push-pem. Cause: In glusterd_op_copy_file, the value of the key "common_pem_contents" is freed explicitly even after dict_set is successful when it is taken cared by dict_free. Solution: Free only in failure cases before dict_set. Change-Id: I65b5f32ee2b946107ad279b1fe3d728ec699bc7e BUG: 1159119 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9018 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* rebalance: ``check_free_space`` should ignore quota_statfsHarshavardhana2014-10-311-10/+33
| | | | | | | | | | | | | | | | | | | | | quota_statfs() returns aggregated details of space usage of bricks this causes distribute to be confused during ``rebalance``, where ``statfs()`` values are used to schedule file migration. We can make sure the values of ``statfs`` are from individual bricks by selectively instructing ``quota_statfs()`` to return non aggregated values. Change-Id: I1397faeee66a1b9c26709cfda693286d227a4170 BUG: 1158262 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8996 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Perform post-op in entry selfheal inside locksKrutika Dhananjay2014-10-311-3/+31
| | | | | | | | | | | | | | | | | Take entrylks in xlator domain before doing post-op (undo-pending) in entry self-heal. This is to prevent a parallel name self-heal on an entry under @fd->inode from reading pending xattrs while it is being modified by SHD after entry sh below, given that name self-heal takes locks ONLY in xlator domain and is free to read pending changelog in the absence of the following locking. Change-Id: Ie083ceab10155c460447f04bdce7688480f1ac4f BUG: 1128721 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9020 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: add option support for own-threadJeff Darcy2014-10-301-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | Like enabling SSL, enabling own-thread has to be done separately for clients and servers. * client.own-thread for clients (including internal like self-heal) * server.own-thread for servers (including e.g. glusterd) It's very unlikely that you would ever want to set one without the other, but they're separate anyway just in case. Check for "private polling thread" in the relevant logs to make sure the option took effect, because otherwise you might not notice any difference besides inreased performance. ;) Change-Id: Ifaee8de52f0b959bcdf7f6b56faeee549ee56604 BUG: 1158648 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/8931 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* features/snapview-server: verify the fs instance in revalidated lookups as wellRaghavendra Bhat2014-10-301-7/+53
| | | | | | | | | Change-Id: Id5f9d5a23eb5932a0a53520b08ffba258952e000 BUG: 1151004 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8999 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Store rebalance state on all peersKaushal M2014-10-291-1/+11
| | | | | | | | | | | | | | | The rebalance state was being saved only on the peers participating in the rebalance on a rebalance start. This change makes sure all nodes save the rebalance state. Change-Id: I436e5c34bcfb88f7da7378cec807328ce32397bc BUG: 1157979 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/8998 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* Avoid spurious EINVAL in posix_readdir()Emmanuel Dreyfus2014-10-292-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | On non Linux systems, we check that seekdir() succeeds and we return EINVAL if it does not. We need this to avoid infinite loops if some other component in GlusterFS makes an invalid seekdir() usage. This was introduced in this change: http://review.gluster.org/#/c/8760/ But seekdir() also fails when using the offset returned for the last entry, and this is expected behavior. As a result, the seekdir() test produces a spurious EINVAL when reaching end of directory. That error is not propagated to calling process, but it may harm internal GlusterFS processing. At least it produce a spurious error message in brick's log. We fix the problem by remembering the last entry offset in fd private data. When a new posix_readdir() invocation requests that offset, we avoid returning EINVAL. BUG: 1129939 Change-Id: I4e67a2ea46538aae63eea663dd4aa33b16ad24c7 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/changelog: Fix changelog missing SETATTR entries.Kotresh HR2014-10-292-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: Valid SETATTR entries are missing in changelog when more than one metadata operation happen on same inode within changelog roll-over time. Cause: Metadata entries with fop num being GF_FOP_NULL are logged in changelog which is of no use. Since slice version checking is done for metadata entries to avoid logging of subsequent entries of same inode falling into same changelog, if the entry with GF_FOP_NULL is logged first, subsequent valid ones will be missed. Solution: Have a boundary condition to log only those fops whose fop number falls between GF_FOP_NULL and GF_FOP_MAXVALUE. Change-Id: Iff585ea573ac5e521a361541c6646225943f0b2d BUG: 1104954 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8964 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd: op state machine shouldn't use global peer listAtin Mukherjee2014-10-284-11/+42
| | | | | | | | | | | | | | | | | | | | Problem : op state machine was relying on the global peer list while sending lock/stage/unlock commit rpc requests to the peers in the cluster. Trusting on global peer list structure is dangerous as this structure gets modified if any peer modification command is attempted in the cluster when there is a ongoing transaction going through the state machine. An ideal usecase of this problem when rebalance is in progress and peer probe is executed rebalance op-sm and peer probe may run into race making peerinfo structure go for toss. Solution: Use local copy of peer list (xaction_peers) in glusterd op-sm. Change-Id: I1ff7118dc6a9a72633e2e87b7ab7bae1796595e0 BUG: 1152890 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8932 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* Posix: Brick failure detection fix for ext4 filesystemLalatendu Mohanty2014-10-281-6/+64
| | | | | | | | | | | | | | | | Issue: stat() on XFS has a check for the filesystem status but ext4 does not. Fix: Replacing stat() call with open, write and read to a new file under the "brick/.glusterfs" directory. This change will work for xfs, ext4 and other fileystems. Change-Id: Id03c4bc07df4ee22916a293442bd74819b051839 BUG: 1130242 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/8213 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* changelog: replace MAKE_HTIME_FILE_PATH with snprintf()Niels de Vos2014-10-281-7/+1
| | | | | | | | | | | | | | | | The used once MAKE_HTIME_FILE_PATH macro uses strcpy and strcat into a fixed buffer without checking the input lengths. Recommend replacing with a snprintf. Change-Id: Ia0245096774dc84be1b937e1d5750f3634fff034 BUG: 1099645 Reported-by: Keith Schincke <kschinck@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8977 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* ec: Correctly handle xtime extended attributeXavier Hernandez2014-10-281-2/+39
| | | | | | | | | | Change-Id: I2bd34f063d6bf1835d5ae57a8e9aa03f3ec3deb3 BUG: 1156404 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8972 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/snapview-server: check if the reference to the snapshot world isRaghavendra Bhat2014-10-282-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | correct before doing any fop The following operations might lead to problems: * Create a file on the glusterfs mount point * Create a snapshot (say "snap1") * Access the contents of the snapshot * Delete the file from the mount point * Delete the snapshot "snap1" * Create a new snapshot "snap1" Now accessing the new snapshot "snap1" gives problems. Because the inode and dentry created for snap1 would not be deleted upon the deletion of the snapshot (as deletion of snapshot is a gluster cli operation, not a fop). So next time upon creation of a new snap with same name, the previous inode and dentry itself will be used. But the inode context contains old information about the glfs_t instance and the handle in the gfapi world. Directly accessing them without proper check leads to ENOTCONN errors. Thus the glfs_t instance should be checked before accessing. If its wrong, then right instance should be obtained by doing the lookup. Change-Id: Idca0c8015ff632447cea206a4807d8ef968424fa BUG: 1151004 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8917 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: really get the inode size for a brickNiels de Vos2014-10-271-12/+17
| | | | | | | | | | | | | | | The device to get the inode size from does not get passed to the tool (tune2fs, xfs_info or the like) that is called. This is probably just an oversight. While correcting this, cleanup some bits of the function too. Change-Id: Ida45852cba061631fb304bc7dd5286df1a808010 BUG: 1130462 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8492 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* ec: Fix rebalance issuesXavier Hernandez2014-10-275-113/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some issues in ec xlator made that rebalance didn't complete successfully and generated some warnings and errors in the log. The most critical error was a race condition that caused false corruption detection when two specific operations were executed sequentially and they shared the same lock. This explains the problem: 1. A setxattr is issued. 2. setxattr: ec locks the inode before updating the xattr. 3. setxattr: The xattr is updated. 4. setxattr: Upper xlator is notified that the operation completed. 5. setxattr: A background task is initiated to update the version of the file. 6. A stat is issued on the same file. 7. stat: Since the lock is already acquired, it's reused. 8. stat: A lookup is issued to determine version and size information of the file. At this point, operations 5 and 8 can interfere. This can make that lookup sees different information on each brick, determining that some bricks are corrupted and incorrectly excluding them from the operation and initiating a self-heal. In some cases this false detection combined with self-heal could lead to invalid updates of the trusted.ec.size xattr, leaving the file smaller than it should be. This only happens if the first operation does not perform a lookup, because chained operations reuse the information returned by the previous one, avoiding this kind of problems. To solve this, now the background update is executed atomically with the posterior unlock. This avoids some reuses of the lock while updating. However this reduces performance because the window in which new requests can reuse the lock is much smaller now. This has been alleviated by using the same technique implemented in AFR (i.e. waiting some time before releasing the lock). Some minor changes also introduced in this patch: * Bug in management of 'trusted.glusterfs.pathinfo' that was writing beyond the allocated space. * Uninitialized variable. * trusted.ec.config was not created for regular files created with mknod. * An invalid state was used in access fop. Change-Id: Idfaf69578ed04dbac97a62710326729715b9b395 BUG: 1152902 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8947 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep/glusterd: Enable changelog and marker during geo-rep create.Kotresh HR2014-10-271-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Geo-rep misses few a files to sync when I/O happenned during geo-rep start. ANALYSES: To use the available changelogs to handle deletes/renames, 'xsync upper limit' is introduced which limits the xsync crawl till the changelog register time. But there is a small time interval between the changelog register time and the time changelog actually enabled. If there is I/O between this interval, it will not be synced through xsync as it is beyond changelog register time and not through changelog also as changelog is not actually enabled. SOLUTION: Enable changelog and marker during geo-rep create instead of geo-rep start so that entries are captured in changelog and above said interval is nullified. Change-Id: Ic5f0457a4b67a335cbbb37d34db5f8cb8bc901c4 BUG: 1139196 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/8650 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* ec: Fix self-heal issuesXavier Hernandez2014-10-2113-302/+391
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Doing an 'ls' of a directory that has been modified while one of the bricks was down, sometimes returns the old directory contents. Cause: Directories are not marked when they are modified as files are. The ec xlator balances requests amongst available and healthy bricks. Since there is no way to detect that a directory is out of date in one of the bricks, it is used from time to time to return the directory contents. Solution: Basically the solution consists in use versioning information also for directories, however some additional changes have been necessary. Changes: * Use directory versioning: This required to lock full directory instead of a single entry for all requests that add or remove entries from it. This is needed to allow atomic version update. This affects the following fops: create, mkdir, mknod, link, symlink, rename, unlink, rmdir Another side effect is that opendir requires to do a previous lookup to get versioning information and discard out of date bricks for subsequent readdir(p) calls. * Restrict directory self-heal: Till now, when one discrepancy was found in lookup, a self-heal was automatically started. This caused the versioning information of a bad directory to be healed instantly, making the original problem to reapear again. To solve this, when a missing directory is detected in one or more bricks on lookup or opendir fops, only a partial self-heal is performed on it. A partial self-heal basically creates the directory but does not restore any additional information. This avoids that an 'ls' could repair the directory and cause the problem to happen again. With this change, output of 'ls' is always consistent. However, since the directory has been created in the brick, this allows any other operation on it (create new files, for example) to succeed on all bricks and not add additional work to the self-heal process. To force a self-heal of a directory, any other operation must be done on it. For example a getxattr. With these changes, the correct healing procedure that would avoid inconsistent directory browsing consists on a post-order traversal of directoriesi being healed. This way, the directory contents will be healed before healing the directory itslef. * Additional changes to fix self-heal errors - Don't use fop->fd to decide between fd/loc. open, opendir and create have an fd, but the correct data is in loc. - Fix incorrect management of bad bricks per inode/fd. - Fix incorrect selection of fop's target bricks when there are bad bricks involved. - Improved ec_loc_parent() to always return a parent loc as complete as possible. Change-Id: Iaf3df174d7857da57d4a87b4a8740a7048b366ad BUG: 1149726 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8916 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/io-threads: Fix static analysis errorPranith Kumar K2014-10-201-4/+2
| | | | | | | | | | | | | stub->fop can be more than FOP_MAX is what static analysis is complaining. This patch doesn't allow any 'log' to be printed in the case fop value is not in the definied range. It gives EINVAL instead. Change-Id: I293381e2c1ad0ab45154b0192a637612becaf744 BUG: 1153935 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8939 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>