summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* gNFS: avoid double fd unref in opendirSantosh Kumar Pradhan2013-09-241-3/+1
| | | | | | | | | | | | | | | | Noticed that the fd_unref was called on the fd regardless of the return value at nfs3svc_opendir_readdir_cbk(), hence removing an extra unref in the negative case in nfs_inode_opendir_cbk, which fixes the spurious fd_unref(). Back port of: http://review.gluster.org/4943 (Rajesh Amaravathi) Change-Id: Ibddf487c7890407d01befedd65eefb10cb9c989f BUG: 1011761 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/5996 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Treat migration failures due to space constraints as skippedv3.4.1qa3shishir gowda2013-09-194-5/+59
| | | | | | | | | | | | | | | | Currently rebalance/remove-brick op's display migration failed count even for files which failed due to space issues (not enough space for file, or migration leading to cluster imbalance) These will now be counted as skipped, and rebalance/remove-brick status will display the additional counter BUG: 989846 Change-Id: I4efa7ce69dd43680ff47181afed0c561954c5080 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5977 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Task parameters in xml outputKaushal M2013-09-193-7/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of 91cd0eae2cc1d96cbafa6457835f146503355238 from master This patch introduces task parameters for the asynchronus task shown in volume status. The parameters are only given for xml output. The parameters shown currently are, - source and destination bricks for replace-brick tasks ...... <tasks> <task> <type>Replace brick</type> <id>3d1a1005-9d2e-4ae0-bd62-577bc1d333a3</id> <status>1</status> <params> <srcBrick>archm:/export/test4</srcBrick> <dstBrick>archm:/export/test-replace1</dstBrick> </params> </task> </tasks> ...... - list of bricks being removed for remove-brick tasks ...... <tasks> <task> <type>Remove brick</type> <id>901c20ca-0da2-41de-8669-5f0caca6b846</id> <status>1</status> <params> <brick>archm:/export/test2</brick> <brick>archm:/export/test3</brick> </params> </task> </tasks> ...... The changes for non-xml output will be done in a subsequent patch. BUG: 916577 Change-Id: Iade8a4974aefc5ffb080553496ae5a3169055090 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5973 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Don't reset rebalance status on add-brickKaushal M2013-09-191-9/+0
| | | | | | | | | | | | | | | | | | | | | | Backport of 67c28b19355c47e96d1420405cc38753a3e5f9be from master The rebalance status was being reset to 'Not started' when add-brick was performed. This would lead to odd cases where a 'rebalance status' on a volume would show status as 'not started' but would also include the rebalance statistics. This also affected the showing of asynchronus task status in 'volume status' command. By not resetting the status prevent the above issues from happening. Since we use the running/not-running of the rebalance process as the check when performing other operations we can safely leave the rebalance stats collected on an add-brick. BUG: 1006247 Change-Id: Idade88d9e5a6f27659490b3e6d85495d426ef0a3 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5971 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount/fuse: Implement forget in cbks for fuse.Vijay Bellur2013-09-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | With the introduction of inode_ctx_set in fuse as part of 2991503d014, forget cbk gets called for fuse xlator. Though nothing needs to be done inf forget_cbk, excessive log messages of the following kind are observed: [2013-09-16 06:09:50.758063] W [defaults.c:1331:default_forget] (-->/usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0xa1f2) [0x7f51432781f2] (-->/usr/local/lib/libglusterfs.so.0(inode_unref+0x3c) [0x7f5144e5 816c] (-->/usr/local/lib/libglusterfs.so.0(+0x2d061) [0x7f5144e58061]))) 0-fuse: xlator does not implement forget_cbk This patch prevents such log messages from being seen. Signed-off-by: Vijay Bellur <vbellur@redhat.com> BUG: 979910 Change-Id: Ie5874138f46822b10ff4213bd1134d78330ec460 Reviewed-on: http://review.gluster.org/5932 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5975 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Calculate volume op-versions only on set/resetv3.4.1qa2Kaushal M2013-09-133-6/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/5568 The volume op-versions are calculated during a volume set/reset, reading a volume from disk and importing a volume during probe or volume sync. The calculation of the volume op-version depends on the clusters op-version as some features are enabled automatically depending on the clusters op-version. We also don't store the volume op-versions persistently and don't export the volume op-versions during sync. Due to this, there can occur cases which will lead to inconsistencies in volumes in different peers. One such case is below, Consider, a cluster made up 3 peers P1, P2 and P3, operating at op-version N. The cluster has two volumes V1 and V2, which have volume op-versions N (since volume op-version cannot be greater than cluster op-version). We have, Cluster-op-version = N V1 op-version = N V2 op-version = N A set operation on V1 causes the clusters op-version to be bumped up to N+1. Assume that there exist some features that are automatically enabled on op-version N+1. The op-version of V2 remains at N as no operation has been performed on it. So, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N Now, we probe a new peer P4. On the new peer we will have the following op-versions, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N+1 This happens because we don't send volume op-versions during the sync after probe. P4 will freshly calculate the op-version of V2 (assuming features have been auto enabled due to the cluster op-version being N+1) as N+1. Another case is when glusterd on a peer restarts. Assume P3 was restarted, glusterd will recalculate the volume op-versions during the restore state. Again, op-version of V2 will be calculated as N+1 assuming auto enabled features. This will lead to inconsistency in the volume representation in memory and on disk, as glusterd will assume the volume contains auto enabled features, but the volfiles don't contain them as they were not regenrated. These kind of issues can be solved by calculating the volume op-version only when features are enabled and disabled (ie. during volume set/reset), persisting the volume-op-versions and exporting/importing them. BUG: 1005043 Change-Id: Id8bb05ba2a77e510739b3b1833f98b4d6d1fa4d7 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5832 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Use volume op-versions during volgenKaushal M2013-09-134-19/+14
| | | | | | | | | | | | | | | | | Backport of '3af61d6 glusterd: Use volume op-versions during volgen' from master Instead of using the cluster op-version, volume op-version is used to enable open-behind during volgen. For doing this, the volume op-versions are updated before regenerating the volfiles. BUG: 990830 Change-Id: I07e4a34004816c803fcbb3ee1ddd4b1e4c3a8006 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5831 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: Update sub_count on remove brickVijay Bellur2013-09-131-0/+1
| | | | | | | | Change-Id: I7c17de39da03c6b2764790581e097936da406695 BUG: 1002556 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/5902 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Allow bumping down a peer's op-version during probev3.4.1qa1Kaushal M2013-09-101-15/+9
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/5715 Earlier, a peer running a higher op-version couldn't be probed into a cluster running at a lower op-version. This created issues when trying to expand an upgraded cluster. This patch changes this behaviour. The cluster no longer rejects a peer being probed if its op-version is higher than the cluster op-version. The peer will reduce its op-version if it doesn't have any volumes. If the peer contains volumes and needs to reduce its op-version, it fails the handshake and the probe fails. BUG: 1005038 Change-Id: Iabe790a9f826a4ac63d379eeeba01efcfef01f4d Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5834 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Ignore subvols with error in min-free-disk/inodesAmar Tumballi2013-09-105-17/+86
| | | | | | | | | | | | | | | | | | | Currently when selecting a alternative subvolume when hashed subvol has exceeded min-free-disk/inodes, we do not check if layouts have errors (including decommissioning). This leads to data being written to those subvolumes, and in case of decommissioning, will lead to data loss. BUG: 982919 > Original-Author: shishir gowda <sgowda@redhat.com> > Reviewed-on: http://review.gluster.org/5299 Change-Id: If301a86cf3ca5fad6529bd2e61382f9901663ba0 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5888 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* performance/write-behind: invoke request queue processing ifRaghavendra G2013-09-101-19/+30
| | | | | | | | | | | | | | | | | | | | | we find fd marked bad while trying to fulfill lies. * flush was queued behind some unfulfilled write. * A previously wound write returned an error and hence fd was marked bad with corresponding error. * wb_fulfill_head (invocation probably rooted in wb_flush), before winding checks for failures of previous writes and since there was a failure, calls wb_head_done without even winding one request in head. * wb_head_done unrefs all the requests in list "head". * since flush was last operation on fd (and most likely last operation on inode itself), no one invokes wb_process_queue and flush is stuck in request queue for eternity. Change-Id: I3b5b114a1c401d477dd7ff64fb6119b43fda2d18 BUG: 988642 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5883 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: release big locks while doing mountAnand Avati2013-09-101-0/+4
| | | | | | | | | | | | Else things can deadlock in getspec v/s glusterd_do_mount() Change-Id: Ie70b43916e495c1c8f93e4ed0836c2fb7b0e1f1d BUG: 997576 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5881 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: save the basefd flags in the new fdRaghavendra Bhat2013-09-101-0/+1
| | | | | | | | | | | | | | Upon graph switch, the basefd's flags were not saved in the new fd created for the new graph upon which all the further requests for the open file would come. Thus posix was treating the fd as a read-only fd and was denying the write on the fds. Change-Id: I781b62b376a85d1a938c091559270c3f242f1a2a BUG: 998352 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5880 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* io-cache: fix unsafe typcasting of pointer to uint64Anand Avati2013-09-101-1/+3
| | | | | | | | | | | | | The typecast of pointer to uint64_t *, followed by setting of 64bit in inode_ctx_get() results in memory corruption on 32bit system. Change-Id: I32fa3bf3b853ed2690a9b9a471099a59b9d7186a BUG: 997902 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5879 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: unlock before aborting transactionAnand Avati2013-09-101-0/+2
| | | | | | | | | | | Else this results in a missing frame causing a hang Change-Id: Ib5f3dc6a3999449faa2853cee2944af2fb065a20 BUG: 1002399 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5878 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: assign layout onto missing directories tooAnand Avati2013-09-101-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | The current self-healing algorithm is ignoring missing directories for assigning new layout. When lookup() is racing against mkdir() or when self-healing a half-done mkdir(), the layout assignment split must happen based on the final number of directories, and not the currently existing number of directories (because we finish mkdir() of missing directories before hash layout assignment). Without this fix, concurrent mkdir() and lookup() will step on each others feet, create a messed up layout on disk, and end up with different in-memory layouts. Once two clients have different in-memory layouts, creation of subdirectory will not arbitrate on the same hashed subvolume and will result in GFID mismatch of the sub-directory. Change-Id: Ia47acad67c265060405984c822b4d37512b9dbb3 BUG: 907072 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5871 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: increase the auxillary group limit to 65536Anand Avati2013-09-093-7/+25
| | | | | | | | | | | | Make the allocation of groups dynamic and increase the limit to 65536. Change-Id: I702364ff460e3a982e44ccbcb3e337cac9c2df51 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5172 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/io-threads: fix potential use after free crashBrian Foster2013-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | do_iot_schedule() enqueues the stub and kicks the worker thread. The stub is eventually destroyed after it has been resumed and thus unsafe to access after being enqueued. Though likely difficult to reproduce in a real deployment, a crash is reproducible by running a smallfile benchmark on a replica 2 volume on a single vm. Reorder the debug log message prior to the do_iot_schedule() call to avoid the crash. BUG: 989579 Change-Id: Ifc6502c02ae455c959a90ff1ca62a690e31ceafb Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5418 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5815 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* system/posix-acl: check for the sticky bit of the parent directoryRaghavendra Bhat2013-09-091-0/+5
| | | | | | | | | | | | | | | * While creating links, check if there is sticky bit set for the parent directory and whether the sticky bit permits the user to create the link. Change-Id: Ic0d09d9ed579c4eb47462c71602a3a60cc7d3bc1 BUG: 958691 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4934 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5813 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* posix-acl: disable permission checks for fd based opsshishir gowda2013-09-091-4/+4
| | | | | | | | | | | Signed-off-by: shishir gowda <sgowda@redhat.com> Change-Id: I9d49537c2c7b51d5598b80627d61f060aaec8549 BUG: 921437 Reviewed-on: http://review.gluster.org/4671 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/5812 Reviewed-by: Anand Avati <avati@redhat.com>
* performance/open-behind: Fix fd-leaks in unlink, renamePranith Kumar K2013-09-091-0/+4
| | | | | | | | | | | Change-Id: Ia8d4bed7ccd316a83c397b53b9c1b1806024f83e BUG: 991622 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5810 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* xlator: NULL terminate volume_options structSantosh Kumar Pradhan2013-09-092-5/+7
| | | | | | | | | | | | | | | | | | | Problem: volume_options struct for open-behind and quick-read xlators were not NULL terminated. Fix: Make them NULL terminated. Change-Id: I2615a1f15c6e5674030a219a99ddf91596bf346b BUG: 965995 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/5064 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/5809 Reviewed-by: Anand Avati <avati@redhat.com>
* open-behind: propagate errors from ob_wake_cbkAnand Avati2013-09-091-9/+25
| | | | | | | | | | | | | | | | If opening fd in background fails, then remember the error and fail all further calls on the fd. Use the newly introduced call_unwind_error() function from call-stub cleanup to fail the future calls. Change-Id: I3b09b7969c98d915abd56590a2777ce833b81813 BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4521 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5808
* performance/open-behind: use anonymous fd for doing fstat and readvRaghavendra Bhat2013-09-092-2/+7
| | | | | | | | | | | Change-Id: I61a3c221e0a15736ab6315e2538c03dac27480a5 BUG: 846240 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4483 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5807 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* mount/fuse: perform lookup() on inodes linked through readdirplusAnand Avati2013-09-093-8/+66
| | | | | | | | | | | | | | | | Some xlators still require lookup() fop to be sent for proper working. This patch remembers inodes which have been linked through readdiprlus and makes the resolver send lookups on them. Also, introduce and use context count for inode table. Change-Id: Ibe8a04a659539d90dfc794521b51bf2bda017a0b BUG: 979910 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5267 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/5806
* md-cache: fix xattr caching code in getxattrAnand Avati2013-09-091-2/+2
| | | | | | | | | | | | | | Bad condition check, fix it! Change-Id: I6e047de70f77d7b98b2ca771a467f14a76fd62fe BUG: 994392 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5513 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/5805 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* nfs: prevent NFS server crash when upgrading from 3.2.x serverAnand Avati2013-09-091-0/+5
| | | | | | | | | | | | | | | | | | | | | After an upgrade the NFS3 filehandle size changed (became smaller), but when doing a live ugprade the client would send the old handle (expect ESTALE and do fresh lookup). But when reading the old handle we were reading it into a structure which was limited to the size of the new handle, while we should have been reading into a buffer which is as big as the NFS3 spec permits the handle size to be. The actor functions declare the structure on the stack. So the overflow is resulting in a stack corruption. Change-Id: Ie930875ac9db46b43d1cb8ad1e6d89cdaeded7ca BUG: 1002385 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5730 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/5804 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* md-cache: invalidate attributes on xattr updateAnand Avati2013-09-081-0/+164
| | | | | | | | | | | | | | xattr update will result in at least ctime change. So invalidate attributes in xattr callback. Change-Id: Ie6e8f2fd9a11c56c27e78bd58c2ff1e1d6edce6e BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5641 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/5640
* call-stub: internal refactorAnand Avati2013-09-051-40/+40
| | | | | | | | | | | | | | | | | | | - re-structure members of call_stub_t with new simpler layout - easier to inspect call_stub_t contents in gdb now - fix a bunch of double unrefs and double frees in cbk stub - change all STACK_UNWIND to STACK_UNWIND_STRICT and thereby fixed a lot of bad params - implement new API call_unwind_error() which can even be called on fop_XXX_stub(), and not necessarily fop_XXX_cbk_stub() Change-Id: Idf979f14d46256af0afb9658915cc79de157b2d7 BUG: 846240 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4520 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/5820 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* glusterd: Give up biglock before brick's rpc unrefKrishnan Parthasarathi2013-08-141-1/+5
| | | | | | | | | | | | This is to prevent the possibility of a deadlock when rpc_connection_cleanup being called in the same thread as rpc_clnt_unref Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5326 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount/fuse: Provide option to use/not use kernel-readdirpv3.4.0Pranith Kumar K2013-07-123-2/+19
| | | | | | | | | | | | | By default fuse kernel readdirp usage in fuse xlator is off. When mount option use-readdirp=yes is provided it starts using fuse-kernel's readdirp. BUG: 983477 Change-Id: Ibdaf1407d6f2a782a4a1916fad374f36fca6c5e7 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5323 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Correct op-version of some optionsKaushal M2013-07-111-9/+9
| | | | | | | | | | | | | Some of the options have been backported to release-3.3 branch and hence should have their op-version reduced. Some other options had op-version incorrectly set as 1. Change-Id: If40325b7b2da7aa36f90261024117cd18cf51ef0 BUG: 981278 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5320 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* store: move glusterd_store functions from mgmt/glusterd to libglusterfsKrishnan Parthasarathi2013-07-036-867/+179
| | | | | | | | | | | | | | | | Backport of http://review.gluster.org/4676 and http://review.gluster.org/5243 Making the glusterd_store_* functions re-usable will help with future changes that need to read/write lists of items. BUG: 904065 Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3 Original-author: Niels de Vos <ndevos@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5279 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: declare lvm_lv_from_name() if it is missing from lvm2app.hv3.4.0beta4Niels de Vos2013-06-271-0/+6
| | | | | | | | | | | | | | | | | | | | | The bd-xlator can not be built successfully on certain Debian distributions due to a missing declaration of lvm_lv_from_name(). This function is available for linking, but it does not exist in the header file. This change adds a detection for lvm_lv_from_name() in both the library for linking, and the declaration in the header file. If the 1st is missing, the bd-xlator can not be built, and if only the 2nd one is missing, we'll declare lvm_lv_from_name() ourselves. This makes it possible to build the bd-xlator on the affected Debian distributions too. Change-Id: If1845f6b6d676793677ebbcc6daf9ff12f7c3fd6 BUG: 976946 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5260 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Disable transport before cleaning up rpc objectKrishnan Parthasarathi2013-06-183-19/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/5000 Problem: rpc_transport object, which is part of rpc_clnt, is destroyed prematurely. This is because, rpc_transport object is ref'd by socket layer and rpc layer. These ref's, until the synctask'izing of operations, were unref'd sequentially in the epoll thread. With more threads at play, the sequential unref guarantee is off. Fix: Shutting down the transport before proceeding with cleaning up of rpc_clnt object would serialize the unref's on the rpc_transport object and thus eliminating the race. Also, we don't store the address of brickinfo in brick's rpc notify function, to avoid the possibility of referring a freed brickinfo. Instead we use a string based id to 'reach' the corresponding brickinfo. Change-Id: If2739e2eeaee1e8b071ab2b6754b7ea0f81cfceb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5214 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Add a cmd for getting uuid of local nodeKrishnan Parthasarathi2013-06-181-0/+99
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/5175 (upstream) Usage: gluster system:: uuid get This is needed since we generate uuid of a node in a lazy manner. ie, we generate a uuid for the node only on the first volume or peer operation, when the node needs an external identity. With this command, we can force[1] the uuid generation, without a volume or peer operation performed. [1]: Querying for uuid (or uuid get), forces uuid to come into existence. Change-Id: I62c8b6754117756aa4d773dd48af4ddeb1a1d878 BUG: 971661 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5204 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Ignore directories matching *.tmp in storeKrishnan Parthasarathi2013-06-141-0/+1
| | | | | | | | | | | | | | Backport of http://review.gluster.org/5177 store being glusterd's persistent store under /var/lib/glusterd/ Change-Id: I1c01a09a8ce4a73ea612f05e7f14d4ab39ad1628 BUG: 971796 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5212 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Perform delayed changelog wakeups for anon fdv3.4.0beta3Pranith Kumar K2013-06-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Nfs xlator never does open on a file for performing writes, afr does not perform changelog wakeup for this fd so operations which do metadata operations as soon as the data operations are completed perceive a delay od 'post-op-delay-secs'. Fix: Perform changelog wakeup on anon-fd if the fd with same pid is not present in inode-list. Note: This approach is a short-term fix. A proper fix needs a new domain for taking metadata locks so that data/metadata locks don't compete with each other. BUG: 966018 Change-Id: Ia9188a253e7943801b665e1b9205e2f551952d87 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5067 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Fix crash in dht_migration_complete_check_task because of NULL fdEmmanuel Dreyfus2013-06-082-1/+3
| | | | | | | | | | | This is a backport of Ia5a5d40bcea7bfb320ef7096af1e035b8847d4ff BUG: 960055 Change-Id: Ibf3547a775d7ca2f3a097c880cdf38ffafb322da Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/5139 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht,posix: support for case discoveryAnand Avati2013-06-082-0/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is support for discovering a filename in a given directory which has a case insensitive match of a given name. It is implemented as a virtual extended attribute on the directory where the required filename is specified in the key. E.g: sh# getfattr -e "text" -n user.glusterfs.get_real_filename:FiLe-B /mnt/samba/patchy getfattr: Removing leading '/' from absolute path names # file: mnt/samba/patchy user.glusterfs.get_real_filename:FiLe-B="file-b" In reality, there can be multiple "answers" as the backend filesystem is case sensitive and there can be multiple files which can strcasecamp() successfully. In this case we pick the first matched file from the first responding server. If a matching file does not exist, we return ENOENT (and NOT ENODATA). This way the caller can differentiate between "unsupported" glusterfs API and file not existing. This API is used by Samba VFS to perform efficient discovery of the real filename without doing a full scan at the Samba level. Change-Id: I53054c4067cba69e585fd0bbce004495bc6e39e8 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5163 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gfapi: link inodes in relevant entry FOPsAnand Avati2013-06-081-3/+3
| | | | | | | | | | | | Do not let inode linking to happen only in lookup(). While that works, it is inefficient. Change-Id: I51bbfb6255ec4324ab17ff00566375f49d120c06 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* md-cache: support negative xattr entriesAnand Avati2013-06-081-10/+31
| | | | | | | | | | | | | | | Add support for negative xattr caching. For this, we need to fetch xattrs in every opportunity (including readdirplus) in order to treat missing key in cached dict as negative entry. This is crucial to detect missing ACL xattrs in Samba workload. Change-Id: I918a2ef4ab804724256f7546b15e808332ed518d BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5160 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* quick-read: prune cache on write/[f]truncateAnand Avati2013-06-081-0/+43
| | | | | | | | | | | | | Cache needs to be pruned on write and [f]truncate. The lack of this is causing Samba ping-pong test to return wierd 'data increment' values during startup. Change-Id: I9cd6a839bcd02de738d78638211b78f382f58e0a BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix-acl: fetch ACLs in readdirplusAnand Avati2013-06-081-0/+6
| | | | | | | | | | | | | Not fetching ACLs in readdirplus can potentially result in spurious wrong ACL decisions (which magically go away on a lookup() which populates the ACLs) Change-Id: Ided38b4d868fab482b477ce51b4878289ef9eed0 BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5156 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Avoid order mismatch in blocking entrylksPranith Kumar K2013-06-061-6/+9
| | | | | | | | | | | | | | | | | | | | | | Problem: When taking blocking entrylks, afr orders the entrylks based on uuid_compare of gfids of parent dirs, if they are equal then it orders them based on the basenames. While this approach works fine, the implementation assumes loc->gfids to be populated at the time of the comparison, but loc may have gfid in loc->inode->gfid instead of loc->gfid which was leading to order mismatches and dead-locks. Fix: Implemented loc_gfid which gives gfid by checking both loc->gfid, loc->inode->gfid. Used this for ordering the blocking entrylks. Change-Id: I2743fcaff3d670fbeb6b8e0a496f106a3585dde1 BUG: 965987 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5063 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd-volgen: Improve volume op-versions calculationKaushal M2013-06-054-454/+582
| | | | | | | | | | | | | | | | | | Backport of patch on master branch, under review at http://review.gluster.org/4952 Volume op-versions calculations now take into account if an option, a. enables/disables an xlator, or b. is a boolean option. This prevents op-versions from being updated when a feature is disabled. BUG: 954256 Change-Id: Ic68032b9e55a3f0191f8fc3ecd6b5ced385ad943 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5094 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd-volgen: Enable open-behind based on op-versionKaushal M2013-06-053-11/+50
| | | | | | | | | | | | | | | | Backport of patch on master branch, under review at http://review.gluster.org/4866 This patch enables the open-behind by default only when the op-version allows it. Also the volume op-version calculations take account of this enablement. BUG: 954256 Change-Id: Ie739bc23ba90ec2f009feecef28187912a37487c Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5095 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Set op-version on startup based on install statusKaushal M2013-06-051-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of change fa227c0 glusterd: Set op-version on startup based on install status from master. If the current installation of glusterfs doesn't have a stored op-version and is, a. a fresh install, then set op-version to maximum b. an upgrade from release which didn't have op-version support, set it to minimum. The install status is detected using the peer-uuid. If both peer-uuid and op-version are not present in the store, the installation is fresh. If peer-uuid is present, but op-version is absent in the store, the installation has been upgraded from a version which didn't support op-versions. By setting the initial op-version as above, we can ensure that a. features are not enabled accidentally during upgrades b. a fresh install starts with all possible features enabled. BUG: 954256 Change-Id: I5cdd0c63fd16ecfa2fede99684da6fd6167823a8 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5001 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Introduce volume op-versionsKaushal M2013-06-0510-389/+566
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of change 9153855 glusterd: Introduce volume op-versions from master. Each volume is now associated with two op-versions, * op_version - the op-version of the highest op-versioned feature enabled * client_op_version - the op-version of the highest op-versioned feature enabled which affects the clients only. These two op-versions are generated dynamically and kept updated during runtime. Glusterd now uses the respective volumes' client-op-version during getspec requests. To achieve the above a new field in the vme table is introduced, client_option, this boolean field tells if the option is a client side option. BUG: 907311 Change-Id: I59af02644a714e1c54fc89f1ead5aa551bba7ee7 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Backport of vme table changes from masterKaushal M2013-06-0512-260/+1168
| | | | | | | | | | | | | | | | | | | | | This patch backports the following changes from the master branch 99fe09f glusterd: Moved the volume entry table to a separate file. e306d08 glusterd: Changing the volume entry table's representation. eac54f6 glusterd: Added option description, and validation function fields. bcb4235 glusterd: Added validation function for performance cache max and min size. 8897d08 glusterd: Added validation function for quota-timeout. 4579609 glusterd: Added validation function for stripe-block-size. 6788bad glusterd: Fix some options in vme table 549231d glusterd: Added the validation function for subvols-per-directory 9636e63 glusterd: Added description for nfs.transport-type option in volume set help. Change-Id: I4a64ad94f17df4b45a3a32262a83e2c35fb5f7da BUG: 907311 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4956 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>