summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* nfs: prevent NFS server crash when upgrading from 3.2.x serverAnand Avati2013-08-291-0/+5
| | | | | | | | | | | | | | | | | | | After an upgrade the NFS3 filehandle size changed (became smaller), but when doing a live ugprade the client would send the old handle (expect ESTALE and do fresh lookup). But when reading the old handle we were reading it into a structure which was limited to the size of the new handle, while we should have been reading into a buffer which is as big as the NFS3 spec permits the handle size to be. The actor functions declare the structure on the stack. So the overflow is resulting in a stack corruption. Change-Id: Ie930875ac9db46b43d1cb8ad1e6d89cdaeded7ca BUG: 1002385 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5730 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: unlock before aborting transactionAnand Avati2013-08-291-0/+2
| | | | | | | | | | | Else this results in a missing frame causing a hang Change-Id: Ib5f3dc6a3999449faa2853cee2944af2fb065a20 BUG: 1002399 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/stripe: enable coalesce mode by defaultBrian Foster2013-08-281-4/+4
| | | | | | | | | | | | It has been available for a while now and is probably the sane default due to the more efficient layout and performance benefit. BUG: 1001207 Change-Id: I6275f9741866c0afd6e685f8dc5867a86485fd20 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5624 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Add special handling for failure postopsPranith Kumar K2013-08-284-26/+71
| | | | | | | | | | | | | | | | Idea is to not leave the file in FOOL-FOOL scenario in case on all the bricks data transaction failed with EDQUOT to avoid increasing un-necessary load of self-heals in the system. For directory transactions don't leave pending changelog in case the failures are seen on all the subvolumes. Change-Id: I38a5561d1d581a78347a76a4a509514e4a0c3fb7 BUG: 969461 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5709 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* stripe: remove unused param, handle mem alloc failureKaleb S. KEITHLEY2013-08-281-2/+2
| | | | | | | | | Change-Id: I9c27b1edab111031ca8eea9cc49480ea01e39089 BUG: 1002207 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/5716 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* synctask: minor enhancementsAnand Avati2013-08-281-4/+1
| | | | | | | | | | | | | | | | | | | | | | | - Enhance syncenv_new() to accept scaling parameters of syncproc. Previously the scaling parameters were hardcoded and decided at compile time. - New API synctask_create() which returns the created synctask. This is similar to synctask_new which only returned the status of whether a synctask could be created or not. The meaning of NULL cbk in synctask_create() means the task is "joinable". Until synctask_join() is called on such a synctask, the task is not reaped and resources are not destroyed. The task would be in a zombie state after synctask_fn returns and before synctask_join() is called. Change-Id: I368ec9037de9510d2ba951f0aad86aaf18d9a6b6 BUG: 986775 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5365 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* cluster/afr: Don't delay post op in cases of failuresPranith Kumar K2013-08-283-11/+36
| | | | | | | | | | Change-Id: Ib0c3af6babc61dc3ed45252582876e2f243d6446 BUG: 958118 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5635 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs: persistent caching of connected NFS-clientsNiels de Vos2013-08-286-57/+456
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce /var/lib/glusterfs/nfs/rmtab to contain a list of NFS-clients which have a volume mounted. The volume option 'nfs.mount-rmtab' can be set to an alternative filename. When the file is located on shared storage, multiple gNFS servers can use the same file to present a single NFS-server. This cache is read when a system administrator calls 'showmount -a' and updated when an NFS-client calls MNT or UMNT from the MOUNT protocol. Usage: - create a volume for storing the shared rmtab file - mount the volume on all storage servers, at the same location - make sure that the volume is mounted at boot (add to /etc/fstab) - place the rmtab file on the volume: # gluster volume set <VOLUME> nfs.mount-rmtab <MOUNTPOINT>/<FILENAME> - any subsequent mount requests will add an entry to this file - 'showmount -a' requests will return the NFS-clients using the cluster Note: The NFS-server does currently not support reconfigure(). When a configuration option is set/changed, the NFS-server glusterfs process gets restarted. This causes the active NFS-clients to be forgotten (the entries are saved in the old rmtab, but we do not have a reference to that file any more, so we can't re-add them). Therefor a re-mount done by the NFS-clients is needed before they get listed in the rmtab again. Change-Id: I58f47135d60ad112849d647bea4e1129683dd2b3 BUG: 904065 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* mount/fuse: perform lookup() on inodes linked through readdirplusAnand Avati2013-08-233-8/+66
| | | | | | | | | | | | | Some xlators still require lookup() fop to be sent for proper working. This patch remembers inodes which have been linked through readdiprlus and makes the resolver send lookups on them. Change-Id: Ibe8a04a659539d90dfc794521b51bf2bda017a0b BUG: 979910 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5267 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* core: remove GLUSTERFS_CREATE_MODE_KEY usageAmar Tumballi2013-08-233-35/+5
| | | | | | | | | Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6 BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mount/fuse: do not forget the root inodeRaghavendra Bhat2013-08-221-1/+4
| | | | | | | | | | | | | | | In batch forgetting of inodes, nodeid should be checked and if it is for root, then it should not be sent forget. Change-Id: I99bd91ba70d8be4df88ddac005e38c449f4ed7d9 BUG: 990744 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5471 Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* io-cache: fix unsafe typcasting of pointer to uint64Anand Avati2013-08-221-1/+3
| | | | | | | | | | | | | | | The typecast of pointer to uint64_t *, followed by setting of 64bit in inode_ctx_get() results in memory corruption on 32bit system. Change-Id: I32fa3bf3b853ed2690a9b9a471099a59b9d7186a BUG: 997902 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5682 Tested-by: Morten Johansen <morten@bzzt.no> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
* gfid-access: virtual access to filesystem through gfid pathAmar Tumballi2013-08-216-1/+1313
| | | | | | | | | BUG: 952029 Change-Id: I7405d473d369a4a951836eceda4faccbad19ce0e Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5497 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: changes to support gfid-accessAmar Tumballi2013-08-213-10/+26
| | | | | | | | | Change-Id: I38d2fdc47e4b805deafca6805e54807976ffdb7e Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 952029 Reviewed-on: http://review.gluster.org/5496 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "fuse: auxiliary gfid mount support"Amar Tumballi2013-08-218-1268/+112
| | | | | | | | | | | | | | | | | This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20. Conflicts: xlators/mount/fuse/src/fuse-bridge.c For build issues added CREATE_MODE_KEY definition in: libglusterfs/src/glusterfs.h Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: add check in remove-brick start variantRavishankar N2013-08-211-11/+11
| | | | | | | | | | | | | | | | | | | | | | The 'start' variant of the remove-brick command only applies at the dht level wherein we can remove all the bricks of a sub-volume (and remove multiple such sub-volumes) but not select bricks of it. This patch disallows removing individual replica bricks of multiple sub-volumes (i.e. reducing the replcia count of the volume) using remove-brick 'start'. The preferred method for such an operation is to use commit force. This patch also reverts the check to prevent removal of bricks from a replicate volume (commit 0d415f7) BUG: 961669 Change-Id: I447ad27f73a0963b5e09fb317bf7267a7a5a6147 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5566 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* md-cache: invalidate attributes on xattr updateAnand Avati2013-08-191-0/+164
| | | | | | | | | | | | | xattr update will result in at least ctime change. So invalidate attributes in xattr callback. Change-Id: Ie6e8f2fd9a11c56c27e78bd58c2ff1e1d6edce6e BUG: 953694 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5641 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: save the basefd flags in the new fdRaghavendra Bhat2013-08-191-0/+1
| | | | | | | | | | | | | | | Upon graph switch, the basefd's flags were not saved in the new fd created for the new graph upon which all the further requests for the open file would come. Thus posix was treating the fd as a read-only fd and was denying the write on the fds. Change-Id: I781b62b376a85d1a938c091559270c3f242f1a2a BUG: 998352 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5601 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: release big locks while doing mountAnand Avati2013-08-181-0/+4
| | | | | | | | | | | | | Else things can deadlock in getspec v/s glusterd_do_mount() Change-Id: Ie70b43916e495c1c8f93e4ed0836c2fb7b0e1f1d BUG: 997576 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5636 Tested-by: Joe Julian <joe@julianfamily.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd: Try to start all bricks on 'start force'Kaushal M2013-08-182-6/+13
| | | | | | | | | | | | | | | | | | | A volume would fail to start if any one of the bricks fails staging or fails to start, even with the 'force' option. With this patch, when the 'force' option is given for a volume start, glusterd will continue and start other bricks even if one fails staging or starting. Also did a small fix in changelog, to prevent it crashing when it fails to init. Change-Id: I7efbd9ab13d12d69b0335ae54143fa17586f8f98 BUG: 994375 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5510 Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli: Add server uuid into volume brick info xmlTimothy Asir2013-08-181-0/+10
| | | | | | | | | | | | | | | | | Add server uuid as an attribute to the existing brick details in the volume info cli xml output. Currently, when a node has more than one ip, the oVirt-engine fails to map the corresponding server using the ip alone. If we get the host uuid along with brick details in volume info command it will be easy for ovirt-engine to find out the server and thereby we can avoid confusion in finding the server. Change-Id: I3c9c9acea80e10e0b2977477759d9af045e48959 BUG: 955588 Signed-off-by: Timothy Asir <tjeyasin@redhat.com> Reviewed-on: http://review.gluster.org/4875 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Move certain logs into 'DEBUG' levelHarshavardhana2013-08-181-3/+3
| | | | | | | | | | | | Confusing "Error" messages in logs can cause user panic and false positives - avoid them as necessary in future. Change-Id: I906c64eea879b19a8db099c89d1d7f874e5530db BUG: 995784 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Del GF_READDIR_SKIP_DIRS key from dict for first_upshishir gowda2013-08-182-4/+12
| | | | | | | | | | | | | | | | Currently, we sent GF_READDIR_SKIP_DIRS for all subvolumes if first_subvol != first_up_subvolume. Also first_up_subvolume can change with-in the life of a call and cbk. Saving the first_up_subvol in dht_local for checks. Change-Id: I6e369e63f29c9761993f2a66ed768c424bb44d27 BUG: 996474 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5577 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Add largest file is source policyAnand Avati2013-08-142-29/+93
| | | | | | | | | | | | | | For Write Once Read Many times type of work-load choosing largest file to be the source will always resolve fool-fool scenarios correctly. In other cases we fsync() the files and will have a reliable 'wise man'. Change-Id: Ic4dbea8d06db6d578fbcb866fb65ee2d066ac7ba BUG: 958118 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5519 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* performance/write-behind: invoke request queue processing ifRaghavendra G2013-08-141-19/+30
| | | | | | | | | | | | | | | | | | | | | | we find fd marked bad while trying to fulfill lies. * flush was queued behind some unfulfilled write. * A previously wound write returned an error and hence fd was marked bad with corresponding error. * wb_fulfill_head (invocation probably rooted in wb_flush), before winding checks for failures of previous writes and since there was a failure, calls wb_head_done without even winding one request in head. * wb_head_done unrefs all the requests in list "head". * since flush was last operation on fd (and most likely last operation on inode itself), no one invokes wb_process_queue and flush is stuck in request queue for eternity. Change-Id: I3b5b114a1c401d477dd7ff64fb6119b43fda2d18 BUG: 988642 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: treat appending writes as stable writes.Anand Avati2013-08-134-3/+68
| | | | | | | | | | | | | Durability of appending writes is implicit in the file size. Therefore performing an explicit fsync() is unnecessary in such cases as self-heal can check for the size of file when pending changelog is not unambiguous. Change-Id: I05446180a91d20e0dbee5de5a7085b87d57f178a BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5501 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* posix: Default value for `batch-fsync-delay-usec` should be '0'Harshavardhana2013-08-131-3/+3
| | | | | | | | | | | | Also fixes for failing testcase `./tests/bugs/bug-888174.t`, which has been failing sporadically for many patches. Change-Id: Ic7d2c95da5d3126623cec403207afadd449bf950 BUG: 927146 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: remove-brick:Allow simultaneous removal of multiple subvolumes.Ravishankar N2013-08-132-36/+89
| | | | | | | | | | | | | | | | | Currently, remove-brick supports removal of only one distributed stripe/ replica pair at a time. Fix it to support removal of multiple pairs. This is consistent with add-brick behaviour which supports adding multiple stripe/replica pairs simultaneously. Removal is successful irrespective of the order of the bricks given at the CLI, as long as the bricks are from the same subvolume(s). Change-Id: I7c11c1235ce07b124155978b9d48d0ea65396103 BUG: 974007 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5210 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* log: set ident to openlogBala.FA2013-08-131-1/+2
| | | | | | | | | | | | | | | | | | | | at syslog side, log message is identified by its properties like programname, pid, etc. brick/mount processes need to be identified uniquely as they are different process of gluterfsd/glusterfs. At rsyslog side, log separated by programname/app-name with pid works but bit hard to identify them in long run which process is for what brick/mount. This patch fixes by setting identity string at openlog() which sets programname/app-name as similar to old style log file prefixed by gluster, glusterd, glusterfs or glusterfsd Change-Id: Ia05068943fa67ae1663aaded1444cf84ea648db8 BUG: 928648 Signed-off-by: Bala.FA <barumuga@redhat.com> Reviewed-on: http://review.gluster.org/5541 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: skip directory inspection when entry self-heal is offAnand Avati2013-08-111-1/+1
| | | | | | | | | | | | When user has explicitly configured to disable entry self-heal in the client, it is wrong to do the healing in opendir. So skip it. This is especially useful to reduce opendir() times after graph switches. Change-Id: Ic6eb9ff2334a5b8417f2f35410a366a536bad5df Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5528 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Unwind frame on error in readdir[p]Pranith Kumar K2013-08-081-2/+2
| | | | | | | | | Change-Id: I5701bf115e0aa1adb4fb52f5418534910a2268d4 BUG: 994959 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5531 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: check for non-zero call_count before doing a stack windRavishankar N2013-08-071-0/+5
| | | | | | | | | | | | | | | | | | | | When one of the bricks of a 1x2 replicate volume is down, writes to the volume is causing a race between afr_flush_wrapper() and afr_flush_cbk(). The latter frees up the call_frame's local variables in the unwind, while the former accesses them in the for loop and sending a stack wind the second time. This causes the FUSE mount process (glusterfs) toa receive a SIGSEGV when the corresponding unwind is hit. This patch adds the call_count check which was removed when afr_flush_wrapper() was introduced in commit 29619b4e Change-Id: I87d12ef39ea61cc4c8244c7f895b7492b90a7042 BUG: 988182 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5393 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* md-cache: fix xattr caching code in getxattrAnand Avati2013-08-071-2/+2
| | | | | | | | | | | | Bad condition check, fix it! Change-Id: I6e047de70f77d7b98b2ca771a467f14a76fd62fe BUG: 994392 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5513 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/locks: Convert old style metadata locks to new-stylePranith Kumar K2013-08-071-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In 3.3, inode locks of both metadata and data are competing in same domain called data domain (old style). This coupled with eager-lock, delayed post-ops introduce delays for metadata operations like chmod, chown etc. To avoid this problem, inode locks for metadata ops are moved to different domain called metadata domain in 3.4 (new style). But when both 3.3 clients and 3.4 clients are present, 3.4 clients for metadata operations still need to take locks in "old style" so that proper synchronization happens across 3.3 and 3.4 clients. Only when all clients are >= 3.4 locks will be taken in "new style" for metadata locks. Because of this behavior as long as at least one 3.3 client is present, delays will be perceived for doing metadata operations on all 3.4 clients while data operations are in progress (Ex: Untar will untar one file per sec). Fix: Make locks xlators translate old-style metadata locks to new-style metadata locks. Since upgrade process suggests upgrading servers first and then clients, this approach gives good results. Tests: 1) Tested that old style metadata locks are converted to new style by locks xlator using gdb 2) Tested that disconnects purge locks in meta-data domain as well using gdb and statedumps. 3) Tested that untar performance is not hampered by meta-data and data operations. 4) Had two mounts one with orthogonal-meta-data on and other with orthogonal-meta-data off ran chmod 777 <file> on one mount and chmod 555 <file> on the other mount in while loops when I took statedumps I saw that both the transports are taking lock on same domain with same range. 18:49:30 :) ⚡ sudo grep -B1 "ACTIVE" /usr/local/var/run/gluster/home-gfs-r2_0.324.dump.* home-gfs-r2_0.324.dump.1375794971-lock-dump.domain.domain=r2-replicate-0:metadata home-gfs-r2_0.324.dump.1375794971:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 7525, owner=78f9e652497f0000, transport=0x15ac9e0, , granted at Tue Aug 6 18:46:11 2013 home-gfs-r2_0.324.dump.1375795051-lock-dump.domain.domain=r2-replicate-0:metadata home-gfs-r2_0.324.dump.1375795051:inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 8879, owner=0019cc3cad7f0000, transport=0x158f580, , granted at Tue Aug 6 18:47:31 2013 Change-Id: I268df4efd93a377a0c73fbc59b739ef12a7a8bb6 BUG: 993981 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Correcting a log message in glusterd-geo-rep.cM S Vishwanath Bhat2013-08-051-1/+1
| | | | | | | | | | Change-Id: I4352f513fc5616daa20e9a4ad51a63fb13a27dff BUG: 847839 Signed-off-by: M S Vishwanath Bhat <vbhat@redhat.com> Reviewed-on: http://review.gluster.org/5472 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Add switch and nufa options to 'gluster cli'Harshavardhana2013-08-032-17/+53
| | | | | | | | | Change-Id: Ic3c43291e0e1ead0d89c0436e8d70aa5dee2f543 BUG: 924488 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5391 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse: fix memory leak in fuse_getxattr()Ravishankar N2013-08-031-13/+16
| | | | | | | | | | | | | | | The fuse_getxattr() function was not freeing fuse_state_t resulting in a memory leak. As a result, when continuous writes (run dd command in a loop) were done from a FUSE mount point, the OOM killer killed the client process (glusterfs). Change-Id: I6ded1a4c25d26ceab0cb3b89ac81066cb51343ec BUG: 988182 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/5392 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Fix when tasks are shown in 'volume status'Kaushal M2013-08-031-0/+4
| | | | | | | | | | | | | Asynchronous tasks are shown in 'volume status' only for a normal volume status request for either all volumes or a single volume. Change-Id: I9d47101511776a179d213598782ca0bbdf32b8c2 BUG: 888752 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5308 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse-bridge: update to protocol minor version 22Brian Foster2013-08-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 7.17 - Distinguishes between POSIX and BSD locking support via a separate BSD locking support init flag. Older protocol versions (since BSD support was added) export both types of locking requests if FUSE_POSIX_LOCKS is specified. Gluster sets this flag, so set FUSE_FLOCK_LOCKS as well on kernels that support version 17 or newer. 7.18 - Adds ioctl() support for directories (and the associated FUSE_IOCTL_DIR flag). Gluster does not support the ioctl request, so no changes are required. Update the header. - Adds support for the delete notification to allow a filesystem to inform the kernel of a deleted inode. No gluster changes required. 7.19 - Adds support for the fallocate request. Gluster already supports fallocate and includes the request opcode definition and data structure. Update the header version number. 7.20 - Adds the FUSE_AUTO_INVAL_DATA init flag to enable attribute updates on reads and automatic cache invalidation on mtime changes. Behavior does not change unless the init flag is specified, no gluster changes required. Update header. 7.21 - Adds readdirplus support and updates the poll request to include events. Gluster already supports readdirplus and includes the relevant data structures. Poll is not supported, so no changes are required. Update the header with some missing READDIRPLUS_AUTO bits. 7.22 - Adds real asynchronous direct I/O support. Gluster already supports/enables the associated bit (FUSE_ASYNC_DIO), no further changes are required. Update the header. BUG: 990744 Change-Id: Idf6fd75bbd48189587e548f7624626f9a75309e8 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5489 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* performance/open-behind: Fix fd-leaks in unlink, renamePranith Kumar K2013-08-031-0/+4
| | | | | | | | | Change-Id: Ia8d4bed7ccd316a83c397b53b9c1b1806024f83e BUG: 991622 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Use volume op-versions during volgenKaushal M2013-08-024-19/+14
| | | | | | | | | | | | | | Instead of using the cluster op-version, volume op-version is used to enable open-behind during volgen. For doing this, the volume op-versions are updated before regenerating the volfiles. Change-Id: I675bb549bf7c7c0279030dca698fb530781addc6 BUG: 990830 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5385 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Disable eager-lock if open-fd-count > 1Pranith Kumar K2013-08-024-5/+98
| | | | | | | | | | | | | | | | | Lets say mount1 has eager-lock(full-lock) and after the eager-lock is taken mount2 opened the same file, it won't be able to perform any data operations until mount1 releases eager-lock. To avoid such scenario do not enable eager-lock for transaction if open-fd-count is > 1. Delaying of changelog piggybacking is avoided in this situation. Change-Id: I51b45d6a7c216a78860aff0265a0b8dabc6423a5 BUG: 910217 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5432 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Enable Open-fd-count query in writevPranith Kumar K2013-08-021-1/+39
| | | | | | | | | | Change-Id: I86bdf865730416150c10617dcbad5c037579acde BUG: 910217 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5433 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* performance/io-threads: fix potential use after free crashBrian Foster2013-08-011-1/+1
| | | | | | | | | | | | | | | | | | | do_iot_schedule() enqueues the stub and kicks the worker thread. The stub is eventually destroyed after it has been resumed and thus unsafe to access after being enqueued. Though likely difficult to reproduce in a real deployment, a crash is reproducible by running a smallfile benchmark on a replica 2 volume on a single vm. Reorder the debug log message prior to the do_iot_schedule() call to avoid the crash. BUG: 989579 Change-Id: Ifc6502c02ae455c959a90ff1ca62a690e31ceafb Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5418 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fuse-bridge: update to fuse protocol minor version 16 (Linux)Brian Foster2013-08-011-9/+30
| | | | | | | | | | | | | | | | | | | | | | | 7.14 - Splice write support to fuse device node. No gluster changes required besides header update. 7.15 - Store/retrieve notification support. No gluster changes required besides header update. 7.16 - BATCH_FORGET request support. Implement a handler for BATCH_FORGET requests and update the header. - Updated ioctl() ABI. No gluster changes required besides header update. BUG: 990744 Change-Id: If3061a720ba566ee6731ad8b77cdc665d8fbf781 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5449 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs: Fix for NFS crash during blocking NLM call.Rajesh Joseph2013-08-011-1/+1
| | | | | | | | | | | | | | | | | | Bug 990887: During a blocking NLM call NFS server is crashing. Cause: When nlm4_establish_callback function is called from nlm4svc_send_granted the cs->req->trans pointer is NULL. Thus using this pointer will result in a crash. Whereas cs->trans points to a valid transport object. NLM should use cs->trans instead of cs->req->trans. Fix: Replaced cs->req->trans with cs->trans. Change-Id: I425e48e0aafc9a6c130912edf2e801d8c4c9472d BUG: 990887 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/5452 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht: make linkfile creation mode explicitly get setAnand Avati2013-07-311-0/+9
| | | | | | | | | | | | | | | | Because of posix default_acl on parent directory, the mode of linkfile can get masked with the mode in the default acl. This breaks DHT integrity. So let the mode get explicitly reset after mknod(). Change-Id: Ia7328e1ee7b4430bda308f9da293dba78405e081 BUG: 990410 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5440 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* cluster/dht: Re-initialize skipped file count in glusterdshishir gowda2013-07-311-0/+1
| | | | | | | | | Change-Id: I42d08b3a6a7a3839f5e9953e1f83959222c080f8 Signed-off-by: shishir gowda <sgowda@redhat.com> BUG: 989846 Reviewed-on: http://review.gluster.org/5446 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd : Checking session created or not in case of geo-rep stopAvra Sengupta2013-07-311-2/+6
| | | | | | | | | | | | | | | | | Performing statefile check in case of geo-rep stop, so as to provide proper error message in case session is not created. However in case of geo-rep stop force, we allow the command to succeed even in case that the session is not created, because the stop command is a failsafe command to stop running geo-rep sessions on any nodes. Change-Id: I2b6a0253de977633606c422cbbc9e37cede9a268 BUG: 989541 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : initiating gsyncd restart during add-brickAvra Sengupta2013-07-313-25/+150
| | | | | | | | | | | | | | | | | | | | | | | During add-brick, when a new brick is added in one of the nodes that was already a part of the existing volume, and gsyncd was already running on that node, then all gsyncd processes running on that node, for that particular master and any slave sessions will be restarted If a new brick is added in a new node, then after adding the brick, the user has to perform the following steps: 1. gluster system:: execute gsec_create 2. gluster volume geo-replication <master-vol> <slave-vol> create push-pem force 3. gluster volume geo-replication <master-vol> <slave-vol> start force Change-Id: I4b9633e176c80e4a7cf33f42ebfa47ab8fc283f1 BUG: 989532 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>