summaryrefslogtreecommitdiffstats
path: root/xlators/protocol
Commit message (Collapse)AuthorAgeFilesLines
* protocol/client: Added lk_ctx info in fdctx dumpKrishnan Parthasarathi2012-03-074-27/+83
| | | | | | | | | | | | | | | | | | - Added a brief explanation as to why we can't use gf_log when in statedump. - Removed gf_log messages from client_priv_dump since it can cause a 'deadlock' - See statedump.c for explanation - Added try-lock based accessors for fd_lk_list for dump purposes. Change-Id: I1d755a4ef2c568acf22fb8c4ab0a33a4f5fd07b4 BUG: 789858 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2882 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: Free readdirp xdr leakPranith Kumar K2012-03-071-0/+1
| | | | | | | | | Change-Id: I5e46deedd93e852a483693de42e4bec0082bc08b BUG: 796186 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2886 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* fops/removexattr: prevent users from removing glusterfs xattrsRajesh Amaravathi2012-03-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * Each xlator prevents the user from removing xlator-specific xattrs like trusted.gfid by handling it in respective removexattr functions. * For xlators which did not define remove and fremovexattr, the functions have been implemented with appropriate checks. xlator | fops-added _______________|__________________________ | 1. stripe | removexattr and fremovexattr 2. quota | removexattr and fremovexattr Change-Id: I98e22109717978134378bc75b2eca83fefb2abba BUG: 783525 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2836 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Fix memory leaks found in readdirVijay Bellur2012-03-011-0/+6
| | | | | | | | | | | Change-Id: I0e221e4de9ee12586b09cd8bf7f394e9d4b88a11 BUG: 765785 Reviewed-on: http://review.gluster.com/2853 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: Calling GF_FREE on memory allocated via GF_CALLOC.Mohammed Junaid2012-03-011-1/+1
| | | | | | | | | | | | This is a temporary fix. A clean fix would be to allocate memory using mem_get0 and free via mem_put. Change-Id: I6351ab22c2f05ba8fa4aaad67f375027df873807 BUG: 796656 Signed-off-by: Mohammed Junaid <junaid@redhat.com> Reviewed-on: http://review.gluster.com/2852 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* protocol/server: Make conn object ref-countedPranith Kumar K2012-03-015-198/+90
| | | | | | | | | Change-Id: I992a7f8a75edfe7d75afaa1abe0ad45e8f351c8b BUG: 796581 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2806 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: send unique key to server during handshakeAmar Tumballi2012-02-291-2/+7
| | | | | | | | | | | | utilize the graph->id for making the key unique. Change-Id: I0c1b355aa901af88e65fd12cb9e0535318856867 BUG: 783982 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.com/2831 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Mohammed Junaid <junaid@redhat.com>
* transport/socket: configuring tcp window-sizeRajesh Amaravathi2012-02-294-0/+14
| | | | | | | | | | | | | | | | | | | | | | | Till now, send and recieve buffer window sizes for sockets were set to a default glusterfs-specific value. Linux's default window sizes have been found to be better w.r.t performance, and hence, no more setting it to any default value. However, if one wishes, there's the new configuration option: network.tcp-window-size <sane_size> which takes a size value (int or human readable) and will set the window size of sockets for both clients and servers. Nfs clients will also be updated with the same. Change-Id: I841479bbaea791b01086c42f58401ed297ff16ea BUG: 795635 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2821 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterfsd: unref the dict and free the memory to avoid memleakRaghavendra Bhat2012-02-271-0/+2
| | | | | | | | | | Change-Id: Ib7a1f8cbab039fefb73dc35560a035d5688b0e32 BUG: 796186 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: Pass correct dict in client_readdirpshishir gowda2012-02-222-1/+3
| | | | | | | | | | | Also, alloc entry->dict before calling unserialize to it. Change-Id: I8a9db93afd6e95e75307467cd654805780d7b467 BUG: 796534 Signed-off-by: shishir gowda <shishirng@gluster.com> Reviewed-on: http://review.gluster.com/2803 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: Print correct error messagePranith Kumar K2012-02-221-1/+1
| | | | | | | | | Change-Id: Ic68626c4a205cd78b60831aa7bd838b6d8824fa1 BUG: 796195 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2800 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mempool: adjustments in pool sizesAmar Tumballi2012-02-223-6/+6
| | | | | | | | | | | | | | | | | * while creating 'rpc_clnt', the caller knows what would be the ideal load on it, so an extra argument to set some pool sizes * while creating 'rpcsvc', the caller knows what would be the ideal load of it, so an extra argument to set request pool size * cli memory footprint is reduced Change-Id: Ie245216525b450e3373ef55b654b4cd30741347f Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2784 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-217-30/+31
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc/clnt: handle PARENT_DOWN event appropriatelyAmar Tumballi2012-02-213-27/+83
| | | | | | | | | | Change-Id: I4644e944bad4d240d16de47786b9fa277333dba4 BUG: 767862 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.com/2735 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client,server: fcntl lock self healing.Mohammed Junaid2012-02-2012-47/+867
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently(with out this patch), on a disconnect the server cleans up the transport which inturn closes the fd's and releases the locks acquired on those fd's by that client. On a reconnect, client just reopens the fd's but doesn't reacquire the locks. The application that had previously acquired the locks still is under the assumption that it is the owner of those locks which might have been granted to other clients(if they request) by the server leading to data corruption. This patch allows the client to reacquire the fcntl locks (held on the fd's) during client-server handshake. * The server identifies the client via process-uuid-xl (which is a combination of uuid and client-protocol name, it is assumed to be unique) and lk-version number. * The client maintains a list of process-uuid-xl, lk-version pair for each accepted connection. On a connect, the server traverses the list for a matching pair, if a matching pair is not found the the server returns lk-version with value 0, else it returns the lk-version it has in store. * On a disconnect, the server and client enter grace period, and on the completion of the grace period, the client bumps up its lk-version number (which means, it will reacquire the locks the next time) and the server will distroy the connection. If reconnection happens within the grace period, the server will find the matching (process-uuid-xl, lk-version) pair in its list which guarantees that the fd's and there corresponding locks are still valid for this client. Configurable options: To set grace-timeout, the following options are option server.grace-timeout value option client.grace-timeout value To enable or disable the lk-heal, option lk-heal [on|off] gluster volume set command can be used to configurable options Change-Id: Id677ef1087b300d649f278b8b2aa0d94eae85ed2 BUG: 795386 Signed-off-by: Mohammed Junaid <junaid@redhat.com> Reviewed-on: http://review.gluster.com/2766 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: delete locks only for non-anonymous fdsRajesh Amaravathi2012-02-202-4/+3
| | | | | | | | | | | | | delete_granted_lock_owners () is not called for anonymous fds since they are not involved in locking Change-Id: Icdc7818f98f5371232ba276ed442704ef69e6b0e BUG: 787365 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2754 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* iobuf: use 'iobuf_get2()' to get variable sized buffersAmar Tumballi2012-02-201-13/+17
| | | | | | | | | | | added 'TODO' in places where it is missing. Change-Id: Ia802c94e3bb76930f7c88c990f078525be5459f5 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765264 Reviewed-on: http://review.gluster.com/388 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd: auth allow enhancementsRajesh Amaravathi2012-02-202-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * PROBLEM: When address-based authentication is enabled on a volume, the gNfs server, self-heal daemon (shd), and other operations such as quota, rebalance, replace-brick and geo-replication either stop working or the services are not started if all the peers' ipv{4,6} addresses or hostnames are not added in the "set auth.allow" operation, breaking the functionality of several operations. E.g: volume vol in a cluster of two peers: /mnt/brick1 in 192.168.1.4 /mnt/brick2 in 192.168.1.5 option auth.allow 192.168.1.6 (allow connection requests only from 192.168.1.6) This will disrupt the nfs servers on 192.168.1.{4,5}. brick server processes reject connection requests from both nfs servers (on 4,5), because the peer addresses are not in the auth.allow list. Same holds true for local mounts (on peer machines), self-heal daemon, and other operations which perform a glusterfs mount on one of the peers. * SOLUTION: Login-based authentication (username/password pairs, henceforth referred to as "keys") for gluster services and operations. These *per-volume* keys can be used to by-pass the addr-based authentication, provided none of the peers' addresses are put in the auth.reject list, to enable gluster services like gNfs, self-heal daemon and internal operations on volumes when auth.allow option is exercised. * IMPLEMENTATION: 1. Glusterd generates keys for each volume and stores it in memory as well as in respective volfiles. A new TRUSTED-FUSE volfile is generated which is fuse volfile + keys in protocol/client, and is named trusted-<volname>-fuse.vol. This is used by all local mounts. ANY local mount (on any peer) is granted the trusted-fuse volfile instead of fuse volfile via getspec. non-local mounts are NOT granted the trusted fuse volfile. 2. The keys generated for the volume is written to each server volfile telling servers to allow users with these keys. 3. NFS, self-heal daemon and replace-brick volfiles are updated with the volume's authentication keys. 4. The keys are NOT written to fuse volfiles for obvious reasons. 5. The ownership of volfiles and logfiles is restricted to root users. 6. Merging two identical definitions of peer_info_t in auth/addr and rpc-lib, throwing away the one in auth/addr. 7. Code cleanup in numerous places as appropriate. * IMPORTANT NOTES: 1. One SHOULD NOT put any of the peer addresses in the auth.reject list if one wants any of the glusterd services and features such as gNfs, self-heal, rebalance, geo-rep and quota. 2. If one wants to use username/password based authentication to volumes, one shall append to the server, nfs and shd volfiles, the keys one wants to use for authentication, *while_retaining those_generated_by_glusterd*. See doc/authentication.txt file for details. Change-Id: Ie0331d625ad000d63090e2d622fe1728fbfcc453 BUG: 789942 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2733 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol: remove the 'path<>' from rename() and link()Amar Tumballi2012-02-162-12/+0
| | | | | | | | | | | | missed it in the previous round of cleanup, path is completely useless in resolve function. Change-Id: I1aef0f5276afb77dfacfcc0c337ac80b4fcacc55 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 790298 Reviewed-on: http://review.gluster.com/2756 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs: avoid logging invalid fdctx in case of anonymous fdsRajesh Amaravathi2012-02-141-3/+4
| | | | | | | | | | | | | | if get_fd_ctx fails (as in case of anonymous fds), overwhelming amount of entries are seen in the nfs log, causing dd and other heavy i/o operations to become unresponsive. this patch logs an invalid fdctx only if it is not an anonymous fd. Change-Id: I4e917d150d6a053af77d47a94a2f1c2633acadb5 BUG: 787365 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.com/2747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* protocol xdr: remove 'path<>'Amar Tumballi2012-02-142-23/+2
| | | | | | | | | | | | | | | client was sending 'path' on wire, which gets ignored on server side, and also doesn't get freed up, which causes memory leak. also with not having path on wire, the xdr size on wire most of the time can remain constant, which helps in allocating RDMA buffers. Change-Id: Ie0d36a670be60b02fd1e925c6f977b1a71def5cd BUG: 790298 Signed-off-by: Amar Tumballi <amar@gluster.com> Reviewed-on: http://review.gluster.com/2744 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: add an extra flag to readv()/writev() APIAmar Tumballi2012-02-143-4/+13
| | | | | | | | | | | | needed to implement a proper handling of open flag alterations using fcntl() on fd. Change-Id: Ic280d5db6f1dc0418d5c439abb8db1d3ac21ced0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2723 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol: code cleanupAmar Tumballi2012-02-142-707/+200
| | | | | | | | | | | make dict serialize and unserialization code a macro Change-Id: I459c77c6c1f54118c6c94390162670f4159b9690 BUG: 764890 Signed-off-by: Amar Tumballi <amar@gluster.com> Reviewed-on: http://review.gluster.com/2742 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: assign the right value to 'conf' before de-refing itAmar Tumballi2012-02-071-2/+2
| | | | | | | | | | | | variable assignment was done after it was actually getting de-referenced. moved the assignment few lines up. Change-Id: Id65e3e2d3dfe071e1c5b14c32488647070398ae4 BUG: 787117 Signed-off-by: Amar Tumballi <amar@gluster.com> Reviewed-on: http://review.gluster.com/2712 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cli, protocol/server : improve validation for the option auth.(allow/reject)Kaushal M2012-02-051-1/+42
| | | | | | | | | | | | | | | | | | cli now checks validity of address list given for 'volume set auth.*' Server xlator checks addresses supplied to auth.(allow/reject) option including wildcards for correctness in case volfile is manually edited. Original patch done by shylesh@gluster.com Original patch is at http://patches.gluster.com/patch/7566/ Change-Id: Icf52d6eeef64d6632b15aa90a379fadacdf74fef BUG: 764197 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/306 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client : prevent client from reconnecting when serverKaushal M2012-02-031-3/+21
| | | | | | | | | | | | | | | | | | authentication fails This prevents the client from trying to reconnect on server authentication failure. Reconnecting on authentcation failure causes hung mounts on unauthorised clients. This patch fixes this problem. Also, mount.glusterfs script unmounts mount-point on mount failure to prevent hung mounts. Change-Id: I5615074d27948077bad491a38cecae1b7f5159fb BUG: 765240 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* cli: Extend "volume status" with statedump infoKaushal M2012-01-271-3/+136
| | | | | | | | | | | | | | | | | This patch enhances and extends the "volume status" command with information obtained from the statedump of the bricks of volumes. Adds new status types : clients, inode, fd, mem, callpool The new syntax of "volume status" is, #gluster volume status [all|{<volname> [<brickname>] [misc-details|clients|inode|fd|mem|callpool]}] Change-Id: I8d019718465bbc3de727653a839de7238f45da5c BUG: 765495 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.com/2637 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com>
* protocol/client: Pass the right arguments to CLIENT_GET_REMOTE_FDVijay Bellur2012-01-261-1/+1
| | | | | | | | | Change-Id: I04f984f20964650a38009bba7711d2757151ade5 BUG: 762935 Signed-off-by: Vijay Bellur <vijay@gluster.com> Reviewed-on: http://review.gluster.com/2694 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* protocol/client: if the remote_fd is -1, then unwind instead of sending the ↵Raghavendra Bhat2012-01-252-24/+31
| | | | | | | | | | | | | | | | | | call to server For calls with remote_fd set to -1, client xlator is sending the call to the server which results in server not resolving it and thus fd being NULL. Locks xlator when tries to get the inode context using the fd it segfaults. To avoid it unwind the call in the client xlator if the remote_fd is -1. Change-Id: Ic34a49fdf1012dd371f4b194703c0be74f29bda2 BUG: 784187 Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-on: http://review.gluster.com/2684 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: add 'fremovexattr()' fopAmar Tumballi2012-01-253-0/+211
| | | | | | | | | | | so operations can be done on fd for extended attribute removal Change-Id: Ie026f1b53793aeb4ae33e96ea5408c7a97f34bf6 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: get xattrs also as part of readdirpAmar Tumballi2012-01-257-26/+126
| | | | | | | | | | | | | readdirp_req() call sends a dict_t * as an argument, which contains all the xattr keys for which the entries got in readdirp_rsp() are having xattr value filled dictionary. Change-Id: I8b7e1290740ea3e884e67d19156ce849227167c0 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765785 Reviewed-on: http://review.gluster.com/771 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: change lk-owner as a 1k bufferAmar Tumballi2012-01-248-63/+65
| | | | | | | | | | | | | so, NLM can send the lk-owner field directly to the locks translators, while doing the same effort, also enabled sending maximum of 500 aux gid over protocol. Change-Id: I87c2514392748416f7ffe21d5154faad2e413969 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 767229 Reviewed-on: http://review.gluster.com/779 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* rpc: extend actors with flag signing if privilege is requiredCsaba Henk2012-01-212-47/+47
| | | | | | | | | | | | Currently we allow the following RPC messages for unprivileged users: GLUSTER_CLI_GETWD, GLUSTER_CLI_MOUNT, GLUSTER_CLI_UMOUNT Change-Id: I05414f3ca7cbe47de45c5e5cfba1537efc774e6c BUG: 781256 Signed-off-by: Csaba Henk <csaba@gluster.com> Reviewed-on: http://review.gluster.com/2641 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-206-371/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* protocol/server: Do connection cleanup if reply failsPranith Kumar K2011-12-223-18/+14
| | | | | | | | | | | | | | | | | | | | We observed that after the first connection cleanup happens on DISCONNECT the lock calls in transit are granted or added in blocked locks queue. These locks were never cleaned up after that because no unlock would come up on that connection. This would leave references on that transport so it would never be destroyed. Now, the connection cleanup happens whenever the reply submission fails. Also cleaned up the old code which is not used any more. Change-Id: Ie4fe6f388ed18d9c907cf8ae06b0b7fd0601a660 BUG: 765430 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Fix local==NULL crash in wb_sync_cbk during disconnect.Jeff Darcy2011-12-191-3/+10
| | | | | | | | | | Change-Id: I26dc48a85756e189b1ef5cfef1658f9c2aed2157 BUG: 767359 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/784 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: Be strict about gfids in fop reqPranith Kumar K2011-12-151-0/+73
| | | | | | | | | | Change-Id: I7508ab3a93329bb6a679801fddfcd0e5b0c7c134 BUG: 765198 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/770 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterfs: An effort to fix all the spell mistakes and typoHarshavardhana2011-11-162-2/+2
| | | | | | | | | | | | | | | in the entire glusterfs codebase. This patch fixes many of spell mistakes and typo in the entire glusterfs codebase and all supported modules. Change-Id: I83238a41aa08118df3cf4d1d605505dd3cda35a1 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: remove 'ino' variable from 'inode_t' structureAmar Tumballi2011-11-166-215/+188
| | | | | | | | Change-Id: I0f078d1753db65d2f2e0380d1b0450c114cf40dd BUG: 3518 Reviewed-on: http://review.gluster.com/522 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Move some of the messages to log level `TRACE'.Sachidananda Urs2011-11-153-6/+6
| | | | | | | | Change-Id: I46133b5e2218b9d810251b3dadadd8f157ab07d7 BUG: 3761 Reviewed-on: http://review.gluster.com/643 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* libglusterfs/iobuf: increase the iobref's iobuf array sizeAmar Tumballi2011-10-101-2/+1
| | | | | | | | | | | | earlier it was hardcoded to 8, now increased the size to 16. also return the exact error code in client_submit_vec_request(), so there will be no missing frames in case of errors. Change-Id: I82a6ee681a543b673a7cf1a0b9c5ade2a7175ebe BUG: 3679 Reviewed-on: http://review.gluster.com/555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* statedump: do not print the inode number in the statedumpRaghavendra Bhat2011-10-012-36/+12
| | | | | | | | | | | | | Since gfid is used to uniquely identify a inode, in the statedump printing inode number is not necessary. Its suffecient if the gfid of the inode is printed. And do not print the the inodelks, entrylks and posixlks if the lock count is 0. Change-Id: Idac115fbce3a5684a0f02f8f5f20b194df8fb27f BUG: 3476 Reviewed-on: http://review.gluster.com/530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* protocol/server: check for the fd being NULL and unwindRaghavendra Bhat2011-09-291-0/+8
| | | | | | | | Change-Id: I400e515431cf739fe0b2f90840359496a2b529d2 BUG: 3158 Reviewed-on: http://review.gluster.com/528 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com>
* cli : new volume statedump commandKaushal M2011-09-271-2/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes: 1. Add a new 'volume statedump' command, that performs statedumps of all the bricks in the volume and saves them in a specified location. 2. Add new server option 'server.statedump-path'. 3. Remove multiple function definitions in glusterd.h Statedump Information: The 'volume statedump' command performs statedumps on all the bricks in a given volume. The syntax of the command is, gluster volume statedump <VOLNAME> [type]...... Types include, * all * mem * iobuf * callpool * priv * fd * inode Defaults to 'all' when no type is specified. The statedump files are created by default in /tmp directory of the server on which the bricks are present. This path can be changed by setting the 'server.statedump-path' option. The statedump files will be named as, <brick-name>.<pid of brick process>.dump Change-Id: I01c0e1a8aad490da818e086d89f292bd2ed06fd4 BUG: 1964 Reviewed-on: http://review.gluster.com/321 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* glusterfs protocol: handshake to log the version of the peerAmar Tumballi2011-09-213-5/+23
| | | | | | | | | | | | | | | | | * As RPC program's name is just used for logging, we now have 'PACKAGE_VERSION' part of the string, which gets logged in client side. * From client, we send the PACKAGE_VERSION in handshake dictionary, which gets logged on serverside handshake. The change doesn't break any compatibility between client or server as it would only enhance the logging part of handshake. Change-Id: Ie7f498af2f5d3f97be37c8d982061cb6021883ce BUG: 3589 Reviewed-on: http://review.gluster.com/467 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: minor log enhancementsRajesh Amaravathi2011-09-191-21/+14
| | | | | | | | | | minor changes to the log enhancements of bug 3473. Change-Id: Id38d29db5a744e0ab7342d10ead6d16866228062 BUG: 3473 Reviewed-on: http://review.gluster.com/452 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* protocol/client: log enhancementsRajesh Amaravathi2011-09-181-90/+109
| | | | | | | | | | | * print paths wherever it is possible to log, to help debugging. * bring uniformity in log level. Change-Id: I2fa85b629de5dd0f0057ed96cba08ecb0ff1a798 BUG: 3473 Reviewed-on: http://review.gluster.com/328 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* distribute rebalance: handle the open file migrationAmar Tumballi2011-09-121-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: avoid code duplication in fd based operationsRaghavendra Bhat2011-09-112-340/+42
| | | | | | | | | Change-Id: I012f78bac8ba82333628c59ef51d5e5f43d05ac7 BUG: 3158 Reviewed-on: http://review.gluster.com/329 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-073-10/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>