summaryrefslogtreecommitdiffstats
path: root/xlators/nfs/server/src/nfs3.c
Commit message (Collapse)AuthorAgeFilesLines
...
* nfs: opendir/closedir for every readdirAnand Avati2010-11-151-13/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert "nfs3: Unref & unbind dir fd with inode lock on EOF" This reverts commit 4e6fb304ce41acbaf7c9ba67c06bf443e65082e8. The above commit (which unbinds fds at EOF) does not fix the original bug (1619) because a readdir from a second app could have already started before the readdir_cbk of the first app's readdir reaches NFS code. Hence the race still exists. Performing extra unrefs when EOF is received is not a reliable way of detecting that a client has performed a closedir (and to close the fd ourselves). Neither is interpreting a 0 cookies a new opendir. Clients can always use telldir/seekdir and hit EOFs twice. Due to the way NFS3 protocol is designed, it is just not possible for the server to reliably detect opendirs/closedirs performed by the client and map the corresponding readdirs to the same dir fd on the server side. The only reliable way of fixing this is to perform opendir/closedir at the cost of performance. Any optimization towards keeping dir fds open attempting to map them with application's opendir/closedir will either result in fd leaks or extra fd unrefs. Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2061 (NFS server crashes in readdir_fstat_cbk due to extra fd unref) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2061
* nfs: Remove conn_destroy/init callbacksShehjar Tikoo2010-11-031-2/+0
| | | | | | | | | | | NFS is transport-independent, so no point emulating knowledge of transportin software. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1743 (XenServer is not compatible with GlusterNFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1743
* nfs3: More robust root gfid checksShehjar Tikoo2010-11-031-0/+2
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2051 (find fails with loop detected error) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2051
* nfs3: Correctly funge solaris root lookup FH for DVMShehjar Tikoo2010-10-211-5/+5
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1997 (Solaris mount fails with "RPC program not registered") URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1997
* nfs: avoid assignment of structure pointer into serialized bufferAnand Avati2010-10-121-2/+4
| | | | | | | | | | | | | With the introduction of variable sized file handle feature in NFS, on-wire lengths of file handles can be lesser than file handle structure of code. Direct pointer assignment into the offsetted buffer and dereference and result in reads beyond the end of buffer and crashes. Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 999 (Crash in nfs3_fh_resolve_and_resume) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=999
* Revert "nfs3: Revalidate inode on receiving ESTALE on lookup"Anand Avati2010-10-121-44/+1
| | | | | | | | | | This reverts commit f5afcc47f9f00472d6c2b3f48127e02332cd457a. Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1756 (NFS must revalidate inode on first ESTALE on lookup) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1756
* Revert "Revert "nfs3: Revalidate inode on receiving ESTALE on lookup""Vijay Bellur2010-10-111-1/+44
| | | | This reverts commit 6dd3b7fa3bc7acf9281cc17f08010675e2297089.
* Revert "nfs3: Revalidate inode on receiving ESTALE on lookup"Anand Avati2010-10-111-44/+1
| | | | | | | | | | This reverts commit f5afcc47f9f00472d6c2b3f48127e02332cd457a. Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1756 (NFS must revalidate inode on first ESTALE on lookup) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1756
* nfs: Revert downed-subvolume changesShehjar Tikoo2010-10-071-191/+51
| | | | | | | | | | | | | | | | | | | For the record these are the patches committed as: 1. "nfs, nfs3: Base volume access on CHILD-UP-DOWN event" http://git.gluster.com/?p=glusterfs.git;a=commit;h=f47b0c55de9941823fbefe4b3a7e37179d6d4329 2. "nfs: Fix multiple subvolume CHILD-UP support" http://git.gluster.com/?p=glusterfs.git;a=commit;h=336e2df7b74be7ad4c9ed403ca10b9f7f7ef9a58 3. "nfs,nfs3: Disable subvolume on ENOTCONN" http://git.gluster.com/?p=glusterfs.git;a=commit;h=8c6e27cdaf895e3031c3256efb9472a6c0bf61f3 Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1724 (kernel untar fails during add-brick) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1724
* nfs,nfs3: Disable subvolume on ENOTCONNShehjar Tikoo2010-10-041-31/+140
| | | | | | | | | | | | | ..so that nfs does not return an error to the client, instead the subvolume gets disabled till it comes back up again. The client is expected to keep retransmitting requests in the mean time. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1724 (kernel untar fails during add-brick) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1724
* Change GNU GPL to GNU AGPLPranith K2010-10-041-3/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1388 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1388
* nfs3: Revalidate inode on receiving ESTALE on lookupShehjar Tikoo2010-10-011-1/+44
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1756 (NFS must revalidate inode on first ESTALE on lookup) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1756
* nfs, nfs3: Base volume access on CHILD-UP-DOWN eventShehjar Tikoo2010-09-221-20/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overall, the aim of this patch is to change the result of an nfs op depending on whether the underlying volume is up or down as notified by CHILD_UP and CHILD_DOWN events. This patch contains three intertwined changes: o Only when the lookup on the root of a volume is successful does gnfs now export the subvolume. Till now the result of the lookup was not used to determine whether we should export that volume. Not accounting for root lookup failure resulted in ESTALEs on first access because some children of distribute were down at the time of the root lookup. o Only when lookups on all the subvolumes have succeeded are these exports enabled through NFS. o When a child of say distribute goes down, on CHILD_DOWN event nfs will ignore all incoming requests from the client because ignoring these will prevent ESTALEs for those requests and in the hope that ignoring the requests will make the client retransmit. There are risks in this approach absent the DRC but we're willing to live with that for now. When a child goes down, the mount exports list will continue to show it but mount requests will be denied. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1643 (Initial requests after mount ESTALE if DHT subvolumes connect after nfs startup) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1643
* nfs3: Unref & unbind dir fd with inode lock on EOFShehjar Tikoo2010-09-171-6/+12
| | | | | | | | | | | | | | ..so that when EOF is reached on this fd, any further requests on the same inode do not get handled through this fd but result in a new fd being opened. Unbinding results in the fd getting deleted from the inode's fd list. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1619 (glusterfs nfs server crashed on dht+replica(2x2)) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1619
* nfs3: Copy deviceid from correct gfid start octetShehjar Tikoo2010-09-151-1/+1
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1600 (showmount works but unable to mount) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1600
* nfs,nfs3,mnt3: Transition fh resolution to gfidShehjar Tikoo2010-09-141-118/+349
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* nfs3: Do not unref dst inode on rename cbkShehjar Tikoo2010-09-021-6/+0
| | | | | | | | | | | This gets done when the call state gets wiped. Doing it here results an extra unref causing a segfault. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1464 (fd leak after rename) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1464
* nfs3: Free vectored write args using FREE not GF_FREEShehjar Tikoo2010-09-021-0/+1
| | | | | | | | | | | ..because the file handle in write3args is allocated inside libc using malloc not memory accounting code in glusterfs. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1499 (GNFS from mainline Glusterfs-3.1-qa13 crashes while initiating SFS2008) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1499
* nfs3: Close dst cached fd & unref inode on renameShehjar Tikoo2010-08-311-0/+19
| | | | | | | | | | | | If the src file is over-writing an existing file and if the destination file is open, then close the cached fd on the destination file and unref the inode for it. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1464 (fd leak after rename) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1464
* nfs3: Dont ref dir fd_t used in hard fh resolutionShehjar Tikoo2010-08-311-4/+2
| | | | | | | | | | | | ..because the extra ref was under the mistaken assumption that directory fd_t will be cached even during hard fh resolution and that is not the case. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1397 (Cached dir fd_ts are a leakin') URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1397
* xlators/nfs: nfs3.c - remove dead assignments.Sachidananda2010-08-221-6/+0
| | | | | | | | | | | | | | | Removed dead assignments and unused variables reported by clang. One of the reports uncovers a minor bug in gnfs. > Dead store Dead assignment xlators/nfs/server/src/nfs3.c 2860 1 A separate bug is logged for the above report and assigned to Shehjar. Signed-off-by: Sachidananda Urs <sac@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1114 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1114
* nfs, nfs/rpc: Rename functions to prevent gfrpcsvc conflictShehjar Tikoo2010-08-101-214/+225
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1274 (nfs fails to start) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1274
* nfs3: Error returns must check for <0, not == -1Shehjar Tikoo2010-07-281-2/+2
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 854 (nfs server didn't start) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=854
* nfs3: Fix race updating op queue on uncached fd openShehjar Tikoo2010-07-061-1/+0
| | | | | | | | | | | | | The order of locking while performing async fd opens was resulting in a deadlock when a particular pattern of operations was generated by compilebench. This patch improves handling of those situations while locking the fd-cache, inode and inode queue. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1047 (Compilebench hangs nfs server) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1047
* nfs: Support directory level exportsShehjar Tikoo2010-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfsx has followed traditional approach of exporting whole volumes as NFS exports. The Platform requires and some users have approached us for introducing exports of only specific directories instead of full Gluster volumes. This commit introduces this support through two nfsx options: Option 1: ========= option nfs3.<volume-name>.export-dir <subdir1-in-vol>,<subdir2-in-vol>,..<subdirN-in-vol> export-dir will allow the export of a particular dir as a single export by itself. For eg.: volume posix type storage/posix option directory /export/ end-volume volume posix-ac type features/access-control subvolumes posix end-volume volume nfs type nfs/server subvolumes posix-ac option rpc-auth.addr.allow * option nfs3.posix-ac.export-dir /homes/shehjart end-volume A comma separate list of sub-directories will set up those dirs as separated exports. At the nfs client, the mount command will be: $ mount <nfsserver>:/posix-ac/homes/shehjart /mnt Option 2: ========= option nfs3.<volume-name>.export-volumes <on|off> There can be situations where users only want the directory level exports and require that volume exports be completely disabled. The above option allows us to do this. By default, volume exports are enabled. From the earlier example, replacing <volume-name> with posix-ac, will disable mounting of the posix-ac volume as a whole. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 989 (Support directory exports in nfsx) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=989
* nfs3: Special-case the lookup for parent dir of rootShehjar Tikoo2010-06-011-7/+31
| | | | | | | | | | | | | | | | When a lookup request comes in for (rootfh, ".."), we need to handle it in a way that returns the attributes and handle of the root dir. Not doing so crashes nfsx because the inode table is not able to find a inode for the root's parent. This inode was being referenced in nfs3_lookup_parentdir_resume when filling a loc for the lookup fop. For the record, such a lookup request is sent by vmkernel. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 942 (NFS crashes as a vmware ESX data store) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=942
* nfs3: Funge . and .. ino/gen in readdir of rootShehjar Tikoo2010-06-011-29/+15
| | | | | | | | | | | | | | | | | In the readdir reply for the root of the export, replace the ino and gen number for the . and .. entries with 1 and 0 respectively. On clients which inspect this field, the client will error out due to the change in inode number of the root directory when see for "." .. also needs to be replaced because we do not have a concept of the parent directory of root. The return of 1 and 0 is the same as the behaviour of: stat /.. command. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 942 (NFS crashes as a vmware ESX data store) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=942
* nfs: Introduce trusted-write and trusted-sync optionsShehjar Tikoo2010-05-211-6/+158
| | | | | | | | | | | | | | | | | | | | | | | | Introduces two new options: 1. nfs3.*.trusted-write: Forces UNSTABLE writes to return STABLE to NFS clients to prevent the clients from sending a COMMIT. STABLE writes are still handled in a sync manner and so are COMMITs if they're sent at all. 2. nfs3.*.trusted-sync: Forces all WRITEs and COMMITs to return STABLE return flags to NFS clients to avoid the overhead of STABLE writes, and COMMITs that follow UNSTABLE writes. This includes the trusted-write functionality. In addition to the trusted-write, it also writes STABLE writes in an UNSTABLE manner. Both violate the NFS protocol but allow better write perf in most configurations. Use with caution. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 924 (Slow NFS synchronous writes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=924
* nfs3: Final unref only on successful removeShehjar Tikoo2010-05-131-1/+5
| | | | | | | | | | | The final unref on the inode during a file removal should take place only if the file removal was successful. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 857 (Crash in afr_sh_entry_expunge_entry_cbk) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=857
* nfs3: Round-up read reply bytes of multi-vector replyShehjar Tikoo2010-05-101-0/+1
| | | | | | | | | | | | | | | | | A previos commit brought in support for returning read replies when subvolumes return reads in multiple iovecs. This did not completely fix the problem since the bytes in iovecs all together could be unaligned with the 4 byte boundary as needed by XDR for the opaque data. This resulted in read requests being either retransmitted or rejected with an error message in syslog on the NFS client. Signed-off-by: Shehjar Tikoo <shehjart@dev.gluster.com> Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 902 (iozone hangs during random read throughput test) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=902
* nfs3: Submit multiple vectors received in read callbackShehjar Tikoo2010-05-081-12/+12
| | | | | | | | | | | | | | There is a possibility of io-cache or read-ahead returning a read buffer that straddles two separate pages in ioc or ra, through two struct iovecs. Current nfs3 read reply does not return as many vectors as received from a subvolume leading to a short read for the NFS client. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 902 (iozone hangs during random read throughput test) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=902
* Memory accounting changesVijay Bellur2010-04-231-9/+12
| | | | | | | | | | | Memory accounting Changes. Thanks to Vinayak Hegde and Csaba Henk for their contributions. Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 329 (Replacing memory allocation functions with mem-type functions) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=329
* nfs3: Use nfs3state in call_state to avoid getting from rpc requestShehjar Tikoo2010-04-131-1/+2
| | | | | | | | | | | | | | | | This change avoids having the nfs translator depend on the sanity of the rpcsvc_request_t type after NFS reply has been sent. This was a problem because the request structure is guaranteed to be invalid after the reply for the request has been submitted by the RPC program. NFS3 handler was ignoring this behaviour and accessing the private in request after reply submission resulting in access to corrupted data. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 757 ([NFS-Alpha] Crash in nfs3_call_state_wipe) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=757
* NFS: type fixes: some portability cleanupCsaba Henk2010-04-081-9/+13
| | | | | | | | Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 399 (NFS translator with Mount v3 and NFS v3 support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=399
* nfs: Redesign fop argument passing to support single volfile useShehjar Tikoo2010-04-021-57/+63
| | | | | | | | | | | | | | | | | | | | | The current design of the interaction between the generic NFS layer and the protocol handlers like mount3 and nfs3 is such that it does not allow using a single volume file which contains the nfs/server and the protocol/server. This is because the common nfs-fops layer assumes that ctx->top is always the nfs/server. This is wrong. The fops layer needs access to top because top or rather the generic NFS xlator's private state has a mem-pool. The fops layer needs this mem-pool to get memory for storing per-fop state. Since the fops layer cannot anymore take ctx->top be the nfs/server, all layers need to start passing the nfs/server xlator_t right down to the fops layer. I am also taking this chance to remove the synchronous equivalents of the fops and also remove the dirent caching directory operations. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 770 (NFS Xlator - Crash when both GlusterFS server/NFS Server are in the same file) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=770
* nfs: Add NFSv3 protocol supportShehjar Tikoo2010-03-311-0/+4836
Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 399 (NFS translator with Mount v3 and NFS v3 support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=399