glusterfs.git/xlators/storage/posix/src/posix-handle.h, branch v3.5beta1

posix: placeholders for GFID to path conversion

2013-11-26T18:22:40+00:00

what?
=====
    The following is an attempt to generate the paths of a file when
    only its gfid is known.

    To find the path of a directory, the symlink handle to the
    directory maintained in  the ".glusterfs" backend directory is
    read. The symlink handle is generated using the gfid of the
    directory. It (handle) contains the directory's name and parent
    gfid, which are used to recursively construct the absolute path as
    seen by the user from the mount point.

    A similar approach cannot be used for a regular file or a symbolic
    link since its hardlink handle, generated using its gfid, doesn't
    contain its parent gfid and basename. So xattrs are set to store
    the parent gfids and the number of hardlinks to a file or a
    symlink having the same parent gfid.  When an user/application
    requests for the paths of a regular file or a symlink with
    multiple hardlinks, using the parent gfids stored in the xattrs,
    the paths of the parent directories are generated as mentioned
    earlier. The base names of the hardlinks (with the same parent
    gfid) are determined by matching the actual backend inode numbers
    of each entry in the parent directory with that of the hardlink
    handle.

    Xattr is set on a regular file, link, and symbolic link as
    follows, Xattr name : trusted.pgfid. Xattr value :
    

    If a regular file, hard link, symbolic link is created then an
    xattr in the above format is set in the backend.

how to use?
===========
    This functionality can be used through getxattr interface. Two
    keys - glusterfs.ancestry.dentry and glusterfs.ancestry.path - enable
    usage of this functionality. A successful getxattr will have the
    result stored under same keys. Values will be,

    glusterfs.ancestry.dentry:
    --------------------------
    A linked list of gf-dirent structures for all possible paths from
    root to this gfid. If there are multiple paths, the linked-list
    will be a series of paths one after another. Each path will be a
    series of dentries representing all components of the path. This
    key is primarily for internal usage within glusterfs.

    glusterfs.ancestry.path:
    ------------------------
    A string containing all possible paths from root to this gfid.
    Multiple hardlinks of a file or a symlink are displayed as a colon
    seperated list (this could interfere with path components
    containing ':').

    e.g. If there is a file "file1" in root directory with two hardlinks,
         "/dir2/link2tofile1" and "/dir1/link1tofile1", then

         [root@alpha gfsmntpt]# getfattr -n glusterfs.ancestry.path -e text
          file1
          glusterfs.ancestry.path="/file1:/dir2/link2tofile1:/dir1/link1tofile1"

    Thanks Amar, Avati and Venky for the inputs.

Original Author: Ramana Raja 
BUG: 990028
Signed-off-by: Raghavendra G 
Change-Id: I0eaa9101e333e0c1f66ccefd9e95944dd4a27497
Reviewed-on: http://review.gluster.org/5951
Tested-by: Gluster Build System 
Reviewed-by: Anand Avati

All: License message change

2012-09-13T20:19:37+00:00

License message changed for server-side, dual license GPLV2 and LGPLv3+.

Change-Id: Ia9e53061b9d2df3b3ef3bc9778dceff77db46a09
BUG: 852318
Signed-off-by: Varun Shastry 
Reviewed-on: http://review.gluster.org/3940
Tested-by: Gluster Build System 
Reviewed-by: Kaleb KEITHLEY 
Reviewed-by: Anand Avati

All: License message change

2012-08-28T10:45:06+00:00

The license message is changed to
  Copyright (c) 2008-2012 Red Hat, Inc. 
  This file is part of GlusterFS.

  This file is licensed to you under your choice of the GNU Lesser
  General Public License, version 3 or any later version (LGPLv3 or
  later), or the GNU General Public License, version 2 (GPLv2), in all
  cases as published by the Free Software Foundation.

Change-Id: I07d2b63ed5fbbbd1884f1e74f2dd56013d15b0f4
BUG: 852318
Signed-off-by: Varun Shastry 
Reviewed-on: http://review.gluster.org/3858
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

storage/posix: Move landfill inside .glusterfs

2012-05-31T23:52:44+00:00

Change-Id: Ia2944f891dd62e72f3c79678c3a1fed389854a90
BUG: 811970
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.com/3158
Tested-by: Gluster Build System 
Reviewed-by: Anand Avati

storage/posix: prefer absolute path handles over GFID handles

2012-04-21T08:26:24+00:00

Change-Id: I9afefa2f8a39c5f2c77271ea64aff95249c88821
BUG: 791187
Signed-off-by: Anand Avati 
Reviewed-on: http://review.gluster.com/3204
Tested-by: Gluster Build System 
Reviewed-by: Amar Tumballi 
Reviewed-by: Vijay Bellur

posix: handle some internal behavior in posix_mknod()

2012-02-17T06:48:57+00:00

assume a case of link() systemcall, which is handled in distribute by
creating a 'linkfile' in hashed subvolume, if the 'oldloc' is present
in different subvolume. we have same 'gfid' for the linkfile as that
of file for consistency. Now, a file with multiple hardlinks, we may
end up with 'hardlinked' linkfiles. dht create linkfile using 'mknod()'
fop, and as now posix_mknod() is not equipped to handle this situation.

this patch fixes the situation by looking at the 'internal' key set in
the dictionary to differentiate the call which originates from inside
with regular system calls.

Change-Id: Ibff7c31f8e0c8bdae035c705c93a295f080ff985
BUG: 763844
Signed-off-by: Amar Tumballi 
Reviewed-on: http://review.gluster.com/2755
Tested-by: Gluster Build System 
Reviewed-by: Anand Avati

core: GFID filehandle based backend and anonymous FDs

2012-01-20T13:03:42+00:00

1. What
--------
This change introduces an infrastructure change in the filesystem
which lets filesystem operation address objects (inodes) just by its
GFID. Thus far GFID has been a unique identifier of a user-visible
inode. But in terms of addressability the only mechanism thus far has
been the backend filesystem path, which could be derived from the
GFID only if it was cached in the inode table along with the entire set
of dentry ancestry leading up to the root.

This change essentially decouples addressability from the namespace. It
is no more necessary to be aware of the parent directory to address a
file or directory.

2. Why
-------
The biggest use case for such a feature is NFS for generating
persistent filehandles. So far the technique for generating filehandles
in NFS has been to encode path components so that the appropriate
inode_t can be repopulated into the inode table by means of a recursive
lookup of each component top-down.

Another use case is the ability to perform more intelligent self-healing
and rebalancing of inodes with hardlinks and also to detect renames.

A derived feature from GFID filehandles is anonymous FDs. An anonymous FD
is an internal USABLE "fd_t" which does not map to a user opened file
descriptor or to an internal ->open()'d fd. The ability to address a file
by the GFID eliminates the need to have a persistent ->open()'d fd for the
purpose of avoiding the namespace. This improves NFS read/write performance
significantly eliminating open/close calls and also fixes some of today's
limitations (like keeping an FD open longer than necessary resulting
in disk space leakage)

3. How
-------

At each storage/posix translator level, every file is hardlinked inside
a hidden .glusterfs directory (under the top level export) with the name
as the ascii-encoded standard UUID format string. For reasons of performance
and scalability there is a two-tier classification of those hardlinks
under directories with the initial parts of the UUID string as the directory
names.

For directories (which cannot be hardlinked), the approach is to use a symlink
which dereferences the parent GFID path along with basename of the directory.
The parent GFID dereference will in turn be a dereference of the grandparent
with the parent's basename, and so on recursively up to the root export.

4. Development
---------------

4a. To leverage the ability to address an inode by its GFID, the technique is
to perform a "nameless lookup". This means, to populate a loc_t structure as:

loc_t {
   pargfid: NULL
   parent: NULL
   name: NULL
   path: NULL
   gfid: GFID to be looked up [out parameter]
   inode: inode_new () result [in parameter]
}

and performing such lookup will return in its callback an inode_t
populated with the right contexts and a struct iatt which can be
used to perform an inode_link () on the inode (without a parent and
basename). The inode will now be hashed and linked in the inode table
and findable via inode_find().

A fundamental change moving forward is that the primary fields in a
loc_t structure are now going to be (pargfid, name) and (gfid) depending
on the kind of FOP. So far path had been the primary field for operations.
The remaining fields only serve as hints/helpers.

4b. If read/write is to be performed on an inode_t, the approach so far
has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and
then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do
fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost
in performance in the inbuilt NFS server.

5. Misc
-------
The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent
with the rest of the codebase.

Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f
BUG: 781318
Reviewed-on: http://review.gluster.com/669
Tested-by: Gluster Build System 
Reviewed-by: Anand Avati