summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/inode.h
Commit message (Collapse)AuthorAgeFilesLines
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-201-16/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: remove 'ino' variable from 'inode_t' structureAmar Tumballi2011-11-161-2/+1
| | | | | | | | Change-Id: I0f078d1753db65d2f2e0380d1b0450c114cf40dd BUG: 3518 Reviewed-on: http://review.gluster.com/522 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: made changes to return value of __is_root_gfid()Amar Tumballi2011-10-021-1/+2
| | | | | | | | | | | now returns 'true(1)' is gfid is root, 'false(0)' if not. earlier it was the inverse, which was bit confusing Change-Id: Id103f444ace048cbb0fccdc72c6646da06631584 BUG: 3518 Reviewed-on: http://review.gluster.com/549 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* inode table: avoid memleak by freeing the allocated structures incase of failureRaghavendra Bhat2011-07-011-3/+0
| | | | | | | | Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3103 (memleak in inode table creation) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3103
* libglusterfs/inode.c: Add version of inode_path which can be called holding ↵Raghavendra G2011-06-141-0/+3
| | | | | | | | | | inode->lock. Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 1059 (enhancements for getting statistics from performance translators) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1059
* Solaris xattr support for symlink and special files.v3.1.3qa3shishir gowda2011-03-041-0/+2
| | | | | | | | | | | | | | | | | | Since glusterfs uses xattr for storing gfid, and xattr support for symlinks and special files does not exist in solaris. The work around is provided by creating hidden files under export directory on solaris hosts only. the hidden files ares maintained in .glusterfs_xattr_inode directory, and all xattr ops on symlink and special files are redirected to respective inodes. All dir entries with name starting as .glusterfs (GF_HIDDEN_PATH) will not be shown in readdir ops. Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2213 (Symlink fails with ENODATA) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2213
* core: Use lru_limit as count for inode and dentry mempoolShehjar Tikoo2010-10-261-0/+1
| | | | | | | | Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1948 (For each subvolume started, glusterfs process takes up around 30-35MB more memory) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1948
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* Change GNU GPL to GNU AGPLPranith K2010-10-041-3/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1388 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1388
* gfid: changes in inode managementAnand Avati2010-09-041-16/+7
| | | | | | | | | | | | | - incorporate usage of uuid (gfid) as the key for finding inodes - deprecate inode number/generation number based inode_get - undo code specific to generation numbers (attic list etc.) Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: introduce uuid based handles for inodesAnand Avati2010-09-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfid represents a gluter file id. This is a universally unique id assigned to a logical inode, independent of the inode numbers assigned by the various backend filesystems to that file/directory. The gfid of a file/directory will be the same on servers depending on the cluster translator in picture. The same gfid can be used as a handle across layers of various translators and across servers and clients. This was not the case previously as the cluster translators would pick the backend inode number from one of the servers and convert that into a logical inode number by performing some mathematical transforms. This new technique of addressing inodes also makes dynamic volume management have a more robust implementation as the file handles remain the same on all versions of the graphs, and allows for seamless NFS daemon restarts as well. This change makes way for server originating communication which was not possible earlier as the servers did not have any reliable way of addressing client side inodes at all. gfid solves this problem by preserving the same uuid as the handle on all the servers and across all clients Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* mem pool for fd_tshishir gowda2010-08-061-0/+1
| | | | | | | | | Ran posix compliance test and sanity test Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 329 (Replacing memory allocation functions with mem-type functions) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=329
* Implement mem pool for inode dentryshishir gowda2010-08-061-0/+2
| | | | | | | | | Ran posix compliance test and sanity test Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 329 (Replacing memory allocation functions with mem-type functions) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=329
* iatt: changes across the codebaseAnand V. Avati2010-03-161-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - libglusterfs -- call-stub -- inode -- protocol - libglusterfsclient - cluster/replicate - cluster/{dht,nufa,switch} - cluster/unify - cluster/HA - cluster/map - cluster/stripe - debug/error-gen - debug/trace - debug/io-stats - encryption/rot-13 - features/filter - features/locks - features/path-converter - features/quota - features/trash - mount/fuse - performance/io-threads - performance/io-cache - performance/quick-read - performance/read-ahead - performance/stat-prefetch - performance/symlink-cache - performance/write-behind - protocol/client - protocol/server - storage-posix Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 361 (GlusterFS 3.0 should work on Mac OS/X) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=361
* inode_ctx_{get,put,del}2 API supportAmar Tumballi2009-10-181-5/+25
| | | | | | | | | | support for storing multiple values for a key in inode context - used for storing inode and generation number pairs on the server in protocol/client inode ctx Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 315 (generation number support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
* inode: API changes for generation number supportAnand V. Avati2009-10-181-37/+34
| | | | | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 315 (generation number support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
* Changed occurrences of Z Research to Gluster.Vijay Bellur2009-10-071-1/+1
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* cleanup 'ctx' from inode and fdAmar Tumballi2009-07-161-1/+0
| | | | | | | | | Removing unused 'dict_t *ctx' from both inode and fd structures. Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 128 (cleanup unwanted ctx dictionary in 'inode' and 'fd' structures.) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=128
* libglusterfsclient code changesRaghavendra G2009-04-031-0/+5
| | | | | | | | | | | | - add dentry support to libglusterfsclient. - changes related to using array, to store context in inode instead of dictionary. - code changes related to cleanup of libglusterfsclient interface. - added glusterfs_mkdir and glusterfs_rmdir - other changes in libglusterfsclient to make it work with code changes in other parts of glusterfs. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* code changes in the usage of inode_ctx_get and inode_ctx_put after their ↵Raghavendra G2009-03-051-0/+6
| | | | | | implementation is changed to hold inode->lock. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* updated copyright header to extend copyright upto 2009Basavanagowda Kanur2009-02-261-1/+1
| | | | | | updated copyright header to include 2009. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Added all filesVikas Gorur2009-02-181-0/+160