diff options
Diffstat (limited to 'doc/developer-guide/data-structures/inode.md')
-rw-r--r-- | doc/developer-guide/data-structures/inode.md | 226 |
1 files changed, 0 insertions, 226 deletions
diff --git a/doc/developer-guide/data-structures/inode.md b/doc/developer-guide/data-structures/inode.md deleted file mode 100644 index a340ab9ca8e..00000000000 --- a/doc/developer-guide/data-structures/inode.md +++ /dev/null @@ -1,226 +0,0 @@ -#Inode and dentry management in GlusterFS: - -##Background -Filesystems internally refer to files and directories via inodes. Inodes -are unique identifiers of the entities stored in a filesystem. Whenever an -application has to operate on a file/directory (read/modify), the filesystem -maps that file/directory to the right inode and start referring to that inode -whenever an operation has to be performed on the file/directory. - -In GlusterFS a new inode gets created whenever a new file/directory is created -OR when a successful lookup is done on a file/directory for the first time. -Inodes in GlusterFS are maintained by the inode table which gets initiated when -the filesystem daemon is started (both for the brick process as well as the -mount process). Below are some important data structures for inode management. - -## Data-structure (inode-table) -``` -struct _inode_table { - pthread_mutex_t lock; - size_t hashsize; /* bucket size of inode hash and dentry hash */ - char *name; /* name of the inode table, just for gf_log() */ - inode_t *root; /* root directory inode, with inode - number and gfid 1 */ - xlator_t *xl; /* xlator to be called to do purge and - the xlator which maintains the inode table*/ - uint32_t lru_limit; /* maximum LRU cache size */ - struct list_head *inode_hash; /* buckets for inode hash table */ - struct list_head *name_hash; /* buckets for dentry hash table */ - struct list_head active; /* list of inodes currently active (in an fop) */ - uint32_t active_size; /* count of inodes in active list */ - struct list_head lru; /* list of inodes recently used. - lru.next most recent */ - uint32_t lru_size; /* count of inodes in lru list */ - struct list_head purge; /* list of inodes to be purged soon */ - uint32_t purge_size; /* count of inodes in purge list */ - - struct mem_pool *inode_pool; /* memory pool for inodes */ - struct mem_pool *dentry_pool; /* memory pool for dentrys */ - struct mem_pool *fd_mem_pool; /* memory pool for fd_t */ - int ctxcount; /* number of slots in inode->ctx */ -}; -``` - -#Life-cycle -``` - -inode_table_new (size_t lru_limit, xlator_t *xl) - -This is a function which allocates a new inode table. Usually the top xlators in -the graph such as protocol/server (for bricks), fuse and nfs (for fuse and nfs -mounts) and libgfapi do inode managements. Hence they are the ones which will -allocate a new inode table by calling the above function. - -Each xlator graph in glusterfs maintains an inode table. So in fuse clients, -whenever there is a graph change due to add brick/remove brick or -addition/removal of some other xlators, a new graph is created which creates a -new inode table. - -Thus an allocated inode table is destroyed only when the filesystem daemon is -killed or unmounted. - -``` - -#what it contains. -``` - -Inode table in glusterfs mainly contains a hash table for maintaining inodes. -In general a file/directory is considered to be existing if there is a -corresponding inode present in the inode table. If a inode for a file/directory -cannot be found in the inode table, glusterfs tries to resolve it by sending a -lookup on the entry for which the inode is needed. If lookup is successful, then -a new inode correponding to the entry is added to the hash table present in the -inode table. Thus an inode present in the hash-table means, its an existing -file/directory within the filesystem. The inode table also contains the hash -size of the hash table (as of now it is hard coded to 14057. The hash value of -a inode is calculated using its gfid). - -Apart from the hash table, inode table also maintains 3 important list of inodes -1) Active list: -Active list contains all the active inodes (i.e inodes which are currently part -of some fop). -2) Lru list: -Least recently used inodes list. A limit can be set for the size of the lru -list. For bricks it is 16384 and for clients it is infinity. -3) Purge list: -List of all the inodes which have to be purged (i.e inodes which have to be -deleted from the inode table due to unlink/rmdir/forget). - -And at last it also contains the mem-pool for allocating inodes, dentries so -that frequent malloc/calloc and free of the data structures can be avoided. -``` - -#Data structure (inode) -``` -struct _inode { - inode_table_t *table; /* the table this inode belongs to */ - uuid_t gfid; /* unique identifier of the inode */ - gf_lock_t lock; - uint64_t nlookup; - uint32_t fd_count; /* Open fd count */ - uint32_t ref; /* reference count on this inode */ - ia_type_t ia_type; /* what kind of file */ - struct list_head fd_list; /* list of open files on this inode */ - struct list_head dentry_list; /* list of directory entries for this inode */ - struct list_head hash; /* hash table pointers */ - struct list_head list; /* active/lru/purge */ - - struct _inode_ctx *_ctx; /* place holder for keeping the - information about the inode by different xlators */ -}; - -As said above, inodes are internal way of identifying the files/directories. A -inode uniquely represents a file/directory. A new inode is created whenever a -create/mkdir/symlink/mknod operations are performed. Apart from that a new inode -is created upon the successful fresh lookup of a file/directory. Say the -filesystem contained some file "a" within root and the filesystem was -unmounted. Now when glusterfs is mounted and some operation is perfomed on "/a", -glusterfs tries to get the inode for the entry "a" with parent inode as -root. But, since glusterfs just came up, it will not be able to find the inode -for "a" and will send a lookup on "/a". If the lookup operation succeeds (i.e. -the root of glusterfs contains an entry called "a"), then a new inode for "/a" -is created and added to the inode table. - -Depending upon the situation, an inode can be in one of the 3 lists maintained -by the inode table. If some fop is happening on the inode, then the inode will -be present in the active inodes list maintained by the inode table. Active -inodes are those inodes whose refcount is greater than zero. Whenever some -operation comes on a file/directory, and the resolver tries to find the inode -for it, it increments the refcount of the inode before returning the inode. The -refcount of an inode can be incremented by calling the below function - -inode_ref (inode_t *inode) - -Any xlator which wants to operate on a inode as part of some fop (or wants the -inode in the callback), should hold a ref on the inode. -Once the fop is completed before sending the reply of the fop to the above -layers , the inode has to be unrefed. When the refcount of an inode becomes -zero, it is removed from the active inodes list and put into LRU list maintained -by the inode table. Thus in short if some fop is happening on a file/directory, -the corresponding inode will be in the active list or it will be in the LRU -list. -``` - -#Life Cycle - -A new inode is created whenever a new file/directory/symlink is created OR a -successful lookup of an existing entry is done. The xlators which does inode -management (as of now protocol/server, fuse, nfs, gfapi) will perform inode_link -operation upon successful lookup or successful creation of a new entry. - -inode_link (inode_t *inode, inode_t *parent, const char *name, - struct iatt *buf); - -inode_link actually adds the inode to the inode table (to be precise it adds -the inode to the hash table maintained by the inode table. The hash value is -calculated based on the gfid). Copies the gfid to the inode (the gfid is -present in the iatt structure). Creates a dentry with the new name. - -A inode is removed from the inode table and eventually destroyed when unlink -or rmdir operation is performed on a file/directory, or the the lru limit of -the inode table has been exceeded. - -#Data structure (dentry) -``` - -struct _dentry { - struct list_head inode_list; /* list of dentries of inode */ - struct list_head hash; /* hash table pointers */ - inode_t *inode; /* inode of this directory entry */ - char *name; /* name of the directory entry */ - inode_t *parent; /* directory of the entry */ -}; - -A dentry is the presence of an entry for a file/directory within its parent -directory. A dentry usually points to the inode to which it belongs to. In -glusterfs a dentry contains the following fields. -1) a hook using which it can add itself to the list of -the dentries maintained by the inode to which it points to. -2) A hash table pointer. -3) Pointer to the inode to which it belongs to. -4) Name of the dentry -5) Pointer to the inode of the parent directory in which the dentry is present - -A new dentry is created when a new file/directory/symlink is created or a hard -link to an existing file is created. - -__dentry_create (inode_t *inode, inode_t *parent, const char *name); - -A dentry holds a refcount on the parent -directory so that the parent inode is never removed from the active inode's list -and put to the lru list (If the lru limit of the lru list is exceeded, there is -a chance of parent inode being destroyed. To avoid it, the dentries hold a -reference to the parent inode). A dentry is removed whenevern a unlink/rmdir -is perfomed on a file/directory. Or when the lru limit has been exceeded, the -oldest inodes are purged out of the inode table, during which all the dentries -of the inode are removed. - -Whenever a unlink/rmdir comes on a file/directory, the corresponding inode -should be removed from the inode table. So upon unlink/rmdir, the inode will -be moved to the purge list maintained by the inode table and from there it is -destroyed. To be more specific, if a inode has to be destroyed, its refcount -and nlookup count both should become 0. For refcount to become 0, the inode -should not be part of any fop (there should not be any open fds). Or if the -inode belongs to a directory, then there should not be any fop happening on the -directory and it should not contain any dentries within it. For nlookup count to -become zero, a forget has to be sent on the inode with nlookup count set to 0 as -an argument. For fuse clients, forget is sent by the kernel itself whenever a -unlink/rmdir is performed. But for brick processes, upon unlink/rmdir, the -protocol/server itself has to do inode_forget. Whenever the inode has to be -deleted due to file removal or lru limit being exceeded the inode is retired -(i.e. all the dentries of the inode are deleted and the inode is moved to the -purge list maintained by the inode table), the nlookup count is set to 0 via -inode_forget api. The inode table, then prunes all the inodes from the purge -list by destroying the inode contexts maintained by each xlator. - -unlinking of the dentry is done via inode_unlink; - -void -inode_unlink (inode_t *inode, inode_t *parent, const char *name); - -If the inode has multiple hard links, then the unlink operation performed by -the application results just in the removal of the dentry with the name provided -by the application. For the inode to be removed, all the dentries of the inode -should be unlinked. -``` - |