diff options
| author | Anand Avati <avati@gluster.com> | 2012-01-13 13:27:15 +0530 |
|---|---|---|
| committer | Anand Avati <avati@gluster.com> | 2012-01-20 05:03:42 -0800 |
| commit | 7e1f8e3bac201f88e2d9ef62fc69a044716dfced (patch) | |
| tree | 77540dbf1def2c864f8ae55f2293dba4a1d47488 | |
| parent | 33c568ce1a28c1739f095611b40b7acf40e4e6df (diff) | |
core: GFID filehandle based backend and anonymous FDs
1. What
--------
This change introduces an infrastructure change in the filesystem
which lets filesystem operation address objects (inodes) just by its
GFID. Thus far GFID has been a unique identifier of a user-visible
inode. But in terms of addressability the only mechanism thus far has
been the backend filesystem path, which could be derived from the
GFID only if it was cached in the inode table along with the entire set
of dentry ancestry leading up to the root.
This change essentially decouples addressability from the namespace. It
is no more necessary to be aware of the parent directory to address a
file or directory.
2. Why
-------
The biggest use case for such a feature is NFS for generating
persistent filehandles. So far the technique for generating filehandles
in NFS has been to encode path components so that the appropriate
inode_t can be repopulated into the inode table by means of a recursive
lookup of each component top-down.
Another use case is the ability to perform more intelligent self-healing
and rebalancing of inodes with hardlinks and also to detect renames.
A derived feature from GFID filehandles is anonymous FDs. An anonymous FD
is an internal USABLE "fd_t" which does not map to a user opened file
descriptor or to an internal ->open()'d fd. The ability to address a file
by the GFID eliminates the need to have a persistent ->open()'d fd for the
purpose of avoiding the namespace. This improves NFS read/write performance
significantly eliminating open/close calls and also fixes some of today's
limitations (like keeping an FD open longer than necessary resulting
in disk space leakage)
3. How
-------
At each storage/posix translator level, every file is hardlinked inside
a hidden .glusterfs directory (under the top level export) with the name
as the ascii-encoded standard UUID format string. For reasons of performance
and scalability there is a two-tier classification of those hardlinks
under directories with the initial parts of the UUID string as the directory
names.
For directories (which cannot be hardlinked), the approach is to use a symlink
which dereferences the parent GFID path along with basename of the directory.
The parent GFID dereference will in turn be a dereference of the grandparent
with the parent's basename, and so on recursively up to the root export.
4. Development
---------------
4a. To leverage the ability to address an inode by its GFID, the technique is
to perform a "nameless lookup". This means, to populate a loc_t structure as:
loc_t {
pargfid: NULL
parent: NULL
name: NULL
path: NULL
gfid: GFID to be looked up [out parameter]
inode: inode_new () result [in parameter]
}
and performing such lookup will return in its callback an inode_t
populated with the right contexts and a struct iatt which can be
used to perform an inode_link () on the inode (without a parent and
basename). The inode will now be hashed and linked in the inode table
and findable via inode_find().
A fundamental change moving forward is that the primary fields in a
loc_t structure are now going to be (pargfid, name) and (gfid) depending
on the kind of FOP. So far path had been the primary field for operations.
The remaining fields only serve as hints/helpers.
4b. If read/write is to be performed on an inode_t, the approach so far
has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and
then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do
fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost
in performance in the inbuilt NFS server.
5. Misc
-------
The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent
with the rest of the codebase.
Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f
BUG: 781318
Reviewed-on: http://review.gluster.com/669
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@gluster.com>
35 files changed, 2403 insertions, 1558 deletions
diff --git a/libglusterfs/src/fd.c b/libglusterfs/src/fd.c index 47b42aef4d2..50a564ee6df 100644 --- a/libglusterfs/src/fd.c +++ b/libglusterfs/src/fd.c @@ -35,7 +35,7 @@ gf_fd_fdtable_expand (fdtable_t *fdtable, uint32_t nr); fd_t * -_fd_ref (fd_t *fd); +__fd_ref (fd_t *fd); static int gf_fd_chain_fd_entries (fdentry_t *entries, uint32_t startidx, @@ -269,6 +269,10 @@ gf_fd_put (fdtable_t *fdtable, int32_t fd) fd_t *fdptr = NULL; fdentry_t *fde = NULL; + if (fd == -2) + /* anonymous fd */ + return; + if (fdtable == NULL || fd < 0) { gf_log_callingfn ("fd", GF_LOG_ERROR, "invalid argument"); return; @@ -336,7 +340,7 @@ gf_fd_fdptr_get (fdtable_t *fdtable, int64_t fd) fd_t * -_fd_ref (fd_t *fd) +__fd_ref (fd_t *fd) { ++fd->refcount; @@ -355,7 +359,7 @@ fd_ref (fd_t *fd) } LOCK (&fd->inode->lock); - refed_fd = _fd_ref (fd); + refed_fd = __fd_ref (fd); UNLOCK (&fd->inode->lock); return refed_fd; @@ -363,7 +367,7 @@ fd_ref (fd_t *fd) fd_t * -_fd_unref (fd_t *fd) +__fd_unref (fd_t *fd) { GF_ASSERT (fd->refcount); @@ -443,7 +447,7 @@ fd_unref (fd_t *fd) LOCK (&fd->inode->lock); { - _fd_unref (fd); + __fd_unref (fd); refcount = fd->refcount; } UNLOCK (&fd->inode->lock); @@ -457,28 +461,34 @@ fd_unref (fd_t *fd) fd_t * -fd_bind (fd_t *fd) +__fd_bind (fd_t *fd) { - inode_t *inode = NULL; + list_add (&fd->inode_list, &fd->inode->fd_list); + + return fd; +} + +fd_t * +fd_bind (fd_t *fd) +{ if (!fd || !fd->inode) { gf_log_callingfn ("fd", GF_LOG_ERROR, "!fd || !fd->inode"); return NULL; } - inode = fd->inode; - LOCK (&inode->lock); + LOCK (&fd->inode->lock); { - list_add (&fd->inode_list, &inode->fd_list); + fd = __fd_bind (fd); } - UNLOCK (&inode->lock); + UNLOCK (&fd->inode->lock); return fd; } -fd_t * -fd_create (inode_t *inode, pid_t pid) +static fd_t * +__fd_create (inode_t *inode, pid_t pid) { fd_t *fd = NULL; @@ -506,22 +516,52 @@ fd_create (inode_t *inode, pid_t pid) INIT_LIST_HEAD (&fd->inode_list); LOCK_INIT (&fd->lock); +out: + return fd; +} + + +fd_t * +fd_create (inode_t *inode, pid_t pid) +{ + fd_t *fd = NULL; + + fd = __fd_create (inode, pid); + if (!fd) + goto out; + + fd = fd_ref (fd); - LOCK (&inode->lock); - { - fd = _fd_ref (fd); - } - UNLOCK (&inode->lock); out: return fd; } +static fd_t * +__fd_lookup (inode_t *inode, pid_t pid) +{ + fd_t *iter_fd = NULL; + fd_t *fd = NULL; + + if (list_empty (&inode->fd_list)) + return NULL; + + + list_for_each_entry (iter_fd, &inode->fd_list, inode_list) { + if (!pid || iter_fd->pid == pid) { + fd = __fd_ref (iter_fd); + break; + } + } + + return fd; +} + + fd_t * fd_lookup (inode_t *inode, pid_t pid) { fd_t *fd = NULL; - fd_t *iter_fd = NULL; if (!inode) { gf_log_callingfn ("fd", GF_LOG_WARNING, "!inode"); @@ -530,21 +570,45 @@ fd_lookup (inode_t *inode, pid_t pid) LOCK (&inode->lock); { - if (list_empty (&inode->fd_list)) { - fd = NULL; - } else { - list_for_each_entry (iter_fd, &inode->fd_list, inode_list) { - if (pid) { - if (iter_fd->pid == pid) { - fd = _fd_ref (iter_fd); - break; - } - } else { - fd = _fd_ref (iter_fd); - break; - } - } - } + fd = __fd_lookup (inode, pid); + } + UNLOCK (&inode->lock); + + return fd; +} + + + +fd_t * +__fd_anonymous (inode_t *inode) +{ + fd_t *fd = NULL; + + fd = __fd_lookup (inode, -1); + + if (!fd) { + fd = __fd_create (inode, -1); + + if (!fd) + return NULL; + + __fd_bind (fd); + } + + __fd_ref (fd); + + return fd; +} + + +fd_t * +fd_anonymous (inode_t *inode) +{ + fd_t *fd = NULL; + + LOCK (&inode->lock); + { + fd = __fd_anonymous (inode); } UNLOCK (&inode->lock); @@ -552,6 +616,13 @@ fd_lookup (inode_t *inode, pid_t pid) } +gf_boolean_t +fd_is_anonymous (fd_t *fd) +{ + return (fd && fd->pid == -1); +} + + uint8_t fd_list_empty (inode_t *inode) { diff --git a/libglusterfs/src/fd.h b/libglusterfs/src/fd.h index 3c2be972ad9..d4cd9bd0662 100644 --- a/libglusterfs/src/fd.h +++ b/libglusterfs/src/fd.h @@ -132,6 +132,14 @@ fd_t * fd_lookup (struct _inode *inode, pid_t pid); +fd_t * +fd_anonymous (inode_t *inode); + + +gf_boolean_t +fd_is_anonymous (fd_t *fd); + + uint8_t fd_list_empty (struct _inode *inode); @@ -164,7 +172,7 @@ int __fd_ctx_del (fd_t *fd, xlator_t *xlator, uint64_t *value); fd_t * -_fd_ref (fd_t *fd); +__fd_ref (fd_t *fd); void fd_ctx_dump (fd_t *fd, char *prefix); diff --git a/libglusterfs/src/inode.c b/libglusterfs/src/inode.c index 3513691c492..c23f0f0e545 100644 --- a/libglusterfs/src/inode.c +++ b/libglusterfs/src/inode.c @@ -660,6 +660,39 @@ inode_grep (inode_table_t *table, inode_t *parent, const char *name) return inode; } +int +inode_grep_for_gfid (inode_table_t *table, inode_t *parent, const char *name, + uuid_t gfid, ia_type_t *type) +{ + inode_t *inode = NULL; + dentry_t *dentry = NULL; + int ret = -1; + + if (!table || !parent || !name) { + gf_log_callingfn (THIS->name, GF_LOG_WARNING, + "table || parent || name not found"); + return ret; + } + + pthread_mutex_lock (&table->lock); + { + dentry = __dentry_grep (table, parent, name); + + if (dentry) + inode = dentry->inode; + + if (inode) { + uuid_copy (gfid, inode->gfid); + *type = inode->ia_type; + ret = 0; + } + } + pthread_mutex_unlock (&table->lock); + + return ret; +} + + /* return 1 if gfid is of root, 0 if not */ gf_boolean_t __is_root_gfid (uuid_t gfid) @@ -998,6 +1031,7 @@ int __inode_path (inode_t *inode, const char *name, char **bufp) { inode_table_t *table = NULL; + inode_t *itrav = NULL; dentry_t *trav = NULL; size_t i = 0, size = 0; int64_t ret = 0; @@ -1011,8 +1045,10 @@ __inode_path (inode_t *inode, const char *name, char **bufp) table = inode->table; - for (trav = __dentry_search_arbit (inode); trav; - trav = __dentry_search_arbit (trav->parent)) { + itrav = inode; + for (trav = __dentry_search_arbit (itrav); trav; + trav = __dentry_search_arbit (itrav)) { + itrav = trav->parent; i ++; /* "/" */ i += strlen (trav->name); if (i > PATH_MAX) { @@ -1024,13 +1060,9 @@ __inode_path (inode_t *inode, const char *name, char **bufp) } } - if (!__is_root_gfid (inode->gfid) && - (i == 0)) { - gf_log (table->name, GF_LOG_WARNING, - "no dentry for non-root inode : %s", - uuid_utoa (inode->gfid)); - ret = -ENOENT; - goto out; + if (!__is_root_gfid (itrav->gfid)) { + /* "<gfid:00000000-0000-0000-0000-000000000000>"/path */ + i += GFID_STR_PFX_LEN; } if (name) { @@ -1052,13 +1084,22 @@ __inode_path (inode_t *inode, const char *name, char **bufp) i -= (len + 1); } - for (trav = __dentry_search_arbit (inode); trav; - trav = __dentry_search_arbit (trav->parent)) { + itrav = inode; + for (trav = __dentry_search_arbit (itrav); trav; + trav = __dentry_search_arbit (itrav)) { + itrav = trav->parent; len = strlen (trav->name); strncpy (buf + (i - len), trav->name, len); buf[i-len-1] = '/'; i -= (len + 1); } + + if (!__is_root_gfid (itrav->gfid)) { + snprintf (&buf[i-GFID_STR_PFX_LEN], GFID_STR_PFX_LEN, + "<gfid:%s>", uuid_utoa (itrav->gfid)); + buf[i-1] = '>'; + } + *bufp = buf; } else { ret = -ENOMEM; @@ -1323,45 +1364,47 @@ out: int -__inode_ctx_put2 (inode_t *inode, xlator_t *xlator, uint64_t value1, - uint64_t value2) +__inode_ctx_set2 (inode_t *inode, xlator_t *xlator, uint64_t *value1_p, + uint64_t *value2_p) { int ret = 0; int index = 0; - int put_idx = -1; + int set_idx = -1; if (!inode || !xlator) return -1; for (index = 0; index < xlator->graph->xl_count; index++) { if (!inode->_ctx[index].xl_key) { - if (put_idx == -1) - put_idx = index; + if (set_idx == -1) + set_idx = index; /* dont break, to check if key already exists further on */ } if (inode->_ctx[index].xl_key == xlator) { - put_idx = index; + set_idx = index; break; } } - if (put_idx == -1) { + if (set_idx == -1) { ret = -1; goto out;; } - inode->_ctx[put_idx].xl_key = xlator; - inode->_ctx[put_idx].value1 = value1; - inode->_ctx[put_idx].value2 = value2; + inode->_ctx[set_idx].xl_key = xlator; + if (value1_p) + inode->_ctx[set_idx].value1 = *value1_p; + if (value2_p) + inode->_ctx[set_idx].value2 = *value2_p; out: return ret; } int -inode_ctx_put2 (inode_t *inode, xlator_t *xlator, uint64_t value1, - uint64_t value2) +inode_ctx_set2 (inode_t *inode, xlator_t *xlator, uint64_t *value1_p, + uint64_t *value2_p) { int ret = 0; @@ -1370,7 +1413,7 @@ inode_ctx_put2 (inode_t *inode, xlator_t *xlator, uint64_t value1, LOCK (&inode->lock); { - ret = __inode_ctx_put2 (inode, xlator, value1, value2); + ret = __inode_ctx_set2 (inode, xlator, value1_p, value2_p); } UNLOCK (&inode->lock); @@ -1466,41 +1509,6 @@ unlock: } -int -__inode_ctx_put (inode_t *inode, xlator_t *key, uint64_t value) -{ - return __inode_ctx_put2 (inode, key, value, 0); -} - - -int -inode_ctx_put (inode_t *inode, xlator_t *key, uint64_t value) -{ - return inode_ctx_put2 (inode, key, value, 0); -} - - -int -__inode_ctx_get (inode_t *inode, xlator_t *key, uint64_t *value) -{ - return __inode_ctx_get2 (inode, key, value, 0); -} - - -int -inode_ctx_get (inode_t *inode, xlator_t *key, uint64_t *value) -{ - return inode_ctx_get2 (inode, key, value, 0); -} - - -int -inode_ctx_del (inode_t *inode, xlator_t *key, uint64_t *value) -{ - return inode_ctx_del2 (inode, key, value, 0); -} - - void inode_dump (inode_t *inode, char *prefix) { @@ -1557,7 +1565,7 @@ inode_dump (inode_t *inode, char *prefix) INIT_LIST_HEAD (&fd_wrapper->next); list_add_tail (&fd_wrapper->next, &fd_list); - fd_wrapper->fd = _fd_ref (fd); + fd_wrapper->fd = __fd_ref (fd); } } unlock: diff --git a/libglusterfs/src/inode.h b/libglusterfs/src/inode.h index df415286b20..7dda0401dcb 100644 --- a/libglusterfs/src/inode.h +++ b/libglusterfs/src/inode.h @@ -106,6 +106,10 @@ struct _inode { }; +#define UUID0_STR "00000000-0000-0000-0000-000000000000" +#define GFID_STR_PFX "<gfid:" UUID0_STR ">" +#define GFID_STR_PFX_LEN (sizeof (GFID_STR_PFX) - 1) + inode_table_t * inode_table_new (size_t lru_limit, xlator_t *xl); @@ -142,6 +146,10 @@ inode_rename (inode_table_t *table, inode_t *olddir, const char *oldname, inode_t * inode_grep (inode_table_t *table, inode_t *parent, const char *name); +int +inode_grep_for_gfid (inode_table_t *table, inode_t *parent, const char *name, + uuid_t gfid, ia_type_t *type); + inode_t * inode_find (inode_table_t *table, uuid_t gfid); @@ -155,32 +163,44 @@ inode_t * inode_from_path (inode_table_t *table, const char *path); int -__inode_ctx_put (inode_t *inode, xlator_t *xlator, uint64_t value); - -int -inode_ctx_put (inode_t *inode, xlator_t *xlator, uint64_t value); - -int -__inode_ctx_get (inode_t *inode, xlator_t *xlator, uint64_t *value); - -int -inode_ctx_get (inode_t *inode, xlator_t *xlator, uint64_t *value); - -int -inode_ctx_del (inode_t *inode, xlator_t *xlator, uint64_t *value); - +inode_ctx_set2 (inode_t *inode, xlator_t *xlator, uint64_t *value1, + uint64_t *value2); int -inode_ctx_put2 (inode_t *inode, xlator_t *xlator, uint64_t value1, - uint64_t value2); +__inode_ctx_set2 (inode_t *inode, xlator_t *xlator, uint64_t *value1, + uint64_t *value2); int inode_ctx_get2 (inode_t *inode, xlator_t *xlator, uint64_t *value1, uint64_t *value2); +int +__inode_ctx_get2 (inode_t *inode, xlator_t *xlator, uint64_t *value1, + uint64_t *value2); int inode_ctx_del2 (inode_t *inode, xlator_t *xlator, uint64_t *value1, uint64_t *value2); +#define __inode_ctx_set(i,x,v_p) __inode_ctx_set2(i,x,v_p,0) +#define inode_ctx_set(i,x,v_p) inode_ctx_set2(i,x,v_p,0) + +static inline int +__inode_ctx_put(inode_t *inode, xlator_t *this, uint64_t v) +{ + return __inode_ctx_set2 (inode, this, &v, 0); +} + +static inline int +inode_ctx_put(inode_t *inode, xlator_t *this, uint64_t v) +{ + return inode_ctx_set2(inode, this, &v, 0); +} + +#define __inode_ctx_get(i,x,v) __inode_ctx_get2(i,x,v,0) +#define inode_ctx_get(i,x,v) inode_ctx_get2(i,x,v,0) + +#define inode_ctx_del(i,x,v) inode_ctx_del2(i,x,v,0) + + gf_boolean_t __is_root_gfid (uuid_t gfid); diff --git a/libglusterfs/src/xlator.c b/libglusterfs/src/xlator.c index 023cbc94030..160ac2d6322 100644 --- a/libglusterfs/src/xlator.c +++ b/libglusterfs/src/xlator.c @@ -544,8 +544,8 @@ loc_wipe (loc_t *loc) inode_unref (loc->parent); loc->parent = NULL; } - uuid_clear (loc->gfid); - uuid_clear (loc->pargfid); + + memset (loc, 0, sizeof (*loc)); } diff --git a/xlators/cluster/afr/src/afr-common.c b/xlators/cluster/afr/src/afr-common.c index c247d56b705..83b91cd3ed1 100644 --- a/xlators/cluster/afr/src/afr-common.c +++ b/xlators/cluster/afr/src/afr-common.c @@ -2071,7 +2071,7 @@ out: /* {{{ open */ int -afr_fd_ctx_set (xlator_t *this, fd_t *fd) +__afr_fd_ctx_set (xlator_t *this, fd_t *fd) { afr_private_t * priv = NULL; int ret = -1; @@ -2083,82 +2083,92 @@ afr_fd_ctx_set (xlator_t *this, fd_t *fd) priv = this->private; - LOCK (&fd->lock); - { - ret = __fd_ctx_get (fd, this, &ctx); + ret = __fd_ctx_get (fd, this, &ctx); - if (ret == 0) - goto unlock; + if (ret == 0) + goto out; - fd_ctx = GF_CALLOC (1, sizeof (afr_fd_ctx_t), - gf_afr_mt_afr_fd_ctx_t); - if (!fd_ctx) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx = GF_CALLOC (1, sizeof (afr_fd_ctx_t), + gf_afr_mt_afr_fd_ctx_t); + if (!fd_ctx) { + ret = -ENOMEM; + goto out; + } - fd_ctx->pre_op_done = GF_CALLOC (sizeof (*fd_ctx->pre_op_done), - priv->child_count, - gf_afr_mt_char); - if (!fd_ctx->pre_op_done) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->pre_op_done = GF_CALLOC (sizeof (*fd_ctx->pre_op_done), + priv->child_count, + gf_afr_mt_char); + if (!fd_ctx->pre_op_done) { + ret = -ENOMEM; + goto out; + } - fd_ctx->pre_op_piggyback = GF_CALLOC (sizeof (*fd_ctx->pre_op_piggyback), - priv->child_count, - gf_afr_mt_char); - if (!fd_ctx->pre_op_piggyback) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->pre_op_piggyback = GF_CALLOC (sizeof (*fd_ctx->pre_op_piggyback), + priv->child_count, + gf_afr_mt_char); + if (!fd_ctx->pre_op_piggyback) { + ret = -ENOMEM; + goto out; + } - fd_ctx->opened_on = GF_CALLOC (sizeof (*fd_ctx->opened_on), - priv->child_count, - gf_afr_mt_int32_t); - if (!fd_ctx->opened_on) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->opened_on = GF_CALLOC (sizeof (*fd_ctx->opened_on), + priv->child_count, + gf_afr_mt_int32_t); + if (!fd_ctx->opened_on) { + ret = -ENOMEM; + goto out; + } - fd_ctx->lock_piggyback = GF_CALLOC (sizeof (*fd_ctx->lock_piggyback), - priv->child_count, - gf_afr_mt_char); - if (!fd_ctx->lock_piggyback) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->lock_piggyback = GF_CALLOC (sizeof (*fd_ctx->lock_piggyback), + priv->child_count, + gf_afr_mt_char); + if (!fd_ctx->lock_piggyback) { + ret = -ENOMEM; + goto out; + } - fd_ctx->lock_acquired = GF_CALLOC (sizeof (*fd_ctx->lock_acquired), - priv->child_count, - gf_afr_mt_char); - if (!fd_ctx->lock_acquired) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->lock_acquired = GF_CALLOC (sizeof (*fd_ctx->lock_acquired), + priv->child_count, + gf_afr_mt_char); + if (!fd_ctx->lock_acquired) { + ret = -ENOMEM; + goto out; + } - fd_ctx->up_count = priv->up_count; - fd_ctx->down_count = priv->down_count; + fd_ctx->up_count = priv->up_count; + fd_ctx->down_count = priv->down_count; - fd_ctx->locked_on = GF_CALLOC (sizeof (*fd_ctx->locked_on), - priv->child_count, - gf_afr_mt_char); - if (!fd_ctx->locked_on) { - ret = -ENOMEM; - goto unlock; - } + fd_ctx->locked_on = GF_CALLOC (sizeof (*fd_ctx->locked_on), + priv->child_count, + gf_afr_mt_char); + if (!fd_ctx->locked_on) { + ret = -ENOMEM; + goto out; + } - INIT_LIST_HEAD (&fd_ctx->paused_calls); - INIT_LIST_HEAD (&fd_ctx->entries); + INIT_LIST_HEAD (&fd_ctx->paused_calls); + INIT_LIST_HEAD (&fd_ctx->entries); - ret = __fd_ctx_set (fd, this, (uint64_t)(long) fd_ctx); - if (ret) - gf_log (this->name, GF_LOG_DEBUG, - "failed to set fd ctx (%p)", fd); + ret = __fd_ctx_set (fd, this, (uint64_t)(long) fd_ctx); + if (ret) + gf_log (this->name, GF_LOG_DEBUG, + "failed to set fd ctx (%p)", fd); +out: + return ret; +} + + +int +afr_fd_ctx_set (xlator_t *this, fd_t *fd) +{ + int ret = -1; + + LOCK (&fd->lock); + { + ret = __afr_fd_ctx_set (this, fd); } -unlock: UNLOCK (&fd->lock); -out: + return ret; } diff --git a/xlators/cluster/afr/src/afr-dir-write.c b/xlators/cluster/afr/src/afr-dir-write.c index 4d2fcd226da..91aa2a9e7af 100644 --- a/xlators/cluster/afr/src/afr-dir-write.c +++ b/xlators/cluster/afr/src/afr-dir-write.c @@ -974,11 +974,10 @@ afr_link (call_frame_t *frame, xlator_t *this, local->transaction.done = afr_link_done; local->transaction.unwind = afr_link_unwind; - afr_build_parent_loc (&local->transaction.parent_loc, oldloc); + afr_build_parent_loc (&local->transaction.parent_loc, newloc); local->transaction.main_frame = frame; - local->transaction.basename = AFR_BASENAME (oldloc->path); - local->transaction.new_basename = AFR_BASENAME (newloc->path); + local->transaction.basename = AFR_BASENAME (newloc->path); afr_transaction (transaction_frame, this, AFR_ENTRY_TRANSACTION); diff --git a/xlators/cluster/afr/src/afr-inode-write.c b/xlators/cluster/afr/src/afr-inode-write.c index 3deca8df1ef..bb8b5f0fe5a 100644 --- a/xlators/cluster/afr/src/afr-inode-write.c +++ b/xlators/cluster/afr/src/afr-inode-write.c @@ -356,6 +356,12 @@ afr_open_fd_fix (call_frame_t *frame, xlator_t *this, gf_boolean_t pause_fop) priv = this->private; GF_ASSERT (local->fd); + + if (fd_is_anonymous (local->fd)) { + fop_continue = _gf_true; + goto out; + } + if (pause_fop) GF_ASSERT (local->fop_call_continue); diff --git a/xlators/cluster/afr/src/afr-self-heal-common.c b/xlators/cluster/afr/src/afr-self-heal-common.c index ad84f8541dd..5acbf90aab3 100644 --- a/xlators/cluster/afr/src/afr-self-heal-common.c +++ b/xlators/cluster/afr/src/afr-self-heal-common.c @@ -2142,6 +2142,12 @@ afr_self_heal (call_frame_t *frame, xlator_t *this, inode_t *inode) UNLOCK (&priv->lock); } + if (!local->loc.name) { + /* nameless lookup */ + sh->do_missing_entry_self_heal = _gf_false; + sh->do_gfid_self_heal = _gf_false; + } + FRAME_SU_DO (sh_frame, afr_local_t); if (sh->do_missing_entry_self_heal) { afr_self_heal_conflicting_entries (sh_frame, this); diff --git a/xlators/cluster/afr/src/afr-transaction.c b/xlators/cluster/afr/src/afr-transaction.c index 6ae493f1cf5..36d74aed8c3 100644 --- a/xlators/cluster/afr/src/afr-transaction.c +++ b/xlators/cluster/afr/src/afr-transaction.c @@ -34,24 +34,53 @@ afr_fd_ctx_t * -afr_fd_ctx_get (fd_t *fd, xlator_t *this) +__afr_fd_ctx_get (fd_t *fd, xlator_t *this) { uint64_t ctx = 0; - afr_fd_ctx_t *fd_ctx = NULL; int ret = 0; + afr_fd_ctx_t *fd_ctx = NULL; + int i = 0; + afr_private_t *priv = NULL; - ret = fd_ctx_get (fd, this, &ctx); + priv = this->private; - if (ret < 0) - goto out; + ret = __fd_ctx_get (fd, this, &ctx); - fd_ctx = (afr_fd_ctx_t *)(long) ctx; + if (ret < 0 && fd_is_anonymous (fd)) { + ret = __afr_fd_ctx_set (this, fd); + if (ret < 0) + goto out; + + ret = __fd_ctx_get (fd, this, &ctx); + if (ret < 0) + goto out; + + fd_ctx = (afr_fd_ctx_t *)(long) ctx; + for (i = 0; i < priv->child_count; i++) + fd_ctx->opened_on[i] = AFR_FD_OPENED; + } + fd_ctx = (afr_fd_ctx_t *)(long) ctx; out: return fd_ctx; } +afr_fd_ctx_t * +afr_fd_ctx_get (fd_t *fd, xlator_t *this) +{ + afr_fd_ctx_t *fd_ctx = NULL; + + LOCK(&fd->lock); + { + fd_ctx = __afr_fd_ctx_get (fd, this); + } + UNLOCK(&fd->lock); + + return fd_ctx; +} + + static void afr_pid_save (call_frame_t *frame) { diff --git a/xlators/cluster/afr/src/afr.h b/xlators/cluster/afr/src/afr.h index 544c9142471..0ff3000857c 100644 --- a/xlators/cluster/afr/src/afr.h +++ b/xlators/cluster/afr/src/afr.h @@ -791,6 +791,9 @@ afr_lk_transfer_datalock (call_frame_t *dst, call_frame_t *src, int pump_start (call_frame_t *frame, xlator_t *this); int +__afr_fd_ctx_set (xlator_t *this, fd_t *fd); + +int afr_fd_ctx_set (xlator_t *this, fd_t *fd); int32_t diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c index 785ecbc615c..d371fc442da 100644 --- a/xlators/cluster/dht/src/dht-common.c +++ b/xlators/cluster/dht/src/dht-common.c @@ -147,6 +147,246 @@ out: int +dht_discover_complete (xlator_t *this, call_frame_t *discover_frame) +{ + dht_local_t *local = NULL; + call_frame_t *main_frame = NULL; + int op_errno = 0; + int ret = -1; + dht_layout_t *layout = NULL; + + local = discover_frame->local; + layout = local->layout; + + LOCK(&discover_frame->lock); + { + main_frame = local->main_frame; + local->main_frame = NULL; + } + UNLOCK(&discover_frame->lock); + + if (!main_frame) + return 0; + + if (local->file_count && local->dir_count) { + gf_log (this->name, GF_LOG_ERROR, + "path %s exists as a file on one subvolume " + "and directory on another. " + "Please fix it manually", + local->loc.path); + op_errno = EIO; + goto out; + } + + if (local->cached_subvol) { + ret = dht_layout_preset (this, local->cached_subvol, + local->inode); + if (ret < 0) { + gf_log (this->name, GF_LOG_WARNING, + "failed to set layout for subvolume %s", + local->cached_subvol ? local->cached_subvol->name : "<nil>"); + op_errno = EINVAL; + goto out; + } + } else { + ret = dht_layout_normalize (this, &local->loc, layout); + + if (ret != 0) { + gf_log (this->name, GF_LOG_DEBUG, + "normalizing failed on %s", + local->loc.path); + op_errno = EINVAL; + goto out; + } + + dht_layout_set (this, local->inode, layout); + } + + DHT_STACK_UNWIND (lookup, main_frame, local->op_ret, local->op_errno, + local->inode, &local->stbuf, local->xattr, + &local->postparent); + return 0; +out: + DHT_STACK_UNWIND (lookup, main_frame, -1, op_errno, NULL, NULL, NULL, + NULL); + + return ret; +} + + +int +dht_discover_cbk (call_frame_t *frame, void *cookie, xlator_t *this, + int op_ret, int op_errno, + inode_t *inode, struct iatt *stbuf, dict_t *xattr, + struct iatt *postparent) +{ + dht_local_t *local = NULL; + int this_call_cnt = 0; + call_frame_t *prev = NULL; + dht_layout_t *layout = NULL; + int ret = -1; + int is_dir = 0; + int is_linkfile = 0; + int attempt_unwind = 0; + + GF_VALIDATE_OR_GOTO ("dht", frame, out); + GF_VALIDATE_OR_GOTO ("dht", this, out); + GF_VALIDATE_OR_GOTO ("dht", frame->local, out); + GF_VALIDATE_OR_GOTO ("dht", this->private, out); + GF_VALIDATE_OR_GOTO ("dht", cookie, out); + + local = frame->local; + prev = cookie; + + layout = local->layout; + + /* Check if the gfid is different for file from other node */ + if (!op_ret && uuid_compare (local->gfid, stbuf->ia_gfid)) { + gf_log (this->name, GF_LOG_WARNING, + "%s: gfid different on %s", + local->loc.path, prev->this->name); + } + + + LOCK (&frame->lock); + { + /* TODO: assert equal mode on stbuf->st_mode and + local->stbuf->st_mode + + else mkdir/chmod/chown and fix + */ + ret = dht_layout_merge (this, layout, prev->this, + op_ret, op_errno, xattr); + if (ret) + gf_log (this->name, GF_LOG_WARNING, + "%s: failed to merge layouts", local->loc.path); + + if (op_ret == -1) { + local->op_errno = op_errno; + gf_log (this->name, GF_LOG_DEBUG, + "lookup of %s on %s returned error (%s)", + local->loc.path, prev->this->name, + strerror (op_errno)); + + goto unlock; + } + + is_linkfile = check_is_linkfile (inode, stbuf, xattr); + is_dir = check_is_dir (inode, stbuf, xattr); + + if (is_dir) { + local->dir_count ++; + } else { + local->file_count ++; + + if (!is_linkfile) { + /* real file */ + local->cached_subvol = prev->this; + attempt_unwind = 1; + } else { + goto unlock; + } + } + + local->op_ret = 0; + + if (local->xattr == NULL) { + local->xattr = dict_ref (xattr); + } else { + dht_aggregate_xattr (local->xattr, xattr); + } + + if (local->inode == NULL) + local->inode = inode_ref (inode); + + dht_iatt_merge (this, &local->stbuf, stbuf, prev->this); + dht_iatt_merge (this, &local->postparent, postparent, + prev->this); + } +unlock: + UNLOCK (&frame->lock); +out: + this_call_cnt = dht_frame_return (frame); + + if (is_last_call (this_call_cnt) || attempt_unwind) { + dht_discover_complete (this, frame); + } + + if (is_last_call (this_call_cnt)) + DHT_STACK_DESTROY (frame); + + return 0; +} + + +int +dht_discover (call_frame_t *frame, xlator_t *this, loc_t *loc) +{ + int ret; + dht_local_t *local = NULL; + dht_conf_t *conf = NULL; + int call_cnt = 0; + int op_errno = EINVAL; + int i = 0; + call_frame_t *discover_frame = NULL; + + + conf = this->private; + local = frame->local; + + ret = dict_set_uint32 (local->xattr_req, + "trusted.glusterfs.dht", 4 * 4); + if (ret) + gf_log (this->name, GF_LOG_WARNING, + "%s: failed to set 'trusted.glusterfs.dht' key", + loc->path); + + ret = dict_set_uint32 (local->xattr_req, + "trusted.glusterfs.dht.linkto", 256); + if (ret) + gf_log (this->name, GF_LOG_WARNING, + "%s: failed to set 'trusted.glusterfs.dht.linkto' key", + loc->path); + + call_cnt = conf->subvolume_cnt; + local->call_cnt = call_cnt; + + local->layout = dht_layout_new (this, conf->subvolume_cnt); + + if (!local->layout) { + op_errno = ENOMEM; + goto err; + } + + uuid_copy (local->gfid, loc->gfid); + + discover_frame = copy_frame (frame); + if (!discover_frame) { + op_errno = ENOMEM; + goto err; + } + + discover_frame->local = local; + frame->local = NULL; + local->main_frame = frame; + + for (i = 0; i < call_cnt; i++) { + STACK_WIND (discover_frame, dht_discover_cbk, + conf->subvolumes[i], + conf->subvolumes[i]->fops->lookup, + &local->loc, local->xattr_req); + } + + return 0; + +err: + DHT_STACK_UNWIND (lookup, frame, -1, op_errno, NULL, NULL, NULL, NULL); + + return 0; +} + + +int dht_lookup_dir_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, inode_t *inode, struct iatt *stbuf, dict_t *xattr, @@ -1086,6 +1326,12 @@ dht_lookup (call_frame_t *frame, xlator_t *this, local->xattr_req = dict_new (); } + if (uuid_is_null (loc->pargfid) && !uuid_is_null (loc->gfid) && + !__is_root_gfid (loc->inode->gfid)) { + local->cached_subvol = NULL; + dht_discover (frame, this, loc); + return 0; + } if (!hashed_subvol) hashed_subvol = dht_subvol_get_hashed (this, loc); diff --git a/xlators/cluster/dht/src/dht-helper.c b/xlators/cluster/dht/src/dht-helper.c index 01d11ee68f1..65c03b62ffd 100644 --- a/xlators/cluster/dht/src/dht-helper.c +++ b/xlators/cluster/dht/src/dht-helper.c @@ -89,7 +89,7 @@ dht_filter_loc_subvol_key (xlator_t *this, loc_t *loc, loc_t *new_loc, int ret = 0; /* not found */ /* Why do other tasks if first required 'char' itself is not there */ - if (loc->name && !strchr (loc->name, '@')) + if (!loc->name || !strchr (loc->name, '@')) goto out; trav = this->children; diff --git a/xlators/cluster/dht/src/dht-selfheal.c b/xlators/cluster/dht/src/dht-selfheal.c index 3342c35a99c..b87e0ab481b 100644 --- a/xlators/cluster/dht/src/dht-selfheal.c +++ b/xlators/cluster/dht/src/dht-selfheal.c @@ -136,8 +136,10 @@ dht_selfheal_dir_xattr_persubvol (call_frame_t *frame, loc_t *loc, int ret = 0; xlator_t *this = NULL; int32_t *disk_layout = NULL; + dht_local_t *local = NULL; + local = frame->local; subvol = layout->list[i].xlator; this = frame->this; @@ -171,6 +173,9 @@ dht_selfheal_dir_xattr_persubvol (call_frame_t *frame, loc_t *loc, dict_ref (xattr); + if (!uuid_is_null (local->gfid)) + uuid_copy (loc->gfid, local->gfid); + STACK_WIND (frame, dht_selfheal_dir_xattr_cbk, subvol, subvol->fops->setxattr, loc, xattr, 0); @@ -306,6 +311,9 @@ dht_selfheal_dir_setattr (call_frame_t *frame, loc_t *loc, struct iatt *stbuf, return 0; } + if (!uuid_is_null (local->gfid)) + uuid_copy (loc->gfid, local->gfid); + local->call_cnt = missing_attr; for (i = 0; i < layout->cnt; i++) { if (layout->list[i].err == -1) { diff --git a/xlators/features/marker/src/marker.c b/xlators/features/marker/src/marker.c index a40755149f5..93b1518cb7e 100644 --- a/xlators/features/marker/src/marker.c +++ b/xlators/features/marker/src/marker.c @@ -65,6 +65,7 @@ marker_loc_fill (loc_t *loc, inode_t *inode, inode_t *parent, char *path) if (inode) { loc->inode = inode_ref (inode); + uuid_copy (loc->gfid, loc->inode->gfid); } if (parent) @@ -94,34 +95,25 @@ int marker_inode_loc_fill (inode_t *inode, loc_t *loc) { char *resolvedpath = NULL; - inode_t *parent = NULL; int ret = -1; + inode_t *parent = NULL; if ((!inode) || (!loc)) return ret; - if ((inode) && __is_root_gfid (inode->gfid)) { - loc->parent = NULL; - goto ignore_parent; - } + parent = inode_parent (inode, NULL, NULL); - parent = inode_parent (inode, 0, NULL); - if (!parent) { - goto err; - } - -ignore_parent: ret = inode_path (inode, NULL, &resolvedpath); if (ret < 0) goto err; - ret = marker_loc_fill (loc, inode, parent, resolvedpath); + ret = marker_loc_fill (loc, inode, NULL, resolvedpath); if (ret < 0) goto err; err: - if (parent) - inode_unref (parent); + if (parent) + inode_unref (parent); if (resolvedpath) GF_FREE (resolvedpath); diff --git a/xlators/mount/fuse/src/fuse-bridge.c b/xlators/mount/fuse/src/fuse-bridge.c index 8c1cd8f7568..b8f53a1bc3c 100644 --- a/xlators/mount/fuse/src/fuse-bridge.c +++ b/xlators/mount/fuse/src/fuse-bridge.c @@ -3378,6 +3378,7 @@ fuse_first_lookup (xlator_t *this) loc.path = "/"; loc.name = ""; loc.inode = fuse_ino_to_inode (1, this); + uuid_copy (loc.gfid, loc.inode->gfid); loc.parent = NULL; dict = dict_new (); diff --git a/xlators/mount/fuse/src/fuse-bridge.h b/xlators/mount/fuse/src/fuse-bridge.h index 39b54f6fe32..ae764a7bccc 100644 --- a/xlators/mount/fuse/src/fuse-bridge.h +++ b/xlators/mount/fuse/src/fuse-bridge.h @@ -148,7 +148,9 @@ typedef struct fuse_private fuse_private_t; state->finh->unique, \ state->finh->opcode); \ free_fuse_state (state); \ - return; \ + /* ideally, need to 'return', but let the */ \ + /* calling function take care of it */ \ + break; \ } \ \ frame->root->state = state; \ @@ -165,6 +167,7 @@ typedef struct fuse_private fuse_private_t; } else { \ STACK_WIND (frame, ret, xl, xl->fops->fop, args); \ } \ + \ } while (0) @@ -242,7 +245,7 @@ typedef struct { char *resolved; int op_ret; int op_errno; - loc_t deep_loc; + loc_t resolve_loc; struct fuse_resolve_comp *components; int comp_count; } fuse_resolve_t; diff --git a/xlators/mount/fuse/src/fuse-helpers.c b/xlators/mount/fuse/src/fuse-helpers.c index 941907cea8b..9bf85f979c3 100644 --- a/xlators/mount/fuse/src/fuse-helpers.c +++ b/xlators/mount/fuse/src/fuse-helpers.c @@ -68,7 +68,7 @@ fuse_resolve_wipe (fuse_resolve_t *resolve) if (resolve->resolved) GF_FREE ((void *)resolve->resolved); - loc_wipe (&resolve->deep_loc); + loc_wipe (&resolve->resolve_loc); comp = resolve->components; @@ -321,6 +321,8 @@ fuse_loc_fill (loc_t *loc, fuse_state_t *state, ino_t ino, if (!parent) { parent = fuse_ino_to_inode (par, state->this); loc->parent = parent; + if (parent) + uuid_copy (loc->pargfid, parent->gfid); } inode = loc->inode; @@ -342,16 +344,17 @@ fuse_loc_fill (loc_t *loc, fuse_state_t *state, ino_t ino, if (!inode) { inode = fuse_ino_to_inode (ino, state->this); loc->inode = inode; + if (inode) + uuid_copy (loc->gfid, inode->gfid); } parent = loc->parent; if (!parent) { - parent = fuse_ino_to_inode (par, state->this); - if (!parent) { - parent = inode_parent (inode, null_gfid, NULL); - } - + parent = inode_parent (inode, null_gfid, NULL); loc->parent = parent; + if (parent) + uuid_copy (loc->pargfid, parent->gfid); + } ret = inode_path (inode, NULL, &path); diff --git a/xlators/mount/fuse/src/fuse-resolve.c b/xlators/mount/fuse/src/fuse-resolve.c index 33606f87919..755e2f429f1 100644 --- a/xlators/mount/fuse/src/fuse-resolve.c +++ b/xlators/mount/fuse/src/fuse-resolve.c @@ -26,375 +26,203 @@ static int fuse_resolve_all (fuse_state_t *state); -static int -fuse_resolve_path_simple (fuse_state_t *state); - -static int -component_count (const char *path) -{ - int count = 0; - const char *trav = NULL; - - for (trav = path; *trav; trav++) { - if (*trav == '/') - count++; - } - - return count + 2; -} - -static int -prepare_components (fuse_state_t *state) -{ - fuse_resolve_t *resolve = NULL; - char *resolved = NULL; - struct fuse_resolve_comp *components = NULL; - char *trav = NULL; - int count = 0; - int i = 0; - - resolve = state->resolve_now; - - resolved = gf_strdup (resolve->path); - resolve->resolved = resolved; - - count = component_count (resolve->path); - components = GF_CALLOC (sizeof (*components), count, 0); //TODO - if (!components) - goto out; - resolve->components = components; - - components[0].basename = ""; - components[0].ino = 1; - components[0].gen = 0; - components[0].inode = inode_ref (state->itable->root); - - i = 1; - for (trav = resolved; *trav; trav++) { - if (*trav == '/') { - components[i].basename = trav + 1; - *trav = 0; - i++; - } - } -out: - return 0; -} +int fuse_resolve_continue (fuse_state_t *state); +int fuse_resolve_entry_simple (fuse_state_t *state); +int fuse_resolve_inode_simple (fuse_state_t *state); static int fuse_resolve_loc_touchup (fuse_state_t *state) { fuse_resolve_t *resolve = NULL; - loc_t *loc = NULL; - char *path = NULL; - int ret = 0; + loc_t *loc = NULL; + char *path = NULL; + int ret = 0; resolve = state->resolve_now; loc = state->loc_now; if (!loc->path) { - if (loc->parent) { + if (loc->parent && resolve->bname) { ret = inode_path (loc->parent, resolve->bname, &path); } else if (loc->inode) { ret = inode_path (loc->inode, NULL, &path); } if (ret) - gf_log ("", GF_LOG_TRACE, + gf_log (THIS->name, GF_LOG_TRACE, "return value inode_path %d", ret); - - if (!path) - path = gf_strdup (resolve->path); - loc->path = path; } - loc->name = strrchr (loc->path, '/'); - if (loc->name) - loc->name++; - - if (!loc->parent && loc->inode) { - loc->parent = inode_parent (loc->inode, 0, NULL); - } - return 0; } -static int -fuse_resolve_newfd_cbk (call_frame_t *frame, void *cookie, xlator_t *this, - int32_t op_ret, int32_t op_errno, fd_t *fd) + +int +fuse_resolve_gfid_entry_cbk (call_frame_t *frame, void *cookie, xlator_t *this, + int op_ret, int op_errno, inode_t *inode, + struct iatt *buf, dict_t *xattr, + struct iatt *postparent) { fuse_state_t *state = NULL; fuse_resolve_t *resolve = NULL; - fd_t *old_fd = NULL; - fd_t *tmp_fd = NULL; - fuse_fd_ctx_t *tmp_fd_ctx = 0; - uint64_t val = 0; - int ret = 0; + inode_t *link_inode = NULL; + loc_t *resolve_loc = NULL; state = frame->root->state; resolve = state->resolve_now; + resolve_loc = &resolve->resolve_loc; STACK_DESTROY (frame->root); if (op_ret == -1) { - resolve->op_ret = -1; - resolve->op_errno = op_errno; + gf_log (this->name, ((op_errno == ENOENT) ? GF_LOG_DEBUG : + GF_LOG_WARNING), + "%s/%s: failed to resolve (%s)", + uuid_utoa (resolve_loc->pargfid), resolve_loc->name, + strerror (op_errno)); goto out; } - old_fd = resolve->fd; - - state->fd = fd_ref (fd); - - fd_bind (fd); + link_inode = inode_link (inode, resolve_loc->parent, + resolve_loc->name, buf); - resolve->fd = NULL; + if (!link_inode) + goto out; - LOCK (&old_fd->lock); - { - ret = __fd_ctx_get (old_fd, state->this, &val); - if (!ret) { - tmp_fd_ctx = (fuse_fd_ctx_t *)(unsigned long)val; - tmp_fd = tmp_fd_ctx->fd; - if (tmp_fd) { - fd_unref (tmp_fd); - tmp_fd_ctx->fd = NULL; - } - } else { - tmp_fd_ctx = __fuse_fd_ctx_check_n_create (old_fd, - state->this); - } + inode_lookup (link_inode); - if (tmp_fd_ctx) { - tmp_fd_ctx->fd = fd; - } else { - gf_log ("resolve", GF_LOG_WARNING, - "failed to set the fd ctx with resolved fd"); - } - } - UNLOCK (&old_fd->lock); + inode_unref (link_inode); out: - fuse_resolve_all (state); - return 0; -} - -static void -fuse_resolve_new_fd (fuse_state_t *state) -{ - fuse_resolve_t *resolve = NULL; - fd_t *new_fd = NULL; - fd_t *fd = NULL; - - resolve = state->resolve_now; - fd = resolve->fd; - - new_fd = fd_create (state->loc.inode, state->finh->pid); - new_fd->flags = (fd->flags & ~O_TRUNC); - - gf_log ("resolve", GF_LOG_DEBUG, - "%"PRIu64": OPEN %s", state->finh->unique, - state->loc.path); - - FUSE_FOP (state, fuse_resolve_newfd_cbk, GF_FOP_OPEN, - open, &state->loc, new_fd->flags, new_fd, 0); -} - -static int -fuse_resolve_deep_continue (fuse_state_t *state) -{ - fuse_resolve_t *resolve = NULL; - int ret = 0; - - resolve = state->resolve_now; - - resolve->op_ret = 0; - resolve->op_errno = 0; - - if (resolve->path) - ret = fuse_resolve_path_simple (state); - if (ret) - gf_log ("resolve", GF_LOG_TRACE, - "return value of resolve_*_simple %d", ret); - - fuse_resolve_loc_touchup (state); - - /* This function is called by either fd resolve or inode resolve */ - if (!resolve->fd) - fuse_resolve_all (state); - else - fuse_resolve_new_fd (state); + loc_wipe (resolve_loc); + fuse_resolve_continue (state); return 0; } -static int -fuse_resolve_deep_cbk (call_frame_t *frame, void *cookie, xlator_t *this, - int op_ret, int op_errno, inode_t *inode, struct iatt *buf, - dict_t *xattr, struct iatt *postparent) +int +fuse_resolve_gfid_cbk (call_frame_t *frame, void *cookie, xlator_t *this, + int op_ret, int op_errno, inode_t *inode, struct iatt *buf, + dict_t *xattr, struct iatt *postparent) { - fuse_state_t *state = NULL; - fuse_resolve_t *resolve = NULL; - struct fuse_resolve_comp *components = NULL; - inode_t *link_inode = NULL; - int i = 0; + fuse_state_t *state = NULL; + fuse_resolve_t *resolve = NULL; + inode_t *link_inode = NULL; + loc_t *resolve_loc = NULL; state = frame->root->state; resolve = state->resolve_now; - components = resolve->components; - - i = (long) cookie; + resolve_loc = &resolve->resolve_loc; STACK_DESTROY (frame->root); if (op_ret == -1) { - goto get_out_of_here; + gf_log (this->name, ((op_errno == ENOENT) ? GF_LOG_DEBUG : + GF_LOG_WARNING), + "%s: failed to resolve (%s)", + uuid_utoa (resolve_loc->gfid), strerror (op_errno)); + loc_wipe (&resolve->resolve_loc); + goto out; } - if (i != 0) { - /* no linking for root inode */ - link_inode = inode_link (inode, resolve->deep_loc.parent, - resolve->deep_loc.name, buf); - components[i].inode = link_inode; - link_inode = NULL; - } + loc_wipe (resolve_loc); - loc_wipe (&resolve->deep_loc); - i++; /* next component */ + link_inode = inode_link (inode, NULL, NULL, buf); - if (!components[i].basename) { - /* all components of the path are resolved */ - goto get_out_of_here; + if (!link_inode) + goto out; + + inode_lookup (link_inode); + + if (uuid_is_null (resolve->pargfid)) { + inode_unref (link_inode); + goto out; } - /* join the current component with the path resolved until now */ - *(components[i].basename - 1) = '/'; + resolve_loc->parent = link_inode; + uuid_copy (resolve_loc->pargfid, resolve_loc->parent->gfid); - resolve->deep_loc.path = gf_strdup (resolve->resolved); - resolve->deep_loc.parent = inode_ref (components[i-1].inode); - resolve->deep_loc.inode = inode_new (state->itable); - resolve->deep_loc.name = components[i].basename; + resolve_loc->name = resolve->bname; - FUSE_FOP_COOKIE (state, state->itable->xl, fuse_resolve_deep_cbk, - (void *)(long)i, - GF_FOP_LOOKUP, lookup, &resolve->deep_loc, NULL); - return 0; + resolve_loc->inode = inode_new (state->itable); + inode_path (resolve_loc->parent, resolve_loc->name, + (char **) &resolve_loc->path); + + FUSE_FOP (state, fuse_resolve_gfid_entry_cbk, GF_FOP_LOOKUP, + lookup, &resolve->resolve_loc, NULL); -get_out_of_here: - fuse_resolve_deep_continue (state); + return 0; +out: + fuse_resolve_continue (state); return 0; } -static int -fuse_resolve_path_deep (fuse_state_t *state) +int +fuse_resolve_gfid (fuse_state_t *state) { - fuse_resolve_t *resolve = NULL; - struct fuse_resolve_comp *components = NULL; - inode_t *inode = NULL; - long i = 0; + fuse_resolve_t *resolve = NULL; + loc_t *resolve_loc = NULL; + int ret = 0; resolve = state->resolve_now; + resolve_loc = &resolve->resolve_loc; - prepare_components (state); - - components = resolve->components; - - /* start from the root */ - for (i = 1; components[i].basename; i++) { - *(components[i].basename - 1) = '/'; - inode = inode_grep (state->itable, components[i-1].inode, - components[i].basename); - if (!inode) - break; - components[i].inode = inode; + if (!uuid_is_null (resolve->pargfid)) { + uuid_copy (resolve_loc->gfid, resolve->pargfid); + resolve_loc->inode = inode_new (state->itable); + ret = inode_path (resolve_loc->inode, NULL, + (char **)&resolve_loc->path); + } else if (!uuid_is_null (resolve->gfid)) { + uuid_copy (resolve_loc->gfid, resolve->gfid); + resolve_loc->inode = inode_new (state->itable); + ret = inode_path (resolve_loc->inode, NULL, + (char **)&resolve_loc->path); + } + if (ret <= 0) { + gf_log (THIS->name, GF_LOG_WARNING, + "failed to get the path from inode %s", + uuid_utoa (resolve->gfid)); } - if (!components[i].basename) - goto resolved; - - resolve->deep_loc.path = gf_strdup (resolve->resolved); - resolve->deep_loc.parent = inode_ref (components[i-1].inode); - resolve->deep_loc.inode = inode_new (state->itable); - resolve->deep_loc.name = components[i].basename; - - FUSE_FOP_COOKIE (state, state->itable->xl, fuse_resolve_deep_cbk, - (void *)(long)i, - GF_FOP_LOOKUP, lookup, &resolve->deep_loc, NULL); + FUSE_FOP (state, fuse_resolve_gfid_cbk, GF_FOP_LOOKUP, + lookup, &resolve->resolve_loc, NULL); return 0; -resolved: - fuse_resolve_deep_continue (state); - return 0; } -static int -fuse_resolve_path_simple (fuse_state_t *state) +int +fuse_resolve_continue (fuse_state_t *state) { - fuse_resolve_t *resolve = NULL; - struct fuse_resolve_comp *components = NULL; - int ret = -1; - int par_idx = 0; - int ino_idx = 0; - int i = 0; + fuse_resolve_t *resolve = NULL; + int ret = 0; resolve = state->resolve_now; - components = resolve->components; - - if (!components) { - resolve->op_ret = -1; - resolve->op_errno = ENOENT; - goto out; - } - - for (i = 0; components[i].basename; i++) { - par_idx = ino_idx; - ino_idx = i; - } - - if (!components[par_idx].inode) { - resolve->op_ret = -1; - resolve->op_errno = ENOENT; - goto out; - } - - if (!components[ino_idx].inode && - (resolve->type == RESOLVE_MUST || resolve->type == RESOLVE_EXACT)) { - resolve->op_ret = -1; - resolve->op_errno = ENOENT; - goto out; - } - - if (components[ino_idx].inode && resolve->type == RESOLVE_NOT) { - resolve->op_ret = -1; - resolve->op_errno = EEXIST; - goto out; - } - if (components[ino_idx].inode) { - if (state->loc_now->inode) { - inode_unref (state->loc_now->inode); - } - - state->loc_now->inode = inode_ref (components[ino_idx].inode); - } + resolve->op_ret = 0; + resolve->op_errno = 0; - if (state->loc_now->parent) { - inode_unref (state->loc_now->parent); - } + /* TODO: should we handle 'fd' here ? */ + if (!uuid_is_null (resolve->pargfid)) + ret = fuse_resolve_entry_simple (state); + else if (!uuid_is_null (resolve->gfid)) + ret = fuse_resolve_inode_simple (state); + if (ret) + gf_log (THIS->name, GF_LOG_DEBUG, + "return value of resolve_*_simple %d", ret); - state->loc_now->parent = inode_ref (components[par_idx].inode); + fuse_resolve_loc_touchup (state); - ret = 0; + fuse_resolve_all (state); -out: - return ret; + return 0; } + /* Check if the requirements are fulfilled by entries in the inode cache itself Return value: @@ -445,6 +273,7 @@ fuse_resolve_entry_simple (fuse_state_t *state) } state->loc_now->inode = inode_ref (inode); + uuid_copy (state->loc_now->gfid, resolve->gfid); out: if (parent) @@ -468,7 +297,7 @@ fuse_resolve_entry (fuse_state_t *state) ret = fuse_resolve_entry_simple (state); if (ret > 0) { loc_wipe (loc); - fuse_resolve_path_deep (state); + fuse_resolve_gfid (state); return 0; } @@ -505,6 +334,7 @@ fuse_resolve_inode_simple (fuse_state_t *state) } state->loc_now->inode = inode_ref (inode); + uuid_copy (state->loc_now->gfid, resolve->gfid); out: if (inode) @@ -526,7 +356,7 @@ fuse_resolve_inode (fuse_state_t *state) if (ret > 0) { loc_wipe (loc); - fuse_resolve_path_deep (state); + fuse_resolve_gfid (state); return 0; } @@ -574,8 +404,6 @@ fuse_resolve_fd (fuse_state_t *state) state->loc_now = &state->loc; - fuse_resolve_path_deep (state); - out: return 0; } @@ -600,10 +428,6 @@ fuse_resolve (fuse_state_t *state) fuse_resolve_inode (state); - } else if (resolve->path) { - - fuse_resolve_path_deep (state); - } else { resolve->op_ret = 0; diff --git a/xlators/performance/quick-read/src/quick-read.c b/xlators/performance/quick-read/src/quick-read.c index 7db1e686f7d..6c9a0f0e5b5 100644 --- a/xlators/performance/quick-read/src/quick-read.c +++ b/xlators/performance/quick-read/src/quick-read.c @@ -82,47 +82,25 @@ static int32_t qr_loc_fill (loc_t *loc, inode_t *inode, char *path) { int32_t ret = -1; - char *parent = NULL; - char *path_copy = NULL; GF_VALIDATE_OR_GOTO_WITH_ERROR ("quick-read", loc, out, errno, EINVAL); GF_VALIDATE_OR_GOTO_WITH_ERROR ("quick-read", inode, out, errno, EINVAL); GF_VALIDATE_OR_GOTO_WITH_ERROR ("quick-read", path, out, errno, EINVAL); - GF_VALIDATE_OR_GOTO_WITH_ERROR ("quick-read", inode->table, out, errno, - EINVAL); loc->inode = inode_ref (inode); - loc->path = gf_strdup (path); - - path_copy = gf_strdup (path); - if (path_copy == NULL) { - ret = -1; - goto out; - } + uuid_copy (loc->gfid, inode->gfid); - parent = dirname (path_copy); - - loc->parent = inode_from_path (inode->table, parent); - if (loc->parent == NULL) { - ret = -1; - errno = EINVAL; - gf_log ("quick-read", GF_LOG_WARNING, - "cannot search parent inode for path (%s)", path); + loc->path = gf_strdup (path); + if (!loc->path) goto out; - } - loc->name = strrchr (loc->path, '/'); ret = 0; out: if (ret == -1) { qr_loc_wipe (loc); } - if (path_copy) { - GF_FREE (path_copy); - } - return ret; } diff --git a/xlators/performance/read-ahead/src/read-ahead.c b/xlators/performance/read-ahead/src/read-ahead.c index 6d7641e6909..37f34f2eb91 100644 --- a/xlators/performance/read-ahead/src/read-ahead.c +++ b/xlators/performance/read-ahead/src/read-ahead.c @@ -489,12 +489,8 @@ ra_readv (call_frame_t *frame, xlator_t *this, fd_t *fd, size_t size, fd_ctx_get (fd, this, &tmp_file); file = (ra_file_t *)(long)tmp_file; - if (file == NULL) { - op_errno = EBADF; - gf_log (this->name, GF_LOG_WARNING, - "readv received on fd (%p) with no" - " file set in its context", fd); - goto unwind; + if (!file || file->disabled) { + goto disabled; } if (file->offset != offset) { @@ -520,14 +516,6 @@ ra_readv (call_frame_t *frame, xlator_t *this, fd_t *fd, size_t size, flush_region (frame, file, 0, file->pages.prev->offset + 1); } - if (file->disabled) { - STACK_WIND (frame, ra_readv_disabled_cbk, - FIRST_CHILD (frame->this), - FIRST_CHILD (frame->this)->fops->readv, - file->fd, size, offset); - return 0; - } - local = (void *) GF_CALLOC (1, sizeof (*local), gf_ra_mt_ra_local_t); if (!local) { op_errno = ENOMEM; @@ -562,6 +550,13 @@ unwind: STACK_UNWIND_STRICT (readv, frame, -1, op_errno, NULL, 0, NULL, NULL); return 0; + +disabled: + STACK_WIND (frame, ra_readv_disabled_cbk, + FIRST_CHILD (frame->this), + FIRST_CHILD (frame->this)->fops->readv, + fd, size, offset); + return 0; } @@ -600,16 +595,10 @@ ra_flush (call_frame_t *frame, xlator_t *this, fd_t *fd) fd_ctx_get (fd, this, &tmp_file); file = (ra_file_t *)(long)tmp_file; - if (file == NULL) { - op_errno = EBADF; - gf_log (this->name, GF_LOG_WARNING, - "flush received on fd (%p) with no" - " file set in its context", fd); - goto unwind; + if (file) { + flush_region (frame, file, 0, file->pages.prev->offset+1); } - flush_region (frame, file, 0, file->pages.prev->offset+1); - STACK_WIND (frame, ra_flush_cbk, FIRST_CHILD (this), FIRST_CHILD (this)->fops->flush, fd); return 0; @@ -634,16 +623,10 @@ ra_fsync (call_frame_t *frame, xlator_t *this, fd_t *fd, int32_t datasync) fd_ctx_get (fd, this, &tmp_file); file = (ra_file_t *)(long)tmp_file; - if (file == NULL) { - op_errno = EBADF; - gf_log (this->name, GF_LOG_WARNING, - "fsync received on fd (%p) with no" - " file set in its context", fd); - goto unwind; + if (file) { + flush_region (frame, file, 0, file->pages.prev->offset+1); } - flush_region (frame, file, 0, file->pages.prev->offset+1); - STACK_WIND (frame, ra_fsync_cbk, FIRST_CHILD (this), FIRST_CHILD (this)->fops->fsync, fd, datasync); return 0; @@ -659,28 +642,16 @@ ra_writev_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int32_t op_ret, int32_t op_errno, struct iatt *prebuf, struct iatt *postbuf) { - fd_t *fd = NULL; ra_file_t *file = NULL; - uint64_t tmp_file = 0; GF_ASSERT (frame); - fd = frame->local; - - fd_ctx_get (fd, this, &tmp_file); - file = (ra_file_t *)(long)tmp_file; + file = frame->local; - if (file == NULL) { - gf_log (this->name, GF_LOG_WARNING, - "no read-ahead context set in fd (%p)", fd); - op_errno = EBADF; - op_ret = -1; - goto out; + if (file) { + flush_region (frame, file, 0, file->pages.prev->offset+1); } - flush_region (frame, file, 0, file->pages.prev->offset+1); - -out: frame->local = NULL; STACK_UNWIND_STRICT (writev, frame, op_ret, op_errno, prebuf, postbuf); return 0; @@ -701,20 +672,13 @@ ra_writev (call_frame_t *frame, xlator_t *this, fd_t *fd, struct iovec *vector, fd_ctx_get (fd, this, &tmp_file); file = (ra_file_t *)(long)tmp_file; - if (file == NULL) { - op_errno = EBADF; - gf_log (this->name, GF_LOG_WARNING, "writev received on fd with" - "no file set in its context"); - goto unwind; + if (file) { + flush_region (frame, file, 0, file->pages.prev->offset+1); + frame->local = file; + /* reset the read-ahead counters too */ + file->expected = file->page_count = 0; } - flush_region (frame, file, 0, file->pages.prev->offset+1); - - /* reset the read-ahead counters too */ - file->expected = file->page_count = 0; - - frame->local = fd; - STACK_WIND (frame, ra_writev_cbk, FIRST_CHILD(this), FIRST_CHILD(this)->fops->writev, diff --git a/xlators/performance/stat-prefetch/src/stat-prefetch.c b/xlators/performance/stat-prefetch/src/stat-prefetch.c index 563da9e7267..73cc3a955d8 100644 --- a/xlators/performance/stat-prefetch/src/stat-prefetch.c +++ b/xlators/performance/stat-prefetch/src/stat-prefetch.c @@ -19,6 +19,7 @@ #include "stat-prefetch.h" #include "statedump.h" +#include "fd.h" #define GF_SP_CACHE_BUCKETS 1 #define GF_SP_CACHE_ENTRIES_EXPECTED (128 * 1024) //1048576 @@ -667,9 +668,6 @@ out: } -fd_t * -_fd_ref (fd_t *fd); - void sp_remove_caches_from_all_fds_opened (xlator_t *this, inode_t *inode, char *name) @@ -705,7 +703,7 @@ sp_remove_caches_from_all_fds_opened (xlator_t *this, inode_t *inode, INIT_LIST_HEAD (&wrapper->list); - wrapper->fd = _fd_ref (fd); + wrapper->fd = __fd_ref (fd); list_add_tail (&wrapper->list, &head); } } diff --git a/xlators/performance/write-behind/src/write-behind.c b/xlators/performance/write-behind/src/write-behind.c index 1555a79a75b..52e03872026 100644 --- a/xlators/performance/write-behind/src/write-behind.c +++ b/xlators/performance/write-behind/src/write-behind.c @@ -832,17 +832,20 @@ wb_fstat (call_frame_t *frame, xlator_t *this, fd_t *fd) GF_VALIDATE_OR_GOTO (frame->this->name, this, unwind); GF_VALIDATE_OR_GOTO (frame->this->name, fd, unwind); + if ((!IA_ISDIR (fd->inode->ia_type)) && fd_ctx_get (fd, this, &tmp_file)) { - gf_log (this->name, GF_LOG_WARNING, - "write behind file pointer is" - " not stored in context of fd(%p), returning EBADFD", - fd); - op_errno = EBADFD; - goto unwind; + file = wb_file_create (this, fd, 0); + } else { + file = (wb_file_t *)(long)tmp_file; + if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { + gf_log (this->name, GF_LOG_WARNING, + "wb_file not found for fd %p", fd); + op_errno = EBADFD; + goto unwind; + } } - file = (wb_file_t *)(long)tmp_file; local = GF_CALLOC (1, sizeof (*local), gf_wb_mt_wb_local_t); if (local == NULL) { @@ -1115,18 +1118,20 @@ wb_ftruncate (call_frame_t *frame, xlator_t *this, fd_t *fd, off_t offset) GF_VALIDATE_OR_GOTO (frame->this->name, this, unwind); GF_VALIDATE_OR_GOTO (frame->this->name, fd, unwind); + if ((!IA_ISDIR (fd->inode->ia_type)) && fd_ctx_get (fd, this, &tmp_file)) { - gf_log (this->name, GF_LOG_WARNING, - "write behind file pointer is" - " not stored in context of fd(%p), returning EBADFD", - fd); - op_errno = EBADFD; - goto unwind; + file = wb_file_create (this, fd, 0); + } else { + file = (wb_file_t *)(long)tmp_file; + if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { + gf_log (this->name, GF_LOG_WARNING, + "wb_file not found for fd %p", fd); + op_errno = EBADFD; + goto unwind; + } } - file = (wb_file_t *)(long)tmp_file; - local = GF_CALLOC (1, sizeof (*local), gf_wb_mt_wb_local_t); if (local == NULL) { op_errno = ENOMEM; @@ -2091,21 +2096,15 @@ wb_writev (call_frame_t *frame, xlator_t *this, fd_t *fd, struct iovec *vector, if ((!IA_ISDIR (fd->inode->ia_type)) && fd_ctx_get (fd, this, &tmp_file)) { - gf_log (this->name, GF_LOG_WARNING, - "write behind file pointer is" - " not stored in context of fd(%p), returning EBADFD", - fd); - - op_errno = EBADFD; - goto unwind; - } - - file = (wb_file_t *)(long)tmp_file; - if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { - gf_log (this->name, GF_LOG_WARNING, - "wb_file not found for fd %p", fd); - op_errno = EBADFD; - goto unwind; + file = wb_file_create (this, fd, 0); + } else { + file = (wb_file_t *)(long)tmp_file; + if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { + gf_log (this->name, GF_LOG_WARNING, + "wb_file not found for fd %p", fd); + op_errno = EBADFD; + goto unwind; + } } if (file != NULL) { @@ -2265,16 +2264,17 @@ wb_readv (call_frame_t *frame, xlator_t *this, fd_t *fd, size_t size, if ((!IA_ISDIR (fd->inode->ia_type)) && fd_ctx_get (fd, this, &tmp_file)) { - gf_log (this->name, GF_LOG_WARNING, - "write behind file pointer is" - " not stored in context of fd(%p), returning EBADFD", - fd); - op_errno = EBADFD; - goto unwind; + file = wb_file_create (this, fd, 0); + } else { + file = (wb_file_t *)(long)tmp_file; + if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { + gf_log (this->name, GF_LOG_WARNING, + "wb_file not found for fd %p", fd); + op_errno = EBADFD; + goto unwind; + } } - file = (wb_file_t *)(long)tmp_file; - local = GF_CALLOC (1, sizeof (*local), gf_wb_mt_wb_local_t); if (local == NULL) { op_errno = ENOMEM; @@ -2449,19 +2449,20 @@ wb_flush (call_frame_t *frame, xlator_t *this, fd_t *fd) conf = this->private; + if ((!IA_ISDIR (fd->inode->ia_type)) && fd_ctx_get (fd, this, &tmp_file)) { - gf_log (this->name, GF_LOG_WARNING, - "write behind file pointer is" - " not stored in context of fd(%p), returning EBADFD", - fd); - - op_errno = EBADFD; - goto unwind; + file = wb_file_create (this, fd, 0); + } else { + file = (wb_file_t *)(long)tmp_file; + if ((!IA_ISDIR (fd->inode->ia_type)) && (file == NULL)) { + gf_log (this->name, GF_LOG_WARNING, + "wb_file not found for fd %p", fd); + op_errno = EBADFD; + goto unwind; + } } - |
