glusterfs.git/xlators/storage, branch v3.6.3beta1

storage/posix: Don't try to set gfid in case of INTERNAL-mknod

2015-02-11T09:49:16+00:00

        Backport of http://review.gluster.org/9446

BUG: 1184527
Change-Id: I3e31b0672704cae09c5097f84229b8264a6c0fbe
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/9472
Tested-by: Gluster Build System 
Reviewed-by: Raghavendra Bhat

storage/posix: Set gfid after all xattrs, uid/gid are set

2015-02-11T09:49:02+00:00

        Backport of http://review.gluster.com/9434

Problem:
When a new entry is created gfid is set even before uid/gid, xattrs
are set on the entry. This can lead to dht/afr healing that file/dir
with the uid/gid it sees just after the gfid is set, i.e. root/root.
Sometimes setattr/setxattr are failing on that file/dir.

Fix:
Set gfid of the file/directory only after uid/gid, xattrs are setup
properly. Readdirp, lookup either wait for the gfid to be assigned
to the entry or not update the in-memory inode ctx in posix-acl
xlator which was producing lot EACCESS/EPERM to the application
or dht/afr self-heals.

BUG: 1184527
Change-Id: Ia6dfd492e03db2275665e7f63260611b310e38e6
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/9471
Tested-by: Gluster Build System 
Reviewed-by: Raghavendra Bhat

telldir()/seekdir() portability fixes

2014-12-20T09:46:26+00:00

POSIX says that an offset obtained from telldir() can only be used
on the same DIR *. Linux is abls to reuse the offset accross
closedir()/opendir() for a given directory, but this is not portable
and such a behavior should be fixed.

An incomplete fix for the posix xlator was merged in
http://review.gluster.org/8933
This change set completes it.

- Perform the same fix index xlator.
- Use appropriate casts and variable types so that 32 bit signed
  offsets obtained by telldir() do not get clobbered when copied into
  64 bit signed types.
- modify afr-self-heald.c so that it does not use anonymous fd,
  since this will cause closedir()/opendir() between each
  syncop_readdir(). On failure we fallback to anonymous fs
  only for Linux so that we can cope with updated client vs not
  updated brick.
- Avoid sending an EINVAL when the client request for the EOF offset.
  Here we fix an error in previous fix for posix xlator: since we
  fill each directory entry with the offset of the next entry, we
  must consider as EOF the offset of the last entry, and not the
  value of telldir() after we read it.

This is a backport of I59fb7f06a872c4f98987105792d648141c258c6a

BUG: 1138897
Change-Id: I1e9f3e4a7d780b98adf6d9f197ee2198d43ef94d
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/9084
Tested-by: Gluster Build System 
Reviewed-by: Raghavendra Bhat

posix: Fix buffer overrun in _handle_list_xattr()

2014-12-20T09:39:39+00:00

In _handle_list_xattr() we test remaining_size > 0 to check that
we do not overrun the buffer, but since that variable was unsigned
(size_t), the condition would let us go beyond end of buffer if
remaining_size became negative.

This could happen if attribute list grew between the first
sys_llistxattr() call that gets the size and the second sys_llistxattr()
call that get the data. We fix the problem by making remaining_size
signed (ssize_t). This also matches sys_llistxattr() return type.

While there, we use the size returned by the second sys_llistxattr()
call to parse the buffser, as it may also be smaller than the size
obtained from first call, if attribute list shrank.

This fixes a spurious crash in tests/basic/afr/resolve.t

backport of: Ifc5884dd0f39a50bf88aa51fefca8e2fa22ea913

BUG: 1138897
Change-Id: I37d4816b9cb246e34c92994cb969dc2be80be20d
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/9215
Tested-by: Gluster Build System 
Reviewed-by: Raghavendra Bhat

Posix: Brick failure detection fix for ext4 filesystem

2014-12-15T13:10:37+00:00

Issue: stat() on XFS has a check for the filesystem status but
ext4 does not.

Fix: Replacing stat() call with open, write and read  to a new file under the
"brick/.glusterfs" directory. This change will work for xfs, ext4 and other
fileystems.

Change-Id: Id03c4bc07df4ee22916a293442bd74819b051839
BUG: 1158037
"Signed-off-by: Lalatendu Mohanty "
"Reviewed-on: http://review.gluster.org/8213"
(cherry picked from commit a7ef6eea4d43afdba9d0453c095e71e6bf22cdb7)
Signed-off-by: Lalatendu Mohanty 
Reviewed-on: http://review.gluster.org/8988
Reviewed-by: Niels de Vos 
Tested-by: Gluster Build System 
Reviewed-by: Raghavendra Bhat 
Tested-by: Raghavendra Bhat

inode: Handle '/' in basename in inode_link/unlink

2014-11-17T07:36:42+00:00

        Backport of http://review.gluster.org/9004

Problem:
inode_link is sometimes called with a trailing '/'. Lookup, dentry
operations like link/unlink/mkdir/rmdir/rename etc come without trailing
'/' so the stale dentry with '/' remains in the dentry list of the inode.

Fix:
Add assert checks and return NULL for '/' in bname.
Fix ancestry building code to call without '/' at the end.

BUG: 1163570
Change-Id: I96655a0eb4678f80082705ab167327e72f54fa45
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/9111
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

storage/posix: Treat ENODATA/ENOATTR as success in bulk removexattr

2014-11-17T07:08:35+00:00

        Backport of http://review.gluster.org/9049

Bulk remove xattr is internal fop in gluster. Some of the xattrs may have
special behavior. Ex: removexattr("posix.system_acl_access"), removes more than
one xattr on the file that could be present in the bulk-removal request.
Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR.
Since all this fop cares is removal of the xattrs in bulk-remove request and
if they are already deleted, it can be treated as success.

BUG: 1163571
Change-Id: I009f4736f8b6362d7115f57a7d7aece74e56e4f6
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/9109
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

Avoid spurious EINVAL in posix_readdir()

2014-10-29T09:26:56+00:00

On non Linux systems, we check that seekdir() succeeds and we return
EINVAL if it does not. We need this to avoid infinite loops if some
other component in GlusterFS makes an invalid seekdir() usage. This
was introduced in this change: http://review.gluster.org/#/c/8760/

But seekdir() also fails when using the offset returned for the
last entry, and this is expected behavior. As a result, the seekdir()
test produces a spurious EINVAL when reaching end of directory. That
error is not propagated to calling process, but it may harm internal
GlusterFS processing. At least it produce a spurious error message
in brick's log.

We fix the problem by remembering the last entry offset in fd private
data. When a new posix_readdir() invocation requests that offset,
we avoid returning EINVAL.

Backport of I4e67a2ea46538aae63eea663dd4aa33b16ad24c7

BUG: 1138897
Change-Id: I4e98294d157f67ae1a1f0ece1562c77d1219da40
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/8933
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

POSIX filesystem compliance: PATH_MAX

2014-10-03T14:58:10+00:00

POSIX mandates the filesystem to support paths of lengths up to
_XOPEN_PATH_MAX (1024).  This is the PATH_MAX limit here:
http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html

When using a path of 1023 bytes, the posix xlator attempts to create
an absolute path by  prefixing the 1023 bytes path by the brick
base path. The result is an absolute path of more than _XOPEN_PATH_MAX
bytes which may be rejected by the backend filesystem.

Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it
will work (except if brick base path is longer than 2072 bytes but
it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024,
which means the bug can happen regardless of brick base path length.

If this condition is detected for a brick, the proposed fix is to
chdir() the brick glusterfsd daemon to its brick base directory.
Then when encountering a path that will exceed _XOPEN_PATH_MAX once
prefixed by the brick base path, a relative path is used instead
of an absolute one. We do not always use relative path because some
operations require an absolute path on the brick base path itself
(e.g.: statvfs).

At least on NetBSD, this chdir() uncovers a race condition which
causes file lookup to fail with ENODATA for a few seconds. The
volume quickly reaches a sane state, but regression tests are fast
enough to choke on it. The reason is obscure (as often with race
conditions), but sleeping one second after the chdir() seems to
change scheduling enough that the problem disapear.

Note that since the chdir() is done if brick backend filesystem
does not support path long enough, it will not occur with Linux
ext3fs (except if brick base path is over 2072 bytes long).

This is a backport of I7db3567948bc8fa8d99ca5f5ba6647fe425186a9

BUG: 1138897
Change-Id: Ib8eb3efaac8a7ba505d830623921338689229e9a
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/8864
Tested-by: Gluster Build System 
Reviewed-by: Harshavardhana 
Tested-by: Harshavardhana 
Reviewed-by: Vijay Bellur

Fix invalid seekdir() usage

2014-09-30T16:50:26+00:00

According to POSIX, seekdir() should only be given offset obtained from
telldir() on the same DIR *
http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html

Code from afr-self-heald.c and index.c is operating outside of the
specification, by doing using seekdir() with offset from a previously
open/close/re-open directory. This seems to work on Linux (although with
no guarantee it will always in the future). On NetBSD the seekdir()
with a in invalid offset is a nilpotent operation, and causes an infinite
loop, since index_fill_readdir() always restart from the beginning of the
directory.

The situation is fixed by using a non anonymous fd in afr-self-heald.c:
we explicitely open the directory so that it remains open on the brick
side during the timeframe where we want to reuse offsets in seekdir().
This requires adding an opendir fop in index xlator.

If the brick was not updated, the opendir will fail and we fallback
to the standard violating approach for backward compatibility on Linux.
On other systems we fail since it never worked.

While there, add tests to check seekdir() success in index and posix
xlators, so that incorrect usage from calling code produce an explicit
error instead of an infinite loop. We can only do it on non Linux systems,
for the sake of backward compatibility when the brick was updated but
not the client.

Backport of I88ca90acfcfee280988124bd6addc1a1893ca7ab

BUG: 1138897
Change-Id: I5446a9a17d5451ec5aab8fbd10d381da9a0a23ad
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/8860
Tested-by: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: Vijay Bellur