| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- thanks to Ioannis Aslanidis <iaslanidis@flumotion.com> for reporting.
- breakup the server_connection_cleanup into smaller procedures.
- do following operations in a single atomic operation.
1. conn->active_transports--
2. collecting pointer to lock table and all fds if there are no active transports
this will avoid any race conditions.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
We can avoid memory allocation, de-allocation and
data copies by just using the entries passed to us from
a lower layer and by de-linking the entries from the original
list.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These checks are needed in case a higher layer intends to
delink the dirent list and passes a NULL pointer to
fop_readdir_cbk_stub for the entries parameter.
Consequently, the gf_dirent_free must guard against an empty list
because the stub that is passed to it mgiht have an empty
dirent list.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier it was thought that only not having 'opensm' running will cause
handshake errors in ib-verbs.
Recently understood that even having a wrong 'ib-verbs.port' option can
also cause the same behavior, and it took more than 5-6 e-mail iterations
with the user and lot of brain cycle in support team to understand the
problem. Made the log message more descriptive, so user can be find the
cause, or can send us email without wasting time.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. A page will be put on the inode waitq if the 'freshness' has to be verified with an fstat()
2. while the fstat is in transit, other calls (like lookup) can update ioc_inode->tv, resetting the freshness (page still on inode waitq)
3. Another read request on the same page, after the updated freshness, will wake up the page frames neglecting the fact that the page is also waiting on the inode (waiting for the fstat completion)
4. once the page's frames are woken, the page becomes elegible for purging and can get destroyed for various reasons, leaving a destroyed page pointer in the inode's waitq
5. fstat returns and hits the destroyed page pointer causing a crash
The fix is to all together disable cache hits when any page of the same inode is under validation. The otherwise cache hit will now be subjected to the ongoing validation by getting queued to the inode waitq.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Krishna <krishna (at) gluster.com> for pointing this out.
When a unify self-heal of large directory (directory with lot of entries)
is done, the getdents_cbk used to fail because of new limit of buffer size
(128KB). Noticed that earlier it used to streach upto 4MB, hence the value
1024 worked fine. By reducing it to 512, noticed, we can fit in well within
128KB limit, and hence unify self-heal goes through.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves the potential for pre-fetching a larger
number of dirents. Consider that, with 255 chars as the max
name length for each dirent, in the worst case scenario, where
we actually have files with such large names, we're not getting
more than 4 entries with the current block size of 1024.
Generally also, increasing the size to 4k provides us
with a higher chance that directories with low to medium
number of dirents will be pre-fetched in a single readdir fop.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fop interface is such that we're able to extract more than
1 dirent in a readdir fop. This commit now enables libglusterfsclient
to read multiple entries on a glusterfs_readdir call. Once these
have been pre-fetched, they're cached till either glusterfs_closedir
,glusterfs_rewinddir or glusterfs_seekdir are called.
The current implementation is beneficial for sequential directory
reading and probably indifferent to applications that do a lot of seekdir
and rewinddir after opening the directory. This is because
both these calls result in dirent cache invalidation.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
In order to expose the timeout values for stat and inode
caching, this commit introduces a new fstab option "attr_timeout"
that defines the number of seconds for which a looked up inode
or a stat()'ed structure is valid in the cache.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
| |
There is a mechanism for caching the inode numbers got from a lookup
and a struct stat got from a stat or fstat but I wasnt sure if it worked.
This commit simplifies cache updates and checks and the accompanying
tests have made sure that the cache does work.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handle two cases when deciding log/fstab file:
1. It turns out that that strdup or strlen doesnt actually
check for NULL before trying to do its thing with the string
so it seg-faults on seeing a NULL char pointer.
2. getenv can return an empty string if the
env var was exported as:
$ export GLUSTEFS_BOOSTER_LOG=
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When multiple threads try to create a glusterfs context using the
glusterfs_init function, those threads end up using the global
vairables in the vol file parser in an non-synchronized manner,
resulting in a seg-fault.
There is now a big lock around searches and additions from the mount
table in do_open. This lock granularity could be reduced.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
This can happen when 'option export-statfs-size off' is given in
posix volume. Caused divide by 0 error.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
A mismatch in the size of the used buffer, between reading and then further writing caused an infinite loop and big files(1Mb, 10Mb etc) could not be downloaded through the lighttpd web service using mod_glusterfs. This is because the big file which is broken up into chunks, has a read and a subsequent write.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
| |
This is experimental. We're hoping this improve performance on
high speed links like 10GigE.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
| |
but since introduction of IObuf it is no more. Now iobuf has to be unref'ed instead.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This is another attempt at fixing build problems on Solaris.
I am told that booster build is disabled on Solaris and I know
that it is disabled on Mac OS X also. Getting it to work
on both these systems is now on my TODO list, mainly
because on both these systems, we can have a glusterfs client
running without requiring FUSE.
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
|
|
|
|
|
|
|
|
| |
When mandatory locks are enabled and a read/write
would block due to a lock and if the fd is opened
with O_NONBLOCK, return EAGAIN (previously EWOULDBLOCK).
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes mem-pool behaviour to return a directly usable
address by performing the required adjustment on the address
being returned.
This is different from the previous behaviour where we're trying to fit
into the requested size, the list_head*2 also. This is not efficient
enough in terms of space but hopefully works better than not having any
mem-pool at all. Besides, I am not comfortable with mem-pool meta-data
and caller-useable memory area being the same because of the potential for
mem-pool's data structure corruption.
PS:
Please do read the comments in the code for more info during review.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
| |
It is possible that some of the real_* functors for stat
family of syscalls are NULL. I've seen this on libc. In that
case, this commit attempts to use any available function that
performs an equivalent operation.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
In case the init procedure for VMP fails, we want to
continue using booster through the old approach, which means
leaving the fd-table intact.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
During a rename, if the new file exists, the old name needs to
over-write the new name. We're returning EEXIST, which is wrong
behaviour.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In __do_path_resolve, we need to use the new_loc.path as the input
for resolution rather than the resolved variable, simply because we're
not interested in resolving the names that have been resolved, as
pointed out by the variable name 'resolved'. Instead, we need to resolve
new_loc, which stores the next component in the path to
be looked up.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not cleaning up the ino member of a loc_t results in SIGABRT
in __inode_link because in some cases, the loc->ino is
different from loc->inode->ino. This happens especially in code
blocks which re-use a loc_t structure for pointing at different
inodes/files. For eg, if a loc_t has been assigned an inode and
an ino, and followed by a libgf_client_loc_wipe, then re-use of this
loc in say libgf_client_lookup results the SIGABRT because
libgf_client_lookup calls inode_link with the same loc_t. However,
this loc_t has just been assigned a new inode pointer but the ino
member still contains a previous inode's inode number. This difference
in inode numbers results in an assertion failure, so the SIGABRT.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
Here I am only refining the entry parsing code in order
to clarify the exit conditions from the loop. There were
a few workloads where this loop went infinite.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit basically reverts the previous readdir conformance
patch I sent a few days back. That commit had a completely retarded
and broken way of maintaining per-directory dirent.
It was broken for two reasons:
1. Creating a wrapper structure around the directory's fd_t
only for storing a struct dirent is not clean enough. This commit
takes a better approach by storing the dirent in fd_t context.
This dirent is valid only if the fd_t refers to a directory.
2. That commit was made and tested under the assumption (..stupidity
is a better word..) that only opendir call is used for opening a
directory. That is not correct. Directories are also opened using the
open syscall. The point is, glusterfs_open returns an fd_t and so did
glusterfs_opendir. The previous patch actually changed opendir to
return a new wrapper structure. That is fine, if we go by the POSIX
definition of open and opendir because, they're both supposed to
return different types, an int and a DIR*. However, in
libglusterfsclient, all other code assumes that directory handles
corresponding to DIR* and file descriptors corresponding to int types
are the same type, resulting in use of the same locking and fd context
addition/extraction code. So a directory opened using opendir returned
a wrapper structure which went down into the libglusterfsclient stack
where some function called a lock on the handle assuming it was an
fd_t, since it is not and dereferencing of the supposed fd->inode->lock
results in a seg fault.
Obviously, this didnt show up till unfs3 used open() to open a
directory and not opendir.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
|
|
|
|
| |
Previous fstab option parsing logic was completely
retarded and did not handle all cases. This fixes the situation
so we now work without any problems.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|
|
|
|
| |
build on solaris and other platforms
|
|
|
|
| |
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
|