| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With 'sync' strategy, a fop's cbk waits for
the interrupt handler to finish by making a
call to fuse_interrupt_finish_fop() with
sync = true.
The wait is implemented by monitoring an
interrupt_state struct member via a condition
variable. However, due to broken code logic,
the pthread_cond_wait() call is never reached.
This change introduces a new member to the
fuse_interrupt_state_t enum (the type of
aforementioned struct member),
FUSE_INTERRUPT_WAITING_HANDLER, which is then
used for indicating the state of waiting for
the interrupt handler.
Change-Id: I72ab06c37f45ff8f212a6a632bac1f647af05cbd
Updates: #1374
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Per a "TODO" comment in the code, a code block can be removed as it's no
longer required
Change-Id: I60e064ece985ff2ea2a686bbd2f0e6cc850899e9
updates: #1000
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is a followup to
I510158843e4b1d482bdc496c2e97b1860dc1ba93.
In referred change we pushed log messages about 'weird'
write errors to fuse device out of sight, by reporting
them at Debug loglevel instead of Error (where
'weird' means errno is not POSIX compliant but having
meaningful semantics for FUSE protocol).
This solved the issue of spurious error reporting.
And so far so good: these messages don't indicate
an error condition by themselves. However, when they
come in high repetitions, that indicates a suboptimal
condition which should be reported.[1]
Therefore now we shall emit a Warning if a certain
errno occurs a certain number of times[2] as the
outcome of a write to the fuse device.
___
[1] typically ENOENTs and ENOTDIRs accumulate
when glusterfs' inode invalidation lags behind
the kernel's internal inode garbage collection
(in this case above errnos mean that the inode
which we requested to be invalidated is not found
in kernel). This can be mitigated with the
invalidate-limit command line / mount option,
cf. bz#1732717.
[2] 256, as of the current implementation.
Change-Id: I8cc7fe104da43a88875f93b0db49d5677cc16045
Updates: #1000
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
FUSE uses failures of communicating with /dev/fuse with various
errnos to indicate in-kernel conditions to userspace. Some of these
shouldn't be handled as an application error. Also the standard
POSIX errno description should not be shown as they are misleading
in this context.
Solution:
When writing to the fuse device, the caller of the respective
convenience routine can mask those errnos which don't qualify to
be an error for the application in that context, so then those
shall be reported at DEBUG level.
The possible non-standard errnos are reported with their
POSIX name instead of their description to avoid confusion.
(Eg. for ENOENT we don't log "no such file or directory",
we log indeed literal "ENOENT".)
Change-Id: I510158843e4b1d482bdc496c2e97b1860dc1ba93
updates: bz#1193929
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The current lru-limit value still uses memory for
upto 128K inodes.
Reduce the default value of lru-limit to 64K.
Change-Id: Ica2dd4f8f5fde45cb5180d8f02c3d86114ac52b3
Fixes: bz#1753880
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the glusterfs fuse client process is unable to
process the invalidate requests quickly enough, the
number of such requests quickly grows large enough
to use a significant amount of memory.
We are now introducing another option to set an upper
limit on these to prevent runaway memory usage.
Change-Id: Iddfff1ee2de1466223e6717f7abd4b28ed947788
Fixes: bz#1732717
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1644322
Change-Id: I53e8fa362cd8c7d04fb1c4abb606a9abb642c592
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Auto invalidation is necessary when same (meta)data is shared/access
across multiple mounts. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, fuse-auto-invalidation
can be disabled for this case which gives a huge performance boost for
workloads that write data and then immediately read the data they just
wrote.
From glusterfs --help,
<snip>
--auto-invalidation[=BOOL] controls whether fuse-kernel can
auto-invalidate attribute, dentry and page-cache.
Disable this only if same files/directories are
not accessed across two different mounts
concurrently [default: "on"]
</snip>
Details on how disabling auto-invalidation helped to reduce pgbench
init times can be found at [1]. Time taken for pgbench init of scale
8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s)
with auto-invalidations turned off along with other
optimizations. Just disabling auto-invalidation contributed 56%
improvement by reducing the total time taken by 33260s.
[1] https://www.spinics.net/lists/gluster-devel/msg25907.html
Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inode LRU mechanism is moot in fuse xlator (ie. there is no
limit for the LRU list), as fuse inodes are referenced from
kernel context, and thus they can only be dropped on request of
the kernel. This might results in a high number of passive
inodes which are useless for the glusterfs client, causing a
significant memory overhead.
This change tries to remedy this by extending the LRU semantics
and allowing to set a finite limit on the fuse inode LRU.
A brief history of problem:
When gluster's inode table was designed, fuse didn't have any
'invalidate' method, which means, userspace application could
never ask kernel to send a 'forget()' fop, instead had to wait
for kernel to send it based on kernel's parameters. Inode table
remembers the number of times kernel has cached the inode based
on the 'nlookup' parameter. And 'nlookup' field is not used by
no other entry points (like server-protocol, gfapi etc).
Hence the inode_table of fuse module always has to have lru-limit
as '0', which means no limit. GlusterFS always had to keep all
inodes in memory as kernel would have had a reference to it.
Again, the reason for this is, kernel's glusterfs inode reference
was pointer of 'inode_t' structure in glusterfs. As it is a
pointer, we could never free it (to prevent segfault, or memory
corruption).
Solution:
In the inode table, handle the prune case of inodes with 'nlookup'
differently, and call a 'invalidator' method, which in this case is
fuse_invalidate(), and it sends the request to kernel for getting
the forget request.
When the kernel sends the forget, it means, it has dropped all
the reference to the inode, and it will send the forget with the
'nlookup' parameter too. We just need to make sure to reduce the
'nlookup' value we have when we get forget. That automatically
cause the relevant prune to happen.
Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B
fixes: bz#1560969
Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* libglusterfs changes to add new fop
* Fuse changes:
- Changes in fuse bridge xlator to receive and send responses
* posix changes to perform the op on the backend filesystem
* protocol and rpc changes for sending and receiving the fop
* gfapi changes for performing the fop
* tools: glfs-copy-file-range tool for testing copy_file_range fop
- Although, copy_file_range support has been added to the upstream
fuse kernel module, no release has been made yet of a kernel
which contains the support. It is expected to come in the
upcoming release of linux-4.20
So, as of now, executing copy_file_range fop on a fused based
filesystem results in fuse kernel module sending read on the
source fd and write on the destination fd.
Therefore a small gfapi based tool has been written to be able
test the copy_file_range fop. This tool is similar (in functionality)
to the example program given in copy_file_range man page.
So, running regular copy_file_range on a fuse mount point and
running gfapi based glfs-copy-file-range tool gives some idea about
how fast, the copy_file_range (or reflink) can be.
On the local machine this was the result obtained.
mount -t glusterfs workstation:new /mnt/glusterfs
[root@workstation ~]# cd /mnt/glusterfs/
[root@workstation glusterfs]# ls
file
[root@workstation glusterfs]# cd
[root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new
real 0m6.495s
user 0m0.000s
sys 0m1.439s
[root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr
OPEN_SRC: opening /file is success
OPEN_DST: opening /rrr is success
FSTAT_SRC: fstat on /rrr is success
copy_file_range successful
real 0m0.309s
user 0m0.039s
sys 0m0.017s
This tool needs following arguments
1) hostname
2) volume name
3) log file path
4) source file path (relative to the gluster volume root)
5) destination file path (relative to the gluster volume root)
"glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>"
- Added a testcase as well to run glfs-copy-file-range tool
* io-stats changes to capture the fop for profiling
* NOTE:
- Added conditional check to see whether the copy_file_range syscall
is available or not. If not, then return ENOSYS.
- Added conditional check for kernel minor version in fuse_kernel.h
and fuse-bridge while referring to copy_file_range. And the kernel
minor version is kept as it is. i.e. 24. Increment it in future
when there is a kernel release which contains the support for
copy_file_range fop in fuse kernel module.
* The document which contains a writeup on this enhancement can be found at
https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit
Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367
updates: #536
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We add dummy interrupt handling for the FLUSH
fuse message. It can be enabled by the
"--fuse-flush-handle-interrupt" hidden command line
option, or "-ofuse-flush-handle-interrupt=yes"
mount option.
It serves no other than diagnostic & demonstational
purposes -- to exercise the interrupt handling framework
a bit and to give an usage example.
Documentation is also provided that showcases interrupt
handling via FLUSH.
Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add sub-framework to send timed responses to kernel
- add interrupt handler queue
- implement INTERRUPT
fuse_interrupt looks up handlers for interrupted messages
in the queue. If found, it invokes the handler function.
Else responds with EAGAIN with a delay.
See spec at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt?h=v4.17#n148
and explanation in comments.
Change-Id: I1a79d3679b31f36e14b4ac8f60b7f2c1ea2badfb
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
| |
Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added kernel-writeback-cache command line and xlator
option for requesting utilisation of the writeback
cache of the kernel in FUSE_INIT (see [1]).
- Added attr-times-granularity command line and xlator
option via which granularity of the {a,m,c}time in
stat (attr) data that we support can be indicated to
kernel. This is a means to avoid divergence of the
attr times between kernel and userspace that could
occur with writeback-cache, while still maintaining
maximum time precision the FUSE server is capable of
(see [2]).
- Handling FATTR_CTIME flag in FUSE_SETATTR that
indicates presence of ctime in setattr payload.
Currently we cannot associate arbitrary ctimes to
files on backend, so we just touch them to update
their ctimes to current time. Having ctimes in setattr
payload is also a side effect of writeback cache
(see [3] and [4]).
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d99ff8,
"fuse: Turn writeback cache on"
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e27c9d3,
"fuse: fuse: add time_gran to INIT_OUT"
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e18bda,
"fuse: add .write_inode"
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab9e13f,
"fuse: allow ctime flushing to userspace"
Updates: #435
Change-Id: Id174c8e0c815c4456c35f8c53e41a6a507d91855
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usage: Use 'reader-thread-count=<NUM>' as command line option to
set the thread count at the time of mounting the volume.
Next task is to make these threads auto-scale based on the load,
instead of having the user remount the volume everytime to change
the thread count.
Updates #412
Change-Id: I94aa1505e5ae6a133683d473e0e4e0edd139b76b
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
| |
Updates: #242
BUG: 1428063
Change-Id: Iaaf2edf99b2ecc75f6d30762c752a6d445c1c826
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... and disable it by default.
This is because having it disabled seems to improve performance.
This could be due to the lock contention by the different epoll threads
on the circular buff lock in the fop cbks just before writing their response
to /dev/fuse.
Just to provide some data - wrt ovirt-gluster hyperconverged
environment, I saw an increase in IOPs by 12K with event-history
disabled for randrom read workload.
Usage:
mount -t glusterfs -o event-history=on $HOSTNAME:$VOLNAME $MOUNTPOINT
OR
glusterfs --event-history=on --volfile-server=$HOSTNAME --volfile-id=$VOLNAME $MOUNTPOINT
Change-Id: Ia533788d309c78688a315dc8cd04d30fad9e9485
BUG: 1467614
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libfuse has an auto_unmount option which,
if enabled, ensures that the file system
is unmounted at FUSE server termination
by running a separate monitor process
that performs the unmount when that
occurs. (This feature would probably
better be called "robust auto-unmount",
as FUSE servers usually do try to unmount
their file systems upon termination,
it's just this mechanism is not crash
resilient.)
This change implements that option and
behavior for glusterfs.
Note that "auto unmount" (robust or not) is
a leaky abstraction, as the kernel cannot
guarantee that at the path where the FUSE
fs is mounted is actually the toplevel mount
at the time of the umount(2) call, for
multiple reasons, among others, see:
fuse-devel: "fuse: feasible to distinguish between umount and abort?"
http://fuse.996288.n3.nabble.com/fuse-feasible-to-distinguish-between-umount-and-abort-tt14358.html
https://github.com/libfuse/libfuse/issues/122
Updates #153
Change-Id: Ia4432580c9fd2c156d9c73c3a44f4bfd42437599
Signed-off-by: Csaba Henk <csaba@redhat.com>
Reviewed-on: https://review.gluster.org/17230
Tested-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_new_dict/dict_destroy is causing confusion where, dict_new/dict_destroy or
get_new_dict/dict_unref are used instead of dict_new/dict_unref.
Change-Id: I4cc69f5b6711d720823395e20fd624a0c6c1168c
BUG: 1296043
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/13183
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport @ http://review.gluster.org/#/c/13626/3
Fix a typo error, consolidate the selinux and capability
check in getxattr and setxattr.
Change-Id: I4303de3d4dd00853169b07577311e03cbb912ed7
BUG: 1316327
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/13653
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Vijay Bellur <vbellur@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally all security.* xattrs were forbidden if selinux is disabled,
which was causing Samba's acl_xattr module to not work, as it would
store the NTACL in security.NTACL. To fix this http://review.gluster.org/#/c/12826/
was sent, which forbid only security.selinux. This opened up a getxattr
call on security.capability before every write fop and others.
Capabilities can be used without selinux, hence if selinux is disabled,
security.capability cannot be forbidden. Hence adding a new mount
option called capability.
Only when "--capability" or "--selinux" mount option is used,
security.capability is sent to the brick, else it is forbidden.
Change-Id: I77f60e0fb541deaa416159e45c78dd2ae653105e
BUG: 1309462
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/13540
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Linux FUSE kernel module has gained support for passing SEEK_HOLE
and SEEK_DATA on through lseek(). This can greatly improve performance
when working with sparse files.
Linux FUSE introduced support for lseek() with version 4.5. The commit
in mainline Linux is 0b5da8db145bfd44266ac964a2636a0cf8d7c286.
URL: http://thread.gmane.org/gmane.comp.file-systems.fuse.devel/14752
Change-Id: I12496d788e59461a3023ddd30e0ea3179007f77e
BUG: 1220173
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11474
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d0edb6d555d687f76837515207b9408be0bdd55e.
The same functionality will be provided in a different patch
Change-Id: I3139478b218fa32e803bb088df585fbbdf94af34
BUG: 1272949
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12375
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
doing inode/entry invalidations.
Writing to pipe can block if pipe is full. This can lead to deadlocks
in some situations. Consider following situation:
1. Kernel sends a write on an inode. Client is waiting for a response
to write from brick.
2. A lookup happens on behalf of different application/thread on the
same inode. In response, mdc tries to invalidate the inode.
3. fuse_invalidate_inode is called. It writes a invalidation request
to pipe. Another thread which reads from this pipe writes the
request to /dev/fuse. The invalidate code in fuse-kernel-module,
tries to acquire lock on all pages for the inode and is blocked as
a write is in progress on same inode (step 1)
4. Now, poller thread is blocked in invalidate notification and cannot
receive any messages from same socket (on which lookup response
came). But client is expecting a response for write from same
socket (again step1) and we've a deadlock.
The deadlock can be solved in two ways:
1. Use a queue (and a conditional variable for notifications) to pass
invalidation requests from poller to invalidate thread. This is a
variant of using non-blocking pipe, but doesn't have any limit on the
amount of data (worst case we run out of memory and error out).
2. Allow events from sockets, immediately after we read one
rpc-msg. Currently we disallow events till that rpc-msg is read from
socket, processed and handled by higher layers. That way we won't run
into these kind of issues. Also, it'll increase parallelism in way of
reading from sockets.
This patch implements solution 1 above.
Change-Id: I8e8199fd7f4da9eab46a719d9292f35c039967e1
BUG: 1273387
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/12402
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a graph switch has happended as part of a attach-tier,
then there is a chance to hash fops to newly added brick
before fix-layout. This causes on going i/o to fail.
This patch will resolve a path, for graph switch by sending
recursive lookup to the parent directories. Those lookups
will help to heal the directory.
Change-Id: Ia2bb4b43a21e5cc6875ba1205628744c3f0ce4e5
BUG: 1263549
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12184
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a --resolve-gids commandline option to the glusterfs binary. This
option gets set when executing "mount -t glusterfs -o resolve-gids ...".
This option is most useful in combination with the "acl" mount option.
POSIX ACL permission checking is done on the FUSE-client side to improve
performance (in addition to the checking on the bricks).
The fuse-bridge reads /proc/$PID/status by default, and this file
contains maximum 32 groups. Any local (client-side) permission checking
that requires more than the first 32 groups will fail.
By enabling the "resolve-gids" option, the fuse-bridge will call
getgrouplist() to retrieve all the groups from the user accessing the
mountpoint. This is comparable to how "nfs.server-aux-gids" works.
Note that when a user belongs to more than ~93 groups, the volume option
server.manage-gids needs to be enabled too. Without this option, the
RPC-layer will need to reduce the number of groups to make them fit in
the RPC-header.
Change-Id: I7ede90d0e41bcf55755cced5747fa0fb1699edb2
BUG: 1246275
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11732
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The structures returned by readdirp contain the inode 2x. Only one of
them was squashed into 32-bits when enable-ino32 is enabled.
Change-Id: I33a6d28fb118bb23971f918ffeb983d7f033106e
BUG: 1223889
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Cyril Peponnet <cyril@peponnet.fr> [on release-3.5]
Reviewed-on: http://review.gluster.org/10881
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fuse notify function gets called by the epoll or the poll thread
and till the point there is a single epoll thread, 2 notify
instances would not race with each other.
With the upcoming multi thread epoll changes, it is possible that
2 epoll threads invoke the notify function. As a result races
in this function are fixed with this commit.
The races seen are detailed in the bug, and the fix here is to
enforce a (slightly) longer critical section when updating the
fuse private structure and reserving state updates post error
handling.
Change-Id: I6974bc043cb59eb6dc39c5777123364dcefca358
BUG: 1180231
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/9421
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Even when the fd resolution failed, the fop is continuing on the
new graph which may not have valid inode. This lead to NULL layout
subvols in dht which lead to crash in fsync after graph migration.
Fix:
- Remove resolution error handling in FUSE_FOP as it was only added
to handle fd migration failures.
- check in fuse_resolve_done for fd resolution failures and fail the
fop right away.
- loc resolution failures are already handled in the corresponding
fops.
- Return errno from state->resolve.op_errno in resume functions.
- Send error to fuse on frame allocation failures.
- Removed unused variable state->resolved
- Removed unused macro FUSE_FOP_COOKIE
Change-Id: I479d6e1ff2ca626ad8c8fcb6f293022149474992
BUG: 1126048
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8402
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Provides a working Gluster Management Daemon, CLI
- Provides a working GlusterFS server, GlusterNFS server
- Provides a working GlusterFS client
- execinfo port from FreeBSD is moved into ./contrib/libexecinfo
for ease of portability on NetBSD. (FreeBSD 10 and OSX provide
execinfo natively)
- More portability cleanups for Darwin, FreeBSD and NetBSD
- Provides a new rc script for FreeBSD
Change-Id: I8dff336f97479ca5a7f9b8c6b730051c0f8ac46f
BUG: 1111774
Original-Author: Mike Ma <mikemandarine@gmail.com>
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/8141
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* As of now clients mounting within the storage pool using that machine's
ip/hostname are trusted clients (i.e clients local to the glusterd).
* Be careful when the request itself comes in as nfsnobody (ex: posix tests).
So move the squashing part to protocol/server when it creates a new frame
for the request, instead of auth part of rpc layer.
* For nfs servers do root-squashing without checking if it is trusted client,
as all the nfs servers would be running within the storage pool, hence will
be trusted clients for the bricks.
* Provide one more option for mounting which actually says root-squash
should/should not happen. This value is given priority only for the trusted
clients. For non trusted clients, the volume option takes the priority. But
for trusted clients if root-squash should not happen, then they have to be
mounted with root-squash=no option. (This is done because by default
blocking root-squashing for the trusted clients will cause problems for smb
and UFO clients for which the requests have to be squashed if the option is
enabled).
* For geo-replication and defrag clients do not do root-squashing.
* Introduce a new option in open-behind for doing read after successful open.
Change-Id: I8a8359840313dffc34824f3ea80a9c48375067f0
BUG: 954057
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/4863
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I85fc6dd393449d365bb908b38c2827b58cb08171
BUG: 1030208
Signed-off-by: Vijaykumar M <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/6262
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If kernel supports FUSE_AUTO_INVAL_DATA then it is safe(r) to turn on
--fopen-keep-cache mode by default. Users report significant improvement
in perf by enabling the mode.
Change-Id: Icf9df4b7b43950d7e25302d9c2a1a7d14571a9a9
BUG: 990744
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5770
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20.
Conflicts:
xlators/mount/fuse/src/fuse-bridge.c
For build issues added CREATE_MODE_KEY definition in:
libglusterfs/src/glusterfs.h
Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e
BUG: 952029
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/5495
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* files can be accessed directly through their gfid and not just
through their paths. For eg., if the gfid of a file is
f3142503-c75e-45b1-b92a-463cf4c01f99, that file can be accessed
using <gluster-mount>/.gfid/f3142503-c75e-45b1-b92a-463cf4c01f99
.gfid is a virtual directory used to seperate out the namespace
for accessing files through gfid. This way, we do not conflict with
filenames which can be qualified as uuids.
* A new file/directory/symlink can be created with a pre-specified
gfid. A setxattr done on parent directory with fuse_auxgfid_newfile_args_t
initialized with appropriate fields as value to key "glusterfs.gfid.newfile"
results in the entry <parent>/bname whose gfid is set to args.gfid. The
contents of the structure should be in network byte order.
struct auxfuse_symlink_in {
char linkpath[]; /* linkpath is a null terminated string */
} __attribute__ ((__packed__));
struct auxfuse_mknod_in {
unsigned int mode;
unsigned int rdev;
unsigned int umask;
} __attribute__ ((__packed__));
struct auxfuse_mkdir_in {
unsigned int mode;
unsigned int umask;
} __attribute__ ((__packed__));
typedef struct {
unsigned int uid;
unsigned int gid;
char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid string
* in canonical form.
*/
unsigned int st_mode;
char bname[]; /* bname is a null terminated string */
union {
struct auxfuse_mkdir_in mkdir;
struct auxfuse_mknod_in mknod;
struct auxfuse_symlink_in symlink;
} __attribute__ ((__packed__)) args;
} __attribute__ ((__packed__)) fuse_auxgfid_newfile_args_t;
An initial consumer of this feature would be geo-replication to
create files on slave mount with same gfids as that on master.
It will also help gsyncd to access files directly through their
gfids. gsyncd in its newer version will be consuming a changelog
(of master) containing operations on gfids and sync corresponding
files to slave.
* Also, bring in support to heal gfids with a specific value.
fuse-bridge sends across a gfid during a lookup, which storage
translators assign to an inode (file/directory etc) if there is
no gfid associated it. This patch brings in support
to specify that gfid value from an application, instead of relying
on random gfid generated by fuse-bridge.
gfids can be healed through setxattr interface. setxattr should be
done on parent directory. The key used is "glusterfs.gfid.heal"
and the value should be the following structure whose contents
should be in network byte order.
typedef struct {
char gfid[UUID_CANONICAL_FORM_LEN + 1]; /* a null terminated gfid
* string in canonical form
*/
char bname[]; /* a null terminated basename */
} __attribute__((__packed__)) fuse_auxgfid_heal_args_t;
This feature can be used for upgrading older geo-rep setups where gfids
of files are different on master and slave to newer setups where they
should be same. One can delete gfids on slave using setxattr -x and
.glusterfs and issue stat on all the files with gfids from master.
Thanks to "Amar Tumballi" <amarts@redhat.com> and "Csaba Henk"
<csaba@redhat.com> for their inputs.
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: Ie8ddc0fb3732732315c7ec49eab850c16d905e4e
BUG: 952029
Reviewed-on: http://review.gluster.com/#/c/4702
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Tested-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.org/4702
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default fuse kernel readdirp usage in fuse xlator is off.
When mount option use-readdirp=yes is provided it starts using
fuse-kernel's readdirp.
Change-Id: Id37edc53b1adc1638186d956c2f74c1e4e48aa59
BUG: 983477
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5322
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes use of READDIRPLUS call when support is available
in the kernel.
Change-Id: Iac78881179567856b55af1f46594a2b2859309f0
BUG: 908128
Signed-off-by: Anand V. Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/3905
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
use event-history framework for saving and dumping (on necessity)
important xlator specific information.
Tests:
Included the regression testcase.
Change-Id: I6c0532e9ffe0b624286cdc4d2637b1bd2c0579e0
BUG: 858215
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Signed-off-by: root <root@thinkpad.(none)>
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/3925
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Migration of fd to new graph involves creation of a new fd
to be used only for calls sent in that graph.
Earlier approach of using same fd across all graphs, with the
associated inode always guaranteed to be the one valid in
currently active graph, had issues because of the broken
immutability of the association of fd with an inode
(for the life of fd).
With this patch, there will be a basefd, which the kernel will be
aware of. This basefd, contains a mapping of an fd which is valid
in currently active graph.
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Change-Id: I2b459f05bc2690a66498be107fad6444e3158138
BUG: 802414
Reviewed-on: http://review.gluster.org/3566
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glusterfs mount option 'enable-ino32' does not change the behaviour
of readdir(). fuse_readdir_cbk() uses entry->d_ino directly, and this
was missed in commit c13823bd16b26bc471d3efb15f63b76fbfdf0309.
By adding the function gf_fuse_fill_dirent(), the fuse_dirent structure
is filled in a similar way as the fuse_attr structure. This helper uses
the same function to squash the 64-bit inode in a 32-bit attribute.
Change-Id: Ia20e7144613124a58691e7935cb793b6256aef79
BUG: 850352
URL: http://lists.nongnu.org/archive/html/gluster-devel/2012-09/msg00051.html
Tested-by: Steve Bakke <sbakke@netzyn.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/3955
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
License message changed for server-side, dual license GPLV2 and LGPLv3+.
Change-Id: Ia9e53061b9d2df3b3ef3bc9778dceff77db46a09
BUG: 852318
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3940
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Linux, certain "filesystem-specific" options (passed in string form in last
argument to mount(2)), such as "rootcontext" or "context" are in fact common to
all filesystems, including fuse. We should pass them down to the actual
mount(2) call untouched.
This is achieved by adding "fuse-mountopts" option to mount/fuse translator and
adjusting the mount helper to propagate it with unrecognized options as they
are encountered.
BUG: 852754
Change-Id: I309203090c02025334561be235864d8d04e4159b
Signed-off-by: Lubomir Rintel <lubo.rintel@gooddata.com>
Reviewed-on: http://review.gluster.org/3871
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default the GlusterFS-native client uses 64-bit inodes. Some 32-bit
applications can not handle these correctly. Introduce a client-side
mount option "enable-ino32" which causes the FUSE-client to squash the
64-bit inodes into a 32-bit value.
Change-Id: I3296d16528bfb50457b9675f6b8701234ed82ff0
BUG: 850352
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/3885
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The license message is changed to
Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com>
This file is part of GlusterFS.
This file is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3 or
later), or the GNU General Public License, version 2 (GPLv2), in all
cases as published by the Free Software Foundation.
Change-Id: I07d2b63ed5fbbbd1884f1e74f2dd56013d15b0f4
BUG: 852318
Signed-off-by: Varun Shastry <vshastry@redhat.com>
Reviewed-on: http://review.gluster.org/3858
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* also make 'congestion_threshold' an option
* make 'congestion_threshold' as 75% of background queue length if
not explicitely specified
* in glusterfsd.c, moved all the fuse option dictionary setting
code to separate function
Change-Id: Ie1680eefaed9377720770a09222282321bd4132e
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 845214
Reviewed-on: http://review.gluster.org/3830
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
resolver.
One error we hit was absence of gfid on backend. While the lookup
code-path generates a new uuid and sets it on file, resolver code
doesn't do that. Since, functionally (atleast after resolving parent
inode, we would be resolving the path in new-graph) both resolver
and lookup does same work, it would be no harm in ignoring errors
during resolving the entry. This would help us to continue with
the _extra_ work (like healing gfid as of now) in fuse_lookup_resume.
Change-Id: If46d5e07c32e67b5744287a6ef55d0b0fe347689
BUG: 821138
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-on: http://review.gluster.com/3344
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fuse kernel module supports caching negative entries, enabled
by specifying a timeout while returning ENOENT to lookup. This
patch enables the functionality to be enabled with the command
line.
Also fixed a typo bug in mount.glusterfs.in.
Change-Id: I47eab2834cca9a05887266358afbf504bbb4c489
BUG: 841417
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.com/3696
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Context
-------
gsyncd/geo-rep plans to rely on Rsync to sync extended attributes.
When this is in place, all xattrs *visible* on the mount point would
be candidate for syncing. This set could include gluster internal
xattrs too (as xome xlators do not filter out in their cbks). Syncing
these xattrs to the slave could result in unexpected functioning of
the slave mount.
Soln.
-----
For gsyncd auxillary mounts (identified by client_pid -1), we only
allow xtime related xattrs to go through and silently ignore (w/o
propagating error back to the client) the rest of them. This provides
a future proof solution as we need not worry about what xattrs show
up on the mounts. Also, 'user' namespace xattrs are always passed
through even if it's from a gsyncd aux mount.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Change-Id: I6fac5e03d2b25fa4cdece4b2897fb202617b3c23
BUG: 841062
Reviewed-on: http://review.gluster.com/3687
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|