|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem: When trying to convert a plain distribute volume to replica-3
or arbiter type it is failing with ENOTCONN error as the lookup on
the root will fail as there is no quorum.
Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT
which is used while adding bricks to a volume. It will try to set the
pending xattrs for the newly added bricks to allow the heal to happen
in the right direction and avoid data loss scenarios.
Note: This fix will solve the problem of type conversion only in the
case where the volume was mounted at least once. The conversion of
non mounted volumes will still fail since the dht selfheal tries to
set the directory layout will fail as they do that with the PID
GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root.
Change-Id: Ic511939981dad118cc946754341318b164954b3b
fixes: bz#1655854
Signed-off-by: karthik-us <ksubrahm@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The current implementation of iobuf_pool has two problems:
- prealloc of 12.5MB memory, this limits the scale factor of the gluster
  processes due to RAM requirements
- lock contention, as the current implementation has one global
  iobuf_pool lock. Credits for debugging and addressing the same goes to
  Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410
Hence changing the iobuf implementation to use per thread mem pool.
This may theoritically appear to cause perf dip as there is no preallocation.
But per thread mem pool will not have significant perf impact as the last
allocated memory is kept alive for subsequent allocs, for some time.
The worst case would be if iobufs requested are of random sizes each time.
The best case is, if we get iobuf request of the same size. From the perf
tests, this patch did not seem to cause any perf decrease.
Note that, with this patch, the rdma performance is going to degrade
drastically. In one of the previous patchsets we had fixes to not
degrade rdma perf, but rdma is not supported and also not tested [1].
Hence the decision was to not have code in rdma that is not tested
and not supported.
[1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html
Updates: #325
Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27
Signed-off-by: Poornima G <pgurusid@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Currently mem-pool implementation provides api to get from the
mem pool based on the struct type. This is to retain api
compatibility with the old implementation of mem pool. Internally
in the mem pool structure there is a mapping from struct to size
based pools.
In this patch, we are adding new APIs to fetch memory from mem pool,
given a size.
Change-Id: Ib220ee45ebd134a7be8f6482db5a592dbb7b9211
Updates: #325
Signed-off-by: Poornima G <pgurusid@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The inode LRU mechanism is moot in fuse xlator (ie. there is no
limit for the LRU list), as fuse inodes are referenced from
kernel context, and thus they can only be dropped on request of
the kernel. This might results in a high number of passive
inodes which are useless for the glusterfs client, causing a
significant memory overhead.
This change tries to remedy this by extending the LRU semantics
and allowing to set a finite limit on the fuse inode LRU.
A brief history of problem:
When gluster's inode table was designed, fuse didn't have any
'invalidate' method, which means, userspace application could
never ask kernel to send a 'forget()' fop, instead had to wait
for kernel to send it based on kernel's parameters. Inode table
remembers the number of times kernel has cached the inode based
on the 'nlookup' parameter. And 'nlookup' field is not used by
no other entry points (like server-protocol, gfapi etc).
Hence the inode_table of fuse module always has to have lru-limit
as '0', which means no limit. GlusterFS always had to keep all
inodes in memory as kernel would have had a reference to it.
Again, the reason for this is, kernel's glusterfs inode reference
was pointer of 'inode_t' structure in glusterfs. As it is a
pointer, we could never free it (to prevent segfault, or memory
corruption).
Solution:
In the inode table, handle the prune case of inodes with 'nlookup'
differently, and call a 'invalidator' method, which in this case is
fuse_invalidate(), and it sends the request to kernel for getting
the forget request.
When the kernel sends the forget, it means, it has dropped all
the reference to the inode, and it will send the forget with the
'nlookup' parameter too. We just need to make sure to reduce the
'nlookup' value we have when we get forget. That automatically
cause the relevant prune to happen.
Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B
fixes: bz#1560969
Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Just looked at posix.c and related code and performed
some changes and cleanups. The only important one is #3 below,
but surely the others (#2 and #4) need careful review.
Changes to other files are as they were related to code paths
in posix.c.
I'll send a separate patch for other posix related files.
Main changes:
1. Proper initializtion for parameters, where it made sense.
2. Logged outside the lock in several places.
3. Moved from CALLOC to MALLOC where it made sense.
4. Aligned structures.
5. moved dictionary functions to use _sizen where possible.
(dict_get() -> dict_get_sizen() for example)
Compile-tested only!
Change-Id: Ia84699fb495e06d095339c91c1ba770d1393bb6c
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * Remove the options to load old symbol.
* keep only 'xlator_api' symbol from being exported using xlator.sym
* add xlator_api to all the xlators where its missing
NOTE: This covers all the xlators which has at least a test case
to validate its loading. If there is a translator, which doesn't
have any test, then we should probably remove that from codebase.
fixes: #164
Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem: In changelog xlator after destroying listener it call's
         unlink to delete changelog socket file but socket file
         reference is not cleaned up from process memory
Solution: 1) To cleanup reference completely from process memory
             serialize transport cleanup for changelog and then
             unlink socket file
          2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next
             xlator only after cleanup all xprts
Test: To test the same run below steps
      1) Setup some volume and enable brick mux
      2) kill anyone brick with gf_attach
      3) check changelog socket for specific to killed brick
         in lsof, it should cleanup completely
fixes: bz#1600145
Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Stop spamming "invalid port" logs in case sysadmin has reserved a large
number of ports.
Change-Id: I244ef7693560cc404b36cadc6b05d92ec0e908d3
fixes: bz#1656517
Signed-off-by: Milind Changire <mchangir@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | * libglusterfs changes to add new fop
    * Fuse changes:
      - Changes in fuse bridge xlator to receive and send responses
    * posix changes to perform the op on the backend filesystem
    * protocol and rpc changes for sending and receiving the fop
    * gfapi changes for performing the fop
    * tools: glfs-copy-file-range tool for testing copy_file_range fop
      - Although, copy_file_range support has been added to the upstream
	    fuse kernel module, no release has been made yet of a kernel
        which contains the support. It is expected to come in the
        upcoming release of linux-4.20
        So, as of now, executing copy_file_range fop on a fused based
        filesystem results in fuse kernel module sending read on the
	    source fd and write on the destination fd.
	    Therefore a small gfapi based tool has been written to be able
        test the copy_file_range fop. This tool is similar (in functionality)
	    to the example program given in copy_file_range man page.
	    So, running regular copy_file_range on a fuse mount point and
	    running gfapi based glfs-copy-file-range tool gives some idea about
	    how fast, the copy_file_range (or reflink) can be.
	    On the local machine this was the result obtained.
	    mount -t glusterfs workstation:new /mnt/glusterfs
	    [root@workstation ~]# cd /mnt/glusterfs/
	    [root@workstation glusterfs]# ls
	    file
	    [root@workstation glusterfs]# cd
	    [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new
	    real  0m6.495s
	    user  0m0.000s
	    sys   0m1.439s
	    [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr
	    OPEN_SRC: opening /file is success
	    OPEN_DST: opening /rrr is success
	    FSTAT_SRC: fstat on /rrr is success
	    copy_file_range successful
        real  0m0.309s
        user  0m0.039s
        sys   0m0.017s
        This tool needs following arguments
         1) hostname
         2) volume name
         3) log file path
         4) source file path (relative to the gluster volume root)
         5) destination file path (relative to the gluster volume root)
        "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>"
      - Added a testcase as well to run glfs-copy-file-range tool
    * io-stats changes to capture the fop for profiling
    * NOTE:
      - Added conditional check to see whether the copy_file_range syscall
        is available or not. If not, then return ENOSYS.
      - Added conditional check for kernel minor version in fuse_kernel.h
        and fuse-bridge while referring to copy_file_range. And the kernel
        minor version is kept as it is. i.e. 24. Increment it in future
        when there is a kernel release which contains the support for
        copy_file_range fop in fuse kernel module.
    * The document which contains a writeup on this enhancement can be found at
      https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit
Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367
updates: #536
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | No functional changes (I hope).
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ifbec21c18a6dbe27c5271db156bff4d30ca85dbf | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem:
A single event-thread causes performance issues in the system.
Solution:
Bump up event-threads to 2 to make the system more performant.
This helps in making the system more responsive and helps avoid the
ping-timer-expiry problem as well. However, setting the event-threads
to 2 is not the only thing required to avoid ping-timer-expiry issues.
Change-Id: Idb0fd49e078db3bd5085dd083b0cdc77b59ddb00
fixes: bz#1653277
Signed-off-by: Milind Changire <mchangir@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem: 1) server_init does not cleanup allocate resources
            while it is failed before return error
         2) dict leak at the time of graph destroying
Solution: 1) free resources in case of server_init is failed
          2) Take dict_ref of graph xlator before destroying
             the graph to avoid leak
Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15
fixes: bz#1654917
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | A single global per program queue is contended by all request handler
threads and event threads. This can lead to high contention. So,
reduce the contention by providing each request handler thread its own
private queue.
Thanks to "Manoj Pillai"<mpillai@redhat.com> for the idea of pairing a
single queue with a fixed request-handler-thread and event-thread,
which brought down the performance regression due to overhead of
queuing significantly.
Thanks to "Xavi Hernandez"<xhernandez@redhat.com> for discussion on
how to communicate the event-thread death to request-handler-thread.
Thanks to "Karan Sandha"<ksandha@redhat.com> for voluntarily running
the perf benchmarks to qualify that performance regression introduced
by ping-timer-fixes is fixed with this patch and patiently running
many iterations of regression tests while RCAing the issue.
Thanks to "Milind Changire"<mchangir@redhat.com> for patiently running
the many iterations of perf benchmarking tests while RCAing the
regression caused by ping-timer-expiry fixes.
Change-Id: I578c3fc67713f4234bd3abbec5d3fbba19059ea5
Fixes: bz#1644629
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | liblglusterfs provides wrapper functions MALLOC/__gf_default_malloc,
CALLOC/__gf_default_calloc, and REALLOC/__gf_default_realloc for those
few places outside of mempool.c that need to call malloc/calloc/realloc
directly.
Notable exceptions are "contrib" code, e.g. rbtree and timer-wheel,
and perhaps parsers generated by yacc+lex. But even parsers can be
fixed to at least call the wrappers mentioned above, if not our own
allocators.
Change-Id: Ib8069815eba9b6c04c3adaf59727ec8d8795c4d1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> | 
| | 
| 
| 
| 
| 
| | Change-Id: I666eeb63ebd000711b3f793b948d4e0c04b1a242
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Updates: bz#1644629 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | A comment has been added to pool_destructor() function to explain why
locks are not needed there. Also, the initialization of 'poison' field
has been moved inside a locked region for further safety and clarity.
Change-Id: Idbf23bda7f9228d60c644a1bea4b6c2cfc582090
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | All glusterd store operation and cleanup thread should work under a
critical section to avoid any partial store write.
Change-Id: I4f12e738f597a1f925c87ea2f42565dcf9ecdb9d
Fixes: bz#1652430
Signed-off-by: Atin Mukherjee <amukherj@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Some were assigned NULL, for no good reason, some were assigned proper
initial value. Made them all consistent, as much as possible, to be
assigned reasonable initial values.
No expected functional changes (and I also assume the compiler
already did most of this work behind the scenes anyway, so no
performance implications either).
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2bc0d4f2221124b5f9ef6150c86b7259074e7013 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | There was a race in the per-thread memory pool management that could lead
to memory corruption. The race appeared when the following sequence of
events happened:
1. Thread T1 allocated a memory object O1 from its own private pool P1
2. T1 terminates and P1 is marked to be destroyed
3. The mem-sweeper thread is woken up and scans all private pools
4. It detects that P1 needs to be destroyed and starts releasing the
   objects from hot and cold lists.
5. Thread T2 releases O1
6. O1 is added to the hot list of P1
The problem happens because steps 4 and 6 are protected by diferent locks,
so they can run concurrently. This means that both T1 and T2 are modifying
the same list at the same time, potentially causing corruption.
This patch fixes the problem using the following approach:
1. When an object is released, it's only returned to the hot list of the
   corresponding memory pool if it's not marked to be destroyed. Otherwise
   the memory is released to the system.
2. Object release and mem-sweeper thread synchronize access to the deletion
   mark of the memory pool to prevent simultaneous access to the list.
Some other minor adjustments are made to reduce the lengths of the locked
regions.
Fixes: bz#1651165
Change-Id: I63be3893f92096e57f54a6150e0461340084ddde
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | fscanf with %s reads a word, there is no restriction on the length
of that word, and the caller is required to pass a sufficiently
large buffer for storing thw word. If the input word exceeds the
buffer size, it will cause buffer overflow.
To fix this, use fscanf with width parameter. Width specifies
the maximum number of characters to be read in the current reading
operation.
Change-Id: If250abf5eb637b9fc2a79047e3599f83254cd4e5
updates: bz#1193929
Signed-off-by: Poornima G <pgurusid@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | A new constant named GF_NETWORK_TIMEOUT has been defined and all
references to the hard-coded timeout of 42 seconds have been
replaced with this constant.
Change-Id: Id30f5ce4f1230f9288d9e300538624bcf1a6da27
fixes: bz#1652852
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | based on the amusing discussion @ https://rusty.ozlabs.org/?p=560
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I1cac54067eb44801b216d5620fc5ee2c89befdd0 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Since gcc-8.2.x (fedora-28 or so) gcc has been emitting warnings
about buggy use of strncpy.
e.g.
  warning: ‘strncpy’ output truncated before terminating nul
  copying as many bytes from a string as its length
and
  warning: ‘strncpy’ specified bound depends on the length of the
  source argument
Since we're copying string fragments and explicitly null terminating
use memcpy to silence the warning
Change-Id: I413d84b5f4157f15c99e9af3e154ce594d5bcdc1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem: pmap is showing stale brick entries after down the brick
         because of glusterd_brick_rpc_notify call gf_is_service_running
         before call pmap_registry_remove to ensure about brick instance.
Solutiom: 1) Change the condition in gf_is_pid_running to ensure about
             process existence, use open instead of access to achieve
             the same
          2) Call search_brick_path_from_proc in __glusterd_brick_rpc_notify
             along with gf_is_service_running
Change-Id: Ia663ac61c01fdee6c12f47c0300cdf93f19b6a19
fixes: bz#1646892
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Per newer GCC releases and clang-scan, some trivial
dead initialization (values that were set but were never
read) were removed.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ia9959b2ff87d2e9cb46864e68ffe7dccb984db34 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Earlier commit had the annotation incorrect, and also did not
wrap the sanitization in a separate function. (see commit 39a1db1)
The issues are corrected in this patch, and also a coverity
stand alone run has been tested to ensure the annotations are
respected by coverity.
Change-Id: I4a93b6981e2ff4bba9a29e590b17da248931c8ae
Updates: bz#789278
Signed-off-by: ShyamsundarR <srangana@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | commit ed83a4ee7b73e6b04694d1ac11ed25d2983ac943 changed locking
order and forgot to unlock in a negative path (when index was -1).
Coverity caught it (thanks!) as  CID 1396581:  Program hangs  (LOCK)
Note: I'm unlocking before logging the failure. I think it's the right
order - logging can take a while (especially if your disk is slow).
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I82ac241edf1d511bf6807cf9c46c538ab9f4acc4 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We add dummy interrupt handling for the FLUSH
fuse message. It can be enabled by the
"--fuse-flush-handle-interrupt" hidden command line
option, or "-ofuse-flush-handle-interrupt=yes"
mount option.
It serves no other than diagnostic & demonstational
purposes -- to exercise the interrupt handling framework
a bit and to give an usage example.
Documentation is also provided that showcases interrupt
handling via FLUSH.
Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - add sub-framework to send timed responses to kernel
- add interrupt handler queue
- implement INTERRUPT
fuse_interrupt looks up handlers for interrupted messages
in the queue. If found, it invokes the handler function.
Else responds with EAGAIN with a delay.
See spec at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/fuse.txt?h=v4.17#n148
and explanation in comments.
Change-Id: I1a79d3679b31f36e14b4ac8f60b7f2c1ea2badfb
updates: #465
Signed-off-by: Csaba Henk <csaba@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Don't 'calculate' again where it can be passed.
If possible, do not perform under lock.
Also remove some NULL checks, assuming they were done by the callers.
Left a remark for each such change.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ia5c2851506da3388cb2d4445334c58881e2c4416 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When creating a new pool, take the pool lock once and create
all arenas, instead of taking and releasing for each arena.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I7daa39de960e47e66a32ecab724cf3a61ccdc01b | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Small code refactoring to remove some if statements
in several functions. No functional changes expected.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: If9f8d5d53c9688fb994b6d690aea66f65fa01c55 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | As glusterfs logging uses different directory than /var/log
(ie, /var/log/glusterfs), there is a chance it may not be
present when starting glusterfs. Create parent dir if it
doesn't exist.
Updates: bz#1193929
Change-Id: I8d6f7e5a608ba53258b14617f5d103d1e98b95c1
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Currently, there are possibilities in few places, where a user-controlled
(like filename, program parameter etc) string can be passed as 'fmt' for
printf(), which can lead to segfault, if the user's string contains '%s',
'%d' in it.
While fixing it, makes sense to make the explicit check for such issues
across the codebase, by making the format call properly.
Fixes: CVE-2018-14661
Fixes: bz#1644763
Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| | Change-Id: Ib8bdf210a896423abcd7413dd4896d424ac0f561
fixes: bz#1626610
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | As an example, and also as an enhancement, added 'log-level'
as a default option to every translator (glusterfs already
support infrastructure to handle xl->loglevel).
Corresponding infrastructure to add per xlator log-level
is not present in glusterd volume-set. Plan is to get it
sorted out in later patches or in GD2.
* Why this is needed? - Mainly because we need to only add
different log-level to some xlator to debug few things in a
production system, while not changing overall log-level. This
helps in better debug-ability.
Updates: bz#1193929
Change-Id: Ia4098ce39197cd423345b3d31fe8315481681ab8
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Based on the proposal to remove few features as they are not
actively maintained [1], removing tier translator from the
build. Also make sure there are no regression tests involving
tiering feature are present.
[1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html
Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5
updates: bz#1642807
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | total_allocs of certain type of variables can be 4billion in a
single day depending on load. So, 32 bits for that is not enough.
Also, size_t is good variable size for one allocation, but the
sum of allocations, should be 64bits to make sure we don't
overflow the variable.
Updates: bz#1639599
Change-Id: If3b19687f94425e913a0201ae5d73661eda51f06
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| | fixes: bz#1644164
Change-Id: I0ac5aff565b3a30d5ff25ec5a3f20e0bda424a5d
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This will allow proper printing of exact 'fop' type to be logged in
string, not number, during backtraces.
Considering this was not done on brick processes, we have no easy
way to glance and understand which fops were pending.
What gets changed:
After a crash, most of the core-dumps logged were of the form:
```
pending frames:
frame : type(0) op(18)
frame : type(0) op(18)
frame : type(0) op(28)
```
would change to
```
pending frames:
frame : type(1) op(SETXATTR)
frame : type(1) op(SETXATTR)
frame : type(1) op(READDIR)
```
updates: bz#1639599
Change-Id: I0e3d2a8dee9cfde7ed0112a948f5213f546efb80
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | ctx->active can be null, and is checked elsewhere in the
same function. In another case, where 'ctx->active' gets
dereferenced, it needs to be validated before the loop
is hit.
Updates: bz#1622665
Change-Id: I4ec917e96c0756586fc7a74c76848bb9589a0293
Signed-off-by: Amar Tumballi <amarts@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem:
data center setups with large number of bricks with replication
causes a flood of connections from bricks and self-heal daemons
to glusterd causing connections to be dropped due to insufficient
listener socket backlog queue length
Solution:
raise default value of transport.listen-backlog to 1024
Change-Id: I879e4161a88f1e30875046dff232499a8e2e6c51
fixes: bz#1642850
Signed-off-by: Milind Changire <mchangir@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Coverity reports tainted pointer access in _gf_free if the pointer passed in
was used by any IO related function by the caller. The taint within gf_free
is a false positive, as the tainted region is from the passed in pointer
till its allocated lenght, and not for contents before the pointer (i.e
the GF_MEM_HEADER_SIZE bytes before the passed in pointer), as that is
exclusively handled by the gf_alloc family of functions.
CID: 1228602, 1292646, 1292647, 1292648, 1292649, 1383192, 1383195, 1389691
Should additionally fix,
CID: 1292650, 1292651, 1357874, 1382373, 1382404, 1382407
Change-Id: I48c5a4028e7b0224c432bbc30f8c29408c2a466b
Updates: bz#789278
Signed-off-by: ShyamsundarR <srangana@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| | CID 1396081:  Control flow issues  (UNREACHABLE)
Change-Id: Ifad303853224cb9abc91c1083bb1529f4c13b1d3
updates: bz#789278
Signed-off-by: Milind Changire <mchangir@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The #include "uuid.h" left over from using .../contrib/uuid is debatably
incorrect now that we use the "system header" file /usr/include/uuid/uuid.h
from libuuid-devel.
Unfortunately this is complicated by things like FreeBSD having its own
/usr/include/uuid.h, and the e2fsprogs-libuuid uuid.h in installed - as
most third-party packages in FreeBSD are - in /usr/local as
/usr/local/include/uuid/uuid.h
With a system header file it should at least be #include <uuid.h>, and
even better as #include <uuid/uuid.h>, much like the way <sys/types.h>
and <net/if.h> are included. Using #include <uuid/uuid.h> guarantees
not getting the /usr/include/uuid.h on FreeBSD, but clang/cc knows to
find "system" header files like this in /usr/local/include; with or
without the -I/... from uuid.pc. Also using #include "uuid.h" leaves
the compiler free to find a uuid.h from any -I option it might be passed.
(Fortunately we don't have any at this time.)
As we now require libuuid-devel or e2fsprogs-libuuid and configure will
exit with an error if the uuid.pc file doesn't exist, the HAVE_LIBUUID
(including the #elif FreeBSD) tests in compat-uuid.h are redundant. We
are guaranteed to have it, so testing for it is a bit silly IMO. It may
also break building third party configure scripts if they omit defining
it. (Just how hard do we want to make things for third party developers?)
Change-Id: I7317f63c806281a5d27de7d3b2208d86965545e1
updates: bz#1193929
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | One needs to be very careful about giving same key for the key and
SLEN(key) arguments in dict_xxxn() functions. Writing macros that
would take care of passing the SLEN(key) would help reduce this
burden on the developer and reviewer.
updates: bz#1193929
Change-Id: I312c479b919826570b47ae2c219c53e2f9b2ddef
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem and Analysis:
The length of canonical format of uuid is 36 but
'GF_UUID_BUF_SIZE 50' was being used everywhere.
glusterd/geo-rep code was earlier using strncpy,
but recently changes to memcpy with the drive
to avoid strncpys. This leads to memory corruption
and glusterd is crashing without a core with geo-rep
creation with ASAN build.
Fix:
'GF_UUID_BUF_SIZE 37' (+ 1 for NULL byte)
And change geo-rep to use UUID_CANONICAL_FORM_LEN
instead
Updates: bz#1633930
Change-Id: Ibd347d542b92e64a96ce06780cda643557233bc7
Signed-off-by: Kotresh HR <khiremat@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| | CID: 1396060
updates: bz#789278
Change-Id: Ia25aa9a4ca7505e747aa92bb3ae81415da7d19d1
Signed-off-by: Sunny Kumar <sunkumar@redhat.com> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Problem: At the time of processing GF_EVENT_PARENT_DOWN
         at brick xlator, it forwards the event to next xlator
         only while xlator ensures no stub is in progress.
         At io-thread xlator it decreases stub_cnt before the process
         a stub and notify EVENT to next xlator
Solution: Introduce a new counter to save stub_cnt and decrease
          the counter after process the stub completely at io-thread
          xlator.
          To avoid brick crash at the time of call xlator_mem_cleanup
          move only brick xlator if detach brick name has found in
          the graph
Note: Thanks to pranith for sharing a simple reproducer to
      reproduce the same
fixes bz#1637934
Change-Id: I1a694a001f7a5417e8771e3adf92c518969b6baa
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> |