| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were unconditionally cleaning up the grap when we get
child_down followed by parent_down. But this is prone to
race condition when some of the bricks are already disconnected.
In this case, even before the last child down is executed in the
client xlator code,we might have freed the graph. Because the
child_down event is alreadt recevied.
To fix this race, we have introduced a check to see if all client
xlator have cleared thier reconnect chain, and called the child_down
for last time.
Change-Id: I7d02813bc366dac733a836e0cd7b14a6fac52042
fixes: bz#1727329
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were using glusterfs_graph_fini to free the xl rec from
glusterfs_process_volfp as well as glusterfs_graph_cleanup.
Instead we can use glusterfs_graph_deactivate, which is does
fini as well as other common rec free.
Change-Id: Ie4a5f2771e5254aa5ed9f00c3672a6d2cc8e4bc1
Updates: bz#1716695
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
This patch will add two extra logs for invalid argument
Change-Id: I3950b4f4b9d88b1f1e788ef93d8f09d4bd8d4d8b
updates: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a graph cleanup, we first sent a PARENT_DOWN and wait for
a child down to ultimately free the xlator and the graph.
In the ec xlator, we cleanup the threads when we get a PARENT_DOWN event.
But a racing event like CHILD_UP or event xl_op may trigger healing threads
after threads cleanup.
So there is a chance that the threads might access a freed private variabe
Change-Id: I252d10181bb67b95900c903d479de707a8489532
fixes: bz#1703948
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a translator stops, memory accounting for that translator is not
destroyed (because there could remain memory allocated that references
it), but mutexes that coordinate updates of memory accounting were
destroyed. This caused incorrect memory accounting and even crashes in
debug mode.
This patch also fixes some other things:
* Reduce the number of atomic operations needed to manage memory
accounting.
* Correctly account memory when realloc() is used.
* Merge two critical sections into one.
* Cleaned the code a bit.
Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8
Updates: bz#1659334
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Shd daemon is per node, which means they create a graph
with all volumes on it. While this is a great for utilizing
resources, it is so good in terms of performance and managebility.
Because self-heal daemons doesn't have capability to automatically
reconfigure their graphs. So each time when any configurations
changes happens to the volumes(replicate/disperse), we need to restart
shd to bring the changes into the graph.
Because of this all on going heal for all other volumes has to be
stopped in the middle, and need to restart all over again.
Solution:
This changes makes shd as a per volume daemon, so that the graph
will be generated for each volumes.
When we want to start/reconfigure shd for a volume, we first search
for an existing shd running on the node, if there is none, we will
start a new process. If already a daemon is running for shd, then
we will simply detach a graph for a volume and reatach the updated
graph for the volume. This won't touch any of the on going operations
for any other volumes on the shd daemon.
Example of an shd graph when it is per volume
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
--------- --------- ----------
| AFR-1 | | AFR-2 | | AFR-3 |
-------- --------- ----------
A running shd daemon with 3 volumes will be like-->
graph
-----------------------
| debug-iostat |
-----------------------
/ | \
/ | \
------------ ------------ ------------
| volume-1 | | volume-2 | | volume-3 |
------------ ------------ ------------
Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The setxattr function receives a pointer to raw data, which may not be
null-terminated. When this data needs to be interpreted as a string, an
explicit null termination needs to be added before using the value.
Change-Id: Id110f9b215b22786da5782adec9449ce38d0d563
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk()
is supposed to call rpc_transport_unref() after open-files on
that transport are flushed per transport.But open-fd-count is
maintained in bound_xl->fd_count, which can be incremented/decremented
cumulatively in server_connection_cleanup() by all transport
disconnect paths. So instead of rpc_transport_unref() happening
per transport, it ends up doing it only once after all the files
on all the transports for the brick are flushed leading to
rpc-leaks.
Solution: To avoid races maintain fd_cnt at client instead of maintaining
on brick
Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In gluster code some of the places it call's get_new_dict
to create a dictionary without taking reference so at the time
of dict_unref it has become a leak
Solution: To resolve the same call dict_new instead of get_new_dict
updates bz#1650403
Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove the options to load old symbol.
* keep only 'xlator_api' symbol from being exported using xlator.sym
* add xlator_api to all the xlators where its missing
NOTE: This covers all the xlators which has at least a test case
to validate its loading. If there is a translator, which doesn't
have any test, then we should probably remove that from codebase.
fixes: #164
Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In changelog xlator after destroying listener it call's
unlink to delete changelog socket file but socket file
reference is not cleaned up from process memory
Solution: 1) To cleanup reference completely from process memory
serialize transport cleanup for changelog and then
unlink socket file
2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next
xlator only after cleanup all xprts
Test: To test the same run below steps
1) Setup some volume and enable brick mux
2) kill anyone brick with gf_attach
3) check changelog socket for specific to killed brick
in lsof, it should cleanup completely
fixes: bz#1600145
Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* libglusterfs changes to add new fop
* Fuse changes:
- Changes in fuse bridge xlator to receive and send responses
* posix changes to perform the op on the backend filesystem
* protocol and rpc changes for sending and receiving the fop
* gfapi changes for performing the fop
* tools: glfs-copy-file-range tool for testing copy_file_range fop
- Although, copy_file_range support has been added to the upstream
fuse kernel module, no release has been made yet of a kernel
which contains the support. It is expected to come in the
upcoming release of linux-4.20
So, as of now, executing copy_file_range fop on a fused based
filesystem results in fuse kernel module sending read on the
source fd and write on the destination fd.
Therefore a small gfapi based tool has been written to be able
test the copy_file_range fop. This tool is similar (in functionality)
to the example program given in copy_file_range man page.
So, running regular copy_file_range on a fuse mount point and
running gfapi based glfs-copy-file-range tool gives some idea about
how fast, the copy_file_range (or reflink) can be.
On the local machine this was the result obtained.
mount -t glusterfs workstation:new /mnt/glusterfs
[root@workstation ~]# cd /mnt/glusterfs/
[root@workstation glusterfs]# ls
file
[root@workstation glusterfs]# cd
[root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new
real 0m6.495s
user 0m0.000s
sys 0m1.439s
[root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr
OPEN_SRC: opening /file is success
OPEN_DST: opening /rrr is success
FSTAT_SRC: fstat on /rrr is success
copy_file_range successful
real 0m0.309s
user 0m0.039s
sys 0m0.017s
This tool needs following arguments
1) hostname
2) volume name
3) log file path
4) source file path (relative to the gluster volume root)
5) destination file path (relative to the gluster volume root)
"glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>"
- Added a testcase as well to run glfs-copy-file-range tool
* io-stats changes to capture the fop for profiling
* NOTE:
- Added conditional check to see whether the copy_file_range syscall
is available or not. If not, then return ENOSYS.
- Added conditional check for kernel minor version in fuse_kernel.h
and fuse-bridge while referring to copy_file_range. And the kernel
minor version is kept as it is. i.e. 24. Increment it in future
when there is a kernel release which contains the support for
copy_file_range fop in fuse kernel module.
* The document which contains a writeup on this enhancement can be found at
https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit
Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367
updates: #536
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: 1) server_init does not cleanup allocate resources
while it is failed before return error
2) dict leak at the time of graph destroying
Solution: 1) free resources in case of server_init is failed
2) Take dict_ref of graph xlator before destroying
the graph to avoid leak
Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15
fixes: bz#1654917
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As an example, and also as an enhancement, added 'log-level'
as a default option to every translator (glusterfs already
support infrastructure to handle xl->loglevel).
Corresponding infrastructure to add per xlator log-level
is not present in glusterd volume-set. Plan is to get it
sorted out in later patches or in GD2.
* Why this is needed? - Mainly because we need to only add
different log-level to some xlator to debug few things in a
production system, while not changing overall log-level. This
helps in better debug-ability.
Updates: bz#1193929
Change-Id: Ia4098ce39197cd423345b3d31fe8315481681ab8
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of processing GF_EVENT_PARENT_DOWN
at brick xlator, it forwards the event to next xlator
only while xlator ensures no stub is in progress.
At io-thread xlator it decreases stub_cnt before the process
a stub and notify EVENT to next xlator
Solution: Introduce a new counter to save stub_cnt and decrease
the counter after process the stub completely at io-thread
xlator.
To avoid brick crash at the time of call xlator_mem_cleanup
move only brick xlator if detach brick name has found in
the graph
Note: Thanks to pranith for sharing a simple reproducer to
reproduce the same
fixes bz#1637934
Change-Id: I1a694a001f7a5417e8771e3adf92c518969b6baa
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Current resource cleanup sequence is not
perfect while brick mux is enabled
Solution: 1) Destroying xprt after cleanup all fd associated
with a client
2) Before call fini for brick xlators ensure no stub
should be running on a brick
Change-Id: I86195785e428f57d3ef0da3e4061021fafacd435
fixes: bz#1631357
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
| |
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`doc/xlator-classification.md` talks about the reasoning and expectations
Reviewers are expected to check the 'category' of new
option / translator added in the codebase, and make sure the flag
is always properly set. It helps to keep the 'expectation' proper
on the codebase.
updates: #430
Change-Id: I2bfc9934a5f6eed77fcc3e20364046242decc82c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a squash of multiple commits:
contrib/fuse-lib/misc.c: remove unneeded memset()
All flock variables are properly set, no need to memset it.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I8e0512c5a88daadb0e587f545fdb9b32ca8858a2
libglusterfs/src/{client_t|fd|inode|stack}.c: remove some memset()
I don't think there's a need for any of them.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2be9ccc3a5cb5da51a92af73488cdabd1c527f59
libglusterfs/src/xlator.c: remove unneeded memset()
All xl->mem_acct members are properly set,
no need to memset it.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I7f264cd47e7a06255a3f3943c583de77ae8e3147
xlators/cluster/afr/src/afr-self-heal-common.c: remove unneeded memset()
Since we are going over the whole array anyway, initialize it
properly, to either 1 or 0.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ied4210388976b6a7a2e91cc3de334534d6fef201
xlators/cluster/dht/src/dht-common.c: remove unneeded memset()
Since we are going over the whole array anyway it is initialized
properly.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Idc436d2bd0563b6582908d7cbebf9dbc66a42c9a
xlators/cluster/ec/src/ec-helpers.c: remove unneeded memset()
Since we are going over the whole array anyway it is initialized
properly.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I81bf971f7fcecb4599e807d37f426f55711978fa
xlators/mgmt/glusterd/src/glusterd-volgen.c: remove some memset()
I don't think there's a need for any of them.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I476ea59ba53546b5153c269692cd5383da81ce2d
xlators/mgmt/glusterd/src/glusterd-geo-rep.c: read() in 4K blocks
The current 1K seems small. 4K is usually better (in Linux).
Also remove a memset() that I don't think is needed between reads.
Only compile-tested!
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I5fb7950c92d282948376db14919ad12e589eac2b
xlators/storage/posix/src/posix-{gfid-path|inode-fd-ops}.c: remove memset()
before sys_*xattr() functions.
I don't see a reason to memset the array sent to the functions
sys_llistxattr(), sys_lgetxattr(), sys_lgetxattr(), sys_flistxattr(),
sys_fgetxattr().
(Note: it's unclear to me why we are calling sys_*txattr() functions with
XATTR_VAL_BUF_SIZE-1 size instead of XATTR_VAL_BUF_SIZE ).
Only compile-tested!
Change-Id: Ief2103b56ba6c71e40ed343a93684eef6b771346
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
{glusterfsd|glusterfsd-mgmt|quota-common-utils|xlator|tier|stripe}.c
tools/setgfid2path/src/main.c
xlators/cluster/afr/src/afr-inode-read.c
{glusterfs-acl|glusterfs}.h
For const strings, just do compile time size calc instead of runtime.
Compile-tested only!
Change-Id: I303684b1ff29b05c10126fb1057f507e404ced07
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes client connection is failed after throwing
error "cleanup flag is set for xlator Try again later".
The situation comes only after get a detach request but
the brick stack is not completely detached and at the same time
the client initiates a connection with brick
Solution: To resolve the same check cleanup_starting flag in get
xlator_by_name_or_type, this function call by server_setvolume
to attach a client with brick.
Change-Id: I3720e42642fe495dd05211e2ed2cb976db9e231b
fixes: bz#1614124
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd2 build is failed due to undefined symbol
(xlator_mem_cleanup , glusterfsd_ctx) in server.so
Solution: To resolve the same done below two changes
1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c
to xlator.c to be part of libglusterfs.so
2) replace glusterfsd_ctx to this->ctx because symbol
glusterfsd_ctx is not part of server.so
BUG: 1544090
Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5
fixes: bz#1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes brick process is getting crashed at the time
of stop brick while brick mux is enabled.
Solution: Brick process was getting crashed because of rpc connection
was not cleaning properly while brick mux is enabled.In this patch
after sending GF_EVENT_CLEANUP notification to xlator(server)
waits for all rpc client connection destroy for specific xlator.Once rpc
connections are destroyed in server_rpc_notify for all associated client
for that brick then call xlator_mem_cleanup for for brick xlator as well as
all child xlators.To avoid races at the time of cleanup introduce
two new flags at each xlator cleanup_starting, call_cleanup.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
|
|
|
|
|
|
|
| |
updates: #304
Change-Id: If6a13d2e56b195390a386d720103a882e077f66c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
otherwise, the very first metrics will have all the min as 0.
also no need to print pending-fops if it is 0.
Updates #168
Change-Id: I233de6c92b1a73977bb468ba211ac6ec3c05298f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RTLD_LOCAL is the default value for symbol visibility flag of dlopen()
in Linux and NetBSD. Using it avoids conflicts during symbol resolution.
This also allows us to detect xlators that have not been explicitly
linked with libraries that they use. This used to go unnoticed
when RTLD_GLOBAL was being used.
BUG: 1193929
Change-Id: I50db6ea14ffdee96596060c4d6bf71cd3c432f7b
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With configure --enable-debug, add all object allocations
to a list in the corresponding mem_acct_rec. This
allows us to see all objects of a particular type
and allows for additional debugging in case of memory
leaks.
This is not compiled in by default and must be explicitly
enabled. It is intended to be used by developers.
Change-Id: I7cf2dbeadecf994423d7e7591e85f18d2575cce8
BUG: 1522662
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
specify ctx in gf_log_set_loglevel, instead of getting it from a thread
specific variable.
Change-Id: I498f826e8e32231235a6b0005026a27c327727fd
BUG: 1521213
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
|
|
|
|
|
|
|
|
|
|
| |
* Introduce xlator methods to allow dumping of metrics
* Separate options to get the metrics dumped in a path
Updates #168
Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
icreate creates inode, while namelink links the basename to it's
parent gfid.
For now mkdir is the primary user of these fops. Better distribution is
acheived by creating the inode on ,(say) mds1 and linking the basename to it's
parent gfid on mds2. The inode serves readdirp, stat etc.
More details about the fops are present at:
https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md
This backport of three patches from experimental branch.
1- https://review.gluster.org/#/c/18085/
2- https://review.gluster.org/#/c/18086/
3- https://review.gluster.org/#/c/18094/
Updates gluster/glusterfs#243
Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: It had been a longtime request to implement put fop
in gluster. put fop in gluster may not have the exact sementics
of HTTP PUT, but can be easily extended to do so. The subsequent
patches, will contain more semantics on the put fop and its
guarentees.
Why compound fop framework is not used for put?
Compound fop framework currently doesn't allow compounding of
entry fop and inode fops, i.e. fops on multiple inodes cannot be
combined in compound fop.
Updates #353
Change-Id: Idb7891b3e056d46d570bb7e31bad1b6a28656ada
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
each translator from now on can have just 1 symbol exported called
'xlator_api', which has all the required fields in it.
Updates: #164
Change-Id: I48d54f5ec59fee842b1d55877e3ac5e9ec9b6bdd
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure to handle these counters in STACK_WIND/UNWIND macro, and
keep the counters as part of xlator_t structure itself, to provide
infra to monitoring.
Updates #137
Change-Id: Ib54d45e2321c2b095dac5810c37e6cdffe1f71b7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
| |
Fixes #303
Change-Id: Icdaa804711c43c65b9684f2649437aae1b5c1ed5
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
| |
Updates #303
Change-Id: Id0b9050c93ea87532dc80b4fda650c5663d285bd
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are storing the entire volfile and using this to check
volfile change. With brick multiplexing there will be lot
of graphs per process which will increase the memory foot
print of the process. So instead of storing the entire
graph we could use sha256 and we can compare the hash to
see whether volfile change happened or not.
Also with Brick multiplexing, the direct comparison of vol
file is not correct. There are two problems.
Problem 1:
We are currently storing one single graph (the last
updated volfile) whereas, what we need is the entire
graph with all atttached bricks.
If we fix this issue, we have second problem
Problem 2:
With multiplexing we have a graph that contains multiple
bricks. But what we are checking as part of the reconfigure
is, comparing the entire graph with one single graph,
which will always fail.
Solution:
We create list in glusterfs_ctx_t that stores sha256 hash
of individual brick graphs. When a graph changes happens
we compare the stored hash and the current hash. If the
hash matches, then no need for reconfigure. Otherwise we
first do the reconfigure and then update the hash.
For now, gfapi has not changed this way. Meaning when gfapi
volfile fetch or reconfigure happens, we still store the
entire graph and compare, each memory.
This is fine, because libgfapi will not load brick graphs.
But changing the libgfapi will make the code similar in
both glusterfsd-mgmt and api. Also it helps to reduce some
memory.
Change-Id: I9df917a771a52b95622ab8f63af34ec390163a77
BUG: 1467986
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: https://review.gluster.org/17709
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
In nameless lookup/other fops, parent inode will be NULL, when we try
to add the cache to the NULL inode, it causes a crash.
Hence handle the scenario of nameless fops, and do not cache/serve
the nameless fops.
Change-Id: I3b90f882ac89e6aaf3419db89e6f890797f37700
BUG: 1451588
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17316
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inside rename, a lookup is done on the source name to make sure that
the file is there. But we used to do a gfid based lookup and hence,
even if the source name was renamed to a new name from some other client,
lookup will be successful as server3_3_lookup will fetch the new path
based on the gfid.
So even if the source file does not exist any more rename will carry on,
and as server3_3_link(destination is hashed to a different brick other
than source cached scenario) also does gfid based resolve, it wont
detect that the source name does not exist and hardlink creation will be
successful (since gfid based resolve will get the new dentry).
To solve this problem, do a name based lookup inside rename. So that
rename will fail right away if the source does not exist.
Change-Id: Ieba8bdd6675088dbf18de90ed4622df043d163bd
BUG: 1412135
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: https://review.gluster.org/16375
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind can not show the symbols if a .so after calling dlclose(). The
unhelpful ??? in the output gets resolved properly with this change:
==25170== 344 bytes in 1 blocks are definitely lost in loss record 233 of 324
==25170== at 0x4C29975: calloc (vg_replace_malloc.c:711)
==25170== by 0x52C7C0B: __gf_calloc (mem-pool.c:117)
==25170== by 0x12B0638A: ???
==25170== by 0x528FCE6: __xlator_init (xlator.c:472)
==25170== by 0x528FE16: xlator_init (xlator.c:498)
==25170== by 0x52DA8D6: glusterfs_graph_init (graph.c:321)
==25170== by 0x52DB587: glusterfs_graph_activate (graph.c:695)
==25170== by 0x5046407: glfs_process_volfp (glfs-mgmt.c:79)
==25170== by 0x5043B9E: glfs_volumes_init (glfs.c:281)
==25170== by 0x5044FEC: glfs_init_common (glfs.c:986)
==25170== by 0x50451A7: glfs_init@@GFAPI_3.4.0 (glfs.c:1031)
By not calling dlclose(), the dynamically loaded .so is still available
upon program exit, and Valgrind is able to resolve the symbols. This
will add an additional leak, so dlclose() is called for normal builds,
but skipped when configuring with "./configure --enable-valgrind" or
passing the "run-with-valgrind" xlator option.
URL: http://valgrind.org/docs/manual/faq.html#faq.unhelpful
Change-Id: I2044e21b1b8fcce32ad1a817fdd795218f967731
BUG: 1425623
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16809
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current macros ATOMIC_INCREMENT() and ATOMIC_DECREMENT() expect a
lock as first argument. There are at least two issues with this
approach:
1. this lock is unused on architectures that have atomic operations
2. some structures use a single lock for multiple variables
By defining a gf_atomic_t type, the unused lock can be removed, saving a
few bytes on modern architectures.
Because the gf_atomic_t type locates the lock for the variable (in case
of older architectures), each variable is protected the same on all
architectures. This makes the behaviour across all architectures more
equal (per variable locking, by a gf_lock_t or compiler optimization).
BUG: 1437037
Change-Id: Ic164892b06ea676e6a9566f8a98b7faf0efe76d6
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: https://review.gluster.org/16963
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a lot of logic (and some long comments) around how to free
these structures safely, but then we didn't do it. Now we do.
Change-Id: I9731ae75c60e99cc43d33d0813a86912db97fd96
BUG: 1420571
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/16570
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for multiple brick translator stacks running
in a single brick server process. This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more. It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.
Multiplexing is controlled by the "cluster.brick-multiplex" global option. By
default it's off, and bricks are started in separate processes as before. If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.
Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These functions do not generally "expect" to be called more than once
in parallel, and many are likely to misbehave in that case (one case
in DHT already). Such parallel calls have not generally happened
because there are only a few places where we call these functions, and
those have been implicitly serialized until recently. However, recent
changes in the epoll layer change that, as does brick multiplexing.
Therefore, the serialization is now explicit at the init/reconfigure
level.
It would be sufficient to serialize calls to a particular translator's
init and reconfigure functions, but that would require per-translator
locks and a bit more complexity in maintaining/using them. Since
there's no clear reason why we would need or want to support a higher
level of parallelism, the simpler approach of a global lock should
suffice.
Change-Id: I26296c2826e91dc00b7f0c2061bcc2964ef90c4c
BUG: 1399134
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/16030
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:
Brick-logs:
[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==> (Invalid argument) [Invalid argument]
Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]
Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.
Change-Id: I5f85b303ede135baaf92e87ec8e09941f5ded6c1
BUG: 1392445
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15788
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic2ba77a1fdd27801a6e579e04e6c0dd93cd7127b
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/14011
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ifd0ff278dcf43da064021f5c25e5dcd34347fcde
BUG: 1326085
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/13970
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia27d66b1061b0377857827515590eb89b18515c9
BUG: 1319992
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/11596
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minimal infrastructure changes for the seek() FOP. This will provide
SEEK_HOLE and SEEK_DATA functionalities.
BUG: 1220173
Change-Id: I4b74fce8b0bad2f45291fd2c2b9e243c4f4a1aa9
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/11480
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the path for socket.so file while loading the so dynamically.
Also for config.memory-accounting & config.transport voltype is changed to
glusterd to fix the warning message coming from xlator_volopt_dynload
Change-Id: I0f7964814586f2018d4922b23c683f4e1eb3098e
BUG: 1283485
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/12656
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|