glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	core, cli, quota: cleanup malloc debugging and stats	Dmitry Antipov	2020-05-04	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \|	1. Since mcheck()/mprobe() etc. features are no longer used, mcheck.h isn't required to be included. 2. Since mallinfo() is used to obtain malloc statistics, it should be detected instead of malloc_stats(). Change-Id: I54c7d2ee568e06ab29938efc01d1a2153c5bd5db Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1172
*	glusterfsd: structure logging	yatipadia	2020-02-06	3	-124/+132
\| \| \| \| \| \| \| \|	convert gf_msg() to gf_smsg() Change-Id: I1cd6a5ac6f4361195d5d925efb2cc194045d0bba Updates: #657 Signed-off-by: yatip <ypadia@redhat.com>
*	multiple xlators: reduce key length	Yaniv Kaul	2020-01-14	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In many cases, we were freely allocating long keys with no need. Smaller char arrays are just fine almost anywhere, so just went ahead and looked where they we can use smaller ones. In some cases, annotated the functions as static and the prefixes passed as const as it was easier to read and understand. Where relevant, converted the dict functions to use known key length. Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	mgmt/brick-mux: Avoid sending two response when attach is failed.	Mohammed Rafi KC	2019-12-31	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	We were sending two response back to glusterd when an attach is failed. One from the handler function glusterfs_handle_attach and another from rpcsvc_check_and_reply_error. It was causing problems like ref leaks, transport disconnect etc. Change-Id: I3bb5b59959530760b568d52becb519499b3dcd2b updates: bz#1785143 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	glusterd: refactoring long method	Barak Sason Rofman	2019-12-19	3	-325/+94
\| \| \| \| \| \| \| \| \| \| \|	- Refactored set_fuse_mount_options(...) in order to shorten it. - Removed dead code and moved some method to it's apropriate location. - Converted loggin in set_fuse_mount_options(...) to structured logs fixes: bz#1768896 Change-Id: If865833d4c60d517da202871978691ef21235fe4 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	[RFC]#ifdef gNFS related code if we are not compiling gNFS	Yaniv Kaul	2019-12-18	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are not compiling gNFS (--enable-gnfs is not given in the ./configure script params), there is little point in compiling code that is related to it. This patch tries to eliminate it. My hope (and it's not clear from the code ) is that I did not break the NFS Ganesha support as well. Other than that, tried to compile with and without anad it looks sane. Change-Id: I8d6c98066b9fceab4ec10fc6f5e81ab069e853bd updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	glusterfsd.c: remove sys_lstat() call	Yaniv Kaul	2019-11-27	1	-9/+0
\| \| \| \| \| \| \| \| \|	get_volfp() in glfs.c doesn't use it, so get_volfp() in glusterfsd.c can just open the file without the stat call as well, IMHO. Change-Id: I3cb5bf12a09b5be42aa2ee4f432f8d351eee5b9e updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	glusterfsd-mgmt.c: move INFO log outside a LOCK	Yaniv Kaul	2019-11-19	1	-28/+27
\| \| \| \| \| \| \| \| \| \| \| \| \|	In glusterfs_handle_attach() we can: 1. Move an INFO level to be executed before the LOCK. 2. Skip the LOCK altogether, if there's no active graph. I hope it's safe - I've seen that in other functions you could look at ctx->active outside of a lock. Change-Id: I3e1ec5b1430d5fddee46883d468ff4f5bd6ca54b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	[WIP]gluster-volgen.c: remove more of JBR and FDL xlators	Yaniv Kaul	2019-11-13	1	-5/+0
\| \| \| \| \| \| \| \| \|	the JBR and FDL experimental xlators were apparently removed. Removed additional leftovers scattered in the code. Change-Id: I78b6fa5fd9044dc48cdcb1fb094b8c267c2d1323 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	gluster: check ctx->active	Xie Changlong	2019-11-12	1	-0/+10
\| \| \| \| \| \| \| \| \|	To avoid process "TRANSLATOR INFO" "BARRIER" if graph is not ready, also see commit ee630e25. Updates: bz#1769712 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Change-Id: Ibd446a35962206d3689667cda7e6712d72e4ec2f
*	glusterd: Client Handling of Elastic Clusters	Mohit Agrawal	2019-11-12	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Configure the list of gluster servers in the key GLUSTERD_BRICK_SERVERS at the time of GETSPEC RPC CALL and access the value in client side to update volfile serve list so that client would be able to connect next volfile server in case of current volfile server is down Updates #741 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I23f36ddb92982bb02ffd83937a8bd8a2c97e8104
*	glusterfsd-mgmt: unify read and write tests	Yaniv Kaul	2019-11-07	2	-148/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Both read and write tests required writing first. Either just writing (write test) or write and then read (read test). So the code is now unified. 2. There's no reason to read zeros from /dev/zero. Just use a CALLOC'ed buffer. I don't think we should read and write zeros, but I did not change the code yet (I think compression and/or dedup will offset results) It appears neither read-perf nor write-perf were tested, so added basic tests for them. Change-Id: I24b1f249fa0335ed652a8982e99c0687d940230e updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	rpc: align structs	Yaniv Kaul	2019-10-17	1	-44/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	squash tens of warnings on padding of structs in afr structures. The warnings were found by manually added '-Wpadded' to the GCC command line. Also made relevant structs and definitions static, where it was applicable. Change-Id: Ib71a7e9c6179378f072d796d11172d086c343e53 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	glusterfsd: fix unused value coverity issues	yatipadia	2019-10-15	1	-11/+0
\| \| \| \| \| \| \| \| \| \|	This patch addresses CID-1398624 and CID-1398631 removed the unused variable brick_name Change-Id: I4f40bd76cb4c94b28589c2333e29d4623da339d0 Updates: bz#789278 Signed-off-by: yati <ypadia@redhat.com>
*	glusterfs/fuse: Reduce the default lru-limit value	N Balachandran	2019-09-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The current lru-limit value still uses memory for upto 128K inodes. Reduce the default value of lru-limit to 64K. Change-Id: Ica2dd4f8f5fde45cb5180d8f02c3d86114ac52b3 Fixes: bz#1753880 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterd, rpc, glusterfsd: fix coverity defects and put required annotations	Atin Mukherjee	2019-09-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	1404965 - Null pointer dereference 1404316 - Program hangs 1401715 - Program hangs 1401713 - Program hangs Updates: bz#789278 Change-Id: I6e6575daafcb067bc910445f82a9d564f43b75a2 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	mount.glusterfs: make fcache-keep-open option take a value	Philip Spencer	2019-08-16	1	-1/+1
\| \| \| \| \| \|	Fixes: bz#1158130 Change-Id: Ifdeaed7c9fbe85f7ce421f7c89cbe7265e45f77c Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	fuse: Set limit on invalidate queue size	N Balachandran	2019-08-14	2	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the glusterfs fuse client process is unable to process the invalidate requests quickly enough, the number of such requests quickly grows large enough to use a significant amount of memory. We are now introducing another option to set an upper limit on these to prevent runaway memory usage. Change-Id: Iddfff1ee2de1466223e6717f7abd4b28ed947788 Fixes: bz#1732717 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	fuse: rate limit reading from fuse device upon receiving EPERM	Csaba Henk	2019-08-08	2	-1/+30
\| \| \| \| \| \|	Fixes: bz#1644322 Change-Id: I53e8fa362cd8c7d04fb1c4abb606a9abb642c592 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	graph/shd: attach volfile even if ctx->active is NULL	Mohammed Rafi KC	2019-08-05	1	-11/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	While we receive a graph attach request, if ctx->active is not set we used to fail assuming that the initilization has not completed yet for the process start. Since the management connection is established, it will receive attach request, even when ctx->active is NULL. Change-Id: Ied4d1ac63e6d4ced4a9405a78e1ce39f81dfd437 fixes: bz#1727256 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	event: rename event_XXX with gf_ prefixed	Xiubo Li	2019-07-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I hit one crash issue when using the libgfapi. In the libgfapi it will call glfs_poller() --> event_dispatch() in file api/src/glfs.c:721, and the event_dispatch() is defined by libgluster locally, the problem is the name of event_dispatch() is the extremly the same with the one from libevent package form the OS. For example, if a executable program Foo, which will also use and link the libevent and the libgfapi at the same time, I can hit the crash, like: kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp 00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000] The link for Foo is: lib_foo_LADD = -levent $(GFAPI_LIBS) It will crash. This is because the glfs_poller() is calling the event_dispatch() from the libevent, not the libglsuter. The gfapi link info : GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid If I link Foo like: lib_foo_LADD = $(GFAPI_LIBS) -levent It will works well without any problem. And if Foo call one private lib, such as handler_glfs.so, and the handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't and it will dlopen(handler_glfs.so), then the crash will be hit everytime. The link info will be: foo_LADD = -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like: foo_LADD = $(GFAPI_LIBS) -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS. And in some cases when the --as-needed link option is added(on many dists it is added as default), then the crash is back again, the above workaround won't work. Fixes: #699 Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa Signed-off-by: Xiubo Li <xiubli@redhat.com>
*	cli: defer create_frame() (and dict creation) to later stages.	Yaniv Kaul	2019-07-16	1	-4/+6
\| \| \| \| \| \| \| \| \|	Where possible, defer create_frame() - whenever possible, after command line verification, for example. Change-Id: Id6606e90e7ea6190f30b225c4733b229c519bb2f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	Fix spelling errors	Aravinda VK	2019-07-14	1	-1/+1
\| \| \| \| \| \| \|	Fixes: bz#1728554 Change-Id: I88357aed7c14988a12616035c3738c32c09a8f9a Signed-off-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Aravinda VK <avishwan@redhat.com>
*	glusterd/svc: update pid of mux volumes from the shd process	Mohammed Rafi KC	2019-07-09	2	-9/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a normal volume, we are updating the pid from a the process while we do a daemonization or at the end of the init if it is no-daemon mode. Along with updating the pid we also lock the file, to make sure that the process is running fine. With brick mux, we were updating the pidfile from gluterd after an attach/detach request. There are two problems with this approach. 1) We are not holding a pidlock for any file other than parent process. 2) There is a chance for possible race conditions with attach/detach. For example, shd start and a volume stop could race. Let's say we are starting an shd and it is attached to a volume. While we trying to link the pid file to the running process, this would have deleted by the thread that doing a volume stop. Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a Updates: bz#1722541 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: fedora 30 compiler warnings	SheetalPamecha	2019-06-18	1	-3/+1
\| \| \| \| \| \| \| \|	warning: ‘%s’ directive argument is null [-Wformat-overflow=] Change-Id: I69b8d47f0002c58b00d1cc947fac6f1c64e0b295 updates: bz#1193929 Signed-off-by: SheetalPamecha <spamecha@redhat.com>
*	multiple files: another attempt to remove includes	Yaniv Kaul	2019-06-14	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	tests: add tests for different signal handling	Amar Tumballi	2019-05-30	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \|	Also some cleanup: * old-protocol.t was actually added to make sure we have line-coverage * first-test.t should have been removed as per the comment. It doesn't do anything. * add statvfs to rpc-coverage so we can cover statvfs in few xlators. updates: bz#1693692 Change-Id: Ie8651ce007de484c4abced16b4de765aa5e517be Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	Fix some "Null pointer dereference" coverity issues	Xavi Hernandez	2019-05-26	1	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following CID's: * 1124829 * 1274075 * 1274083 * 1274128 * 1274135 * 1274141 * 1274143 * 1274197 * 1274205 * 1274210 * 1274211 * 1288801 * 1398629 Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	core: avoid dynamic TLS allocation when possible	Xavi Hernandez	2019-04-24	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some interdependencies between logging and memory management functions make it impossible to use the logging framework before initializing memory subsystem because they both depend on Thread Local Storage allocated through pthread_key_create() during initialization. This causes a crash when we try to log something very early in the initialization phase. To prevent this, several dynamically allocated TLS structures have been replaced by static TLS reserved at compile time using '__thread' keyword. This also reduces the number of error sources, making initialization simpler. Updates: bz#1193929 Change-Id: I8ea2e072411e30790d50084b6b7e909c7bb01d50 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	core: Log level changes do not effect on running client process	Mohit Agrawal	2019-04-15	3	-18/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: commit c34e4161f3cb6539ec83a9020f3d27eb4759a975 set log-level per xlator during reconfigure only for a brick process not for the client process. Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure about brick multiplex introudce a flag brick_mux at ctx->cmd_args. Note: There are two other changes done with this patch 1) Ignore client-log-level option to attach a brick with already running brick if brick_mux is enabled 2) Add a log to print pid of the running process to make easier debugging Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> Fixes: bz#1696046
*	mgmt/shd: Implement multiplexing in self heal daemon	Mohammed Rafi KC	2019-04-01	3	-23/+236
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ --------- --------- ---------- \| AFR-1 \| \| AFR-2 \| \| AFR-3 \| -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ ------------ ------------ ------------ \| volume-1 \| \| volume-2 \| \| volume-3 \| ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	rpc/transport: Missing a ref on dict while creating transport object	Mohammed Rafi KC	2019-03-20	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	glusterfsd: Multiple shd processes are spawned on brick_mux environment	Mohit Agrawal	2019-03-12	1	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Multiple shd processes are spawned while starting volumes in the loop on brick_mux environment.glusterd spawn a process based on a pidfile and shd daemon is taking some time to update pid in pidfile due to that glusterd is not able to get shd pid Solution: Commit cd249f4cb783f8d79e79468c455732669e835a4f changed the code to update pidfile in parent for any gluster daemon after getting the status of forking child in parent.To resolve the same correct the condition update pidfile in parent only for glusterd and for rest of the daemon pidfile is updated in child Change-Id: Ifd14797fa949562594a285ec82d58384ad717e81 fixes: bz#1684404 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	glusterfsd: Do not process PROFILE_NFS_INFO if graph is not ready	hujianfei	2019-02-19	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, gnfs will crash in following situation. Also see commit 2f9e555f. Reproducible Steps: 1. kill gnfs process 2. service glusterd restart;gluster volume profile [vol] info nfs dump trace info: /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7fcf5cb6a872] /lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fcf5cb743a4] /lib64/libc.so.6(+0x35670)[0x7fcf5b1d5670] /usr/sbin/glusterfs(glusterfs_handle_nfs_profile+0x114)[0x7fcf5d066474] /lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7fcf5cba1502] /lib64/libc.so.6(+0x47110)[0x7fcf5b1e7110] Fixes: bz#1677559 Change-Id: Id68edb3e4646c39544e0b4c90b5e0a9083b37b0d Signed-off-by: hujianfei <hujianfei@cmss.chinamobile.com>
*	glusterd: adding a comment for code readability	Sanju Rakonde	2019-02-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	Adding a comment in the source code, so that anyone reading the code will understand the changes done by d4fa29 better. fixes: bz#1654270 Change-Id: I75aff4243420c434c47d69a4b310f77bf161bb29 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	core: implement a global thread pool	Xavi Hernandez	2019-02-18	2	-1/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a thread pool that is wait-free for adding jobs to the queue and uses a very small locked region to get jobs. This makes it possible to decrease contention drastically. It's based on wfcqueue structure provided by urcu library. It automatically enables more threads when load demands it, and stops them when not needed. There's a maximum number of threads that can be used. This value can be configured. Depending on the workload, the maximum number of threads plays an important role. So it needs to be configured for optimal performance. Currently the thread pool doesn't self adjust the maximum for the workload, so this configuration needs to be changed manually. For this reason, the global thread pool has been made optional, so that volumes can still use the thread pool provided by io-threads. To enable it for bricks, the following option needs to be set: config.global-threading = on This option has no effect if bricks are already running. A restart is required to activate it. It's recommended to also enable the following option when running bricks with the global thread pool: performance.iot-pass-through = on To enable it for a FUSE mount point, the option '--global-threading' must be added to the mount command. To change it, an umount and remount is needed. It's recommended to disable the following option when using global threading on a mount point: performance.client-io-threads = off To enable it for services managed by glusterd, glusterd needs to be started with option '--global-threading'. In this case all daemons, like self-heal, will be using the global thread pool. Currently it can only be enabled for bricks, FUSE mounts and glusterd services. The maximum number of threads for clients and bricks can be configured using the following options: config.client-threads config.brick-threads These options can be applied online and its effect is immediate most of the times. If one of them is set to 0, the maximum number of threads will be calcutated as #cores * 2. Some distributions use a very old userspace-rcu library (version 0.7) for this reason, some header files from version 0.10 have been copied into contrib/userspace-rcu and are used if the detected version is 0.7 or older. An additional change has been made to io-threads to prevent that threads are started when iot-pass-through is set. Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b updates: #532 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	fuse: reflect the actual default for lru-limit option	Amar Tumballi	2019-02-11	1	-1/+1
\| \| \| \| \| \| \| \|	in both `--help` text and man page updates: bz#1193929 Change-Id: I9aa9367c6863ac8e2403255280697c9e6be26cf0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	mount/fuse: expose auto-invalidation as a mount option	Raghavendra Gowdappa	2019-02-02	2	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Auto invalidation is necessary when same (meta)data is shared/access across multiple mounts. However, if (meta)data is not shared, all relevant I/O goes through the cache of single mount and hence is coherent with (meta)data on bricks always. So, fuse-auto-invalidation can be disabled for this case which gives a huge performance boost for workloads that write data and then immediately read the data they just wrote. From glusterfs --help, <snip> --auto-invalidation[=BOOL] controls whether fuse-kernel can auto-invalidate attribute, dentry and page-cache. Disable this only if same files/directories are not accessed across two different mounts concurrently [default: "on"] </snip> Details on how disabling auto-invalidation helped to reduce pgbench init times can be found at [1]. Time taken for pgbench init of scale 8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s) with auto-invalidations turned off along with other optimizations. Just disabling auto-invalidation contributed 56% improvement by reducing the total time taken by 33260s. [1] https://www.spinics.net/lists/gluster-devel/msg25907.html Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	rpc: Fix double free	Poornima G	2019-01-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	The value rsp.xdata.xdata_val was being freed twice. It was assigned to dict->extra_stdfree, dict_destroy would free it and also there was an explicit free. Getting rid of explicit free in this patch. Change-Id: Ia9c73454bec3970b33f154fa754398bf3b045645 fixes: bz#1668268 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	rpc: use address-family option from vol file	Milind Changire	2019-01-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	core: Resolve memory leak for brick	Mohit Agrawal	2019-01-16	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	Problem: Some functions are not freeing memory allocated by xdr_to_genric so it has become leak Solution: Call free to avoid leak Change-Id: I3524fe2831d1511d378a032f21467edae3850314 fixes: bz#1656682 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	core: glusterd/add-brick-and-validate-replicated-volume-options.t is crash	Mohit Agrawal	2019-01-14	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometime brick is getting crash at the time of handling pmap signin request Solution: glusterfs_mgmt_pamp_signin is using same frame to send pmap signin request so to avoid crash send signin request on separate frame Change-Id: I443f854171ec4372e8d5f84bdc576c468e92c493 fixes: bz#1665656 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	core: Resolve dict_leak at the time of destroying graph	Mohit Agrawal	2019-01-14	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: In gluster code some of the places it call's get_new_dict to create a dictionary without taking reference so at the time of dict_unref it has become a leak Solution: To resolve the same call dict_new instead of get_new_dict updates bz#1650403 Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	glusterd: kill the process without releasing the cleanup mutex lock	Sanju Rakonde	2019-01-02	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd acquires a cleanup mutex lock before it starts cleanup process, so that any other thread which tries to acquire lock on any resource will be blocked on cleanup mutex lock. We don't want any thread to try to acquire any resource, once the cleanup is started. because other threads might try to acquire lock on resources which are already freed by the thread which is going though the cleanup phase. previously we were releasing the cleanup mutex lock before the process exit. As we are releasing the cleanup mutex lock, before the process can exit some other thread which is blocked on cleanup mutex lock is acquiring the cleanup mutex lock and trying to acquire some resources which are already freed as a part of cleanup. This is leading glusterd to crash. Solution: We should exit the process without releasing the cleanup mutex lock. Change-Id: Ibae1c62260f141019017f7a547519a5d38dc2bb6 fixes: bz#1654270 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	posix: use synctask for janitor	Poornima G	2018-12-19	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With brick mux, the number of threads increases as the number of bricks increases. As an initiative to reduce the number of threads in brick mux scenario, replacing janitor thread to use synctask infra. Now close() and closedir() handle by separate janitor thread which is linked with glusterfs_ctx. Updates #475 Change-Id: I0c4aaf728125ab7264442fde59f3d08542785f73 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	fuse: add --lru-limit option	Amar Tumballi	2018-12-14	2	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inode LRU mechanism is moot in fuse xlator (ie. there is no limit for the LRU list), as fuse inodes are referenced from kernel context, and thus they can only be dropped on request of the kernel. This might results in a high number of passive inodes which are useless for the glusterfs client, causing a significant memory overhead. This change tries to remedy this by extending the LRU semantics and allowing to set a finite limit on the fuse inode LRU. A brief history of problem: When gluster's inode table was designed, fuse didn't have any 'invalidate' method, which means, userspace application could never ask kernel to send a 'forget()' fop, instead had to wait for kernel to send it based on kernel's parameters. Inode table remembers the number of times kernel has cached the inode based on the 'nlookup' parameter. And 'nlookup' field is not used by no other entry points (like server-protocol, gfapi etc). Hence the inode_table of fuse module always has to have lru-limit as '0', which means no limit. GlusterFS always had to keep all inodes in memory as kernel would have had a reference to it. Again, the reason for this is, kernel's glusterfs inode reference was pointer of 'inode_t' structure in glusterfs. As it is a pointer, we could never free it (to prevent segfault, or memory corruption). Solution: In the inode table, handle the prune case of inodes with 'nlookup' differently, and call a 'invalidator' method, which in this case is fuse_invalidate(), and it sends the request to kernel for getting the forget request. When the kernel sends the forget, it means, it has dropped all the reference to the inode, and it will send the forget with the 'nlookup' parameter too. We just need to make sure to reduce the 'nlookup' value we have when we get forget. That automatically cause the relevant prune to happen. Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B fixes: bz#1560969 Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterfsd: Fix coverity issue	Iraj Jamali	2018-12-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \|	Problem reported: value assigned to a variable is never used Fixes CID : 1274230 updates: bz#789278 Change-Id: I7afcb411876dea81c6820c5b31ae0a2896f9ca15 Signed-off-by: Iraj Jamali <ijamali@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	5	-31/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	server: Resolve memory leak path in server_init	Mohit Agrawal	2018-12-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) server_init does not cleanup allocate resources while it is failed before return error 2) dict leak at the time of graph destroying Solution: 1) free resources in case of server_init is failed 2) Take dict_ref of graph xlator before destroying the graph to avoid leak Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15 fixes: bz#1654917 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>