glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	multiple files: another attempt to remove includes	Yaniv Kaul	2019-06-14	15	-35/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	Fix some "Null pointer dereference" coverity issues	Xavi Hernandez	2019-05-26	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following CID's: * 1124829 * 1274075 * 1274083 * 1274128 * 1274135 * 1274141 * 1274143 * 1274197 * 1274205 * 1274210 * 1274211 * 1288801 * 1398629 Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	Revert "rpc: implement reconnect back-off strategy"	Amar Tumballi	2019-05-21	2	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 59841f7e1ff0511b04884015441a181a56d07bea. This revert is done as a 'possible' fix for frequent regression failures, which are random in nature too (ie, different tests fails in different runs). Why exactly this patch? Because this patch seemed like most probable candidate which got merged in last 15days, and after which regressions are failing more often. Updates: bz#1711827 Change-Id: I35333162fcd4064f9609525ca93c666053c6d959
*	rpc: implement reconnect back-off strategy	Xavier Hernandez	2019-05-11	2	-16/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a connection failure happens, gluster tries to reconnect every 3 seconds. In some cases the failure is spurious, so a delay of 3 seconds could be unnecessarily long. This patch implements a back-off strategy that tries a reconnect as soon as 1 tenth of a second. If this fails, the time is doubled until it's around 3 seconds. After that, the reconnect is attempted every 3 seconds as before. Change-Id: Icb3fbe20d618f50cbbb599dce542b4e871c22149 Updates: bz#1193929 Signed-off-by: Xavier Hernandez <xhernandez@redhat.com>
*	rpclib: slow floating point math and libm	Kaleb S. KEITHLEY	2019-04-03	2	-9/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In release-6 rpc/rpc-lib (libgfrpc) added the function get_rightmost_set_bit() which calls log2(3), a call that takes a floating point parameter. It's used thusly: right_most_unset_bit = get_rightmost_set_bit(...); (So is it really the right-most unset bit, or the right-most set bit?) It's unclear to me whether this is in the data path or not. If it is, it's rather scary to think about integer-to-float conversions and slow calls to libm functions in the data path. gcc and clang have __builtin_ctz() which returns the same result as get_rightmost_set_bit(), and does it substantially faster. Approx 20M iterations of get_rightmost_set_bit() took ~33sec of wall clock time on my devel machine, while 20M iterations of __builtin_ctz() took < 9sec; get_rightmost_set_bit() is 3x slower than __builtin_ctz(). And as a side benefit, we can again eliminate the need to link libgfrpc with libm. Change-Id: If9e7e80874577c52223f8125b385fc930de20699 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	mgmt/shd: Implement multiplexing in self heal daemon	Mohammed Rafi KC	2019-04-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ --------- --------- ---------- \| AFR-1 \| \| AFR-2 \| \| AFR-3 \| -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ ------------ ------------ ------------ \| volume-1 \| \| volume-2 \| \| volume-3 \| ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	rpc: Remove duplicate code	Pranith Kumar K	2019-03-28	3	-77/+1
\| \| \| \| \| \| \| \| \| \|	rpc_clnt_disable() and rpc_clnt_disconnect() have same code. Removed rpc_clnt_disconnect() and moved calls to rpc_clnt_disconnect() to rpc_clnt_disable() updates bz#1193929 Change-Id: I965f57cc1d5af36d266810125558b6f5e5f279d4 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	build: link libgfrpc with MATH_LIB (libm, -lm)	Kaleb S. KEITHLEY	2019-03-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tl;dnr: libgfrpc.so calls log2(3) from libm; it should be explicitly linked with -lm the autoconf/automake/libtool stack is more or less forgiving on different distributions. On forgiving systems libtool will semi- magically link with implicit dependencies. But on Ubuntu, which seems to be tending toward being less forgiving, the link of libgfrpc will fail with an unresolved referencee to log2(3). Change-Id: I9fae09ddb81e49004fbea4d7d83b95fb64a484b0 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	rpc/transport: Missing a ref on dict while creating transport object	Mohammed Rafi KC	2019-03-20	5	-43/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: implement a global thread pool	Xavi Hernandez	2019-02-18	3	-10/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a thread pool that is wait-free for adding jobs to the queue and uses a very small locked region to get jobs. This makes it possible to decrease contention drastically. It's based on wfcqueue structure provided by urcu library. It automatically enables more threads when load demands it, and stops them when not needed. There's a maximum number of threads that can be used. This value can be configured. Depending on the workload, the maximum number of threads plays an important role. So it needs to be configured for optimal performance. Currently the thread pool doesn't self adjust the maximum for the workload, so this configuration needs to be changed manually. For this reason, the global thread pool has been made optional, so that volumes can still use the thread pool provided by io-threads. To enable it for bricks, the following option needs to be set: config.global-threading = on This option has no effect if bricks are already running. A restart is required to activate it. It's recommended to also enable the following option when running bricks with the global thread pool: performance.iot-pass-through = on To enable it for a FUSE mount point, the option '--global-threading' must be added to the mount command. To change it, an umount and remount is needed. It's recommended to disable the following option when using global threading on a mount point: performance.client-io-threads = off To enable it for services managed by glusterd, glusterd needs to be started with option '--global-threading'. In this case all daemons, like self-heal, will be using the global thread pool. Currently it can only be enabled for bricks, FUSE mounts and glusterd services. The maximum number of threads for clients and bricks can be configured using the following options: config.client-threads config.brick-threads These options can be applied online and its effect is immediate most of the times. If one of them is set to 0, the maximum number of threads will be calcutated as #cores * 2. Some distributions use a very old userspace-rcu library (version 0.7) for this reason, some header files from version 0.10 have been copied into contrib/userspace-rcu and are used if the detected version is 0.7 or older. An additional change has been made to io-threads to prevent that threads are started when iot-pass-through is set. Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b updates: #532 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	clnt/rpc: ref leak during disconnect.	Mohammed Rafi KC	2019-02-12	1	-1/+10
\| \| \| \| \| \| \| \| \| \|	During disconnect cleanup, we are not cancelling reconnect timer, which causes a ref leak each time when a disconnect happen. Change-Id: I9d05d1f368d080e04836bf6a0bb018bf8f7b5b8a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	1	-17/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	core: heketi-cli is throwing error "target is busy"	Mohit Agrawal	2019-01-24	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of deleting block hosting volume through heketi-cli , it is throwing an error "target is busy". cli is throwing an error because brick is not detached successfully and brick is not detached due to race condition to cleanp xprt associated with detached brick Solution: To avoid xprt specifc race condition introduce an atomic flag on rpc_transport Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4 fixes: bz#1668190 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	rpc: use address-family option from vol file	Milind Changire	2019-01-22	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	rpc-clnt: reduce transport connect log for EINPROGRESS	Kinglong Mee	2019-01-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	quotad and ganesha.nfsd prints many logs as, [rpc-clnt.c:1739:rpc_clnt_submit ] 0-<VOLUME_NAME>-quota: error returned while attempting to connect to host: (null), port 0 Change-Id: Ic0c815400619e4a87a772a51b19822920228c1ef Updates: bz#1596787 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	rpcsvc: Don't expect dictionary values to be available	Pranith Kumar K	2019-01-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	When reconfigure happens, string values from one dictionary are directly set in another dictionary. This can lead to invalid memory when the first dictionary is freed up. So do dict_set_dynstr_with_alloc instead of dict_set_str updates bz#1650403 Change-Id: Id53236467521cfdeb07e7178d87ba6cf88d17003 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	rpc/rpc-lib: fix coverity issue	Sheetal Pamecha	2018-12-28	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \|	Defect: Code can never be reached because of the condition queue_index > 1024 cannot be true. CID: 1398471 Logically dead code updates: bz#789278 Change-Id: I367cda7e734f6d774900a58d8664cffcab69126f Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
*	rpc : fix coverity in rpc/rpc-lib/src/rpcsvc.c	Sunny Kumar	2018-12-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	This patch fixes newly introduced coverity. CID: 1398472: Dereference before null check. updates: bz#789278 Change-Id: Ie9b13084097de8f24b138acd7608c3e15b3bba9c Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	rpc: Use adaptive mutex in rpcsvc_program_register	Mohit Agrawal	2018-12-20	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	Adaptive mutexes are used to protect critical/shared data items that are held for short periods.It provides a balance between spin locks and traditional mutex.We have observed after use adaptive mutex in rpcsvc_program_register got some improvement. Change-Id: I7905744b32516ac4e4ca3c83c2e8e5e306093add fixes: bz#1660701
*	clang: Fix various missing checks for empty list	ShyamsundarR	2018-12-14	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using list_for_each_entry(_safe) functions, care needs to be taken that the list passed in are not empty, as these functions are not empty list safe. clag scan reported various points where this this pattern could be caught, and this patch fixes the same. Additionally the following changes are present in this patch, - Added an explicit op_ret setting in error case in the macro MAKE_INODE_HANDLE to address another clang issue reported - Minor refactoring of some functions in quota code, to address possible allocation failures in certain functions (which in turn cause possible empty lists to be passed around) Change-Id: I1e761a8d218708f714effb56fa643df2a3ea2cc7 Updates: bz#1622665 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	rpc: Resolve memory leak in mgmt_pmap_signout_cbk	Mohit Agrawal	2018-12-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of submit signout request to mgmt rpc_clnt_mgmt_pmap_signout create a frame but in cbk frame is not destroyed Solution: cleanup frame in mgmt_pmap_signout_cbk to avoid leak Change-Id: I9961cacb2e02c8023c4c99e22e299b8729c2b09f fixes: bz#1658045 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	copy_file_range support in GlusterFS	Raghavendra Bhat	2018-12-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	19	-61/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	protocol/server: support server.all-squash	Xie Changlong	2018-12-05	3	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We still use gnfs on our side, so do a little work to support server.all-squash. Just like server.root-squash, it's also a volume wide option. Also see bz#1285126 $ gluster volume set <VOLNAME> server.all-squash on Note: If you enable server.root-squash and server.all-squash at the same time, only server.all-squash works. Please refer to following table +---------------+-----------------+---------------------------+ \| \|all_squash \| no_all_squash \| +-------------------------------------------------------------+ \| \| \|anonuid/anongid for root \| \|root_squash \|anonuid/anongid \|useruid/usergid for no-root\| +-------------------------------------------------------------+ \|no_root_squash \|anonuid/anongid \|useruid/usergid \| +-------------------------------------------------------------+ Updates bz#1285126 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Xue Chuanyu <xuechuanyu@cmss.chinamobile.com> Change-Id: Iea043318fe6e9a75fa92b396737985062a26b47e
*	rpc: check if fini is there before calling it	Raghavendra Bhat	2018-12-04	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rpc_transport_t structure is allocated and filled in the rpc_transport_load function. If filling the fileds of the rpc structure fails, then in the failure handling the structure is freed by rpc_transport_cleanup. There, it unconditionally calls fini. But, if the failure handling was invoked because of any failure in between the allocation of rpc_transport_t and filling the transport->fini (including the failure to fill fini ()), then rpc_transport_cleanup can lead to a segfault. Change-Id: I8be9b84cd6b19933c559c9736198a6e440373f68 fixes: bz#1654917 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	server: Resolve memory leak path in server_init	Mohit Agrawal	2018-12-03	5	-9/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) server_init does not cleanup allocate resources while it is failed before return error 2) dict leak at the time of graph destroying Solution: 1) free resources in case of server_init is failed 2) Take dict_ref of graph xlator before destroying the graph to avoid leak Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15 fixes: bz#1654917 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	rpc *.h fles: align structs	Yaniv Kaul	2018-12-03	6	-58/+56
\| \| \| \| \| \| \| \|	Make an effort to slightly better align the structures. Change-Id: I6f80a451f2ffbf15adfb986cedc24c2799787b49 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	rpcsvc: provide each request handler thread its own queue	Raghavendra Gowdappa	2018-11-29	7	-134/+327
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A single global per program queue is contended by all request handler threads and event threads. This can lead to high contention. So, reduce the contention by providing each request handler thread its own private queue. Thanks to "Manoj Pillai"<mpillai@redhat.com> for the idea of pairing a single queue with a fixed request-handler-thread and event-thread, which brought down the performance regression due to overhead of queuing significantly. Thanks to "Xavi Hernandez"<xhernandez@redhat.com> for discussion on how to communicate the event-thread death to request-handler-thread. Thanks to "Karan Sandha"<ksandha@redhat.com> for voluntarily running the perf benchmarks to qualify that performance regression introduced by ping-timer-fixes is fixed with this patch and patiently running many iterations of regression tests while RCAing the issue. Thanks to "Milind Changire"<mchangir@redhat.com> for patiently running the many iterations of perf benchmarking tests while RCAing the regression caused by ping-timer-expiry fixes. Change-Id: I578c3fc67713f4234bd3abbec5d3fbba19059ea5 Fixes: bz#1644629 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
*	core: fix strncpy warnings	Kaleb S. KEITHLE	2018-11-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since gcc-8.2.x (fedora-28 or so) gcc has been emitting warnings about buggy use of strncpy. Most uses that gcc warns about in our sources are exactly backwards; the 'limit' or len is the strlen/size of the _source param_, giving exactly zero protection against overruns. (Which was, after all, one of the points of using strncpy in the first place.) IOW, many warnings are about uses that look approximately like this: ... char dest[8]; char src[] = "this is a string longer than eight chars"; ... strncpy (dest, src, sizeof(src)); /* boom */ ... The len/limit should be sizeof(dest). Note: the above example has a definite over-run. In our source the overrun is typically only theoretical (but possibly exploitable.) Also strncpy doesn't null-terminate on truncation; snprintf does; prefer snprintf over strncpy. Mildly surprising that coverity doesn't warn/isn't warning about this. Change-Id: I022d5c6346a751e181ad44d9a099531c1172626e updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLE <kkeithle@redhat.com>
*	rpc/rpc-lib/src/rpc-clnt.c: unlock sooner, if we fail to connect.	Yaniv Kaul	2018-11-15	1	-15/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, we did not go to unlock the mutex if we failed to connect. This patch fixes it. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I0fcca066a2601dba6bc3e9eb8b3c9fc757ffe4db
*	rpc-clnt*: several code changes to reduce conn lock times	Yaniv Kaul	2018-11-12	3	-55/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Assorted code refactoring to reduce lock contention. Also, took the opportunity to reorder structs more properly. Removed dead code. Hopefully, no functional changes. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I5de6124ad071fd5e2c31832364d602b5f6d6fe28
*	all: fix the format string exceptions	Amar Tumballi	2018-11-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there are possibilities in few places, where a user-controlled (like filename, program parameter etc) string can be passed as 'fmt' for printf(), which can lead to segfault, if the user's string contains '%s', '%d' in it. While fixing it, makes sense to make the explicit check for such issues across the codebase, by making the format call properly. Fixes: CVE-2018-14661 Fixes: bz#1644763 Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	rpc/rpc-lib: Uninitialized argument value of a function	Harpreet Lalwani	2018-10-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	trav->saved_at.tv_sec is not initialized. Calling "list_empty" function before initializing "trav". Updates: bz#1622665 Change-Id: Ib5c2703a07a9c56ccd115001aca500f7a23c4a2e Signed-off-by: Harpreet Lalwani <hlalwani@redhat.com>
*	rpc: failed requests immediately if rpc connection is down	Kinglong Mee	2018-09-27	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	In the case glfs_fini is ongoing, some cache xlators like readdir-ahead, continues to submit requests. Current rpc submit code ignores connection status and queues these internally generated requests. These requests then got cleaned up after inode table has been destroyed, causing crash. Change-Id: Ife6b17d8592a054f7a7f310c79d07af005087017 updates: bz#1626313 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	13	-6407/+6270
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	Land clang-format changes	Gluster Ant	2018-09-12	11	-1153/+1129
\| \| \| \|	Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
*	glusterd: Fix Buffer size issues	Sanju Rakonde	2018-09-04	1	-4/+4
\| \| \| \| \| \| \| \|	This patch fixes buffer size issue 1138522. Change-Id: Ia12fc8f34f75704f8ed3efae2022c4fd67a8c76c updates: bz#789278 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
*	Various files: strncpy()->sprintf(), reduce strlen()'s	Yaniv Kaul	2018-08-31	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	strncpy may not be very efficient for short strings copied into a large buffer: If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written. Instead, use snprintf(). Check for truncated output where applicable. Also: - save the result of strlen() and re-use it when possible. - move from strlen to SLEN (sizeof() ) for const strings. Compile-tested only! Change-Id: I54e80d4f4a80e98d3775e376efe05c51af0b29eb updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	rpc: log fuse unique ID along with gluster XID	Milind Changire	2018-08-30	1	-8/+12
\| \| \| \| \| \| \| \| \| \|	for better traceability between fuse requests and gluster requests a mapping needs to be established in the logs between the two IDs BUG: 1623408 Change-Id: I0ef82fe69c1ad7d0ce9e3ac4f35cd82aa6e9bca9 fixes: bz#1623408 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	rpc/rpc-lib/src/rpc-clnt-ping.c:move to GF_MALLOC() instead of GF_CALLOC() when	Yaniv Kaul	2018-08-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! Change-Id: Ifb30412ddf1bfa509f52e0454454929b266e5658 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	mgmt/glusterd: Code cleanup in glusterd-volgen.c	Vijay Bellur	2018-08-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does the following: 1. Addresses CID: 1124815,124816,1124833,1291724,1325535,1325536,1357858 - by adding some null checks - by handling return values from functions - by using an appropriate buffer length in strncpy 2. Cleans up some commented code Change-Id: I5a7079f34e3e460d5a6267734c3bc84bf4ad72f5 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	rpc: fix return value in rpc destroy	Zhang Huan	2018-07-28	1	-0/+2
\| \| \| \| \| \|	Change-Id: I73a113e2d40f508fd53b273a990a2371692c87bf fixes: bz#1607689 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	rpc: add missing free of rpc->dnscache	Zhang Huan	2018-07-28	1	-0/+8
\| \| \| \| \| \|	Change-Id: I3fa97b99bf23459cf548205d75d2cc7936b2310e fixes: bz#1607689 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	build: rename event.h to gf-event.h	Niels de Vos	2018-07-27	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Newer FreeBSD versions (noticed with 10.3-RELEASE) provide a event.h file that on occasion gets included instead of the libglusterfs file. When this happens, 'struct event_pool' will not be defined and building will fail with errors like: autoscale-threads.c:18:55: error: incomplete definition of type 'struct event_pool' int thread_count = pool->eventthreadcount; ~~~~^ autoscale-threads.c:17:16: note: forward declaration of 'struct event_pool' struct event_pool *pool = ctx->event_pool; ^ This problem is caused by 'pkg-config --cflags uuid' that adds /usr/local/include to the GF_CPPFLAGS. The use of libuuid is preferred so that the contrib/uuid/ directory can be removed. By renaming event.h to gf-event.h there is no conflict between the different event.h files anymore and compiling on FreeBSD works without issues. Change-Id: Ie69f6b8a4f8f8e9630d39a86693eb74674f0f763 Updates: bz#1607319 Signed-off-by: Niels de Vos <ndevos@redhat.com>
*	rpc: rpc_clnt_connection_cleanup is crashed due to double free	Mohit Agrawal	2018-07-25	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: gfapi client is getting crashed in rpc_clnt_connection_cleanup at the time of destroying saved_frames Solution: gfapi client is getting crashed because saved_frame ptr is already freed in rpc_clnt_destroy.To avoid the same update code in rpc_clnt_destroy Change-Id: Id8cce102b49f26cfd86ef88257032ed98f43192b fixes: bz#1607783 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	changelog: fix br-state-check.t crash for brick_mux	Mohit Agrawal	2018-07-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: br-state-check.t is getting crash Solution: Check condition in rpcsvc_request_create before allocate memory from rxpool BUG: 1597776 Change-Id: I4fde1ade6073f603c32453f1840395db9a9155b7 fixes: bz#1597776 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	Fix compile warnings	Xavi Hernandez	2018-07-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	rpc/clnt: Don't let consumers manage "connected" state	Raghavendra G	2018-06-04	3	-53/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback must exist and must - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585