glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	across: coverity fixes	Amar Tumballi	2019-06-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* locks/posix.c: key was not freed in one of the cases. * locks/common.c: lock was being free'd out of context. * nfs/exports: handle case of missing free. * protocol/client: handle case of entry not freed. * storage/posix: handle possible case of double free CID: 1398628, 1400731, 1400732, 1400756, 1124796, 1325526 updates: bz#789278 Change-Id: Ieeaca890288bc4686355f6565f853dc8911344e8 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
*	Fix some "Null pointer dereference" coverity issues	Xavi Hernandez	2019-05-26	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following CID's: * 1124829 * 1274075 * 1274083 * 1274128 * 1274135 * 1274141 * 1274143 * 1274197 * 1274205 * 1274210 * 1274211 * 1288801 * 1398629 Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	protocol: remove compound fop	Amar Tumballi	2019-04-29	4	-2851/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compound fops are kept on wire as a backward compatibility with older AFR modules. The AFR module used beyond 4.x releases are not using compound fops. Hence removing the compound fop in the protocol code. Note that, compound-fops was already an 'option' in AFR, and completely removed since 4.1.x releases. So, point to note is, with this change, we have 2 ways to upgrade when clients of 3.x series are present. i) set 'use-compound-fops' option to 'false' on any volume which is of replica type. And then upgrade the servers. ii) Do a two step upgrade. First from current version (which will already be EOL if it's using compound) to a 4.1..6.x version, and then an upgrade to 7.x. Consider the overall code which we are removing for the option seems quite high, I believe it is worth it. updates: bz#1693692 Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I0a8876d0367a15e1410ec845f251d5d3097ee593
*	Replace memdup() with gf_memdup()	Vijay Bellur	2019-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	memdup() and gf_memdup() have the same implementation. Removed one API as the presence of both can be confusing. Change-Id: I562130c668457e13e4288e592792872d2e49887e updates: bz#1193929 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	client/fini: return fini after rpc cleanup	Mohammed Rafi KC	2019-04-11	2	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race condition in rpc_transport later and client fini. Sequence of events to happen the race condition 1) When we want to destroy a graph, we send a parent down event first 2) Once parent down received on a client xlator, we will initiates a rpc disconnect 3) This will in turn generates a child down event. 4) When we process child down, we first do fini for Every xlator 5) On successful return of fini, we delete the graph Here after the step 5, there is a chance that the fini on client might not be finished. Because an rpc_tranpsort ref can race with the above sequence. So we have to wait till all rpc's are successfully freed before returning the fini from client Change-Id: I20145662d71fb837e448a4d3210d1fcb2855f2d4 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	protocol: add an option to force using old-protocol	Amar Tumballi	2019-04-10	3	-3/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As protocol implements every fop, and in general a large part of the codebase. Considering our regression is run mostly in 1 machine, there was no way of forcing the client to use old protocol (while new one is available). With this patch, a new 'testing' option is provided which forces client to use old protocol if found. This should help increase the code coverage by at least 10k lines overall. updates: bz#1693692 Change-Id: Ie45256f7dea250671b689c72b4b6f25037cef948 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	mgmt/shd: Implement multiplexing in self heal daemon	Mohammed Rafi KC	2019-04-01	1	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ --------- --------- ---------- \| AFR-1 \| \| AFR-2 \| \| AFR-3 \| -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ ------------ ------------ ------------ \| volume-1 \| \| volume-2 \| \| volume-3 \| ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	protocol/client: Do not fallback to anon-fd if fd is not open	Pranith Kumar K	2019-03-31	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an open comes on a file when a brick is down and after the brick comes up, a fop comes on the fd, client xlator would still wind the fop on anon-fd leading to wrong behavior of the fops in some cases. Example: If lk fop is issued on the fd just after the brick is up in the scenario above, lk fop will be sent on anon-fd instead of failing it on that client xlator. This lock will never be freed upon close of the fd as flush on anon-fd is invalid and is not wound below server xlator. As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag. Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc fixes bz#1390914 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	client-rpc: Fix the payload being sent on the wire	Poornima G	2019-03-29	6	-244/+308
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fops allocate 3 kind of payload(buffer) in the client xlator: - fop payload, this is the buffer allocated by the write and put fop - rsphdr paylod, this is the buffer required by the reply cbk of some fops like lookup, readdir. - rsp_paylod, this is the buffer required by the reply cbk of fops like readv etc. Currently, in the lookup and readdir fop the rsphdr is sent as payload, hence the allocated rsphdr buffer is also sent on the wire, increasing the bandwidth consumption on the wire. With this patch, the issue is fixed. Fixes: bz#1692093 Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b Signed-off-by: Poornima G <pgurusid@redhat.com>
*	clang: Fix various missing checks for empty list	ShyamsundarR	2018-12-14	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using list_for_each_entry(_safe) functions, care needs to be taken that the list passed in are not empty, as these functions are not empty list safe. clag scan reported various points where this this pattern could be caught, and this patch fixes the same. Additionally the following changes are present in this patch, - Added an explicit op_ret setting in error case in the macro MAKE_INODE_HANDLE to address another clang issue reported - Minor refactoring of some functions in quota code, to address possible allocation failures in certain functions (which in turn cause possible empty lists to be passed around) Change-Id: I1e761a8d218708f714effb56fa643df2a3ea2cc7 Updates: bz#1622665 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	copy_file_range support in GlusterFS	Raghavendra Bhat	2018-12-12	6	-0/+265
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	all: add xlator_api to many translators	Amar Tumballi	2018-12-06	1	-0/+15
\| \| \| \| \| \|	Fixes: #164 Change-Id: I93ad6f0232a1dc534df099059f69951e1339086f Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	12	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	Multiple xlator .h files: remove unused private gf_* memory types.	Yaniv Kaul	2018-11-30	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	It seems there were quite a few unused enums (that in turn cause unndeeded memory allocation) in some xlators. I've removed them, hopefully not causing any damage. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I8252bd763dc1506e2d922496d896cd2fc0886ea7
*	core: create a constant for default network timeout	Xavi Hernandez	2018-11-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	A new constant named GF_NETWORK_TIMEOUT has been defined and all references to the hard-coded timeout of 42 seconds have been replaced with this constant. Change-Id: Id30f5ce4f1230f9288d9e300538624bcf1a6da27 fixes: bz#1652852 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	protocol/client: unchecked return value	Shwetha Acharya	2018-11-20	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In client_process_response_v2, value returned from function client_post_common_dict is not checked for errors before being used. Solution: Added a check condition to resolve the issue. CID: 1390020 Change-Id: I4d297f33c8dd332ae5f6f21667a4871133b2b570 updates: bz#789278 Signed-off-by: Shwetha Acharya <sacharya@redhat.com>
*	all: fix the format string exceptions	Amar Tumballi	2018-11-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there are possibilities in few places, where a user-controlled (like filename, program parameter etc) string can be passed as 'fmt' for printf(), which can lead to segfault, if the user's string contains '%s', '%d' in it. While fixing it, makes sense to make the explicit check for such issues across the codebase, by making the format call properly. Fixes: CVE-2018-14661 Fixes: bz#1644763 Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	protocol: remove the option 'verify-volfile-checksum'	Amar Tumballi	2018-11-05	1	-84/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'getspec' operation is not used between 'client' and 'server' ever since we have off-loaded volfile management to glusterd, ie, at least 7 years. No reason to keep the dead code! The removed option had no meaning, as glusterd didn't provide a way to set (or unset) this option. So, no regression should be observed from any of the existing glusterfs deployment, supported or unsupported. Updates: CVE-2018-14653 Updates: bz#1644756 Change-Id: I4a2e0f673c5bcd4644976a61dbd2d37003a428eb Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	8	-19701/+19598
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	Land clang-format changes	Gluster Ant	2018-09-12	4	-789/+733
\| \| \| \|	Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
*	Multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	1	-21/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/storage/posix/src/posix-inode-fd-ops.c: xlators/storage/posix/src/posix-helpers.c: xlators/storage/bd/src/bd.c: xlators/protocol/client/src/client-lk.c: xlators/performance/quick-read/src/quick-read.c: xlators/performance/io-cache/src/page.c xlators/nfs/server/src/nfs3-helpers.c xlators/nfs/server/src/nfs-fops.c xlators/nfs/server/src/mount3udp_svc.c xlators/nfs/server/src/mount3.c xlators/mount/fuse/src/fuse-helpers.c xlators/mount/fuse/src/fuse-bridge.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-syncop.h xlators/mgmt/glusterd/src/glusterd-snapshot.c xlators/mgmt/glusterd/src/glusterd-rpc-ops.c xlators/mgmt/glusterd/src/glusterd-replace-brick.c xlators/mgmt/glusterd/src/glusterd-op-sm.c xlators/mgmt/glusterd/src/glusterd-mgmt.c xlators/meta/src/subvolumes-dir.c xlators/meta/src/graph-dir.c xlators/features/trash/src/trash.c xlators/features/shard/src/shard.h xlators/features/shard/src/shard.c xlators/features/marker/src/marker-quota.c xlators/features/locks/src/common.c xlators/features/leases/src/leases-internal.c xlators/features/gfid-access/src/gfid-access.c xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c xlators/features/bit-rot/src/bitd/bit-rot.c xlators/features/bit-rot/src/bitd/bit-rot-scrub.c bxlators/encryption/crypt/src/metadata.c xlators/encryption/crypt/src/crypt.c xlators/performance/md-cache/src/md-cache.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! .. and allocate memory as much as needed. xlators/nfs/server/src/mount3.c : Don't blindly allocate PATH_MAX, but strlen() the string and allocate appropriately. Also, align error messges. updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
*	protocol: coverity fixes	Bhumika Goyal	2018-08-22	2	-8/+4
\| \| \| \| \| \| \| \|	Fixes CID: 1389388 1389320 1274113 1388881 1388623 1124801 1124795 Change-Id: Ia72abc0560c959b0298f42e25abdfc5523755569 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
*	xlators: protocol: Fix deferencing pointer after free coverity issues	Bhumika Goyal	2018-08-21	3	-32/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pointer of type struct iobuf * is getting dereferenced after getting freed by iobuf_unref function. Therefore, move this function after all the dereferences of this pointer type. Also, it is useful coding standard to have iobuf_unref just after iobref_add. So, move iobref_add too. Occurences found using Coccinelle script: @@ identifier rsphdr_iobuf; expression E; identifier func; @@ iobuf_unref(rsphdr_iobuf); ... E = func(rsphdr_iobuf); Fixes CID: 1390517, 1390278, 1388666, 1356588, 1356587 at [1]. and also some more occurences which were found using the above script but not caught by Coverity. [1]. https://scan6.coverity.com/reports.htm#v42388/p10714/fileInstanceId=84384920&defectInstanceId=25600709&mergedDefectId=1388666 Change-Id: I579e9d12698f14e9e24bc926c6efef16bac5c06c updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
*	build: rename event.h to gf-event.h	Niels de Vos	2018-07-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Newer FreeBSD versions (noticed with 10.3-RELEASE) provide a event.h file that on occasion gets included instead of the libglusterfs file. When this happens, 'struct event_pool' will not be defined and building will fail with errors like: autoscale-threads.c:18:55: error: incomplete definition of type 'struct event_pool' int thread_count = pool->eventthreadcount; ~~~~^ autoscale-threads.c:17:16: note: forward declaration of 'struct event_pool' struct event_pool *pool = ctx->event_pool; ^ This problem is caused by 'pkg-config --cflags uuid' that adds /usr/local/include to the GF_CPPFLAGS. The use of libuuid is preferred so that the contrib/uuid/ directory can be removed. By renaming event.h to gf-event.h there is no conflict between the different event.h files anymore and compiling on FreeBSD works without issues. Change-Id: Ie69f6b8a4f8f8e9630d39a86693eb74674f0f763 Updates: bz#1607319 Signed-off-by: Niels de Vos <ndevos@redhat.com>
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	protocol/client: handle the fdctx_destroy properly with different versions	Amar Tumballi	2018-07-05	2	-74/+133
\| \| \| \| \| \| \| \|	while adding the new version of RPC, this part was not handled properly Updates: bz#1193929 Change-Id: If4cc4c2db075221b9ed731bacb7cc035f7007c5b Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	client: remove the "connecting" state - it's not used	Michael Adam	2018-06-21	3	-6/+0
\| \| \| \| \| \| \| \| \|	The "connecting" state is not used anywhere really. It's only being set and printed. So remove it. Change-Id: I11fc8b0bdcda5a812d065543aa447d39957d3b38 fixes: bz#1583583 Signed-off-by: Michael Adam <obnox@samba.org>
*	protocol/client: Remove code duplication	Krutika Dhananjay	2018-06-15	3	-119/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	client_submit_vec_request() which is used by WRITEV, and PUT and client_submit_request() used by the rest of the fops have almost similar code. However, there have been some more checks - such as whether setvolume was successful or not, and one more that is send-gid-specific - that have been missed out in the vectored version of the function. This patch fixes this code duplication. Change-Id: I363a28eeead6219cb1009dc831538153e8bd7d40 fixes: bz#1591580 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	rpc/clnt: Don't let consumers manage "connected" state	Raghavendra G	2018-06-04	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback must exist and must - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
*	protocol/client: Don't send fops till SETVOLUME is complete	Raghavendra G	2018-05-31	2	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An earlier commit set conf->connected just after rpc layer sends RPC_CLNT_CONNECT event. However, success of socket level connection connection doesn't indicate brick stack is ready to receive fops, as an handshake has to be done b/w client and server after RPC_CLNT_CONNECT event. Any fop sent to brick in the window between, * protocol/client receiving RPC_CLNT_CONNECT event * protocol/client receiving a successful setvolume response can end up accessing an uninitialized brick stack. So, set conf->connected only after a successful SETVOLUME. Change-Id: I139a03d2da6b0d95a0d68391fcf54b00e749decf fixes: bz#1583937 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	client/protocol: fix the log level for removexattr_cbk	Amar Tumballi	2018-05-17	2	-2/+12
\| \| \| \| \| \| \| \| \| \|	noticed that server protocol actually logs all the errors for removexattr as INFO, instead of WARNING like client, and hence, doesn't create a confusion in user. updates: bz#1576418 Change-Id: Ia6681e9ee433fda3c77a4509906c78333396e339 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	protocol: Fix 4.0 client, parsing older iatt in dict	ShyamsundarR	2018-03-10	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a mixed mode cluster involving 4.0 and older 3.x bricks, if clients are newer, then the iatt encoded in the dictionary can be of the older iatt format, which a newer client will map incorrectly to the newer structure. This causes failures in FOPs that depend on this iatt for some functionality (seen in mkdir operations failing as EIO, when DHT hits its internal setxattr call). The fix provided is to convert the iatt in the dict, based on which RPC version is used to communicate with the server. IOW, this is the reverse of change in commit "b966c7790e" Tested using a mixed mode cluster (i.e bricks in 3.12 and 4.0 versions) and a mixed set of clients, 3.12 and 4.0 clients. There is no regression test provided, as this needs a mixed mode cluster to test and validate. Change-Id: I454e54651ca836b9f7c28f45f51d5956106aefa9 BUG: 1554053 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	protocol/client: fix memory corruption	Xavi Hernandez	2018-03-09	6	-92/+78
\| \| \| \| \| \| \| \| \| \| \| \| \|	There was an issue when some accesses to saved_fds list were protected by the wrong mutex (lock instead of fd_lock). Additionally, the retrieval of fdctx from fd's context and any checks done on it have also been protected by fd_lock to avoid fdctx to become outdated just after retrieving it. Change-Id: If2910508bcb7d1ff23debb30291391f00903a6fe BUG: 1553129 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	protcol/client: Insert dummy clnt-lk-version to avoid upgrade failure	Anoop C S	2018-02-14	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With https://review.gluster.org/#/c/12363/ being merged, we no longer send client's lk-version to server side and the corresponding check on server is also removed. But when clients are upgraded prior to servers, the check for lk-version at server side fails and is reported back to clients resulting in disconnection. Since we don't have lock-recovery (lk-version and grace-timeout) logic anymore in code base our best bet would be to add client's default lk-version i.e, 1, into the dictionary just to make server side check pass and continue with remaining SETVOLUME operations. Change-Id: I441b67bd271d1e9ba9a7c08703e651c7a6bd945b BUG: 1544699 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	protocol: utilize the version 4 xdr	Amar Tumballi	2018-02-01	3	-73/+360
\| \| \| \| \| \| \|	updates #384 Change-Id: Id80bf470988dbecc69779de9eb64088559cb1f6a Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	protocol: Implement put fop	Poornima G	2018-01-31	4	-0/+201
\| \| \| \| \| \|	Updates #353 Change-Id: I755b9208690be76935d763688fa414521eba3a40 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	quiesce, gfproxy: Implement failover across multiple gfproxy nodes	Poornima G	2018-01-30	2	-48/+34
\| \| \| \| \| \|	Updates: #242 Change-Id: I767e574a26e922760a7130bd209c178d74e8cf69 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	protocol: Remove lock recovery logic from client and server	Anoop C S	2018-01-29	5	-589/+14
\| \| \| \| \| \|	Change-Id: I27f5e1e34fe3eac96c7dd88e90753fb5d3d14550 BUG: 1272030 Signed-off-by: Anoop C S <anoopcs@redhat.com>
*	protocol: make on-wire-change of protocol using new XDR definition.	Amar Tumballi	2018-01-19	7	-295/+9029
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this patchset, some major things are changed in XDR, mainly: * Naming: Instead of gfs3/gfs4 settle for gfx_ for xdr structures * add iattx as a separate structure, and add conversion methods * the _rsp structure is now changed, and is also reduced in number (ie, no need for different strucutes if it is similar to other response). use proper XDR methods for sending dict on wire. Also, with the change of xdr structure, there are changes needed outside of xlator protocol layer to handle these properly. Mainly because the abstraction was broken to support 0-copy RDMA with payload for write and read FOP. This made transport layer know about the xdr payload, hence with the change of xdr payload structure, transport layer needed to know about the change. Updates #384 Change-Id: I1448fbe9deab0a1b06cb8351f2f37488cefe461f Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	rpc/*: auth-header changes	Amar Tumballi	2018-01-17	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce another authentication header which can now send more data. This is useful because this data can be common for all the fops, and we don't need to change all the signatures. As part of this, made rpc-clnt.c little more modular to support multiple authentication structures. stack.h changes are placeholder for the ctime etc, can be moved later based on need. updates #384 Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	locks: added inodelk/entrylk contention upcall notifications	Xavier Hernandez	2018-01-16	2	-1/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The locks xlator now is able to send a contention notification to the current owner of the lock. This is only a notification that can be used to improve performance of some client side operations that might benefit from extended duration of lock ownership. Nothing is done if the lock owner decides to ignore the message and to not release the lock. For forced release of acquired resources, leases must be used. Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
*	protocol/client: reduce lock contention	Zhang Huan	2017-12-26	6	-58/+53
\| \| \| \| \| \| \| \| \| \| \| \|	Current use of a per-client mutex to protect fdctx introduces lock contentions when there are dozens of file operations active. Use finer grain spinlock to reduce contention, and put retrieving fdctx out of lock. Change-Id: Iea3e2eb481e76a5d73a582ba81529180c5b88248 BUG: 1519598 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	all: Simplify component message id's definition	Xavier Hernandez	2017-12-14	1	-634/+78
\| \| \| \| \| \| \| \| \|	This patch creates a new way of defining message id's that is easier and less error prone because it doesn't require so many manual changes each time a new component is defined or a new message created. Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
*	glusterfs: Use gcc builtin ATOMIC operator to increase/decreate refcount.	Mohit Agrawal	2017-12-12	2	-13/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In glusterfs code base we call mutex_lock/unlock to take reference/dereference for a object.Sometime it could be reason for lock contention also. Solution: There is no need to use mutex to increase/decrease ref counter, instead of using mutex use gcc builtin ATOMIC operation. Test: I have not observed yet how much performance gain after apply this patch specific to glusterfs but i have tested same with below small program(mutex and atomic both) and get good difference. static int numOuterLoops; static void * threadFunc(void arg) { int j; for (j = 0; j < numOuterLoops; j++) { __atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL); } return NULL; } int main(int argc, char argv[]) { int opt, s, j; int numThreads; pthread_t *thread; int verbose; int64_t n = 0; if (argc < 2 ) { printf(" Please provide 2 args Num of threads && Outer Loop\n"); exit (-1); } numThreads = atoi(argv[1]); numOuterLoops = atoi (argv[2]); if (1) { printf("\tthreads: %d; outer loops: %d;\n", numThreads, numOuterLoops); } thread = calloc(numThreads, sizeof(pthread_t)); if (thread == NULL) { printf ("calloc error so exit\n"); exit (-1); } __atomic_store (&glob, &n, __ATOMIC_RELEASE); for (j = 0; j < numThreads; j++) { s = pthread_create(&thread[j], NULL, threadFunc, NULL); if (s != 0) { printf ("pthread_create failed so exit\n"); exit (-1); } } for (j = 0; j < numThreads; j++) { s = pthread_join(thread[j], NULL); if (s != 0) { printf ("pthread_join failed so exit\n"); exit (-1); } } printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED)); exit(0); } time ./thr_count 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 1m10.288s user 0m57.269s sys 3m31.565s time ./thr_count_atomic 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 0m20.313s user 1m20.558s sys 0m0.028 Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	rio/everywhere: add icreate/namelink fop	Susant Palai	2017-12-05	2	-0/+254
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	icreate creates inode, while namelink links the basename to it's parent gfid. For now mkdir is the primary user of these fops. Better distribution is acheived by creating the inode on ,(say) mds1 and linking the basename to it's parent gfid on mds2. The inode serves readdirp, stat etc. More details about the fops are present at: https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md This backport of three patches from experimental branch. 1- https://review.gluster.org/#/c/18085/ 2- https://review.gluster.org/#/c/18086/ 3- https://review.gluster.org/#/c/18094/ Updates gluster/glusterfs#243 Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592 Signed-off-by: Susant Palai <spalai@redhat.com>
*	protocol/client: Update xlator options table	Kaushal M	2017-11-28	1	-13/+32
\| \| \| \| \| \| \|	Updates #302 Change-Id: Ia78e5d8f7b9ee6410965296808ad316c3cfb1d61 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	rpc : Change the way client uuid is built	Poornima G	2017-11-20	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Today the main users of client uuid are protocol layers, locks, leases. Protocol layers requires each client uuid to be unique, even across connects and disconnects. Locks and leases on the server side also use the same client uid which changes across file migrations. Which makes the graph switch and file migration tedious for locks and leases. file migration across bricks becomes difficult as client uuid for the same client, is different on the other brick. The exact set of issues exists for leases as well. Solution would be to introduce a constant in the client-uid string which the locks and leases can use to identify the owner client across bricks. Client uuid currently: %s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume count/reconnect count) Proposed Client uuid: "CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s" - CTX_ID: This is will be constant per client. - GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume count) remains the same. Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c BUG: 1369028 Signed-off-by: Susant Palai <spalai@redhat.com>
*	Coverity Issue: PW.INCLUDE_RECURSION in several files	Girjesh Rajoria	2017-11-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439, 440, 441, 442, 443 Issue: Event include_recursion Removed redundant, recursive includes from the files. Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
*	rpc: bring a new protocol version	Amar Tumballi	2017-11-07	2	-0/+175
\| \| \| \| \| \| \| \| \| \|	* xdr: add gfid to on wire format for fsetattr/rchecksum * as it is change in on wire XDR format, needed backward compatible RPC programs. Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 827334 Change-Id: Id0a2da3632516dc1a5560dde2b151b2e5f0be8e5
*	xlators/client-rpc-fops: Fix Coverity Issues	Mohammed Azhar Padariyakam	2017-11-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity Id: 803, 804 from [1] Problem: USE_AFTER_FREE code at client3_3_lookup: 3371 USE_AFTER_FREE at client3_3_readdir: 5562 Solution: Postponed iobuf_unref(rsp_iobuf) to a location after all dereferences of rsp_iobuf [1]:https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-10-30-9aa574a5/html/ Change-Id: I34f90d72b1248e5526ff1d2b3273cdab619cf4f8 BUG: 789278 Signed-off-by: Mohammed Azhar Padariyakam <mpadariy@redhat.com>