glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	glusterd: dump SSL error stack on disconnect	Leonid Ishimnikov	2020-08-13	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When a non-SSL connection is attempted on an SSL-enabled management port, unrelated peers are subsequently disconnected from the node with a misleading error message. Cause: A non-SSL client causes OpenSSL to push a wrong version error into its thread-local error stack, but this error is never cleared, and it lingers in the stack until the thread is used by another SSL session, and a certain condition requires the error stack to be examined, at which time the old error is discovered and the connection is terminated. Solution: Log and clear the error stack upon terminating the connection. Change-Id: I82f3a723285df24dafc88850ae4fca65b69f6ae4 Fixes: #1418 Signed-off-by: Leonid Ishimnikov <lishim@fastmail.com>
*	socket: Use AES128 cipher in SSL if AES is supported by CPU	Mohit Agrawal	2020-05-26	1	-0/+32
\| \| \| \| \| \| \| \| \| \|	SSL performance is improved after configuring AES128 cipher so use AES128 cipher as a default cipher on the CPU those enabled AES bits otherwise ssl use AES256 cipher Change-Id: I91c50fe987cbb22ed76f8012094730c592c63506 Fixes: #1050 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	socket: Resolve ssl_ctx leak for a brick while only mgmt SSL is enabled	Mohit Agrawal	2020-04-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: While only mgmt SSL is enabled for a brick process use_ssl flag is false for a brick process and socket api's cleanup ssl_ctx only while use_ssl and ssl_ctx both are valid Solution: To avoid a leak check only ssl_ctx, if it is valid cleanup ssl_ctx Fixes: #1196 Change-Id: I2f4295478f4149dcb7d608ea78ee5104f28812c3 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	rpc: Make ssl log more useful	Mohit Agrawal	2020-04-02	1	-17/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, ssl_setup_connection_params throws 4 messages for every rpc connection that irritates a user while reading the logs. The same info we can print in a single log with peerinfo to make it more useful.ssl_setup_connection_params try to load dh_param even user has not configured it and if a dh_param file is not available it throws a failure message.To avoid the message load dh_param only while the user has configured it. Change-Id: I9ddb57f86a3fa3e519180cb5d88828e59fe0e487 Fixes: #1141 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	socket.c/name.c: minor changes	Yaniv Kaul	2020-01-13	1	-194/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Move functions to static - Remove redundant checks - Use dict_get_...sizen() where applicable - Remove unused variables. - Moved some code to be executed only if relevant. ~3% object size reduction. Change-Id: Id9b8414e0a17442f1dac10ba77014d565756c935 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	transport/socket: destroy notify mutex and condition variable	Dmitry Antipov	2019-12-31	1	-0/+5
\| \| \| \| \| \|	Change-Id: Id74f829dc5c6a30d19e3c3ef42bcb938afc0d8e4 Updates: bz#1430623 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
*	socket: fix typos and drop unused members/options	Dmitry Antipov	2019-12-27	1	-8/+7
\| \| \| \| \| \| \| \| \| \|	Consistently fix 'configued' -> 'configured' typo, remove useless members from 'socket_private_t' and unused 'transport.socket.lowlat' option. Adjust tests as well. Change-Id: I285be196457763aec16b184acd26b90623074dec Updates: bz#1193929 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
*	socket: fix error handling	Xavi Hernandez	2019-12-12	1	-84/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When __socket_proto_state_machine() detected a problem in the size of the request or it couldn't allocate an iobuf of the requested size, it returned -ENOMEM (-12). However the caller was expecting only -1 in case of error. For this reason the error passes undetected initially, adding back the socket to the epoll object. On further processing, however, the error is finally detected and the connection terminated. Meanwhile, another thread could receive a poll_in event from the same connection, which could cause races with the connection destruction. When this happened, the process crashed. To fix this, all error detection conditions have been hardened to be more strict on what is valid and what not. Also, we don't return -ENOMEM anymore. We always return -1 in case of error. An additional change has been done to prevent destruction of the transport object while it may still be needed. Change-Id: I6e59cd81cbf670f7adfdde942625d4e6c3fbc82d Fixes: bz#1782495 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	socket.c: minor changes	Yaniv Kaul	2019-11-19	1	-90/+69
\| \| \| \| \| \| \| \| \| \| \| \|	1. Remove dead code and declarations 2. Move some dict functions to use more efficient ones. 3. Use more constants, where possible. 4. Align messages - easier to grep the code for them. 5. Aligned structures and adding padding where needed. Change-Id: Ifc2639afe65a935fab5238d3e4a121b662836d3d updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	rpc: Cleanup SSL specific data at the time of freeing rpc object	l17zhou	2019-11-08	1	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of cleanup rpc object ssl specific data is not freeing so it has become a leak. Solution: To avoid the leak cleanup ssl specific data at the time of cleanup rpc object Credits: l17zhou <cynthia.zhou@nokia-sbell.com.cn> Fixes: bz#1768407 Change-Id: I37f598673ae2d7a33c75f39eb8843ccc6dffaaf0
*	glusterd, rpc, glusterfsd: fix coverity defects and put required annotations	Atin Mukherjee	2019-09-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	1404965 - Null pointer dereference 1404316 - Program hangs 1401715 - Program hangs 1401713 - Program hangs Updates: bz#789278 Change-Id: I6e6575daafcb067bc910445f82a9d564f43b75a2 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	rpc: glusterd start is failed and throwing an error Address already in use	Mohit Agrawal	2019-08-18	1	-7/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Some of the .t are failed due to bind is throwing an error EADDRINUSE Solution: After killing all gluster processes .t is trying to start glusterd but somehow if kernel has not cleaned up resources(socket) then glusterd startup is failed due to bind system call failure.To avoid the issue retries to call bind 10 times to execute system call succesfully Change-Id: Ia5fd6b788f7b211c1508c1b7304fc08a32266629 Fixes: bz#1743020 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	event: rename event_XXX with gf_ prefixed	Xiubo Li	2019-07-29	1	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I hit one crash issue when using the libgfapi. In the libgfapi it will call glfs_poller() --> event_dispatch() in file api/src/glfs.c:721, and the event_dispatch() is defined by libgluster locally, the problem is the name of event_dispatch() is the extremly the same with the one from libevent package form the OS. For example, if a executable program Foo, which will also use and link the libevent and the libgfapi at the same time, I can hit the crash, like: kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp 00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000] The link for Foo is: lib_foo_LADD = -levent $(GFAPI_LIBS) It will crash. This is because the glfs_poller() is calling the event_dispatch() from the libevent, not the libglsuter. The gfapi link info : GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid If I link Foo like: lib_foo_LADD = $(GFAPI_LIBS) -levent It will works well without any problem. And if Foo call one private lib, such as handler_glfs.so, and the handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't and it will dlopen(handler_glfs.so), then the crash will be hit everytime. The link info will be: foo_LADD = -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like: foo_LADD = $(GFAPI_LIBS) -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS. And in some cases when the --as-needed link option is added(on many dists it is added as default), then the crash is back again, the above workaround won't work. Fixes: #699 Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa Signed-off-by: Xiubo Li <xiubli@redhat.com>
*	multiple files: another attempt to remove includes	Yaniv Kaul	2019-06-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	across: clang-scan: fix NULL dereferencing warnings	Amar Tumballi	2019-06-04	1	-5/+3
\| \| \| \| \| \| \| \| \|	All these checks are done after analyzing clang-scan report produced by the CI job @ https://build.gluster.org/job/clang-scan updates: bz#1622665 Change-Id: I590305af4ceb779be952974b2a36066ffc4865ca Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	transport/socket: log shutdown msg occasionally	Raghavendra G	2019-04-03	1	-2/+2
\| \| \| \| \| \|	Change-Id: If3fc0884e7e2f45de2d278b98693b7a473220a5f Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1691616
*	socket/ssl: fix crl handling	Milind Changire	2019-03-19	1	-19/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Just setting the path to the CRL directory in socket_init() wasn't working. Solution: Need to use special API to retrieve and set X509_VERIFY_PARAM and set the CRL checking flags explicitly. Also, setting the CRL checking flags is a big pain, since the connection is declared as failed if any CRL isn't found in the designated file or directory. A comment has been added to the code appropriately. Change-Id: I8a8ed2ddaf4b5eb974387d2f7b1a85c1ca39fe79 fixes: bz#1687326 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	core: implement a global thread pool	Xavi Hernandez	2019-02-18	1	-12/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a thread pool that is wait-free for adding jobs to the queue and uses a very small locked region to get jobs. This makes it possible to decrease contention drastically. It's based on wfcqueue structure provided by urcu library. It automatically enables more threads when load demands it, and stops them when not needed. There's a maximum number of threads that can be used. This value can be configured. Depending on the workload, the maximum number of threads plays an important role. So it needs to be configured for optimal performance. Currently the thread pool doesn't self adjust the maximum for the workload, so this configuration needs to be changed manually. For this reason, the global thread pool has been made optional, so that volumes can still use the thread pool provided by io-threads. To enable it for bricks, the following option needs to be set: config.global-threading = on This option has no effect if bricks are already running. A restart is required to activate it. It's recommended to also enable the following option when running bricks with the global thread pool: performance.iot-pass-through = on To enable it for a FUSE mount point, the option '--global-threading' must be added to the mount command. To change it, an umount and remount is needed. It's recommended to disable the following option when using global threading on a mount point: performance.client-io-threads = off To enable it for services managed by glusterd, glusterd needs to be started with option '--global-threading'. In this case all daemons, like self-heal, will be using the global thread pool. Currently it can only be enabled for bricks, FUSE mounts and glusterd services. The maximum number of threads for clients and bricks can be configured using the following options: config.client-threads config.brick-threads These options can be applied online and its effect is immediate most of the times. If one of them is set to 0, the maximum number of threads will be calcutated as #cores * 2. Some distributions use a very old userspace-rcu library (version 0.7) for this reason, some header files from version 0.10 have been copied into contrib/userspace-rcu and are used if the detected version is 0.7 or older. An additional change has been made to io-threads to prevent that threads are started when iot-pass-through is set. Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b updates: #532 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	socket: socket event handlers now return void	Milind Changire	2019-02-18	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Returning any value from socket event handlers to the event sub-system doesn't make sense since event sub-system cannot handle socket sub-system errors. Solution: Change return type of all socket event handlers to 'void' Change-Id: I70dc2c57f12b7ea2fae41120f71aa0d7fe0b2b6f Fixes: bz#1651246 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	socket: don't pass return value from protocol handler to event handler	Zhang Huan	2019-01-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Event handler handles socket level error only, while protocol handler handles in protocol level error. If protocol handler decides to disconnect on error in any case, it should call disconnect instead of return an error back to event handler. Change-Id: I9375be98cc52cb969085333f3c7229a91207d1bd updates: bz#1666143 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	socket: fix issue when socket read return with EAGAIN	Zhang Huan	2019-01-22	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the case socket read returns EAGAIN, positive value about remaining vector to send is returned. This return value will be passed all the way back to event handler, making it complains. [2018-12-29 08:02:25.603199] T [socket.c:1640:__socket_read_simple_payload] 0-test-client-0-extra.0: partial read on non-blocking socket. [2018-12-29 08:02:25.603201] T [rpc-clnt.c:654:rpc_clnt_reply_init] 0-test-client-2-extra.1: received rpc message (RPC XID: 0xfa6 Program: GlusterFS 4.x v1, ProgVers: 400, Proc: 12) from rpc-transport (test-client-2-extra.1) [2018-12-29 08:02:25.603207] T [socket.c:3129:socket_event_handler] 0-test-client-0-extra.0: (sock:32) socket_event_poll_in returned 1 Formerly, in socket_proto_state_machine, return value of socket_readv is used to check if message is all read-in. In this commit, it is checked whether size of bytes indicated in header are all read in. In this way, only 0 and -1 will be returned from socket_proto_state_machine(), indicating whether there is error in the underlying socket. Change-Id: I8be0d178b049f0720d738a03aec41c4b375d2972 updates: bz#1666143 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	socket: fix issue when socket write return with EAGAIN	Zhang Huan	2019-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the case socket write return with EAGAIN, the remaining vector count is return all way back to event handler, making followup pollin event to skip handling and dispatch loop complains about failure. Even thought temporary write failure is not an error. [2018-12-29 07:31:41.772310] E [MSGID: 101191] [event-epoll.c:674:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch handler Change-Id: Idf03d120b5f7619eda19720a583cbcc3e7da2504 updates: bz#1666143 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	socket: fix counting of socket total_bytes_read and total_bytes_write	Zhang Huan	2019-01-17	1	-4/+4
\| \| \| \| \| \|	Change-Id: If35d0dbae963facf00ab6bcf07c6e4d1706ed982 updates: bz#1666143 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	socket: Remove redundant in_lock in incoming message handling	Krutika Dhananjay	2018-12-26	1	-36/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	A given epoll thread can handle only one incoming (POLLIN) request. And until the socket is rearmed for listening, it is guaranteed that there won't be any new incoming requests. As a result, the priv->in_lock which guards the socket proto state machine seems redundant. This patch removes priv->in_lock. Change-Id: I26b6ddd852aba8c10385833b85ffd2e53e46cb8c updates: bz#1467614 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	[geo-rep]: Worker still ACTIVE after killing bricks	Mohit Agrawal	2018-12-13	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In changelog xlator after destroying listener it call's unlink to delete changelog socket file but socket file reference is not cleaned up from process memory Solution: 1) To cleanup reference completely from process memory serialize transport cleanup for changelog and then unlink socket file 2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next xlator only after cleanup all xprts Test: To test the same run below steps 1) Setup some volume and enable brick mux 2) kill anyone brick with gf_attach 3) check changelog socket for specific to killed brick in lsof, it should cleanup completely fixes: bz#1600145 Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	rpcsvc: provide each request handler thread its own queue	Raghavendra Gowdappa	2018-11-29	1	-7/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A single global per program queue is contended by all request handler threads and event threads. This can lead to high contention. So, reduce the contention by providing each request handler thread its own private queue. Thanks to "Manoj Pillai"<mpillai@redhat.com> for the idea of pairing a single queue with a fixed request-handler-thread and event-thread, which brought down the performance regression due to overhead of queuing significantly. Thanks to "Xavi Hernandez"<xhernandez@redhat.com> for discussion on how to communicate the event-thread death to request-handler-thread. Thanks to "Karan Sandha"<ksandha@redhat.com> for voluntarily running the perf benchmarks to qualify that performance regression introduced by ping-timer-fixes is fixed with this patch and patiently running many iterations of regression tests while RCAing the issue. Thanks to "Milind Changire"<mchangir@redhat.com> for patiently running the many iterations of perf benchmarking tests while RCAing the regression caused by ping-timer-expiry fixes. Change-Id: I578c3fc67713f4234bd3abbec5d3fbba19059ea5 Fixes: bz#1644629 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
*	libglusterfs: rename macros roof and floor to not conflict with math.h	Raghavendra Gowdappa	2018-11-28	1	-4/+4
\| \| \| \| \| \|	Change-Id: I666eeb63ebd000711b3f793b948d4e0c04b1a242 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Updates: bz#1644629
*	core: create a constant for default network timeout	Xavi Hernandez	2018-11-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	A new constant named GF_NETWORK_TIMEOUT has been defined and all references to the hard-coded timeout of 42 seconds have been replaced with this constant. Change-Id: Id30f5ce4f1230f9288d9e300538624bcf1a6da27 fixes: bz#1652852 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	rpc: stop log flooding about ENODATA	Milind Changire	2018-11-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Logs are being flooded with ENODATA errors. This log was introduced via https://review.gluster.org/c/glusterfs/+/21481 Solution: Add a flag to remember that ENODATA error was logged for a socket/transport Change-Id: I54c10b87e46c2592339cc8b966333b8d08331750 fixes: bz#1650389 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	rpc-transport/socket: NULL pointer dereferencing clang fix	Shwetha Acharya	2018-10-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: ctx and res can be NULL. Solution: introduced a VALIDATE_OR_GOTO statement, hence removed the null check for ctx; added a check for res. Updates: bz#1622665 Change-Id: Ifee4c73e260530ab44c0a34c5ff5568f38f92c94 Signed-off-by: Shwetha Acharya <sacharya@redhat.com>
*	transport: log socket closures more verbose	Milind Changire	2018-10-26	1	-11/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: Intentional and unintentional socket closures cannot be identified Solution: Log intentional socket closures with at least INFO log level Change-Id: Ic02c882b16ab2193e57f8c3e6c3a82c4fe0f6875 fixes: bz#1642800 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	socket: use accept4/paccept for nonblocking socket	Krishnan Parthasarathi	2018-10-12	1	-16/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reduces the no. of syscalls on Linux systems from 2, accept(2) and fcntl(2) for setting O_NONBLOCK, to a single accept4(2). On NetBSD, we have paccept(2) that does the same, if we leave signal masking aside. Added sys_accept which accepts an extra flags argument than accept(2). This would opportunistically use accept4/paccept as available. It would fallback to accept(2) and fcntl(2) otherwise. While at this, the patch sets FD_CLOEXEC flag on the accepted socket fd. BUG: 1236272 Change-Id: I41e43fd3e36d6dabb07e578a1cea7f45b7b4e37f fixes: bz#1236272 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
*	socket: set FD_CLOEXEC on all sockets	Krishnan Parthasarathi	2018-10-11	1	-3/+3
\| \| \| \| \| \| \| \| \|	For more information, see http://udrepper.livejournal.com/20407.html BUG: 1236272 Change-Id: I25a645c10bdbe733a81d53cb714eb036251f8129 fixes: bz#1236272 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
*	socket: clear return value if error is going to be handled in event thread	Kinglong Mee	2018-10-10	1	-0/+3
\| \| \| \| \| \|	Change-Id: Ibce94f282b0aafaa1ca60ab927a469b70595e81f updates: bz#1626313 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	rpc: coverity fixes	Milind Changire	2018-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	CID: [1] 1394646 Unchecked return value from library CID: [2] 1394633 Unused value CID: 1382443 Sleeping while holding a lock [This is intentional] [1] https://scan6.coverity.com/reports.htm#v40014/p10714/fileInstanceId=86159112&defectInstanceId=26360786&mergedDefectId=1394646 [2] https://scan6.coverity.com/reports.htm#v40014/p10714/fileInstanceId=86159365&defectInstanceId=26360919&mergedDefectId=1394633 Change-Id: I03086f7a9672c9f50a2bc44cdbce0006c887357b updates: bz#789278 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	rpc: make binding to port 0 as the default if no option is provided	Amar Tumballi	2018-10-02	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, if no option is provided, the default port is assumed, which is 24007. Ideally, for 'glusterfsd' processes, it is better to not assume there are any ports given, so it can start listening on any port which is available. This helps us to cleanup the dependencies on glusterd from glusterfsd at the moment. No changes would be done to glusterd code, but making the right defaults helps to make glusterfsd more independent process later. NOTE: This patch is a reduced version of below set of patches: * https://review.gluster.org/14613/ & * https://review.gluster.org/14670/ & * https://review.gluster.org/14671/ Credits: Prasanna Kumar Kalever <pkalever@redhat.com> updates: bz#1343926 Change-Id: Ib874e10505e7366dc56ba754458252b67052e653 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	socket: set 42 as default tpc-user-timeout	Xavi Hernandez	2018-09-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'tcp-user-timeout' option is define in the 'socket' module, but it's configured in 'protocol/server' and 'protocol/client', which are the parents of the 'socket' module. However, current options management logic only takes into consideration default values specified in the 'socket' module itself, ignoring values defined in the owner xlator. This patch simply sets the default value of tcp-user-timeout in the 'socket' module so that server and client use the expected value. Change-Id: Ib8ad7c4ac6aac725b01a78f8c3d10cf4063d2ee6 fixes: bz#1628605 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	1	-3830/+3614
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	Modify log message 'DH ciphers are disabled' from ERROR to INFO	Omar Kohl	2018-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Per the latest comment in bz#1398237 this message is confusing for users because it suggests an error where none exists. Fixes: bz#1626319 Change-Id: I2f05999da157b11e225bf3d95edb597e964f9923 Signed-off-by: Omar Kohl <omarkohl@gmail.com>
*	multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
*	socket: Remove code duplication	Krutika Dhananjay	2018-08-02	1	-69/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	While I was reading code, I saw that socket_submit_request() and socket_submit_reply() are identical except for @a_byte and the source of @msg object being different. This patch moves all of the common code to a new function with the differing vars passed as parameters by the callers. Change-Id: I7a62ae72f10c34dc8de01b250d89a77ec5ab490d fixes: bz#1608991 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	rpc: merge ssl infra with epoll infra	Milind Changire	2018-07-27	1	-744/+786
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch attempts to use the epoll infra for handling SSL connections as well instead of the socket_poller() thread func. This essentially makes priv->own_thread flag redundant. SSL_connect()/SSL_accept() is now non-blocking which has done away with the localised poll() in ssl_do(). So, ssl_do() has been updated appropriately. own_thread and coincidently socket_poller() thread for SSL processing is now deprecated. Change-Id: I1ce54c06ddb643c16baa143598e7e4fbf16bae0a fixes: bz#1561332 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	Fix compile warnings	Xavi Hernandez	2018-07-10	1	-8/+17
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	rpc: set listen-backlog to high value	Milind Changire	2018-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: On node reboot, when glusterd starts volumes rapidly, there's a flood of connections from the bricks to glusterd and from the self-heal daemons to the bricks. This causes SYN Flooding and dropped connections when the listen-backlog is not enough to hold the pending connections to compensate for the rate at which connections are accepted by the RPC layer. Solution: Increase the listen-backlog value to 1024. This is a partial solution. Part of the solution is to rearm the listener socket early for quicker accept() of connections. See commit 6964640a977cb10c0c95a94e03c229918fa6eca8 (change 19833) Change-Id: I62283d1f4990dd43839f9a6932cf8a36effd632c fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	rpc: rearm listener socket early	Milind Changire	2018-04-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: On node reboot, when glusterd starts volumes, a setup with a large number of bricks might cause SYN Flooding and connections to be dropped if the connections are not accepted quickly enough. Solution: accept() the connection and rearm the listener socket early to receive more connection requests as soon as possible. Change-Id: Ibed421e50284c3f7a8fcdb4de7ac86cf53d4b74e fixes: bz#1564600 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	socket: Improve error logging when loading SSL files fails	Niklas Hambüchen	2018-03-21	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	* Say which file had the problem * Dump openssl error stack Fixes gluster/glusterfs#431. Change-Id: I66e9a0ae7758e9d7d8a5f19cc8ff898f01f2b491 Signed-off-by: Niklas Hambüchen <mail@nh2.me>
*	glusterd: TLS verification fails while using intermediate CA	Mohit Agrawal	2018-03-19	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: TLS verification fails while using intermediate CA if mgmt SSL is enabled. Solution: There are two main issue of TLS verification failing 1) not calling ssl_api to set cert_depth 2) The current code does not allow to set certificate depth while MGMT SSL is enabled. After apply this patch to set certificate depth user need to set parameter option transport.socket.ssl-cert-depth <depth> in /var/lib/glusterd/secure_acccess instead to set in /etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx we will check the value of cert-depth and save the value of cert-depth in ctx.If user does not provide any value in cert-depth in that case it will consider default value is 1 BUG: 1555154 Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	socket: options update for GD2	Mohit Agrawal	2018-02-19	1	-4/+36
\| \| \| \| \| \| \|	All socket options update for GD2 Change-Id: I227c16965e92018a5ab5aacd9c2617fb2735268c Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>