glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	glusterd/svc: update pid of mux volumes from the shd process	Mohammed Rafi KC	2019-07-09	1	-9/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a normal volume, we are updating the pid from a the process while we do a daemonization or at the end of the init if it is no-daemon mode. Along with updating the pid we also lock the file, to make sure that the process is running fine. With brick mux, we were updating the pidfile from gluterd after an attach/detach request. There are two problems with this approach. 1) We are not holding a pidlock for any file other than parent process. 2) There is a chance for possible race conditions with attach/detach. For example, shd start and a volume stop could race. Let's say we are starting an shd and it is attached to a volume. While we trying to link the pid file to the running process, this would have deleted by the thread that doing a volume stop. Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a Updates: bz#1722541 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: fedora 30 compiler warnings	SheetalPamecha	2019-06-18	1	-3/+1
\| \| \| \| \| \| \| \|	warning: ‘%s’ directive argument is null [-Wformat-overflow=] Change-Id: I69b8d47f0002c58b00d1cc947fac6f1c64e0b295 updates: bz#1193929 Signed-off-by: SheetalPamecha <spamecha@redhat.com>
*	mgmt/shd: Implement multiplexing in self heal daemon	Mohammed Rafi KC	2019-04-01	1	-4/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ --------- --------- ---------- \| AFR-1 \| \| AFR-2 \| \| AFR-3 \| -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ ------------ ------------ ------------ \| volume-1 \| \| volume-2 \| \| volume-3 \| ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	rpc/transport: Missing a ref on dict while creating transport object	Mohammed Rafi KC	2019-03-20	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	glusterfsd: Do not process PROFILE_NFS_INFO if graph is not ready	hujianfei	2019-02-19	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, gnfs will crash in following situation. Also see commit 2f9e555f. Reproducible Steps: 1. kill gnfs process 2. service glusterd restart;gluster volume profile [vol] info nfs dump trace info: /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7fcf5cb6a872] /lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fcf5cb743a4] /lib64/libc.so.6(+0x35670)[0x7fcf5b1d5670] /usr/sbin/glusterfs(glusterfs_handle_nfs_profile+0x114)[0x7fcf5d066474] /lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7fcf5cba1502] /lib64/libc.so.6(+0x47110)[0x7fcf5b1e7110] Fixes: bz#1677559 Change-Id: Id68edb3e4646c39544e0b4c90b5e0a9083b37b0d Signed-off-by: hujianfei <hujianfei@cmss.chinamobile.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	rpc: Fix double free	Poornima G	2019-01-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	The value rsp.xdata.xdata_val was being freed twice. It was assigned to dict->extra_stdfree, dict_destroy would free it and also there was an explicit free. Getting rid of explicit free in this patch. Change-Id: Ia9c73454bec3970b33f154fa754398bf3b045645 fixes: bz#1668268 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	rpc: use address-family option from vol file	Milind Changire	2019-01-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	core: Resolve memory leak for brick	Mohit Agrawal	2019-01-16	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	Problem: Some functions are not freeing memory allocated by xdr_to_genric so it has become leak Solution: Call free to avoid leak Change-Id: I3524fe2831d1511d378a032f21467edae3850314 fixes: bz#1656682 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	core: glusterd/add-brick-and-validate-replicated-volume-options.t is crash	Mohit Agrawal	2019-01-14	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometime brick is getting crash at the time of handling pmap signin request Solution: glusterfs_mgmt_pamp_signin is using same frame to send pmap signin request so to avoid crash send signin request on separate frame Change-Id: I443f854171ec4372e8d5f84bdc576c468e92c493 fixes: bz#1665656 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	glusterfsd: Fix coverity issue	Iraj Jamali	2018-12-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \|	Problem reported: value assigned to a variable is never used Fixes CID : 1274230 updates: bz#789278 Change-Id: I7afcb411876dea81c6820c5b31ae0a2896f9ca15 Signed-off-by: Iraj Jamali <ijamali@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	glusterfsd: Make io-stats xlator search position independent	Pranith Kumar K	2018-11-15	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterfsd notify trigger for profile info command expects decompounder xlator to have the name of the brick and its immediate child to be io-stats xlator. In GD2 decompounder xlator doesn't exist, so this is preventing io-stats xlator from receiving the profile info collection notification. Fix: search for io-stats xlator below server xlator till the first instance is found and send notification for it. fixes bz#1649709 Change-Id: I92a1d9019bbd5546050ab43d50d571c444e027ed Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	glusterfsd: Make each multiplexed brick sign in	Prashanth Pai	2018-11-12	1	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NOTE: This change will be consumed by brick mux implementation of glusterd2 only. No corresponsing change in glusterd1 has been made. When a multiplexed brick process is shutting down, it sends sign out requests to glusterd for all bricks that it contains. However, sign in request is only sent for a single brick. Consequently, glusterd has to use some tricky means to repopulate pmap registry with information of multiplexed bricks during glusterd restart. This change makes each multiplexed brick send a sign in request to glusterd2 which ensures that glusterd2 can easily repopulate pmap registry with port information. As a bonus, sign in request will now also contain PID of the brick sending the request so that glusterd2 can rely on this instead of having to read/manage brick pidfiles. Change-Id: I409501515bd9a28ee7a960faca080e97cabe5858 updates: bz#1193929 Signed-off-by: Prashanth Pai <ppai@redhat.com>
*	glusterfsd: Do not process GLUSTERD_NODE_STATUS if graph is not ready	Hu Jianfei	2018-11-07	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, gnfs will crash if we try to get nfs clients status in following situation. Also see commit 2f9e555f. Reproducible Steps: 1. systemctl restart glusterd; gluster volume status rep 2. systemctl restart glusterd; gluster volume status rep nfs clients step 1 works ok, but step 2 will lead localhost gnfs crash with certain probability. /lib64/libglusterfs.so.0(+0x270f0)[0x7effb6c7b0f0] /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7effb6c854a4] /lib64/libc.so.6(+0x35270)[0x7effb52e7270] /usr/sbin/glusterfs(glusterfs_handle_node_status+0x155)[0x7effb7196905] /lib64/libglusterfs.so.0(+0x63f40)[0x7effb6cb7f40] /lib64/libc.so.6(+0x46d40)[0x7effb52f8d40] Updates: bz#1646869 Change-Id: Ia4cb009f821d32b2d18ba48d3467cc81a4b07747 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Hu Jianfei <hujianfei@cmss.chinamobile.com>
*	glusterfsd: Fix coverity issues around unused values	ShyamsundarR	2018-10-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes, CID: 1274064, 1274232 The fix is to not add into the return dict the throughput and time values on error computing the same. The pattern applied is the same as when an error occurs during adding either the throughput or the time value to the dict. Change-Id: I33e21e75efbc691f18b818934ef3bf70dd075097 Updates: bz#789278 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	all: fix warnings on non 64-bits architectures	Xavi Hernandez	2018-10-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When compiling in other architectures there appear many warnings. Some of them are actual problems that prevent gluster to work correctly on those architectures. Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0 fixes: bz#1632717 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	glusterfsd/mgmt : Check for NULL after creating frame	Ashish Pandey	2018-09-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	CID: 1382390 https://scan6.coverity.com/reports.htm#v42607/p10714fileInstanceId=85472585&defectInstanceId=26074725&mergedDefectId=1382390 Change-Id: Iade073a5e72f29ad5e8f372955869bc287eb9793 updates: bz#789278 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	1	-2416/+2429
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	mgmt/glusterd : Fix coverity issue	Ashish Pandey	2018-09-11	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \|	CID: 727146, 727066 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=85393035&defectInstanceId=26034751&mergedDefectId=727146 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=85392913&defectInstanceId=26034571&mergedDefectId=727066 updates: bz#789278 Change-Id: Ieaef33829ec88e68690dabce4ea21d2e61dad9f6 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	multiple files: move from strlen() to sizeof()	Yaniv Kaul	2018-08-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	{glusterfsd\|glusterfsd-mgmt\|quota-common-utils\|xlator\|tier\|stripe}.c tools/setgfid2path/src/main.c xlators/cluster/afr/src/afr-inode-read.c {glusterfs-acl\|glusterfs}.h For const strings, just do compile time size calc instead of runtime. Compile-tested only! Change-Id: I303684b1ff29b05c10126fb1057f507e404ced07 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	coverity: Multiple coverity fixes for issues with HIGH severity	ShyamsundarR	2018-08-22	1	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glfs-fops.c 1391414 Uninitialized pointer read List head needed initialization glusterfsd-mgmt.c graph.c 1382431 Buffer not null terminated 1382417 Dereference before null check 1382347 Buffer not null terminated Cleaned usage of volfile_checksum member of gf_volfile_t struct across the code base. glusterd-tier.c 1382426 Resource leak 1370955 Dereference before null check The function fixed needs more work, but with tier almost being deprecated, addressed some parts of the reported coverity issues as appropriate. Tested using the following test cases: ./tests/basic/tier/new-tier-cmds.t ./tests/basic/tier/tier.t ./tests/basic/tier/bug-1214222-directories_missing_after_attach_tier.t ./tests/basic/tier/tier_lookup_heal.t ./tests/basic/tier/tier-heald.t ./tests/basic/tier/tier-snapshot.t ./tests/features/glfs-lease.t Change-Id: I396f1c34bb112bb22d2745ed279e1a4850cac4af Updates: bz#789278 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	coverity: last of the secure temp fixes	ShyamsundarR	2018-08-13	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity ignore directive is not working if the comment is split across lines (or has an empty line at the end. This can be seen in this report: https://download.gluster.org/pub/gluster/glusterfs/static-analysis /master/glusterfs-coverity/2018-08-06-b982e09f/html/1 /384glusterfsd-mgmt.c.html#error In other places the same pattern has avoided coverity from flagging off the same call, except here. Updates: bz#789278 Change-Id: Ic35ff0fc91d0a42904630728ef7c18215aa277f3 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	coverity: Fix remaining SECURE_TEMP issues reported	ShyamsundarR	2018-08-03	1	-1/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two pending SECURE_TEMP issues still exist in the coverity reports, these are fixed by this patch. In both instances (where functions actually seem to be duplicates of each other) the need was for a FILE * and not an fd. Applied the same pattern in both places as in other parts of the code where mkstemp was used and later a FILE * was created from the resulting fd for use. Coverity report: https://download.gluster.org/pub/gluster/ glusterfs/static-analysis/master/glusterfs-coverity/ 2018-07-30-4d3c62e7/html/ Issues numbered: 382, 383 (named SECURE_TEMP) Further added tmpfile to the blacklist, so that future code changes do not add the same, into symbol-check.sh. Also corrected shellcheck errors in symbol-check.sh as a result of updating the same. Updates: bz#789278 Change-Id: I1d572a16ca5b5df2f597aeaa5f454fad34c8296e Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	build: rename event.h to gf-event.h	Niels de Vos	2018-07-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Newer FreeBSD versions (noticed with 10.3-RELEASE) provide a event.h file that on occasion gets included instead of the libglusterfs file. When this happens, 'struct event_pool' will not be defined and building will fail with errors like: autoscale-threads.c:18:55: error: incomplete definition of type 'struct event_pool' int thread_count = pool->eventthreadcount; ~~~~^ autoscale-threads.c:17:16: note: forward declaration of 'struct event_pool' struct event_pool *pool = ctx->event_pool; ^ This problem is caused by 'pkg-config --cflags uuid' that adds /usr/local/include to the GF_CPPFLAGS. The use of libuuid is preferred so that the contrib/uuid/ directory can be removed. By renaming event.h to gf-event.h there is no conflict between the different event.h files anymore and compiling on FreeBSD works without issues. Change-Id: Ie69f6b8a4f8f8e9630d39a86693eb74674f0f763 Updates: bz#1607319 Signed-off-by: Niels de Vos <ndevos@redhat.com>
*	glusterd: Add multiple checks before attach/start a brick	Mohit Agrawal	2018-07-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In brick mux scenario sometime glusterd is not able to start/attach a brick and gluster v status shows brick is already running Solution: 1) To make sure brick is running check brick_path in /proc/<pid>/fd , if a brick is consumed by the brick process it means brick stack is come up otherwise not 2) Before start/attach a brick check if a brick is mounted or not 3) At the time of printing volume status check brick is consumed by any brick process Test: To test the same followed procedure 1) Setup brick mux environment on a vm 2) Put a breaking point in gdb in function posix_health_check_thread_proc at the time of notify GF_EVENT_CHILD_DOWN event 3) unmount anyone brick path forcefully 4) check gluster v status it will show N/A for the brick 5) Try to start volume with force option, glusterd throw message "No device available for mount brick" 6) Mount the brick_root path 7) Try to start volume with force option 8) down brick is started successfully Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69 fixes: bz#1595320 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	core: dereference check on the variables in glusterfs_handle_brick_status	Hari Gowtham	2018-07-17	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	problem: In a race condition, the active->first which is supposed to be filled is NULL and trying to dereference it crashs. back trace: Core was generated by `/usr/sbin/glusterfsd -s bxts470192.eu.rabonet.com --volfile-id prod_xvavol.bxts'. Program terminated with signal 11, Segmentation fault. 1029 any = active->first; (gdb) bt Change-Id: Ia6291865319a9456b8b01a5251be2679c4985b7c fixes: bz#1600451 Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
*	glusterfsd: Do not process GLUSTERD_BRICK_XLATOR_OP if graph is not ready	Ravishankar N	2018-07-02	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If glustershd gets restarted by glusterd due to node reboot/volume start force/ or any thing that changes shd graph (add/remove brick), and index heal is launched via CLI, there can be a chance that shd receives this IPC before the graph is fully active. Thus when it accesses glusterfsd_ctx->active, it crashes. Fix: Since glusterd does not really wait for the daemons it spawned to be fully initialized and can send the request as soon as rpc initialization has succeeded, we just handle it at shd. If glusterfs_graph_activate() is not yet done in shd but glusterd sends GD_OP_HEAL_VOLUME to shd, we fail the request. Change-Id: If6cc07bc5455c4ba03458a36c28b63664496b17d fixes: bz#1596513 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	rpc/clnt: Don't let consumers manage "connected" state	Raghavendra G	2018-06-04	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback must exist and must - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
*	protocol/server: don't assume there would be a volfile id	Amar Tumballi	2018-05-08	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier glusterfs never had an assumption someone would start it with right arguments, and brick processes would be spawned by a management layer. It just assume the role based on the volfile. Other than volfile, no other arguments should be technically mandatory for working of glusterfs. With this patch, that assumption holds true. Updates: github issue # 352 A note on why this particular issue for this basic sanity? As per the design of thin-arbiter/tie-breaker, it can be started independently on any machine, without need of glusterd. So, similar to 'glusterd', we should be able to spawn a process with any translator without options/volume id etc. fixes: bz#1569399 Change-Id: I5c0650fe0bfde35ad94ccba60e63f6cdcd1ae5ff Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterd: handling brick termination in brick-mux	Sanju Rakonde	2018-05-07	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: There's a race between the glusterfs_handle_terminate() response sent to glusterd from last brick of the process and the socket disconnect event that encounters after the brick process got killed. Solution: When it is a last brick for the brick process, instead of sending GLUSTERD_BRICK_TERMINATE to brick process, glusterd will kill the process (same as we do it in case of non brick multiplecing). The test case is added for https://bugzilla.redhat.com/show_bug.cgi?id=1549996 Change-Id: If94958cd7649ea48d09d6af7803a0f9437a85503 fixes: bz#1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	glusterfsd: initiate pmap_signout for all detach brick requests	Atin Mukherjee	2018-05-04	1	-0/+1
\| \| \| \| \| \| \| \| \|	In glusterfs_handle_terminate all bricks getting detached need to initiate a pmap_signout. Change-Id: Iacbd6fcd49215fe6a5210df7dfed1260fde9179a Fixes: bz#1570011 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	server: fix unresolved symbols by moving them to libglusterfs	Mohit Agrawal	2018-04-20	1	-103/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd2 build is failed due to undefined symbol (xlator_mem_cleanup , glusterfsd_ctx) in server.so Solution: To resolve the same done below two changes 1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c to xlator.c to be part of libglusterfs.so 2) replace glusterfsd_ctx to this->ctx because symbol glusterfsd_ctx is not part of server.so BUG: 1544090 Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5 fixes: bz#1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	gluster: Sometimes Brick process is crashed at the time of stopping brick	Mohit Agrawal	2018-04-19	1	-21/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
*	glusterd: volume inode/fd status broken with brick mux	hari gowtham	2018-04-19	1	-19/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The values for inode/fd was populated from the ctx received from the server xlator. Without brickmux, every brick from a volume belonged to a single brick from the volume. So searching the server and populating it worked. With brickmux, a number of bricks can be confined to a single process. These bricks can be from different volumes too (if we use the max-bricks-per-process option). If they are from different volumes, using the server xlator to populate causes problem. Fix: Use the brick to validate and populate the inode/fd status. Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd fixes: bz#1566067
*	cluster/afr: Make sure latency-arg is passed to afr	Pranith Kumar K	2018-04-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	Revert "glusterd: handling brick termination in brick-mux"	Sanju Rakonde	2018-03-29	1	-22/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit a60fc2ddc03134fb23c5ed5c0bcb195e1649416b. This commit was causing multiple tests to time out when brick multiplexing is enabled. With further debugging, it's found that even though the volume stop transaction is converted into mgmt_v3 to allow the remote nodes to follow the synctask framework to process the command, there are other callers of glusterd_brick_stop () which are not synctask based. Change-Id: I7aee687abc6bfeaa70c7447031f55ed4ccd64693 updates: bz#1545048
*	glusterd: handling brick termination in brick-mux	Sanju Rakonde	2018-03-28	1	-11/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: There's a race between the last glusterfs_handle_terminate() response sent to glusterd and the kill that happens immediately if the terminated brick is the last brick. Solution: When it is a last brick for the brick process, instead of glusterfsd killing itself, glusterd will kill the process in case of brick multiplexing. And also changing gf_attach utility accordingly. Change-Id: I386c19ca592536daa71294a13d9fc89a26d7e8c0 fixes: bz#1545048 BUG: 1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	glusterd: TLS verification fails while using intermediate CA	Mohit Agrawal	2018-03-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: TLS verification fails while using intermediate CA if mgmt SSL is enabled. Solution: There are two main issue of TLS verification failing 1) not calling ssl_api to set cert_depth 2) The current code does not allow to set certificate depth while MGMT SSL is enabled. After apply this patch to set certificate depth user need to set parameter option transport.socket.ssl-cert-depth <depth> in /var/lib/glusterd/secure_acccess instead to set in /etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx we will check the value of cert-depth and save the value of cert-depth in ctx.If user does not provide any value in cert-depth in that case it will consider default value is 1 BUG: 1555154 Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	build: address linkage issues	Kaleb S. KEITHLEY	2018-03-05	1	-87/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have the following undefined symbol error from protocol/server.so: glusterfs_mgmt_pmap_signout glusterfs_autoscale_threads See https://review.gluster.org/19225 (bz#1532238) and https://review.gluster.org/19657 (bz#1550895) (why are there two different bzs for the same bug?) IMO this is a cleaner solution. I.e. moving the above two functions to libgfrpc (.../rpc/rpc-lib/...) I would also, for (foolish) consistency sake, like to see glusterfs_mgmt_pmap_signin() moved from glusterfsd to libgfrpc as well. This works on f28/rawhide, with its new, more restrictive run-time link semantics. The smoke and regression tests on earlier fedora and centos will confirm that it works on those platforms too. Change-Id: I9cfbd1cc15e7ebd9fc31b56ac791287fa2c584de BUG: 1550895 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	glusterfsd: Memleak in glusterfsd process while brick mux is on	Mohit Agrawal	2018-02-27	1	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/19574) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
*	rpcsvc: scale rpcsvc_request_handler threads	Milind Changire	2018-02-26	1	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Scale rpcsvc_request_handler threads to match the scaling of event handler threads. Please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c51 for a discussion about why we need multi-threaded rpcsvc request handlers. Change-Id: Ib6838fb8b928e15602a3d36fd66b7ba08999430b Signed-off-by: Milind Changire <mchangir@redhat.com>
*	Revert "glusterfsd: Memleak in glusterfsd process while brick mux is on"	Mohit Agrawal	2018-02-19	1	-65/+0
\| \| \| \| \| \| \| \| \| \| \|	There are still remain some code paths where cleanup is required while brick mux is on.I will upload a new patch after resolve all code paths. This reverts commit b313d97faa766443a7f8128b6e19f3d2f1b267dd. BUG: 1544090 Change-Id: I26ef1d29061092bd9a409c8933d5488e968ed90e Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	Fetch backup volfile servers from glusterd2	Prashanth Pai	2018-02-16	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clients will request for a list of volfile servers from glusterd2 by setting a (optional) flag in GETSPEC RPC call. glusterd2 will check for the presence of this flag and accordingly return a list of glusterd2 servers in GETSPEC RPC reply. Currently, this list of servers returned only contains servers which have bricks belonging to the volume. See: https://github.com/gluster/glusterd2/issues/382 https://github.com/gluster/glusterfs/issues/351 Updates #351 Change-Id: I0eee3d0bf25a87627e562380ef73063926a16b81 Signed-off-by: Prashanth Pai <ppai@redhat.com>
*	glusterfsd: Memleak in glusterfsd process while brick mux is on	Mohit Agrawal	2018-02-15	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Change-Id: Ifa1525e25b697371276158705026b421b4f81140 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	Revert "rpc: merge ssl infra with epoll infra"	Milind Changire	2018-01-07	1	-3/+0
\| \| \| \| \| \| \|	This reverts commit 56e5fdae74845dfec0ff7ad0c8fee77695d36ad5. Change-Id: Ia62cee5440bbe8e23f5da9cff692d792091d544a Signed-off-by: Milind Changire <mchangir@redhat.com>
*	all: Simplify component message id's definition	Xavier Hernandez	2017-12-14	1	-3/+6
\| \| \| \| \| \| \| \| \|	This patch creates a new way of defining message id's that is easier and less error prone because it doesn't require so many manual changes each time a new component is defined or a new message created. Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
*	rpc: merge ssl infra with epoll infra	Milind Changire	2017-12-12	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch attempts to use the epoll infra for handling SSL connections as well instead of the socket_poller() thread func. This essentially makes priv->own_thread flag redundant. SSL_connect()/SSL_accept() is now non-blocking which has done away with the localised poll() in ssl_do(). So, ssl_do() has been updated appropriately. own_thread and coincidently socket_poller() thread for SSL processing is now deprecated. Added a timeout to test whether seal-heal daemon is up and running as per Ravi's suggestion. Change-Id: If2b5d7b4fd19e321cb289e08d49a718d2161aafe Signed-off-by: Milind Changire <mchangir@redhat.com>
*	metrics: provide options to dump metrics from xlators	Amar Tumballi	2017-12-06	1	-0/+74
\| \| \| \| \| \| \| \| \| \|	* Introduce xlator methods to allow dumping of metrics * Separate options to get the metrics dumped in a path Updates #168 Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	Coverity Issue: PW.INCLUDE_RECURSION in several files	Girjesh Rajoria	2017-11-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439, 440, 441, 442, 443 Issue: Event include_recursion Removed redundant, recursive includes from the files. Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>