summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-handler.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: get-state command should not fail if any brick is gone badv7devSanju Rakonde2019-02-051-4/+5
| | | | | | | | | | | | | | | | | | | | | | Problem: get-state command will error out, if any of the underlying brick(s) of volume(s) in the cluster go bad. It is expected that get-state command should not error out, but should generate an output successfully. Solution: In glusterd_get_state(), a statfs call is made on the brick path for every bricks of the volumes to calculate the total and free memory available. If any of statfs call fails on any brick, we should not error out and should report total memory and free memory of that brick as 0. This patch also handles a statfs failure scenario in glusterd_store_retrieve_bricks(). fixes: bz#1672205 Change-Id: Ia9e8a1d8843b65949d72fd6809bd21d39b31ad83 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* rpc: use address-family option from vol fileMilind Changire2019-01-221-3/+8
| | | | | | | | | | | | | | | | | This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
* glusterd: Avoid dict_leak in __glusterd_handle_cli_uuid_get functionMohit Agrawal2019-01-221-0/+2
| | | | | | Change-Id: Iefe08b136044495f6fa2b092c9e8c833efee1400 fixes: bz#1667905 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: fix crashSanju Rakonde2019-01-131-3/+2
| | | | | | | | | | | | | | | | | | Problem: running "gluster get-state glusterd odir /get-state" resulted in glusterd crash. Cause: In the above command output directory has been specified without "/" at the end. If "/" is not given at the end, "/" will be added to path using "strcat", so the added character "/" is not having memory allocated. When tried to free, glusterd will crash as"/" has no memory allocated. Solution: Instead of concatenating "/" to output directory, add it to output filename. Change-Id: I5dc00a71e46fbef4d07fe99ae23b36fb60dec1c2 fixes: bz#1665038 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: migrating rebalance commands to mgmt_v3 frameworkSanju Rakonde2018-12-181-1/+2
| | | | | | | | | Current rebalance commands use the op_state machine framework. Porting it to use the mgmt_v3 framework. Change-Id: I6faf4a6335c2e2f3d54bbde79908a7749e4613e7 fixes: bz#1655827 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Resolve memory leak in some glusterd functionsMohit Agrawal2018-12-101-0/+6
| | | | | | | | | | | Problem: Functions allocate memory for req structure but after submit request they missed to cleanup memory Solution: After submit request cleanup allocated mmeory Change-Id: I8f995787ed8986b882f008ccd588670b5d4139f5 updates: bz#1633930 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: migrating profile commands to mgmt_v3 frameworkSanju Rakonde2018-12-041-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current profile commands use the op_state machine framework. Porting it to use the mgmt_v3 framework. The following tests were performed on the patch: case 1: 1. On a 3 node cluster, created and started 3 volumes 2. Mounted all the three volumes and wrote some data 3. Started profile operation for all the volumes 4. Ran "gluster v status" from N1, "gluster v profile <volname1> info" form N2, "gluster v profile <volname2> info" from N3 simultaneously in a loop for around 10000 times 5. Didn't find any cores generated. case 2: 1. Repeat the steps 1,2 and 3 from case 1. 2. Ran "gluster v status" from N1, "gluster v profile <volname1> info" form N2(terminal 1), "gluster v profile <volname2> info" from N2(terminal 2) simultaneously in a loop. 3. No cores were generated. fixes: bz#1654181 Change-Id: I83044cf5aee3970ef94066c89fcc41783ed468a6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: perform rcu_read_lock/unlock() under cleanup_lock mutexSanju Rakonde2018-12-031-37/+37
| | | | | | | | | | | | | | Problem: glusterd should not try to acquire locks on any resources, when it already received a SIGTERM and cleanup is started. Otherwise we might hit segfault, since the thread which is going through cleanup path will be freeing up the resouces and some other thread might be trying to acquire locks on freed resources. Solution: perform rcu_read_lock/unlock() under cleanup_lock mutex. fixes: bz#1654270 Change-Id: I87a97cfe4f272f74f246d688660934638911ce54 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd/mux: Optimize brick disconnect handler codeMohammed Rafi KC2018-11-181-63/+14
| | | | | | | | | Removed unnecessary iteration during brick disconnect handler when multiplex is enabled. Change-Id: I62dd3337b7e7da085da5d76aaae206e0b0edff9f fixes: bz#1650115 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd: fix Resource leak coverity issueMohit Agrawal2018-11-161-2/+10
| | | | | | | | | | | | Problem: In commit bcf1e8b07491b48c5372924dbbbad5b8391c6d81 code was missed to free path return by function search_brick_path_from_proc This patch fixes CID: 1396668: Resource leak Change-Id: I4888c071c1058023c7e138a8bcb94ec97305fadf fixes: bz#1646892 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* core: Portmap entries showing stale brick entries when bricks are downMohit Agrawal2018-11-121-2/+5
| | | | | | | | | | | | | | | | Problem: pmap is showing stale brick entries after down the brick because of glusterd_brick_rpc_notify call gf_is_service_running before call pmap_registry_remove to ensure about brick instance. Solutiom: 1) Change the condition in gf_is_pid_running to ensure about process existence, use open instead of access to achieve the same 2) Call search_brick_path_from_proc in __glusterd_brick_rpc_notify along with gf_is_service_running Change-Id: Ia663ac61c01fdee6c12f47c0300cdf93f19b6a19 fixes: bz#1646892 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* libglusterfs/dict: Add sizeof()-1 variants of dict functionsPranith Kumar K2018-10-151-4/+2
| | | | | | | | | | | One needs to be very careful about giving same key for the key and SLEN(key) arguments in dict_xxxn() functions. Writing macros that would take care of passing the SLEN(key) would help reduce this burden on the developer and reviewer. updates: bz#1193929 Change-Id: I312c479b919826570b47ae2c219c53e2f9b2ddef Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: ignore RPC events when glusterd is shutting downAtin Mukherjee2018-10-041-1/+11
| | | | | | | | | | | | | | | When glusterd receives a SIGTERM while it receives RPC connect/disconnect/destroy events, the thread might lead to a crash while accessing rcu_read_lock () as the clean up thread might have already freed up the resources. This is more observable when glusterd comes up with upgrade mode = on during upgrade process. The solution is to ignore these events if glusterd is already in the middle of cleanup_and_exit (). Fixes: bz#1635593 Change-Id: I12831d31c2f689d4deb038b83b9421bd5cce26d9 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: fix resource leak coverity issuesSanju Rakonde2018-09-221-0/+1
| | | | | | | | | | This patch addresses CID 1288098,1370948 and 1382454 key_fixed is allocated with memory but missed to free it. updates: bz#789278 Change-Id: Iea805c668ba89759313f9e21b328757e570be97b Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-5534/+5612
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* glusterd: Fix unused value coverity fixSanju Rakonde2018-09-091-1/+1
| | | | | | | | | | Commit 09198e203e has introduced a new coverity with ID 1395635. keylen variable is assigned to some value but stored value is overwritten before it is used. This patch addresses the issue. updates: bz#789278 Change-Id: Ice290dcb9d703cd2131b0f0803436660e670e10a Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Some (mgmt) xlators: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-09-091-212/+298
| | | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - It also adds dict_get_int32n which was missing. - It also reduces the size of some key variables. They were set to 1024b or PATH_MAX, where sometimes 64 bytes were really enough. Please review carefully: 1. That I did not reduce some the size of the key variables too much. 2. That I did not mix up some keys. Compile-tested only! Change-Id: Ic729baf179f40e8d02bc2350491d4bb9b6934266 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* mgmt/glusterd: Fix memory leaks in glusterd-handler.cVijay Bellur2018-08-231-1/+2
| | | | | | | | | | | | Addresses the following CIDs: 1370941: Unconditional memory leak in glusterd_print_snapinfo_by_vol() 1370943: Memory leak when opendir fails for output directory in glusterd_get_state() Change-Id: I9536841629e1ffe1fed79a8e57d266a0e953e5af updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* xlators/mgmt/glusterd/src/glusterd-handler.c : reduce size or re-scope ↵Yaniv Kaul2018-08-231-23/+19
| | | | | | | | | | | | | | | message variable The the error and/or message variable was either: - Reduced in size - from 2048 bytes to 64 bytes, for example. or - Changed in scope - defined in a smaller scope. Compile-tested only! Change-Id: Iebb92a56d9d0ca53c80d75866bcb7848e08cf6b2 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: fix some coverity issuesBhumika Goyal2018-08-201-1/+1
| | | | | | | | | | | Fixes CID: 1241481 1241482 1274079 1274118 1274121 1274131 1274198 1274214 1274220 1274224 1394663 1394641 382454 1382453 1382449 1288095 Link: https://scan6.coverity.com/reports.htm#v42388/p10714/fileInstanceId=84772667&defectInstanceId=25770661&mergedDefectId=744716 Change-Id: Idaf434186231c8b0fff4b27c57fa23636a89c8a7 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
* All: remove memset() before sprintf()Yaniv Kaul2018-08-141-64/+41
| | | | | | | | | | | | It's not needed. There's a good chance the compiler is smart enough to remove it anyway, but it can't hurt - I hope. Compile-tested only! Change-Id: Id7c054e146ba630227affa591007803f3046416b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-3/+3
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: memory leak in get-stateSanju Rakonde2018-07-181-34/+19
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: gluster get-state command is leaking the memory when geo-replication session is configured. Cause: In glusterd_print_gsync_status(), we are trying to get reference to the keys of gsync_dict. The references to keys of gsync_dict are stored status_vols[i]. status_vols[i] are allocated with a memory of size of gf_gsync_status_t. Solution: Need not to use a array of pointers(status_vals), using a pointer to hold the reference to a key of gsync_dict is sufficient. Followed the below steps for testing: 1. Configured geo-rep session 2. Ran gluster get-state command for 1000 times. Without this patch, glusterd's memory was increasing significantly (around 22000KB per 1000 times), with this patch it reduced (1500KB per 1000 times) fixes: bz#1601423 Change-Id: I361f5525d71f821bb345419ccfdc20ca288ca292 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-101-2/+7
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: Fix glusterd crashSanju Rakonde2018-07-061-9/+0
| | | | | | | | | | | | | | | | | | | | Problem: gluster get-state command is crashing glusterd process, when geo-replication session is configured. Cause: Crash is happening due to the double free of memory. In glusterd_print_gsync_status_by_vol we are calling dict_unref(), which will free all the keys and values in the dictionary. Before calling dict_unref(), glusterd_print_gsync_status_by_vol is calling glusterd_print_gsync_status(). glusterd_print_gsync_status is freeing up values in the dictionary and again when dict_unref() is called, it tries to free up the values which are already freed. Solution: Remove the code which will free the memory in glusterd_print_gsync_status function. Fixes: bz#1598345 Change-Id: Id3d8aae109f377b462bbbdb96a8e3c5f6b0be752 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Introduce daemon-log-level cluster wide optionAtin Mukherjee2018-07-031-0/+1
| | | | | | | | | | | | This option, applicable to the node level daemons can be very helpful in controlling the log level of these services. Please note any daemon which is started prior to setting the specific value of this option (if not INFO) will need to go through a restart to have this change into effect. Change-Id: I7f6d2620bab2b094c737f5cc816bc093e9c9c4c9 fixes: bz#1597473 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* rpc/clnt: Don't let consumers manage "connected" stateRaghavendra G2018-06-041-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback *must* exist and *must* - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
* changed 'sometime' messsages to 'some time'Levi Baber2018-06-011-2/+2
| | | | | | Change-Id: I0936229fc84c011db7791218bb566c971fdea174 fixes: bz#1584864 Signed-off-by: Levi Baber <baber@iastate.edu>
* glusterd: address test failures with brick mux enabledAtin Mukherjee2018-05-311-0/+8
| | | | | | | | | | | | | | | | | | | | | | This patch addresses following: 1. On volume stop, for the last brick, pmap_registry_remove () is invoked by glusterd. 2. If a brick process is sigkilled, remove all the associated brick instances from the portmap. 3. Bump up PROCESS_UP_TIMEOUT to 45. 4. gf_attach to kill a brick takes more time in mux (which is an issue that needs a fix), but in the interim, give br-state-check.t more time to complete (there are 2 kill_bricks, each taking 120 seconds, and the test usually passes in 30 odd seconds, hence bumping this up to 350 seconds) 5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at times on master without mux, in mux cases, when it fails, it is almost at the last iteration, hence bumping the timeout for this test case to reduce regression error rates Updates: bz#1577672 Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: glusterd is releasing the locks before timeoutSanju Rakonde2018-05-281-0/+10
| | | | | | | | | | | | | | | | | | | Problem: We introduced lock timer in mgmt v3, which will realease the lock after 3 minutes from command execution. Some commands related to heal/profile will take more time to execute. For these comands timeout is set to 10 minutes. As the lock timer is set to 3 minutes glusterd is releasing the lock after 3 minutes. That means locks are released before the command is completed its execution. Solution: Pass a timeout parameter from cli to glusterd, when there is a change in default timeout value(i.e, default timeout value can be changed through command line or for the commands related to profile/heal we will change the default timeout value to 10 minutes.) glusterd will set the lock timer timeout according to the timeout value passed. Change-Id: I7b7a9a4f95ed44aca39ef9d9907f546bca99c69d fixes: bz#1577731 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Fix for memory leak in get-state detailSanju Rakonde2018-05-011-1/+8
| | | | | | Fixes: bz#1573066 Change-Id: I76fe3bdde7351736b32eb3d6c4cc5f8f276257ed Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: volume inode/fd status broken with brick muxhari gowtham2018-04-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | Problem: The values for inode/fd was populated from the ctx received from the server xlator. Without brickmux, every brick from a volume belonged to a single brick from the volume. So searching the server and populating it worked. With brickmux, a number of bricks can be confined to a single process. These bricks can be from different volumes too (if we use the max-bricks-per-process option). If they are from different volumes, using the server xlator to populate causes problem. Fix: Use the brick to validate and populate the inode/fd status. Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd fixes: bz#1566067
* glusterd: mark port_registered to true for all running bricks with brick muxAtin Mukherjee2018-04-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | glusterd maintains a boolean flag 'port_registered' which is used to determine if a brick has completed its portmap sign in process. This flag is (re)set in pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is the identifier to determine if the very first brick with which the process is spawned up has completed its sign in process. However in case of glusterd restart when a brick is already identified as running, glusterd does a pmap_registry_bind to ensure its portmap table is updated but this flag isn't which is fine in case of non brick multiplex case but causes an issue if the very first brick which came as part of process is replaced and then the subsequent brick attach will fail. One of the way to validate this is to create and start a volume, remove the first brick and then add-brick a new one. Add-brick operation will take a very long time and post that the volume status will show all other brick status apart from the new brick as down. Solution is to set brickinfo->port_registered to true for all the running bricks when brick multiplexing is enabled. Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff Fixes: bz#1560957 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: fix txn_opinfo memory leakAtin Mukherjee2018-04-041-0/+1
| | | | | | | | | | | | | For transactions where there's no volname involved (eg : gluster v status), the originator node initiates with staging phase and what that means in op-sm there's no unlock event triggered which resulted into a txn_opinfo dictionary leak. Credits : cynthia.zhou@nokia-sbell.com Change-Id: I92fffbc2e8e1b010f489060f461be78aa2b86615 Fixes: bz#1550339 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: TLS verification fails while using intermediate CAMohit Agrawal2018-03-191-0/+3
| | | | | | | | | | | | | | | | | | | | | Problem: TLS verification fails while using intermediate CA if mgmt SSL is enabled. Solution: There are two main issue of TLS verification failing 1) not calling ssl_api to set cert_depth 2) The current code does not allow to set certificate depth while MGMT SSL is enabled. After apply this patch to set certificate depth user need to set parameter option transport.socket.ssl-cert-depth <depth> in /var/lib/glusterd/secure_acccess instead to set in /etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx we will check the value of cert-depth and save the value of cert-depth in ctx.If user does not provide any value in cert-depth in that case it will consider default value is 1 BUG: 1555154 Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterd: volume get fixes for client-io-threads & quorum-typeRavishankar N2018-03-071-3/+3
| | | | | | | | | | | | | | | | | | | | 1. If a replica volume created on glusterfs-3.8 was upgraded to glusterfs-3.12, `gluster vol get volname client-io-threads` displayed 'on' even though it wasn't and the xlator wasn't loaded on the client-graph. This was due to removing certain checks in glusterd_get_default_val_for_volopt as a part of commit 47604fad4c2a3951077e41e0c007ceb979bb2c24. Fix it. 2. Also, as a part of op-version bump-up, client-io-threads was being loaded on the clients during volfile regeneration. Prevent it. 3. AFR assumes quorum-type to be auto in newly created replic 3 (odd replica in general) volumes but `gluster vol get quorum-type` displays 'none'. Fix it. Change-Id: I19e586361ed1065c70fb378533d3b4dac1095df9 BUG: 1545056 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* glusterd: add profile_enabled flag in get-stateAtin Mukherjee2018-01-251-0/+2
| | | | | | Change-Id: I09f348ed7ae6cd481f8c4d8b4f65f2f2f6aad84e BUG: 1537364 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: get-state memory leak fixAtin Mukherjee2018-01-081-3/+12
| | | | | | Change-Id: Ic4fcf2087f295d3dade944efb8fd08f7e2d7d516 BUG: 1531149 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Fix management client deadlockRichard Wareing2017-12-071-1/+6
| | | | | | | | | | | | | | | | | | | Ping notify is a NOOP for management daemons Reviewers: sshreyas Reviewed By: sshreyas FB-commit-id: ec30b68 Change-Id: I8e121aaaa3ad268e5df057e03aa4b37a403c9ea0 BUG: 1522968 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: https://review.gluster.org/16858 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* glusterd: Fix coverity issues in glusterd-handler.cSamikshan Bairagya2017-11-151-9/+26
| | | | | | | | | | | Fixes get-state CLI related coverity issues 477, 511, 515, 523, 526 and 527 from the report at [1] [1] https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-10-30-9aa574a5/html/ Change-Id: Ieb6f64c9035b4d9338d9515de003d607b7a4e9bc BUG: 789278 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
* glusterd: few coverity fixes in glusterd-handler.cSanju Rakonde2017-11-121-6/+15
| | | | | | | | This patch fixes coverity issues 695,555,263 Change-Id: I3577cbc793b6652b24cc719037db2bdd5e27f196 BUG: 789278 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Coverity Issue: PW.INCLUDE_RECURSION in several filesGirjesh Rajoria2017-11-091-2/+0
| | | | | | | | | | | | | | Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439, 440, 441, 442, 443 Issue: Event include_recursion Removed redundant, recursive includes from the files. Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
* glusterd: Changed default op-version for some optionsShyamsundarR2017-11-061-2/+2
| | | | | | | | | | As 3.13 is branched at a point that includes the features that are changed with this commit, their minimum supported op-versions should also change to 3.13. Change-Id: I7ef8eccc13a16f93939c1edbff9508d1e167c5e4 BUG: 1509412 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: fix brick restart parallelismAtin Mukherjee2017-11-011-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | glusterd's brick restart logic is not always sequential as there is atleast three different ways how the bricks are restarted. 1. through friend-sm and glusterd_spawn_daemons () 2. through friend-sm and handling volume quorum action 3. through friend handshaking when there is a mimatch on quorum on friend import. In a brick multiplexing setup, glusterd ended up trying to spawn the same brick process couple of times as almost in fraction of milliseconds two threads hit glusterd_brick_start () because of which glusterd didn't have any choice of rejecting any one of them as for both the case brick start criteria met. As a solution, it'd be better to control this madness by two different flags, one is a boolean called start_triggered which indicates a brick start has been triggered and it continues to be true till a brick dies or killed, the second is a mutex lock to ensure for a particular brick we don't end up getting into glusterd_brick_start () more than once at same point of time. Change-Id: I292f1e58d6971e111725e1baea1fe98b890b43e2 BUG: 1506513 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: clean up portmap on brick disconnectAtin Mukherjee2017-10-311-0/+25
| | | | | | | | | | | | | | | | | | | | GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is initiated by the brick process at the shutdown. But if the brick process crashes or gets killed through SIGKILL then this event is not initiated and glusterd ends up with a stale port. Since GlusterD's portmap traversal happens both ways, forward for allocation and backward for registry search, there is a possibility that glusterd might end up running with a stale port for a brick which eventually will end up with clients to fail to connect to the bricks. Solution is to clean up the port entry in case the process is down as part of the brick disconnect event. Although with this the handling PMAP_SIGNOUT event becomes redundant in most of the cases, but this is the safeguard method to avoid glusterd getting into the stale port issues. Change-Id: I04c5be6d11e772ee4de16caf56dbb37d5c944303 BUG: 1503246 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd:Marking all the brick status as stopped when a process goes down in ↵Sanju Rakonde2017-10-121-1/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | brick multiplexing In brick multiplexing environment, if a brick process goes down i.e., if we kill it with SIGKILL, the status of the brick for which the process came up for the first time is only changing to stopped. all other brick statuses are remain started. This is happening because the process was killed abruptly using SIGKILL signal and signal handler wasn't invoked and further cleanup wasn't triggered. When we try to start a volume using force, it shows error saying "Request timed out", since all the brickinfo->status are still in started state, we're waiting for one of the brick process to come up which never going to happen since the brick process was killed. To resolve this, In the disconnect event, We are checking all the processes that whether the brick which got disconnected belongs the process. Once we get the process we are calling a function named glusterd_mark_bricks_stopped_by_proc() and sending brick_proc_t object as an argument. From the glusterd_brick_proc_t we can get all the bricks attached to that process. but these are duplicated ones. To get the original brickinfo we are reading volinfo from brick. In volinfo we will have original brickinfo copies. We are changing brickinfo->status to stopped for all the bricks. Change-Id: Ifb9054b3ee081ef56b39b2903ae686984fe827e7 BUG: 1499509 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd : fix client io-threads option for replicate volumesRavishankar N2017-10-091-3/+4
| | | | | | | | | | | | | | | | | | | Problem: Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading client-io-threads for replicate volumes (it was set to on by default in commit e068c1997314046658dd502e9118dab32decf879) due to performance issues but in doing so, inadvertently failed to load the xlator even if the user explicitly enabled the option using the volume set command. This was despite returning returning sucess for the volume set. Fix: Modify the check in perfxl_option_handler() and add checks in volume create/add-brick/remove-brick code paths, tying it all to GD_OP_VERSION_3_12_2. Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf BUG: 1498570 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* Coverity Issue Fix: IDENTICAL_BRANCHESGirjesh Rajoria2017-09-261-2/+0
| | | | | | | | | | | | Issue: Event identical_branches: The same code is executed when the condition "ret" is true or false, because the code in the if-then branch and after the if statement is identical. Function: glusterd_print_gsync_status_by_vol Fix: removed if and goto statement. Change-Id: I966d793c9f3b65487acfb07083a4039caf593105 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
* Command to identify client processhari gowtham2017-09-061-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | command: gluster volume status <volname/all> client-list output: Client connections for volume v1 Name count ----- ------ fuse 2 tierd 1 total clients for volume v1 : 3 ----------------------------------------------------------------- Client connections for volume v2 Name count ----- ------ tierd 1 fuse.gsync 1 total clients for volume v2 : 2 ----------------------------------------------------------------- Updates: #178 Change-Id: I0ff2579d6adf57cc0d3bd0161a2ec6ac6c4747c0 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/18095 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>