summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd.c
Commit message (Collapse)AuthorAgeFilesLines
* [RFC]#ifdef gNFS related code if we are not compiling gNFSYaniv Kaul2019-12-181-1/+8
| | | | | | | | | | | | | | | | If we are not compiling gNFS (--enable-gnfs is not given in the ./configure script params), there is little point in compiling code that is related to it. This patch tries to eliminate it. My hope (and it's not clear from the code ) is that I did not break the NFS Ganesha support as well. Other than that, tried to compile with and without anad it looks sane. Change-Id: I8d6c98066b9fceab4ec10fc6f5e81ab069e853bd updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: Fixed incorrect size argumentN Balachandran2019-08-271-2/+3
| | | | | | | | | | An incorrect size argument to snprintf caused the glusterd process to crash on startup. This has been fixed. Change-Id: Iddafb5468866d0182cd8239210c92c893e643285 Fixes: bz#1745965 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* glusterd: create separate logdirs for cluster.rc instancesN Balachandran2019-08-141-66/+109
| | | | | | | | | | Create a separate logdir for each host instance created by cluster.rc. This makes it easier to determine the files belonging to a particular instance. Change-Id: Ic8321f83f98995412b7d5f095b3d3f0391767a8b Fixes: bz#1733042 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* event: rename event_XXX with gf_ prefixedXiubo Li2019-07-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit one crash issue when using the libgfapi. In the libgfapi it will call glfs_poller() --> event_dispatch() in file api/src/glfs.c:721, and the event_dispatch() is defined by libgluster locally, the problem is the name of event_dispatch() is the extremly the same with the one from libevent package form the OS. For example, if a executable program Foo, which will also use and link the libevent and the libgfapi at the same time, I can hit the crash, like: kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp 00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000] The link for Foo is: lib_foo_LADD = -levent $(GFAPI_LIBS) It will crash. This is because the glfs_poller() is calling the event_dispatch() from the libevent, not the libglsuter. The gfapi link info : GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid If I link Foo like: lib_foo_LADD = $(GFAPI_LIBS) -levent It will works well without any problem. And if Foo call one private lib, such as handler_glfs.so, and the handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't and it will dlopen(handler_glfs.so), then the crash will be hit everytime. The link info will be: foo_LADD = -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like: foo_LADD = $(GFAPI_LIBS) -levent libhandler_glfs_LIBADD = $(GFAPI_LIBS) But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS. And in some cases when the --as-needed link option is added(on many dists it is added as default), then the crash is back again, the above workaround won't work. Fixes: #699 Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa Signed-off-by: Xiubo Li <xiubli@redhat.com>
* core: use more restrictive mode while creating the directoriesSanju Rakonde2019-07-231-18/+18
| | | | | | | fixes: bz#1724024 Change-Id: I539fb7248b2cfc037ec29f1413ea648f9ec21ef2 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Remove hadoop related code from the codebaseVijay Bellur2019-07-091-25/+3
| | | | | | | | | As Hadoop is no longer supported, dropping code for handling Hadoop access. Fixes: bz#1728417 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I8fcf4faacb364f1c9a8abb0c48faec337087f845
* core: fedora 30 compiler warningsSheetalPamecha2019-06-181-1/+1
| | | | | | | | warning: ā€˜%sā€™ directive argument is null [-Wformat-overflow=] Change-Id: I69b8d47f0002c58b00d1cc947fac6f1c64e0b295 updates: bz#1193929 Signed-off-by: SheetalPamecha <spamecha@redhat.com>
* shd/glusterd: Serialize shd manager to prevent race conditionMohammed Rafi KC2019-05-101-0/+1
| | | | | | | | | | | At the time of a glusterd restart, while doing a handshake there is a possibility that multiple shd manager might get executed. Because of this, there is a chance that multiple shd get spawned during a glusterd restart Change-Id: Ie20798441e07d7d7a93b7d38dfb924cea178a920 fixes: bz#1707081 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd: define dumpops in the xlator_api of glusterdSanju Rakonde2019-04-271-0/+1
| | | | | | | | | | | | | | Problem: statedump is not capturing information related to glusterd Solution: statdump is not capturing glusterd info because trav->dumpops is null in gf_proc_dump_single_xlator_info () where trav is glusterd xlator object. trav->dumpops is null because we missed to define dumpops in xlator_api of glusterd. defining dumpops in xlator_api of glusterd fixes the issue. fixes: bz#1703629 Change-Id: If85429ecb1ef580aced8d5b88d09fc15258bfc4c Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* mgmt/shd: Implement multiplexing in self heal daemonMohammed Rafi KC2019-04-011-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ --------- --------- ---------- | AFR-1 | | AFR-2 | | AFR-3 | -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ ------------ ------------ ------------ | volume-1 | | volume-2 | | volume-3 | ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* rpc/transport: Missing a ref on dict while creating transport objectMohammed Rafi KC2019-03-201-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd: coverity fixesSanju Rakonde2018-12-271-0/+2
| | | | | | | | | | | | This patch addresses coverity issues with CID 1398470 and 1398475 1398470 - Missing unlock - False positive, Added a annotation to make coverity happy 1398475 - Unused value Change-Id: I1bb3df0b716690fad8fc52c393c8b2b6c41f7860 updates: bz#789278 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* xlator: make 'xlator_api' mandatoryAmar Tumballi2018-12-131-0/+12
| | | | | | | | | | | | | | * Remove the options to load old symbol. * keep only 'xlator_api' symbol from being exported using xlator.sym * add xlator_api to all the xlators where its missing NOTE: This covers all the xlators which has at least a test case to validate its loading. If there is a translator, which doesn't have any test, then we should probably remove that from codebase. fixes: #164 Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb Signed-off-by: Amar Tumballi <amarts@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: glusterd to regenerate volfiles when GD_OP_VERSION_MAX changesAtin Mukherjee2018-12-051-5/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While glusterd has an infra to allow post install of spec to bring it up in the interim upgrade mode to allow all the volfiles to be regenerated with the latest executable, in container world the same methodology is not followed as container image always point to the specific gluster rpm and gluster rpm doesn't go through an upgrade process. This fix does the following: 1. If glusterd.upgrade file doesn't exist, regenerate the volfiles 2. If maximum-operating-version read from glusterd.upgrade doesn't match with GD_OP_VERSION_MAX, glusterd detects it to be a version where new options are introduced and regenerate the volfiles. Tests done: 1. Bring up glusterd, check if glusterd.upgrade file has been created with GD_OP_VERSION_MAX value. 2. Post 1, restart glusterd and check glusterd hasn't regenerated the volfiles as there's is no change in the GD_OP_VERSION_MAX vs the op_version read from the file. 3. Bump up the GD_OP_VERSION_MAX in the code by 1 and post compilation restart glusterd where the volfiles should be again regenerated. Note: The old way of having volfiles regenerated during an rpm upgrade is kept as it is for now but eventually this can be sunset later. Change-Id: I75b49a1601c71e99f6a6bc360dd12dd03a96414b Fixes: bz#1651463 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* core: Resolve memory leak at the time of graph initMohit Agrawal2018-11-201-4/+0
| | | | | | | | | | | Problem: Memory leak when graph init fails as during volfile exchange between brick and glusterd Solution: Fix the error code path in glusterfs_graph_init Change-Id: If62bee61283fccb7fd60abc6ea217cfac12358fa fixes: bz#1651431 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* core: fix strncpy, coverity annotationKaleb S. KEITHLEY2018-11-191-1/+6
| | | | | | | | | | | | For added fun, coverity is not smart enough to detect that the strncpy() is safe, and for extra laughs, using coverity annotations doesn't do anything either; but we're adding them anyway, along with marking the BUFFER_SIZE_WARNINGS as false positives on scan.coverity.com. Change-Id: If7fa157eca565842109f32fee0399ac183b19ec7 updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* cli: cluster.server-quorum-type help text is missingShwetha Acharya2018-11-161-4/+8
| | | | | | | | Added a default value "none" and additional description. Change-Id: I3a5c06f8ec1e502fc399860e4b5cb835102cd71d Updates: bz#1608512 Signed-off-by: Shwetha Acharya <sacharya@redhat.com>
* core: fix strncpy warningsKaleb S. KEITHLE2018-11-151-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since gcc-8.2.x (fedora-28 or so) gcc has been emitting warnings about buggy use of strncpy. Most uses that gcc warns about in our sources are exactly backwards; the 'limit' or len is the strlen/size of the _source param_, giving exactly zero protection against overruns. (Which was, after all, one of the points of using strncpy in the first place.) IOW, many warnings are about uses that look approximately like this: ... char dest[8]; char src[] = "this is a string longer than eight chars"; ... strncpy (dest, src, sizeof(src)); /* boom */ ... The len/limit should be sizeof(dest). Note: the above example has a definite over-run. In our source the overrun is typically only theoretical (but possibly exploitable.) Also strncpy doesn't null-terminate on truncation; snprintf does; prefer snprintf over strncpy. Mildly surprising that coverity doesn't warn/isn't warning about this. Change-Id: I022d5c6346a751e181ad44d9a099531c1172626e updates: bz#1193929 Signed-off-by: Kaleb S. KEITHLE <kkeithle@redhat.com>
* glusterd: Use GF_ATOMIC to update 'blockers' counter at glusterd_confMohit Agrawal2018-09-201-1/+1
| | | | | | | | | | | | | | | Problem: Currently in glusterd code uses sync_lock/sync_unlock to update blockers counter which could add delays to the overall transaction phase escpecially when there's a batch of volume stop operations processed by glusterd in brick multiplexing mode. Solution: Use GF_ATOMIC to update blocker counter to ensure unnecessary context switching can be avoided. Change-Id: Ie13177dfee2af66687ae7cf5c67405c152853990 Fixes: bz#1631128 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-1873/+1841
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* multiple xlators (mgmt): strncpy()->sprintf(), reduce strlen()'sYaniv Kaul2018-09-071-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/mgmt/glusterd/src/glusterd-geo-rep.c xlators/mgmt/glusterd/src/glusterd-handshake.c xlators/mgmt/glusterd/src/glusterd-sm.c xlators/mgmt/glusterd/src/glusterd-store.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-volgen.c xlators/mgmt/glusterd/src/glusterd-volume-ops.c xlators/mgmt/glusterd/src/glusterd.c strncpy may not be very efficient for short strings copied into a large buffer: If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written. Instead, use snprintf(). Try to ensure output is not truncated. Also: - save the result of strlen() and re-use it when possible. - move from strlen to SLEN (sizeof() ) for const strings. Compile-tested only! Change-Id: Ib5d001857236f43e41c4a51b5f48e1a33110aaeb updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: compare friend data within mutexAtin Mukherjee2018-08-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | During friend handshake if the glusterd receives more than one friend updates, it might very well become possible that two threads would end up working on two different volinfo references and glusterd might end up updating the store with a old volinfo reference. While debugging glusterd crash from validating-server-quorum.t test file from the line-coverage regression the same was observed. Solution is to run glusterd_compare_friend_data under a mutex. Test: As the crash was more visible in the line-coverage run (given lcov does some instrumentation and exposes the races), 6 manual lcov runs were triggered starting from https://build.gluster.org/job/line-coverage/443 to https://build.gluster.org/job/line-coverage/449/ and no crash was observed from validating-server-quorum.t Change-Id: I86fce473a76fd24742d51bf17a685d28b90a8941 Fixes: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-2/+2
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-101-14/+60
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterfs: access trusted peer group via remote-host commandMohit Agrawal2018-06-201-5/+0
| | | | | | | | | | | | | Problem: In SSL environment the user is able to access volume via remote-host command without adding node in a trusted pool Solution: Change the list of rpc program in glusterd.c at the time of initialization while SSL is enabled BUG: 1593232 Change-Id: I987e433b639e68ad17b77b6452df1e22dbe0f199 fixes: bz#1593232 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* tests: Enable geo-rep test casesKotresh HR2018-01-051-2/+2
| | | | | | | | | | | | | | | | | This patch re-enables the geo-rep test cases. Along with it does following optimizations. 1. Use EXPECT_WITHIN instead of sleep 2. Clean up geo-rep ssh key after test 3. Changes to gverify.sh and S56glusterd-geo-rep-create-post.sh to use the given ssh identity file for geo-rep create 4. Make gluster-command-dir configurable and introduce slave-gluster-command-dir which points the parent directory of gluster binaries in master and slave respectively. Change-Id: Ia7696278d9dd3ba04224dcd7c3564088ca970b04 BUG: 1480491 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* rpc: optimize fop program lookupMilind Changire2017-11-061-1/+1
| | | | | | | | | | Ensure that the fop program is the first in the program list so that there's minimum amount of time spent to search the program for the most frequently needed use case. Change-Id: I45c3dcdbf39ec90ba39d914432d13a2ace00a5ee BUG: 1509647 Signed-off-by: Milind Changire <mchangir@redhat.com>
* glusterd: Fix few coverity errorsPrashanth Pai2017-11-061-3/+3
| | | | | | | | | | | Fixes issues 810, 248, 491, 499, 85, 786, 811, 43, and 44 from the report at [1]. [1]: https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-10-30-9aa574a5/html/ BUG: 789278 Change-Id: I27ebae2ffb2256b8eef0757d768cc46e5a942e9f Signed-off-by: Prashanth Pai <ppai@redhat.com>
* glusterd: clean up portmap on brick disconnectAtin Mukherjee2017-10-311-1/+2
| | | | | | | | | | | | | | | | | | | | GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is initiated by the brick process at the shutdown. But if the brick process crashes or gets killed through SIGKILL then this event is not initiated and glusterd ends up with a stale port. Since GlusterD's portmap traversal happens both ways, forward for allocation and backward for registry search, there is a possibility that glusterd might end up running with a stale port for a brick which eventually will end up with clients to fail to connect to the bricks. Solution is to clean up the port entry in case the process is down as part of the brick disconnect event. Although with this the handling PMAP_SIGNOUT event becomes redundant in most of the cases, but this is the safeguard method to avoid glusterd getting into the stale port issues. Change-Id: I04c5be6d11e772ee4de16caf56dbb37d5c944303 BUG: 1503246 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd : introduce timer in mgmt_v3_lockGaurav Yadav2017-10-171-5/+23
| | | | | | | | | | | | | | | | Problem: In a multinode environment, if two of the op-sm transactions are initiated on one of the receiver nodes at the same time, there might be a possibility that glusterd may end up in stale lock. Solution: During mgmt_v3_lock a registration is made to gf_timer_call_after which release the lock after certain period of time Change-Id: I16cc2e5186a2e8a5e35eca2468b031811e093843 BUG: 1499004 Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
* glusterd: introduce max-port rangeAtin Mukherjee2017-08-171-2/+15
| | | | | | | | | | | | | | | | | glusterd.vol file always had an option (commented out) to indicate the base-port to start the portmapper allocation. This patch brings in the max-port configuration where one can limit the range of ports which gluster can be allowed to bind. Fixes: #305 Change-Id: Id7a864f818227b9530a07e13d605138edacd9aa9 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/18016 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Gaurav Yadav <gyadav@redhat.com>
* glusterd: Gluster should keep PID file in correct locationGaurav Kumar Garg2017-08-111-0/+77
| | | | | | | | | | | | | | | | | | | | | | | Currently Gluster keeps process pid information of all the daemons and brick processes in Gluster configuration file directory (ie., /var/lib/glusterd/*). These pid files should be seperate from configuration files. Deletion of the configuration file directory might result into serious problems. Also, /var/run/gluster is the default placeholder directory for pid files. So, with this fix Gluster will keep all process pid information of all processes in /var/run/gluster/* directory. Change-Id: Idb09e3fccb6a7355fbac1df31082637c8d7ab5b4 BUG: 1258561 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-on: https://review.gluster.org/13580 Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* logging: localtime logging, cmdline, volume set optionKaleb S. KEITHLEY2017-08-031-0/+20
| | | | | | | | | | | | | | | | | Despite the fact that appliances generally use UTC, some users really want log entries in localtime. fixes gluster/glusterfs#272 feature page: https://review.gluster.org/17807 Change-Id: I5fbf2c3eedd9eb128fb3f851dd67b2f4081c8bba Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16911 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: Introduce option to limit no. of muxed bricks per processSamikshan Bairagya2017-07-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces a new global option that can be set to limit the number of multiplexed bricks in one process. Usage: `# gluster volume set all cluster.max-bricks-per-process <value>` If this option is not set then multiplexing will happen for now with no limitations set; i.e. a brick process will have as many bricks multiplexed to it as possible. In other words the current multiplexing behaviour won't change if this option isn't set to any value. This commit also introduces a brick process instance that contains information about brick processes, like the number of bricks handled by the process (which is 1 in non-multiplexing cases), list of bricks, and port number which also serves as an unique identifier for each brick process instance. The brick process list is maintained in 'glusterd_conf_t'. Updates: #151 Change-Id: Ib987d14ab0a4f6034dac01b73a4b2839f7b0b695 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17469 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* protocol/server: make listen backlog value as configurableMohammed Rafi KC2017-06-081-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problem: When we call listen from protocol/server, we are giving a hard coded valie of 10 if it is not manually given. With multiplexing, especially when glusterd restarts all clients may try to connect to the server at a time. Which will result in overflowing the queue, and kernel will complain about the errors. Solution: This patch will introduce a volume set command to make backlog value as a configurable. This patch also changes the default values for backlog from 10 to 128. This changes is only applicable for sockets listening from protocol. Example: gluster volume set <volname> transport.listen-backlog 1024 Note: 1 Brick has to be restarted to get this value in effect 2 This changes won't be reflected in glusterd, or other xlators which calls listen. If you need, you have to add this option to the volfile. Change-Id: I0c5a2bbf28b5db612f9979e7560e05dd82b41477 BUG: 1456405 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17411 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterfs: Not able to mount running volume after enable brick mux and ā†µMohit Agrawal2017-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | stopped any volume Problem: After enabled brick mux if any volume has down and then try ot run mount with running volume , mount command is hung. Solution: After enable brick mux server has shared one data structure server_conf for all associated subvolumes.After down any subvolume in some ungraceful manner (remove brick directory) posix xlator sends GF_EVENT_CHILD_DOWN event to parent xlatros and server notify updates the child_up to false in server_conf.When client is trying to communicate with server through mount it checks conf->child_up and it is FALSE so it throws message "translator are not yet ready". From this patch updated structure server_conf to save child_up status for xlator wise. Another improtant correction from this patch is cleanup threads from server side xlators after stop the volume. BUG: 1453977 Change-Id: Ic54da3f01881b7c9429ce92cc569236eb1d43e0d Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17356 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterd: socketfile & pidfile related fixes for brick multiplexing featureMohit Agrawal2017-05-091-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While brick-muliplexing is on after restarting glusterd, CLI is not showing pid of all brick processes in all volumes. Solution: While brick-mux is on all local brick process communicated through one UNIX socket but as per current code (glusterd_brick_start) it is trying to communicate with separate UNIX socket for each volume which is populated based on brick-name and vol-name.Because of multiplexing design only one UNIX socket is opened so it is throwing poller error and not able to fetch correct status of brick process through cli process. To resolve the problem write a new function glusterd_set_socket_filepath_for_mux that will call by glusterd_brick_start to validate about the existence of socketpath. To avoid the continuous EPOLLERR erros in logs update socket_connect code. Test: To reproduce the issue followed below steps 1) Create two distributed volumes(dist1 and dist2) 2) Set cluster.brick-multiplex is on 3) kill glusterd 4) run command gluster v status After apply the patch it shows correct pid for all volumes BUG: 1444596 Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17101 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Fix removing pmap entry on rpc disconnectPrashanth Pai2017-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: The following line of code intended to remove pmap entry for the connection during disconnects: pmap_registry_remove (this, 0, NULL, GF_PMAP_PORT_NONE, xprt); However, no pmap entry will have it's type set to GF_PMAP_PORT_NONE at any point in time. So a call to pmap_registry_search_by_xprt() in pmap_registry_remove() will always fail to find a match. Fix: Optionally ignore pmap entry's type in pmap_registry_search_by_xprt(). BUG: 1193929 Change-Id: I705f101739ab1647ff52a92820d478354407264a Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17129 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* build: conditionally build legacy gNFS server and associated sub-packagingKaleb S. KEITHLEY2017-04-281-29/+6
| | | | | | | | | | | | | | | | | | | Plus some additional logic in glusterd to ensure gnfs (glusterfs) daemons are never started if server/nfs xlator is not installed. As a service, nfs is still initialized. The glusterfs-gnfs RPM may be installed or uninstalled independent of anything else, including on a system where gluster is actively running, so the existence of the xlator is always tested before trying to start gnfs. Change-Id: I56743ad1cb36a84917226d7d26cb9d015d441e66 BUG: 1326219 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16958 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* xlator: do not call dlclose() when debuggingNiels de Vos2017-04-071-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind can not show the symbols if a .so after calling dlclose(). The unhelpful ??? in the output gets resolved properly with this change: ==25170== 344 bytes in 1 blocks are definitely lost in loss record 233 of 324 ==25170== at 0x4C29975: calloc (vg_replace_malloc.c:711) ==25170== by 0x52C7C0B: __gf_calloc (mem-pool.c:117) ==25170== by 0x12B0638A: ??? ==25170== by 0x528FCE6: __xlator_init (xlator.c:472) ==25170== by 0x528FE16: xlator_init (xlator.c:498) ==25170== by 0x52DA8D6: glusterfs_graph_init (graph.c:321) ==25170== by 0x52DB587: glusterfs_graph_activate (graph.c:695) ==25170== by 0x5046407: glfs_process_volfp (glfs-mgmt.c:79) ==25170== by 0x5043B9E: glfs_volumes_init (glfs.c:281) ==25170== by 0x5044FEC: glfs_init_common (glfs.c:986) ==25170== by 0x50451A7: glfs_init@@GFAPI_3.4.0 (glfs.c:1031) By not calling dlclose(), the dynamically loaded .so is still available upon program exit, and Valgrind is able to resolve the symbols. This will add an additional leak, so dlclose() is called for normal builds, but skipped when configuring with "./configure --enable-valgrind" or passing the "run-with-valgrind" xlator option. URL: http://valgrind.org/docs/manual/faq.html#faq.unhelpful Change-Id: I2044e21b1b8fcce32ad1a817fdd795218f967731 BUG: 1425623 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16809 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd: hold off volume deletes while still restarting bricksJeff Darcy2017-03-301-0/+1
| | | | | | | | | | | | | | | | We need to do this because modifying the volume/brick tree while glusterd_restart_bricks is still walking it can lead to segfaults. Without waiting we could accidentally "slip in" while attach_brick has released big_lock between retries and make such a modification. Change-Id: I30ccc4efa8d286aae847250f5d4fb28956a74b03 BUG: 1432542 Signed-off-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-on: https://review.gluster.org/16927 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* posix: Fix creation of files with S_ISVTX on FreeBSDXavier Hernandez2017-02-181-1/+1
| | | | | | | | | | | | | | | | | | | On FreeBSD the S_ISVTX flag is completely ignored when creating a regular file. Since gluster needs to create files with this flag set, specialy for DHT link files, it's necessary to force the flag. This fix does this by calling fchmod() after creating a file that must have this flag set. Change-Id: I51eecfe4642974df6106b9084a0b144835a4997a BUG: 1411228 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16417 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* rpcsvc: Add rpchdr and proghdr to iobref before submitting to transportPoornima G2017-02-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1421937 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16613 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: add a cli command to trigger a statedump on a clientPoornima G2017-01-231-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we will be able to trigger statedumps on remote Gluster clients, mainly targetted for applications using libgfapi. Design: SIGUSR signal is the most comman way of taking a statedump in Gluster. But it cannot be used for libgfapi based processes, as the process loading the library might have already consumed SIGUSR signal. Hence going by the command way. One has to issue a Gluster command to initiate a statedump on the libgfapi based client. The command takes hostname and PID as an argument. All the glusterds in the cluster, check if they are connected to the specified hostname, and send an RPC request to all the connected clients from that hostname (via the mgmt connection). URL: http://review.gluster.org/16357 Change-Id: Icbe4d2f026b32a2c7d5535e1bfb2cdaaff042e91 BUG: 1169302 Signed-off-by: Poornima G <pgurusid@redhat.com> [ndevos: minor fixes and split patch in smaller pieces] Reviewed-on: https://review.gluster.org/9228 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
* geo-rep: Separate slave mount logs for each connectionKotresh HR2017-01-181-3/+3
| | | | | | | | | | | | | | | | | | | | | Geo-rep worker mounts the slave volume on the slave node. If multiple worker connects to same slave node, all workers share the same mount log file. This is very difficult to debug as logs are cluttered from different mounts. Hence creating separate mount log file for each connection from worker. Each connection from worker is identified uniquely using 'mastervol uuid', 'master host', 'master brickpath', 'salve vol'. The log file name will be combination of the above. Change-Id: I67871dc8e8ea5864e2ad55e2a82063be0138bf0c BUG: 1412689 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16384 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* glusterd: Get maximum supported op-version in a clusterSamikshan Bairagya2017-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gluster volume get <VOLNAME> cluster.opversion gives us the current op-version on which the cluster is operating. There is no command that lets the user know the maximum supported op-version that the cluster can run on. This patch adds a new global option cluster.max-op-version, that can be used to retrieve the maximum supported op-version in a cluster. Usage: # gluster volume get all cluster.max-op-version Example output: Option Value ------ ----- cluster.max-op-version 30900 NOTE: The only way to test this feature for now is to set the GD_OP_VERSION_MAX macro to different values (30800 for 3.8,30900 for 3.9, and so on) and rebuild glusterd. Since the regression test framework currently doesn't have support to simulate these tests, there are no accompanying regression tests for this feature. It should be possible to add tests once glusto comes in and makes it easier to run a heterogeneous cluster. Change-Id: I547480ee5e7912664784643e436feb198b6d16d0 BUG: 1365822 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/16283 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd : Introduce reset brickAnuradha Talur2016-08-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The command basically allows replace brick with src and dst bricks as same. Usage: gluster v reset-brick <volname> <hostname:brick-path> start This command kills the brick to be reset. Once this command is run, admin can do other manual operations that they need to do, like configuring some options for the brick. Once this is done, resetting the brick can be continued with the following options. gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force} Does the job of resetting the brick. 'force' option should be used when the brick already contains volinfo id. Problem: On doing a disk-replacement of a brick in a replicate volume the following 2 scenarios may occur : a) there is a chance that reads are served from this replaced-disk brick, which leads to empty reads. b) potential data loss if next writes succeed only on replaced brick, and heal is done to other bricks from this one. Solution: After disk-replacement, make sure that reset-brick command is run for that brick so that pending markers are set for the brick and it is not chosen as source for reads and heal. But, as of now replace-brick for the same brick-path is not allowed. In order to fix the above mentioned problem, same brick-path replace-brick is needed. With this patch reset-brick commit {force} will be allowed even when source and destination <hostname:brickpath> are identical as long as 1) destination brick is not alive 2) source and destination brick have the same brick uuid and path. Also, the destination brick after replace-brick will use the same port as the source brick. Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de BUG: 1266876 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12250 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-291-6/+0
| | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: Id6f7555e35fd2bc37e8b9f81ac37d5384624501c BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15285 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com>
* glusterd: Fix gsyncd upgrade issueKotresh HR2016-07-131-2/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: gluster upgrade is not generating new volfiles Cause: During upgrade, "glusterd --xlator-option *.upgrade=on -N" is run to generate new volfiles. It is run post 'glusterfs' rpm installation. The above command fails during upgrade if geo-replication is installed. This is because on glusterd start 'gsyncd' binary is called to configure geo-replication related stuff. Since 'glusterfs' rpm is installed prior to 'geo-rep' rpm, the 'gsyncd' binary used to glusterd upgrade command is of old version and hence it fails before generating new volfiles. Solution: Don't call geo-replication configure during upgrade/downgrade. Geo-replication configuration happens during start of glusterd after upgrade. Change-Id: Id58ea44ead9f69982f86fb68dc5b9ee3f6cd11a1 BUG: 1355628 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14898 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>