summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-utils.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: allow shared-storage to use bricks under glusterd working directorySanju Rakonde2018-11-081-4/+5
| | | | | | | | | | | | | | | With commit 44e4db, we are not allowing user to create a volume using glusterd's working directory as a brick or any sub directory under glusterd's working directory as a brick.This has broken shared-storage since the volume "gluster-shared-storage" is created using the bricks under glusterd's working directory. With this patch, we let the "gluster-shared-storage" volume to use bricks under glusterd's working directory. fixes: bz#1647029 Change-Id: Ifcbcf4576eea12cf46f199dea287b29bd3ec3bfd Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: coverity fixesAtin Mukherjee2018-11-031-6/+8
| | | | | | | | | | | | | Addresses CIDs : 1124769, 1124852, 1124864, 1134024, 1229876, 1382382 Also addressed a spurious failure in tests/bugs/glusterd/df-results-post-replace-brick-operations.t to ensure post replace brick operation and before triggering 'df' from mount, client has connection to the newly replaced bricks. Change-Id: Ie5d7e02f89400a661491d7fc2a120d6f6a83a1cc Updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* stripe: remove the translator from build and glusterdAmar Tumballi2018-10-311-4/+1
| | | | | | | | | | | | | | | | Based on the proposal to remove few features as they are not actively maintained [1], removing stripe translator from the build. Also make sure there are no regression tests involving stripe translator. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Note that this patch aims at removing the translator from build, and a followup patch is needed to remove the code from repository. Updates: bz#1364707 Change-Id: I235b305338f138e29e9f30cba65bc0dadbebbbd5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* mgmt/glusterd: use proper path to the volfileRaghavendra Bhat2018-10-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Till now, glusterd was generating the volfile path for the snapshot volume's bricks like this. /snaps/<snap name>/<brick volfile> But in reality, the path to the brick volfile for a snapshot volume is /snaps/<snap name>/<snap volume name>/<brick volfile> The above workaround was used to distinguish between a mount command used to mount the snapshot volume, and a brick of the snapshot volume, so that based on what is actually happening, glusterd can return the proper volfile (client volfile for the former and the brick volfile for the latter). But, this was causing problems for snapshot restore when brick multiplexing is enabled. Because, with brick multiplexing, it tries to find the volfile and sends GETSPEC rpc call to glusterd using the 2nd style of path i.e. /snaps/<snap name>/<snap volume name>/<brick volfile> So, when the snapshot brick (which is multiplexed) sends a GETSPEC rpc request to glusterd for obtaining the brick volume file, glusterd was returning the client volume file of the snapshot volume instead of the brick volume file. Change-Id: I28b2dfa5d9b379fe943db92c2fdfea879a6a594e fixes: bz#1635050 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* mgmt/glusterd: NULL pointer deferencing clang fixIraj Jamali2018-10-021-1/+1
| | | | | | | | | Changed this->name to "glusterd" Updates: bz#1622665 Change-Id: Ic8ce428cefd6a5cecf5547769d8b13f530065c56 Signed-off-by: Iraj Jamali <ijamali@redhat.com>
* glusterd: fix coverity issuesSanju Rakonde2018-10-011-1/+2
| | | | | | | | | | | This patch fixes CID 1274175, 1175018. 1274175: Buffer size warning 1175018: Resource leak Change-Id: Id18960c249447b8dae35de3ad92bc570e62ddb09 updates: bz#789278 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: fix resource leak coverity issuesSanju Rakonde2018-09-221-1/+1
| | | | | | | | | | This patch addresses CID 1288098,1370948 and 1382454 key_fixed is allocated with memory but missed to free it. updates: bz#789278 Change-Id: Iea805c668ba89759313f9e21b328757e570be97b Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Use GF_ATOMIC to update 'blockers' counter at glusterd_confMohit Agrawal2018-09-201-9/+5
| | | | | | | | | | | | | | | Problem: Currently in glusterd code uses sync_lock/sync_unlock to update blockers counter which could add delays to the overall transaction phase escpecially when there's a batch of volume stop operations processed by glusterd in brick multiplexing mode. Solution: Use GF_ATOMIC to update blocker counter to ensure unnecessary context switching can be avoided. Change-Id: Ie13177dfee2af66687ae7cf5c67405c152853990 Fixes: bz#1631128 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* geo-rep: Fix issues related config setKotresh HR2018-09-181-2/+2
| | | | | | | | | | | | | | | | | | | | 1. '--ignore-mising-args' option for rsync is not being used even though the rsync version is greater than 3.1.0. Fixed the same. 2. '--existing' option for rsync is also not being used. Fixed the same. 3. geo-rep config fails to set rsync-options as the value contains '--'. Interestingly, python argsparse treats the value with '--' (e.g., --ignore-missing-args) as option. But when passed with something like --value=--ignore-missing-args, it succeeds. Fixed the same. Change-Id: Iaeb838acaff1c2920fee9c7f920c99edce13a0a1 Signed-off-by: Kotresh HR <khiremat@redhat.com> fixes: bz#1629561
* Land part 2 of clang-format changesGluster Ant2018-09-121-11982/+11581
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Some (mgmt) xlators: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-09-091-535/+728
| | | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - It also adds dict_get_int32n which was missing. - It also reduces the size of some key variables. They were set to 1024b or PATH_MAX, where sometimes 64 bytes were really enough. Please review carefully: 1. That I did not reduce some the size of the key variables too much. 2. That I did not mix up some keys. Compile-tested only! Change-Id: Ic729baf179f40e8d02bc2350491d4bb9b6934266 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: avoid using glusterd's working directory as a brickSanju Rakonde2018-09-081-0/+9
| | | | | | | | | Adding checks for avoiding glusterd's working directory used as a brick for volume creation. fixes: bz#853601 Change-Id: I4b16a05f752e92216aa628f542a4fdbf59b3c669 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* multiple xlators (mgmt): strncpy()->sprintf(), reduce strlen()'sYaniv Kaul2018-09-071-124/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/mgmt/glusterd/src/glusterd-geo-rep.c xlators/mgmt/glusterd/src/glusterd-handshake.c xlators/mgmt/glusterd/src/glusterd-sm.c xlators/mgmt/glusterd/src/glusterd-store.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-volgen.c xlators/mgmt/glusterd/src/glusterd-volume-ops.c xlators/mgmt/glusterd/src/glusterd.c strncpy may not be very efficient for short strings copied into a large buffer: If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written. Instead, use snprintf(). Try to ensure output is not truncated. Also: - save the result of strlen() and re-use it when possible. - move from strlen to SLEN (sizeof() ) for const strings. Compile-tested only! Change-Id: Ib5d001857236f43e41c4a51b5f48e1a33110aaeb updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* {dht-rebalance|glusterd-geo-rep|glusterd-utils|nfs|bd}.c: no dict_del before ↵Yaniv Kaul2018-09-041-2/+0
| | | | | | | | | | | | dict_set There is no need to remove an item before re-setting it. Compile-tested only! Change-Id: I2869aec9ebf474859127b8b38d284246e6097e84 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Multiple files: calloc -> mallocYaniv Kaul2018-09-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/storage/posix/src/posix-inode-fd-ops.c: xlators/storage/posix/src/posix-helpers.c: xlators/storage/bd/src/bd.c: xlators/protocol/client/src/client-lk.c: xlators/performance/quick-read/src/quick-read.c: xlators/performance/io-cache/src/page.c xlators/nfs/server/src/nfs3-helpers.c xlators/nfs/server/src/nfs-fops.c xlators/nfs/server/src/mount3udp_svc.c xlators/nfs/server/src/mount3.c xlators/mount/fuse/src/fuse-helpers.c xlators/mount/fuse/src/fuse-bridge.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-syncop.h xlators/mgmt/glusterd/src/glusterd-snapshot.c xlators/mgmt/glusterd/src/glusterd-rpc-ops.c xlators/mgmt/glusterd/src/glusterd-replace-brick.c xlators/mgmt/glusterd/src/glusterd-op-sm.c xlators/mgmt/glusterd/src/glusterd-mgmt.c xlators/meta/src/subvolumes-dir.c xlators/meta/src/graph-dir.c xlators/features/trash/src/trash.c xlators/features/shard/src/shard.h xlators/features/shard/src/shard.c xlators/features/marker/src/marker-quota.c xlators/features/locks/src/common.c xlators/features/leases/src/leases-internal.c xlators/features/gfid-access/src/gfid-access.c xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c xlators/features/bit-rot/src/bitd/bit-rot.c xlators/features/bit-rot/src/bitd/bit-rot-scrub.c bxlators/encryption/crypt/src/metadata.c xlators/encryption/crypt/src/crypt.c xlators/performance/md-cache/src/md-cache.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! .. and allocate memory as much as needed. xlators/nfs/server/src/mount3.c : Don't blindly allocate PATH_MAX, but strlen() the string and allocate appropriately. Also, align error messges. updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
* glusterd: fail volume stop operation if brick detach failsAtin Mukherjee2018-09-041-8/+22
| | | | | | | | | | | | | | While sending a detach request for a brick in brick multiplexing mode, in any situation if the brick isn't connected, glusterd will fail to detach the brick but due to the missing error code handling, glusterd will mark the volume as stopped. Fix is to handle the return code of send_attach_req in glusterd_volume_stop_glusterfs () Change-Id: I886202969c96eec3620f74cd7027652d6287f4be Fixes: bz#1624440 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Fix coverity issuesSanju Rakonde2018-08-291-2/+1
| | | | | | | | | | | This patch fixes CID's 1395250, 1395252 1395250 - Unintialized variable 1395252 - Out of bounds access updates: bz#789278 Change-Id: Icf646364b14d48fa2bd82ea78ca5cdb5c684355f Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: glusterd_brick_start shouldn't cleanup pidfile if only_connect is trueMohit Agrawal2018-08-271-4/+4
| | | | | | | | | | | | Problem: Sometime glusterd cleanup pidfile even brick is started and cli shows volume status "N/A" Solution: Update the condition in glusterd_brick_start to avoid pidfile cleanup in case if only_connect flag is true Fixes: bz#1622422 Change-Id: I8decb34597126b848e3a44d957e138833dd97350 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* mgmt/glusterd: Coverity fixes in glusterd-utils.cVijay Bellur2018-08-271-3/+19
| | | | | | | | | | | | | | | Addresses the following CIDs: 1388821: Unchecked return value from sys_lremovexattr() in glusterd_check_and_set_brick_xattr() 1370957: Unused return value in glusterd_volume_tier_use_rsp_dict() 1370950: Memory leak in glusterd_get_global_options_for_all_vols() 1370946: Redundant gf_strdup() leading to a memory leak in glusterd_get_global_options_for_all_vols() Change-Id: I2ab58207bc43b40f004ee18463430a141126bf94 Updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: glusterd_brick_start shouldn't try to bring up brick if ↵Atin Mukherjee2018-08-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | only_connect is true With the latest refactoring in glusterd_brick_start () function in case we run into a situation where is_gf_service_running () return a valid pid which is running but doesn't belong to a gluster process, even in case of only_connect flag passed as gf_true we'd end up trying to start a brick which would cause a deadlock in brick multiplexing as both glusterd_restart_bricks () and glusterd_do_volume_quorum_action () would cause context switching with each other for the same brick. The following bt shows the same: (gdb) t a a bt Thread 8 (Thread 0x7fcced48a700 (LWP 11959)): srch_vol=srch_vol@entry=0xbe0410, comp_vol=comp_vol@entry=0xc03680, brickinfo=brickinfo@entry=0xc14ef0) at glusterd-utils.c:5834 brickinfo=0xc14ef0, volinfo=0xc03680, conf=<optimized out>) at glusterd-utils.c:5902 brickinfo=brickinfo@entry=0xc14ef0, wait=wait@entry=_gf_false, only_connect=only_connect@entry=_gf_true) at glusterd-utils.c:6251 volinfo=0xc03680, meets_quorum=_gf_true) at glusterd-server-quorum.c:402 at glusterd-server-quorum.c:443 iov=iov@entry=0x7fcce0004040, count=count@entry=1, myframe=myframe@entry=0x7fcce00023a0) at glusterd-rpc-ops.c:542 iov=0x7fcce0004040, count=1, myframe=0x7fcce00023a0, fn=0x7fccf12403d0 <__glusterd_friend_add_cbk>) at glusterd-rpc-ops.c:223 ---Type <return> to continue, or q <return> to quit--- at rpc-transport.c:538 Thread 7 (Thread 0x7fccedc8b700 (LWP 11958)): Thread 6 (Thread 0x7fccf1d67700 (LWP 11877)): brickinfo=brickinfo@entry=0xc14ef0) at glusterd-utils.c:5834 at glusterd-utils.c:6251 Thread 5 (Thread 0x7fccf2568700 (LWP 11876)): Thread 4 (Thread 0x7fccf2d69700 (LWP 11875)): Thread 3 (Thread 0x7fccf356a700 (LWP 11874)): Thread 2 (Thread 0x7fccf3d6b700 (LWP 11873)): ---Type <return> to continue, or q <return> to quit--- Thread 1 (Thread 0x7fccf68a8780 (LWP 11872)): Fix: The solution is to ensure we don't restart bricks if only_connect is true and just ensure that the brick is attempted to be connected. Test: Simulated a code change to ensure gf_is_service_running () always return to true to hit the scenario. Change-Id: Iec184e6c9e8aabef931d310f931f4d7a580f0f48 Fixes: bz#1620544 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* xlators/mgmt/glusterd/src/glusterd-utils.c : re-scope message variableYaniv Kaul2018-08-231-1/+1
| | | | | | | | | | The the error message variable changed in scope - defined in a smaller scope. Compile-tested only! Change-Id: I16dda11c30099b0e448b8e44a300f153727ce8da updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* snapshot : fix snapshot status failure due to symlink problemSunny Kumar2018-08-231-1/+9
| | | | | | | | | | | | | | | | | | | | | | | Problems : 1. Snapshot status for individual snapshots were failing after OS upgrade from RHEL6 to RHEL7. 2. Post upgrade snapshot creation of cloned volume was failing. Root Cause : When OS upgrade is from RHEL6 to RHEL7 there is difference in symlink (/var/run) between these two versions. Basically when (/var/run) is symlinked to /run, mount command resolves path and mounts it. But at the same time call to those functions fails who depends on absolute path. (like strcmp in glusterd_get_mnt_entry_info) Solution : Resolve the input path to absolute path before calling these functions. Test : Tested on same setup where issue was reported. After this patch snapshot issues are completely resolved. Change-Id: I5ba57998cea614c6072709f52f42a57562018844 fixes: bz#1619843 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: fix gcc warningsRavishankar N2018-08-211-146/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...and also changed char array initialization to from {0,} to "". gcc version 8.1.1 20180712 (Red Hat 8.1.1-5) (GCC) on Fedora 28. Sample warnings: glusterd-utils.c:7234:41: warning: ‘.hostname’ directive output may be truncated writing 9 bytes into a region of size between 1 and 1024 [-Wformat-truncation=] snprintf (key, sizeof (key), "%s.hostname", base_key); ^~~~~~~~~ glusterd-utils.c:7234:9: note: ‘snprintf’ output between 10 and 1033 bytes into a destination of size 1024 snprintf (key, sizeof (key), "%s.hostname", base_key); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ glusterd-snapshot.c:3090:65: warning: ‘/’ directive output may be truncated writing 1 byte into a region of size between 0 and 4095 [-Wformat-truncation=] snprintf (snap_path, sizeof (snap_path) - 1, "%s/%s", ^ glusterd-snapshot.c:3090:17: note: ‘snprintf’ output between 2 and 4351 bytes into a destination of size 4095 snprintf (snap_path, sizeof (snap_path) - 1, "%s/%s", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ snap_mount_dir, snap_vol->snapshot->snapname); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ glusterd-statedump.c:28:45: warning: ‘%s’ directive output may be truncated writing up to 4095 bytes into a region of size 144 [-Wformat-truncation=] snprintf (subkey, sizeof (subkey), "%s%d", key, index); ^~ ~~~ glusterd-statedump.c:28:9: note: ‘snprintf’ output between 2 and 4107 bytes into a destination of size 144 snprintf (subkey, sizeof (subkey), "%s%d", key, index); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ updates: bz#1193929 Change-Id: Ic721f27b28d1221c124b570e81c55528f5b7f3cd Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* glusterd: fix some coverity issuesBhumika Goyal2018-08-201-0/+6
| | | | | | | | | | | Fixes CID: 1241481 1241482 1274079 1274118 1274121 1274131 1274198 1274214 1274220 1274224 1394663 1394641 382454 1382453 1382449 1288095 Link: https://scan6.coverity.com/reports.htm#v42388/p10714/fileInstanceId=84772667&defectInstanceId=25770661&mergedDefectId=744716 Change-Id: Idaf434186231c8b0fff4b27c57fa23636a89c8a7 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
* glusterd: coverity defects fix introduced by commit 1f3bfe7Atin Mukherjee2018-08-171-1/+1
| | | | | | | | | | | | | Commit 1f3bfe7 stripped down the total size of certain path related variables in glusterd_brickinfo_t which considered couple of new coverity defects. Fix the following: CID : 1394969 Destination buffer too small CID : 1394968 Out-of-bounds access Change-Id: Ibc30eac4680cc6c83bd89d248f1435cb6a3d1b75 updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: ignore importing volume which is undergoing a delete operationAtin Mukherjee2018-08-161-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem explanation: Assuming in a 3 nodes cluster, if N1 originates a delete operation and while N1's commit phase completes, either glusterd service of N2 or N3 gets disconnected from N1 (before completing the commit phase), N1 will attempt to end up importing the volume which is in-flight for a delete in other nodes as a fresh resulting into an incorrect configuration state. Fix: Mark a volume as stage deleted once a volume delete operation passes it's staging phase and reset this flag during unlock phase. Now during this intermediate phase if the same volume gets imported to other peers, it shouldn't considered to be recreated. An automated .t is quite tough to implement with the current infra. Test Case: 1. Keep creating and deleting volumes in a loop on a 3 node cluster 2. Simulate n/w failure between the peers (ifdown followed by ifup) 3. Check if output of 'gluster v list | wc -l' is same across all 3 nodes during 1 & 2. Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246 Fixes: bz#1605077 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* All: remove memset() before sprintf()Yaniv Kaul2018-08-141-226/+19
| | | | | | | | | | | | It's not needed. There's a good chance the compiler is smart enough to remove it anyway, but it can't hurt - I hope. Compile-tested only! Change-Id: Id7c054e146ba630227affa591007803f3046416b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: Compare volume_id before start/attach a brickMohit Agrawal2018-08-101-20/+27
| | | | | | | | | | | | | | Problem: After reboot a node brick is not coming up because fsid comparison is failed before start a brick Solution: Instead of comparing fsid compare volume_id to resolve the same because fsid is changed after reboot a node but volume_id persist as a xattr on brick_root path at the time of creating a volume. Change-Id: Ic289aab1b4ebfd83bbcae8438fee26ae61a0fff4 fixes: bz#1612418 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: more stricter checks of if brick is running in multiplex modeAtin Mukherjee2018-08-091-32/+39
| | | | | | | | | | | | | | | | | | | | | | | While gf_attach () utility can help in detaching a brick instance from the brick process which the kill_brick () function in tests/volume.rc uses it has a caveat which is as follows: 1. It doesn't ensure the respective brick is marked as stopped which glusterd does from glusterd_brick_stop 2. Sometimes if kill_brick () is executed just after a brick stack is up, the mgmt_rpc_notify () can take some time before marking priv->connected to 1 and before it if kill_brick () is executed, brick will fail to initiate the pmap_signout which would inturn cleans up the pidfile. To avoid such possibilities, a more stricter check on if a brick is running or not in brick multiplexing has been brought in now where it not only checks for its pid's existance but checks if the respective process has the brick instance associated with it before checking for brick's status. Change-Id: I98b92df949076663b9686add7aab4ec2f24ad5ab Fixes: bz#1595320 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Bricks of a normal volumes should not attach to ↵Sanju Rakonde2018-08-031-0/+14
| | | | | | | | | | | | | | | | | | | gluster_shared_storage bricks Problem: In a brick multiplexing environment, Bricks of a normal volume created by user are getting attached to the bricks of a volume "gluster_shared_storage" which is created by enabling the enable-shared-storage option. Mounting gluster_shared_storage has strict authentication checks. when we attach bricks of a normal volume to bricks of gluster_shared_storage, mounting the normal volume created by user will fail due to strict authentication checks. Solution: We should not attach bricks of a normal volume to brick process of gluster_shared_storage volume and vice versa. fixes: bz#1610726 Change-Id: If1b5a2a02675789a2915ba480fb48c145449163d Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Add multiple checks before attach/start a brickMohit Agrawal2018-07-271-44/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In brick mux scenario sometime glusterd is not able to start/attach a brick and gluster v status shows brick is already running Solution: 1) To make sure brick is running check brick_path in /proc/<pid>/fd , if a brick is consumed by the brick process it means brick stack is come up otherwise not 2) Before start/attach a brick check if a brick is mounted or not 3) At the time of printing volume status check brick is consumed by any brick process Test: To test the same followed procedure 1) Setup brick mux environment on a vm 2) Put a breaking point in gdb in function posix_health_check_thread_proc at the time of notify GF_EVENT_CHILD_DOWN event 3) unmount anyone brick path forcefully 4) check gluster v status it will show N/A for the brick 5) Try to start volume with force option, glusterd throw message "No device available for mount brick" 6) Mount the brick_root path 7) Try to start volume with force option 8) down brick is started successfully Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69 fixes: bz#1595320 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterd: Coverity issues with type FORWARD_NULLSanju Rakonde2018-07-241-6/+4
| | | | | | | | | | | This patch fixes coverity issues 102, 103, 112 and 119 from [1] [1] https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-07-23-5fa004f3/html/ Updates: bz#789278 Change-Id: I99762eb0bcbd974a5250434777db63520f2ce2e6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Deadcode Coverity issueOshank Kumar2018-07-241-2/+0
| | | | | | | | | | | | | | | | | | | This patch will fix coverity issue 74 from [1]. we are updating ret value line number 5011, and immediately checking whether ret is having non zero value at line number 5013.If ret is 0, then only we continue to execute and we can reach line number 5036. By the time we reach 5036, ret value is always 0. So this block of code is redundant here and removing it. [1] https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-07-23-5fa004f3/html/ Updates: bz#789278 Change-Id: Ia6e8ba2936e350f0d29a9151ab786622f5e750db Signed-off-by: Oshank Kumar <okumar@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-7/+7
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: To find a compatible brick ignore diagnostics.brick-log-level optionMohit Agrawal2018-07-131-0/+4
| | | | | | | | | | | | | | | Problem: glusterd start a volume as a separate process instead of attaching with the already running process if volume option has different brick-log-level. There is no functionality impact on a brick if the option has different brick-log-level so glusterd should attach a brick with the already running process. Solution: Ignore brick-log-level option in unsafe_option BUG: 1599628 Change-Id: I72638ff2026fcd9332bc38e1144b1ef4a708820b fixes: bz#1599628 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: _is_prefix should handle 0-length pathsKaushal M2018-07-111-0/+9
| | | | | | | | | If one of the paths given to _is_prefix is 0-length, then it is not a prefix of the other. Hence, _is_prefix should return false. Change-Id: I54aa577a64a58940ec91872d0d74dc19cff9106d fixes: bz#1599783 Signed-off-by: Kaushal M <kaushal@redhat.com>
* glusterd: log improvements on brick creation validationAtin Mukherjee2018-07-111-2/+15
| | | | | | | | Added few log entries in glusterd_is_brickpath_available (). Change-Id: I8b758578f9db90d2974f7c79126c50ad3a001d71 Updates: bz#1193929 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-101-94/+179
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: show brick online after port registration even in brick-muxPranith Kumar K2018-07-051-8/+28
| | | | | | | | | | | | | | | | | Problem: With brick-mux even before brick attach is complete on the bricks glusterd marks them as online. This can lead to a race where scripts that check if the bricks are online to assume that the brick is online before it is completely online. Fix: Wait for the callback from the brick before marking the port as registered so that volume status will show the correct status of the brick. fixes bz#1597568 Change-Id: Icd3dc62506af0cf75195e96746695db823312051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: start the services after all the bricks are upAtin Mukherjee2018-07-031-9/+5
| | | | | | | | | glusterd_svcs_manager () should be called post starting all the volumes at one go. Change-Id: I838cc50c29f3930a483aa9671958cdc186904030 Fixes: bz#1597247 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Coverity FixesSanju Rakonde2018-06-111-3/+4
| | | | | | | Fixes: #789278 Change-Id: I633704fab49992cac6ee9e05bc368f7da360d09e Signed-off-by: Sanju Rakonde <srakonde@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* glusterd: address test failures with brick mux enabledAtin Mukherjee2018-05-311-0/+11
| | | | | | | | | | | | | | | | | | | | | | This patch addresses following: 1. On volume stop, for the last brick, pmap_registry_remove () is invoked by glusterd. 2. If a brick process is sigkilled, remove all the associated brick instances from the portmap. 3. Bump up PROCESS_UP_TIMEOUT to 45. 4. gf_attach to kill a brick takes more time in mux (which is an issue that needs a fix), but in the interim, give br-state-check.t more time to complete (there are 2 kill_bricks, each taking 120 seconds, and the test usually passes in 30 odd seconds, hence bumping this up to 350 seconds) 5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at times on master without mux, in mux cases, when it fails, it is almost at the last iteration, hence bumping the timeout for this test case to reduce regression error rates Updates: bz#1577672 Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: handling brick termination in brick-muxSanju Rakonde2018-05-071-4/+7
| | | | | | | | | | | | | | | | | Problem: There's a race between the glusterfs_handle_terminate() response sent to glusterd from last brick of the process and the socket disconnect event that encounters after the brick process got killed. Solution: When it is a last brick for the brick process, instead of sending GLUSTERD_BRICK_TERMINATE to brick process, glusterd will kill the process (same as we do it in case of non brick multiplecing). The test case is added for https://bugzilla.redhat.com/show_bug.cgi?id=1549996 Change-Id: If94958cd7649ea48d09d6af7803a0f9437a85503 fixes: bz#1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Don't use hardcoded /sbin, /usr/bin etc. paths. Fixes #1450546Niklas Hambüchen2018-05-031-13/+1
| | | | | | | | | Instead, rely on programs to be in PATH, as gluster already does in many places across its code base. Change-Id: Id21152fe42f5b67205d8f1571b0656c4d5f74246 BUG: 1450546 Signed-off-by: Niklas Hambuechen <mail@nh2.me>
* glusterd/geo-rep: Fix UNUSED_VALUE coverity issueVarsha Rao2018-05-031-4/+6
| | | | | | | | | | | The return value of glusterd_get_local_brickpaths is unused so add goto statement. As it is reinitialized outside the if block. Also change the if condition to check the failure case, when return value is -1 and path_list is NULL. Change-Id: I6b47d7751263f704bd69a6452a7e71bfcf226d49 updates: bz#789278 Signed-off-by: Varsha Rao <varao@redhat.com>
* glusterd: show brick online after port registrationAtin Mukherjee2018-04-051-2/+3
| | | | | | | | | | | | | | | | | gluster-block project needs a dependency check to see if all the bricks are online before bringing up the relevant gluster-block services. While the patch https://review.gluster.org/#/c/19785/ attempts to write the script but brick should be only marked as online only when the pmap_signin is completed. While this is perfectly fine for non brick multiplexing, but with brick multiplexing this patch still doesn't eliminate the race completely as the attach_req call is asynchrnous and glusterd immediately marks the port as registerd. Change-Id: I81db54b88f7315e1b24e0234beebe00de6429f9d Fixes: bz#1563273 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: mark port_registered to true for all running bricks with brick muxAtin Mukherjee2018-04-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | glusterd maintains a boolean flag 'port_registered' which is used to determine if a brick has completed its portmap sign in process. This flag is (re)set in pmap_sigin and pmap_signout events. In case of brick multiplexing this flag is the identifier to determine if the very first brick with which the process is spawned up has completed its sign in process. However in case of glusterd restart when a brick is already identified as running, glusterd does a pmap_registry_bind to ensure its portmap table is updated but this flag isn't which is fine in case of non brick multiplex case but causes an issue if the very first brick which came as part of process is replaced and then the subsequent brick attach will fail. One of the way to validate this is to create and start a volume, remove the first brick and then add-brick a new one. Add-brick operation will take a very long time and post that the volume status will show all other brick status apart from the new brick as down. Solution is to set brickinfo->port_registered to true for all the running bricks when brick multiplexing is enabled. Change-Id: Ib0662d99d0fa66b1538947fd96b43f1cbc04e4ff Fixes: bz#1560957 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Revert "glusterd: handling brick termination in brick-mux"Sanju Rakonde2018-03-291-50/+6
| | | | | | | | | | | | | This reverts commit a60fc2ddc03134fb23c5ed5c0bcb195e1649416b. This commit was causing multiple tests to time out when brick multiplexing is enabled. With further debugging, it's found that even though the volume stop transaction is converted into mgmt_v3 to allow the remote nodes to follow the synctask framework to process the command, there are other callers of glusterd_brick_stop () which are not synctask based. Change-Id: I7aee687abc6bfeaa70c7447031f55ed4ccd64693 updates: bz#1545048
* glusterd: handling brick termination in brick-muxSanju Rakonde2018-03-281-6/+50
| | | | | | | | | | | | | | | Problem: There's a race between the last glusterfs_handle_terminate() response sent to glusterd and the kill that happens immediately if the terminated brick is the last brick. Solution: When it is a last brick for the brick process, instead of glusterfsd killing itself, glusterd will kill the process in case of brick multiplexing. And also changing gf_attach utility accordingly. Change-Id: I386c19ca592536daa71294a13d9fc89a26d7e8c0 fixes: bz#1545048 BUG: 1545048 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: volume get fixes for client-io-threads & quorum-typeRavishankar N2018-03-071-1/+26
| | | | | | | | | | | | | | | | | | | | 1. If a replica volume created on glusterfs-3.8 was upgraded to glusterfs-3.12, `gluster vol get volname client-io-threads` displayed 'on' even though it wasn't and the xlator wasn't loaded on the client-graph. This was due to removing certain checks in glusterd_get_default_val_for_volopt as a part of commit 47604fad4c2a3951077e41e0c007ceb979bb2c24. Fix it. 2. Also, as a part of op-version bump-up, client-io-threads was being loaded on the clients during volfile regeneration. Prevent it. 3. AFR assumes quorum-type to be auto in newly created replic 3 (odd replica in general) volumes but `gluster vol get quorum-type` displays 'none'. Fix it. Change-Id: I19e586361ed1065c70fb378533d3b4dac1095df9 BUG: 1545056 Signed-off-by: Ravishankar N <ravishankar@redhat.com>