summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt
Commit message (Collapse)AuthorAgeFilesLines
* snapshot:Fail snapshot creation if an empty description providedMohammed Rafi KC2018-08-191-0/+10
| | | | | | | | | | Snapshot description should have a valid string. Creating a snapshot with null value will cause reading from info file to fail with a null exception Change-Id: I9f84154b8e3e7ffefa5438807b3bb9b4e0d964ca updates: bz#1618004 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd: coverity defects fix introduced by commit 1f3bfe7Atin Mukherjee2018-08-172-2/+2
| | | | | | | | | | | | | Commit 1f3bfe7 stripped down the total size of certain path related variables in glusterd_brickinfo_t which considered couple of new coverity defects. Fix the following: CID : 1394969 Destination buffer too small CID : 1394968 Out-of-bounds access Change-Id: Ibc30eac4680cc6c83bd89d248f1435cb6a3d1b75 updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: ignore importing volume which is undergoing a delete operationAtin Mukherjee2018-08-165-6/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem explanation: Assuming in a 3 nodes cluster, if N1 originates a delete operation and while N1's commit phase completes, either glusterd service of N2 or N3 gets disconnected from N1 (before completing the commit phase), N1 will attempt to end up importing the volume which is in-flight for a delete in other nodes as a fresh resulting into an incorrect configuration state. Fix: Mark a volume as stage deleted once a volume delete operation passes it's staging phase and reset this flag during unlock phase. Now during this intermediate phase if the same volume gets imported to other peers, it shouldn't considered to be recreated. An automated .t is quite tough to implement with the current infra. Test Case: 1. Keep creating and deleting volumes in a loop on a 3 node cluster 2. Simulate n/w failure between the peers (ifdown followed by ifup) 3. Check if output of 'gluster v list | wc -l' is same across all 3 nodes during 1 & 2. Change-Id: Ifdd5dc39699120258d7fdd42fe2deb9de25c6246 Fixes: bz#1605077 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* build: use standard PKG_CHECK_MODULES for libxml2 availabilityNiels de Vos2018-08-161-2/+2
| | | | | | | | | | | | | In case the development parts of libxml2 are not installed, it was required to re-run ./autogen.sh to cleanup the cached values for the check. This is not nice towards users. By using the standard PKG_CHECK_MODULES for libxml-2.0 the results of the check are not cached and will be probed again when running ./configure. Change-Id: I3c4586e5555a521be5d4fb61bdb873ae0317311a Fixes: bz#1599219 Reported-by: Sachidananda Urs <surs@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com>
* mountbroker : fix coverity issue in glusterd-mountbroker.cSunny Kumar2018-08-151-1/+4
| | | | | | | | | Fixes CID : 1124789 updates: bz#789278 Change-Id: I61c70f05e6377d7ddc8961556274714dd356a117 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: fix gcc7 warningsAmar Tumballi2018-08-142-22/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | [sh]$ gcc --version gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Warnings were of the type below: xlators/mgmt/glusterd/src/glusterd-store.c:3285:33: warning: ‘/options’ directive output may be truncated writing 8 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (path, len, "%s/options", conf->workdir); ^~~~~~~~ xlators/mgmt/glusterd/src/glusterd-store.c:1280:39: warning: ‘/snaps/’ directive output may be truncated writing 7 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (snap_fpath, len, "%s/snaps/%s/%s", priv->workdir, ^~~~~~~ * Also changed some places where there was issues with key size * Made sure all the 'char buf[SOMESIZE] = {0,};' are changed to 'char buf[SOMESIZE] = "";` - In the files I changed * Also edited coding standard to reflect that. updates: bz#1193929 Change-Id: I04c652624ac63199cea2077e46b3a5def37c3689 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* mgmt/glusterd: Fix possible use after free in glusterd_op_ac_commit_op()Vijay Bellur2018-08-141-1/+3
| | | | | | | | Fixes CID 1391418 Change-Id: I60ce6cd3b2528369f4dc1be81c0c15a1a806982a updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix buffer length to prevent a memory overrunVijay Bellur2018-08-141-2/+2
| | | | | | | | Fixes CID 1394647, 1394658 Change-Id: I30cf6e793919a08e0a3fe10622351b8316d7767c updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: remove the unused databuf in rebalance structureAmar Tumballi2018-08-141-1/+0
| | | | | | | | | While it is a one line fix, it allows a significant unwanted memory being allocated for defrag structure. Updates: bz#1193929 Change-Id: Idda70d1d3dc0e7be56c35e872aa6edfaf752290d Signed-off-by: Amar Tumballi <amarts@redhat.com>
* mgmt/glusterd: Fix a memory leak in volgenVijay Bellur2018-08-141-0/+1
| | | | | | | | Fixes CID 1325557 Change-Id: I5e33ae19ddf4c44a49a2b3b3dea0c739bc96d3a7 updates: bz#789278 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* All: remove memset() before sprintf()Yaniv Kaul2018-08-1412-351/+61
| | | | | | | | | | | | It's not needed. There's a good chance the compiler is smart enough to remove it anyway, but it can't hurt - I hope. Compile-tested only! Change-Id: Id7c054e146ba630227affa591007803f3046416b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: compare friend data within mutexAtin Mukherjee2018-08-133-41/+48
| | | | | | | | | | | | | | | | | | | | | | | During friend handshake if the glusterd receives more than one friend updates, it might very well become possible that two threads would end up working on two different volinfo references and glusterd might end up updating the store with a old volinfo reference. While debugging glusterd crash from validating-server-quorum.t test file from the line-coverage regression the same was observed. Solution is to run glusterd_compare_friend_data under a mutex. Test: As the crash was more visible in the line-coverage run (given lcov does some instrumentation and exposes the races), 6 manual lcov runs were triggered starting from https://build.gluster.org/job/line-coverage/443 to https://build.gluster.org/job/line-coverage/449/ and no crash was observed from validating-server-quorum.t Change-Id: I86fce473a76fd24742d51bf17a685d28b90a8941 Fixes: bz#1603063 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Compare volume_id before start/attach a brickMohit Agrawal2018-08-101-20/+27
| | | | | | | | | | | | | | Problem: After reboot a node brick is not coming up because fsid comparison is failed before start a brick Solution: Instead of comparing fsid compare volume_id to resolve the same because fsid is changed after reboot a node but volume_id persist as a xattr on brick_root path at the time of creating a volume. Change-Id: Ic289aab1b4ebfd83bbcae8438fee26ae61a0fff4 fixes: bz#1612418 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: more stricter checks of if brick is running in multiplex modeAtin Mukherjee2018-08-091-32/+39
| | | | | | | | | | | | | | | | | | | | | | | While gf_attach () utility can help in detaching a brick instance from the brick process which the kill_brick () function in tests/volume.rc uses it has a caveat which is as follows: 1. It doesn't ensure the respective brick is marked as stopped which glusterd does from glusterd_brick_stop 2. Sometimes if kill_brick () is executed just after a brick stack is up, the mgmt_rpc_notify () can take some time before marking priv->connected to 1 and before it if kill_brick () is executed, brick will fail to initiate the pmap_signout which would inturn cleans up the pidfile. To avoid such possibilities, a more stricter check on if a brick is running or not in brick multiplexing has been brought in now where it not only checks for its pid's existance but checks if the respective process has the brick instance associated with it before checking for brick's status. Change-Id: I98b92df949076663b9686add7aab4ec2f24ad5ab Fixes: bz#1595320 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Bricks of a normal volumes should not attach to ↵Sanju Rakonde2018-08-031-0/+14
| | | | | | | | | | | | | | | | | | | gluster_shared_storage bricks Problem: In a brick multiplexing environment, Bricks of a normal volume created by user are getting attached to the bricks of a volume "gluster_shared_storage" which is created by enabling the enable-shared-storage option. Mounting gluster_shared_storage has strict authentication checks. when we attach bricks of a normal volume to bricks of gluster_shared_storage, mounting the normal volume created by user will fail due to strict authentication checks. Solution: We should not attach bricks of a normal volume to brick process of gluster_shared_storage volume and vice versa. fixes: bz#1610726 Change-Id: If1b5a2a02675789a2915ba480fb48c145449163d Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* stack: Reduce stack usage for local variables to store tmpfile namesShyamsundarR2018-07-273-20/+61
| | | | | | | | | | This patch moves stack based PATH_MAX allocations for tmpfile names, to heap allocated names instead. Reducing the impact on stack space used and accruing benefits thereof. Change-Id: I646d9cb091018de6768b3523902788fa2ba14d96 Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* coverity: Ignore most of SECURE_TEMP issuesShyamsundarR2018-07-273-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | mkstemp as per the Linux man page, uses 0600 as the permission bits when creating the file. This is hence safe and a Coverity warning that should be ignored. Further, we are mostly a multi-threaded program in all our daemons and cannot set and unset umask at will in a multi-threaded program, to address the coverity issue. This change attempts to nudge coverity to ignore this warning, using the pattern, /* coverity[EVENT_TAG_NAME] ... */ <line of code that has the issue> This commit is an experiment, if post merge the next coverity report ignores these errors, the above pattern (as found using an internet search) works and can be applied to certain other warnings as well. Change-Id: I73a184ce1a54dd9e66542952b1190a74438c826a Updates: bz#789278 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: Add multiple checks before attach/start a brickMohit Agrawal2018-07-274-49/+226
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In brick mux scenario sometime glusterd is not able to start/attach a brick and gluster v status shows brick is already running Solution: 1) To make sure brick is running check brick_path in /proc/<pid>/fd , if a brick is consumed by the brick process it means brick stack is come up otherwise not 2) Before start/attach a brick check if a brick is mounted or not 3) At the time of printing volume status check brick is consumed by any brick process Test: To test the same followed procedure 1) Setup brick mux environment on a vm 2) Put a breaking point in gdb in function posix_health_check_thread_proc at the time of notify GF_EVENT_CHILD_DOWN event 3) unmount anyone brick path forcefully 4) check gluster v status it will show N/A for the brick 5) Try to start volume with force option, glusterd throw message "No device available for mount brick" 6) Mount the brick_root path 7) Try to start volume with force option 8) down brick is started successfully Change-Id: I91898dad21d082ebddd12aa0d1f7f0ed012bdf69 fixes: bz#1595320 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterd: Coverity issues with type FORWARD_NULLSanju Rakonde2018-07-244-11/+11
| | | | | | | | | | | This patch fixes coverity issues 102, 103, 112 and 119 from [1] [1] https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-07-23-5fa004f3/html/ Updates: bz#789278 Change-Id: I99762eb0bcbd974a5250434777db63520f2ce2e6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Deadcode Coverity issueOshank Kumar2018-07-241-2/+0
| | | | | | | | | | | | | | | | | | | This patch will fix coverity issue 74 from [1]. we are updating ret value line number 5011, and immediately checking whether ret is having non zero value at line number 5013.If ret is 0, then only we continue to execute and we can reach line number 5036. By the time we reach 5036, ret value is always 0. So this block of code is redundant here and removing it. [1] https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-07-23-5fa004f3/html/ Updates: bz#789278 Change-Id: Ia6e8ba2936e350f0d29a9151ab786622f5e750db Signed-off-by: Oshank Kumar <okumar@redhat.com>
* features/shard: Make lru limit of inode list configurableKrutika Dhananjay2018-07-231-0/+6
| | | | | | | | | | | | | | | Currently this lru limit is hard-coded to 16384. This patch makes it configurable to make it easier to hit the lru limit and enable testing of different cases that arise when the limit is reached. The option is features.shard-lru-limit. It is by design allowed to be configured only in init() but not in reconfigure(). This is to avoid all the complexity associated with eviction of least recently used shards when the list is shrunk. Change-Id: Ifdcc2099f634314fafe8444e2d676e192e89e295 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* geo-rep : fix possible crashSunny Kumar2018-07-231-2/+5
| | | | | | | | | | | | Problem : In 'glusterd_verify_slave' while tokenizing error message we call 'strtok_r' and store return value in 'tmp' which can be NULL. We are passing this 'tmp' as 1st argument to 'strcmp' which will lead to segmentation fault. Solution : before calling 'strcmp' we should NULL check 'tmp'. Change-Id: Ifd3864b904afe6cd09d9e5a4b55c6d0578e22b9d fixes: bz#1602121 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-2215-39/+39
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd-quota.c: fix coverity warning (BAD_COMPARE)Yaniv Kaul2018-07-201-1/+1
| | | | | | | | | See https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-07-13-1718f9c6/html/1/6glusterd-quota.c.html#error Only compile tested! Change-Id: Ief42f9fcdb02ad001bd39c4a6e27e7fa86fd2496 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: memory leak in get-stateSanju Rakonde2018-07-181-34/+19
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: gluster get-state command is leaking the memory when geo-replication session is configured. Cause: In glusterd_print_gsync_status(), we are trying to get reference to the keys of gsync_dict. The references to keys of gsync_dict are stored status_vols[i]. status_vols[i] are allocated with a memory of size of gf_gsync_status_t. Solution: Need not to use a array of pointers(status_vals), using a pointer to hold the reference to a key of gsync_dict is sufficient. Followed the below steps for testing: 1. Configured geo-rep session 2. Ran gluster get-state command for 1000 times. Without this patch, glusterd's memory was increasing significantly (around 22000KB per 1000 times), with this patch it reduced (1500KB per 1000 times) fixes: bz#1601423 Change-Id: I361f5525d71f821bb345419ccfdc20ca288ca292 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd-store: fix coverity warningYaniv Kaul2018-07-172-70/+63
| | | | | | | | | The same variable 'len' was used both in the macros and the functions. (Introduced as part of commit 6dc5dfef819cad69d6d4b4c1c305efa74236ad84 ?) Change-Id: If434999d6470067f8a1e501c8e132561e8cd81ef updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: To find a compatible brick ignore diagnostics.brick-log-level optionMohit Agrawal2018-07-131-0/+4
| | | | | | | | | | | | | | | Problem: glusterd start a volume as a separate process instead of attaching with the already running process if volume option has different brick-log-level. There is no functionality impact on a brick if the option has different brick-log-level so glusterd should attach a brick with the already running process. Solution: Ignore brick-log-level option in unsafe_option BUG: 1599628 Change-Id: I72638ff2026fcd9332bc38e1144b1ef4a708820b fixes: bz#1599628 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* Quota: Fix crawling of filesSanoj Unnikrishnan2018-07-131-1/+3
| | | | | | | | | | | | | | Problem: Running "find ." does not crawl files. It goes over the directories and lists all dentries with getdents system call. Hence the files are not looked up. Solution: explicitly triggerr stat on files with find . -exec stat {} \; since crawl can take slightly longer, updating timeout in test case Change-Id: If3c1fba2ed8e300c9cc08c1b5c1ba93cb8e4d6b6 fixes: bz#1533000 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
* snapshot : remove stale entrySunny Kumar2018-07-121-0/+38
| | | | | | | | | | | During snap delete after removing brick-path we should remove snap-path too i.e. /var/run/gluster/snaps/<snap-name>. During snap deactivate also we should remove snap-path. Change-Id: Ib80b5d8844d6479d31beafa732e5671b0322248b fixes: bz#1597662 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: _is_prefix should handle 0-length pathsKaushal M2018-07-111-0/+9
| | | | | | | | | If one of the paths given to _is_prefix is 0-length, then it is not a prefix of the other. Hence, _is_prefix should return false. Change-Id: I54aa577a64a58940ec91872d0d74dc19cff9106d fixes: bz#1599783 Signed-off-by: Kaushal M <kaushal@redhat.com>
* glusterd: log improvements on brick creation validationAtin Mukherjee2018-07-111-2/+15
| | | | | | | | Added few log entries in glusterd_is_brickpath_available (). Change-Id: I8b758578f9db90d2974f7c79126c50ad3a001d71 Updates: bz#1193929 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-1026-509/+1187
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: Fix glusterd crashSanju Rakonde2018-07-061-9/+0
| | | | | | | | | | | | | | | | | | | | Problem: gluster get-state command is crashing glusterd process, when geo-replication session is configured. Cause: Crash is happening due to the double free of memory. In glusterd_print_gsync_status_by_vol we are calling dict_unref(), which will free all the keys and values in the dictionary. Before calling dict_unref(), glusterd_print_gsync_status_by_vol is calling glusterd_print_gsync_status(). glusterd_print_gsync_status is freeing up values in the dictionary and again when dict_unref() is called, it tries to free up the values which are already freed. Solution: Remove the code which will free the memory in glusterd_print_gsync_status function. Fixes: bz#1598345 Change-Id: Id3d8aae109f377b462bbbdb96a8e3c5f6b0be752 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: show brick online after port registration even in brick-muxPranith Kumar K2018-07-053-10/+31
| | | | | | | | | | | | | | | | | Problem: With brick-mux even before brick attach is complete on the bricks glusterd marks them as online. This can lead to a race where scripts that check if the bricks are online to assume that the brick is online before it is completely online. Fix: Wait for the callback from the brick before marking the port as registered so that volume status will show the correct status of the brick. fixes bz#1597568 Change-Id: Icd3dc62506af0cf75195e96746695db823312051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: Introduce daemon-log-level cluster wide optionAtin Mukherjee2018-07-036-1/+69
| | | | | | | | | | | | This option, applicable to the node level daemons can be very helpful in controlling the log level of these services. Please note any daemon which is started prior to setting the specific value of this option (if not INFO) will need to go through a restart to have this change into effect. Change-Id: I7f6d2620bab2b094c737f5cc816bc093e9c9c4c9 fixes: bz#1597473 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: start the services after all the bricks are upAtin Mukherjee2018-07-031-9/+5
| | | | | | | | | glusterd_svcs_manager () should be called post starting all the volumes at one go. Change-Id: I838cc50c29f3930a483aa9671958cdc186904030 Fixes: bz#1597247 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterfs: access trusted peer group via remote-host commandMohit Agrawal2018-06-201-5/+0
| | | | | | | | | | | | | Problem: In SSL environment the user is able to access volume via remote-host command without adding node in a trusted pool Solution: Change the list of rpc program in glusterd.c at the time of initialization while SSL is enabled BUG: 1593232 Change-Id: I987e433b639e68ad17b77b6452df1e22dbe0f199 fixes: bz#1593232 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* features/shard: Perform shards deletion in the backgroundKrutika Dhananjay2018-06-201-0/+5
| | | | | | | | | | | | | | | A synctask is created that would scan the indices from .shard/.remove_me, to delete the shards associated with the gfid corresponding to the index bname and the rate of deletion is controlled by the option features.shard-deletion-rate whose default value is 100. The task is launched on two accounts: 1. when shard receives its first-ever lookup on the volume 2. when a rename or unlink deleted an inode Change-Id: Ia83117230c9dd7d0d9cae05235644f8475e97bc3 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* performance/quick-read: provide an invalidation based on ctimeRaghavendra G2018-06-181-0/+6
| | | | | | | | | | | | | | | | | | | | Quick-read by default uses mtime to identify changes to file data. However there are applications like rsync which explicitly set mtime making it unreliable for the purpose of identifying change in file content. Since ctime also changes when content of a file changes and it cannot be set explicitly, it becomes suitable for identifying staleness of cached data. This option makes quick-read to prefer ctime over mtime to validate its cache. However, using ctime can result in false positives as ctime changes with just attribute changes like permission without changes to file data. So, use this option only when mtime is not reliable. credits to Kotresh Hiremath Ravishankar <khiremat@redhat.com> for suggestion on using ctime instead of mtime. Change-Id: Ib3ae39a3252b2876c8ffe81f471d02a87190e9b9 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Updates: bz#1591621
* glusterd: removing the unnecessary glusterd messageSanju Rakonde2018-06-141-2/+1
| | | | | | Fixes: bz#1589253 Change-Id: I5510250a3d094e19e471b3ee47bf13ea9ee8aff5 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Fix for shd not coming upSanju Rakonde2018-06-132-1/+6
| | | | | | | | | | | | | | Problem: After creating and starting n(n is large) distribute-replicated volumes using a script, if we create and start (n+1)th distribute-replicate volume manually self heal daemon is down. Solution: In glusterd_proc_stop after giving SIGTERM signal if the process is still running, we are giving a SIGKILL. As SIGKILL will not perform any cleanup process, we need to remove the pidfile. Fixes: bz#1589253 Change-Id: I7c114334eec74c8d0f21b3e45cf7db6b8ef28af1 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: Coverity FixesSanju Rakonde2018-06-112-4/+7
| | | | | | | Fixes: #789278 Change-Id: I633704fab49992cac6ee9e05bc368f7da360d09e Signed-off-by: Sanju Rakonde <srakonde@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* glusterd: gluster v status is showing wrong status for glustershdSanju Rakonde2018-06-061-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | When we restart the bricks, connect and disconnect events happen for glustershd. glusterd use two threads to handle disconnect and connects events from glustershd. When we restart the bricks we'll get both disconnect and connect events. So both the threads will compete for the big lock. We want disconnect event to finish before connect event. But If connect thread gets the big lock first, it sets svc->online to true, and then disconnect thread will et svc->online to false. So, glustershd will be disconnected from glusterd and wrong status is shown. After killing shd, glusterd sleeps for 1 second. To avoid the problem, If glusterd releses the lock before sleep and acquires it after sleep, disconnect thread will get a chance to handle the glusterd_svc_common_rpc_notify before other thread completes connect event. Change-Id: Ie82e823fdfc936feb7c0ae10599297b050ee9986 fixes: bz#1585391 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* mgmt/glusterd: Cleanup dead codeVijay Bellur2018-06-061-9/+0
| | | | | | | updates: bz#789278 Change-Id: Id67ab681317eb0a69874400a40e3b249dfc7a7db Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* rpc/clnt: Don't let consumers manage "connected" stateRaghavendra G2018-06-041-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The state management of "connected" in rpc is ad-hoc as far as the responsibility goes. Note that there is nothing wrong with functionality itself. rpc layer manages this state in disconnect codepath and has exposed an api to manage this one from consumers. Note that rpc layer never sets "connected" to true by itself, which forces the consumers to use this api to get a working rpc connection. The situation is best captured from a comment in code from Jeff Darcy in glusterfsd/src/gf-attach.c: -/* - * In a sane world, the generic RPC layer would be capable of tracking - * connection status by itself, with no help from us. It might invoke our - * callback if we had registered one, but only to provide information. Sadly, - * we don't live in that world. Instead, the callback *must* exist and *must* - * call rpc_clnt_{set,unset}_connected, because that's the only way those - * fields get set (with RPC both above and below us on the stack). If we don't - * do that, then rpc_clnt_submit doesn't think we're connected even when we - * are. It calls the socket code to reconnect, but the socket code tracks this - * stuff in a sane way so it knows we're connected and returns EINPROGRESS. - * Then we're stuck, connected but unable to use the connection. To make it - * work, we define and register this trivial callback. - */ Also, consumers of rpc know about state of connection only through the notifications sent by rpc-clnt. So, consumers don't have any extra information to manage the state and hence letting them manage the state is counter intuitive. This patch cleans that up and instead moves the responsibility of state management of rpc layer into itself. Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Fixes: bz#1585585
* changed 'sometime' messsages to 'some time'Levi Baber2018-06-014-10/+10
| | | | | | Change-Id: I0936229fc84c011db7791218bb566c971fdea174 fixes: bz#1584864 Signed-off-by: Levi Baber <baber@iastate.edu>
* glusterd: address test failures with brick mux enabledAtin Mukherjee2018-05-312-0/+19
| | | | | | | | | | | | | | | | | | | | | | This patch addresses following: 1. On volume stop, for the last brick, pmap_registry_remove () is invoked by glusterd. 2. If a brick process is sigkilled, remove all the associated brick instances from the portmap. 3. Bump up PROCESS_UP_TIMEOUT to 45. 4. gf_attach to kill a brick takes more time in mux (which is an issue that needs a fix), but in the interim, give br-state-check.t more time to complete (there are 2 kill_bricks, each taking 120 seconds, and the test usually passes in 30 odd seconds, hence bumping this up to 350 seconds) 5. The test bug-1559004-EMLINK-handling.t is taking ~950 seconds at times on master without mux, in mux cases, when it fails, it is almost at the last iteration, hence bumping the timeout for this test case to reduce regression error rates Updates: bz#1577672 Change-Id: I1922675e112baca4c125c4c094eaa42a11e34e67 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* cloudsync: Adding s3 plugin for cloudsyncSusant Palai2018-05-301-1/+21
| | | | | | | | | | | | | | | | | | This is a plugin which provides an interface to retrive files from amazon-s3 which are archived in to s3. Users need to give the above information for cloudsync to retrieve the file from s3. TODO: 1- A separate commit in to developer-guide will detail about the usage of this plugin in more detail. 2- Need to create target file in aws-bucket with "gfid" names. Helps avoiding name collisions. Change-Id: I2e4a586f4e3f86164de9178e37673a07f317e7d9 Updates: #387 Signed-off-by: Susant Palai <spalai@redhat.com>
* glusterd: glusterd is releasing the locks before timeoutSanju Rakonde2018-05-286-0/+58
| | | | | | | | | | | | | | | | | | | Problem: We introduced lock timer in mgmt v3, which will realease the lock after 3 minutes from command execution. Some commands related to heal/profile will take more time to execute. For these comands timeout is set to 10 minutes. As the lock timer is set to 3 minutes glusterd is releasing the lock after 3 minutes. That means locks are released before the command is completed its execution. Solution: Pass a timeout parameter from cli to glusterd, when there is a change in default timeout value(i.e, default timeout value can be changed through command line or for the commands related to profile/heal we will change the default timeout value to 10 minutes.) glusterd will set the lock timer timeout according to the timeout value passed. Change-Id: I7b7a9a4f95ed44aca39ef9d9907f546bca99c69d fixes: bz#1577731 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: memory leak in geo-rep statusSanju Rakonde2018-05-281-2/+6
| | | | | | | Fixes: bz#1580352 Change-Id: I9648e73090f5a2edbac663a6fb49acdb702cdc49 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>