summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-brick-ops.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: additional log informationnik-redhat2020-06-291-6/+56
| | | | | | | | | | | | | | | Issue: Some of the functions didn't had sufficient logging of information in case of failure. Fix: Added log information in few functions in case of failure indicating the cause of such event. Change-Id: I301cf3a1c8d2c94505c6ae0d83072b0241c36d84 fixes: #874 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: add-brick command failureSanju Rakonde2020-06-211-12/+27
| | | | | | | | | | | | | | | Problem: add-brick operation is failing when replica or disperse count is not mentioned in the add-brick command. Reason: with commit a113d93 we are checking brick order while doing add-brick operation for replica and disperse volumes. If replica count or disperse count is not mentioned in the command, the dict get is failing and resulting add-brick operation failure. fixes: #1306 Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: migrating remove-brick commands to mgmt v3 frameworkSanju Rakonde2020-06-171-2/+92
| | | | | | | | | | | Currently remove-brick commands follow sync-op framework. For code extensibility (like, adding more phases in the trnasaction) we are migrating the command to mgmt v3 framework. fixes: #1164 Change-Id: I5d363223d6f9dc7a70b61adb9d3a5250e84a71b4 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: check for same node while adding bricks in disperse volumeSheetal Pamecha2020-06-061-1/+19
| | | | | | | | | | | | | | | The optimal way for configuring disperse and replicate volumes is to have all bricks in different nodes. During create operation it fails saying it is not optimal, user must use force to over-ride this behavior. Implementing same during add-brick operation to avoid situation where all the added bricks end up from same host. Operation will error out accordingly. and this can be over-ridden by using force same as create. fixes: #1047 Change-Id: I3ee9c97c1a14b73f4532893bc00187ef9355238b Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* mgmt/glusterd: Stop old shd before increasing replica countPranith Kumar K2020-05-161-0/+18
| | | | | | | | | | | | | | | | | | Problem: In add-brick that increases replica count SHD was restarted after pending xattrs are set on the new bricks and adding bricks. But before restarting SHD there is a possibility that old SHD would do a scan on root-directory see no heal is needed and delete index for root-dir leading to no heals until lookup is executed on the mount Fix: Stop shd, perform pending-xattr setting/adding new bricks and then restart shd Fixes: #1240 Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd/snapshot: Add a warning message when a volume config changesMohammed Rafi KC2020-03-121-0/+27
| | | | | | | | | | | | While changing a volume configuration, there is a chance that the brick layout on the disk might be changed. If snapshot is present on such volumes that will effects it's working. So this patch adds a warning message if snapshot is present while a volume config change happen. Change-Id: I7256863fef734841fce0bc9ad94d5d201b1813d5 Fixes: bz#1812144 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd: Add warning and abort in case of failures in migration during ↵heal-infoVishal Pandey2019-08-251-0/+11
| | | | | | | | | | | | | | | | | | remove-brick commit Problem - Currently remove-brick commit goes through even though there were files that failed to migrate or were skipped. There is no warning raised to the user. Solution- Add a check in the remove brick staging phase to verify if the status of the rebalnce process is complete but there has been failures or some skipped files while migration, In this case user will be given a warning and remove-brick commit. User will need to use the force option to remove the bricks. Fixes: bz#1514683 Signed-offby- Vishal Pandey <vpandey@redhat.com> Change-Id: I014d0f0afb4b2fac35ab0de52227f98dbae079d5
* multiple files: reduce minor work under RCU_READ_LOCKYaniv Kaul2019-08-051-2/+2
| | | | | | | | | 1. Try to unlock faster - in error paths. 2. Remove memory allocations - do them before the lock. Change-Id: I1e9ddd80b99de45ad0f557d62a5f28951dfd54c8 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Fix spelling errorsAravinda VK2019-07-141-1/+1
| | | | | | | Fixes: bz#1728554 Change-Id: I88357aed7c14988a12616035c3738c32c09a8f9a Signed-off-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Aravinda VK <avishwan@redhat.com>
* glusterd-volgen.c: remove BD xlator from the graphYaniv Kaul2019-06-181-2/+0
| | | | | | | | | | | | | The BD xlator was removed some time ago. Remove it from the graph. We can also remove the caps settings - only the BD xlator was using it. Lastly, remove the caps (which only BD was using) and the document describing the translator. Change-Id: Id0adcb2952f4832a5dc6301e726874522e07935d updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd/tier: remove tier related code from glusterdHari Gowtham2019-05-271-700/+36
| | | | | | | | | | | | | The handler functions are pointed to dummy functions. The switch case handling for tier also have been moved to point default case to avoid issues, if reintroduced. The tier changes in DHT still remain as such. updates: bz#1693692 Change-Id: I80d80c9a3eb862b4440a36b31ae82b2e9d92e4dc Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* glusterd: Fix coverity defects & put coverity annotationsAtin Mukherjee2019-05-021-3/+4
| | | | | | | | | | | Along with fixing few defect, put the required annotations for the defects which are marked ignore/false positive/intentional as per the coverity defect sheet. This should avoid the per component graph showing many defects as open in the coverity glusterfs web page. Updates: bz#789278 Change-Id: I19461dc3603a3bd8f88866a1ab3db43d783af8e4 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: remove redundant glusterd_check_volume_exists () callsAtin Mukherjee2019-04-081-30/+14
| | | | | | | | | | | | | | | | | | A pattern of following was found in multiple places where both glusterd_check_volume_exists and glusterd_volinfo_find do the same job. We just need one of them not both. In a scaled environment having many volumes this is a bottleneck to iterate over the volume list to find a volume twice! exists = glusterd_check_volume_exists(volname); ret = glusterd_volinfo_find(volname, &volinfo); if ((ret) || (!exists)) { Credits: ykaul@redhat.com for finding this out Updates: bz#1193929 Change-Id: Ie116fe5c93e261a2bddd267c28ccb20a2884a36f Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* mgmt/shd: Implement multiplexing in self heal daemonMohammed Rafi KC2019-04-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ --------- --------- ---------- | AFR-1 | | AFR-2 | | AFR-3 | -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- | debug-iostat | ----------------------- / | \ / | \ ------------ ------------ ------------ | volume-1 | | volume-2 | | volume-3 | ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* Multiple files: remove HAVE_BD_XLATOR related code.Yaniv Kaul2019-03-251-32/+0
| | | | | | | | | | | | The BD translator was removed some time ago, (in commit a907e468e724c32b9833ce59806fc215c7122d63). This completes the work. Compile-tested only! updates: bz#1635688 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I999df52e479a72d3cc9523f22f9056de17eb559c
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: perform rcu_read_lock/unlock() under cleanup_lock mutexSanju Rakonde2018-12-031-4/+4
| | | | | | | | | | | | | | Problem: glusterd should not try to acquire locks on any resources, when it already received a SIGTERM and cleanup is started. Otherwise we might hit segfault, since the thread which is going through cleanup path will be freeing up the resouces and some other thread might be trying to acquire locks on freed resources. Solution: perform rcu_read_lock/unlock() under cleanup_lock mutex. fixes: bz#1654270 Change-Id: I87a97cfe4f272f74f246d688660934638911ce54 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: allow shared-storage to use bricks under glusterd working directorySanju Rakonde2018-11-081-1/+2
| | | | | | | | | | | | | | | With commit 44e4db, we are not allowing user to create a volume using glusterd's working directory as a brick or any sub directory under glusterd's working directory as a brick.This has broken shared-storage since the volume "gluster-shared-storage" is created using the bricks under glusterd's working directory. With this patch, we let the "gluster-shared-storage" volume to use bricks under glusterd's working directory. fixes: bz#1647029 Change-Id: Ifcbcf4576eea12cf46f199dea287b29bd3ec3bfd Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* stripe: remove the translator from build and glusterdAmar Tumballi2018-10-311-189/+1
| | | | | | | | | | | | | | | | Based on the proposal to remove few features as they are not actively maintained [1], removing stripe translator from the build. Also make sure there are no regression tests involving stripe translator. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Note that this patch aims at removing the translator from build, and a followup patch is needed to remove the code from repository. Updates: bz#1364707 Change-Id: I235b305338f138e29e9f30cba65bc0dadbebbbd5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* glusterd: set integer instead of stringSanju Rakonde2018-10-241-1/+2
| | | | | | | | | | | | | | dict_get_str_boolean expects a integer, so we need to set all the boolean variables as integers to avoid log messages like below: [2018-09-10 03:55:19.236387] I [dict.c:2838:dict_get_str_boolean] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_reconnect+0xc2) [0x7ff7a83d0452] -->/usr/local/lib/glusterfs/4.2dev/rpc-transport/socket.so(+0x65b0) [0x7ff7a06cf5b0] -->/usr/local/lib/libglusterfs.so.0(dict_get_str_boolean+0xcf) [0x7ff7a85fc58f] ) 0-dict: key transport.socket.ignore-enoent, integer type asked, has string type [Invalid argument] This patch addresses all such instances in glusterd. Change-Id: I7e1979fcf381363943f4d09b94c3901c403727da updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* mgmt/glusterd: NULL pointer dereferencing clang fixIraj Jamali2018-10-031-15/+16
| | | | | | | | | Added checks to avoid NULL pointer dereferencing Updates: bz#1622665 Change-Id: I745c1f3ba4df0e486ce99301843f9f13d01c00e0 Signed-off-by: Iraj Jamali <ijamali@redhat.com>
* glusterd: fix coverity issuesSanju Rakonde2018-10-011-1/+1
| | | | | | | | | | | This patch fixes CID 1274175, 1175018. 1274175: Buffer size warning 1175018: Resource leak Change-Id: Id18960c249447b8dae35de3ad92bc570e62ddb09 updates: bz#789278 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-2993/+2938
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Some (mgmt) xlators: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-09-091-90/+128
| | | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - It also adds dict_get_int32n which was missing. - It also reduces the size of some key variables. They were set to 1024b or PATH_MAX, where sometimes 64 bytes were really enough. Please review carefully: 1. That I did not reduce some the size of the key variables too much. 2. That I did not mix up some keys. Compile-tested only! Change-Id: Ic729baf179f40e8d02bc2350491d4bb9b6934266 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: Fix coverity issuesSanju Rakonde2018-08-291-1/+1
| | | | | | | | | | | This patch fixes CID's 1395250, 1395252 1395250 - Unintialized variable 1395252 - Out of bounds access updates: bz#789278 Change-Id: Icf646364b14d48fa2bd82ea78ca5cdb5c684355f Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* xlators/mgmt/glusterd/src/glusterd-brick-ops.c : reduce size or re-scope ↵Yaniv Kaul2018-08-231-4/+3
| | | | | | | | | | | | | | | message variable The the error and/or message variable was either: - Reduced in size - from 2048 bytes to 64 bytes, for example. or - Changed in scope - defined in a smaller scope. Compile-tested only! Change-Id: Ib74e5f8f4c2978f670d4708e9382e97edf5df0a7 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: fix some coverity issuesBhumika Goyal2018-08-201-1/+1
| | | | | | | | | | | Fixes CID: 1241481 1241482 1274079 1274118 1274121 1274131 1274198 1274214 1274220 1274224 1394663 1394641 382454 1382453 1382449 1288095 Link: https://scan6.coverity.com/reports.htm#v42388/p10714/fileInstanceId=84772667&defectInstanceId=25770661&mergedDefectId=744716 Change-Id: Idaf434186231c8b0fff4b27c57fa23636a89c8a7 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
* glusterd: fix gcc7 warningsAmar Tumballi2018-08-141-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | [sh]$ gcc --version gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Warnings were of the type below: xlators/mgmt/glusterd/src/glusterd-store.c:3285:33: warning: ‘/options’ directive output may be truncated writing 8 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (path, len, "%s/options", conf->workdir); ^~~~~~~~ xlators/mgmt/glusterd/src/glusterd-store.c:1280:39: warning: ‘/snaps/’ directive output may be truncated writing 7 bytes into a region of size between 1 and 4096 [-Wformat-truncation=] snprintf (snap_fpath, len, "%s/snaps/%s/%s", priv->workdir, ^~~~~~~ * Also changed some places where there was issues with key size * Made sure all the 'char buf[SOMESIZE] = {0,};' are changed to 'char buf[SOMESIZE] = "";` - In the files I changed * Also edited coding standard to reflect that. updates: bz#1193929 Change-Id: I04c652624ac63199cea2077e46b3a5def37c3689 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-1/+1
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-101-7/+16
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: connect to an existing brick process when qourum status is ↵Atin Mukherjee2018-01-051-1/+1
| | | | | | | | | | | | | | NOT_APPLICABLE_QUORUM First of all, this patch reverts commit 635c1c3 as the same is causing a regression with bricks not coming up on time when a node is rebooted. This patch tries to fix the problem in a different way by just trying to connect to an existing running brick when quorum status is not applicable. Change-Id: I0efb5901832824b1c15dcac529bffac85173e097 BUG: 1509845 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd : fix client io-threads option for replicate volumesRavishankar N2017-10-091-0/+34
| | | | | | | | | | | | | | | | | | | Problem: Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading client-io-threads for replicate volumes (it was set to on by default in commit e068c1997314046658dd502e9118dab32decf879) due to performance issues but in doing so, inadvertently failed to load the xlator even if the user explicitly enabled the option using the volume set command. This was despite returning returning sucess for the volume set. Fix: Modify the check in perfxl_option_handler() and add checks in volume create/add-brick/remove-brick code paths, tying it all to GD_OP_VERSION_3_12_2. Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf BUG: 1498570 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tier: separation of attach-tier from add-brickhari gowtham2017-08-011-0/+271
| | | | | | | | | | | | | | | | | | PROBLEM: Both attach tier and add brick have the same RPC and set of code. This becomes a hurdle while tring to implement add brick on a tiered volume. FIX: This patch separates the add brick and attach tier giving them separate RPCs. Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2 BUG: 1376326 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/15503 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* posix: option to handle the shared bricks for statvfs()Amar Tumballi2017-07-241-0/+17
| | | | | | | | | | | | | | | | | | | | Currently 'storage/posix' xlator has an option called option `export-statfs-size no`, which exports zero as values for few fields in `struct statvfs`. In a case of backend brick shared between multiple brick processes, the values of these variables should be `field_value / number-of-bricks-at-node`. This way, even the issue of 'min-free-disk' etc at different layers would also be handled properly when the statfs() sys call is made. Fixes #241 Change-Id: I2e320e1fdcc819ab9173277ef3498201432c275f Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: https://review.gluster.org/17618 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Revert "glusterd: disallow rebalance & remove-brick on a sharded volume"Krutika Dhananjay2017-06-131-10/+0
| | | | | | | | | | | | | | | | | This reverts commit 8375b3d70d5c6268c6770b42a18b2e1bc09e411e. Now that some of the users have confirmed rebalance works fine without causing corruption of VMs, time to revert the CLI restriction. Change-Id: I45493fcbb1f25fd0fff27b2b3526c42642ccb464 BUG: 1460585 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17506 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: Make reset-brick work correctly if brick-mux is onSamikshan Bairagya2017-05-101-1/+2
| | | | | | | | | | | | | | | | | | | Reset brick currently kills of the corresponding brick process. However, with brick multiplexing enabled, stopping the brick process would render all bricks attached to it unavailable. To handle this correctly, we need to make sure that the brick process is terminated only if brick-multiplexing is disabled. Otherwise, we should send the GLUSTERD_BRICK_TERMINATE rpc to the respective brick process to detach the brick that is to be reset. Change-Id: I69002d66ffe6ec36ef48af09b66c522c6d35ac58 BUG: 1446172 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17128 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: socketfile & pidfile related fixes for brick multiplexing featureMohit Agrawal2017-05-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While brick-muliplexing is on after restarting glusterd, CLI is not showing pid of all brick processes in all volumes. Solution: While brick-mux is on all local brick process communicated through one UNIX socket but as per current code (glusterd_brick_start) it is trying to communicate with separate UNIX socket for each volume which is populated based on brick-name and vol-name.Because of multiplexing design only one UNIX socket is opened so it is throwing poller error and not able to fetch correct status of brick process through cli process. To resolve the problem write a new function glusterd_set_socket_filepath_for_mux that will call by glusterd_brick_start to validate about the existence of socketpath. To avoid the continuous EPOLLERR erros in logs update socket_connect code. Test: To reproduce the issue followed below steps 1) Create two distributed volumes(dist1 and dist2) 2) Set cluster.brick-multiplex is on 3) kill glusterd 4) run command gluster v status After apply the patch it shows correct pid for all volumes BUG: 1444596 Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17101 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: disallow rebalance & remove-brick on a sharded volumeAtin Mukherjee2017-05-041-0/+10
| | | | | | | | | | | | | Change-Id: Idfbdbc61ca18054fdbf7556f74e195a63cd8a554 BUG: 1447630 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17160 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: disallow increasing replica count for arbiter volumesRavishankar N2017-03-061-0/+10
| | | | | | | | | | | | | | | | | | | | | Problem: add-brick command to increase replica count in an arbiter volume succeeds, causing undesirable effects like the 4th brick being loaded with the arbiter xlator, the 3rd one losing the arbiter xlator (when the brick process is restarted), arbitration logic in afr going for a toss etc. Fix: Arbiter configuration should always be a replica 3 volume (of which 3rd brick is arbiter). Hence disallow increasing replica count for arbiter volume configurations. Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51 BUG: 1429200 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16845 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* libglusterfs, gfdb, glusterfs: Add missing breaksNigel Babu2017-02-261-0/+1
| | | | | | | | | | | | | | | | A few switches did not have breaks causing fall throughs. Most of them have been fixed with fall through comments for those that are intentional. Change-Id: I84c85726b542f38504b50fefab5eba5dbcd27a07 BUG: 1424894 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16677 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd : Fix for error message while removing brickGaurav Yadav2017-02-171-1/+2
| | | | | | | | | | | | | | | | | | | When remove-brick command is issued to a offline brick, glusterd error out the operation with message -: "volume remove-brick start: failed: Found stopped brick <hostname>:". With this fix while removing brick, error message is modified to "volume remove-brick start: failed: Found stopped brick <brick path>. Use force option to remove the brick" Change-Id: Id40a02fc38cdb526c4629de262967fe2383febe4 BUG: 1422624 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16630 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier: remove warning related to the enumhari gowtham2017-02-071-6/+19
| | | | | | | | | | | | | | | | | | | PROBLEM: In the tier as a service patch the enums for tier (from gf1_op_command and gf_defrag_command) are put into a single enum gf_defrag_command which causes a warning that will make the build fail. FIX: send both the enum and eliminate the warning. Change-Id: I899ff622dfb07134e6459aa65f65ea7252765293 BUG: 1418973 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16539 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: double-check brick liveness for remove-brick validationJeff Darcy2017-02-021-4/+20
| | | | | | | | | | | | | | | | Same problem as https://review.gluster.org/#/c/16509/ in a different place. Tests detach bricks without glusterd's knowledge, so glusterd's internal brick state is out of date and we have to re-check (via the brick's pidfile) as well. BUG: 1385758 Change-Id: I169538c1c62d72a685a49d57ef65fb6c3db6eab2 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16529 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-301-3/+9
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: bypass add-brick validation with forceAtin Mukherjee2017-01-181-3/+3
| | | | | | | | | | | | | | | | | | | | | Commit c916a2f added a validation to restrict add-brick operation if a replica configuration is changed and any of the bricks belonging to the volume is down. However we should bypass this validation with a force option if users really want to have add-brick to go through at the sake of the corner cases of data loss issue. The original problem of add-brick getting failed when layout is not set will still be a problem with a force option as the issue has to be taken care in the DHT layer. Change-Id: I0ed3df91ea712f77674eb8afc6fdfa577f25a7bb BUG: 1406411 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/16358 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier : Tier as a servicehari gowtham2017-01-161-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Fail add-brick on replica count change, if brick is downkarthik-us2017-01-061-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1. Have a replica 2 volume with bricks b1 and b2 2. Before setting the layout, b1 goes down 3. Set the layout write some data, which gets populated on b2 4. b2 goes down then b1 comes up 5. Add another brick b3, and heal will take place from b1 to b3, which basically have no data 6. Write some data. Both b1 and b3 will mark b2 for pending writes 7. b1 goes down, and b2 comes up 8. b2 gets heald from b1. During heal it removes the data which is already in b2, considering that as stale data. This leads to data loss. Solution: 1. In glusterd stage-op, while adding bricks, check whether the replica count is being increased 2. If yes, then check whether any of the bricks are down at that time 3. If yes, then fail the add-brick to avoid such data loss 4. Else continue the normal operation. This check will work enen when we convert plain distribute volume to replicate Test: 1. Create a replica 2 volume 2. Kill one brick from the volume 3. Try adding a brick to the volume 4. It should fail with all bricks are not up error 5. Cretae a distribute volume and kill one of the brick 6. Try to convert it to replicate volume, by adding bricks. 7. This should also fail. Change-Id: I9c8d2ab104263e4206814c94c19212ab914ed07c BUG: 1406411 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: http://review.gluster.org/16330 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: fix few events generationAtin Mukherjee2016-11-231-1/+2
| | | | | | | | | | | | | | | | | | | | | This patch does the following: 1. Generate PEER_REJECT event if the peer add request is from an unknown peer during peer handshaking. 2. EVENT_COMPARE_FRIEND_VOLUME_FAILED should be generated based on status code, not ret. 3. Add EVENT_BRICKPATH_RESOLVE_FAILED event in case glusterd fails to resolve bricks, this is mainly at restore path. 4. Remove EVENT_BRICKS_START_FAILED event as we already have EVENT_BRICK_START_FAILED Change-Id: I90e5bc4a331166d0bb3554eb2ec9df2526837a1d BUG: 1397424 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15903 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
* cli/rebalance: remove brick status is incorrectN Balachandran2016-11-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | If a remove brick operation is preceded by a fix-layout, running remove-brick status on a node which does not contain any of the bricks that were removed displays fix-layout status. The defrag_cmd variable was not updated in glusterd for the nodes not hosting removed bricks causing the status parsing to go wrong. This is now updated. Also made minor modifications to the spacing in the fix-layout status output. Change-Id: Ib735ce26be7434cd71b76e4c33d9b0648d0530db BUG: 1389697 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15749 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier: failing detach commit on detach failure and in-progresshari gowtham2016-09-121-0/+11
| | | | | | | | | | | | | | | | | | | | PROBLEM: if detach status has failed or if it remains in progress we allow detach commit to happen. only detach force should be allowed. FIX: check the detach status for failure or inprogress and disallow with the apt error message. Change-Id: Ib97d540fec67717bb55c18d133187c665cf69ef1 BUG: 1374584 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/15438 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>