summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-brick-ops.c
Commit message (Collapse)AuthorAgeFilesLines
* glusterd : fix client io-threads option for replicate volumesRavishankar N2017-10-091-0/+34
| | | | | | | | | | | | | | | | | | | Problem: Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading client-io-threads for replicate volumes (it was set to on by default in commit e068c1997314046658dd502e9118dab32decf879) due to performance issues but in doing so, inadvertently failed to load the xlator even if the user explicitly enabled the option using the volume set command. This was despite returning returning sucess for the volume set. Fix: Modify the check in perfxl_option_handler() and add checks in volume create/add-brick/remove-brick code paths, tying it all to GD_OP_VERSION_3_12_2. Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf BUG: 1498570 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tier: separation of attach-tier from add-brickhari gowtham2017-08-011-0/+271
| | | | | | | | | | | | | | | | | | PROBLEM: Both attach tier and add brick have the same RPC and set of code. This becomes a hurdle while tring to implement add brick on a tiered volume. FIX: This patch separates the add brick and attach tier giving them separate RPCs. Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2 BUG: 1376326 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/15503 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* posix: option to handle the shared bricks for statvfs()Amar Tumballi2017-07-241-0/+17
| | | | | | | | | | | | | | | | | | | | Currently 'storage/posix' xlator has an option called option `export-statfs-size no`, which exports zero as values for few fields in `struct statvfs`. In a case of backend brick shared between multiple brick processes, the values of these variables should be `field_value / number-of-bricks-at-node`. This way, even the issue of 'min-free-disk' etc at different layers would also be handled properly when the statfs() sys call is made. Fixes #241 Change-Id: I2e320e1fdcc819ab9173277ef3498201432c275f Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: https://review.gluster.org/17618 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Revert "glusterd: disallow rebalance & remove-brick on a sharded volume"Krutika Dhananjay2017-06-131-10/+0
| | | | | | | | | | | | | | | | | This reverts commit 8375b3d70d5c6268c6770b42a18b2e1bc09e411e. Now that some of the users have confirmed rebalance works fine without causing corruption of VMs, time to revert the CLI restriction. Change-Id: I45493fcbb1f25fd0fff27b2b3526c42642ccb464 BUG: 1460585 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17506 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: Make reset-brick work correctly if brick-mux is onSamikshan Bairagya2017-05-101-1/+2
| | | | | | | | | | | | | | | | | | | Reset brick currently kills of the corresponding brick process. However, with brick multiplexing enabled, stopping the brick process would render all bricks attached to it unavailable. To handle this correctly, we need to make sure that the brick process is terminated only if brick-multiplexing is disabled. Otherwise, we should send the GLUSTERD_BRICK_TERMINATE rpc to the respective brick process to detach the brick that is to be reset. Change-Id: I69002d66ffe6ec36ef48af09b66c522c6d35ac58 BUG: 1446172 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17128 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: socketfile & pidfile related fixes for brick multiplexing featureMohit Agrawal2017-05-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While brick-muliplexing is on after restarting glusterd, CLI is not showing pid of all brick processes in all volumes. Solution: While brick-mux is on all local brick process communicated through one UNIX socket but as per current code (glusterd_brick_start) it is trying to communicate with separate UNIX socket for each volume which is populated based on brick-name and vol-name.Because of multiplexing design only one UNIX socket is opened so it is throwing poller error and not able to fetch correct status of brick process through cli process. To resolve the problem write a new function glusterd_set_socket_filepath_for_mux that will call by glusterd_brick_start to validate about the existence of socketpath. To avoid the continuous EPOLLERR erros in logs update socket_connect code. Test: To reproduce the issue followed below steps 1) Create two distributed volumes(dist1 and dist2) 2) Set cluster.brick-multiplex is on 3) kill glusterd 4) run command gluster v status After apply the patch it shows correct pid for all volumes BUG: 1444596 Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17101 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: disallow rebalance & remove-brick on a sharded volumeAtin Mukherjee2017-05-041-0/+10
| | | | | | | | | | | | | Change-Id: Idfbdbc61ca18054fdbf7556f74e195a63cd8a554 BUG: 1447630 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17160 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: disallow increasing replica count for arbiter volumesRavishankar N2017-03-061-0/+10
| | | | | | | | | | | | | | | | | | | | | Problem: add-brick command to increase replica count in an arbiter volume succeeds, causing undesirable effects like the 4th brick being loaded with the arbiter xlator, the 3rd one losing the arbiter xlator (when the brick process is restarted), arbitration logic in afr going for a toss etc. Fix: Arbiter configuration should always be a replica 3 volume (of which 3rd brick is arbiter). Hence disallow increasing replica count for arbiter volume configurations. Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51 BUG: 1429200 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16845 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* libglusterfs, gfdb, glusterfs: Add missing breaksNigel Babu2017-02-261-0/+1
| | | | | | | | | | | | | | | | A few switches did not have breaks causing fall throughs. Most of them have been fixed with fall through comments for those that are intentional. Change-Id: I84c85726b542f38504b50fefab5eba5dbcd27a07 BUG: 1424894 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16677 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd : Fix for error message while removing brickGaurav Yadav2017-02-171-1/+2
| | | | | | | | | | | | | | | | | | | When remove-brick command is issued to a offline brick, glusterd error out the operation with message -: "volume remove-brick start: failed: Found stopped brick <hostname>:". With this fix while removing brick, error message is modified to "volume remove-brick start: failed: Found stopped brick <brick path>. Use force option to remove the brick" Change-Id: Id40a02fc38cdb526c4629de262967fe2383febe4 BUG: 1422624 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16630 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier: remove warning related to the enumhari gowtham2017-02-071-6/+19
| | | | | | | | | | | | | | | | | | | PROBLEM: In the tier as a service patch the enums for tier (from gf1_op_command and gf_defrag_command) are put into a single enum gf_defrag_command which causes a warning that will make the build fail. FIX: send both the enum and eliminate the warning. Change-Id: I899ff622dfb07134e6459aa65f65ea7252765293 BUG: 1418973 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16539 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: double-check brick liveness for remove-brick validationJeff Darcy2017-02-021-4/+20
| | | | | | | | | | | | | | | | Same problem as https://review.gluster.org/#/c/16509/ in a different place. Tests detach bricks without glusterd's knowledge, so glusterd's internal brick state is out of date and we have to re-check (via the brick's pidfile) as well. BUG: 1385758 Change-Id: I169538c1c62d72a685a49d57ef65fb6c3db6eab2 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16529 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-301-3/+9
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: bypass add-brick validation with forceAtin Mukherjee2017-01-181-3/+3
| | | | | | | | | | | | | | | | | | | | | Commit c916a2f added a validation to restrict add-brick operation if a replica configuration is changed and any of the bricks belonging to the volume is down. However we should bypass this validation with a force option if users really want to have add-brick to go through at the sake of the corner cases of data loss issue. The original problem of add-brick getting failed when layout is not set will still be a problem with a force option as the issue has to be taken care in the DHT layer. Change-Id: I0ed3df91ea712f77674eb8afc6fdfa577f25a7bb BUG: 1406411 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/16358 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier : Tier as a servicehari gowtham2017-01-161-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Fail add-brick on replica count change, if brick is downkarthik-us2017-01-061-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1. Have a replica 2 volume with bricks b1 and b2 2. Before setting the layout, b1 goes down 3. Set the layout write some data, which gets populated on b2 4. b2 goes down then b1 comes up 5. Add another brick b3, and heal will take place from b1 to b3, which basically have no data 6. Write some data. Both b1 and b3 will mark b2 for pending writes 7. b1 goes down, and b2 comes up 8. b2 gets heald from b1. During heal it removes the data which is already in b2, considering that as stale data. This leads to data loss. Solution: 1. In glusterd stage-op, while adding bricks, check whether the replica count is being increased 2. If yes, then check whether any of the bricks are down at that time 3. If yes, then fail the add-brick to avoid such data loss 4. Else continue the normal operation. This check will work enen when we convert plain distribute volume to replicate Test: 1. Create a replica 2 volume 2. Kill one brick from the volume 3. Try adding a brick to the volume 4. It should fail with all bricks are not up error 5. Cretae a distribute volume and kill one of the brick 6. Try to convert it to replicate volume, by adding bricks. 7. This should also fail. Change-Id: I9c8d2ab104263e4206814c94c19212ab914ed07c BUG: 1406411 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: http://review.gluster.org/16330 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: fix few events generationAtin Mukherjee2016-11-231-1/+2
| | | | | | | | | | | | | | | | | | | | | This patch does the following: 1. Generate PEER_REJECT event if the peer add request is from an unknown peer during peer handshaking. 2. EVENT_COMPARE_FRIEND_VOLUME_FAILED should be generated based on status code, not ret. 3. Add EVENT_BRICKPATH_RESOLVE_FAILED event in case glusterd fails to resolve bricks, this is mainly at restore path. 4. Remove EVENT_BRICKS_START_FAILED event as we already have EVENT_BRICK_START_FAILED Change-Id: I90e5bc4a331166d0bb3554eb2ec9df2526837a1d BUG: 1397424 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15903 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
* cli/rebalance: remove brick status is incorrectN Balachandran2016-11-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | If a remove brick operation is preceded by a fix-layout, running remove-brick status on a node which does not contain any of the bricks that were removed displays fix-layout status. The defrag_cmd variable was not updated in glusterd for the nodes not hosting removed bricks causing the status parsing to go wrong. This is now updated. Also made minor modifications to the spacing in the fix-layout status output. Change-Id: Ib735ce26be7434cd71b76e4c33d9b0648d0530db BUG: 1389697 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15749 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier: failing detach commit on detach failure and in-progresshari gowtham2016-09-121-0/+11
| | | | | | | | | | | | | | | | | | | | PROBLEM: if detach status has failed or if it remains in progress we allow detach commit to happen. only detach force should be allowed. FIX: check the detach status for failure or inprogress and disallow with the apt error message. Change-Id: Ib97d540fec67717bb55c18d133187c665cf69ef1 BUG: 1374584 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/15438 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* glusterd : Introduce reset brickAnuradha Talur2016-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The command basically allows replace brick with src and dst bricks as same. Usage: gluster v reset-brick <volname> <hostname:brick-path> start This command kills the brick to be reset. Once this command is run, admin can do other manual operations that they need to do, like configuring some options for the brick. Once this is done, resetting the brick can be continued with the following options. gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force} Does the job of resetting the brick. 'force' option should be used when the brick already contains volinfo id. Problem: On doing a disk-replacement of a brick in a replicate volume the following 2 scenarios may occur : a) there is a chance that reads are served from this replaced-disk brick, which leads to empty reads. b) potential data loss if next writes succeed only on replaced brick, and heal is done to other bricks from this one. Solution: After disk-replacement, make sure that reset-brick command is run for that brick so that pending markers are set for the brick and it is not chosen as source for reads and heal. But, as of now replace-brick for the same brick-path is not allowed. In order to fix the above mentioned problem, same brick-path replace-brick is needed. With this patch reset-brick commit {force} will be allowed even when source and destination <hostname:brickpath> are identical as long as 1) destination brick is not alive 2) source and destination brick have the same brick uuid and path. Also, the destination brick after replace-brick will use the same port as the source brick. Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de BUG: 1266876 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12250 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-261-6/+0
| | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: I212abb47d9f35922b3f8253137ebab6841a53eed BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15261 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com>
* cluster/tier: fix detach tier error messageDan Lambright2016-06-131-14/+15
| | | | | | | | | | | | | | | Do not refer to obsolete syntax when throwing an error on detach tier. Throw a warning if the user tries to commit a detach tier before starting (or completing) the decommission process. Change-Id: I9df28c1b42b8ab1790e1db5c0cd518432146c1d1 BUG: 1337227 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/14438 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cli/glusterd: add/remove brick fixes for arbiter volumesRavishankar N2016-05-191-26/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1.Provide a command to convert replica 2 volumes to arbiter volumes. Existing self-heal logic will automatically heal the file hierarchy into the arbiter brick, the progress of which can be monitored using the heal info command. Syntax: gluster volume add-brick <VOLNAME> replica 3 arbiter 1 <HOST:arbiter-brick-path> 2. Add checks when removing bricks from arbiter volumes: - When converting from arbiter to replica 2 volume, allow only arbiter brick to be removed. - When converting from arbiter to plain distribute volume, allow only if arbiter is one of the bricks that is removed. 3. Some clean-up: - Use GD_MSG_DICT_GET_SUCCESS instead of GD_MSG_DICT_GET_FAILED to log messages that are not failures. - Remove unused variable `brick_list` - Move 'brickinfo->group' related functions to glusted-utils. Change-Id: Ic87b8c7e4d7d3ab03f93e7b9f372b314d80947ce BUG: 1318289 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14126 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: copy real_path from older brickinfo during brick importAtin Mukherjee2016-05-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | In glusterd_import_new_brick () new_brickinfo->real_path will not be populated for the first time and hence if the underlying file system is bad for the same brick, import will fail resulting in inconsistent configuration data. Fix is to populate real_path from old brickinfo object. Also there were many cases where we were unnecessarily calling realpath() and that may cause in failure. For eg - if a remove brick is executed with a brick whoose underlying file system has crashed, remove-brick fails since realpath() call fails. We'd need to call realpath() here as the value is of no use.Hence passing construct_realpath as _gf_false in glusterd_volume_brickinfo_get_by_brick () is a must in such cases. Change-Id: I7ec93871dc9e616f5d565ad5e540b2f1cacaf9dc BUG: 1335531 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14306 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: remove-brick commit should not succeed when migration failedSakshi Bansal2016-05-021-0/+6
| | | | | | | | | | | | | | | | While remove a brick if the data migration was not successful, remove-brick commit should not succeed as this can lead to data loss. Change-Id: I1eac0ef775cc6910ece0e46ebb04051444d54393 BUG: 1278325 Signed-off-by: Sakshi Bansal <sabansal@localhost.localdomain> Reviewed-on: http://review.gluster.org/12513 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: populate brickinfo->real_path conditionallyAtin Mukherjee2016-04-111-6/+12
| | | | | | | | | | | | | | | | | | | | glusterd_brickinfo_new_from_brick () is called from multiple places and one of them is glusterd_brick_rpc_notify where its very well possible that an underlying brick's file system has crashed and a disconnect event has been received. In this case glusterd tries to build the brickinfo from the brickid in the RPC request, however the same fails as glusterd_brickinfo_new_from_brick () fails from realpath. Fix is to skip populating real_path if its a disconnect event. Change-Id: I9d9149c64a9cf2247abb731f219c1b1eef037960 BUG: 1325841 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13965 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd / afr : Enable auto heal when replica count increasesAnuradha Talur2016-03-211-16/+70
| | | | | | | | | | | | | | | | | | | | | In replicate volumes, when a brick is added to a replicate group, heal to the new brick should be triggered. Also, the new brick should not be considered as source for healing till it is up to date. Previously, extended attributes had to be set manually on the bricks for this to happen. This patch is part 1 patch to automate this process. Change-Id: I29958448618372bfde23bf1dac5dd23dba1ad98f BUG: 1276203 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12451 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* Tier: making detach start fail when brick on hot tier is downhari2016-03-031-1/+2
| | | | | | | | | | | | | | | | | | | Currently detach tier start happens even when a hot brick is down this might lead to data loss. This patch prevents the detach tier start from being executed successfully if a brick in hot tier is down Change-Id: I3b6047a44bd01b8a6887d41f799f64de6bf075ef BUG: 1309999 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13474 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd/tier: Reset to reconfigured values after detach-tierMohammed Rafi KC2015-11-261-0/+33
| | | | | | | | | | | | | | After detach-tier commit , we have to reset the option reconfigured for tier volume Change-Id: Iae0210259720d6ac14ccc0cc339dc9f54a0c4571 BUG: 1285046 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12736 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier make cache mode default for tiered volumesDan Lambright2015-11-171-0/+3
| | | | | | | | | | | | | The default mode for tiered volumes must be cache. The current test mode was for engineering and should ordinarily not be used by customers. Change-Id: I20583f54a9269ce75daade645be18ab8575b0b9b BUG: 1282076 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12581 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* cluster/tier: Disallow detach commit when detach in progressDan Lambright2015-11-161-0/+25
| | | | | | | | | | | | | 1. Check if detach is running, disallow detach commit if so. 2. Cleanup shutdown of tier daemon on detach: do not rerun fix-layout, do not send incorrect status back to glusterd. Change-Id: I97202f748773c1176396a4ffd32a4c7fa9b9c1bc BUG: 1279637 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12560 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: brick failed to startMohammed Rafi KC2015-11-161-1/+3
| | | | | | | | | | | | | | | | | | brick volfiles are generated in post validate, if it is running version higher than GLUSTER_3_7_5, else will be running in syncop. If the code fall back to syncop, and volume is stopped then we were returning the operation with out generating volfiles. Change-Id: I3b16ee29de19c5d34e45d77d6b7e4b665c2a4653 BUG: 1282322 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12552 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/tier enable CTR on attach tierDan Lambright2015-10-301-0/+3
| | | | | | | | | | | | | CTR is currently disabled by default, and must be manually enabled for tiering to start. This is an overhead on the administrator and easy to overlook. Enable it automatically when a tier is attached. Change-Id: I0c29de8762faec1bfe6d1376a57eeef3357ad15a BUG: 1274847 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12420 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
* tiering/glusterd: keep afr/ec xlators name constantMohammed Rafi KC2015-10-081-3/+11
| | | | | | | | | | | | | | | | | | | afr uses the translator name for locking purpose, so it is mandatory to keep afr/ec xlators name constant across graph change currently when a tier is attached, afr names are appended either with hot or cold. ie that breaks the above mentioned constraint. Change-Id: I3699dcdaa8190bab3ba81cbc01e8fa126d37ba0d BUG: 1261276 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12134 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* glusterd/add-brick: change add-brick implementation to v3 frameworkMohammed Rafi KC2015-10-071-11/+67
| | | | | | | | | | | | | | | | | | | | | add-brick commit first happens on local node and followed by peers. As part of the commit of local-host glusterd will send the updated volfiles to the clients connected to the local-host even before the commit of peers happen. If any of the newly added brick was hosted by any peer, that brick won't be started when client (connected to local-host) try to send fops. By changing to v3 framework we can send post validate ops after commit operation that helps to send volfile fetch request only after completing commits on all nodes. Change-Id: Ib7312e01143326128c010c11fc2ed206f37409ad BUG: 1263549 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12237 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterfsd : newly added brick receives fops only after it is startedSakshi2015-09-221-1/+4
| | | | | | | | | | | | | | | | | | | When new bricks are added in the middle of an on-going fop like 'rm', the volfile changes without waiting for the newly added bricks to get port. Fops are sent to all bricks and may fail on some with ENOTCONN as these bricks may not have a port yet. This patch ensures that the volfile change happens only after all the bricks have a port. Change-Id: I7ed2413475f80d0cc8849fed33036ade8d75a191 BUG: 1233151 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/11342 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd : check if all bricks are started before performing remove-brickSakshi2015-09-221-1/+10
| | | | | | | | | | Change-Id: Ie9e24e037b7a39b239a7badb983504963d664324 BUG: 1225716 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/10954 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Tier/cli: Change detach-tier commit force to detach-tier forceMohammed Rafi KC2015-09-221-1/+1
| | | | | | | | | | | | | | | | | | Current detach-tier cli command support commit force. Deprecating the same to force. So the new syntax would be: volume detach-tier <VOLNAME> <start|stop|status|commit|force> Change-Id: Ie86dfd72341078c0a1be94767f523730911312ef BUG: 1261862 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12151 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* Tier/glusterd: Do not allow attach-tier if remove-brick is not committedMohammed Rafi KC2015-09-181-0/+24
| | | | | | | | | | | | | | | | When attaching a tier, if there is a pending remove-brick task, then should not allow attach-tier. Since we are not supporting add/remove brick on a tiered volume, we won't able to commit pending remove-brick after attaching the tier Change-Id: Ib434e2e6bc75f0908762f087ad1ca711e6b62818 BUG: 1261819 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12148 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/glusterd: volume status failed after detach startMohammed Rafi KC2015-09-181-3/+4
| | | | | | | | | | | | | | | After triggering detach start on a tiered volume fails. This because of brick count was wrongly setting in rebal dictionary. Change-Id: I6a472bf2653a07522416699420161f2fb1746aef BUG: 1261757 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12146 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* Tiering:Changing error message as detach-tier instead of "remove-brick"hari gowtham2015-09-161-4/+13
| | | | | | | | | | | Change-Id: Id93424a08f601a8d7540d96a47ed2b0497d4a631 BUG: 1263177 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/12177 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/glusterd : Disable subvol match check during detach tierMohammed Rafi KC2015-09-141-5/+13
| | | | | | | | | | | | | | | For tiering, user does not have authorization to choose for bricks to detach, so we don't need to whether subvols match for the bricks or not. Change-Id: I7e777ccc1aa261f652f9b158718fcd55185c7794 BUG: 1261741 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12145 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Do not allow "detach-tier commit" unnecessarilyGaurav Kumar Garg2015-09-071-9/+21
| | | | | | | | | | | | | | | | | Currently when user execute gluster v detach-tier commit command without starting detach-tier or without giving force option then gluster will success this operation. Detach-tier commit should not allow without giving "force" optioin. Change-Id: Id161c288f6f3e0f6b298878a5c35a49fcbd9c6e3 BUG: 1260185 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12107 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* glusterd: Don't allow remove brick start/commit if glusterd is down of the ↵Atin Mukherjee2015-08-261-30/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | host of the brick remove brick stage blindly starts the remove brick operation even if the glusterd instance of the node hosting the brick is down. Operationally its incorrect and this could result into a inconsistent rebalance status across all the nodes as the originator of this command will always have the rebalance status to 'DEFRAG_NOT_STARTED', however when the glusterd instance on the other nodes comes up, will trigger rebalance and make the status to completed once the rebalance is finished. This patch fixes two things: 1. Add a validation in remove brick to check whether all the peers hosting the bricks to be removed are up. 2. Don't copy volinfo->rebal.dict from stale volinfo during restore as this might end up in a incosistent node_state.info file resulting into volume status command failure. Change-Id: Ia4a76865c05037d49eec5e3bbfaf68c1567f1f81 BUG: 1245045 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/11726 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd: Stop/restart/notify to daemons(svcs) during reset/set on a volumeanand2015-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | problem : Reset/set commands were not working properly. reset command returns success but it not sending notification to svcs if corresponding graph modified. Fix: Whenever reset/set command issued, generate the temp graph and compare with original graph and do the fallowing actions 1.) If both graph are identical nothing to do with svcs. 2.) If any changes in graph topology restart/stop service by calling svc manager. 3) If changes in options send notify signal by calling glusterd_fetchspec_notify. Change-Id: I852c4602eafed1ae6e6a02424814fe3a83e3d4c7 BUG: 1209329 Signed-off-by: anand <anekkunt@redhat.com> Reviewed-on: http://review.gluster.org/10850 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Porting left out log messages to new frameworkNandaja Varma2015-06-261-18/+36
| | | | | | | | | | | Change-Id: I70d40ae3b5f49a21e1b93f82885cd58fa2723647 BUG: 1235538 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/11388 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* ops/glusterd: Porting messages to new logging frameworkNandaja Varma2015-06-161-121/+228
| | | | | | | | | | Change-Id: Iafeb07aabc1781d98f51c6c2627bf3bbdf493153 BUG: 1194640 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/9905 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: subvol_count value for replicate volume should be calculate correctlyGaurav Kumar Garg2015-06-151-2/+2
| | | | | | | | | | | | | | | | | | glusterd was crashing while trying to remove bricks from replica set after shrinking nx3 replica to nx2 replica to nx1 replica. This is because volinfo->subvol_count is calculating value from old replica count value. Change-Id: I1084a71e29c9cfa1cd85bdb4e82b943b1dc44372 BUG: 1230121 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/11165 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/tier: account for reordered layoutsDan Lambright2015-06-111-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a tiered volume the cold subvolume is always at a fixed position in the graph. DHT's layout array, on the other hand, may have the cold subvolume in either the first or second index, therefore code cannot make any assumptions. The fix searches the layout for the correct position dynamically rather than statically. The bug manifested itself in NFS, in which a newly attached subvolume had not received an existing directory. This case is a "stale entry" and marked as such in the layout for that directory. The code did not see this, because it looked at the wrong index in the layout array. The fix also adds the check for decomissioned bricks, and fixes a problem in detach tier related to starting the rebalance process: we never received the right defrag command and it did not get directed to the tier translator. Change-Id: I77cdf9fbb0a777640c98003188565a79be9d0b56 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Joseph Fernandes <josferna@redhat.com> Reviewed-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11092
* glusterd/tier: glusterd crashed with detach-tier commit forceMohammed Rafi KC2015-06-111-0/+7
| | | | | | | | | | | | | | | glusterd crashed when doing "detach-tier commit force" on a non-tiered volume. Change-Id: I884771893bb80bec46ae8642c2cfd7e54ab116a6 BUG: 1228112 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11081 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>