summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd/src/glusterd-store.h
Commit message (Collapse)AuthorAgeFilesLines
* glusterd:Reducing file operations when writing options into volfile.Srijan Sivakumar2020-09-171-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: The options to be written into the volfile are in form of key-value pairs and the current approach taken to write them into a file is to invoke the write syscall for each key-value pair. This implies an increased number of system calls. Code Changes: 1. Addition of a structure, glusterd_volinfo_data_store_t in glusterd-store.h, containing a character buffer, a pointer to gf_store_handle_t, the current length of data in the buffer as well as a flag for checking key while storing in the buffer. This is used for passing the required file descriptor as well having a character buffer for storing multiple options before being written into a file. 2. Modification of function, _storeopts in glusterd-store.c, which now invokes the gf_store_save_items when buffer is to be emptied into the volfile before further write into it. Also, it has replaced the function _storeslaves, _gd_store_rebalance_dict and _store_global_opts. 3. Modification of function, glusterd_store_volinfo_write in glusterd-store.c, wherein a pointer of type glusterd_volinfo_data_store_t is initialized for further operation. Also, the buffer is emptied into the volfile before it is freed. 4. Modification of function, glusterd_store_node_state_write in glusterd-store.c, wherein the a pointer of type glusterd_volinfo_data_store_t is initialized for further operations. Also, the buffer is emptied into the volfile before it is freed. 5. Addition of enum into glusterd-mem-types.h 6. Modification of function, glusterd_store_options in glusterd-store.c, wherein a pointer of type glusterd_volinfo_data_store_t is initialized for further opertaions. Also, the buffer is emptied into the volfile before it is freed. Reasoning behind the approach: 1.Instead of a dynamic allocation of buffer or increasing the buffer size with increased number of options, it, the current approach takes a buffer of fixed size (VOLINFO_BUFFER_SIZE). Before any write into the buffer, the size is checked and if it exceeds the available space, the contents of the buffer are written to the file before copying new contents. Dynamic allocation can lead to increased memory usage as one doesn't know the number of options that could be added in time and may go on to occupy more space than mandated. 2.The function dict_foreach is a generic function used across different modules. It made sense not to change its implementation as it might affect other Functionalities. Hence a structure was added which could just be passed as one of the parameter to this function (as it takes a void*). 3. Reduced number of system calls implies an increase in execution speed. Also, these modified functions come into play whenever the volume is started or modified. 4. The functions _storeslaves, _gd_store_rebalace_dict and _store_global_opts were doing the same set of operations as that of _storeopts except the checking for the key. This has been handled with the help of a flag in the glusterd_volinfo_data_store_t structure. This reduces the duplicate code present. Signed-off-by: Srijan Sivakumar <ssivakum@redhat.com> Change-Id: I22e6e91c78ed51e3a171482054d77bf793b9ab16 Fixes: #718
* Revert "glusterd: (storhaug) remove ganesha (843e1b0)"Jiffin Tony Thottan2019-08-241-0/+2
| | | | | | | | | please note as an additional change, macro GLUSTERD_GET_SNAP_DIR moved from glusterd-store.c to glusterd-snapshot-utils.h Change-Id: I811efefc148453fe32e4f0d322e80455447cec71 updates: #663 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* glusterd/thin-arbiter: Thin-arbiter integration with GD1Vishal Pandey2019-06-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | gluster volume create <VOLNAME> replica 2 thin-arbiter 1 <host1>:<brick1> <host2>:<brick2> <thin-arbiter-host>:<path-to-store-replica-id-file> [force] The changes have been made in a way that the last brick in the bricks list will be treated as the thin-arbiter. GD1 will be manipulated to consider replica count to be as 2 and continue creating the volume like any other replica 2 volume but since thin-arbiter volumes need ta-brick client xlator entries for each subvolume in fuse volfile, volfile generation is modified in a way to inject these entries seperately in the volfile for every subvolume. Few more additions - 1- Save the volinfo with new fields ta_bricks list and thin_arbiter_count. 2- Introduce a new option client.ta-brick-port to add remote-port to ta-brick xlator entry in fuse volfiles. The option can be set using the following CLI syntax - gluster volume set <VOLNAME> client.ta-brick-port <PORTNO.> 3- Volume Info will contain a Thin-Arbiter-path entry to distinguish from other replicate volumes. Change-Id: Ib434e2313b29716f32476c6c211d282c4ef39406 Updates #687 Signed-off-by: Vishal Pandey <vpandey@redhat.com>
* glusterd-volgen.c: remove BD xlator from the graphYaniv Kaul2019-06-181-2/+1
| | | | | | | | | | | | | The BD xlator was removed some time ago. Remove it from the graph. We can also remove the caps settings - only the BD xlator was using it. Lastly, remove the caps (which only BD was using) and the document describing the translator. Change-Id: Id0adcb2952f4832a5dc6301e726874522e07935d updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd/tier: remove tier related code from glusterdHari Gowtham2019-05-271-11/+0
| | | | | | | | | | | | | The handler functions are pointed to dummy functions. The switch case handling for tier also have been moved to point default case to avoid issues, if reintroduced. The tier changes in DHT still remain as such. updates: bz#1693692 Change-Id: I80d80c9a3eb862b4440a36b31ae82b2e9d92e4dc Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
* glusterd/store: store all key-values in one shotYaniv Kaul2019-05-081-3/+2
| | | | | | | | | | | | Instead of saving each key-value separately, which is slow ( especially as we fflush() after each!), store them all as one string and write all together. Implements https://github.com/gluster/glusterfs/issues/629 Change-Id: Ie77a272446b0b6785584b710a4fdd9c613dd9578 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat,.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* glusterd: glusterd to regenerate volfiles when GD_OP_VERSION_MAX changesAtin Mukherjee2018-12-051-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While glusterd has an infra to allow post install of spec to bring it up in the interim upgrade mode to allow all the volfiles to be regenerated with the latest executable, in container world the same methodology is not followed as container image always point to the specific gluster rpm and gluster rpm doesn't go through an upgrade process. This fix does the following: 1. If glusterd.upgrade file doesn't exist, regenerate the volfiles 2. If maximum-operating-version read from glusterd.upgrade doesn't match with GD_OP_VERSION_MAX, glusterd detects it to be a version where new options are introduced and regenerate the volfiles. Tests done: 1. Bring up glusterd, check if glusterd.upgrade file has been created with GD_OP_VERSION_MAX value. 2. Post 1, restart glusterd and check glusterd hasn't regenerated the volfiles as there's is no change in the GD_OP_VERSION_MAX vs the op_version read from the file. 3. Bump up the GD_OP_VERSION_MAX in the code by 1 and post compilation restart glusterd where the volfiles should be again regenerated. Note: The old way of having volfiles regenerated during an rpm upgrade is kept as it is for now but eventually this can be sunset later. Change-Id: I75b49a1601c71e99f6a6bc360dd12dd03a96414b Fixes: bz#1651463 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Land clang-format changesGluster Ant2018-09-121-121/+119
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* Coverity Issue: PW.INCLUDE_RECURSION in several filesGirjesh Rajoria2017-11-091-1/+0
| | | | | | | | | | | | | | Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439, 440, 441, 442, 443 Issue: Event include_recursion Removed redundant, recursive includes from the files. Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
* snapshot: Issue with other processes accessing the mounted brickSunny Kumar2017-10-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added code for unmount of activated snapshot brick during snapshot deactivation process which make sense as mount point for deactivated bricks should not exist. Removed code for mounting newly created snapshot, as newly created snapshots should not mount until it is activated. Added code for mount point creation and snapshot mount during snapshot activation. Added validation during glusterd init for mounting only those snapshot whose status is either STARTED or RESTORED. During snapshot restore, mount point for stopped snap should exist as it is required to set extended attribute. During handshake, after getting updates from friend mount point for activated snapshot should exist and should not for deactivated snapshot. While getting snap status we should show relevent information for deactivated snapshots, after this pathch 'gluster snap status' command will show output like- Snap Name : snap1 Snap UUID : snap-uuid Brick Path : server1:/run/gluster/snaps/snap-vol-name/brick Volume Group : N/A (Deactivated Snapshot) Brick Running : No Brick PID : N/A Data Percentage : N/A LV Size : N/A Fixes: #276 Change-Id: I65783488e35fac43632615ce1b8ff7b8e84834dc BUG: 1482023 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd : glusterd fails to start when peer's network interface is downGaurav Yadav2017-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | Problem: glusterd fails to start on nodes where glusterd tries to come up even before network is up. Fix: On startup glusterd tries to resolve brick path which is based on hostname/ip, but in the above scenario when network interface is not up, glusterd is not able to resolve the brick path using ip_address or hostname With this fix glusterd will use UUID to resolve brick path. Change-Id: Icfa7b2652417135530479d0aa4e2a82b0476f710 BUG: 1472267 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/17813 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* posix: option to handle the shared bricks for statvfs()Amar Tumballi2017-07-241-0/+1
| | | | | | | | | | | | | | | | | | | | Currently 'storage/posix' xlator has an option called option `export-statfs-size no`, which exports zero as values for few fields in `struct statvfs`. In a case of backend brick shared between multiple brick processes, the values of these variables should be `field_value / number-of-bricks-at-node`. This way, even the issue of 'min-free-disk' etc at different layers would also be handled properly when the statfs() sys call is made. Fixes #241 Change-Id: I2e320e1fdcc819ab9173277ef3498201432c275f Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: https://review.gluster.org/17618 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: (storhaug) remove ganeshaKaleb S. KEITHLEY2017-03-211-1/+0
| | | | | | | | | | | | | | | | remove all vestiges of ganesha The storhaug CLI is used to manage ganesha and Samba. Also any setup and teardown of the ganesha HA is initiated using storhaug to preserve the proper layering. Change-Id: I0eec0016a1b7802a36e7b2d92896b86fdf8607d5 BUG: 1420713 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16504 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier : Tier as a servicehari gowtham2017-01-161-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: glusterd must store all rebalance related informationSakshi Bansal2016-07-041-0/+7
| | | | | | | | | | | Change-Id: I8404b864a405411e3af2fbee46ca20330e656045 BUG: 1351021 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14827 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: persist brickinfo->real_pathAtin Mukherjee2016-04-291-0/+1
| | | | | | | | | | | | | | | | | Since real_path was not persisted and gets constructed at every glusterd restart, glusterd will fail to come up if one of the brick's underlying file system is crashed. Solution is to construct real_path only once and get it persisted. Change-Id: I97abc30372c1ffbbb2d43b716d7af09172147b47 BUG: 1330481 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14075 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* mgmt/glusterd: Store arbiter-count and restore itPranith Kumar K2015-11-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: 1) Glusterd doesn't remember about arbiter information of replica volume in store. When glusterd goes down and comes backup, arbiter volumes will become replica volumes. 2) Glusterd doesn't import/export arbiter information to/from the other peers. 3) Volume info doesn't show any arbiter count in the output. Fix: 1) Persist arbiter information in glusterd-store 2) Import/Export arbiter information of the volume 3) Change volume info output to show arbiter count. Change-Id: I2db81e73d2694b01f7d07b08a17b41ad5a55c361 BUG: 1276675 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12475 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* quota: add version to quota xattrsvmallika2015-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb BUG: 1272411 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12386 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* quota: quota.conf backward compatibility fixvmallika2015-05-281-0/+6
| | | | | | | | | | | | | | | | | | | In release-3.7 the format of quota.conf is changed. There is a backward compatibility issues during upgrade 1) There can be an issue when peer sync between node-3.6 and node-3.7 2) If the user sets/removes limit, there is will different format of file in node-3.6 and node-3.7 This patch fixes the issue: 1) restrict the user to execute command quota enable, limit-usage, remove 2) write quota.conf in older format if op-version is less than 3.6 Change-Id: Ib76f5a0a85394642159607a105cacda743e7d26b BUG: 1223739 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/10889 Tested-by: NetBSD Build System Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cli/tiering: volume info should display details about tierMohammed Rafi KC2015-05-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | >> gluster volume info patchy Volume Name: patchy Type: Tier Volume ID: 8bf1a1ca-6417-484f-821f-18973a7502a8 Status: Created Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Replicate Number of Bricks: 1 x 2 = 2 Brick1: hostname:/home/brick30 Brick2: hostname:/home/brick31 Cold Bricks: Cold Tier Type : Disperse Number of Bricks: 1 x (4 + 2) = 6 Brick3: hostname:/home/brick20 Brick4: hostname:/home/brick21 Brick5: hostname:/home/brick23 Brick6: hostname:/home/brick24 Brick7: hostname:/home/brick25 Brick8: hostname:/home/brick26 Change-Id: I7b9025af81263ebecd641b4b6897b20db8b67195 BUG: 1212400 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10339 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* NFS-Ganesha: Handling CLI commands when NFS-Ganesha keys are setMeghana Madhusudhan2015-04-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ganesha.enable is set to on and features.ganesha is enabled, there are a few behaviour changes that should be seen in other volume operations. 1. ganesha.enable can be set to 'on' only when features.ganesha is set to 'enable' 2.When gluster vol is started, and if ganesha.enable key was set to 'on', it should automatically export the volume via NFS-Ganesha. 3.When ganesha.enable is set to 'on', and a volume is stopped, that volume should be unexported via NFS-Ganesha. 4. gluster vol reset <volname> If ganesha.enable was set to on, then unexport the volume via NFS-Ganesha. 5. gluster vol reset all If features.ganesha is set to enable, as part of reset all, set it to disable. This translates to teardown cluster. All the above problems are fixed by checking the global key and value, depending on the value, specific functions are called. And also, functions related to global commands are moved to cli-cmd-global.c Commit phase of features.ganesha enable/disable runs the ganesha-ha.sh setup/teardown respectively. Before the script begins, it is important that the NFS-Ganesha service starts on all the HA nodes. Having the start service commands in the commit phase could lead to problems. Moving the pre-requisite service start commands to the 'stage' phase. Change-Id: I5a256f94f8e1310ddcd5369f329b7168b2a24c47 BUG: 1200265 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/10283 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* build: make contrib/uuid dependency optionalNiels de Vos2015-04-101-1/+1
| | | | | | | | | | | | | | | | | | | On Linux systems we should use the libuuid from the distribution and not bundle and statically link the contrib/uuid/ bits. libglusterfs/src/compat-uuid.h has been introduced and should become an abstraction layer for different UUID APIs. Non-Linux operating systems should implement their compatibility layer there. Once all operating systems have an implementation in compat-uuid.h, we can remove contrib/uuid/ from the repository completely. Change-Id: I345e5357644be2521685e00358bb8c83c4ea0577 BUG: 1206587 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10129 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: CLI commands to create and manage tiered volumes.Dan Lambright2015-03-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A tiered volume is a normal volume with some number of new bricks representing "hot" storage. The "hot" bricks can be attached or detached dynamically to a normal volume. When this happens, a new graph is constructed. The root of the new graph is an instance of the tier translator. One subvolume of the tier translator leads to the old volume, and another leads to the new hot bricks. attach-tier <VOLNAME> [<replica> <COUNT>] <NEW-BRICK> ... [force] volume detach-tier <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force> gluster volume rebalance <volume> tier start gluster volume rebalance <volume> tier stop gluster volume rebalance <volume> tier status The "tier start" CLI command starts a server side daemon. The daemon initiates file level migration based on caching policies. The daemon's status can be monitored and stopped. Note development on the "tier status" command is incomplete. It will be added in a subsequent patch. When the "hot" storage is detached, the tier translator is removed from the graph and the tiered volume reverts to its original state as described in the volume's info file. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Ic8042ce37327b850b9e199236e5be3dae95d2472 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9753 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: nfs,shd,quotad,snapd daemons refactoringAtin Mukherjee2015-02-201-0/+4
| | | | | | | | | | | | | This patch ports nfs, shd, quotad & snapd with the approach suggested in http://www.gluster.org/pipermail/gluster-devel/2014-December/043180.html Change-Id: I4ea5b38793f87fc85cc9d2cf873727351dedffd2 BUG: 1191486 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9428 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
* features/changelog: Cleanup .processing and .current directoryAravinda VK2015-01-181-14/+0
| | | | | | | | | | | | | | | | | On changelog_register cleanup .processing, .history/.processing, .current and .history/.current from the working directory. Moved glusterd_recursive_rmdir and glusterd_for_each_entry to common place(libglusterfs) and renamed as recursive_rmdir and GF_FOR_EACH_ENTRY_IN_DIR respectively BUG: 1162057 Change-Id: I1f98468a344cead039026762a805437b2f9e507b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9082 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/snapshot: Snapshot should be deactivated when it is createdvmallika2014-11-121-0/+1
| | | | | | | | | | | | | | | | | By default snapshot should be deactivated and this should be a configurable option. This behaviour can be configured by the command below: gluster snapshot config activate-on-create <enable|disable> Change-Id: I1911595c32beed43bb2fca4bf99f0d264b422513 BUG: 1157991 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8985 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd/snapshot: Inherit the mount options of a original brickVijaikumar M2014-08-031-0/+1
| | | | | | | | | | | | | | | | | | | | when creating snapshots When creating a snapshot a LVM is created at the backend and is mounted under /var/run/gluster/snaps/... However, this mount does not inherit the mount options for the original brick acting as the parent for the snap. If the snap is restored, this could lead to performance degredations, functional limitations, or in extreme scenarios even potential data loss. Change-Id: I67d70fd83430d83dacc5380c6c928e27fb9c9e1b BUG: 1125180 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8394 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cli/glusterd: Added support for dispersed volumesXavier Hernandez2014-07-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two new options have been added to the 'create' command of the cli interface: disperse [<count>] redundancy <count> Both are optional. A dispersed volume is created by specifying, at least, one of them. If 'disperse' is missing or it's present but '<count>' does not, the number of bricks enumerated in the command line is taken as the disperse count. If 'redundancy' is missing, the lowest optimal value is assumed. A configuration is considered optimal (for most workloads) when the disperse count - redundancy count is a power of 2. If the resulting redundancy is 1, the volume is created normally, but if it's greater than 1, a warning is shown to the user and he/she must answer yes/no to continue volume creation. If there isn't any optimal value for the given number of bricks, a warning is also shown and, if the user accepts, a redundancy of 1 is used. If 'redundancy' is specified and the resulting volume is not optimal, another warning is shown to the user. A distributed-disperse volume can be created using a number of bricks multiple of the disperse count. Change-Id: Iab93efbe78e905cdb91f54f3741599f7ea6645e4 BUG: 1118629 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/7782 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Change file-system uuid to file-system labelRajesh Joseph2014-07-031-0/+1
| | | | | | | | | | | | | | | | | | | | | Problem: In XFS changing file-system UUID with xfs_admin is causing too much delay with large file-system. The time taken by xfs_admin tool to change UUID is directly proportional to the size of the file system. Cause: In XFS file-system UUID is stored in file-system superblock. Therefore for chaning UUID all the superblock needs to be changed. Fix: Instead of using file-system UUID use file-system label. Change-Id: Ifb4c668fb29cfc1c89d9b221abc8d09dc09589ec BUG: 1115107 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/8215 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: display snapd status as part of volume statusRaghavendra Bhat2014-06-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Made changes to save the port used by snapd in the info file for the volume i.e. <glusterd-working-directory>/vols/<volname>/info This is how the gluster volume status of a volume would look like for which the uss feature is enabled. [root@tatooine ~]# gluster volume status vol Status of volume: vol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick tatooine:/export1/vol 49155 Y 5041 Snapshot Daemon on localhost 49156 Y 5080 NFS Server on localhost 2049 Y 5087 Task Status of Volume vol ------------------------------------------------------------------------------ There are no active volume tasks Change-Id: I8f3e5d7d764a728497c2a5279a07486317bd7c6d BUG: 1111041 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/8114 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd/snapshot : Provide enable/disable option for snapshot auto-delete ↵Sachin Pandit2014-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | feature. This patch provides an interface to enable or disable the auto-delete feature. Syntax : gluster snapshot config auto-delete <enable/disable> DETAILS : 1) When auto-delete feature is disabled, If the the soft-limit is reached then user is given a warning about exceeding soft-limit along with successful snapshot creation message (oldest snapshot is not deleted). And upon reaching hard-limit further snapshot creation is not allowed. Example : ------------------------------------------------------------------ |Case - 1: Upon reaching soft-limit | |Snapshot create : snap successfully created. |Warning : soft-limit of volume (vol) is reached. Snapshot creation |is not possible once hard-limit is reached. | |----------------------------------------------------- |Case - 2: Upon reaching hard-limit | |Snapshot create : snap creation failed. |Error : hard-limit of volume (vol) is reached, Hence it is not |possible to take further snapshots. Please delete few snapshots |of the volume (vol) before taking another snapshot. ------------------------------------------------------------------ 2) When auto-delete feature is enabled, then as soon as the soft-limit is reached the oldest snapshot is deleted for every successful snapshot creation (same as existing method), With this it is made sure that number of snapshot created is not more than snap-max-hard-limit. Change-Id: Ie3ca64bbd2c763371f541cd2e378314e73b695b4 BUG: 1105415 Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/8017 Tested-by: Justin Clift <justin@gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Glusterd/Rebalance: Update rebalance status properly inSusant Palai2014-05-261-0/+1
| | | | | | | | | | | | | | node_state.info credit: kaushal@redhat.com spalai@redhat.com Change-Id: I08d0771e2168a4a6ebd473e8a937b8b2eda1341a BUG: 1075087 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/7214 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: Fetch brick mount_dirs during brick create.Avra Sengupta2014-05-061-0/+1
| | | | | | | | | | | | | | | | | Fetch the mount directory path for a brick, during volume create, add-brick, and replace-brick. When a snap-create is missed, use this mount directory information to create the brick path for the missed snap brick. Change-Id: Iad3eec96a32cf340f26bdf3f28e2f529e4b77e31 BUG: 1061685 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7550 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd/snapshot: Restore cleanupRajesh Joseph2014-05-021-0/+2
| | | | | | | | | | | | | | | If restores fails for some reason then we should revert the restore operation. To do so we take the backup of vols folder before doing a restore and if the restore fails then we revert the changes done. Change-Id: I97f72aec3a34fc122bf137beb336e94db3a04dff BUG: 1061685 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/7548 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Perform missed snap deletes and restores.Avra Sengupta2014-04-281-1/+1
| | | | | | | | | | | | | | | | | Replacing is_volume_restored(gf_boolean_t) with restored_from_snap(uuid_t) in glusterd_volinfo_ Also removed gd_restore_snap_volume from glusterd-volgen.c to glusterd-snapshot.c Change-Id: Ic615a1658cfaffa98d4590506ac82f20bf709ad6 BUG: 1089906 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7455 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/snapshot: Adding snap_vol_id and snap_uuid to missed_snap_listAvra Sengupta2014-04-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Persisting missing snapshot info on disk as well as in memory in the following format: -------------NODE-UUID--------------:--------------SNAP-UUID-------------=---------SNAP-VOL-ID------------:BRICKNUM:-------BRICKPATH--------:OPERATION:STATUS 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:8258b18f-d408-483d-8239-204039dc6397=a17b4fe42c5a45f7a916438643edaa13: 3 :/brick/brick-dirs/brick3: 1 : 1 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:8258b18f-d408-483d-8239-204039dc6397=a17b4fe42c5a45f7a916438643edaa13: 3 :/brick/brick-dirs/brick3: 3 : 1 927cb5fe-63da-48f5-82f6-e6a09ddc81c4:8258b18f-d408-483d-8239-204039dc6397=83a3cc05453b46b2a7eda4c9a9208638: 3 :/brick/brick-dirs/brick3: 1 : 1 This data will be stored on disk at /var/lib/glusterd/snaps/missed_snaps_list In memory we maintain the data as a list of glusterd_missed_snap_info in conf, the key for this list are the first two fields, i.e NODE-UUID:SNAP-UUID. For every NODE-UUID:SNAP-UUID, there can be multiple operations missed on multiple bricks. So we maintain a list of glusterd_snap_op_t for every node of glusterd_missed_snap_info This list is maintained or updated during snapshot create, delete, and restore operations which are the only operations that if missed, are recorded in this list. During snapshot create, if a node is down, or a brick is down, we don't receive their mount point infos. snap_status of such bricks is marked as -1, and their brick details are added to this list. During snapshot delete, we check from originator node, if any other nodes, holding bricks of the said snap are down. Those are also added to the list. Also if the node is up, but the snapshot was pending for a snap brick, and its snap_status is -1, we add that to the list too. When a subsequent delete entry is processed for an already existing create entry, we just mark the create entries status as done (2), and don't add the delete entry to the list. During snapshot restore, we check from originator node, if any other nodes, holding bricks of the said snap are down. Those are also added to the list. Also if the node is up, but the snapshot was pending for a snap brick, and its snap_status is -1, we add that to the list too. Like delete when a subsequent restore entry is processed for an already existing create entry, we just mark the create entries status as done (2), and don't add the restore entry to the list. Change-Id: I54f63e28d3c40555d0f84528f38227103171f594 BUG: 1061685 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7454 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gluster: GlusterFS Volume Snapshot FeatureAvra Sengupta2014-04-111-32/+64
| | | | | | | | | | | | | | | | | | | | | | | | | This is the initial patch for the Snapshot feature. Current patch includes following features: * Snapshot create * Snapshot delete * Snapshot restore * Snapshot list * Snapshot info * Snapshot status * Snapshot config Change-Id: I2f46920c0d61c515f6a60e0f8b46fff886d9f6a9 BUG: 1061685 Signed-off-by: shishir gowda <sgowda@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Signed-off-by: Vijaikumar M <vmallika@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7128 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: persistent client xlator/ afr changelog namesRavishankar N2014-03-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | -Add a unique brick-id field to glusterd_brickinfo_t -Persist the id to the brickinfo file -Use the brick-id as the client xlator name during vol create, add-brick and replace-brick operations. -For older volumes,generate the id in-memory during glusterd restore but defer writing it to the brickinfo file until the next volume set operation. -send and receive the brick-ids during peer probe. Feature page: www.gluster.org/community/documentation/index.php/Features/persistent-AFR-changelog-xattributes Related patch: http://review.gluster.org/#/c/7122 Change-Id: Ib7f1570004e33f4144476410eec2b84df4e41448 BUG: 1066778 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/7155 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli/glusterd: Changes to quota command Quota featureRaghavendra G2013-11-261-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | re-work. Following are the cli commands that are new/re-worked: ====================================================== volume quota <VOLNAME> {enable|disable|list [<path> ...]|remove <path>| default-soft-limit <percent>} | volume quota <VOLNAME> {limit-usage <path> <size> [<percent>]} | volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} volume status [all | <VOLNAME> [nfs|shd|<BRICK>|quotad]] [detail|clients|mem|inode|fd|callpool] volume statedump <VOLNAME> [nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history] glusterd changes: ================= * Quota limits are now set as extended attributes by glusterd from the aux mount created by the cli. * The gfids of the directories on which quota limits are set for a given volume are stored in /var/lib/glusterd/vols/<volname>/quota.conf file in binary format, and whose cksum and version is stored in /var/lib/glusterd/vols/<volname>/quota.cksum. Original-author: Krutika Dhananjay <kdhananj@redhat.com> Original-author: Krishnan Parthasarathi <kparthas@redhat.com> BUG: 969461 Change-Id: If32bba36c67f9c2a30417af9c6389045b2b7c13b Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6003 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* bd: posix/multi-brick support to BD xlatorM. Mohan Kumar2013-11-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current BD xlator (block backend) has a few limitations such as * Creation of directories not supported * Supports only single brick * Does not use extended attributes (and client gfid) like posix xlator * Creation of special files (symbolic links, device nodes etc) not supported Basic limitation of not allowing directory creation is blocking oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM creates multi-level directories when GlusterFS is used as storage backend for storing VM images. To overcome these limitations a new BD xlator with following improvements is suggested. * New hybrid BD xlator that handles both regular files and block device files * The volume will have both POSIX and BD bricks. Regular files are created on POSIX bricks, block devices are created on the BD brick (VG) * BD xlator leverages exiting POSIX xlator for most POSIX calls and hence sits above the POSIX xlator * Block device file is differentiated from regular file by an extended attribute * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a posix file to Logical Volume (LV). * When a client sends a request to set BD_XATTR on a posix file, a new LV is created and mapped to posix file. So every block device will have a representative file in POSIX brick with 'user.glusterfs.bd' (BD_XATTR) set. * Here after all operations on this file results in LV related operations. For example opening a file that has BD_XATTR set results in opening the LV block device, reading results in reading the corresponding LV block device. When BD xlator gets request to set BD_XATTR via setxattr call, it creates a LV and information about this LV is placed in the xattr of the posix file. xattr "user.glusterfs.bd" used to identify that posix file is mapped to BD. Usage: Server side: [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2 It creates a distributed gluster volume 'bdvol' with Volume Group vg1 using posix brick /storage/vg1_info in host1 and Volume Group vg2 using /storage/vg2_info in host2. [root@host1 ~]# gluster volume start bdvol Client side: [root@node ~]# mount -t glusterfs host1:/bdvol /media [root@node ~]# touch /media/posix It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# mkdir /media/image [root@node ~]# touch /media/image/lv1 It also creates regular posix file 'lv1' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1 [root@node ~]# Above setxattr results in creating a new LV in corresponding brick's VG and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size' [root@node ~]# truncate -s5G /media/image/lv1 It results in resizig LV 'lv1'to 5G New BD xlator code is placed in xlators/storage/bd directory. Also add volume-uuid to the VG so that same VG can't be used for other bricks/volumes. After deleting a gluster volume, one has to manually remove the associated tag using vgchange <vg-name> --deltag <trusted.glusterfs.volume-id:<volume-id>> Changes from previous version V5: * Removed support for delayed deleting of LVs Changes from previous version V4: * Consolidated the patches * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR. Changes from previous version V3: * Added support in FUSE to support full/linked clone * Added support to merge snapshots and provide information about origin * bd_map xlator removed * iatt structure used in inode_ctx. iatt is cached and updated during fsync/flush * aio support * Type and capabilities of volume are exported through getxattr Changes from version 2: * Used inode_context for caching BD size and to check if loc/fd is BD or not. * Added GlusterFS server offloaded copy and snapshot through setfattr FOP. As part of this libgfapi is modified. * BD xlator supports stripe * During unlinking if a LV file is already opened, its added to delete list and bd_del_thread tries to delete from this list when a last reference to that file is closed. Changes from previous version: * gfid is used as name of LV * ? is used to specify VG name for creating BD volume in volume create, add-brick. gluster volume create volname host:/path?vg * open-behind issue is fixed * A replicate brick can be added dynamically and LVs from source brick are replicated to destination brick * A distribute brick can be added dynamically and rebalance operation distributes existing LVs/files to the new brick * Thin provisioning support added. * bd_map xlator support retained * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and setfattr -n user.glusterfs.bd -v "thin" creates thin LV * Capability and backend information added to gluster volume info (and --xml) so that management tools can exploit BD xlator. * tracing support for bd xlator added TODO: * Add support to display snapshots for a given LV * Display posix filename for list-origin instead of gfid Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/4809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* bd_map: Remove bd_map xlatorM. Mohan Kumar2013-11-131-1/+1
| | | | | | | | | | | Remove bd_map xlator and CLI related changes. Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Calculate volume op-versions only on set/resetKaushal M2013-09-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The volume op-versions are calculated during a volume set/reset, reading a volume from disk and importing a volume during probe or volume sync. The calculation of the volume op-version depends on the clusters op-version as some features are enabled automatically depending on the clusters op-version. We also don't store the volume op-versions persistently and don't export the volume op-versions during sync. Due to this, there can occur cases which will lead to inconsistencies in volumes in different peers. One such case is below, Consider, a cluster made up 3 peers P1, P2 and P3, operating at op-version N. The cluster has two volumes V1 and V2, which have volume op-versions N (since volume op-version cannot be greater than cluster op-version). We have, Cluster-op-version = N V1 op-version = N V2 op-version = N A set operation on V1 causes the clusters op-version to be bumped up to N+1. Assume that there exist some features that are automatically enabled on op-version N+1. The op-version of V2 remains at N as no operation has been performed on it. So, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N Now, we probe a new peer P4. On the new peer we will have the following op-versions, Cluster op-version = N+1 V1 op-version = N+1 V2 op-version = N+1 This happens because we don't send volume op-versions during the sync after probe. P4 will freshly calculate the op-version of V2 (assuming features have been auto enabled due to the cluster op-version being N+1) as N+1. Another case is when glusterd on a peer restarts. Assume P3 was restarted, glusterd will recalculate the volume op-versions during the restore state. Again, op-version of V2 will be calculated as N+1 assuming auto enabled features. This will lead to inconsistency in the volume representation in memory and on disk, as glusterd will assume the volume contains auto enabled features, but the volfiles don't contain them as they were not regenrated. These kind of issues can be solved by calculating the volume op-version only when features are enabled and disabled (ie. during volume set/reset), persisting the volume-op-versions and exporting/importing them. Change-Id: I52de0668c92628622e85f4588fb28829a7231132 BUG: 1005043 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5568 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* store: move glusterd_store functions from mgmt/glusterd to libglusterfsNiels de Vos2013-06-201-23/+0
| | | | | | | | | | | | | Making the glusterd_store_* functions re-usable will help with future changes that need to read/write lists of items. BUG: 904065 Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4676 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Ignore directories matching *.tmp in storeKrishnan Parthasarathi2013-06-131-0/+1
| | | | | | | | | | | | store being glusterd's persistent store under /var/lib/glusterd/ Change-Id: I1c01a09a8ce4a73ea612f05e7f14d4ab39ad1628 BUG: 971796 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5177 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Mark vol as deleted by renaming voldir before cleaning up the storeKrutika Dhananjay2013-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: During 'volume delete', when glusterd fails to erase all information about a volume from the backend store (for instance because rmdir() failed on non-empty directories), not only does volume delete fail on that node, but also subsequent attempts to restart glusterd fail because the volume store is left in an inconsistent state. FIX: Rename the volume directory path to a new location <working-dir>/trash/<volume-id>.deleted, and then go on to clean up its contents. The volume is considered deleted once rename() succeeds, irrespective of whether the cleanup succeeds or not. Change-Id: Iaf18e1684f0b101808bd5e1cd53a5d55790541a8 BUG: 889630 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4639 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Made dst brick's port info available to all peersKrishnan Parthasarathi2013-01-041-0/+1
| | | | | | | | | Change-Id: I1f65743a31d95013fdf22cded91c314e9934a3a9 BUG: 816915 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.org/3275 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd, cli: Task id's for async tasksKaushal M2012-12-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | This patch introduces task-id's for async tasks like rebalance, remove-brick and replace-brick. An id is generated for each task when it is started and displayed to the user in cli output. The status of running tasks is also included in the output of "volume status" along with its id, so that a user can easily track the progress of an async task. Also, * added tests for this feature into the regression test suite. * added a python script for creating files, 'create-files.py', courtesy Vijaykumar Koppad (vkoppad@redhat.com) into the test suite. This patch reverts the revert commit 698deb33d731df6de84da8ae8ee4045e1543a168. BUG: 857330 Change-Id: Id43d7cb629a38f47f733fbc18cb4c5f2f0327c7a Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4294 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "glusterd, cli: Task id's for async tasks"Anand Avati2012-12-041-1/+0
| | | | | | | | | | | This reverts commit ed15521d4e5af2b52b78fd33711e7562f5273bc6 Strangely, the test scripts are "silently" passing for failures too. Reverting patch for now. Change-Id: I802ec1634c7863dc373cc7dc4a47bd4baa72764e Reviewed-on: http://review.gluster.org/4267 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>