summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* Revert "glusterd, cli: Task id's for async tasks"Anand Avati2012-12-0413-879/+284
| | | | | | | | | | | This reverts commit ed15521d4e5af2b52b78fd33711e7562f5273bc6 Strangely, the test scripts are "silently" passing for failures too. Reverting patch for now. Change-Id: I802ec1634c7863dc373cc7dc4a47bd4baa72764e Reviewed-on: http://review.gluster.org/4267 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: mark new entry changelog for create/mknod failuresPranith Kumar K2012-12-045-67/+228
| | | | | | | | | | | | | | | | | | | | | | Problem: When create/mknod fails on some of the nodes, appropriate pending data/metadata changelogs are not assigned. This was not considered to be an issue because entry self-heal would do the assigning of appropriate changelog after creating new entries. But using the combination of rebalance and remove brick we can construct a case where a file with same name and gfid can be created in a dir with different data and link-to xattr without any changelog. Fix: When a create/mknod failure is observed mark the appropriate changelog on the new file created. Change-Id: I4c32cbf5594a13fb14deaf97ff30b2fff11cbfd6 BUG: 858212 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4207 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: use data trylock mode in read/write self-heal trigger pathsBrian Foster2012-12-041-1/+8
| | | | | | | | | | | | | | | | | | | | | | Self-heal data lock contention between clients and glustershd instances can lead to long wait and user response times if the client ends up pending its lock on glustershd self-heal of a large file. We have reports of guest vm instances going completely unresponsive during self-heal of virtual disk images. Optimize the read/write self-heal trigger codepath (i.e., afr_open_fd_fix()) to trylock for self-heal and skip the self-heal otherwise to minimize the likelihood of a running/active guest of competing with glustershd on arrival of a brick. Note that lock contention is still possible from the client (e.g., via lookup). BUG: 874045 Change-Id: I406443c061ff6acd2a851179626b78352caa5c03 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: support self-heal data trylock mechanismBrian Foster2012-12-044-8/+15
| | | | | | | | | | | | | | | Introduce a block flag to support an optional blocking or non-blocking mode in the self-heal data locking mechanism. All callers are modified to use blocking mode, which is the current default behavior (no change in behavior is introduced by this commit). BUG: 874045 Change-Id: Ib7ff9984578fa11de4e3b6981508100cdddd37cd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4257 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd, cli: Task id's for async tasksKaushal M2012-12-0413-284/+879
| | | | | | | | | | | | | | | | | | | | This patch introduces task-id's for async tasks like rebalance, remove-brick and replace-brick. An id is generated for each task when it is started and displayed to the user in cli output. The status of running tasks is also included in the output of "volume status" along with its id, so that a user can easily track the progress of an async task. Also, * added tests for this feature into the regression test suite. * added a python script for creating files, 'create-files.py', courtesy Vijaykumar Koppad (vkoppad@redhat.com) into the test suite. Change-Id: Ib0c0d12e0d6c8f72ace48d303d7ff3102157e876 BUG: 857330 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/3942 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: if create returns EXIST, donot set gfid/xattrsshishir gowda2012-12-041-0/+4
| | | | | | | | | Change-Id: I9f2b75b10bde428d36d6516aa09c18e590d17ed9 BUG: 864801 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4265 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* afr: make flush non-transactionalBrian Foster2012-12-043-139/+38
| | | | | | | | | | | | | | | | | | Flush is historically a transaction to ensure all previous writes were complete. This is no longer required as write-behind has learned to make flush a barrier operation (re: conversation w/ Avati). Flush taking a full file lock causes VMs running on afr volumes to stall when a migration occurs and self-heal is in progress. Make afr_flush() a non-transactional operation. BUG: 874045 Change-Id: If2db83823e280c86b1b29b41361eed7081601632 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: add "volume label" commandJeff Darcy2012-12-046-0/+206
| | | | | | | | | | | | | | | | | This command is necessary when the local disk/filesystem containing a brick is unexpectedly lost and then recreated. Since 961bc80c, trying to start the brick will fail because the trusted.glusterfs.volume-id xattr is missing, and if we can't start it then we can't replace-brick or self-heal so we're stuck in a permanently degraded state. This command provides a way to label the empty brick with the proper volume ID so that further repair actions become possible. Change-Id: I1c1e5273a018b7a6b8d0852daf111ddc3fddfdc2 BUG: 860297 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4259 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* dht: support auto-NUFA optionJeff Darcy2012-12-042-10/+79
| | | | | | | | | | | | | | | Many people have asked for behavior like the old NUFA, which builds and seems to run but was previously impossible to enable/configure in a standard way. This change allows NUFA to be enabled instead of DHT from the command line, with automatic selection of the local subvolume on each host. Change-Id: I0065938db3922361fd450a6c1919a4cbbf6f202e BUG: 882278 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4234 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* fix memory leaksRaghavendra Bhat2012-12-043-1/+5
| | | | | | | | | | | | | * write-behind: free the inode context in wb_forget * distribute: in readdirp callback put the allocated context to the inode * distribute: check if the layout is NULL before accessing it in layout_unref Change-Id: I7698f81b85b99d06bf6b01fc1a6e51e1593b5e27 BUG: 790709 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4250 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* geo-rep / gsyncd: play nicely with peer multiplexing when setting a checkpointCsaba Henk2012-12-041-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | The gsyncd invocation that instruments the "geo-rep config" command is multiplexed over peers to ensure the uniformity of configuration. In general, that works well, but checkpoint setting is a special case, because (unlike other instances of config-set) it is logged (as recording of checkpoint events is part of the feature). Problem is that the path components leading to the log file are created only on the original node, where gsyncd was started. Therefore the logging attempt will fail on the other nodes. Fix: ignore if opening the logfile on behalf of checkpoint setting fails with ENOENT. Change-Id: I677f3f081bf4b9e3ba4d25d58979d86931e6beb4 BUG: 881997 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4248 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Christos Triantafyllidis <ctrianta@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* nfs: do opendir for "naked" readdirp to force self-heal checksJeff Darcy2012-12-031-0/+34
| | | | | | | | | | | | | Instead of an opendir, the first thing the Linux NFS client usually sends us is a readdirp at offset zero, effectively bypassing our self-heal checks. Detect this condition and issue our own opendir to compensate. Change-Id: I69463370abd6235d705bf80b8c77fae4a61096ae BUG: 830665 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4067 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Provide option to disable readdir failoverPranith Kumar K2012-12-035-25/+42
| | | | | | | | | | | | | | | | | In a replica pair unlike files, directories may not have their content in same order, so readdir for same (offset, size) may not give same entries on both the sobvolumes of replica pair. Switching over from one subvolume to another may not be a good idea sometimes. It may lead to duplicate entries or fewer entries or both. This patch provides a way to disable readdir-failover so that applications like rebalance can retry if they want to. Change-Id: I2b23eb224a2e84016a561362932613ac824c11a0 BUG: 859387 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4159 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* encryption/rot-13: Cleanup trailing whitespacesVijay Bellur2012-11-301-8/+8
| | | | | | | | | Change-Id: I9f5c81ca4320b6e73087023102dff6e3911b5095 BUG: 764890 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4251 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cluster/afr: Added descriptions to afr optionsPranith Kumar K2012-11-291-11/+86
| | | | | | | | | Change-Id: I4aef1c79743ee08b62e04d7b709f3e8c6b9dc56a BUG: 881517 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4244 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cluster/dht: fail fix-layout if any of the subvol is downshishir gowda2012-11-295-35/+47
| | | | | | | | | | | | | | | | If any subvolume is down, and a layout is re-written and hash values change, entry names in the downed subvol can be reused in the other subvol which got the same hash range. when the downed subvol is brought back up, duplicate entried might appear Also separated handling of ENOSPC and ENOTCONN error. Change-Id: I5ed93990425a4cee70df2dab7c7c119fdc87ad56 BUG: 860663 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4000 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: Consider nodesvc to be running after onlinePranith Kumar K2012-11-297-21/+23
| | | | | | | | | | | | | | | | | | | | | Definition of online in the message below is that the RPC_CLNT_CONNECT event arrives for the nfs/self-heal-daemon process. For automated tests, sometimes the script needs to wait until self-heal-daemon comes online, so that the relevant commands can be executed. Gluster volume status before this change printed whether the self-heal-daemon is running or not based on the lock availability on the pidfile. But there is a small window where the lock on pid file is present but the process is still not online. So the commands that were depending on this kept failing in the test script. Change-Id: I0e44e18b08d7b653d34fa170c1f187d91c888cd9 BUG: 858212 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4236 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Heal dir uid/gidshishir gowda2012-11-294-1/+105
| | | | | | | | | | | | Identify mismatching uid/gid in lookup, and trigger a syncop heal. uid/gid of subvol with latest ctime is trusted (local->prebuf). Change-Id: Ib5c4bc438e7f4b1f33080e73593f40f400e997f0 BUG: 862967 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/3964 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* debug/error-gen : Added dumpops to print private values of error-gen in the ↵Avra Sengupta2012-11-291-0/+38
| | | | | | | | | | | statedump. Change-Id: I4aa299bd8ecdaa82cdfdc2d97a89fcddcbb25930 BUG: 767095 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4245 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* BD Backend: CLI to create a full/linked clone of a imageM. Mohan Kumar2012-11-292-10/+115
| | | | | | | | | | | | | | | | A new CLI command added to support cloning/snapshotting of a LV device Syntax is: $ gluster bd clone <volname>:<vg>/<lv> <newlv> $ gluster bd snapshot <volname>:<vg>/<lv> <snap_lv> <size> BUG: 805138 Change-Id: Idc2ac14525a3998329c742bf85a06326cac8cd54 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3719 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: CLI commands to create/delete imageM. Mohan Kumar2012-11-296-79/+545
| | | | | | | | | | | | | | | | | Cli commands added to create/delete a LV device. The following command creates lv in a given vg. $ gluster bd create <volname>:<vgname>/<lvname> <size> The following command deletes lv in a given vg. $ gluster bd delete <volname>:<vgname>/<lvname> BUG: 805138 Change-Id: Ie4e100eca14e2ee32cf2bb4dd064b17230d673bf Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3718 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Volume creation supportM. Mohan Kumar2012-11-299-14/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new parameter type is added to volume create command. To use BD xlator one has to specify following argument in addition to normal volume create device vg brick:<VG-NAME> for example, $ gluster volume create lv_volume device vg host:/vg1 Changes from previous version * New type 'backend' added to volinfo structure to differentiate between posix and bd xlator * Most of the volume related commands are updated to handle BD xlator, like add-brick, heal-brick etc refuse to work when volume is BD xlator type * Only one VG (ie brick) can be specified for BD xlator during volume creation * volume info shows VG info if its of type BD xlator BUG: 805138 Change-Id: I0ff90aca04840c71f364fabb0ab43ce33f9278ce Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3717 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Rename LVM. Mohan Kumar2012-11-291-0/+94
| | | | | | | | | BUG: 805138 Change-Id: I18c64435e66ede148c58d412a0639f45554209c8 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3558 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Extend size of file (LV)M. Mohan Kumar2012-11-291-6/+104
| | | | | | | | | | | | | | Use the truncate interface to increase the size of file (LV). FIXME: lvm2 library does not provide any interface to extend size of LV. So lvextend binary is forked to achieve the same BUG: 805138 Change-Id: If4c0bd112364437b89e091b7f53764b8e6e01a28 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3557 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Full clone and linked cloneM. Mohan Kumar2012-11-291-16/+348
| | | | | | | | | | | | FIXME: There is no lvm2 api to create a LV snapshot. This patch forks lvcreate binary to achieve the same. BUG: 805138 Change-Id: Icdbead16f797162fe6a31a672b619ce6a0391235 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3556 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Unlink a file (LV)M. Mohan Kumar2012-11-293-0/+117
| | | | | | | | | BUG: 805138 Change-Id: I53d8a4bc09cbd9766ba937887cadd7ac475017ba Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3555 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Create a new file (LV)M. Mohan Kumar2012-11-292-9/+255
| | | | | | | | | | | | | | Add support to create a new file (LV) under a directory (VG). By default created LV is of one logical extent size. Also setattr/fsetattr interfaces added as part of this patch. BUG: 805138 Change-Id: I51752b707b3766ab277d623ce574537346f376c9 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3554 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Write supportM. Mohan Kumar2012-11-291-0/+243
| | | | | | | | | BUG: 805138 Change-Id: I4a672fc58ee61dead99e0adcd46d7771f3fdd730 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3553 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* BD Backend: Open,read and related calls support for LVM. Mohan Kumar2012-11-294-0/+296
| | | | | | | | | BUG: 805138 Change-Id: I811c179d4244342537dbedb8a24fd2ec628942ed Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3552 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators: Add Block Device(BD) backend translatorM. Mohan Kumar2012-11-297-1/+1524
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new server storage xlator 'bd mapper'. Intention of this xlator is to add block device backend support to gluster. It exports block devices as regular files to the gluster client. The immediate goal of this translator is to use logical volumes to store VM images and expose them as files to QEMU/KVM. Given Volume group is represented as directory and its logical volumes as files. By exporting LUNs/LVs as regular files, it becomes possible to: * Associate each VM to a LV/LUN * Use file system commands like cp to take copy of VM images * Create linked clones of VM by doing LV snapshot at server side * Implement thin provisioning by developing a qcow2 translator As of now this patchset maps only logical volumes. BD Mapper volume file specifies which Volume group to export to the client. BD xlator exports the volume group as a directory and all logical volumes under that as regular files. BD xlator uses lvm2-devel APIs for getting the list of Volume Groups and Logical Volumes in the system. The eventual goal of this work is to support thin provisioning, snapshot, copy etc of VM images seamlessly in glusterfs storage environment BUG: 805138 Change-Id: I13b69d39d7fd199c101c8e9e4f2cf10772bdc3dd Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3551 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* afr: send unique dict_t instances to replicas in self-heal fxattropBrian Foster2012-11-291-28/+42
| | | | | | | | | | | | | | | | | | | | | afr_sh_data_fxattrop() currently allocates and sends a single xattr dict_t instance to each replica. The callback codepath references the returned object in the self-heal in-memory state for the particular replica. If storage/posix is in the same address-space (i.e., running a single glusterfs client with a fuse->afr->posix graph), the same object is modified and returned for each child, causing corrupted in-memory state and afr xattrs. Allocate and send independent xattr dict_t's for each replica. This allows self-heal to work correctly in a single address-space graph. BUG: 868478 Change-Id: I42832e85b5d1abb6098c28944c717e129300109e Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4149 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* afr: handle short writes in afr_writev_wind and self-heal to avoid corruptionBrian Foster2012-11-295-16/+75
| | | | | | | | | | | | | | | | | | | | | The current failure to handle short writes on writev fops leaves us open to file corruption. A short write on a user request is ignored and leaves replicas in an inconsistent state. A short write during a self-heal is ignored and incorrectly marks the files as consistent if the heal completes. Modify user writev handling to return the best case return value from each of the replicas. Short writes that occur relative to this value are marked as failed and will require a heal. Modify self-heal to set an error on a short write and abort the heal. BUG: 853690 Change-Id: I18b30f58702326249230eeebb361b29e40b535f5 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4150 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: send ACCESS call on dir to first_up_subvol if cached is downshishir gowda2012-11-291-0/+11
| | | | | | | | | Change-Id: I4f518a969bbe3a11075e7c9ae10bd21bf059d5f3 BUG: 867253 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/4240 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep / gsyncd,glusterd: do not hardcode socket pathCsaba Henk2012-11-286-5/+16
| | | | | | | | | | | | ... in gsyncd python code. Indeed, use the configuration mechanism to set it suitably from glusterd. Change-Id: I9fe2088b14d28588d1e64fe892740cc5755b8365 BUG: 868877 Signed-off-by: Csaba Henk <csaba@redhat.com> Reviewed-on: http://review.gluster.org/4143 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd, cli: implement gluster system uuid reset commandRaghavendra Bhat2012-11-283-11/+131
| | | | | | | | | | | | | | | | | | | | | | A commonly faced problem among glusterfs users is: after a fresh installation of glusterfs in a virtual machine, the VM image is cloned to make multiple instances of the server. This breaks glusterd because right after glusterfs installation on the first boot glusterd would have created the node UUID and this gets inherited into the clone. The result is wierd behavior at the time of peer probe where glusterd does not (yet) deal with UUID collisions in a user friendly way. To handle it gluster peer reset command is implemented which upon execution changes the uuid of local glusterd. Change-Id: If207dd2ad93ab94ef1a3253f409c21c442975f87 BUG: 811493 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/3637 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Protected conf->xprt_list racy access.Krishnan Parthasarathi2012-11-282-3/+14
| | | | | | | | | | | | | | | | - epoll on RPCSVC_EVENT_ACCEPT would add corresponding xprt onto the xprt_list. Concurrently, synctask thread (volume op) would call into glusterd_fetchspec_notify which iterates on the xprt_list. Added a mutex to protect such a racy access of the list. Change-Id: Idc51b4bdb1c814dfab7790e1c899d6977f7640f2 BUG: 878873 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4241 Reviewed-by: Raghavendra G <raghavendra@gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-replication: catch select.error on select()Niels de Vos2012-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | tailer() in resource.py does not correctly catch exceptions from select(). select() can raise an instance of the select.error class and the current expression only catches ValueError (and the instance will have reference called selecterror). The geo-rep log contains a call trace like this: > E [syncdutils:190:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 216, in twrap > tf(*aa) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 123, in tailer > poe, _ ,_ = select([po.stderr for po in errstore], [], [], 1) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 276, in select > return eintr_wrap(oselect.select, oselect.error, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 269, in eintr_wrap > return func(*a) > error: (9, 'Bad file descriptor') BUG: 880308 Change-Id: I2babe42918950d0e9ddb3d08fa21aa3548ccf7c5 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4233 Reviewed-by: Peter Portante <pportant@redhat.com> Reviewed-by: Csaba Henk <csaba@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* protocol/client: add an option to filter O_DIRECT flag in openAmar Tumballi2012-11-283-2/+27
| | | | | | | | | | | | | | | | with the option, the idea is all client-side caching will be disabled, where as on server side process, the fd will be treated as a regular fd, thus helping the performance better. "gluster volume set <VOLNAME> remote-dio enable" would set this option in client protocol volumes. Change-Id: Id2255a167137f8fee20849513e3011274dc829b4 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 845213 Reviewed-on: http://review.gluster.org/4206 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: volume-sync shouldn't validate volume-idKrishnan Parthasarathi2012-11-272-35/+9
| | | | | | | | | | | | | | - volume sync would overwrite volume information on local node from the hostname supplied. This warning is provided to the user. - Also fixed a double free in volume-sync handler. Change-Id: Icc68d9d563fb50ca58d5880921f063692e1e6882 BUG: 865700 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4188 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/locks: implement fgetxattr and fsetxattrRaghavendra G2012-11-272-0/+342
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | implement xattrs for GF_XATTR_LOCKINFO_KEY, which will be used for posix-locks migration from old to new graph after a switch. fgetxattr (fd, GF_XATTR_LOCKINFO_KEY) will return a dict. This dict has a serialized dict stored for key GF_XATTR_LOCKINFO_KEY. This serialized dict in turn has fdnum value of locks acquired on this fd with modified pathinfo (containing hostname and base directory components) as key. fsetxattr (newfd, GF_XATTR_LOCKINFO_KEY, dict) has following semantics. dict can be the result of a previous fgetxattr with GF_XATTR_LOCKINFO_KEY. In that case, a dict_get on dict constructed using serialized buffer is done on modified pathinfo as key. If a value is got, that value is treated as fdnum and for every lock l on newfd->inode we do, if (l->fdnum == fdnum) { l->fdnum = fd_fdnum (newfd); l->transport = <connection identifier of connection on which fsetxattr came>; } Signed-off-by: Raghavendra G <raghavendra@gluster.com> Change-Id: I73a8f43aa0b6077bc19f8de52205ba748f2d8bbe BUG: 808400 Reviewed-on: http://review.gluster.org/4120 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cluster/stripe: handle GF_XATTR_LOCKINFO_KEY in f(get)(set)xattrRaghavendra2012-11-273-19/+302
| | | | | | | | | Change-Id: I4463006a7f54c05e757d877c56e1330fd91aec45 BUG: 808400 Signed-off-by: Raghavendra <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4125 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/distribute: send getxattr on LOCKINFO to only cached subvolumes.Raghavendra2012-11-271-1/+3
| | | | | | | | | | | | lk is sent to only cached subvolume. Hence there is no point in sending LOCKINFO to other children (even in case of directories). Change-Id: Ia20fc358dfa84cee9a52d1f613564ff6f25aa0c9 BUG: 808400 Signed-off-by: Raghavendra <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4123 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mount/fuse: migrate posix locks after a graph-switchRaghavendra2012-11-271-8/+90
| | | | | | | | | | | | | Each posix-lock is associated with an fd and a transport. After a graph switch, this lock-state has to be associated with new fd and transport corresponding to new client graph. Change-Id: Ia0855e15600c85ef902bf612738f7d96557145be BUG: 808400 Signed-off-by: Raghavendra <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4122 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: handle GF_XATTR_LOCKINFO_KEY appropriately.Raghavendra G2012-11-272-26/+521
| | | | | | | | | | | | | | values from all children need to be aggregated into a dictionary and serialized buffer of this aggregated dictionary has to be the value of GF_XATTR_LOCKINFO_KEY in the dict sent as a result of fgetxattr. Change-Id: Ie877f7c637c07feaee4c44d7ef86aa967a17b7e7 BUG: 808400 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4121 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* replicate: don't stop checking xattrs because one was absentJeff Darcy2012-11-263-114/+41
| | | | | | | | | | | | | | | | | | | | | The functional issue is described by the subject line. This patch also addresses several efficiency/structure issues, such as... * Calling dict_set_ptr once for each txn type, instead of once overall. * Calling afr_index_for_transaction_type once per iteration instead of once per call (or better yet zero since the conversion is unnecessary). * Implementation of inner functions in a different file than their one caller, creating a spurious header-file dependency. Change-Id: I29e0df906a820533b66b9ced73e015dfe77267d2 BUG: 865825 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Cluster/afr: Fix output for gluster volume heal vn info healedVenkatesh Somyajula2012-11-267-17/+53
| | | | | | | | | | | | | | | | | | | | | | Problem: Whenever gluster volume heal vol full command is executed, the entries stored in the circual buffer for sh->healed are added in the dictionary in the _crawl_post_sh_action function irrespective of whether actual self heal (due to non-zero values in chage log) takes place or not. Fix: Value of key (actual-sh-done) will be set to 1 whenever self heal takes place due to non-zero change log values and if for some FOP self heal daemon finds that no self heal required after examining the pending matrix, the value will be 0. Change-Id: I11fd0b9ee76759af17c5bca6bfafbaf66bcaacbc BUG: 863068 Signed-off-by: Venkatesh Somyajula <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4181 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: avoid blocking lock calls in statedump codeRajesh Amaravathi2012-11-241-2/+11
| | | | | | | | | | | | This change replaces LOCK () with TRY_LOCK () in nlm statedump code. Change-Id: I28c558b68854cf08c3a8190a00d6e3d507317628 BUG: 843819 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4193 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Implementation of server-side quorumPranith Kumar K2012-11-2311-161/+1206
| | | | | | | | | | | | Feature-page: http://www.gluster.org/community/documentation/index.php/Features/Server-quorum Change-Id: I747b222519e71022462343d2c1bcd3626e1f9c86 BUG: 839595 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/3811 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Implement float percentagePranith Kumar K2012-11-234-20/+21
| | | | | | | | | Change-Id: Ia7ea63471f0bbd74686873f5f6f183475880f1a0 BUG: 839595 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.org/4162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: fix use of undefined realpath(3) resultJeff Darcy2012-11-231-2/+7
| | | | | | | | | Change-Id: Ic50ae192c99cece25cd63f2277fb440fca5f0b04 BUG: 877522 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4201 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>