summaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* io-stats: Add stats for upcall notificationsv3.10devPoornima G2016-08-311-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this patch, there will be additional entries seen in the profile info: UPCALL : Total number of upcall events that were sent from the brick(in brick profile), and number of upcall notifications recieved by client(in client profile) Cache invalidation events: ------------------------- CI_IATT : Number of upcalls that were cache invalidation and had one of the IATT_UPDATE_FLAGS set. This indicates that one of the iatt value was changed. CI_XATTR : Number of upcalls that were cache invalidation, and had one of the UP_XATTR or UP_XATTR_RM set. This indicates that an xattr was updated or deleted. CI_RENAME : Number of upcalls that were cache invalidation, resulted by the renaming of a file or directory CI_UNLINK : Number of upcalls that were cache invalidation, resulted by the unlink of a file. CI_FORGET : Number of upcalls that were cache invalidation, resulted by the forget of inode on the server side. Lease events: ------------ LEASE_RECALL : Number of lease recalls sent by the brick (in brick profile), and number of lease recalls recieved by client(in client profile) Note that the sum of CI_IATT, CI_XATTR, CI_RENAME, CI_UNLINK, CI_FORGET, LEASE_RECALL may not be equal to UPCALL. This is because, each cache invalidation can carry multiple flags. Eg: - Every CI_XATTR will have CI_IATT - Every CI_UNLINK will also increment CI_IATT as link count is an iatt attribute. Also UP_PARENT_DENTRY_FLAGS is currently not accounted for, as CI_RENAME and CI_UNLINK will always have the flag UP_PARENT_DENTRY_FLAGS Change-Id: Ieb8cd21dde2c4c7618f12d025a5e5156f9cc0fe9 BUG: 1371543 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15193 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests: Fix spurious failures because of wrong shd up functionPranith Kumar K2016-08-313-4/+4
| | | | | | | | | | | | | | Fixed the way shd up check is done to prevent self-heal daemon not running error when heal full command is executed. Change-Id: I93c4a0da12316373d62cd4ea74432cd9bf2b090c BUG: 1370053 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15341 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com>
* upcall: Mark the clients as accessed on readdirp entriesPoornima G2016-08-311-0/+43
| | | | | | | | | | | | | | | | | | | | Currently when a client performs a readdirp it is not stored in upcall, as one of the clients that have accessed the files. Hence, when any other client modifies the file, the client that had performed readdirp will not get any notifications. Fix this by adding the clients to upcall database when they perform readdirp. Change-Id: I7767f1e26bf1bd1f67702a6d01f8aa64526ccc46 BUG: 1369430 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15313 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* tests: disable lock_revocation.t on NetBSDRaghavendra Talur2016-08-311-0/+1
| | | | | | | | | | | | | | | This has been consistently causing hangs in NetBSD machines. I have not been able to debug the issue and we have merge deadline for 3.9. It would be better to disable this for now. Change-Id: I8c63940aa26f78dd9994bb63293a5757835ec52b BUG: 1369401 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/15374 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* md-cache: Process all the cache invalidation flagsPoornima G2016-08-302-0/+46
| | | | | | | | | | | | | | | | Currently, md-cache only processes IATT_UPDATE_FLAGS, UP_XATTR and UP_XATTR_RM. We also need to process UP_RENAME_FLAGS, UP_FORGET, UP_PARENT_DENTRY_FLAGS and UP_NLINK_FLAGS. Otherwise the files unlinked or renamed will not be reflected on other mounts. Change-Id: Icb8b03da51482c3fc2e2a7292d16d56e11a341d9 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15324 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* gfapi: Mark tests/basic/gfapi/1093594.t bad until it is fixedPoornima G2016-08-301-0/+2
| | | | | | | | | | | Change-Id: If88efe3db782a6156614af4c650d53b159ade57f BUG: 1371541 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15354 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd : Introduce reset brickAnuradha Talur2016-08-291-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The command basically allows replace brick with src and dst bricks as same. Usage: gluster v reset-brick <volname> <hostname:brick-path> start This command kills the brick to be reset. Once this command is run, admin can do other manual operations that they need to do, like configuring some options for the brick. Once this is done, resetting the brick can be continued with the following options. gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force} Does the job of resetting the brick. 'force' option should be used when the brick already contains volinfo id. Problem: On doing a disk-replacement of a brick in a replicate volume the following 2 scenarios may occur : a) there is a chance that reads are served from this replaced-disk brick, which leads to empty reads. b) potential data loss if next writes succeed only on replaced brick, and heal is done to other bricks from this one. Solution: After disk-replacement, make sure that reset-brick command is run for that brick so that pending markers are set for the brick and it is not chosen as source for reads and heal. But, as of now replace-brick for the same brick-path is not allowed. In order to fix the above mentioned problem, same brick-path replace-brick is needed. With this patch reset-brick commit {force} will be allowed even when source and destination <hostname:brickpath> are identical as long as 1) destination brick is not alive 2) source and destination brick have the same brick uuid and path. Also, the destination brick after replace-brick will use the same port as the source brick. Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de BUG: 1266876 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12250 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* gfapi: SSL connection for mgmt connection is not workingRajesh Joseph2016-08-284-1/+223
| | | | | | | | | | | | | | | | | Problem: libgfapi does not enable SSL on mgmt connection. Fix: Enable SSL when it is enabled on mgmt connection is enabled, i.e. presence of /var/lib/glusterd/secure-access file Change-Id: I1ce4935b04e6140aeab819e42076defd580b0727 BUG: 1362602 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15073 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* tests: change EXPECT_WITHIN timeoutsAnuradha Talur2016-08-282-9/+9
| | | | | | | | | | | | | | | | | Use defined HEAL and PROCESS_UP timeouts rather than hard code them in self-heald.t. Change-Id: I21586811904c8417b7208bb643f14dff20dc4832 BUG: 1370074 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/15316 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* md-cache: Do not use features.cache-invalidation for both md-cache and upcallPoornima G2016-08-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the volume set option features.cache-invalidation enables upcall feature on server side and md-cache cache-invalidation on client side. There are multiple problems that can arise from this: 1. The scenario when user wants to, enable upcall for nfs-ganesha setup, but do not want to enable md-cache cache-invalidation, as the nfs-clients have already cached the metadata and upcall is used to to invalidate the nfs-client cache. In this case, users should have a way of disabling md-cache invalidation without disabling upcall. 2. Upcall requires a op-version of GD_OP_VERSION_3_7_0, where as md-cache invalidation requires an op version of GD_OP_VERSION_3_9_0. Consider a setup where the servers are in op-version GD_OP_VERSION_3_7_0, and th clients are in op-version GD_OP_VERSION_3_9_0. if there is one single volume set option, user can enable this feature in this setup. But it can lead to stale xattr cache as the xattr invalidation was introduced in upcall only in release 3.8. Hence, we should not be able to enable md-cache invalidation, if all the servers and clients are not on opversion >= GD_OP_VERSION_3_9_0. To solve the above mentioned issues, we have seperate volume options for enabling md-cache invalidation and upcall. But this can lead to issues when user enable md-cache invalidation and forgets to enable upcall. Probably in the next release, these can be enables by default. Change-Id: Ie70eff97fe12fcb623eec8f4f5861ac065bf483e BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15314 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd/cli: cli to get local state representation from glusterdSamikshan Bairagya2016-08-261-0/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is no existing CLI that can be used to get the local state representation of the cluster as maintained in glusterd in a readable as well as parseable format. The CLI added has the following usage: # gluster get-state [daemon] [odir <path/to/output/dir>] [file <filename>] This would dump data points that reflect the local state representation of the cluster as maintained in glusterd (no other daemons are supported as of now) to a file inside the specified output directory. The default output directory and filename is /var/run/gluster and glusterd_state_<timestamp> respectively. The option for specifying the daemon name leaves room to add support for other daemons in the future. Following are the data points captured as of now to represent the state from the local glusterd pov: * Peer: - Primary hostname - uuid - state - connection status - List of hostnames * Volumes: - name, id, transport type, status - counts: bricks, snap, subvol, stripe, arbiter, disperse, redundancy - snapd status - quorum status - tiering related information - rebalance status - replace bricks status - snapshots * Bricks: - Path, hostname (for all bricks these info will be shown) - port, rdma port, status, mount options, filesystem type and signed in status for bricks running locally. * Services: - name, online status for initialised services * Others: - Base port, last allocated port - op-version - MYUUID Change-Id: I4a45cc5407ab92d8afdbbd2098ece851f7e3d618 BUG: 1353156 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/14873 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* feature/bitrot: Ondemand scrub option for bitrotKotresh HR2016-08-252-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The bitrot scrubber takes 'hourly/daily/biweekly/monthly' as the values for 'scrub-frequency'. There is no way to schedule the scrubbing when the admin wants it. Ondemand scrubbing brings in the new option 'ondemand' with which the admin can start scrubbing ondemand. It starts the scrubbing immediately. Ondemand scrubbing is successful only if the scrubber is in 'Active (Idle)' (waiting for it's next frequency cycle to start scrubbing). It is not entertained when the scrubber is in 'Paused' or already running. Here is the command line syntax. gluster volume bitrot <vol name> scrub ondemand Change-Id: I84c28904367eed827a7dae8d6a535c14b28e9f4d BUG: 1366195 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/15111 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* gfapi: do not cache upcalls if the application is not interestedNiels de Vos2016-08-252-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the volume option 'features.cache-invalidation' is enabled, upcall events are sent from the brick process to the client. Even if the client is not interested in upcall events itself, md-cache or other xlators may benefit from them. By adding a new 'cache_upcalls' boolean in the 'struct glfs', we can enable the caching of upcalls when the application called glfs_h_poll_upcall(). NFS-Ganesha sets up a thread for handling upcalls in the initialization phase, and calls glfs_h_poll_upcall() before any NFS-client accesses the NFS-export. In the future there will be a more flexible registration API for enabling certain kind of upcall events. Until that is available, this should work just fine. Verificatio of this change is not trivial within our current regression test framework. The bug report contains a description on how to reliably reproduce the problem with the glusterfs-coreutils. Change-Id: I818595c92db50e6e48f7bfe287ee05103a4a30a2 BUG: 1368842 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15191 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* snapshot/cli: Fix snapshot status xml outputAvra Sengupta2016-08-231-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | snap status --xml errors out if a brick is down and doesn't have pid. It is handled in the cli of the snap status where "N/A" is displayed in such a scenario. Handled the same in xml snap status <snapname> --xml fails as the writer is not initialised for the same. Using GF_SNAP_STATUS_TYPE_ITER instead of GF_SNAP_STATUS_TYPE_SNAP for all snap's status to differentiate between the two scenarios. Added testcase volume-snapshot-xml.t to check all snapshot commands xml outputs Change-Id: I99563e8f3e84f1aaeabd865326bb825c44f5c745 BUG: 1325831 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14018 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* snapshot: Display number of snapshots in volume infoAvra Sengupta2016-08-231-0/+20
| | | | | | | | | | | | | | | Display number of snapshots in a volume in volume info output. This number gets modified, with create, delete, and restore operations. Change-Id: Ic9b7c2b6950980f8ce75ca362998c097ea7c863d BUG: 1360693 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15029 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* tests: fix volfile_server_switch.t spurious failureAtin Mukherjee2016-08-221-3/+1
| | | | | | | | | | | | | | | 1. Unnecessary self probe is removed 2. After every probe a peer_count check is added to give the test to time finish handhake. Change-Id: Iab52548f8b781e7968250cd98fdbeaf02472970d BUG: 1368953 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15231 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tests: Remove explicit mention of include locationPoornima G2016-08-221-8/+1
| | | | | | | | | | | | | | | | | This hopefully fixes the spurious failures with error: glusterfs/api/glfs.h No such file or directory build.gluster.org/job/rackspace-regression-2GB-triggered/22897/consoleFull BUG: 1365489 Change-Id: Ic3660de810c0daee7284373bbfaed172aba86d69 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15194 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Prevent split-brain when bricks are brought off and on in ↵Krutika Dhananjay2016-08-221-0/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cyclic order When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15080 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests/cli: Generate SSL certificatesAshish Pandey2016-08-211-0/+17
| | | | | | | | | | | | | | | Generate SSL certificates before enabling management encryption to avoid test failure. Change-Id: Iab23b36703f4653f1d5bb9d14695e4d3fa63ad61 Signed-off-by: Ashish Pandey <aspandey@redhat.com> BUG: 1368349 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15202 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Fix volume restart issue upon glusterd restartSamikshan Bairagya2016-08-172-1/+41
| | | | | | | | | | | | | | | | | | | | | | | http://review.gluster.org/#/c/14758/ introduces a check in glusterd_restart_bricks that makes sure that if server quorum is enabled and if the glusterd instance has been restarted, the bricks do not get started. This prevents bricks which have been brought down purposely, say for maintainence, from getting started upon a glusterd restart. However this change introduced regression for a situation that involves multiple volumes. The bricks from the first volume get started, but then for the subsequent volumes the bricks do not get started. This patch fixes that by setting the value of conf->restart_done to _gf_true only after bricks are started correctly for all volumes. Change-Id: I2c685b43207df2a583ca890ec54dcccf109d22c3 BUG: 1367478 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15183 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* logging: Fix per xl log levelPoornima G2016-08-151-0/+43
| | | | | | | | | | | | | | | | | | | Currently per xlator loglevel setting doesn't work, due to the flaw in loglevel checking. Fix the same. Per xlator logging can be set using the below command: Eg: setfattr -n trusted.glusterfs.patchy-md-cache.set-log-level -v TRACE /mnt/glusterfs/0 Change-Id: I8ff1d15bd5693b6f682d99bee22a4bbb5eee646c BUG: 1362520 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15071 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* gfapi: add missing glfs_truncateJeff Darcy2016-08-112-0/+119
| | | | | | | | | | | | Change-Id: I80b016090a4d9d86278a0a5144dd58c0cbfe9bb2 BUG: 1365489 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/13927 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd: Convert volume to replica after adding brick self heal is not ↵Mohit Agrawal2016-08-111-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | triggered Problem: After add brick to a distribute volume to convert to replica is not triggering self heal. Solution: Modify the condition in brick_graph_add_index to set trusted.afr.dirty attribute in xlator. Test : To verify the patch followd below steps 1) Create a single node volume gluster volume create <DIS> <IP:/dist1/brick1> 2) Start volume and create mount point mount -t glusterfs <IP>:/DIS /mnt 3) Touch some file and write some data on file 4) Add another brick along with replica 2 gluster volume add-brick DIS replica 2 <IP>:/dist2/brick2 5) Before apply the patch file size is 0 bytes in mount point. BUG: 1365455 Change-Id: Ief0ccbf98ea21b53d0e27edef177db6cabb3397f Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15118 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* georep: tests for logrotate, create+rename, hard-link renameMilind Changire2016-08-092-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * log rotate eg. with rotate count as 2 and with the following files created x.log, x.log.1 and x.log.2, another iteration of logrotate results in the following operations to be performed x.log.2 is renamed to x.log.3 x.log.1 is renamed to x.log.2 x.log is renamed to x.log.1 x.log.3 is deleted x.log is created [possible gfid allocated that belonged to x.log.3] when a file is created, there's a big possibility that a gfid used earlier can be reassigned and reused. This causes RENAME operations to fail on slave nodes when logs are replayed, typically on a Georep restart. A function called logrotate_simulate has been added to tests/geo-rep.rc to help simulate this situation. Starting from a clean state, logrotate_simulate has to be called 4 times with a rotate count of 2 to help the logs roll over and cause a delete at the end and a gfid reallocation. With the bug-fix in place, this test should not cause the Georep session to go into a Faulty state. * create+rename On log replay after Georep restart, a create+rename causes an entry to be created for the original file, but it cannot be linked to the gfid back-end since it is associated with the renamed file before the Georep restart. A function create_rename simulates the create+rename scenario and the function create_rename_ok tests if the dangling entry is present at the slave mount. With the bug-fix in place, a dangling entry with the original name should not be found at the slave mount after logs are replayed after a Georep restart. * hard-link rename This case is similar to the create+rename case except that this is a case of renaming hard-link to one of its other names. A function hardlink_rename has been added to tests/geo-rep.rc which simulates the creation and rename of hard-link. After a Georep session restart, the test function hardlink_rename_ok should not find the source link name on the slave. With the bug-fix in place, this test should not fail. If changelogs have been enabled on the slave as well, then the logs should show an UNLINK entry for the source link name, since a rename operation of a hard link to one of its names essentially just drops the source name as per the 'mv' command semantics. Change-Id: I85b196c00cf79a11bada25ef2fe5f1dc5c0c858a BUG: 1316389 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/13663 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* glusterd : skip non directories inside /var/lib/glusterd/volsJiffin Tony Thottan2016-08-081-0/+31
| | | | | | | | | | | | | | | | Right now glusterd won't come up if vols directory contains an invalid entry. Instead of doing that with this change a message will be logged and then skip that entry Change-Id: I665b5c35291b059cf054622da0eec4db44ec5f68 BUG: 1318591 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/13764 Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* posix: Do not move and recreate .glusterfs/unlink directoryAshish Pandey2016-08-081-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: At the time of start of a volume, it is checked if .glusterfs/unlink exist or not. If it does, move it to landfill and recreate unlink directory. If a volume is mounted and we write data on it till we face ENOSPC, restart of that volume fails as it will not be able to create unlink dir. mkdir will fail with ENOSPC. This will not allow volume to restart. Solution: If .glusterfs/unlink directory exist, don't move it to landfill. Delete all the entries inside it. Change-Id: Icde3fb36012f2f01aeb119a2da042f761203c11f BUG: 1360679 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15030 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: fix spurious failure in tests/bugs/glusterd/bug-1089668.tAtin Mukherjee2016-08-041-2/+1
| | | | | | | | | | | | | | Instead of rebalance stop, its always better to wait for rebalance to complete as the former doesn't have any purpose. Change-Id: Ia1bc2a34d937a0a96543bebd257dcda619f12474 BUG: 1363948 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15085 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* libglusterfs: fix glusterd statedump crashAtin Mukherjee2016-08-041-1/+6
| | | | | | | | | | | | | | | | | | commit 3c04a91 removed setting typeStr to NULL if num_allocs is set to 0, this has caused this regression. Code has been put back like earlier and to avoid statedump printing all the NULL values check is modified to see skip the records if num_allocs is 0 instead of total_allocs Change-Id: Ib8bcc2fba908e88cf52b641c3f6bcba74f5e667c BUG: 1359190 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14987 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: clean up old port and allocate new one on every restartAtin Mukherjee2016-08-033-49/+5
| | | | | | | | | | | | | | | | | | | | | | | | GlusterD as of now was blindly assuming that the brick port which was already allocated would be available to be reused and that assumption is absolutely wrong. Solution : On first attempt, we thought GlusterD should check if the already allocated brick ports are free, if not allocate new port and pass it to the daemon. But with that approach there is a possibility that if PMAP_SIGNOUT is missed out, the stale port will be given back to the clients where connection will keep on failing. Now given the port allocation always start from base_port, if everytime a new port has to be allocated for the daemons, the port range will still be under control. So this fix tries to clean up old port using pmap_registry_remove () if any and then goes for pmap_registry_alloc () Change-Id: If54a055d01ab0cbc06589dc1191d8fc52eb2c84f BUG: 1221623 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15005 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* tests: Fix get_pending_heal_count check in ecRavishankar N2016-07-2911-4/+4
| | | | | | | | | | | | | | | | Continuation of http://review.gluster.org/#/c/14985. Also renamed tests/bugs/disperse to tests/bugs/ec for a better correlation to tests/basic/ec and xlators/cluster/ec Change-Id: I662b3477c12af8a0b94597769e8f00f354b1168c BUG: 1332054 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15006 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* io-threads: remove least-rate-limit option and codeJeff Darcy2016-07-281-53/+0
| | | | | | | | | | | | | | This will be unnecessary, and mostly in the way, as real fairness guarantees are implemented. Change-Id: Ic61ec1c9e9add58385f1a4eafcfe2cc554ceefc8 BUG: 1360402 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14989 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: Gluster Build System <jenkins@build.gluster.org>
* snapshot/snapd: Don't display pid when snapd is offlineAvra Sengupta2016-07-271-0/+6
| | | | | | | | | | | | | | | | We were previously reading the pidfile, and displaying the pid even if snapd daemon is not running. Now to fix it, we re-assign pid value to -1, if snapd is offline. Change-Id: I4baff8d489fe9380061c52aea006db90fa421cd7 BUG: 1358244 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14981 Tested-by: Vijay Bellur <vbellur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Fix the spurious failure in libgfapi-fini-hang.tPoornima G2016-07-272-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | RCA: After running libgfapi-fini-hang, there is a EXPECT_WITHIN which waits for PROCESS_UP_TIMEOUT(20s), for the process libgfapi-fini-hang to die. Currently EXPECT_WITHIN is returning success even if the process libgfapi-fini-hang is alive. This is because "pgrep libgfapi-fini-hang" in check_process() is returning 1(no process alive) even if the process is alive. Man page of pgrep says "The process name used for matching is limited to the 15 characters". Hence changing the name of executable from libgfapi-fini-hang to gfapi-hang, so that it falls within the limit. As explained the failure is not because there was a hang(logs show that glfs_set_volfile_server was still executing), but because EXPECT_WITHIN was not really waiting. And hence there was a race between the execution of the process libgfapi-fini-hang and the kill. Change-Id: I257715865e0d3e5a14f83d1e235c01899e1cae68 BUG: 1358594 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/14997 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.org>
* tests: Fix tests/bitrot/bug-1244613.tKotresh HR2016-07-271-0/+3
| | | | | | | | | | | | | | Wait for gluster nfs to initialize before attempting the nfs mount. Change-Id: I4bd9579ad5368935cf62632a5d612f89fce5979f BUG: 1360682 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/15028 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* afr: some coverity fixesRavishankar N2016-07-264-7/+6
| | | | | | | | | | | | | | | Thanks to Krutika for a cleaner way to track inode refs in afr_set_split_brain_choice(). Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df BUG: 1355604 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14895 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: Moving ./tests/bugs/snapshot/bug-1316437.t to bad testAvra Sengupta2016-07-251-0/+2
| | | | | | | | | | | | | | | | Moving ./tests/bugs/snapshot/bug-1316437.t to bad test, while mulling over the pros and cons of the fix. Will update the bug, as we go. Sending this patch to unblock master. Change-Id: Ia863312913686b4fa0ee0b63da13aedc0439a835 BUG: 1359717 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15001 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Remove hard coding in get_auxPranith Kumar K2016-07-221-4/+12
| | | | | | | | | | | Change-Id: Ie007d8006a2f2be0187f0c73d46ec6dda2a68a6b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14988 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jdarcy@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Fix spurious failures with split-brain-favorite-child-policy.tPranith Kumar K2016-07-221-0/+17
| | | | | | | | | | | | | | | | | | | | | | | Problem: It is not guranteed that the self-heal daemon would apply the new option as soon as volume set is executed because all the command gurantees is that the process is notified of the change in volfile. Shd still needs to fetch volfile and reconfigure. If the next volume heal command comes even before the reconfigure happens, then the heal won't happen. Fix: Restart shd to make sure it has the option loaded with new value. BUG: 1358976 Change-Id: I3ed30ebbec17bd06caa632e79e9412564f431b19 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14978 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Jeff Darcy <jdarcy@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Fix pending-heal-count checksPranith Kumar K2016-07-221-4/+2
| | | | | | | | | | | | | | | | | | EXPECT_WITHIN takes regular expression to match the count, so even when there are say 10 entries to heal, it would think that the heal is complete. Fixed checking pending heal count with correct regex. Thanks to Xavi for finding this problem. Change-Id: Ic593d22468b2b586bfca864962ffa0eda96b1d1f BUG: 1332054 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14985 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: Fix timing issue in ec.tPranith Kumar K2016-07-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | Problem: Because of timing issue sometimes the mount is unmounted even before the version is updated, this is leading to not triggering heals. Fix: One way to fix this would be to increate 'sleep 2' to 'sleep 10' but that would slow things down. I changed the way ec learns it needs xattr healing so that it triggers heals even when the xattrs are not marked correctly. Change-Id: I1c82041166443ae7079dd99b89ea2ed170233ba3 BUG: 1359001 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14980 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Fix spurious failure of br-stub.tKotresh HR2016-07-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | The nfs mount fails occasionally in ./tests/bitrot/br-stub.t. The reason being nfs mount is attempted before the gluster nfs has come up. It is a race and hence happens occasionally. The patch fixes it by waiting for nfs server to come up before mount. Thanks skoduri@redhat.com for root causing it. Change-Id: I3adbf2363514635785c02b1478733095ad0b74cf BUG: 1358114 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14960 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: Enable all gfapi test casesPoornima G2016-07-2032-103/+134
| | | | | | | | | | | Change-Id: I32bfec4af91348d96dc3e81a9d5c9cad599f821b Bug: 1358594 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/14748 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* tests: Fix spurious failure of tests/bugs/glusterd/bug-1111041.tAvra Sengupta2016-07-201-6/+2
| | | | | | | | | | | | | | | | | | On a faster machine the ps check was returning two pids, including the glusterfsd process's pid, right after that, process forked. Hence removing that ps, as for the scope of this test, verifying the snapd pid from the status command itself is enough. Change-Id: I8bd8fc4ea406d96e3a47f952cfe44560b615dbe6 BUG: 1358195 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14963 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* md-cache: Add cache invalidation support to invalidate the meta data cachePoornima G2016-07-201-7/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: md-cache currently updates its stat in cbks of selected fops. The default cache time is 1 second, if this is increasd to reap the benefits of caching, we may end up with stale cache for long time, as there is no logic yet to notify md-cache of backend changes by another client. Solution: Use the existing upcall mechanism to invalidate the cache. For this feature to work, "features.cache-invalidation" volume option should be enabled. This patch as is doesn't improve any performance, the benifit of the patch is that it provides coherency for stat cache, hence the cache timeout can be quite longer which in turn can improve the performance. Change-Id: I2dbb0afa7b5e4a5a248f910188e0918e02f18692 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/12951 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* xlator/trash : append '/' at the end in trash_notify_lookup_cbkJiffin Tony Thottan2016-07-191-0/+29
| | | | | | | | | | | | | | | | | | In the notify function in trash xlator, a lookup is performed to obtain path of old trash directory. The result usually contains path without '/' at the end. The trash xlator maintains expects '/' at the end for the values such as 'old trash dir' and 'new trash dir'. Otherwise certian checks in the code will fail. Change-Id: I89e02e4b249314fb6536297f959865feee182c83 BUG: 1357397 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14938 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Revert "tests: remove tests for clear-locks"Pranith Kumar K2016-07-183-0/+114
| | | | | | | | | | | | | | | | | | This reverts commit 0086a55bb7de1ef5dc7a24583f5fc2b560e835fd. As part of Richard's patch for lock-revocation feature this bug is completely fixed (I think at least ;-) ). So bringing these back so that we will find out if there are anymore things we need to address in this code path. BUG: 1350867 Change-Id: If1440fc83b376576ae1a77b1156188a6bf53fe3a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14817 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* features/locks: Add lock revocation functionality to posix locks translatorRichard Wareing2016-07-181-0/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Motivation: Prevents cluster instability by mis-behaving clients causing bricks to OOM due to inode/entry lock pile-ups. - Adds option to strip clients of entry/inode locks after N seconds - Adds option to clear ALL locks should the revocation threshold get hit - Adds option to clear all or granted locks should the max-blocked threshold get hit (can be used in combination w/ revocation-clear-all). - Options are: features.locks-revocation-secs <integer; 0 to disable> features.locks-revocation-clear-all [on/off] features.locks-revocation-max-blocked <integer> - Adds monkey-locking option to ignore 1% of unlock requests (dev only) features.locks-monkey-unlocking [on/off] - Adds logging to indicate revocation event & reason Test Plan: First you will need TWO fuse mounts for this repro. Call them /mnt/patchy1 & /mnt/patchy2. 1. Enable monkey unlocking on the volume: gluster vol set patchy features.locks-monkey-unlocking on 2. From the "patchy1", use DD or some other utility to begin writing to a file, eventually the dd will hang due to the dropped unlocked requests. This now simulates the broken client. Run: for i in {1..1000};do dd if=/dev/zero of=/mnt/patchy1/testfile bs=1k count=10;done' ...this will eventually hang as the unlock request has been lost. 3. Goto another window and setup the mount "patchy2" @ /mnt/patchy2, and observe that 'echo "hello" >> /mnt/patchy2/testfile" will hang due to the inability of the client to take out the required lock. 4. Next, re-start the test this time enabling lock revocation; use a timeout of 2-5 seconds for testing: 'gluster vol set patchy features.locks-revocation-secs <2-5>' 5. Wait 2-5 seconds before executing step 3 above this time. Observe that this time the access to the file will succeed, and the writes on patchy1 will unblock until they hit another failed unlock request due to "monkey-unlocking". BUG: 1350867 Change-Id: I814b9f635fec53834a26db634d1300d9a61057d8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14816 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* afr, index: Clean up stale directory and file indices in granular entry shKrutika Dhananjay2016-07-111-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | Specifically when a directory tree is removed (rm -rf) while a brick is down, both the directory index and the name indices of the files and subdirs under it will remain. Self-heal will need to pick up these and remove them. Towards this, afr sh will now also crawl indices/entry-changes and call an rmdir on the dir if the directory index is stale. On the brick side, rmdir fop has been implemented for index xl, which would delete the directory index and its contents if present in a synctask. Change-Id: I8b527331c2547e6c141db6c57c14055ad1198a7e BUG: 1331323 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14832 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests: fix rebalance timing issueSakshi Bansal2016-07-111-0/+2
| | | | | | | | | | | | | | | | With a start and stop rebalance, the stop command may fail as by that time the rebalance process may not come up. Using the rebalance status commmand to ensure that the rebalance process is up before stoping rebalance. Change-Id: I3d5123cd5dfabde2720428455b257d11b980ce21 BUG: 1354372 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14885 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: miscellaneous improvementsJeff Darcy2016-07-112-10/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a combination of three previous low-impact changes, combined to reduce patch-pushing burden. ((( GF_INTERACTIVE ))) To use this, just define GF_INTERACTIVE (value doesn't matter as long as the length is non-zero) before running your test. It replaces the TEST alias with one that will prompt you before executing that line. You can answer: 'y' to execute the line 'q' to exit the test immediately anything else to skip this line and continue This is particularly useful to inspect state in another window while a test is paused, or to do manual experimentation in the (often complex) configuration created during a test. ((( CLEANUP.SH ))) tests: add cleanup.sh Often, a developer might want to run a test up to some point, then bail out and poke around manually. That leaves state that needs to be cleaned up before the next test can run properly. This patch adds a trivial script to invoke that cleanup machinery. Along the way, code in include.rc to find env.rc was changed to be more robust across arbitrarily deep (or shallow) directory hierarchies. ((( REPLACE EXISTING TAR FILES INSTEAD OF APPENDING ))) We currently use "tar rf" to collect log files from each test. This *appends* the new data to whatever's there already, which has two bad effects when a test is run repeatedly. * Ever-increasing size of the tar file. * Ever-increasing time to extract logs from the tar file, with each copy completely overwriting any previous. This doesn't seem to be a problem in our regression tests, because the entire directory is nuked during package removal and reinstallation. However, when running a test repeatedly during a debug session, the effects can be quite severe. This is particularly evident with JBR, because the "logs" that get archived include large journal files. Certain other translators, such as changelog and CTR, might be prone to similar effects. There's no point to having multiple copies of the logs in each tar file. As far as I know, nobody ever takes advantage of that. Therefore, use "tar cf" to overwrite any existing archive instead of appending. This change also handles excluding other .tar files in a portable way. Change-Id: Iebf77d496a71976c321bbacb49776338a9da586f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14874 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>