summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* cli/quota: Sort the list output alphabetically by pathvmallika2016-04-277-5/+160
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/14000 > Change-Id: I0b124e119d167817be2ae3eb52ac6c80fc7db5d1 > BUG: 1320716 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/14000 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaushal M <kaushal@redhat.com> Change-Id: I87e12d58c8e267b2af67e287998e7313efc70af4 BUG: 1330018 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/14061 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr: replica pair going offline does not require CHILD_MODIFIED eventSakshi Bansal2016-04-273-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As a part of CHILD_MODIFIED event DHT forgets the current layout and performs fresh lookup. However this is not required when a replica pair goes offline as the xattrs can be read from other replica pairs. Hence setting different event to handle replica pair going down. > Backport of http://review.gluster.org/#/c/12573/ > Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3 > BUG: 1281230 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/12573 > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ida30240d1ad8b8730af7ab50b129dfb05264fdf9 BUG: 1283972 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12767 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: fix validation of lower op-version check in volume setAtin Mukherjee2016-04-262-2/+24
| | | | | | | | | | | | | | | | | Commit 2d87a98 introduced a validation to fail lowering down the cluster.op-version. Commit 2eb8758 actually changed the variable value from cluster's op-version to volume's op-version which resulted the logic go for a toss. Change-Id: I70df32b75c3a3fe47dc840c4a655059e5b124bca BUG: 1330545 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14069 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/14077
* quota: setting 'read-only' option in xdata to instruct DHT to not healSakshi Bansal2016-04-263-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Backport of http://review.gluster.org/#/c/13988/ > Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 > BUG: 1252244 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/13988 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1328473 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14031 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tier/libgfdb: Ordering query results from libgfdbJoseph Fernandes2016-04-261-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When querying we will order the query result to get the hotest or the coldest files in the queried list so that these files are migrated first. Now here we are giving priority to the write heat(time and counters), as it requires complex queries to have a composite ordering of write and read + it has it impact on performance. Backport of http://review.gluster.org/13607 > Change-Id: I2e0415dcfad4218b42c68fc5c2ed8d1f075ce9ea > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/13607 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Joseph Fernandes > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: If5fad07f8d0f50016b10e256803abd5266cd708f BUG: 1323017 Reviewed-on: http://review.gluster.org/13881 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: Handle GF_DEFRAG_STOPSusant Palai2016-04-261-0/+17
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14004 Problem: On a rebal stop, the migrator threads don't intimate the crawler thread to wake up in case it is waiting on signal from migrator thread. BUG: 1330529 Change-Id: I9019a715c7b4673b8bb5a75d7d33a18add85ce33 Reviewed-by: N Balachandran <nbalacha@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14076 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* clone/snapshot: Save restored_from_snap for clonesAvra Sengupta2016-04-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Bricks of cloned volumes are lvm bricks mounted in /run/gluster, which on reboot of the node gets cleared. Hence, these brick paths need to be recreated on glusterd restart and the appropriate lvms are mounted. Backport of: > Change-Id: I6da086288c0dbdcedf3a20fd53f25e3728bea473 > BUG: 1328010 > Signed-off-by: Avra Sengupta <asengupt@redhat.com> > Reviewed-on: http://review.gluster.org/14021 > Smoke: Gluster Build System <jenkins@build.gluster.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Change-Id: I02d10e6bca99f6d78b50cb91c677e33a8dcefa17 BUG: 1329989 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/14059 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* snapshot/quota: Copy quota.cksum during snapshot operationsAvra Sengupta2016-04-264-21/+40
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13760/ A volume having a quota.conf file, should always have a quota.cksum file too. Based on this above assumption modifying glusterd_copy_quota_files() to always copy quota.cksum, if quota.conf is present. This change will be reflected when a snapshot is created, restored and cloned. Change-Id: Ia49dc26eacef32eeb8f7d7d9553c80e304b08779 BUG: 1329492 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/13760 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 8f3ad1e3ede77fa5f8c8d606e18a7e83865a822c) Reviewed-on: http://review.gluster.org/14047
* packaging: additional dirs and files in /var/lib/glusterd/Kaleb S KEITHLEY2016-04-262-75/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing /var/lib/glusterd files and dirs found by downstream testing. Use a loop to create hook dirs instead of open-coding. Merge the %ghost and non-ghost dirs in -server %files section for easier maintenance. Eliminate a benign warning for enabling non-existent glusterfsd.{init,service} which is only relevant to Fedora koji builds Don't reject glusterfs.spec.in changes because of long lines Backport of > Change-Id: I5802175d729e0168eea879a2a61626b0b73d77c8 > BUG: 1326410 > http://review.gluster.org/13981 Change-Id: Ica0f54e63ec01056263a27bbcd4a10e469f67e42 BUG: 1326413 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13982 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* geo-rep: Fix hostname mismatch between volinfo and geo-rep statusAravinda VK2016-04-261-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | When Volume was created using IP, Gluster Volume info shows IP address But Geo-rep shows hostname if available, So difficult to map the outputs of Volume Info and Georep status output. Schedule Geo-rep script(c#13279) will merge the output of Volume info and Geo-rep status to get offline brick nodes information. This script was failing since host info shown in Volinfo is different from Georep status. Script was showing all nodes as offline. With this patch Geo-rep gets host info from volinfo->bricks instead of getting from hostname. Geo-rep status will now show same hostname/IP which was used in Volume Create. BUG: 1328706 Change-Id: Ib8e56da29129aa19225504a891f9b870f269ab75 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/14005 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit bc89311aff62c78102ab6920077b6782ee99689a) Reviewed-on: http://review.gluster.org/14036
* dht: add "nuke" functionality for efficient server-side deletionJeff Darcy2016-04-264-14/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of the following two patches (of which the second is a trivial adjustment to a timeout for a test added by the first). http://review.gluster.org/13878 http://review.gluster.org/13935 This turns a special xattr into an rmdir with flags set. When that hits the posix translator on the server side, that causes the file/directory to be moved into the special "landfill" directory. From there, the posix janitor thread will take care of deleting it entirely on the server side - traversing it recursively if necessary. A couple of secondary issues were fixed to make this effective. * FUSE now ensures that setxattr values are NUL terminated. * The janitor thread now gets woken up immediately when something is placed in 'landfill' instead of only when file descriptors need to be closed. * The default landfill-emptying interval was reduced to 10s. To use the feature, issue a setxattr something like this: setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir The value doesn't actually matter; the mere receipt of a request with this key is sufficient. Some day it might be useful to allow setting a required value as a sort of password, so that only those who know it can access the underlying special functionality. Change-Id: I4132a30d1faa53a6682399ad1d9041e2c4519951 BUG: 1330241 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14065 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: SSL certificate depth volume option is incorrectKaleb S KEITHLEY2016-04-261-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Between 3.7.1 and 3.7.2 a typo was introduced changing the string ssl-cert-depth to ssl-cetificate-depth. [sic] rpc/rpc-transport/socket/src/socket.c still expects the string ssl-cert-depth. Also replace a couple errant tabs with spaces. See: > Change-Id: I0621258470bd831c97008b56123a9dc7029d73f1 > BUG: 1330248 > http://review.gluster.org/#/c/14066/ Change-Id: I4d227fc40d9377dd1a1ef39bf31be026ba43acb1 BUG: 1330249 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14067 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* quota : fix null dereference issues in quotaManikandan Selvaganesh2016-04-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14022/ > Change-Id: I3805b206077718da26adbeb8b29a53642e00886f > BUG: 1328696 > Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-on: http://review.gluster.org/14022 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ia5af10b79a71c6be5db293aee070049364b70237 BUG: 1329115 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/14041 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
* cluster/afr: Fix inode-leak in data self-healPranith Kumar K2016-04-252-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Thanks to Olia-Kremmyda for finding the bug on github review, https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6 >Change-Id: Ib8640ed0c331a635971d5d12052f0959c24f76a2 >BUG: 1329773 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14052 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1329779 Change-Id: I3d77f0b445fdedf2c582ea88f8d89e1da525638f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14053 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/distribute: detect stale layouts in entry fopsRaghavendra G2016-04-2512-42/+768
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1329062 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14040 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* Tier: tier command fails message when any node is downhari2016-04-222-24/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back-port of : http://review.gluster.org/#/c/13918/ PROBLEM: the dict doesn't get set on the node if its down. so while printing the output on cli we get a ENOENT which ends in a tier command failed. FIX: this patch skips the node that wasn't available and carrys on with the next node for both tier status and tier detach status. >Change-Id: I718a034b18b109748ec67f3ace56540c50650d23 >BUG: 1324439 >Signed-off-by: hari <hgowtham@redhat.com> >Reviewed-on: http://review.gluster.org/13918 >Smoke: Gluster Build System <jenkins@build.gluster.com> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Kaushal M <kaushal@redhat.com> Change-Id: Ia23df47596adb24816de4a2a1c8db875f145838e BUG: 1328410 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/14030 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* mount/fuse: report ESTALE as ENOENTRaghavendra G2016-04-201-0/+3
| | | | | | | | | | | | | | | | | | | | | When the inode/gfid is missing, brick report back as an ESTALE error. However, most of the applications don't accept ESTALE as an error for a file-system object missing, changing their behaviour. For eg., rm -rf ignores ENOENT errors during unlink of files/directories. But with ESTALE error it doesn't send rmdir on a directory if unlink had failed with ESTALE for any of the files or directories within it. BUG: 1257894 Change-Id: I5e56bc0c53f52179940b4691acf6b3db853965df Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14027 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix spurious entries in heal infoPranith Kumar K2016-04-206-17/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Locking schemes in afr-v1 were locking the directory/file completely during self-heal. Newer schemes of locking don't require Full directory, file locking. But afr-v2 still has compatibility code to work-well with older clients, where in entry-self-heal it takes a lock on a special 256 character name which can't be created on the fs. Similarly for data self-heal there used to be a lock on (LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks before examining heal-state. If it doesn't take sh-domain locks, then there is a possibility of heal-info hanging till self-heal completes because of compatibility locks. But the problem with heal-info taking sh-domain locks is that if two heal-info or shd, heal-info try to inspect heal state in parallel using trylocks on sh-domain, there is a possibility that both of them assuming a heal is in progress. This was leading to spurious entries being shown in heal-info. Fix: As long as there is afr-v1 way of locking, we can't fix this problem with simple solutions. If we know that the cluster is running newer versions of locking schemes, in those cases we can give accurate information in heal-info. So introduce a new option called 'locking-scheme' which if it is 'granular' will give correct information in heal-info. Not only that, Extra network hops for taking compatibility locks, sh-domain locks in heal info will not be necessary anymore. Thus it improves performance. >BUG: 1322850 >Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13873 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Anuradha Talur <atalur@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >(cherry picked from commit b6a0780d86e7c6afe7ae0d9a87e6fe5c62b4d792) Change-Id: If7eee18843b48bbeff4c1355c102aa572b2c155a BUG: 1294675 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14039 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* op-version: Bump up op-version to 3.7.12Pranith Kumar K2016-04-181-1/+1
| | | | | | | | | | | BUG: 1325857 Change-Id: I49286ba60281d543f2acacf45c4f824627ef4167 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14017 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* Build fix: remove undefined -I${rpclibdir}Emmanuel Dreyfus2016-04-181-1/+1
| | | | | | | | | | | | | | | The variable is not defined anywhere, remove it. Backport of Iaefb349cceb4108ac22c44cd32e5ea3d3c8bc0e5 Change-Id: I2ef75c8da8c7421328958f91dfaaf287347296e4 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> BUG: 1212676 Reviewed-on: http://review.gluster.org/13868 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* NFS: new option nfs.rdirplus addedSakshi Bansal2016-04-186-1/+50
| | | | | | | | | | | | | | | | | | | | When this option is 'disabled', NFS falls back to standard readdir instead of readdirp Backport of http://review.gluster.org/#/c/13782/ > Change-Id: Icaaf4da6533bee56160d4a81e42bb60f7d341945 > BUG: 1302948 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Change-Id: Icaaf4da6533bee56160d4a81e42bb60f7d341945 BUG: 1312721 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13916 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd: populate brickinfo->real_path conditionallyAtin Mukherjee2016-04-1811-44/+76
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13965 glusterd_brickinfo_new_from_brick () is called from multiple places and one of them is glusterd_brick_rpc_notify where its very well possible that an underlying brick's file system has crashed and a disconnect event has been received. In this case glusterd tries to build the brickinfo from the brickid in the RPC request, however the same fails as glusterd_brickinfo_new_from_brick () fails from realpath. Fix is to skip populating real_path if its a disconnect event. Change-Id: I9d9149c64a9cf2247abb731f219c1b1eef037960 BUG: 1326174 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13965 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13973
* cluster/afr: Use parallel dir scan functionalityPranith Kumar K2016-04-176-13/+72
| | | | | | | | | | | | | | | | | | | | >BUG: 1221737 >Change-Id: I0ed71a72f0e33bd733723e00a01cf28378c5534e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13755 >Reviewed-on: http://review.gluster.org/13992 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1325857 Change-Id: I7c6b2ea065edd7f5dafffeb42fd6c601b4ab8d14 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14010 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Don't lookup/forget inodesPranith Kumar K2016-04-175-43/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: All inodes that are looked-up are always forgotten without fail in afr removing the benefits of them being in lru. This same code can cause crashes if between inode_lookup, inode_forget in afr if the top xlator does inode_forget(0). Fix: Don't use lookup/forget in afr. No benefits are there at the moment for keeping this code. It is impossible to prevent top xlators to do inode_forget(0). Found similar instances in ec and removed them even though those code paths are not going to be executed in any place other than heal-daemon. >BUG: 1321554 >Change-Id: Ia4cb236178f7f129cc898d53f0bbd26f494a2a8d >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13834 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Anuradha Talur <atalur@redhat.com> BUG: 1327864 Change-Id: I3507ed88cd75e069ed302525bfa259cf407871fb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14009 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Fix partial heals in 3-way replicationPranith Kumar K2016-04-165-15/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When there are 2 sources and one sink and if two self-heal daemons try to acquire locks at the same time, there is a chance that it gets a lock on one source and sink leading partial to heal. This will need one more heal from the remaining source to sink for the complete self-heal. This is not optimal. Fix: Upgrade non-blocking locks to blocking lock on all the subvolumes, if the number of locks acquired is majority and there were eagains. >BUG: 1318751 >Change-Id: Iae10b8d3402756c4164b98cc49876056ff7a61e5 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13766 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >(cherry picked from commit 8deedef565df49def75083678f8d1558c7b1f7d3) Change-Id: Ia164360dc1474a717f63633f5deb2c39cc15017c BUG: 1327863 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14008 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Make o-direct writes work with shardingKrutika Dhananjay2016-04-161-0/+6
| | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/13846/ With files opened with o-direct, the expectation is that the IO performed on the fds is byte aligned wrt the sector size of the underlying device. With files getting sharded, a single write from the application could be broken into more than one write falling on different shards which _might_ cause the original byte alignment property to be lost. To get around this, shard translator will send fsync on odirect writes to emulate o-direct-like behavior in the backend. Change-Id: I992e10162afcca17a19d9cba3bcb187a31c618ae BUG: 1325843 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13966 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* syncop: Add parallel dir scan functionalityPranith Kumar K2016-04-163-0/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of this functionality's ideas are contributed by Richard Wareing, in his patch: https://bugzilla.redhat.com/show_bug.cgi?id=1221737#c1 VERY BIG thanks to him :-). After starting porting/testing the patch above, I found a few things we can improve in this patch based on the results we got in testing. 1) We are reading all the indices before we launch self-heals. In some customer cases I worked on there were almost 5million files/directories that needed heal. With such a big number self-heal daemon will be OOM killed if we go this route. So I modified this to launch heals based on a queue length limit. 2) We found that for directory hierarchies, multi-threaded self-heal patch was not giving better results compared to single-threaded self-heal because of the order problems. We improved index xlator to give gfid type to make sure that all directories in the indices are healed before the files that follow in that iteration of readdir output(http://review.gluster.org/13553). In our testing this lead to zero errors of self-heals as we were only doing self-heals in parallel for files and not directories. I think we can further improve self-heal speed for directories by doing name heals in parallel based on similar techniques Richard's patch showed. I think the best thing there would be to introduce synccond_t infra (pthread_cond_t kind of infra for syncops) which I am planning to implement for future releases. 3) Based on 1), 2) and the fact that afr already does retries of the indices in a loop I removed retries again in the threads. 4) After the refactor, the changes required to bring in multi-threaded self-heal for ec would just be ~10 lines, most of it will be about options initialization. Our tests found that we are able to easily saturate network :-). High level description of the final feature: Traditionally self-heal daemon reads the indices (gfids) that need to be healed from the brick and initiates heal one gfid at a time. Goal of this feature is to add parallelization to the way we do self-heals in a way we do not regress in any case but increase parallelization wherever we can. As part of this following knobs are introduced to improve parallelization: 1) We can launch 'max-jobs' number of heals in parallel. 2) We can keep reading indices as long as the wait-q for heals doesn't go over 'max-qlen' passed as arguments to multi-threaded dir_scan. As a first cut, we always do healing of directories in serial order one at a time but for files we launch heals in parallel. In future we can do name-heals of dir in parallel, but this is not implemented as of now. Reason for this is mentioned already in '2)' above. AFR/EC can introduce options like max-shd-threads/wait-qlength which can be set by users to increase the rate of heals when they want. Please note that the options will take effect only for the next crawl. >BUG: 1221737 >Change-Id: I8fc0afc334def87797f6d41e309cefc722a317d2 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13569 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> BUG: 1325857 Change-Id: I23235bbb923208eee6a8be711bbfb14350edb11b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13967 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix witness counting code in src/sink detectionPranith Kumar K2016-04-162-9/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In afr-v1 pre-op, xattrop increments self xattr first then it increments the value on rest. In post-op, xattr value is decreased first on rest and at last it gets decremented on self. So for a possible operation to be witnessed i.e. a fop is seen by the brick it is important to have at least 1 pending op because without completing pre-op fop won't come. The other possibility is when fop completes but at the time of post-op after decrementing pending counts on others just before decrementing its own pending count, the brick dies. Fix: Fix witness detection code in afr_self_heal_find_direction() >BUG: 1322253 >Change-Id: Ia7e76482c0a46e775e269bb96ec1b9490a3ac18f >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13811 >Smoke: Gluster Build System <jenkins@build.gluster.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >(cherry picked from commit e88962f8c49ea1d65fa26703e5c11be3f21af2ba) Change-Id: I5d9a6d323b35409127c26f3ce61c5e1d91395b18 BUG: 1326212 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13975 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* doc: Add release notes for 3.7.11v3.7.11Kaushal M2016-04-161-0/+35
| | | | | Change-Id: I15b86ebcbb6bc93bd7dfe1a32929c0ea6aa1c92a Signed-off-by: Kaushal M <kaushal@redhat.com>
* Revert "glusterd: Bug fixes for IPv6 support"Kaushal M2016-04-1613-162/+25
| | | | | | | | This reverts commit b33f3c95ec9c8112e6677e09cea05c4c462040d0. This commit exposes some issues with management encryption that prevents GlusterFS from operating properly. This will be added again once problems with management encryption are fixed.
* posix_acl: create inode ctx for posix_acl_getvmallika2016-04-151-10/+26
| | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/13961 > Change-Id: Ibe5b00cd4b5d896133adc61f65094d783c492ed4 > BUG: 1325822 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I6be941044ea430913bb950c202f65a1c642d7ae4 BUG: 1325826 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13962 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: coverity fix for insecure temporary fileSakshi2016-04-152-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Set umask before creating temporary file Backport of http://review.gluster.org/9558 > Change-Id: Ia39af63b05ce68f3f3af6585b70d4129a5530269 > BUG: 789278 > Signed-off-by: Sakshi <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/9558 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ia39af63b05ce68f3f3af6585b70d4129a5530269 BUG: 1215026 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13984 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: fix per-test core detectionJeff Darcy2016-04-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of the following patch: http://review.gluster.org/13921 Commit 9933c5ab in glusterfs-patch-acceptance-tests broke the code here to count cores after each test, with two bad effects: * Tests continue to run after the job is already guaranteed to fail, tying up resources and delaying jobs for other patches. * Cores aren't detected until the end of the job, long after it might have been possible to figure out what was going on at the time the process died. The current check here works for the current code in the other repo, but could break if the two repos are changed without coordination again. Change-Id: I0840849f5df8b6a62893bad1b80b6ace208ffe90 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14001 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* extras: Add namespace for options in group-virt.exampleVijay Bellur2016-04-133-10/+25
| | | | | | | | | | | | | | | | | | | | | Commit 23ccabbeb7 introduced a new key "disperse.eager-lock" which causes a conflict with key "cluster.eager-lock" when option is used without the qualifying namespace. group-virt.example which gets installed as /var/lib/glusterd/ groups/virt contains options without namespace qualifiers. This patch adds the appropriate namespace to all options in group-virt.example. Change-Id: I2c09dd10d44138410d889ddeb805f01c641c6780 BUG: 1325630 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/13929 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13958 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Kaushal M <kaushal@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/afr: Don't delete gfid-req from lookup requestPranith Kumar K2016-04-124-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Afr does dict_ref of the xattr_req that comes to it and deletes "gfid-req" key. Dht uses same dict to send lookup to other subvolumes. So in case of directories and more than 1 dht subvolumes, second subvolume till the last subvolume won't get a lookup request with "gfid-req". So gfid reset never happens on the directories in distributed replicate subvolume for 2nd till last subvolumes. Fix: Make a copy of lookup xattr request. Also fixed replies_wipe possibly resetting gfid to NULL gfid >BUG: 1312816 >Change-Id: Ic16260e5a4664837d069c1dc05b9e96ca05bda88 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13545 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >(cherry picked from commit 9b022c3a3f2f774904b5b458ae065425b46cc15d) Change-Id: Ia68193b559ec1dfd841cc5a22ef1fa801b866200 BUG: 1313693 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13574 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com>
* arbiter: write performance improvementRavishankar N2016-04-114-34/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/13906 Problem: The throughput for a 'dd' workload was much less for arbiter configuration when compared to normal replica-3 volume. There were 2 issues: i)arbiter_writev was using the request dict as response dict while unwinding, leading to incorect GLUSTERFS_WRITE_IS_APPEND and GLUSTERFS_OPEN_FD_COUNT values (=4), leading to immediate post-ops because is_afr_delayed_changelog_post_op_needed() failed due to afr_are_multiple_fds_opened() check. ii) The arbiter code in afr was setting local->transaction.{start and len} =0 to take full file locks. What this meant was even for simultaenous but non-overlapping writevs, afr_transaction_eager_lock_init() was not happening because afr_locals_overlap() always stays true. Consequently is_afr_delayed_changelog_post_op_needed() failed due to local->delayed_post_op not being set. Fix: i) Send appropriate response dict values in arbiter_writev. ii) Modify flock params instead of local->transaction.{start and len} to take full file locks in the transaction. Also changed _fill_writev_xdata() in posix to fill rsp_xdata for whatever key is requested for. Change-Id: I1c5fc5e98aba49ade540bb441a022e65b753432a BUG: 1324809 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Robert Rauch <robert.rauch@gns-systems.de> Reported-by: Russel Purinton <russell.purinton@gmail.com> Reviewed-on: http://review.gluster.org/13925 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht: lock on subvols to prevent rename and lookup selfheal raceSakshi2016-04-101-43/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses two races while renaming directories: 1) While renaming src to dst, if a lookup selfheal is triggered it can recreate src on those subvols where rename was successful. This leads to multiple directories (src and dst) having same gfid. To avoid this we must take locks on all subvols with src. 2) While renaming if the dst exists and a lookup selfheal is triggered it will find anomalies in the dst layout and try to heal the stale layout. To avoid this we must take lock on any one subvol with dst. Backport of http://review.gluster.org/#/c/11880/ > Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 > BUG: 1252244 > Signed-off-by: Sakshi <sabansal@redhat.com> Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 BUG: 1324381 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13917 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* socket: Don't cleanup encrypted transport in socket_connect()Kaushal M2016-04-091-12/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ..instead cleanup only in socket_poller() Backport of be99ddd from master With commit d117466 socket_poller() wasn't launched from socket_connect (for encrypted connections), if connect() failed. This was done to prevent the socket private data from being double unreffed, from the cleanups in both socket_poller() and socket_connect(). This allowed future reconnects to happen successfully. If a socket reconnects is sort of decided by the rpc notify function registered. The above change worked with glusterd, as the glusterd rpc notify function (glusterd_peer_rpc_notify()) continuously allowed reconnects on failure. mgmt_rpc_notify(), the rpc notify function in glusterfsd, behaves differently. For a DISCONNECT event, if more volfile servers are available or if more addresses are available in the dns cache, it allows reconnects. If not it terminates the program. For a CONNECT event, it attempts to do a volfile fetch rpc request. If sending this rpc fails, it immediately terminates the program. One side effect of commit d117466, was that the encrypted socket was registered with epoll, unintentionally, on a connect failure. A weird thing happens because of this. The epoll notifier notifies mgmt_rpc_notify() of a CONNECT event, instead of a DISCONNECT as expected. This causes mgmt_rpc_notify() to attempt an unsuccessful volfile fetch rpc request, and terminate. (I still don't know why the epoll raises the CONNECT event) Commit 46bd29e fixed some issues with IPv6 in GlusterFS. This caused address resolution in GlusterFS to also request of IPv6 addresses (AF_UNSPEC) instead of just IPv4. On most systems, this causes the IPv6 addresses to be returned first. GlusterD listens on 0.0.0.0:24007 by default. While this attaches to all interfaces, it only listens on IPv4 addresses. GlusterFS daemons and bricks are given 'localhost' as the volfile server. This resolves to '::1' as the first address. When using management encryption, the above reasons cause the daemon processes to fail to fetch volfiles and terminate. Solution -------- The solution to this is simple. Instead of cleaning up the encrypted socket in socket_connect(), launch socket_poller() and let it cleanup the socket instead. This prevents the unintentional registration with epoll, and socket_poller() sends the correct events to the rpc notify functions, which allows proper reconnects to happen. Change-Id: Idb0c0a828743cccca51cfdd1aa6458cfa0a9d100 BUG: 1325491 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/13931 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Do not ref dictionary in lookupPranith Kumar K2016-04-091-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1) dict_for_each loops over the elements without any locks, so the members of the dictionary can be ref/unrefed while dict_for_each is executed by another thread leading to crashes. Basically with distributed ec + disctributed replicate as cold, hot tiers. tier sends a lookup which fails on ec. (By this time dict already contains ec xattrs) After this lookup_everywhere code path is hit in tier which triggers lookup on each of distribute's hash lookup but fails which leads to the cold, hot dht's lookup_everywhere in two parallel epoll threads where in ec when it tries to set trusted.ec.version/dirty/size as keys in the dictionary, the older values against the same key get erased. While this erasing is going on if the thread that is doing lookup on afr's subvolume accesses these keys either in dict_copy_with_ref or client xlator trying to serialize, that can either lead to crash or hang based on if the spin/mutex lock is called on invalid memory. 2) EC deletes GF_CONTENT_KEY from the dictionary, this may lead to extra reads in case of lookup-everwhere for tiered volumes. Fix: Do dict_copy_with_ref() for the lookup-dictionary. This is avoiding the problem and is not actually fixing the 1st problem. 2nd problem will be fixed. >Change-Id: I5427aa14c48cb7572977d4de9a28c5ffff2b4b95 >BUG: 1315560 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13680 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >(cherry picked from commit 64cba025b13aad7fb3020a04930cfa22fbfcb859) Change-Id: I2828a0d9e730bc4b0ea6cee037365131767ae43e BUG: 1322520 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13859 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* posix_acl: skip acl_permits for special clientsvmallika2016-04-061-11/+28
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/13894 > Change-Id: I3f478b7e4ecab517200f50eb09f65a634c029437 > BUG: 1320818 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/13894 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Change-Id: I5bb727857e94feb4c86574cf2985c71d8b9ad1d7 BUG: 1320817 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13909 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* marker: do mq_reduce_parent_size_txn in FG for unlink & rmdirvmallika2016-04-063-32/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13874/ * If a "rm -rf" is performed by a client, we initiate a marker background operation mq_reduce_parent_size_txn for rmdir and unlink. mq_reduce_parent_size_txn can fail when updating size on the ancestor directories, if these directories are removed during the txn as the child-parent association removed in the dentry list. So execute mq_reduce_parent_size_txn in foreground and then do the UNWIND for rmdir and unlink FOP > Change-Id: Iefcdced4c6ae0dbd43f92814d0ddcd1e33825864 > BUG: 1322489 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I79e4b53e4bacd39d23dad5278a7d02a338e59195 BUG: 1324040 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13910 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* marker: optimize mq_update_dirty_inode_taskvmallika2016-04-063-50/+58
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13892/ In function mq_update_dirty_inode_task we do readdirp on a dirty directory and for entry we again do lookup to fecth the contribution xattr. We can fetch this contribution as part of readdirp > Change-Id: I766593c0dba793f1ab3b43625acce1c7d9af8d7f > BUG: 1320818 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: Id826a09a72529f7435372ea7f04068dd10da5fcb BUG: 1324040 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13908 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht: lock on subvols to prevent lookup vs rmdir raceSakshi2016-04-065-88/+435
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a possibility that while an rmdir is completed on some non-hashed subvol and proceeding to others, a lookup selfheal can recreate the same directory on those subvols for which the rmdir had succeeded. Now the deletion of the parent directory will fail with an ENOTEMPTY. To fix this take blocking inodelk on the subvols before starting rmdir. Selfheal must also take blocking inodelk before creating the entry. Backport of http://review.gluster.org/13528 > Change-Id: I168a195c35ac1230ba7124d3b0ca157755b3df96 > BUG: 1245065 > Signed-off-by: Sakshi <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/13528 > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I168a195c35ac1230ba7124d3b0ca157755b3df96 BUG: 1257894 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13915 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* quota: check inode limits only when new file/dir is createdvmallika2016-04-064-65/+57
| | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13911/ When a inode limit is full, writes to any existing file fails with disk quota exceed even if usage limit is not set or usage limit is not full. > BUG: 1323486 > Change-Id: I9679fe26a2839ade0b1541fa7f0a2b71ac6dcc31 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I55ec86ebecbb8490e557c61090f6fb8c6c449ec7 BUG: 1324058 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13912 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* marker: build_ancestry in markervmallika2016-04-063-19/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13857/ * quota-enforcer doesn't execute build_ancestry in the below code path 1) Special client (PID < 0) 2) unlink 3) rename within the same directory 4) link within the same directory In these cases, marker accounting can fail as parent not found. We need to build_ancestry in marker if it doesn't find parent during update txn > Change-Id: Idb7a2906500647baa6d183ba859b15e34769029c > BUG: 1320818 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: Ib56a556bdeebcc498d59599baf4655be05d765e5 BUG: 1324040 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13907 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht/rebalance: rebalance failure handlingSusant Palai2016-04-051-21/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At current state rebalance aborts basically on any failure like fix-layout of a directory, readdirp, opendir etc. Unless it is not a remove-brick process we can ignore these failures. Major impact: Any failure in the gf_defrag_process_dir means there are files left unmigrated in the directory. Fix-layout(setxattr) failure will impact it's child subtree i.e. the child subtree will not be rebalanced. Settle-hash (commit-hash)failure will trigger lookup_everywhere for immediate children until the next commit-hash. Note: Remove-brick opertaion is still sensitive to any kind of failure. Change-Id: I2f67a490e4e7d06423bb5bc010a1373a74a6af1d BUG: 1318196 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12013 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13749 Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* update 3.7.10 release notesAtin Mukherjee2016-04-051-0/+11
| | | | | | | | | | | | | Add 1322772 & 1323287 in known issues section Change-Id: I1269e91ca0062162ac92f65f4f746beeb100db54 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13886 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: build realpath post recreate of brick mount for snapshotAtin Mukherjee2016-04-052-16/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13869 Commit a60c39d introduced a new field called real_path in brickinfo to hold the realpath() conversion. However at restore path for all snapshots and snapshot restored volumes the brickpath gets recreated post restoration of bricks which means the realpath () call will fail here for all the snapshots and cloned volumes. Fix is to store the realpath for snapshots and clones post recreating the brick mounts. For normal volume it would be done during retrieving the brick details from the store. Change-Id: Ia34853acddb28bcb7f0f70ca85fabcf73276ef13 BUG: 1324014 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13869 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/13905 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: fill real_path variable in brickinfo during volume importMohammed Rafi KC2016-04-052-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable "real_path" in brick info was used to store absolute path and using this we check the availability of the newly added bricks. But we were not populating the variable when we import a volume from peers. That caused to reset the real_path variable to zero, which resulted in validation failure for all new brick creation. Backport of> >Change-Id: I62be7bf452f0dcdf6aec3a4ec33c2e1fba2951ca >BUG: 1323287 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/13890 >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> (cherry picked from commit 648357ffad482a1bda8915d42df9d5b055dae44f) Change-Id: I6937f83bb50277a396944edc3cf0b0ed82facc3a BUG: 1324156 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13914 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* quota/cli: display quota usage on path when limit not setvmallika2016-04-051-17/+0
| | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13893/ When a quota limit is not set, 'quota list <path>' should still display the usage when a path parameter is specified. > Change-Id: Ida12d9c5e348fbd98db4d68d9324c623cbdd3dea > BUG: 1323360 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I51cf46faffb324d2632dfde4264cd95d2da22479 BUG: 1323490 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13895 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com>