summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* runner: extract and return actual exit status of childPrasanna Kumar Kalever2016-05-011-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro: pid_t waitpid(pid_t pid, int *status, int options); The waitpid() system call suspends execution of the calling process until a child specified by pid argument has changed state. Here the ret (pid) value is not equal to the exit status of the child process. Check manpages for more info on this. Problem: In the current runner framework we always return the pid i.e ret value of the waitpid, as said above it is not the exit value of the child process Solution: Extract the actual exit code/status in case if the child terminated normally, that is, by calling exit(3) or _exit(2), or by returning from main() Backport of: > Change-Id: Iffae99a43e540af66917b3745f21ea3c2a5a3c2d > BUG: 1329129 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14042 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I137b2fc120ec2b1137bf8a4e6b180f1787bf5908 BUG: 1331759 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14116 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd: try to connect on GF_PMAP_PORT_FOREIGN aswellPrasanna Kumar Kalever2016-05-012-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix couple of things mentioned below: 1. previously we use to try to connect on only GF_PMAP_PORT_FREE in the pmap_registry_alloc(), it could happen that some foreign process would have freed the port by this time ?, hence it is worth giving a try on GF_PMAP_PORT_FOREIGN ports as well instead of wasting them all. 2. fix pmap_registry_remove() to mark the port asGF_PMAP_PORT_FREE 3. added useful comments on gf_pmap_port_type enum members Backport of: > Change-Id: Id2aa7ad55e76ae3fdece21bed15792525ae33fe1 > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14080 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: Ib7852aa1c55611e81c78341aace4d374d516f439 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14117 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* scripts: correct the usage of -perm in backend-cleanup.shNiels de Vos2016-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | extras/backend-cleanup.sh uses deprecated find -perm +xxx syntax: find [...] -perm +01000 [...] This GNU extension syntax is deprecated and does not work in GNU findutils 4.5.11 and later. Please change to find -perm /xxx instead. The new syntax was introduced in 4.2.25 (October 2005) and should therefore be available on any relevant system. Bachport of this change on the master branch: > BUG: 1294223 > Reviewed-on: http://review.gluster.org/13080 > Change-Id: Ice742957dd24f0ab4f70a8569dff6f2536e9ac1e > Reported-by: Andreas Metzler <ametzler@bebt.de> > Signed-off-by: Niels de Vos <ndevos@redhat.com> BUG: 1294077 Change-Id: Ice742957dd24f0ab4f70a8569dff6f2536e9ac1e Reported-by: Andreas Metzler <ametzler@bebt.de> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/13081 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd: fix max pmap alloc to GF_PORT_MAXPrasanna Kumar Kalever2016-05-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | This patch also mops the port max i.e 65535 hard coding Backport of: > Change-Id: Ia93656d75ceb4c6c4849f9e0ad1b260d5952d4c6 > BUG: 1331253 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14096 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Change-Id: Idf07dd5d08957aed9287e7df2f3fc6f0f32039c1 BUG: 1331772 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14127 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/afr: Do not fsync when durability is offPranith Kumar K2016-04-292-0/+47
| | | | | | | | | | | | | | | | | | | | | | | >BUG: 1329501 >Change-Id: Id402c20f2fa19b22bc402295e03e7a0ea96b0c40 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14048 >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >(cherry picked from commit 302e218f68ef5edab6b369411d6f06cafea08ce1) Change-Id: Ifbf693f8de6765fca90a9ef3c11c1912c2e9885f BUG: 1331342 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14104 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* geo-rep: Fix checkpoint issue in schedulerAravinda VK2016-04-291-15/+11
| | | | | | | | | | | | | | | | | If checkpoint is not met, Scheduler script should touch the Mount point so that SETATTR will get recorded in every brick Changelog. Script was not touching the mount point in each iteration. BUG: 1330450 Change-Id: I2718a764fb3e550742c9dcd316724683561ddf18 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/14029 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit 8590c1cf3c27468177c425c920cab01f52b251e5) Reviewed-on: http://review.gluster.org/14071
* packaging: %postun libs ldconfig: relative path `1' used to build cacheKaleb S KEITHLEY2016-04-291-3/+7
| | | | | | | | | | | | | | | | | | | | | %postun libs isn't 'closed' by the following %postun server on RHEL6 due to the %ifdef...%endif But -server has /usr/lib*/libgfdb.so.x, so we should be running /sbin/ldconfig! Which conveniently fixes the closing issue. See: > Change-Id: Icc365eefc5453c40e02b59288a4e8023b82baa7b > BUG: 1330583 > http://review.gluster.org/14081 Change-Id: I7c2daf1408aaee6340e6983cfaba207c5d13f5a1 BUG: 1328836 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14082 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* inode: Always fetch first entry from the inode lists during inode_table_destroySoumya Koduri2016-04-291-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | In inode_table_destroy, we iterate through lru and active lists to move the entries to purge list so that they can be destroyed during inode_table_prune. But if used "list_for_each_entry" or "list_for_each_entry_safe" to iterate, we could end up accessing the entries which may have got moved to different(purge) lists in the process and can result in either infinite loop or crash. The safe approach seems to fetch the first entry of the list in each iteration till it gets empty. This is backport of below mainline fix - http://review.gluster.org/13987 Change-Id: I24a18881833bd9419c2d8e5e8807bc71ec396479 BUG: 1330892 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13987 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14089 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* glusterd: check if glusterd is started on all nodes and allSakshi2016-04-282-7/+48
| | | | | | | | | | | | | | | | | | | | | | | | bricks are started before performing rebalance Backport of http://review.gluster.org/#/c/10906/ > Change-Id: I458ea9cd86cf35bdb7d758be55f951ae9f3e66f0 > BUG: 1224857 > Signed-off-by: Sakshi <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/10906 > Smoke: Gluster Build System <jenkins@build.gluster.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> BUG: 1312722 Change-Id: Ib8e59b33e064be8301f682a4b08cb5cf10c22fc9 Signed-off-by: Sakshi <sabansal@redhat.com> Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13537 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* tier/dht: check for rebalance completion for EIO errorMohammed Rafi KC2016-04-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an ongoing rebalance completion check task been triggered by dht, there is a possibility of a race between afr setting subvol as non-readable and dht updates the cached subvol. In this window a write can fail with EIO. Back port of> >Change-Id: I42638e6d4104c0dbe893d1bc73e1366188458c5d >BUG: 1329503 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/14049 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit a9ccd0c8ea6989c72073028b296f73a6fcb6b896) Change-Id: Ifc741f05a9cfe5c1d9c03085312ebbf21b8b798b BUG: 1330428 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/14070 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* common-ha: continuous grace_mon log messages in /var/log/messagesKaleb S KEITHLEY2016-04-284-33/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | messages are seen on RHEL6.x and RHEL7.1 and earlier versions of pacemaker. (And RHEL7.2 with RHEL7.1 pacemaker packages.) It's not possible to query attrd attributes in the older version, only set/update/clear them. The messages come from invalid attempts to query the attributes. However it is possible to query crm attributes. The fix here is to create a "shadow" crm attribute for the attrd attribute. Changes are made to both, queries are made on the crm attribute. (Resource Agents "follow" the attrd attribute using constraint locations, so we must keep the attrd attribute.) Backport of > Change-Id: I84ac1a80673e528d98b67b7d5062e21dcf744d4a > BUG: 1324509 > http://review.gluster.org/#/c/13919/ Change-Id: I7301c48849496be026ef598c588e78c68f273a8a BUG: 1324510 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13920 Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com>
* afr: propagate child up event after timeoutRavishankar N2016-04-274-99/+200
| | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/11113 Problem: During mount, afr waits for response from all its children before notifying the parent xlator. In a 1x2 replica volume , if one of the nodes is down, the mount will hang for more than a minute until child down is received from the client xlator for that node. Fix: When parent up is received by afr, start a 10 second timer. In the timer call back, if we receive a successful child up from atleast one brick, propagate the event to the parent xlator. Change-Id: I31e57c8802c1a03a4a5d581ee4ab82f3a9c8799d BUG: 1330855 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14088 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: fixes for migration over ec as cold tierDan Lambright2016-04-271-1/+1
| | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11433 The above mentioned patch was backported but it was not complete. This patch backports whatever was left out. For more info, refer to previous backport at http://review.gluster.org/#/c/11652. Change-Id: I6ccda8ca8e43485b9b354341bbfcb302496f632c BUG: 1330765 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/14084 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cli/quota: Sort the list output alphabetically by pathvmallika2016-04-277-5/+160
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/14000 > Change-Id: I0b124e119d167817be2ae3eb52ac6c80fc7db5d1 > BUG: 1320716 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/14000 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaushal M <kaushal@redhat.com> Change-Id: I87e12d58c8e267b2af67e287998e7313efc70af4 BUG: 1330018 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/14061 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr: replica pair going offline does not require CHILD_MODIFIED eventSakshi Bansal2016-04-273-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As a part of CHILD_MODIFIED event DHT forgets the current layout and performs fresh lookup. However this is not required when a replica pair goes offline as the xattrs can be read from other replica pairs. Hence setting different event to handle replica pair going down. > Backport of http://review.gluster.org/#/c/12573/ > Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3 > BUG: 1281230 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/12573 > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ida30240d1ad8b8730af7ab50b129dfb05264fdf9 BUG: 1283972 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12767 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: fix validation of lower op-version check in volume setAtin Mukherjee2016-04-262-2/+24
| | | | | | | | | | | | | | | | | Commit 2d87a98 introduced a validation to fail lowering down the cluster.op-version. Commit 2eb8758 actually changed the variable value from cluster's op-version to volume's op-version which resulted the logic go for a toss. Change-Id: I70df32b75c3a3fe47dc840c4a655059e5b124bca BUG: 1330545 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14069 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/14077
* quota: setting 'read-only' option in xdata to instruct DHT to not healSakshi Bansal2016-04-263-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Backport of http://review.gluster.org/#/c/13988/ > Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 > BUG: 1252244 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/13988 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1328473 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14031 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tier/libgfdb: Ordering query results from libgfdbJoseph Fernandes2016-04-261-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When querying we will order the query result to get the hotest or the coldest files in the queried list so that these files are migrated first. Now here we are giving priority to the write heat(time and counters), as it requires complex queries to have a composite ordering of write and read + it has it impact on performance. Backport of http://review.gluster.org/13607 > Change-Id: I2e0415dcfad4218b42c68fc5c2ed8d1f075ce9ea > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/13607 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Joseph Fernandes > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: If5fad07f8d0f50016b10e256803abd5266cd708f BUG: 1323017 Reviewed-on: http://review.gluster.org/13881 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: Handle GF_DEFRAG_STOPSusant Palai2016-04-261-0/+17
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14004 Problem: On a rebal stop, the migrator threads don't intimate the crawler thread to wake up in case it is waiting on signal from migrator thread. BUG: 1330529 Change-Id: I9019a715c7b4673b8bb5a75d7d33a18add85ce33 Reviewed-by: N Balachandran <nbalacha@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14076 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* clone/snapshot: Save restored_from_snap for clonesAvra Sengupta2016-04-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Bricks of cloned volumes are lvm bricks mounted in /run/gluster, which on reboot of the node gets cleared. Hence, these brick paths need to be recreated on glusterd restart and the appropriate lvms are mounted. Backport of: > Change-Id: I6da086288c0dbdcedf3a20fd53f25e3728bea473 > BUG: 1328010 > Signed-off-by: Avra Sengupta <asengupt@redhat.com> > Reviewed-on: http://review.gluster.org/14021 > Smoke: Gluster Build System <jenkins@build.gluster.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Change-Id: I02d10e6bca99f6d78b50cb91c677e33a8dcefa17 BUG: 1329989 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/14059 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* snapshot/quota: Copy quota.cksum during snapshot operationsAvra Sengupta2016-04-264-21/+40
| | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13760/ A volume having a quota.conf file, should always have a quota.cksum file too. Based on this above assumption modifying glusterd_copy_quota_files() to always copy quota.cksum, if quota.conf is present. This change will be reflected when a snapshot is created, restored and cloned. Change-Id: Ia49dc26eacef32eeb8f7d7d9553c80e304b08779 BUG: 1329492 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/13760 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 8f3ad1e3ede77fa5f8c8d606e18a7e83865a822c) Reviewed-on: http://review.gluster.org/14047
* packaging: additional dirs and files in /var/lib/glusterd/Kaleb S KEITHLEY2016-04-262-75/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing /var/lib/glusterd files and dirs found by downstream testing. Use a loop to create hook dirs instead of open-coding. Merge the %ghost and non-ghost dirs in -server %files section for easier maintenance. Eliminate a benign warning for enabling non-existent glusterfsd.{init,service} which is only relevant to Fedora koji builds Don't reject glusterfs.spec.in changes because of long lines Backport of > Change-Id: I5802175d729e0168eea879a2a61626b0b73d77c8 > BUG: 1326410 > http://review.gluster.org/13981 Change-Id: Ica0f54e63ec01056263a27bbcd4a10e469f67e42 BUG: 1326413 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13982 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* geo-rep: Fix hostname mismatch between volinfo and geo-rep statusAravinda VK2016-04-261-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | When Volume was created using IP, Gluster Volume info shows IP address But Geo-rep shows hostname if available, So difficult to map the outputs of Volume Info and Georep status output. Schedule Geo-rep script(c#13279) will merge the output of Volume info and Geo-rep status to get offline brick nodes information. This script was failing since host info shown in Volinfo is different from Georep status. Script was showing all nodes as offline. With this patch Geo-rep gets host info from volinfo->bricks instead of getting from hostname. Geo-rep status will now show same hostname/IP which was used in Volume Create. BUG: 1328706 Change-Id: Ib8e56da29129aa19225504a891f9b870f269ab75 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/14005 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit bc89311aff62c78102ab6920077b6782ee99689a) Reviewed-on: http://review.gluster.org/14036
* dht: add "nuke" functionality for efficient server-side deletionJeff Darcy2016-04-264-14/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of the following two patches (of which the second is a trivial adjustment to a timeout for a test added by the first). http://review.gluster.org/13878 http://review.gluster.org/13935 This turns a special xattr into an rmdir with flags set. When that hits the posix translator on the server side, that causes the file/directory to be moved into the special "landfill" directory. From there, the posix janitor thread will take care of deleting it entirely on the server side - traversing it recursively if necessary. A couple of secondary issues were fixed to make this effective. * FUSE now ensures that setxattr values are NUL terminated. * The janitor thread now gets woken up immediately when something is placed in 'landfill' instead of only when file descriptors need to be closed. * The default landfill-emptying interval was reduced to 10s. To use the feature, issue a setxattr something like this: setfattr -n glusterfs.dht.nuke -v "" /mnt/glusterfs/vol/some_dir The value doesn't actually matter; the mere receipt of a request with this key is sufficient. Some day it might be useful to allow setting a required value as a sort of password, so that only those who know it can access the underlying special functionality. Change-Id: I4132a30d1faa53a6682399ad1d9041e2c4519951 BUG: 1330241 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14065 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: SSL certificate depth volume option is incorrectKaleb S KEITHLEY2016-04-261-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Between 3.7.1 and 3.7.2 a typo was introduced changing the string ssl-cert-depth to ssl-cetificate-depth. [sic] rpc/rpc-transport/socket/src/socket.c still expects the string ssl-cert-depth. Also replace a couple errant tabs with spaces. See: > Change-Id: I0621258470bd831c97008b56123a9dc7029d73f1 > BUG: 1330248 > http://review.gluster.org/#/c/14066/ Change-Id: I4d227fc40d9377dd1a1ef39bf31be026ba43acb1 BUG: 1330249 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14067 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* quota : fix null dereference issues in quotaManikandan Selvaganesh2016-04-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14022/ > Change-Id: I3805b206077718da26adbeb8b29a53642e00886f > BUG: 1328696 > Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-on: http://review.gluster.org/14022 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ia5af10b79a71c6be5db293aee070049364b70237 BUG: 1329115 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/14041 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com>
* cluster/afr: Fix inode-leak in data self-healPranith Kumar K2016-04-252-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | Thanks to Olia-Kremmyda for finding the bug on github review, https://github.com/gluster/glusterfs/commit/b8106d1127f034ffa88b5dd322c23a10e023b9b6 >Change-Id: Ib8640ed0c331a635971d5d12052f0959c24f76a2 >BUG: 1329773 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14052 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1329779 Change-Id: I3d77f0b445fdedf2c582ea88f8d89e1da525638f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14053 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/distribute: detect stale layouts in entry fopsRaghavendra G2016-04-2512-42/+768
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1329062 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14040 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* Tier: tier command fails message when any node is downhari2016-04-222-24/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back-port of : http://review.gluster.org/#/c/13918/ PROBLEM: the dict doesn't get set on the node if its down. so while printing the output on cli we get a ENOENT which ends in a tier command failed. FIX: this patch skips the node that wasn't available and carrys on with the next node for both tier status and tier detach status. >Change-Id: I718a034b18b109748ec67f3ace56540c50650d23 >BUG: 1324439 >Signed-off-by: hari <hgowtham@redhat.com> >Reviewed-on: http://review.gluster.org/13918 >Smoke: Gluster Build System <jenkins@build.gluster.com> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Kaushal M <kaushal@redhat.com> Change-Id: Ia23df47596adb24816de4a2a1c8db875f145838e BUG: 1328410 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/14030 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* mount/fuse: report ESTALE as ENOENTRaghavendra G2016-04-201-0/+3
| | | | | | | | | | | | | | | | | | | | | When the inode/gfid is missing, brick report back as an ESTALE error. However, most of the applications don't accept ESTALE as an error for a file-system object missing, changing their behaviour. For eg., rm -rf ignores ENOENT errors during unlink of files/directories. But with ESTALE error it doesn't send rmdir on a directory if unlink had failed with ESTALE for any of the files or directories within it. BUG: 1257894 Change-Id: I5e56bc0c53f52179940b4691acf6b3db853965df Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14027 Tested-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix spurious entries in heal infoPranith Kumar K2016-04-206-17/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Locking schemes in afr-v1 were locking the directory/file completely during self-heal. Newer schemes of locking don't require Full directory, file locking. But afr-v2 still has compatibility code to work-well with older clients, where in entry-self-heal it takes a lock on a special 256 character name which can't be created on the fs. Similarly for data self-heal there used to be a lock on (LLONG_MAX-2, 1). Old locking scheme requires heal info to take sh-domain locks before examining heal-state. If it doesn't take sh-domain locks, then there is a possibility of heal-info hanging till self-heal completes because of compatibility locks. But the problem with heal-info taking sh-domain locks is that if two heal-info or shd, heal-info try to inspect heal state in parallel using trylocks on sh-domain, there is a possibility that both of them assuming a heal is in progress. This was leading to spurious entries being shown in heal-info. Fix: As long as there is afr-v1 way of locking, we can't fix this problem with simple solutions. If we know that the cluster is running newer versions of locking schemes, in those cases we can give accurate information in heal-info. So introduce a new option called 'locking-scheme' which if it is 'granular' will give correct information in heal-info. Not only that, Extra network hops for taking compatibility locks, sh-domain locks in heal info will not be necessary anymore. Thus it improves performance. >BUG: 1322850 >Change-Id: Ia563c5f096b5922009ff0ec1c42d969d55d827a3 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13873 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Anuradha Talur <atalur@redhat.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >(cherry picked from commit b6a0780d86e7c6afe7ae0d9a87e6fe5c62b4d792) Change-Id: If7eee18843b48bbeff4c1355c102aa572b2c155a BUG: 1294675 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14039 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* op-version: Bump up op-version to 3.7.12Pranith Kumar K2016-04-181-1/+1
| | | | | | | | | | | BUG: 1325857 Change-Id: I49286ba60281d543f2acacf45c4f824627ef4167 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14017 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* Build fix: remove undefined -I${rpclibdir}Emmanuel Dreyfus2016-04-181-1/+1
| | | | | | | | | | | | | | | The variable is not defined anywhere, remove it. Backport of Iaefb349cceb4108ac22c44cd32e5ea3d3c8bc0e5 Change-Id: I2ef75c8da8c7421328958f91dfaaf287347296e4 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> BUG: 1212676 Reviewed-on: http://review.gluster.org/13868 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* NFS: new option nfs.rdirplus addedSakshi Bansal2016-04-186-1/+50
| | | | | | | | | | | | | | | | | | | | When this option is 'disabled', NFS falls back to standard readdir instead of readdirp Backport of http://review.gluster.org/#/c/13782/ > Change-Id: Icaaf4da6533bee56160d4a81e42bb60f7d341945 > BUG: 1302948 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Change-Id: Icaaf4da6533bee56160d4a81e42bb60f7d341945 BUG: 1312721 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13916 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd: populate brickinfo->real_path conditionallyAtin Mukherjee2016-04-1811-44/+76
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13965 glusterd_brickinfo_new_from_brick () is called from multiple places and one of them is glusterd_brick_rpc_notify where its very well possible that an underlying brick's file system has crashed and a disconnect event has been received. In this case glusterd tries to build the brickinfo from the brickid in the RPC request, however the same fails as glusterd_brickinfo_new_from_brick () fails from realpath. Fix is to skip populating real_path if its a disconnect event. Change-Id: I9d9149c64a9cf2247abb731f219c1b1eef037960 BUG: 1326174 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13965 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13973
* cluster/afr: Use parallel dir scan functionalityPranith Kumar K2016-04-176-13/+72
| | | | | | | | | | | | | | | | | | | | >BUG: 1221737 >Change-Id: I0ed71a72f0e33bd733723e00a01cf28378c5534e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13755 >Reviewed-on: http://review.gluster.org/13992 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1325857 Change-Id: I7c6b2ea065edd7f5dafffeb42fd6c601b4ab8d14 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14010 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Don't lookup/forget inodesPranith Kumar K2016-04-175-43/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: All inodes that are looked-up are always forgotten without fail in afr removing the benefits of them being in lru. This same code can cause crashes if between inode_lookup, inode_forget in afr if the top xlator does inode_forget(0). Fix: Don't use lookup/forget in afr. No benefits are there at the moment for keeping this code. It is impossible to prevent top xlators to do inode_forget(0). Found similar instances in ec and removed them even though those code paths are not going to be executed in any place other than heal-daemon. >BUG: 1321554 >Change-Id: Ia4cb236178f7f129cc898d53f0bbd26f494a2a8d >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13834 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Anuradha Talur <atalur@redhat.com> BUG: 1327864 Change-Id: I3507ed88cd75e069ed302525bfa259cf407871fb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14009 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Fix partial heals in 3-way replicationPranith Kumar K2016-04-165-15/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When there are 2 sources and one sink and if two self-heal daemons try to acquire locks at the same time, there is a chance that it gets a lock on one source and sink leading partial to heal. This will need one more heal from the remaining source to sink for the complete self-heal. This is not optimal. Fix: Upgrade non-blocking locks to blocking lock on all the subvolumes, if the number of locks acquired is majority and there were eagains. >BUG: 1318751 >Change-Id: Iae10b8d3402756c4164b98cc49876056ff7a61e5 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13766 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >(cherry picked from commit 8deedef565df49def75083678f8d1558c7b1f7d3) Change-Id: Ia164360dc1474a717f63633f5deb2c39cc15017c BUG: 1327863 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14008 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Make o-direct writes work with shardingKrutika Dhananjay2016-04-161-0/+6
| | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/13846/ With files opened with o-direct, the expectation is that the IO performed on the fds is byte aligned wrt the sector size of the underlying device. With files getting sharded, a single write from the application could be broken into more than one write falling on different shards which _might_ cause the original byte alignment property to be lost. To get around this, shard translator will send fsync on odirect writes to emulate o-direct-like behavior in the backend. Change-Id: I992e10162afcca17a19d9cba3bcb187a31c618ae BUG: 1325843 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13966 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* syncop: Add parallel dir scan functionalityPranith Kumar K2016-04-163-0/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of this functionality's ideas are contributed by Richard Wareing, in his patch: https://bugzilla.redhat.com/show_bug.cgi?id=1221737#c1 VERY BIG thanks to him :-). After starting porting/testing the patch above, I found a few things we can improve in this patch based on the results we got in testing. 1) We are reading all the indices before we launch self-heals. In some customer cases I worked on there were almost 5million files/directories that needed heal. With such a big number self-heal daemon will be OOM killed if we go this route. So I modified this to launch heals based on a queue length limit. 2) We found that for directory hierarchies, multi-threaded self-heal patch was not giving better results compared to single-threaded self-heal because of the order problems. We improved index xlator to give gfid type to make sure that all directories in the indices are healed before the files that follow in that iteration of readdir output(http://review.gluster.org/13553). In our testing this lead to zero errors of self-heals as we were only doing self-heals in parallel for files and not directories. I think we can further improve self-heal speed for directories by doing name heals in parallel based on similar techniques Richard's patch showed. I think the best thing there would be to introduce synccond_t infra (pthread_cond_t kind of infra for syncops) which I am planning to implement for future releases. 3) Based on 1), 2) and the fact that afr already does retries of the indices in a loop I removed retries again in the threads. 4) After the refactor, the changes required to bring in multi-threaded self-heal for ec would just be ~10 lines, most of it will be about options initialization. Our tests found that we are able to easily saturate network :-). High level description of the final feature: Traditionally self-heal daemon reads the indices (gfids) that need to be healed from the brick and initiates heal one gfid at a time. Goal of this feature is to add parallelization to the way we do self-heals in a way we do not regress in any case but increase parallelization wherever we can. As part of this following knobs are introduced to improve parallelization: 1) We can launch 'max-jobs' number of heals in parallel. 2) We can keep reading indices as long as the wait-q for heals doesn't go over 'max-qlen' passed as arguments to multi-threaded dir_scan. As a first cut, we always do healing of directories in serial order one at a time but for files we launch heals in parallel. In future we can do name-heals of dir in parallel, but this is not implemented as of now. Reason for this is mentioned already in '2)' above. AFR/EC can introduce options like max-shd-threads/wait-qlength which can be set by users to increase the rate of heals when they want. Please note that the options will take effect only for the next crawl. >BUG: 1221737 >Change-Id: I8fc0afc334def87797f6d41e309cefc722a317d2 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13569 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> BUG: 1325857 Change-Id: I23235bbb923208eee6a8be711bbfb14350edb11b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13967 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix witness counting code in src/sink detectionPranith Kumar K2016-04-162-9/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In afr-v1 pre-op, xattrop increments self xattr first then it increments the value on rest. In post-op, xattr value is decreased first on rest and at last it gets decremented on self. So for a possible operation to be witnessed i.e. a fop is seen by the brick it is important to have at least 1 pending op because without completing pre-op fop won't come. The other possibility is when fop completes but at the time of post-op after decrementing pending counts on others just before decrementing its own pending count, the brick dies. Fix: Fix witness detection code in afr_self_heal_find_direction() >BUG: 1322253 >Change-Id: Ia7e76482c0a46e775e269bb96ec1b9490a3ac18f >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13811 >Smoke: Gluster Build System <jenkins@build.gluster.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >(cherry picked from commit e88962f8c49ea1d65fa26703e5c11be3f21af2ba) Change-Id: I5d9a6d323b35409127c26f3ce61c5e1d91395b18 BUG: 1326212 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13975 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* doc: Add release notes for 3.7.11v3.7.11Kaushal M2016-04-161-0/+35
| | | | | Change-Id: I15b86ebcbb6bc93bd7dfe1a32929c0ea6aa1c92a Signed-off-by: Kaushal M <kaushal@redhat.com>
* Revert "glusterd: Bug fixes for IPv6 support"Kaushal M2016-04-1613-162/+25
| | | | | | | | This reverts commit b33f3c95ec9c8112e6677e09cea05c4c462040d0. This commit exposes some issues with management encryption that prevents GlusterFS from operating properly. This will be added again once problems with management encryption are fixed.
* posix_acl: create inode ctx for posix_acl_getvmallika2016-04-151-10/+26
| | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/13961 > Change-Id: Ibe5b00cd4b5d896133adc61f65094d783c492ed4 > BUG: 1325822 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I6be941044ea430913bb950c202f65a1c642d7ae4 BUG: 1325826 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13962 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: coverity fix for insecure temporary fileSakshi2016-04-152-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Set umask before creating temporary file Backport of http://review.gluster.org/9558 > Change-Id: Ia39af63b05ce68f3f3af6585b70d4129a5530269 > BUG: 789278 > Signed-off-by: Sakshi <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/9558 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ia39af63b05ce68f3f3af6585b70d4129a5530269 BUG: 1215026 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13984 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* tests: fix per-test core detectionJeff Darcy2016-04-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of the following patch: http://review.gluster.org/13921 Commit 9933c5ab in glusterfs-patch-acceptance-tests broke the code here to count cores after each test, with two bad effects: * Tests continue to run after the job is already guaranteed to fail, tying up resources and delaying jobs for other patches. * Cores aren't detected until the end of the job, long after it might have been possible to figure out what was going on at the time the process died. The current check here works for the current code in the other repo, but could break if the two repos are changed without coordination again. Change-Id: I0840849f5df8b6a62893bad1b80b6ace208ffe90 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14001 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* extras: Add namespace for options in group-virt.exampleVijay Bellur2016-04-133-10/+25
| | | | | | | | | | | | | | | | | | | | | Commit 23ccabbeb7 introduced a new key "disperse.eager-lock" which causes a conflict with key "cluster.eager-lock" when option is used without the qualifying namespace. group-virt.example which gets installed as /var/lib/glusterd/ groups/virt contains options without namespace qualifiers. This patch adds the appropriate namespace to all options in group-virt.example. Change-Id: I2c09dd10d44138410d889ddeb805f01c641c6780 BUG: 1325630 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/13929 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13958 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Kaushal M <kaushal@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/afr: Don't delete gfid-req from lookup requestPranith Kumar K2016-04-124-6/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Afr does dict_ref of the xattr_req that comes to it and deletes "gfid-req" key. Dht uses same dict to send lookup to other subvolumes. So in case of directories and more than 1 dht subvolumes, second subvolume till the last subvolume won't get a lookup request with "gfid-req". So gfid reset never happens on the directories in distributed replicate subvolume for 2nd till last subvolumes. Fix: Make a copy of lookup xattr request. Also fixed replies_wipe possibly resetting gfid to NULL gfid >BUG: 1312816 >Change-Id: Ic16260e5a4664837d069c1dc05b9e96ca05bda88 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13545 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >(cherry picked from commit 9b022c3a3f2f774904b5b458ae065425b46cc15d) Change-Id: Ia68193b559ec1dfd841cc5a22ef1fa801b866200 BUG: 1313693 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13574 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com>
* arbiter: write performance improvementRavishankar N2016-04-114-34/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/13906 Problem: The throughput for a 'dd' workload was much less for arbiter configuration when compared to normal replica-3 volume. There were 2 issues: i)arbiter_writev was using the request dict as response dict while unwinding, leading to incorect GLUSTERFS_WRITE_IS_APPEND and GLUSTERFS_OPEN_FD_COUNT values (=4), leading to immediate post-ops because is_afr_delayed_changelog_post_op_needed() failed due to afr_are_multiple_fds_opened() check. ii) The arbiter code in afr was setting local->transaction.{start and len} =0 to take full file locks. What this meant was even for simultaenous but non-overlapping writevs, afr_transaction_eager_lock_init() was not happening because afr_locals_overlap() always stays true. Consequently is_afr_delayed_changelog_post_op_needed() failed due to local->delayed_post_op not being set. Fix: i) Send appropriate response dict values in arbiter_writev. ii) Modify flock params instead of local->transaction.{start and len} to take full file locks in the transaction. Also changed _fill_writev_xdata() in posix to fill rsp_xdata for whatever key is requested for. Change-Id: I1c5fc5e98aba49ade540bb441a022e65b753432a BUG: 1324809 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Robert Rauch <robert.rauch@gns-systems.de> Reported-by: Russel Purinton <russell.purinton@gmail.com> Reviewed-on: http://review.gluster.org/13925 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht: lock on subvols to prevent rename and lookup selfheal raceSakshi2016-04-101-43/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses two races while renaming directories: 1) While renaming src to dst, if a lookup selfheal is triggered it can recreate src on those subvols where rename was successful. This leads to multiple directories (src and dst) having same gfid. To avoid this we must take locks on all subvols with src. 2) While renaming if the dst exists and a lookup selfheal is triggered it will find anomalies in the dst layout and try to heal the stale layout. To avoid this we must take lock on any one subvol with dst. Backport of http://review.gluster.org/#/c/11880/ > Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 > BUG: 1252244 > Signed-off-by: Sakshi <sabansal@redhat.com> Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 BUG: 1324381 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13917 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>