summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Adding release notes for self-heal window size option for ECSunil Kumar Acharya2017-06-131-0/+7
| | | | | | | | | | | | | Fixes gluster/glusterfs#233 Change-Id: Iba2e6fc0c2021c3e174caf237bd78a04a1647765 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/17500 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/ec: Update xattr and heal size properlyAshish Pandey2017-06-074-9/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem-1 : Recursive healing of same file is happening when IO is going on even after data heal completes. Solution: RCA: At the end of the write, when ec_update_size_version gets called, we send it only on good bricks and not on healing brick. Due to this, xattr on healing brick will always remain out of sync and when the background heal check source and sink, it finds this brick to be healed and start healing from scratch. That involve ftruncate and writing all of the data again. To solve this, send xattrop on all the good bricks as well as healing bricks. Problem-2: The above fix exposes the data corruption during heal. If the write on a file is going on and heal finishes, we find that the file gets corrupted. RCA: The real problem happens in ec_rebuild_data(). Here we receive the 'size' argument which contains the real file size at the time of starting self-heal and it's assigned to heal->total_size. After that, a sequence of calls to ec_sync_heal_block() are done. Each call ends up calling ec_manager_heal_block(), which does the actual work of healing a block. First a lock on the inode is taken in state EC_STATE_INIT using ec_heal_inodelk(). When the lock is acquired, ec_heal_lock_cbk() is called. This function calls ec_set_inode_size() to store the real size of the inode (it uses heal->total_size). The next step is to read the block to be healed. This is done using a regular ec_readv(). One of the things this call does is to trim the returned size if the file is smaller than the requested size. In our case, when we read the last block of a file whose size was = 512 mod 1024 at the time of starting self-heal, ec_readv() will return only the first 512 bytes, not the whole 1024 bytes. This isn't a problem since the following ec_writev() sent from the heal code only attempts to write the amount of data read, so it shouldn't modify the remaining 512 bytes. However ec_writev() also checks the file size. If we are writing the last block of the file (determined by the size stored on the inode that we have set to heal->total_size), any data beyond the (imposed) end of file will be cleared with 0's. This causes the 512 bytes after the heal->total_size to be cleared. Since the file was written after heal started, the these bytes contained data, so the block written to the damaged brick will be incorrect. Solution: Align heal->total_size to a multiple of the stripe size. Thanks "Xavier Hernandez" <xhernandez@datalab.es> to find out the root cause and to fix the issue. >Change-Id: I6c9f37b3ff9dd7f5dc1858ad6f9845c05b4e204e >BUG: 1428673 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-on: https://review.gluster.org/16985 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I6c9f37b3ff9dd7f5dc1858ad6f9845c05b4e204e BUG: 1459392 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/17482 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Tier: removing port allocated for tierhari gowtham2017-06-073-35/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Tier has a port which it doesn't use. Fix: Remove the port getting allocated for tier. >Change-Id: If0fe393fc335d9f622a063787e0a3c6db9b7a50c >BUG: 1452006 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/17328 >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: If0fe393fc335d9f622a063787e0a3c6db9b7a50c BUG: 1457289 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/17428 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* extras/hookscripts: non-portable shell syntaxKaleb S. KEITHLEY2017-06-061-2/+2
| | | | | | | | | | | | | | | | | | use of "function" is not portable to other shells master BUG: 1457812 master: https://review.gluster.org/17443 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Change-Id: I13a0482b387cc3b7a7a57df424e673850603da37 BUG: 1459095 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/17476 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Make optimal usage of buffer provided with readdir(p)Sakshi2017-06-064-54/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_readdirp must unwind with list of entries only after the entire buffer requested by kernel is filled to avoid extra syscalls occuring when returning partially filled buffer. Also wind readdir call to next subvol on reaching EOD for directory on that subvol to avoid extra network call. >Change-Id: If2e1a2722f813d95457c7542bff25fef56c7a041 >BUG: 1356453 >Signed-off-by: Sakshi <sabansal@redhat.com> >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: https://review.gluster.org/12271 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> >Reviewed-by: Susant Palai <spalai@redhat.com> (cherry picked from commit b9406e210717621bc672a63c1cbd1b0183834056) Change-Id: If2e1a2722f813d95457c7542bff25fef56c7a041 BUG: 1457339 Signed-off-by: Sakshi <sabansal@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/17429 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* brick mux: Detach brick on posix health check failureAtin Mukherjee2017-06-065-7/+28
| | | | | | | | | | | | | | | | | | | | | With brick mux enabled, we'd need to detach a particular brick if the underlying backend has gone bad. This patch addresses the same. >Reviewed-on: https://review.gluster.org/17287 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >(cherry picked from commit cb6837d03658c1005475d4040fa95504b3fd84d0) Change-Id: Icfd469c7407cd2d21d02e4906375ec770afeacc3 BUG: 1458570 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17459 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* geo-rep: Fix meta data sync on symlinkKotresh HR2017-06-061-11/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | chmod doesn't support 'no dereference' option. It always deference the symlink. But 'chown' does support metadata changes on symlink itself, which was not taken care while syncing. This patch fixes the same. > Change-Id: Ic9985f4e39d15b5a9deb379841bcfb2c263d3e6c > BUG: 1455559 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: https://review.gluster.org/17389 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Aravinda VK <avishwan@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 324e81d6fea324d512431a2604086326b8848e9b) Change-Id: Ic9985f4e39d15b5a9deb379841bcfb2c263d3e6c BUG: 1458664 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17463 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* afr: add errno to afr_inode_refresh_done()Ravishankar N2017-06-061-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/17413 and https://review.gluster.org/17436 Problem: When parellel `rm -rf`s were being done from cifs clients, opendir might fail on some replicas with ENOENT. DHT ignores partial opendir failures in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh (as a part of readdirp read_txn) sees in its fd context that the state of the fds is *not* AFR_FD_OPENED and bails out to afr_inode_refresh_done() without doing a refresh. When this happens, the errno is set as EIO due to lack of readable subvols, logging split-brain messages in the logs. Fix: Introduce an errno argument to afr_inode_refresh_do() to bail out with the right error value when inode refresh is not performed. Change-Id: I075707fbb73fd93a923b77b923a96aac79e847f9 BUG: 1457616 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17434 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* perf/ioc: Fix race causing crash when accessing freed pageN Balachandran2017-06-061-38/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ioc_inode_wakeup does not lock the ioc_inode for the duration of the operation, leaving a window where ioc_prune could find a NULL waitq and hence free the page which ioc_inode_wakeup later tries to access. Thanks to Mohit for the analysis. credit: moagrawa@redhat.com > BUG: 1456385 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17410 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I54b064857e2694826d0c03b23f8014e3984a3330 BUG: 1457058 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17424 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* nl-cache: Remove null check validation for frame->local in lookup cbkRavishankar N2017-06-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For nameless lookups, nl-cache does not init frame local, so the cbk throws up messages like these flooding the logs, especially whenenver gfid lookup on '/' is done (i.e. loc.path="/" and loc.gfid=1). [2017-05-30 04:35:31.628443] E [nl-cache.c:201:nlc_lookup_cbk] (-->/usr/lib64/glusterfs/3.8.4/xlator/performance/io-cache.so(+0x3d81) [0x7f0883005d81] -->/usr/lib64/glusterfs/3.8.4/xlator/performance/quick-read.so(+0x3127) [0x7f0882dfb127] -->/usr/lib64/glusterfs/3.8.4/xlator/performance/nl-cache.so(+0x4cd3) [0x7f08829e0cd3] ) 0-distrep-nl-cache: invalid argument: local [Invalid argument] Fixed it. > Reviewed-on: https://review.gluster.org/17417 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Poornima G <pgurusid@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit ec86167d09bcbb763e31b73fb3d688efaa5444d7) Change-Id: I21cb44a9d2a324617e43f46fed83c9a0942d3a0b BUG: 1457901 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17446 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com>
* posix: use the correct op_errnoRavishankar N2017-06-062-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If readdir/fstat was performed on a directory that was removed, posix_fd_ctx_get() fails with ENOENT but we incorrectly use the ret value (-1 in this case) as op_errno, logging "Operation not permitted" messages in the brick logs. Also in case of fstat, the -1 op_errno was also propagated to the client via stack unwind, causing the message to appear in protocol/client logs as well. Fix: Use the right op_errno in readdir, fstat and writev. Also, if posix_fd_ctx_get() failed with ENOENT, convert it into EBADF because ENOENT is not a valid error for an fd operation. > Reviewed-on: https://review.gluster.org/17414 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit de92c363c95d16966dbcc9d8763fd4448dd84d13) Change-Id: Ie43c0789d5040ec73b7cf885d015a183b8c64d70 BUG: 1457616 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17435 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* event/epoll: Add back socket for polling of events immediately after reading ↵Raghavendra G2017-06-058-121/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the entire rpc message from the wire Currently socket is added back for future events after higher layers (rpc, xlators etc) have processed the message. If message processing involves signficant delay (as in writev replies processed by Erasure Coding), performance takes hit. Hence this patch modifies transport/socket to add back the socket for polling of events immediately after reading the entire rpc message, but before notification to higher layers. credits: Thanks to "Kotresh Hiremath Ravishankar" <khiremat@redhat.com> for assitance in fixing a regression in bitrot caused by this patch. >Reviewed-on: https://review.gluster.org/15036 >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> Change-Id: I04b6b9d0b51a1cfb86ecac3c3d87a5f388cf5800 BUG: 1456259 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/17391 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* doc: Update release notes to reflect right version for bugsShyam2017-05-301-1/+1
| | | | | | | | | | | Fixes gluster/glusterfs#195 Change-Id: I3b9b8ca3f81b90e4b6e3b97824f3b6058943af7a Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/17421 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* doc: add details for the SELinux feature to the release-notesv3.11.0Niels de Vos2017-05-301-3/+19
| | | | | | | | | | | Change-Id: I288196ed195f4d0a36eadd363085602ac4b1f670 Updates: #55 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17416 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* doc: Update release notes for 3.11.0 releaseShyam2017-05-301-118/+346
| | | | | | | | | | | | | | | | Edits, md formatting corrections and added missing release notes for 3 features. Fixes gluster/glusterfs#191 Fixes gluster/glusterfs#188 Fixes gluster/glusterfs#176 Change-Id: I8502391bfcef8a65fa7fd802aacedfe3c595d04b Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/17415 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* features/shard: Handle offset in appending writesPranith Kumar K2017-05-294-43/+278
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a file is opened with append, all writes are appended at the end of file irrespective of the offset given in the write syscall. This needs to be considered in shard size update function and also for choosing which shard to write to. At the moment shard piggybacks on queuing from write-behind xlator for ordering of the operations. So if write-behind is disabled and two parallel appending-writes come both of which can increase the file size beyond shard-size the file will be corrupted. >BUG: 1455301 >Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: https://review.gluster.org/17387 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1456225 Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17404 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* extras: Provide group set for gluster-block workloadsPranith Kumar K2017-05-293-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For gluster-block workloads I/O is always with o-direct so it doesn't benefit by any of the perf xlators so disabling all of them to save on memory. performance.quick-read=off performance.read-ahead=off performance.io-cache=off performance.stat-prefetch=off performance.write-behind=off performance.open-behind=off performance.readdir-ahead=off We want the I/O on the file to be with o-direct network.remote-dio=enable Options that are proven to give good performance with VM workloads which is very similar to gluster-block cluster.eager-lock=enable cluster.quorum-type=auto cluster.data-self-heal-algorithm=full cluster.locking-scheme=granular cluster.shd-max-threads=8 cluster.shd-wait-qlength=10000 features.shard=on It is better to turn off things we are not using user.cifs=off It is better to have allow-insecure to be on so that ports that are > 1024 in tcmu-runner are allowed. server.allow-insecure=on >Change-Id: I9a21c824fa42242f02b57569feedd03d9b6f9439 >BUG: 1450010 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: https://review.gluster.org/17254 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Niels de Vos <ndevos@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> BUG: 1456224 Change-Id: I9a21c824fa42242f02b57569feedd03d9b6f9439 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17403 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* features/bitrot: Fix glusterfsd crashKotresh HR2017-05-291-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | With object versioning being optional, it can so happen the bitrot stub context is not always set. When it's not found, it's initialized. But was not being assigned to use in the local function. This was leading for brick crash. Fixed the same. > Change-Id: I0dab6435cdfe16a8c7f6a31ffec1a370822597a8 > BUG: 1454317 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: https://review.gluster.org/17357 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> (cherry picked from commit 6908e962f6293d38f0ee65c088247a66f2832e4a) Change-Id: I0dab6435cdfe16a8c7f6a31ffec1a370822597a8 BUG: 1456331 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17406 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterfsd: process attach and detach request inside lockAtin Mukherjee2017-05-293-77/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | With brick multiplexing, there is a high possibility that attach and detach requests might be parallely processed and to avoid a concurrent update to the same graph list, a mutex lock is required. Please note this backport defines the volfile_lock mutex which was done as part of a different patch https://review.gluster.org/15036 in mainline but is not available in release-3.11 branch. Credits : Rafi (rkavunga@redhat.com) for the RCA of this issue >Reviewed-on: https://review.gluster.org/17374 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >(cherry picked from commit 3ca5ae2f3bff2371042b607b8e8a218bf316b48c) Change-Id: Ic8e6d1708655c8a143c5a3690968dfa572a32a9c BUG: 1455907 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17402 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* doc: Updated release notes for several features/changes in 3.11.0Shyam2017-05-261-3/+45
| | | | | | | | | | | | Updates #61, Updates #156 Fixes #166, Fixes #167 Change-Id: I031bf944493b959d44c97fb0ddf7c1b80e53bdda Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/17390 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* doc: Update about negative lookup cache feature in 3.11.0 release notesPoornima G2017-05-261-0/+13
| | | | | | | | | | | | Updates: #82 Change-Id: Ib3dcaf6c7e0d6b080ae42cbf07f3a06a321c2b09 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17398 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* release notes: storhaugKaleb S. KEITHLEY2017-05-261-0/+16
| | | | | | | | | | | | | | Update release notes of 3.11 for storhaug Updates: #59 Change-Id: I9cc3183d7087e3a2d9794e0d1d9ea1ddc41e5564 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/17401 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nl-cache: Remove the max limit for nl-cache-limit and nl-cache-timeoutPoornima G2017-05-261-2/+0
| | | | | | | | | | | | | | | | | | | | | The max limit is better unset when arbitrary. Otherwise in the future if max has to be changed, it can break backward compatility. >Reviewed-on: https://review.gluster.org/17261 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> >(cherry picked from commit 64f41b962b643b966e376a10a16671c569bf6299) Change-Id: I4337a3789a2d0d5cc8e2bf687a22536c97608461 BUG: 1453152 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17400 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Update release notes of 3.11 for Rebalance perf improvementsSusant Palai2017-05-261-0/+15
| | | | | | | | | | | Updates glusterfs#155 Change-Id: Ife221f2bc6ae565064783cd447c078960ba97dba Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/17397 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Update release notes of 3.11 for halo replicationPranith Kumar K2017-05-261-0/+28
| | | | | | | | | | | Updates: #199 Change-Id: I834543d7e71d9c3d84b595f7446a3e2c74f07d97 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17396 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Modify release notes for 3.11Samikshan Bairagya2017-05-261-0/+19
| | | | | | | | | | | | | | Notes have been added for changes related to accommodating client details and brick capacity information in the get-state CLI output. Updates: #158 Change-Id: Ic4a82e204e37e001b497900787cc2d02c2574fd0 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17394 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Adding release notes for FALLOCATE support on ECSunil Kumar Acharya2017-05-261-0/+10
| | | | | | | | | | | | Fixes gluster/glusterfs#219 Change-Id: I39b6af1ddcd38fc6979f0843e7cdc880cea9c5c9 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/17399 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* dht:Spacing issue in fix-layout o/pAnkitRaj2017-05-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | There is a spacing issue in the status output of the rebalance fix-layout operations. If the local host name is big then we will have spacing issue. This is the backport of below given ID. > Bug 1437748 > Reviewed-on: https://review.gluster.org/17203 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Tested-by: ankitraj <anraj@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I2fcc4fd382723fb7e93cb4d4dad03dae682cc1a8 BUG: 1452000 Signed-off-by: AnkitRaj <anraj@redhat.com> Reviewed-on: https://review.gluster.org/17327 Tested-by: ankitraj NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd : volume profile command on one of the node crashes glusterdGaurav Yadav2017-05-251-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When volume profile command is issued on one of the node glusterd crashes. Its a race condition which may hit when profile command and status command is being executed from node A and node B respectively. While doing so event GD_OP_STATE_BRICK_OP_SENT/GD_OP_STATE_BRICK_COMMITTED is being triggered. As handling of event is not thread safe, hence context got modify and glusterd crashes. With the fix now we are validating the context before using it. > Reviewed-on: https://review.gluster.org/17350 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 8dc63c8824fc1a00c873c16e8a16a14fca7c8cca) Change-Id: Ic07c3cdc5644677b0e40ff0fac6fcca834158913 BUG: 1454612 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/17362 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nl-cache: In case of nameless operations do not cachePoornima G2017-05-254-4/+48
| | | | | | | | | | | | | | | | | | | | | | | | | Issue: In nameless lookup/other fops, parent inode will be NULL, when we try to add the cache to the NULL inode, it causes a crash. Hence handle the scenario of nameless fops, and do not cache/serve the nameless fops. >Reviewed-on: https://review.gluster.org/17316 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >(cherry picked from commit 284cd8851bfe60984d2f11b5c52fe3204ff43b06) Change-Id: I3b90f882ac89e6aaf3419db89e6f890797f37700 BUG: 1454569 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17361 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Update release notes of 3.11 for readdirplus enhancementsSoumya Koduri2017-05-251-1/+15
| | | | | | | | | | | Updates: #174 Change-Id: I006cafc622c9ee2d776556c287e3ed1743ffa84b Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: https://review.gluster.org/17372 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: ignore incorrect uuid validation if uuid_str is emptyAtin Mukherjee2017-05-251-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If uuid_str is not filled up in dictionary (when glusterd bit is old), we shouldn't be additional validation with peer uuid otherwise the handshake request will fail. Refer : http://lists.gluster.org/pipermail/gluster-users/2017-May/031187.html Credits : pawan@platform.sh >Reviewed-on: https://review.gluster.org/17358 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> >Reviewed-by: Prashanth Pai <ppai@redhat.com> >(cherry picked from commit b1fbc695a63801a3a2c62738fd6660388123724a) Change-Id: I2c30bf0490c31d1418b31d555e7758696e79409f BUG: 1455177 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17385 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: Eliminate race in brick compatibility checking stageSamikshan Bairagya2017-05-251-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In https://review.gluster.org/17307/, while looking for compatible bricks for multiplexing, it is checked if the brick pidfile exists before checking if the corresponding brick process is running. However checking if the brick process is running just after checking if the pidfile exists isn't enough since there might be race conditions where the pidfile has been created but hasn't been updated with a pid value yet. This commit solves that by making sure that we wait iteratively till the pid value is updated as well. > Reviewed-on: https://review.gluster.org/17375 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit a8624b8b13a1f4222e4d3e33fa5836d7b45369bc) Change-Id: Ib7a158f95566486f7c1f84b6357c9b89e4c797ae BUG: 1453086 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17383 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Implement FALLOCATE FOP for ECSunil Kumar Acharya2017-05-257-8/+354
| | | | | | | | | | | | | | | | | | | | | | | | | FALLOCATE file operations is not implemented in the existing EC code. This change set implements it for EC. >BUG: 1448293 >Change-Id: Id9ed914db984c327c16878a5b2304a0ea461b623 >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> >Reviewed-on: https://review.gluster.org/15200 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1454686 Change-Id: Id9ed914db984c327c16878a5b2304a0ea461b623 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/17369 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/dht: Initialize local hashed_subvolKotresh HR2017-05-251-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Self heal directory code path doesn't always have local->hashed_subvol populated. Populating the same which otherwise would fail the self heal. > Change-Id: I03b64709fd7a68e28f9e7438243e817c53c6ef5d > BUG: 1455104 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: https://review.gluster.org/17381 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 90df37558d488f9a794f62ed74ec6d72879ed895) Change-Id: I03b64709fd7a68e28f9e7438243e817c53c6ef5d BUG: 1455423 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17388 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Fix ret checkN Balachandran2017-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | Fixed an incorrect return code check in the rebalance code. > BUG: 1448640 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17197 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 67598f538efb24a9e5ac561b294a05e707e15761) Change-Id: I60804ff121cec7a2f0419e2ee70dd22ea7533c0c BUG: 1454853 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17373 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Tier: Watermark check for hi and low value being equalhari gowtham2017-05-232-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Both low and hi watermark can be set to same value as the check missed the case for being equal. Fix: Add the check to both the hi and low values being equal along with the low value being higher than hi value. >Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64 >BUG: 1447960 >Signed-off-by: hari gowtham <hgowtham@redhat.com> >Reviewed-on: https://review.gluster.org/17175 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: hari gowtham <hari.gowtham005@gmail.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >Reviewed-by: Milind Changire <mchangir@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64 BUG: 1454597 Reviewed-on: https://review.gluster.org/17364 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tests/lock_revocation: mark as badRaghavendra G2017-05-231-0/+1
| | | | | | | | | | | | | | | | | The test is failing in master. Looks like a hang causing Aborted test runs. > Reviewed-on: https://review.gluster.org/17234 (cherry picked from commit d5865881de5653a0e810093a9867ab3962d00f67) Change-Id: I7a589ad2c54bd55d62f4e66fdf8037c19fc123ea BUG: 1454533 Reviewed-on: https://review.gluster.org/17360 Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/ec: return all node uuids from all subvolumesv3.11.0rc1Xavier Hernandez2017-05-222-105/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC was retuning the UUID of the brick with smaller value. This had the side effect of not evenly balancing the load between bricks on rebalance operations. This patch modifies the common functions that combine multiple subvolume values into a single result to take into account the subvolume order and, optionally, other subvolumes that could be damaged. This makes easier to add future features where brick order is important. It also makes possible to easily identify the originating brick of each answer, in case some brick will have an special meaning in the future. >Change-Id: Iee0a4da710b41224a6dc8e13fa8dcddb36c73a2f >BUG: 1366817 >Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> >Reviewed-on: https://review.gluster.org/17297 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >(cherry picked from commit bcc34ce05c1be76dae42838d55c15d3af5f80e48) Change-Id: I055713c3c25b7ba99248be880414fb0e8f36a67e BUG: 1451573 Signed-off-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17318 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rda, glusterd: Change the max of rda-cache-limit to INFINITYPoornima G2017-05-224-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: The max value of rda-cache-limit is 1GB before this patch. When parallel-readdir is enabled, there will be many instances of readdir-ahead, hence the rda-cache-limit depends on the number of instances. Eg: On a volume with distribute count 4, rda-cache-limit when parallel-readdir is enabled, will be 4GB instead of 1GB. Consider a followinf sequence of operations: - Enable parallel readdir - Set rda-cache-limit to lets say 3GB - Disable parallel-readdir, this results in one instance of readdir-ahead and the rda-cache-limit will be back to 1GB, but the current value is 3GB and hence the mount will stop working as 3GB > max 1GB. Solution: To fix this, we can limit the cache to 1GB even when parallel-readdir is enabled. But there is no necessity to limit the cache to 1GB, it can be increased if the system has enough resources. Hence getting rid of the rda-cache-limit max value is more apt. If we just change the rda-cache-limit max to INFINITY, we will render older(<3.11) clients broken, when the rda-cache-limit is set to > 1GB (as the older clients still expect a value < 1GB). To safely change the max value of rda-cache-limit to INFINITY, add a check in glusted to verify all the clients are > 3.11 if the value exceeds 1GB. >Reviewed-on: https://review.gluster.org/17338 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >(cherry picked from commit e43b40296956d132c70ffa3aa07b0078733b39d4) Change-Id: Id0cdda3b053287b659c7bf511b13db2e45b92032 BUG: 1453152 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17354 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Fix crash in dht_selfheal_dir_setattrN Balachandran2017-05-221-2/+6
| | | | | | | | | | | | | | | | | | | | | | | Use a local variable to store the call cnt used in the for loop for the STACK_WIND so as not to access local which may be freed by STACK_UNWIND after all fops return. > BUG: 1452102 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: https://review.gluster.org/17343 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 17784aaa311494e4538c616f02bf95477ae781bc) Change-Id: I24f49b6dbd29a2b706e388e2f6d5196c0f80afc5 BUG: 1453050 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17348 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* glusterd: Don't spawn new glusterfsds on node reboot with brick-muxSamikshan Bairagya2017-05-224-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With brick multiplexing enabled, upon a node reboot new bricks were not being attached to the first spawned brick process even though there wasn't any compatibility issues. The reason for this is that upon glusterd restart after a node reboot, since brick services aren't running, glusterd starts the bricks in a "no-wait" mode. So after a brick process is spawned for the first brick, there isn't enough time for the corresponding pid file to get populated with a value before the compatibilty check is made for the next brick. This commit solves this by iteratively waiting for the pidfile to be populated in the brick compatibility comparison stage before checking if the brick process is alive. > Reviewed-on: https://review.gluster.org/17307 > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 13e7b3b354a252ad4065f7b2f0f805c40a3c5d18) Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4 BUG: 1453086 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17351 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/afr: Return the list of node_uuids for the subvolumekarthik-us2017-05-193-50/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: AFR was returning the node uuid of the first node for every file if the replica set was healthy, which was resulting in only one node migrating all the files. Fix: With this patch AFR returns the list of node_uuids to the upper layer, so that they can decide on which node to migrate which files, resulting in improved performance. Ordering of node uuids will be maintained based on the ordering of the bricks. If a brick is down, then the node uuid for that will be set to all zeros. >Reviewed-on: https://review.gluster.org/17084 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 0a50167c0a8f950f5a1c76442b6c9abea466200d) Change-Id: I73ee0f9898ae473584fdf487a2980d7a6db22f31 BUG: 1451573 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/17336 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* performance/read-ahead: prevent stale data being returned to application.Raghavendra G2017-05-181-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assume that fd is shared by two application threads/processes. T0 read is triggered from app-thread t1 and read call passes through write-behind. T1 app-thread t2 issues a write. The page on which read from t1 is waiting is marked stale T2 write-behind caches write and indicates to application as write complete. T3 app-thread t2 issues read to same region. Since, there is already a page for that region (created as part of read at T0), this read request waits on that page to be filled (though it is stale, which is a bug). T4 read (triggered at T0) completes from brick (with write still pending). Now both read requests from t1 and t2 are served this data (though data is stale from app-thread t2's perspective - which is a bug) T5 write is flushed to brick by write-behind. Fix is to not to serve data from a stale page, but instead initiate a fresh read to back-end. >Change-Id: Id6af733464fa41bb4e81fd29c7451c73d06453fb >BUG: 1414242 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: https://review.gluster.org/7447 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Csaba Henk <csaba@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> >Reviewed-by: Amar Tumballi <amarts@redhat.com> (cherry picked from commit 2ff39c5cbea6fbda0d7a442f55e6dc2a72efb171) Change-Id: Id6af733464fa41bb4e81fd29c7451c73d06453fb BUG: 1449311 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/17221 Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* afr: gfid-mismatch-resolution-with-fav-child-policy.t to bad testsRavishankar N2017-05-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfid-mismatch-resolution-with-fav-child-policy.t does a `TEST ls $M0/f3` (line #170) to trigger healing of a file in gfid split-brain in a rep-3 volume. But the code to trigger name heal of gfid split-brain file is not yet there. The test is passing due a lookup/ stat on $M0 which triggers a background entry self heal (which has the code to heal gfid split-brain files) which may or may not complete the heal before line 170. If it doesn't, lookup on f3 is failing with EIO. Add the .t to bad tests until Karthik's patch for CLI based gfid split-brain resolution fixes name heal also. > BUG: 1450730 > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/17290 (cherry picked from commit ba0fc77947c9e873350b58a0e3e93ab51cc56b37) BUG: 1451887 Change-Id: Iba6e9d81db386bc406aff1ecb6a18851f09bf7c0 Reviewed-on: https://review.gluster.org/17319 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* Fixes quota aux mount failureSanoj Unnikrishnan2017-05-1721-83/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The aux mount is created on the first limit/remove_limit/list command and it remains until volume is stopped / deleted / (quota is disabled) , where we do a lazy unmount. If the process is uncleanly terminated, then the mount entry remains and we get (Transport disconnected) error on subsequent attempts to run quota list/limit-usage/remove commands. Second issue, There is also a risk of inadvertent rm -rf on the /var/run/gluster causing data loss for the user. Ideally, /var/run is a temp path for application use and should not cause any data loss to persistent storage. Solution: 1) unmount the aux mount after each use. 2) clean stale mount before mounting, if any. One caveat with doing mount/unmount on each command is that we cannot use same mount point for both list and limit commands. The reason for this is that list command needs mount to be accessible in cli after response from glusterd, So it could be unmounted by a limit command if executed in parallel (had we used same mount point) Hence we use separate mount points for list and limit commands. >Reviewed-on: https://review.gluster.org/16938 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >(cherry picked from commit 2ae4b4058691b324535d802f4e6d24cce89a10e5) Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0 BUG: 1449775 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com> Reviewed-on: https://review.gluster.org/17240 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* nfs/nlm: remove lock request from the list after cancelNiels de Vos2017-05-171-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | Once an NLM client cancels a lock request, it should be removed from the list. The list can also be cleaned of unneeded entries once the client does not have any outstanding lock/share requests/granted. Cherry picked from commit 71cb7f3eb4fb706aab7f83906592942a2ff2e924: > Change-Id: I2f2b666b627dcb52cddc6d5b95856e420b2b2e26 > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17188 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: I2f2b666b627dcb52cddc6d5b95856e420b2b2e26 BUG: 1450377 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17268 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* nfs/nlm: free the nlm_client upon RPC_DISCONNECTNiels de Vos2017-05-171-12/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | When an NLM client disconnects, it should be removed from the list and free'd. > Cherry picked from commit 6897ba5c51b29c05b270c447adb1a34cb8e61911: > Change-Id: Ib427c896bfcdc547a3aee42a652578ffd076e2ad > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17189 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: Ib427c896bfcdc547a3aee42a652578ffd076e2ad BUG: 1450377 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17267 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* nfs/nlm: log the caller_name if nlm_client_t can be foundNiels de Vos2017-05-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to help tracking possible misbehaving clients down, log the 'caller_name' (hostname of the NFS client) that does not have a matching nlm_client_t structure. Cherry picked from commit 9bfb74a39954a7e63bfd762c816efc7e64b9df65: > Change-Id: Ib514a78d1809719a3d0274acc31ee632727d746d > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17186 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: Ib514a78d1809719a3d0274acc31ee632727d746d BUG: 1450377 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17266 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* nfs/nlm: ignore notify when there is no matching rpc requestNiels de Vos2017-05-171-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In certain (unclear) occasions it seems to happen that there are notifications sent to the Gluster/NFS NLM service, but no call-state can be found. Instead of segfaulting, log an error but keep on running. Cherry picked from commit e997d752ba08f80b1b00d2c0035874befafe5200: > Change-Id: I0f186e56e46a86ca40314d230c1cc7719c61f0b5 > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17185 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I0f186e56e46a86ca40314d230c1cc7719c61f0b5 BUG: 1450377 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17265 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>