glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	posix: make sure atime and mtime are set when calling lutimes()	Niels de Vos	2017-01-08	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When overwriting an existing file with O_TRUNC, the 'atime' was set to 0, meaning the Epoch (01-Jan-1970 UTC). However, the 'mtime' gets updated correcty. In case 'atime' or 'mtime' is not passed in the 'struct iatt', the time values passed to the systemcall are taken from the current values are returned by lstat(). Cherry picked from commit 9bed81ada6f91f998e9abd915b18e3f06557cdcb: > Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3 > BUG: 1401777 > Reported-by: Eivind Sarto <eivindsarto@gmail.com> > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/16034 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3 BUG: 1411010 Reported-by: Eivind Sarto <eivindsarto@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/16355 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	uss: snapd should enable SSL if SSL is enabled on volume	Rajesh Joseph	2017-01-02	1	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During snapd graph generation we should check if SSL is enabled on main volume or not. This is because clients will communicate with snapd as if it is communicating to a brick. > Reviewed-on: http://review.gluster.org/15979 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaushal M <kaushal@redhat.com> (cherry picked from commit 182f0d12040dab5081ca645a3f370f65cd68b528) Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> BUG: 1400460 Reviewed-on: http://review.gluster.org/15987 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/afr: Fix missing name indices due to EEXIST error	Krutika Dhananjay	2016-12-28	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/16286 PROBLEM: Consider a volume with granular-entry-heal and sharding enabled. When a replica is down and a shard is created as part of a write, the name index is correctly created under indices/entry-changes/<dot-shard-gfid>. Now when a read on the same region triggers another MKNOD, the fop fails on the online bricks with EEXIST. By virtue of this being a symmetric error, the failed_subvols[] array is reset to all zeroes. Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be set, causing the name index, which was created in the previous MKNOD operation, to be wrongly deleted in THIS MKNOD operation. FIX: The ideal fix would have been for a transaction to delete the name index ONLY if it knows it is the one that created the index in the first place. This would involve gathering information as to whether THIS xattrop created the index from individual bricks, aggregating their responses and based on the various posisble combinations of responses, decide whether to delete the index or not. This is rather complex. Simpler fix would be for post-op to examine local->op_ret in the event of no failed_subvols to figure out whether to delete the name index or not. This can occasionally lead to creation of stale name indices but they won't be affecting the IO path or mess with pending changelogs in any way and self-heal in its crawl of "entry-changes" directory would take care to delete such indices. Change-Id: I8c5c08b7a208e840b5970fe5699dabdaf751a150 BUG: 1408785 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16294 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: Fix spurious failure in tests/bugs/replicate/bug-1402730.t	Krutika Dhananjay	2016-12-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/16193 Replace the EXPECT '00000001' with EXPECT_NOT '00000000'. This is because occasionally a name-heal is performing new-entry marking on 'c' causing the pending entry changelog on it to become '00000002'. Change-Id: I89c2129f6969d3ad32d665b25e9fc55d7f9b80a1 BUG: 1406739 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16223 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: Fix spurious test failure in bug-1316437.t	Rajesh Joseph	2016-12-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After sending SIGTERM to gluster process we immediately check if process exited. We should wait for some time before checking process state. > Reviewed-on: http://review.gluster.org/16162 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Avra Sengupta <asengupt@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit e9d8525a0d34130ba2a582109937b8e79eecf6ab) BUG: 1405451 Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/16165 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
*	tests: Fix one of the md-cache test cases	Poornima G	2016-12-19	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Verify if the unlink, rename and other ops are reflected both on the current mount and other mounts. >Reviewed-on: http://review.gluster.org/15419 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> >(cherry picked from commit 0fd7d0e1c78fdbedfcdb085445c4b0be3c1a97a9) Change-Id: I5a296cdd557194dcf487e65ee4a14bbeaf4be690 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15960 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	tests: Fix spurious failure in bug-1402841.t-mt-dir-scan-race.t	Krutika Dhananjay	2016-12-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/16169 Check that shd is up before executing 'volume heal' command Change-Id: If302c9f4e7a3636e0cd52859f229d2c0018aa180 BUG: 1405889 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16188 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	libglusterfs: Fix a read hang	Poornima G	2016-12-13	2	-0/+159
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.org/15923 Issue: ===== In certain cases, there was no unwind of read from read-ahead xlator, thus resulting in hang. RCA: ==== In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead of STACK_WIND(). One such case is when inode_ctx for that file is not present (can happen if readdirp was called, and populates md-cache and serves all the lookups from cache). Consider the following graph: ... io-cache (parent) \| readdir-ahead \| read-ahead ... Below is the code snippet of ioc_readv calling STACK_WIND_TAIL: ioc_readv() { ... if (!inode_ctx) STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this), FIRST_CHILD (frame->this)->fops->readv, fd, size, offset, flags, xdata); /* Ideally, this stack_wind should wind to readdir-ahead:readv() but it winds to read-ahead:readv(). See below for explaination. / ... } STACK_WIND_TAIL (frame, obj, fn, ...) { frame->this = obj; / for the above mentioned graph, frame->this will be readdir-ahead * frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which * is as expected / ... THIS = obj; / THIS will be read-ahead instead of readdir-ahead!, as obj expands * to "FIRST_CHILD (frame->this)" and frame->this was pointing * to readdir-ahead in the previous statement. / ... fn (frame, obj, params); / fn will call read-ahead:readv() instead of readdir-ahead:readv()! * as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and * frame->this was pointing ro readdir-ahead in the first statement / ... } Thus, the readdir-ahead's readv() implementation will be skipped, and ra_readv() will be called with frame->this = "readdir-ahead" and this = "read-ahead". This can lead to corruption / hang / other problems. But in this perticular case, when 'frame->this' and 'this' passed to ra_readv() doesn't match, it causes ra_readv() to call ra_readv() again!. Thus the logic of read-ahead readv() falls apart and leads to hang. Solution: ========= Modify STACK_WIND_TAIL() as: STACK_WIND_TAIL (frame, obj, fn, ...) { next_xl = obj / resolve obj as the variables passed in obj macro can be overwritten in the further instrucions / next_xl_fn = fn / resolve fn and store in a tmp variable, before modifying any variables */ frame->this = next_xl; ... THIS = next_xl; ... next_xl_fn (frame, next_xl, params); ... } >Reviewed-on: http://review.gluster.org/15923 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (Cherry picked from commit 8943c19a2ef51b6e4fa66cb57211d469fe558579) BUG: 1399015 Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15933 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	afr: allow I/O when favorite-child-policy is enabled	Ravishankar N	2016-12-12	1	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy,inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. > Reviewed-on: http://review.gluster.org/15673 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1403121 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/16088 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/afr: Fix per-txn optimistic changelog initialisation	Krutika Dhananjay	2016-12-12	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: http://review.gluster.org/16075 Incorrect initialisation of local->optimistic_change_log was leading to skipped pre-op and post-op even when a brick didn't participate in the txn because it was down. The result - missing granular name index resulting in some entries never getting healed. FIX: Initialise local->optimistic_change_log just before pre-op. Also fixed granular entry heal to create the granular name index in pre-op as opposed to post-op. This is to prevent loss of granular information when during an entry txn, the good (src) brick goes offline before the post-op is done. This would cause self-heal to do conservative merge (since dirty xattr is the only information available), which when granular-entry-heal is enabled, expects granular indices, the lack of which can lead to loss of data in the worst case. Change-Id: I213d98ca9b3c4604b095478bf427fa69c04a7d64 BUG: 1403743 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16106 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	syncop: fix conditional wait bug in parallel dir scan	Ravishankar N	2016-12-11	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The issue as seen by the user is detailed in the BZ but what is happening is if the no. of items in the wait queue == max-qlen, syncop_mt_dir_scan() does a pthread_cond_wait until the launched synctask workers dequeue the queue. But if for some reason the worker fails, the queue is never emptied due to which further invocations of syncop_mt_dir_scan() are blocked forever. Fix: Made some changes to _dir_scan_job_fn - If a worker encounters error while processing an entry, notify the readdir loop in syncop_mt_dir_scan() of the error but continue to process other entries in the queue, decrementing the qlen as and when we dequeue elements, and ending only when the queue is empty. - If the readdir loop in syncop_mt_dir_scan() gets an error form the worker, stop the readdir+queueing of further entries. > Reviewed-on: http://review.gluster.org/16073 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 2d012c4558046afd6adb3992ff88f937c5f835e4) Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88 BUG: 1403187 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16095 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	md-cache, afr: Reduce the window of stale read	Poornima G	2016-12-01	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Consider a replica setup, where one mount writes data to a file and the other mount reads the file. In afr, read operations are not transaction based, a brick(read subvolume) is chosen as a part of lookup or other operations, read is always wound only to the read subvolume, even if there was write from a different client that failed on this brick. This stale read continues until there is a lookup or any write operation from the mount point. Currently, this is not a major issue, as a lookup is issued before every read and it will switch the read subvolume to a correct one. But with the plan of increasing md-cache timeout to 600s, the stale read problem will be more pronounced, i.e. stale read can continue for 600s(or more if cascaded with readdirp), as there will be no lookups. Solution: Afr doesn't have any built-in solution for stale read(without affecting the performance). The solution that came up, was to use upcall. When a file on any brick is marked bad for the first time, upcall sends a notification to all the clients that had recently accessed the file. The solution has 2 parts: - Identifying when a file is marked bad, on any of the bricks, for the first time - Client side actions on recieving the notifications Identifying when a file is marked bad on any of the bricks for the first time: ----------------------------------------------------------------------------- The idea is to track xattrop in upcall. xattrop currently comes with 2 afr xattrs - afr dirty bit and afr pending xattrs. Dirty xattr is set to 1 before every write, and is unset if write succeeds. In certain scenarios, dirty xattr can be 0 and still the file could be bad copy. Hence do not track dirty xattr. Pending xattr is set on the good copy, indicating the other bricks that have bad copy. It is still not as simple as, notifying when any of the pending xattrs change. It could lead to flood of notifcations, in case the other brick is completely down or consistantly failing. Hence it is important to notify only once, the first time a good copy is marked bad. Client side actions on recieving pending xattr change, notification: -------------------------------------------------------------------- md-cache will invalidate the cache of that file, so that further lookup is passed down to afr and hence update the read subvolume. Invalidating only in md-cache is not enough, consider the folling oder of opertaions: - pending xattr invalidation - invalidate md-cache - readdirp on the bad read subvolume - fill md-cache - lookup (served from md-cache) - read - wound to the old read subvol. Hence, along with invalidating md-cache, it is very important to reset the read subvolume for that file, in afr. Design Credit: Anuradha Talur, Ravishankar N 1. xattrop doesn't carry info saying post op/pre op. 2. Pre xattrop will have 0 value for all pending xattrs, the cbk of pre xattrop carries the on-disk xattr value. Non zero indicated healing is required. 3. Post xattrop will have non zero value for any of the pending xattrs, if the fop failed on any of the bricks. >Reviewed-on: http://review.gluster.org/15398 >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Signed-off-by: Poornima G <pgurusid@redhat.com> Change-Id: I469cbc111714c433984fe1c922be2ef113c25804 BUG: 1399450 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15958 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	marker: Fix inode value in loc, in setxattr fop	Poornima G	2016-11-21	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.org/15826 On recieving a rename fop, marker_rename() stores the, oldloc and newloc in its 'local' struct, once the rename is done, the xtime marker(last updated time) is set on the file, but sending a setxattr fop. When upcall receives the setxattr fop, the loc->inode is NULL and it crashes. The loc->inode can be NULL only in one valid case, i.e. in rename case where the inode of new loc can be NULL. Hence, marker should have filled the inode of the new_loc before issuing a setxattr. > Reviewed-on: http://review.gluster.org/15826 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kotresh HR <khiremat@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 46e5466850311ee69e6ae9a11c2bba2aabadd5de) Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977 BUG: 1396414 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15877 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cli/rebalance: remove brick status is incorrect	N Balachandran	2016-11-17	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a remove brick operation is preceded by a fix-layout, running remove-brick status on a node which does not contain any of the bricks that were removed displays fix-layout status. The defrag_cmd variable was not updated in glusterd for the nodes not hosting removed bricks causing the status parsing to go wrong. This is now updated. Also made minor modifications to the spacing in the fix-layout status output. > Change-Id: Ib735ce26be7434cd71b76e4c33d9b0648d0530db > BUG: 1389697 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15749 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 35b085ba345cafb2b0ee978a4c4475ab0dcba5a6) Change-Id: I3da89c61da07bc5e037527aafc84d184dcd1f764 BUG: 1396109 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15870 Tested-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	socket: pollerr event shouldn't trigger socket_connnect_finish	Atin Mukherjee	2016-09-21	2	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If connect fails with any other error than EINPROGRESS we cannot get the error status using getsockopt (... SO_ERROR ... ). Hence we need to remember the state of connect and take appropriate action in the event_handler for the same. As an added note, a event can come where poll_err is HUP and we have poll_in as well (i.e some status was written to the socket), so for such cases we need to finish the connect, process the data and then the poll_err as is the case in the current code. Special thanks to Kaushal M & Raghavendra G for figuring out the issue. >Signed-off-by: Shyam <srangana@redhat.com> >Reviewed-on: http://review.gluster.org/15440 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ic45ad59ff8ab1d0a9d2cab2c924ad940b9d38528 BUG: 1377386 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15533 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: fix bug-963541.t spurious failure	Atin Mukherjee	2016-09-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wait for remove brick to complete before attempt for a commit. >Reviewed-on: http://review.gluster.org/15457 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I66ea6c48b6a69fe33d79f9d9080b6f2c1462578e BUG: 1375042 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15458 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
*	tests: Fix one of the upcall tests	Poornima G	2016-09-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of http://review.gluster.org/#/c/15385/ Currently md-cache invalidation feature is enabled by setting "performance.cache-invalidation", but this case was sent when "features.cache-invalidation" was enabling md-cache invalidation. Hence, fix the same. Change-Id: If044f6208179748a120fbe1d63b676367e707f73 BUG: 1372586 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15386 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	io-stats: Add stats for upcall notificationsv3.10dev	Poornima G	2016-08-31	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this patch, there will be additional entries seen in the profile info: UPCALL : Total number of upcall events that were sent from the brick(in brick profile), and number of upcall notifications recieved by client(in client profile) Cache invalidation events: ------------------------- CI_IATT : Number of upcalls that were cache invalidation and had one of the IATT_UPDATE_FLAGS set. This indicates that one of the iatt value was changed. CI_XATTR : Number of upcalls that were cache invalidation, and had one of the UP_XATTR or UP_XATTR_RM set. This indicates that an xattr was updated or deleted. CI_RENAME : Number of upcalls that were cache invalidation, resulted by the renaming of a file or directory CI_UNLINK : Number of upcalls that were cache invalidation, resulted by the unlink of a file. CI_FORGET : Number of upcalls that were cache invalidation, resulted by the forget of inode on the server side. Lease events: ------------ LEASE_RECALL : Number of lease recalls sent by the brick (in brick profile), and number of lease recalls recieved by client(in client profile) Note that the sum of CI_IATT, CI_XATTR, CI_RENAME, CI_UNLINK, CI_FORGET, LEASE_RECALL may not be equal to UPCALL. This is because, each cache invalidation can carry multiple flags. Eg: - Every CI_XATTR will have CI_IATT - Every CI_UNLINK will also increment CI_IATT as link count is an iatt attribute. Also UP_PARENT_DENTRY_FLAGS is currently not accounted for, as CI_RENAME and CI_UNLINK will always have the flag UP_PARENT_DENTRY_FLAGS Change-Id: Ieb8cd21dde2c4c7618f12d025a5e5156f9cc0fe9 BUG: 1371543 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15193 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: Fix spurious failures because of wrong shd up function	Pranith Kumar K	2016-08-31	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed the way shd up check is done to prevent self-heal daemon not running error when heal full command is executed. Change-Id: I93c4a0da12316373d62cd4ea74432cd9bf2b090c BUG: 1370053 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15341 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com>
*	upcall: Mark the clients as accessed on readdirp entries	Poornima G	2016-08-31	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when a client performs a readdirp it is not stored in upcall, as one of the clients that have accessed the files. Hence, when any other client modifies the file, the client that had performed readdirp will not get any notifications. Fix this by adding the clients to upcall database when they perform readdirp. Change-Id: I7767f1e26bf1bd1f67702a6d01f8aa64526ccc46 BUG: 1369430 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15313 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	md-cache: Process all the cache invalidation flags	Poornima G	2016-08-30	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, md-cache only processes IATT_UPDATE_FLAGS, UP_XATTR and UP_XATTR_RM. We also need to process UP_RENAME_FLAGS, UP_FORGET, UP_PARENT_DENTRY_FLAGS and UP_NLINK_FLAGS. Otherwise the files unlinked or renamed will not be reflected on other mounts. Change-Id: Icb8b03da51482c3fc2e2a7292d16d56e11a341d9 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15324 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	gfapi: Mark tests/basic/gfapi/1093594.t bad until it is fixed	Poornima G	2016-08-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: If88efe3db782a6156614af4c650d53b159ade57f BUG: 1371541 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15354 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd : Introduce reset brick	Anuradha Talur	2016-08-29	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The command basically allows replace brick with src and dst bricks as same. Usage: gluster v reset-brick <volname> <hostname:brick-path> start This command kills the brick to be reset. Once this command is run, admin can do other manual operations that they need to do, like configuring some options for the brick. Once this is done, resetting the brick can be continued with the following options. gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force} Does the job of resetting the brick. 'force' option should be used when the brick already contains volinfo id. Problem: On doing a disk-replacement of a brick in a replicate volume the following 2 scenarios may occur : a) there is a chance that reads are served from this replaced-disk brick, which leads to empty reads. b) potential data loss if next writes succeed only on replaced brick, and heal is done to other bricks from this one. Solution: After disk-replacement, make sure that reset-brick command is run for that brick so that pending markers are set for the brick and it is not chosen as source for reads and heal. But, as of now replace-brick for the same brick-path is not allowed. In order to fix the above mentioned problem, same brick-path replace-brick is needed. With this patch reset-brick commit {force} will be allowed even when source and destination <hostname:brickpath> are identical as long as 1) destination brick is not alive 2) source and destination brick have the same brick uuid and path. Also, the destination brick after replace-brick will use the same port as the source brick. Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de BUG: 1266876 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12250 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: change EXPECT_WITHIN timeouts	Anuradha Talur	2016-08-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use defined HEAL and PROCESS_UP timeouts rather than hard code them in self-heald.t. Change-Id: I21586811904c8417b7208bb643f14dff20dc4832 BUG: 1370074 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/15316 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	md-cache: Do not use features.cache-invalidation for both md-cache and upcall	Poornima G	2016-08-27	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the volume set option features.cache-invalidation enables upcall feature on server side and md-cache cache-invalidation on client side. There are multiple problems that can arise from this: 1. The scenario when user wants to, enable upcall for nfs-ganesha setup, but do not want to enable md-cache cache-invalidation, as the nfs-clients have already cached the metadata and upcall is used to to invalidate the nfs-client cache. In this case, users should have a way of disabling md-cache invalidation without disabling upcall. 2. Upcall requires a op-version of GD_OP_VERSION_3_7_0, where as md-cache invalidation requires an op version of GD_OP_VERSION_3_9_0. Consider a setup where the servers are in op-version GD_OP_VERSION_3_7_0, and th clients are in op-version GD_OP_VERSION_3_9_0. if there is one single volume set option, user can enable this feature in this setup. But it can lead to stale xattr cache as the xattr invalidation was introduced in upcall only in release 3.8. Hence, we should not be able to enable md-cache invalidation, if all the servers and clients are not on opversion >= GD_OP_VERSION_3_9_0. To solve the above mentioned issues, we have seperate volume options for enabling md-cache invalidation and upcall. But this can lead to issues when user enable md-cache invalidation and forgets to enable upcall. Probably in the next release, these can be enables by default. Change-Id: Ie70eff97fe12fcb623eec8f4f5861ac065bf483e BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15314 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd/cli: cli to get local state representation from glusterd	Samikshan Bairagya	2016-08-26	1	-0/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently there is no existing CLI that can be used to get the local state representation of the cluster as maintained in glusterd in a readable as well as parseable format. The CLI added has the following usage: # gluster get-state [daemon] [odir <path/to/output/dir>] [file <filename>] This would dump data points that reflect the local state representation of the cluster as maintained in glusterd (no other daemons are supported as of now) to a file inside the specified output directory. The default output directory and filename is /var/run/gluster and glusterd_state_<timestamp> respectively. The option for specifying the daemon name leaves room to add support for other daemons in the future. Following are the data points captured as of now to represent the state from the local glusterd pov: * Peer: - Primary hostname - uuid - state - connection status - List of hostnames * Volumes: - name, id, transport type, status - counts: bricks, snap, subvol, stripe, arbiter, disperse, redundancy - snapd status - quorum status - tiering related information - rebalance status - replace bricks status - snapshots * Bricks: - Path, hostname (for all bricks these info will be shown) - port, rdma port, status, mount options, filesystem type and signed in status for bricks running locally. * Services: - name, online status for initialised services * Others: - Base port, last allocated port - op-version - MYUUID Change-Id: I4a45cc5407ab92d8afdbbd2098ece851f7e3d618 BUG: 1353156 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/14873 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/afr: Prevent split-brain when bricks are brought off and on in ↵	Krutika Dhananjay	2016-08-22	1	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cyclic order When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline after the inode refresh but before the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15080 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests/cli: Generate SSL certificates	Ashish Pandey	2016-08-21	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generate SSL certificates before enabling management encryption to avoid test failure. Change-Id: Iab23b36703f4653f1d5bb9d14695e4d3fa63ad61 Signed-off-by: Ashish Pandey <aspandey@redhat.com> BUG: 1368349 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15202 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: Fix volume restart issue upon glusterd restart	Samikshan Bairagya	2016-08-17	2	-1/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	http://review.gluster.org/#/c/14758/ introduces a check in glusterd_restart_bricks that makes sure that if server quorum is enabled and if the glusterd instance has been restarted, the bricks do not get started. This prevents bricks which have been brought down purposely, say for maintainence, from getting started upon a glusterd restart. However this change introduced regression for a situation that involves multiple volumes. The bricks from the first volume get started, but then for the subsequent volumes the bricks do not get started. This patch fixes that by setting the value of conf->restart_done to _gf_true only after bricks are started correctly for all volumes. Change-Id: I2c685b43207df2a583ca890ec54dcccf109d22c3 BUG: 1367478 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15183 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	logging: Fix per xl log level	Poornima G	2016-08-15	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently per xlator loglevel setting doesn't work, due to the flaw in loglevel checking. Fix the same. Per xlator logging can be set using the below command: Eg: setfattr -n trusted.glusterfs.patchy-md-cache.set-log-level -v TRACE /mnt/glusterfs/0 Change-Id: I8ff1d15bd5693b6f682d99bee22a4bbb5eee646c BUG: 1362520 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15071 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: Convert volume to replica after adding brick self heal is not ↵	Mohit Agrawal	2016-08-11	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	triggered Problem: After add brick to a distribute volume to convert to replica is not triggering self heal. Solution: Modify the condition in brick_graph_add_index to set trusted.afr.dirty attribute in xlator. Test : To verify the patch followd below steps 1) Create a single node volume gluster volume create <DIS> <IP:/dist1/brick1> 2) Start volume and create mount point mount -t glusterfs <IP>:/DIS /mnt 3) Touch some file and write some data on file 4) Add another brick along with replica 2 gluster volume add-brick DIS replica 2 <IP>:/dist2/brick2 5) Before apply the patch file size is 0 bytes in mount point. BUG: 1365455 Change-Id: Ief0ccbf98ea21b53d0e27edef177db6cabb3397f Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15118 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd : skip non directories inside /var/lib/glusterd/vols	Jiffin Tony Thottan	2016-08-08	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now glusterd won't come up if vols directory contains an invalid entry. Instead of doing that with this change a message will be logged and then skip that entry Change-Id: I665b5c35291b059cf054622da0eec4db44ec5f68 BUG: 1318591 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/13764 Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	posix: Do not move and recreate .glusterfs/unlink directory	Ashish Pandey	2016-08-08	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of start of a volume, it is checked if .glusterfs/unlink exist or not. If it does, move it to landfill and recreate unlink directory. If a volume is mounted and we write data on it till we face ENOSPC, restart of that volume fails as it will not be able to create unlink dir. mkdir will fail with ENOSPC. This will not allow volume to restart. Solution: If .glusterfs/unlink directory exist, don't move it to landfill. Delete all the entries inside it. Change-Id: Icde3fb36012f2f01aeb119a2da042f761203c11f BUG: 1360679 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15030 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	tests: fix spurious failure in tests/bugs/glusterd/bug-1089668.t	Atin Mukherjee	2016-08-04	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of rebalance stop, its always better to wait for rebalance to complete as the former doesn't have any purpose. Change-Id: Ia1bc2a34d937a0a96543bebd257dcda619f12474 BUG: 1363948 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15085 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	libglusterfs: fix glusterd statedump crash	Atin Mukherjee	2016-08-04	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 3c04a91 removed setting typeStr to NULL if num_allocs is set to 0, this has caused this regression. Code has been put back like earlier and to avoid statedump printing all the NULL values check is modified to see skip the records if num_allocs is 0 instead of total_allocs Change-Id: Ib8bcc2fba908e88cf52b641c3f6bcba74f5e667c BUG: 1359190 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14987 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: clean up old port and allocate new one on every restart	Atin Mukherjee	2016-08-03	2	-49/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GlusterD as of now was blindly assuming that the brick port which was already allocated would be available to be reused and that assumption is absolutely wrong. Solution : On first attempt, we thought GlusterD should check if the already allocated brick ports are free, if not allocate new port and pass it to the daemon. But with that approach there is a possibility that if PMAP_SIGNOUT is missed out, the stale port will be given back to the clients where connection will keep on failing. Now given the port allocation always start from base_port, if everytime a new port has to be allocated for the daemons, the port range will still be under control. So this fix tries to clean up old port using pmap_registry_remove () if any and then goes for pmap_registry_alloc () Change-Id: If54a055d01ab0cbc06589dc1191d8fc52eb2c84f BUG: 1221623 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15005 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
*	tests: Fix get_pending_heal_count check in ec	Ravishankar N	2016-07-29	10	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuation of http://review.gluster.org/#/c/14985. Also renamed tests/bugs/disperse to tests/bugs/ec for a better correlation to tests/basic/ec and xlators/cluster/ec Change-Id: I662b3477c12af8a0b94597769e8f00f354b1168c BUG: 1332054 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15006 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
*	io-threads: remove least-rate-limit option and code	Jeff Darcy	2016-07-28	1	-53/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will be unnecessary, and mostly in the way, as real fairness guarantees are implemented. Change-Id: Ic61ec1c9e9add58385f1a4eafcfe2cc554ceefc8 BUG: 1360402 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14989 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: Gluster Build System <jenkins@build.gluster.org>
*	snapshot/snapd: Don't display pid when snapd is offline	Avra Sengupta	2016-07-27	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were previously reading the pidfile, and displaying the pid even if snapd daemon is not running. Now to fix it, we re-assign pid value to -1, if snapd is offline. Change-Id: I4baff8d489fe9380061c52aea006db90fa421cd7 BUG: 1358244 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14981 Tested-by: Vijay Bellur <vbellur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	afr: some coverity fixes	Ravishankar N	2016-07-26	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Thanks to Krutika for a cleaner way to track inode refs in afr_set_split_brain_choice(). Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df BUG: 1355604 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14895 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	tests: Moving ./tests/bugs/snapshot/bug-1316437.t to bad test	Avra Sengupta	2016-07-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Moving ./tests/bugs/snapshot/bug-1316437.t to bad test, while mulling over the pros and cons of the fix. Will update the bug, as we go. Sending this patch to unblock master. Change-Id: Ia863312913686b4fa0ee0b63da13aedc0439a835 BUG: 1359717 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15001 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	tests: Fix pending-heal-count checks	Pranith Kumar K	2016-07-22	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EXPECT_WITHIN takes regular expression to match the count, so even when there are say 10 entries to heal, it would think that the heal is complete. Fixed checking pending heal count with correct regex. Thanks to Xavi for finding this problem. Change-Id: Ic593d22468b2b586bfca864962ffa0eda96b1d1f BUG: 1332054 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14985 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	tests: Enable all gfapi test cases	Poornima G	2016-07-20	15	-41/+45
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I32bfec4af91348d96dc3e81a9d5c9cad599f821b Bug: 1358594 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/14748 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	tests: Fix spurious failure of tests/bugs/glusterd/bug-1111041.t	Avra Sengupta	2016-07-20	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a faster machine the ps check was returning two pids, including the glusterfsd process's pid, right after that, process forked. Hence removing that ps, as for the scope of this test, verifying the snapd pid from the status command itself is enough. Change-Id: I8bd8fc4ea406d96e3a47f952cfe44560b615dbe6 BUG: 1358195 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14963 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	md-cache: Add cache invalidation support to invalidate the meta data cache	Poornima G	2016-07-20	1	-7/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: md-cache currently updates its stat in cbks of selected fops. The default cache time is 1 second, if this is increasd to reap the benefits of caching, we may end up with stale cache for long time, as there is no logic yet to notify md-cache of backend changes by another client. Solution: Use the existing upcall mechanism to invalidate the cache. For this feature to work, "features.cache-invalidation" volume option should be enabled. This patch as is doesn't improve any performance, the benifit of the patch is that it provides coherency for stat cache, hence the cache timeout can be quite longer which in turn can improve the performance. Change-Id: I2dbb0afa7b5e4a5a248f910188e0918e02f18692 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/12951 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	xlator/trash : append '/' at the end in trash_notify_lookup_cbk	Jiffin Tony Thottan	2016-07-19	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the notify function in trash xlator, a lookup is performed to obtain path of old trash directory. The result usually contains path without '/' at the end. The trash xlator maintains expects '/' at the end for the values such as 'old trash dir' and 'new trash dir'. Otherwise certian checks in the code will fail. Change-Id: I89e02e4b249314fb6536297f959865feee182c83 BUG: 1357397 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14938 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	Revert "tests: remove tests for clear-locks"	Pranith Kumar K	2016-07-18	3	-0/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 0086a55bb7de1ef5dc7a24583f5fc2b560e835fd. As part of Richard's patch for lock-revocation feature this bug is completely fixed (I think at least ;-) ). So bringing these back so that we will find out if there are anymore things we need to address in this code path. BUG: 1350867 Change-Id: If1440fc83b376576ae1a77b1156188a6bf53fe3a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14817 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	tests: fix rebalance timing issue	Sakshi Bansal	2016-07-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With a start and stop rebalance, the stop command may fail as by that time the rebalance process may not come up. Using the rebalance status commmand to ensure that the rebalance process is up before stoping rebalance. Change-Id: I3d5123cd5dfabde2720428455b257d11b980ce21 BUG: 1354372 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14885 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: Don't start bricks if server quorum is not met	Samikshan Bairagya	2016-07-05	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upon glusterd restart if it is observered that the server quorum isn't met anymore due to changes to the "server-quorum-ratio" global option, the bricks should be stopped if they are running. Also if glusterd has been restarted, and if server quorum is not applicable for a volume, do not restart the bricks corresponding to the volume to make sure that bricks that have been brought down purposely, say for maintenance, are not brought up. This commit moves this check that was previously inside "glusterd_spawn_daemons" to "glusterd_restart_bricks" instead. Change-Id: I0a44a2e7cad0739ed7d56d2d67ab58058716de6b BUG: 1345727 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/14758 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: spawn daemons from init() on a single or two node setup	Atin Mukherjee	2016-07-05	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow glusterd to spawn the daemons at the time of initialization when peer count is less than 2. This is required if user wants to set up a two node cluster with out server side quorum and want the bricks to come up on a node where the other node is down, however the behaviour will be overriden when server side quorum is enabled. Change-Id: I21118e996655822467eaf329f638eb9a8bf8b7d5 BUG: 1352277 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/14848 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>