glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	refcount: return pointer to the structure instead of a counter	Niels de Vos	2016-12-11	3	-23/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are no real users of the counter. It was thought of a handy tool to track and debug refcounting, but it is not used at all. Some parts of the code would benefit from a pointer getting returned instead. BUG: 1399780 Change-Id: I97e52c48420fed61be942ea27ff4849b803eed12 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15971 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	syncop: fix conditional wait bug in parallel dir scan	Ravishankar N	2016-12-09	2	-6/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The issue as seen by the user is detailed in the BZ but what is happening is if the no. of items in the wait queue == max-qlen, syncop_mt_dir_scan() does a pthread_cond_wait until the launched synctask workers dequeue the queue. But if for some reason the worker fails, the queue is never emptied due to which further invocations of syncop_mt_dir_scan() are blocked forever. Fix: Made some changes to _dir_scan_job_fn - If a worker encounters error while processing an entry, notify the readdir loop in syncop_mt_dir_scan() of the error but continue to process other entries in the queue, decrementing the qlen as and when we dequeue elements, and ending only when the queue is empty. - If the readdir loop in syncop_mt_dir_scan() gets an error form the worker, stop the readdir+queueing of further entries. Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88 BUG: 1402841 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16073 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/dht: Fix memory corruption while accessing regex stored in	Raghavendra G	2016-12-08	3	-45/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	private If reconfigure is executed parallely (or concurrently with dht_init), there are races that can corrupt memory. One such race is modification of regexes stored in conf (conf->rsync_regex_valid and conf->extra_regex_valid) through dht_init_regex. With change [1], reconfigure codepath can get executed parallely (with itself or with dht_init) and this fix is needed. Also, a reconfigure can race with any thread doing dht_layout_search, resulting in dht_layout_search accessing regex freed up by reconfigure (like in bz 1399134). [1] http://review.gluster.org/15046 Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 BUG: 1399134 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15945 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	ganesha/scripts : find export id for already exported volume in ↵	Jiffin Tony Thottan	2016-12-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	S31ganesha-start.sh Change-Id: Iada90ed215966d3f526fa20aa5359b67f25a6944 BUG: 1401822 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16037 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	eventsapi: Log all published events and provide option to disable logging	Aravinda VK	2016-12-08	3	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Log every published event in /var/log/glusterfs/events.log, Disable logging using, gluster-eventsapi config-set disable-events-log true Also changed "log_level" config name to "log-level" Change-Id: Ib354be0c4ca999d1ccd01b810d6cd96ebc72bcd4 BUG: 1386200 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15674 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/geo-rep: Fix glusterd crash	Kotresh HR	2016-12-08	3	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterd crashes when geo-rep mountbroker setup is created if the slave user length is more than 8 characters. Cause: _POSIX_LOGIN_NAME_MAX is used which is 9 including NULL byte. Analysis: While the man page says it sufficient for portability, but acutally it's not. Linux allows the creation of username upto 32 characters by default where the max length is 256. And NetBSD's max is 17. Linux: #getconf LOGIN_NAME_MAX 256 NetBSD: #getconf LOGIN_NAME_MAX 17 Fix: Use LOGIN_NAME_MAX instead of _POSIX_LOGIN_NAME_MAX Change-Id: I26b7230433ecbbed6e6914ed39221a478c0266a8 BUG: 1368138 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16053 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterfsd : fix null pointer dereference in glusterfs_handle_barrier	Atin Mukherjee	2016-12-07	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Iab86a3c4970e54c22d3170e68708e0ea432a8ea4 BUG: 1401921 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/16043 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	tools/glusterfind: avoid deleting keys directory	Milind Changire	2016-12-07	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: gluster volume delete mistakenly deletes the .keys directory under /var/lib/glusterd/glusterfind. Solution: Check for ".keys" directory and avoid deleting it. Change-Id: Ia595c8bf3f423c1ad5d6faa183a29598c07a11f9 BUG: 1402369 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/16052 Reviewed-by: Aravinda VK <avishwan@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	geo-rep: Do not restart workers when log-rsync-performance config change	Aravinda VK	2016-12-07	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geo-rep restarts workers when any of the configurations changed. We don't need to restart workers if tunables like log-rsync-performance is modified. With this patch, Geo-rep workers will get new "log-rsync-performance" config automatically without restart. BUG: 1393678 Change-Id: I40ec253892ea7e70c727fa5d3c540a11e891897b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15816 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
*	cluster/afr: Remove backward compatibility for locks with v1	Pranith Kumar K	2016-12-07	2	-69/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we have cascading locks with same lk-owner there is a possibility for a deadlock to happen. One example is as follows: self-heal takes a lock in data-domain for big name with 256 chars of "aaaa...a" and starts heal in a 3-way replication when brick-0 is offline and healing from brick-1 to brick-2 is in progress. So this lock is active on brick-1 and brick-2. Now brick-0 comes online and an operation wants to take full lock and the lock is granted at brick-0 and it is waiting for lock on brick-1. As part of entry healing it takes full locks on all the available bricks and then proceeds with healing the entry. Now this lock will start waiting on brick-0 because some other operation already has a granted lock on it. This leads to a deadlock. Operation is waiting for unlock on "aaaa..." by heal where as heal is waiting for the operation to unlock on brick-0. Initially I thought this is happening because healing is trying to take a lock on all the available bricks instead of just the bricks that are participating in heal. But later realized that same kind of deadlock can happen if a brick goes down after the heal starts but comes back before it completes. So the essential problem is the cascading locks with same lk-owner which were added for backward compatibility with afr-v1 which can be safely removed now that versions with afr-v1 are already EOL. This patch removes the compatibility with v1 which requires cascading locks with same lk-owner. In the next version we can make locking-scheme option a dummy and switch completely to v2. BUG: 1401404 Change-Id: Ic9afab8260f5ff4dff5329eb0429811bcb879079 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/16024 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/afr: Serialize conflicting locks on all subvols	Pranith Kumar K	2016-12-06	2	-33/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) When a blocking lock is issued and the parallel lock phase fails on all subvolumes with EAGAIN, it is not switching to serialized locking phase. 2) When quorum is enabled and locks fail partially it is better to give errno returned by brick rather than the default quorum errno. Fix: Handled this error case and changed op_errno to reflect the actual errno in case of quorum error. BUG: 1369077 Change-Id: Ifac2e4a13686e9fde601873012700966d56a7f31 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15984 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	build: Use %{_datadir} rpm macro in spec file	Anoop C S	2016-12-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I18383b2a5f0ff5233d47715c2e91514b6c3f2b7c BUG: 1198849 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/15985 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	glusterd/ganesha : handle volume reset properly for ganesha options	Jiffin Tony Thottan	2016-12-06	6	-87/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "gluster volume reset" should first unexport the volume and then delete export configuration file. Also reset option is not applicable for ganesha.enable if volume value is "all". This patch also changes the name of create_export_config into manange_export_config Change-Id: Ie81a49e7d3e39a88bca9fbae5002bfda5cab34af BUG: 1397795 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15914 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	glusterd : coverity fix for string overflow	Muthu-vigneshwaran	2016-12-06	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CID : 1357872, 1357873, 1351695 BUG: 789278 Change-Id: I2ee01a6054326f35de621ee7a1f2afd09c5738fe Signed-off-by: Muthu-vigneshwaran <mvignesh@redhat.com> Reviewed-on: http://review.gluster.org/15989 Tested-by: Muthu Vigneshwaran Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	afr: fix bug in passing child index in afr_inode_write_fill	Ravishankar N	2016-12-06	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I7b70de317a5f15a3bf483ffe40b971143deddc11 BUG: 1401218 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16029 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	rpc: fix for race between rpc and protocol/client	Rajesh Joseph	2016-12-05	2	-40/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible that the notification thread which notifies protocol/client layer about the disconnection is put to sleep and meanwhile, a fuse thread or a timer thread initiates and completes reconnection to the brick. The notification thread is then woken up and protocol/client layer updates its flags to indicate that network is disconnected. No reconnection is initiated because reconnection is rpc-lib layer's responsibility and its flags indicate that connection is connected. Fix: Serialize connect and disconnect notify Credit: Raghavendra Talur <rtalur@redhat.com> Change-Id: I8ff5d1a3283b47f5c26848a42016a40bc34ffc1d BUG: 1386626 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15916 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	extras: Include shard and full-data-heal in virt group	Krutika Dhananjay	2016-12-04	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: Iea66cb017bd1ab62da9cd65895fa65fc6896108b BUG: 1375431 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15995 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	afr, client: More mem-leak fixes in COMPOUND fop cbk	Krutika Dhananjay	2016-12-04	7	-293/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bugs found and fixed: 1. Use correct subvolume index in pre-op-writev compound cbk 2. Prevent use-after-free of local->compound_args members in compound fops cbk in protocol/client 3. Fix xdata and xattr leaks in client_process_response 4. Fix possible leak of xdata in client_pre_writev() in test mode. 5. Free req->compound_req_array.compound_req_array_val as well after freeing its members 6. Free tmp_rsp->flock.lk_owner.lk_owner_val in LK fop. Change-Id: I15b646d7d4e0e5cd4ea3d2d6452c815cf2eaf68f BUG: 1401218 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16020 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/ec: Check xdata to avoid memory leak	Ashish Pandey	2016-12-02	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: ec_writev_start calls ec_make_internal_fop_xdata to set "yes" in xdata before ec_readv (an internal fop) is called for head and tail. Second call to this function is overwriting the previous allocated dict_t to "xdata", which results in memory leak. Solution: In ec_make_internal_fop_xdata, check if xdata is NULL or not to avoid overwriting xdata. Change-Id: I49b83923e11aff9b92d002e86424c0c2e1f5f74f BUG: 1400818 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/16007 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	eventsapi: Push Messages to Webhooks in parallel	Aravinda VK	2016-12-02	2	-5/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this patch, glustereventsd will maintain one thread per webhook. If a webhook is slow, then all the events to that worker will be delayed but it will not affect the other webhooks. Note: Webhook in transit may get missed if glustereventsd reloads due to new Webhook addition or due configuration changes. BUG: 1357754 Change-Id: I2d11e01c7ac434355bc356ff75396252f51b339b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15966 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	dht/md-cache: Filter invalidate if the file is made a linkto file	Poornima G	2016-12-02	4	-2/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upcall as a part of setattr, sends an invalidation and the invalidation carries the resulting stat value. When a file is converted to linkto files, even then an invalidation is set and as a result the mountpoint shows the sticky bit in the stat of the file. eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999 Fix: When dht recieves a notification of sticky bit change, it updates the flag, to indicate md-cache to send the subsequent lookup. Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606 BUG: 1392713 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15789 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	geo-rep/eventsapi: Add Master node information in Geo-rep Events	Aravinda VK	2016-12-02	4	-23/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added Master node information to GEOREP_ACTIVE, GEOREP_PASSIVE, GEOREP_FAULTY and GEOREP_CHECKPOINT_COMPLETED events. EVENT_GEOREP_ACTIVE(master_node and master_node_id are new fields) { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_ACTIVE", "message": { "master_volume": MASTER_VOLUME_NAME, "master_node": MASTER_NODE, "master_node_id": MASTER_NODE_ID, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH } } EVENT_GEOREP_PASSIVE(master_node and master_node_id are new fields) { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_PASSIVE", "message": { "master_volume": MASTER_VOLUME_NAME, "master_node": MASTER_NODE, "master_node_id": MASTER_NODE_ID, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH } } EVENT_GEOREP_FAULTY(master_node and master_node_id are new fields) { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_FAULTY", "message": { "master_volume": MASTER_VOLUME_NAME, "master_node": MASTER_NODE, "master_node_id": MASTER_NODE_ID, "current_slave_host": CURRENT_SLAVE_HOST, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH } } EVENT_GEOREP_CHECKPOINT_COMPLETED(master_node and master_node_id are new fields) { "nodeid": NODEID, "ts": TIMESTAMP, "event": "GEOREP_CHECKPOINT_COMPLETED", "message": { "master_volume": MASTER_VOLUME_NAME, "master_node": MASTER_NODE, "master_node_id": MASTER_NODE_ID, "slave_host": SLAVE_HOST, "slave_volume": SLAVE_VOLUME, "brick_path": BRICK_PATH, "checkpoint_time": CHECKPOINT_TIME, "checkpoint_completion_time": CHECKPOINT_COMPLETION_TIME } } BUG: 1395660 Change-Id: Ic91af52fa248c8e982e93a06be861dfd69689f34 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15858 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
*	selfheal: fix memory leak on client side healing queue	Mateusz Slupny	2016-12-02	3	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d BUG: 1399592 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> Reviewed-on: http://review.gluster.org/15968 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	eventsapi: JSON output and different error codes	Aravinda VK	2016-12-02	4	-87/+259
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	JSON outputs are added to all commands, use `--json` to get JSON output. Following error codes are added to differenciate between errors. Any other Unknown errors will have return code 1 ERROR_SAME_CONFIG = 2 ERROR_ALL_NODES_STATUS_NOT_OK = 3 ERROR_PARTIAL_SUCCESS = 4 ERROR_WEBHOOK_ALREADY_EXISTS = 5 ERROR_WEBHOOK_NOT_EXISTS = 6 ERROR_INVALID_CONFIG = 7 ERROR_WEBHOOK_SYNC_FAILED = 8 ERROR_CONFIG_SYNC_FAILED = 9 Also hidden `node-` commands in the help message. BUG: 1357753 Change-Id: I962b5435c8a448b4573059da0eae42f3f93cc97e Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15867 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterfsd: glusterfs_ctx_defaults_init should not re-initialize ctx->locks	Rajesh Joseph	2016-12-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterfs_ctx_new already initialize ctx->locks therefore the second initialization in glusterfs_ctx_defaults_init does not make sense. Change-Id: I6027cbd311da8e80585e0f0dcd6916e3bc8dd284 BUG: 1397419 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15905 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/tier: fix op-version for tier-query-limit	Milind Changire	2016-12-01	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct the op-version for tier-query-limit option from 3.9.0 to 3.9.1 Change-Id: I3a52a94c2708a97c18377e945d559a51d8025c41 BUG: 1366648 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15990 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	ganesha/scripts : avoid incrementing Export Id value for already exported ↵	Jiffin Tony Thottan	2016-12-01	1	-32/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	volumes Currently a volume will unexport when it stops and reexport it during volume start using hook script. And also it increments the value for export id for each reexport. Since a hook script is called from every node parallely which may led inconsistency for export id value. Change-Id: Ib9f19a3172b2ade29a3b4edc908b3267c68c0b20 BUG: 1399186 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15948 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	dht/rename : Incase of failure remove linkto file properly	Jiffin Tony Thottan	2016-12-01	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally linkto file is created using root user. Consider following case, a user is trying to rename a file which he is not permitted. So the rename fails with EACESS and when rename tries to cleanup the linkto file, it fails. The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in script * create a file "abc" using root * rename the file "abc" to "xyz" using a non root user, it fails with EACESS * delete "abc" * create directory "abc" using root * again try ot rename "abc" to "xyz" using non root user, test hungs here which slowly leds to OOM kill of ganesha process RCA put forwarded by Du for OOM kill of ganesha Note that when we hit this bug, we've a scenario of a dentry being present as: * a linkto file on one subvol * a directory on rest of subvols When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: dht_lookup_everywhere dht_lookup_everywhere_cbk dht_lookup_unlink_cbk dht_lookup_everywhere_done dht_lookup_directory (as local->dir_count > 0) dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol) dht_lookup_everywhere (as need_selfheal = 1). This infinite loop can cause increased consumption of memory due to: 1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops. This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case) Thanks Du for root causing the oom kill and Sushant for suggesting the fix Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d BUG: 1397052 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15894 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	common-ha: IPaddr RA is not stopped when pacemaker quorum is list	Kaleb S. KEITHLEY	2016-12-01	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ken Gaillot writes: The other is pacemaker's no-quorum-policy cluster property. The default (which has not changed) is "stop" (stop all resources). Other values are "ignore" (act as if quorum was not lost), "freeze" (continue running existing resources but don't recover resources from unseen nodes) or "suicide" (shut down). But on my four node cluster % pcs property show no-quorum-policy Cluster Properties: % i.e. shows nothing. But: % pcs property list --all Cluster Properties: ... no-quorum-policy: stop ... % Seems to think it knows about it. and then % pcs property set no-quorum-policy=stop % pcs property show no-quorum-policy Cluster Properties: no-quorum-policy: stop % Which looks rather inconsistent. So we will try explicitly setting it to "stop" when there are three or more nodes. Change-Id: I47fc7ee84fcd6ad52ccb776913511978a8d517b4 BUG: 1400237 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15981 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	build: add systemd dependency to the glusterfs sub-package	Milind Changire	2016-12-01	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: /bin/systemctl is not available at install time of primary glusterfs package. Solution: Add %{?systemd_requires} to the glusterfs sub-package install time requirements. Replace all "Requires: systemd" and "Requires: systemd-units" with %{?systemd_requires}. %systemd_requires is defined in /usr/lib/rpm/macros.d/macros.systemd systemd-units is provided by systemd. Add BuildRequires: systemd for the definition of %systemd_requires as well. Change-Id: I980ece7d538ea177ca6b0e70c1c169e6f04c46d4 BUG: 1399031 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15936 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	glusterd, cli: Fix volume options output format in get-state cli	Samikshan Bairagya	2016-12-01	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the get-state cli outputs the volume options in the following format: Volume1.rebalance.skipped: 0 Volume1.rebalance.lookedup: 0 Volume1.rebalance.files: 0 Volume1.rebalance.data: 0Bytes [Volume1.options] features.barrier: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on Volume2.name: tv2 Volume2.id: 35854708-bb72-45a5-bdbd-77c51e5ebfb9 Volume2.type: Distribute This above format is a valid ini file format syntactically, but is not very easily parseable. This patch changes the format to look like the following and should be more easily parseable: Volume1.rebalance.skipped: 0 Volume1.rebalance.lookedup: 0 Volume1.rebalance.files: 0 Volume1.rebalance.data: 0Bytes Volume1.options.features.barrier: on Volume1.options.transport.address-family: inet Volume1.options.performance.readdir-ahead: on Volume1.options.nfs.disable: on Volume2.name: tv2 Volume2.id: 35854708-bb72-45a5-bdbd-77c51e5ebfb9 Volume2.type: Distribute Change-Id: I9768b45de288d9817ec669d3a801874eb1914750 BUG: 1399995 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15975 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shubhendu Tripathi <shtripat@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	common-ha: add cluster HA status to --status output for gdeploy	Kaleb S. KEITHLEY	2016-12-01	1	-28/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gdeploy desires a one-liner "health" assessment. If all the VIP and port block/unblock RAs are located on their prefered nodes and 'Started', then the cluster is deemed to be good (healthy). N.B. status originally only checked the "online" nodes obtained from `pcs status` but we really want to consider all the configured nodes, whether they are online or not. Also one `pcs status` is enough. Change-Id: Id0e0380b6982e23763edeb0488843b5363e370b8 BUG: 1395648 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15882 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Arthy Loganathan <aloganat@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com>
*	common-HA: Increase timeout for portblock RA of action=unblock	Soumya Koduri	2016-12-01	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Portblock RA of action type unblock stores the information about the client/server IPs connection in tickle_dir folder created in the shared storage. In case of node shutdown/reboot there could be cases wherein shared_storage may become unavailable for sometime. Hence increase the timeout to avoid that resource agent going into FAILED state. Change-Id: I4f98f819895cb164c3a82ba8084c7c11610f35ff BUG: 1399154 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15947 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
*	uss: snapd should enable SSL if SSL is enabled on volume	Rajesh Joseph	2016-12-01	2	-0/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During snapd graph generation we should check if SSL is enabled on main volume or not. This is because clients will communicate with snapd as if it is communicating to a brick. Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> BUG: 1400013 Reviewed-on: http://review.gluster.org/15979 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	glusterd: Remove duplicate values for glusterd message macros	Samikshan Bairagya	2016-11-30	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both GD_MSG_BRICK_CLEANUP_SUCCESS and GD_MSG_DAEMON_STATE_REQ_RCVD macros are assigned the same value (GLUSTERD_COMP_BASE + 584). Also the number of messages should be 588 instead of 587 Change-Id: I015d32435c05ded1b14cd8ba11911af826bc956b BUG: 1400026 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/15980 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	protocol/server: Fix mem-leaks in compound fops	Krutika Dhananjay	2016-11-30	1	-187/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Remove spurious 'return' statement. * Free up 'compound_rsp_array_val' as well in the end. * Remove multiple refs on this_args->xdata. Change-Id: I212c6dbe4d81b0381c1323d05fdfcc853886b25b BUG: 1399578 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15965 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	gfapi: glfs_subvol_done should NOT wait for graph migration.	Rajesh Joseph	2016-11-29	3	-14/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In graph_setup function glfs_subvol_done is called which is executed in an epoll thread. glfs_lock waits on other thread to finish graph migration. This can lead to dead lock if we consume all the epoll threads. In general any call-back function executed in epoll thread should not call any blocking call which waits on a network reply either directly or indirectly, e.g. syncop functions should not be called in these threads. As a fix we should not wait for migration in the call-back path. Change-Id: If96d0689fe1b4d74631e383048cdc30b01690dc2 BUG: 1397754 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15913 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	all: remove dead translators	Jeff Darcy	2016-11-29	35	-13769/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following have been completely removed from the source tree, makefiles, configure script, and RPM specfile. cluster/afr/pump cluster/ha cluster/map features/filter features/mac-compat features/path-convertor features/protect Change-Id: I2f966999ac3c180296ff90c1799548fba504f88f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/15906 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: A hard link is lost during rebalance + lookup	Mohit Agrawal	2016-11-29	1	-40/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. BUG: 1396048 Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15866 Reviewed-by: Susant Palai <spalai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	afr: fix auto-quorum	Jeff Darcy	2016-11-28	3	-49/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(1) afr_have_quorum is dead code. It was copied to afr_has_quorum, and everything else uses that, but the original was never deleted (until now). (2) Auto-quorum should be default for any N>2. Leaving quorum disabled is BAD, but apparently deemed acceptable for N=2 because there's no real quorum in that case. For any larger number (including arbiter configurations) there is such a thing as real quorum and we should use it by default. Note that for N=3 the answers we get from "N % 2" (the old check) and "N > 2" (the new one) are the same. (3) The special case for even N in afr_has_quorum has been simplified and explained more thoroughly in a comment. Change-Id: I48b33c15093512fecf516b26dcf09afecb7ae33b Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/15873 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	geo-rep: Fix Last synced status column issue during Hybrid Crawl	Aravinda VK	2016-11-28	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During Hybrid crawl, Geo-rep maintains stime xattr in subdirectories along with the Brick root. This is done to skip directories if Geo-rep crashes before Hybrid crawl completes. Update Last synced status only when stime xattr updated in brick root. Status output will mislead if it shows sub directory stime as last synced time. BUG: 1396081 Change-Id: I5b73aee7ae4a1c1e2d1001d1f55559b9f9efd6e6 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/15869 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
*	afr: Fix the EIO that can occur in afr_inode_refresh as a result	Poornima G	2016-11-28	4	-18/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of cache invalidation(upcall). Issue: ------ When a cache invalidation is recieved as a result of changing pending xattr, the read_subvol is reset. Consider the below chain of execution: CHILD_DOWN ... afr_readv ... afr_inode_refresh ... afr_inode_read_subvol_reset <- as a result of pending xattr set by some other client GF_EVENT_UPCALL will be sent afr_refresh_done -> this results in an EIO, as the read subvol was reset by the end of the afr_inode_refresh Solution: --------- When GF_EVENT_UPCALL is recieved, instead of resetting read_subvol, set a variable need_refresh in inode_ctx, the next time some one starts a txn, along with event gen, need_rrefresh also needs to be checked. Change-Id: Ifda21a7a8039b8874215e1afa4bdf20f7d991b58 BUG: 1396952 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15892 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	libglusterfs: Fix a read hang	Poornima G	2016-11-28	2	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: ===== In certain cases, there was no unwind of read from read-ahead xlator, thus resulting in hang. RCA: ==== In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead of STACK_WIND(). One such case is when inode_ctx for that file is not present (can happen if readdirp was called, and populates md-cache and serves all the lookups from cache). Consider the following graph: ... io-cache (parent) \| readdir-ahead \| read-ahead ... Below is the code snippet of ioc_readv calling STACK_WIND_TAIL: ioc_readv() { ... if (!inode_ctx) STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this), FIRST_CHILD (frame->this)->fops->readv, fd, size, offset, flags, xdata); /* Ideally, this stack_wind should wind to readdir-ahead:readv() but it winds to read-ahead:readv(). See below for explaination. / ... } STACK_WIND_TAIL (frame, obj, fn, ...) { frame->this = obj; / for the above mentioned graph, frame->this will be readdir-ahead * frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which * is as expected / ... THIS = obj; / THIS will be read-ahead instead of readdir-ahead!, as obj expands * to "FIRST_CHILD (frame->this)" and frame->this was pointing * to readdir-ahead in the previous statement. / ... fn (frame, obj, params); / fn will call read-ahead:readv() instead of readdir-ahead:readv()! * as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and * frame->this was pointing ro readdir-ahead in the first statement / ... } Thus, the readdir-ahead's readv() implementation will be skipped, and ra_readv() will be called with frame->this = "readdir-ahead" and this = "read-ahead". This can lead to corruption / hang / other problems. But in this perticular case, when 'frame->this' and 'this' passed to ra_readv() doesn't match, it causes ra_readv() to call ra_readv() again!. Thus the logic of read-ahead readv() falls apart and leads to hang. Solution: ========= Modify STACK_WIND_TAIL() as: STACK_WIND_TAIL (frame, obj, fn, ...) { next_xl = obj / resolve obj as the variables passed in obj macro can be overwritten in the further instrucions / next_xl_fn = fn / resolve fn and store in a tmp variable, before modifying any variables */ frame->this = next_xl; ... THIS = next_xl; ... next_xl_fn (frame, next_xl, params); ... } As a part of http://review.gluster.org/15901/ the caller io-cache was fixed. BUG: 1388292 Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15923 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/ec: Healing should not start if only "data" bricks are UP	Ashish Pandey	2016-11-28	1	-11/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a disperse volume with "K+R" configuration, where "K" is the number of data bricks and "R" is the number of redundancy bricks (Total number of bricks, N = K+R), if only K bricks are UP, we should NOT start heal process. This is because the bricks, which are supposed to be healed, are not UP. This will unnecessary eat up the resources. Solution: Check for the number of xl_up_count and only if it is greater than ec->fragments (number of data bricks), start heal process. Change-Id: I8579f39cfb47b65ff0f76e623b048bd67b15473b BUG: 1399072 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15937 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	cluster/afr: CLI for granular entry heal enablement/disablement	Krutika Dhananjay	2016-11-28	13	-63/+407
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099 BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15747 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	protocol/server: capture offset in seek	Ravishankar N	2016-11-28	3	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: http://review.gluster.org/11482 implemented seek FOP but http://review.gluster.org/#/c/14137/ 'undid' the change where we pack the offset returned by seek in server xlator before sending it to the client. As a result, seek always returns zero to the client for SEEK_HOLE/ SEEK_DATA. Fix: I think 14137 removed it unintentionally, hence adding it back again. Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I67a1f7b53214b043c5291f5704be4a50b698f91c BUG: 1398076 Reviewed-on: http://review.gluster.org/15920 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	gfapi: Fix memory leak in glfs-mgmt	Rajesh Joseph	2016-11-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	dictionary was not freed after serialization Change-Id: I495f2f823b0d53a0d858876bde41fde5f0705113 BUG: 1397177 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15895 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Poornima G <pgurusid@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	afr: allow I/O when favorite-child-policy is enabled	Ravishankar N	2016-11-27	9	-61/+469
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy, inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks), refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1386188 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/15673 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	protocol/server: Print pargfid in logs for rename error	Pranith Kumar K	2016-11-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	BUG: 1394548 Change-Id: I42ee627c8cdf54158f083f9019a096ace449e3cc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15872 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
*	cluster/afr: Fix bugs in [f]inodelk/[f]entrylk	Pranith Kumar K	2016-11-26	6	-336/+468
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problems: 1) Inodelk is not taking quorum into account 2) finodelk, [f]entrylk are not implemented correctly 3) By default afr doesn't go for non-blocking parallel locks. Fix: Implemented a common framework which can be used by [f]inodelk/[f]entrylk. Used quorum for the same. Change-Id: I239f13875a065298630d266941df10cfa3addc85 BUG: 1369077 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15802 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>