glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/ec: Implement self-heal-window_size option	Sunil Kumar Acharya	2017-04-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix implements the heal window size option for EC. This option control the maximum size of read/write operation carried out in self-heal process. BUG: 1441491 Change-Id: I6c0ef65c9ca18b0828f91b319d4f52ac5b77d0d8 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/17098 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd/geo-rep: Fix snapshot create in geo-rep setup	Kotresh HR	2017-04-24	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterd persists geo-rep sessions in glusterd info file which is represented by dictionary 'volinfo->gsync_slaves' in memory. Glusterd also maintains in memory active geo-rep sessions in dictionary 'volinfo->gsync_active_slaves' whose key is "<slave_url>::<slavhost>". When glusterd is restarted while the geo-rep sessions are active, it builds the 'volinfo->gsync_active_slaves' from persisted glusterd info file. Since slave volume uuid is added to "voinfo->gsync_slaves" with the commit "http://review.gluster.org/13111", it builds it with key "<slave_url>::<slavehost>:<slavevol_uuid>" which is wrong. So during snapshot pre-validation which checks whether geo-rep is active or not, it always says it is ACTIVE, as geo-rep stop would not deleted this key. Fixed the same in this patch. Change-Id: I185178910b4b8a62e66aba406d88d12fabc5c122 BUG: 1443977 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17093 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: set conn->reconnect to null on timer cancellation	Atin Mukherjee	2017-04-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Ic48e6652f431daeb0db027660f6c9de16d893f08 BUG: 1443896 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17088 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
*	Implement negative lookup cache	Poornima G	2017-04-20	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before creating any file negative lookups(1 in Fuse, 4 in SMB etc.) are sent to verify if the file already exists. By serving these lookups from the cache when possible, increases the create performance by multiple folds in SMB access and some percentage in Fuse/NFS access. Feature page: https://review.gluster.org/#/c/16436 Updates #82 Change-Id: Ib1c0e7ac7a386f943d84f6398c27f9a03665b2a4 BUG: 1442569 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16952 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	features/bit-rot-stub: bring in optional versioning	Raghavendra Bhat	2017-04-18	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* As of now bit-rot-stub does versioning always. This leads lots of getxattr calls being made in lookups. So make object versioning optional. Change-Id: I83713e45ae59fb28004bb3cfa008f2d69edebbfa BUG: 1359599 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/14442 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterd: Fix snapshot failure in non-root geo-rep setup	Kotresh HR	2017-04-18	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geo-replication session directory name has the form '<mastervol>_<slavehost>_<slavevol>'. But in non-root geo-replication setup, while preparing geo-replication session directory name, glusterd is including 'user@' resulting in "<mastervol>_<user@slavehost>_<slavevol>". Hence snapshot is failing to copy geo-rep specific session files. Fixing the same. Change-Id: Id214d3186e40997d2827a0bb60d3676ca2552df7 BUG: 1442760 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17067 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
*	dht: Add readdir-ahead in rebalance graph if parallel-readdir is on	Poornima G	2017-04-18	2	-5/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: The value of linkto xattr is generally the name of the dht's next subvol, this requires that the next subvol of dht is not changed for the life time of the volume. But with parallel readdir enabled, the readdir-ahead loaded below dht, is optional. The linkto xattr for first subvol, when: - parallel readdir is enabled : "<volname>-readdir-head-0" - plain distribute volume : "<volname>-client-0" - distribute replicate volume : "<volname>-afr-0" The value of linkto xattr is "<volname>-readdir-head-0" when parallel readdir is enabled, and is "<volname>-client-0" if its disabled. But the dht_lookup takes care of healing if it cannot identify which linkto subvol, the xattr points to. In dht_lookup_cbk, if linkto xattr is found to be "<volname>-client-0" and parallel readdir is enabled, then it cannot understand the value "<volname>-client-0" as it expects "<volname>-readdir-head-0". In that case, dht_lookup_everywhere is issued and then the linkto file is unlinked and recreated with the right linkto xattr. The issue is when parallel readdir is enabled, mount point accesses the file that is currently being migrated. Since rebalance process doesn't have parallel-readdir feature, it expects "<volname>-client-0" where as mount expects "<volname>-readdir-head-0". Thus at some point either the mount or rebalance will fail. Solution: Enable parallel-readdir for rebalance as well and then do not allow enabling/disabling parallel-readdir if rebalance is in progress. Change-Id: I241ab966bdd850e667f7768840540546f5289483 BUG: 1436090 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17056 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: Add brick capacity details to get-state CLI output	Samikshan Bairagya	2017-04-14	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I53fe180e71d41d56b129254b93bb74014a2cdb43 BUG: 1431192 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17029 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: fix glusterd_wait_for_blockers to go in infinite loop	Atin Mukherjee	2017-04-13	1	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In send_attach_req () conf->blockers is bumped up before rpc_clnt_submit however the same is bumped down twice, one from the callback and one from the negative ret handling which can very well be a possible case if the rpc submit fails. Change-Id: Icb820694034cbfcb3d427911e192ac4a0f4540f6 BUG: 1441910 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17055 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
*	glusterd: Propagate EADDRINUSE correctly to parent process	Prashanth Pai	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exit()/_exit(): Only the least significant 8 bits i.e (err & 255) shall be available to the waiting parent process on calling _exit() or exit() with an integer exit status. If this number is negative, the parent process doesn't readily get what it's really looking forward to handle. For example: EADDRINUSE is 98 and if exit status code is set to -98, the waiting parent process shall get 158 (= -98 & 255) as exit status. BUG: 1193929 Change-Id: Idc6b0f40c2332e087e584b4b40cbf0d29168c9cd Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/16200 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: Add client details to get-state output	Samikshan Bairagya	2017-04-12	3	-1/+207
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit optionally adds client details corresponding to the locally running bricks to the get-state output. Since getting the client details involves sending RPC requests to the respective local bricks, this is a relatively more costly operation. These client details would be added to the get-state output only if the get-state command is invoked with the 'detail' option. This commit therefore also changes the get-state CLI usage. The modified usage is as follows: # gluster get-state [<daemon>] [[odir </path/to/output/dir/>] \ [file <filename>]] [detail] Change-Id: I42cd4ef160f9e96d55a08a10d32c8ba44e4cd3d8 BUG: 1431183 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17003 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	rpc: add options to manage socket keepalive lifespan	Milind Changire	2017-04-12	1	-1/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Default values for handling socket timeouts for brick responses are insufficient for aggressive applications such as databases. Solution: Add 1:1 gluster options for keepalive, keepalive-idle, keepalive-interval and keepalive-timeout as per the socket level options available as per tcp(7) man page. Default values for options are NOT agressive and continue to be values which result in default timeout when only the keep alive option is turned on. These options are Linux specific and will not be applicable to the *BSDs. Change-Id: I2a08ecd949ca8ceb3e090d336ad634341e2dbf14 BUG: 1426059 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16731 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: Add validation for options rda-cache-limit rda-request-size	Poornima G	2017-04-11	2	-7/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when prarallel readdir is enabled, setting any junk value to rda-cache-limit and rda-request-size succeeds. This is because of bug in the special handling of these options. Fixing the same in this patch Change-Id: I902cd9ac9134c158ab6f8aea4b001254a03547bd BUG: 1439640 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17008 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	xlator: do not call dlclose() when debugging	Niels de Vos	2017-04-07	7	-11/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Valgrind can not show the symbols if a .so after calling dlclose(). The unhelpful ??? in the output gets resolved properly with this change: ==25170== 344 bytes in 1 blocks are definitely lost in loss record 233 of 324 ==25170== at 0x4C29975: calloc (vg_replace_malloc.c:711) ==25170== by 0x52C7C0B: __gf_calloc (mem-pool.c:117) ==25170== by 0x12B0638A: ??? ==25170== by 0x528FCE6: __xlator_init (xlator.c:472) ==25170== by 0x528FE16: xlator_init (xlator.c:498) ==25170== by 0x52DA8D6: glusterfs_graph_init (graph.c:321) ==25170== by 0x52DB587: glusterfs_graph_activate (graph.c:695) ==25170== by 0x5046407: glfs_process_volfp (glfs-mgmt.c:79) ==25170== by 0x5043B9E: glfs_volumes_init (glfs.c:281) ==25170== by 0x5044FEC: glfs_init_common (glfs.c:986) ==25170== by 0x50451A7: glfs_init@@GFAPI_3.4.0 (glfs.c:1031) By not calling dlclose(), the dynamically loaded .so is still available upon program exit, and Valgrind is able to resolve the symbols. This will add an additional leak, so dlclose() is called for normal builds, but skipped when configuring with "./configure --enable-valgrind" or passing the "run-with-valgrind" xlator option. URL: http://valgrind.org/docs/manual/faq.html#faq.unhelpful Change-Id: I2044e21b1b8fcce32ad1a817fdd795218f967731 BUG: 1425623 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16809 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	Fixes Stale auxiliary mount when crawler fails to spawn	Sanoj Unnikrishnan	2017-04-05	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The auxiliary mount created for crawling remains if the crawler was not successfully spawned due to transport disconnect or other such issues. The patch ensures the mount is cleared in those code paths as well. Change-Id: I659fcc1d1956f8e05a37b75ebe3f3a00c24693e8 BUG: 1429330 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com> Reviewed-on: https://review.gluster.org/16853 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
*	glusterd : Disallow peer detach if snapshot bricks exist on it	Gaurav Yadav	2017-03-31	3	-2/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem : - Deploy gluster on 2 nodes, one brick each, one volume replicated - Create a snapshot - Lose one server - Add a replacement peer and new brick with a new IP address - replace-brick the missing brick onto the new server (wait for replication to finish) - peer detach the old server - after doing above steps, glusterd fails to restart. Solution: With the fix detach peer will populate an error : "N2 is part of existing snapshots. Remove those snapshots before proceeding". While doing so we force user to stay with that peer or to delete all snapshots. Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c BUG: 1322145 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16907 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: reset pid to -1 if brick is not online	Atin Mukherjee	2017-03-31	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While populating brick details in gluster volume status response payload if a brick is not online then pid should be reset back to -1 so that volume status output doesn't show up the pid which was not cleaned up especially with brick multiplexing where multiple bricks belong to same process. Change-Id: Iba346da9a8cb5b5f5dd38031d4c5ef2097808387 BUG: 1437494 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16971 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
*	glusterd: hold off volume deletes while still restarting bricks	Jeff Darcy	2017-03-30	5	-14/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to do this because modifying the volume/brick tree while glusterd_restart_bricks is still walking it can lead to segfaults. Without waiting we could accidentally "slip in" while attach_brick has released big_lock between retries and make such a modification. Change-Id: I30ccc4efa8d286aae847250f5d4fb28956a74b03 BUG: 1432542 Signed-off-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-on: https://review.gluster.org/16927 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: support filesystems with dynamic inode sizes	Niels de Vos	2017-03-27	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	btrfs and zfs are two filesystems that do not have fixed sizes for inodes. Instead of logging an error, skip checking and mark the size as "N/A" like other properties that can not be reported. The error message that was reported by users on the mailinglist shows up like: [glusterd-utils.c:5458:glusterd_add_inode_size_to_dict] 0-management: could not find (null) to getinode size for /dev/vdb (btrfs): (null) package missing? Change-Id: Ib10b7a3669f2f4221075715d9fd44ce1ffc35324 Reported-by: Arman Khalatyan <arm2arm@gmail.com> URL: http://lists.gluster.org/pipermail/gluster-users/2017-March/030189.html BUG: 1433425 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16867 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: (storhaug) remove ganesha	Kaleb S. KEITHLEY	2017-03-21	14	-1390/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	remove all vestiges of ganesha The storhaug CLI is used to manage ganesha and Samba. Also any setup and teardown of the ganesha HA is initiated using storhaug to preserve the proper layering. Change-Id: I0eec0016a1b7802a36e7b2d92896b86fdf8607d5 BUG: 1420713 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16504 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	TIER: watermark check during low watermark reset	hari gowtham	2017-03-18	2	-1/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: during a low watermark reset, checking of whether the low watermark is lower than hi watermark is not done. FIX: This patch checks if the hi watermark value is higher the default low watermark. Else throws an failure of the reset command Change-Id: I8b49090c6bccce6d45c2e8076ab766047a2a6162 BUG: 1328342 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/14028 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	build/packaging: Debian and Ubuntu don't have /usr/libexec	Kaleb S. KEITHLEY	2017-03-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GLUSTERFS_LIBEXECDIR is effectively hard-coded to /usr/libexec/glusterfs in configure(.ac) Debian-based distributions don't have a /usr/libexec/ directory This issues is partially mitigated by the use of $libexecdir in some of the Makefile.am files, but even so the incorrectly defined GLUSTERFS_LIBEXECDIR results in various things such as gsyncd, glusterfind, eventsd, etc., trying to invoke other scripts and programs from a location that doesn't exist. And once we correctly define GLUSTERFS_LIBEXECDIR, then we might as well use it appropriatedly. Change-Id: If5219cadc51ae316f7ba2e2831d739235c77902d BUG: 1430841 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16880 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Joe Julian <me@joejulian.name> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	glusterd: don't queue attach reqs before connecting	Jeff Darcy	2017-03-08	1	-11/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was causing USS tests to fail. The underlying problem here is that if we try to queue the attach request too soon after starting a brick process then the socket code will get an error trying to write to the still-unconnected socket. Its response is to shut down the socket, which causes the queued attach requests to be force-unwound. There's nothing to retry them, so they effectively never happen and those bricks (second and succeeding for a snapshot) never become available. We do have a retry loop for attach requests, but currently break out as soon as a request is queued - not actually sent. The fix is to modify that loop so it will wait some more if the rpc connection isn't even complete yet. Now we break out only when we have a completed connection and a queued request. Change-Id: Ib6be13646f1fa9072b4a944ab5f13e1b29084841 BUG: 1430148 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16868 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com>
*	glusterd : Fix for replicate and disperse volume option	Gaurav Yadav	2017-03-06	2	-2/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While setting volume option(disperse-shd-max-threads) for replicate volume and volume option(cluster-shd-max-threads) for disperse volume, glusterd is not validating volume options and setting all the values irrespective of proper validation for disperse-shd-max-threads and cluster-shd-max-threads Change-Id: Ic88815ad49e901e74ffc042170f5caabf7c17a89 BUG: 1417588 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16489 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glusterd: disallow increasing replica count for arbiter volumes	Ravishankar N	2017-03-06	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: add-brick command to increase replica count in an arbiter volume succeeds, causing undesirable effects like the 4th brick being loaded with the arbiter xlator, the 3rd one losing the arbiter xlator (when the brick process is restarted), arbitration logic in afr going for a toss etc. Fix: Arbiter configuration should always be a replica 3 volume (of which 3rd brick is arbiter). Hence disallow increasing replica count for arbiter volume configurations. Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51 BUG: 1429200 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16845 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/ec: Introduce optimistic changelog in EC	Pranith Kumar K	2017-03-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Fix to https://bugzilla.redhat.com/show_bug.cgi?id=1316873 has made changes to set dirty flag before every update fop, data or metadata, and unset it after successful operation. That makes some of the fops very slow such as entry operations or metadata operations. Solution: File data operations are the only operation which take some time and setting dirty flag before a fop and unsetting it after serves the purpose as probability of failure of a fop is high when the time duration is more. For all the other operations, set dirty flag at the end of the fop, if any brick is down and need heal. Providing following option to choose between high performance or better heal marking for metadata and entry fops. Set/Unset dirty flag for every update fop at the start of the fop. If ON, this option impacts performance of entry operations or metadata operations as it will set dirty flag at the start and unset it at the end of ALL update fop. If OFF and all the bricks are good, dirty flag will be set at the start only for file fops For metadata and entry fops dirty flag will not be set at the start, if all the bricks are good. This does not impact performance for metadata operations and entry operation but has a very small window to miss marking entry as dirty in case it is required to be healed. Thanks to Xavi and Ashish for the design Picked the .t file from Ashish' patch https://review.gluster.org/16298 BUG: 1408809 Change-Id: I3ce860063f0e2901e50754dcfc3e4ed22daf819f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16821 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Fix bad free	Michael Scherer	2017-03-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since rsp.path is assigned a constant string, free would operate on a incorrect pointer, with likely bad results. Found by coverity scan Change-Id: I4befdd78573daa3c0c3013100f7ae69a2dcae36a BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16716 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: use sys_lstat instead of lstat	Jeff Darcy	2017-03-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Showed up in 0symbol-check.t while testing something else. Might as well fix it now. Change-Id: Ic6b8214de6f486187afc4987c5ffbbca02c8997f Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16820 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Free brick_hint at the end of the function	Nigel Babu	2017-03-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not freeing brick_hint causes a memory leak. This error was reported by Coverity. Change-Id: Ic923f892ea5207848cdd3fa6332a1e52e6c996b8 BUG: 789278 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16782 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: Fix the incorrect check	Nigel Babu	2017-02-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	err_str is an array and is therefore never NULL. This condition would always be false. Change-Id: I31eb3338986a3af584e0feca8ec3e16f738378ec BUG: 789278 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16766 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	core: Clean up pmap registry up correctly on volume/brick stop	Samikshan Bairagya	2017-02-27	1	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit changes the following: 1. In glusterfs_handle_terminate, send out individual pmap signout requests to glusterd for every brick. 2. Add another parameter to glusterfs_mgmt_pmap_signout function to pass the brickname that needs to be removed from the pmap registry. 3. Make sure pmap_registry_search doesn't break out from the loop iterating over the list of bricks per port if the first brick entry corresponding to a port is whitespaced out. 4. Make sure the pmap registry entries are removed for other daemons like snapd. Change-Id: I69949874435b02699e5708dab811777ccb297174 BUG: 1421590 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/16689 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: add portmap details in glusterd statedump	Atin Mukherjee	2017-02-27	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterd statedump file doesn't have information on the ports and its associated brick details. This is quite problematic if any setup ends up with stale ports and the only way to find the issue out is to gdb into the process which is always not available. This patch attempts to fill in this gap. Change-Id: I26b4fe753d752366ddf865ca3eeae3b4d577d555 BUG: 1426948 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16764 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com>
*	mgmt: Reset conf_fd to default value to avoid double close	Michael Scherer	2017-02-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity warn of a path where we use sys_close 2 times on the same file descriptor, which is likely harmless but could cause various hard to debug problems if threads are used (since the file descriptor table is shared among all threads, we could close a newly opened fd by another thread). Change-Id: I0524b31dccc0da94c7b87583e2a88ef06e003518 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16758 Tested-by: Michael Scherer <misc@fedoraproject.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	Remove deadcode	Michael Scherer	2017-02-26	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found by coverity, ret is already test in the previous 'if'. Change-Id: Iefb7da07c1144470c2322f44b28f98a5904343b4 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16718 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Michael Scherer <misc@fedoraproject.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Remove deadcode, found by coverty	Michael Scherer	2017-02-26	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since ret value is unchanged since last goto out, this code is unreachable. Change-Id: Iff8618739900b44bad6c4e663a4201d9e14cb457 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16713 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Michael Scherer <misc@fedoraproject.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	libglusterfs, gfdb, glusterfs: Add missing breaks	Nigel Babu	2017-02-26	4	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few switches did not have breaks causing fall throughs. Most of them have been fixed with fall through comments for those that are intentional. Change-Id: I84c85726b542f38504b50fefab5eba5dbcd27a07 BUG: 1424894 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16677 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	glusterd : log improvements in glusterd_op_stage_rebalance ()	Atin Mukherjee	2017-02-24	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Include volume names in the respective staging failure error logs in rebalance staging Change-Id: Iaaab12a552930dd5274fbecec78f5735f883ab6b BUG: 1426509 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16746 Reviewed-by: Gaurav Yadav <gyadav@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	libglusterfs, dht, locks, glusterd: Coverity fixes	Nigel Babu	2017-02-23	1	-22/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix up use after free bugs and dead code Change-Id: I8f79ed6b5108926c1fac31c147b5ecba79d10785 BUG: 1424905 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16666 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	glusterd : cluster.brick-multiplex validation is missing while setting it	Gaurav Yadav	2017-02-22	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently glusterd allow setting all values while setting cluster.brick-multiplex option. Validation of allowed options is missing. With this patch glusterd will validate the values given while setting cluster.brick-multiplex. Change-Id: I938fb16b8f5faa9d31326373cd18632b8aa7ebab BUG: 1425288 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16704 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: unref brickinfo object on volume stop	Atin Mukherjee	2017-02-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If brick multiplexing is enabled, on a volume stop glusterd was not unrefing the brickinfo rpc object which lead to a flood of stale rpc logs. Change-Id: I18fedcd6921042ef2e945605466194b7b53fe2f7 BUG: 1421724 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16699 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
*	Fix erronous comparaison of flags	Michael Scherer	2017-02-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using a binary 'or' mean that we always send the UUID, even when not required. Found by coverty scan Change-Id: Ifc4bff6b2f64febd5d2f038538218c2183518fd5 BUG: 1424815 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16675 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Remove the useless goto	Michael Scherer	2017-02-20	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ret cannot be 0, since the error code for gf_store_save_value is -1. And the label of the goto is just after the goto, so that's deadcode. Change-Id: I227bca41f4d0755891b8e6e0f4cb2ce004615a35 BUG: 1424809 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16674 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Nigel Babu <nigelb@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Remove deadcode, found by covertyscan	Michael Scherer	2017-02-19	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since ctx_dict is either assigned to the value of aggr, or we goto to out, there is no need for a 2nd goto. Change-Id: I6c4295c61e6ff412ed7b85421dcae13df8088d7c BUG: 1424796 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16672 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	posix: Fix creation of files with S_ISVTX on FreeBSD	Xavier Hernandez	2017-02-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On FreeBSD the S_ISVTX flag is completely ignored when creating a regular file. Since gluster needs to create files with this flag set, specialy for DHT link files, it's necessary to force the flag. This fix does this by calling fchmod() after creating a file that must have this flag set. Change-Id: I51eecfe4642974df6106b9084a0b144835a4997a BUG: 1411228 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16417 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	glusterd, readdir-ahead: Fix backward incompatibility	Poornima G	2017-02-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: Any opion is spcified in two places: In the options[] of xlator itself and glusterd-volume-set.c. The default value of this option can be specified in both the places. If its specified only in xlator then the volfile generated will not have the option and default value, it will be assigned during graph initialization. With patch [1] the option rda-request-size was changed from INT to SIZET type, and default was changed from 131072 to 128KB, but was specified only in the readdir-ahead.c. Thus with this patch alone the volfile entry for readdir-ahead looks like: volume patchy-readdir-ahead type performance/readdir-ahead subvolumes patchy-read-ahead end-volume With patch [2], the default of option rda-request-size was specified in glusterd-volume-set.c as well(as it was necessary fr parallel readdir). With this patch the readdir entry in the volfile will look like: volume patchy-readdir-ahead type performance/readdir-ahead option rda-cache-limit 10MB option rda-request-size 128KB option parallel-readdir off subvolumes patchy-read-ahead end-volume Now consider the server has both these patches and client doesn't. Server will generate a volfile with entry: The old clients which thought the option rda-request-size is of type INT will now recieve the value 128KB which it willn't understand, and hence fail the mount. The issue is seen only with the combination of [1] and [2]. Solution: Instead of specifying 128KB as default in glusterd we specify 131072 so that the old clients will interpret as INT and new ones as 128KB Credits: Raghavendra G Change-Id: I0c269a5890957fd8a38e9a05bdec088645a6688a BUG: 1423410 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16657 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: take conn->lock around operations on conn->reconnect	Jeff Darcy	2017-02-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Failure to do this could lead to a race in which a timer would be removed twice concurrently, corrupting the timer list (because gf_timer_call_cancel has no internal protection against this) and possibly causing a crash. Change-Id: Ic1a8b612d436daec88fd6cee935db0ae81a47d5c BUG: 1421721 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16662 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd : Fix for error message while removing brick	Gaurav Yadav	2017-02-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When remove-brick command is issued to a offline brick, glusterd error out the operation with message -: "volume remove-brick start: failed: Found stopped brick <hostname>:". With this fix while removing brick, error message is modified to "volume remove-brick start: failed: Found stopped brick <brick path>. Use force option to remove the brick" Change-Id: Id40a02fc38cdb526c4629de262967fe2383febe4 BUG: 1422624 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16630 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	rpcsvc: Add rpchdr and proghdr to iobref before submitting to transport	Poornima G	2017-02-15	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1421937 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16613 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd : Fix for error mesage while detaching peers	Gaurav Yadav	2017-02-13	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When peer is detached from a cluster, an error log is being generated in glusterd.log -"Failed to reconfigure all daemon services". This log is seen in the originator node where the detach is issued. This happens in two cases. Case 1: Detach peer with no volume been created in cluster. Case 2: Detach peer after deleting all the volumes which were created but never started. In any one of the above two cases, in glusterd_check_files_identical() GlusterD fails to retrieve nfs-server.vol file from /var/lib/glusterd/nfs which gets created only when a volume is in place and and is started. With this fix both the above cases have been handled by added validation to skip reconfigure if there is no volume in started state. Change-Id: I039c0840e3d61ab54575e1e00c6a6a00874d84c0 BUG: 1421607 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16607 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: keep snapshot bricks separate from regular ones	Jeff Darcy	2017-02-10	1	-82/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem here is that a volume's transport options can change, but any snapshots' bricks don't follow along even though they're now incompatible (with respect to multiplexing). This was causing the USS+SSL test to fail. By keeping the snapshot bricks separate (though still potentially multiplexed with other snapshot bricks including those for other volumes) we can ensure that they remain unaffected by changes to their parent volumes. Also fixed various issues with how the test waits (or more precisely didn't) for various events to complete before it continues. Change-Id: Iab4a8a44fac5760373fac36956a3bcc27cf969da BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16544 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Avra Sengupta <asengupt@redhat.com>