glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	features/index: Choose different base file on EMLINK error	Pranith Kumar K	2018-04-12	1	-18/+9
\| \| \| \| \| \| \|	Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f fixes: bz#1565655 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> (cherry picked from commit bb12f2109a01856e8184e13cf984210d20155b13)
*	cluster/dht: Skipped files are not treated as errors	N Balachandran	2018-04-06	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	For skipped files, use a return value of 1 to prevent error messages being logged. > Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861 > BUG: 1553598 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I18de31ac1a64d4460e88dea7826c3ba03c895861 BUG: 1555161 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/afr: Prevent ping-event handling on shd	Pranith Kumar K	2018-04-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	On shd, we shouldn't treat any brick down based on latency, otherwise self-heal will never happen fixes: 1562723 Change-Id: Ica07fcc4fae91a6bfd9c9a670e2be464704d94b7 BUG: 1562723 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: send list-node-uuids request to all subvolumes	Xavi Hernandez	2018-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xattr trusted.glusterfs.list-node-uuids was only sent to a single subvolume. This was returning null uuids from the other subvolumes as if they were down. This fix forces that xattr to be requested from all subvolumes. Backport of: > BUG: 1561406 Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f BUG: 1561731 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	cluster/ec: Change default read policy to gfid-hash	Ashish Pandey	2018-04-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Whenever we read data from file over NFS, NFS reads more data then requested and caches it. Based on the stat information it makes sure that the cached/pre-read data is valid or not. Consider 4 + 2 EC volume and all the bricks are on differnt nodes. In EC, with round-robin read policy, reads are sent on different set of data bricks. This way, it balances the read fops to go on all the bricks and avoid heating UP (overloading) same set of bricks. Due to small difference in clock speed, it is possible that we get minor difference for atime, mtime or ctime for different bricks. That might cause a different stat returned to NFS based on which NFS will discard cached/pre-read data which is actually not changed and could be used. Solution: Change read policy for EC as gfid-hash. That will force all the read to go to same set of bricks. >Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 >BUG: 1554743 >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84 BUG: 1558352 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	cluster/ec: avoid delays in self-heal	Xavi Hernandez	2018-04-06	4	-48/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Self-heal creates a thread per brick to sweep the index looking for files that need to be healed. These threads are started before the volume comes online, so nothing is done but waiting for the next sweep. This happens once per minute. When a replace brick command is executed, the new graph is loaded and all index sweeper threads started. When all bricks have reported, a getxattr request is sent to the root directory of the volume. This causes a heal on it (because the new brick doesn't have good data), and marks its contents as pending to be healed. This is done by the index sweeper thread on the next round, one minute later. This patch solves this problem by waking all index sweeper threads after a successful check on the root directory. Additionally, the index sweep thread scans the index directory sequentially, but it might happen that after healing a directory entry more index entries are created but skipped by the current directory scan. This causes the remaining entries to be processed on the next round, one minute later. The same can happen in the next round, so the heal is running in bursts and taking a lot to finish, specially on volumes with many directory levels. This patch solves this problem by immediately restarting the index sweep if a directory has been healed. Backport of: > BUG: 1547662 Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e BUG: 1555201 Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
*	glusterfsd: Memleak in glusterfsd process while brick mux is on	Mohit Agrawal	2018-04-06	21	-120/+204
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. > BUG: 1544090 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (cherry picked from commit 7c3cc485054e4ede1efb358552135b432fb7047a) >Note: Run all test-cases in separate build (https://review.gluster.org/19574) > with same patch after enable brick mux forcefully, all test cases are > passed. BUG: 1549473 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
*	glusterd: import volumes in separate synctask	Atin Mukherjee	2018-04-06	6	-69/+343
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With brick multiplexing, to attach a brick to an existing brick process the prerequisite is to have the compatible brick to finish it's initialization and portmap sign in and hence the thread might have to go to a sleep and context switch the synctask to allow the brick process to communicate with glusterd. In normal code path, this works fine as glusterd_restart_bricks () is launched through a separate synctask. In case there's a mismatch of the volume when glusterd restarts, glusterd_import_friend_volume is invoked and then it tries to call glusterd_start_bricks () from the main thread which eventually may land into the similar situation. Now since this is not done through a separate synctask, the 1st brick will never be able to get its turn to finish all of its handshaking and as a consequence to it, all the bricks will fail to get attached to it. Solution : Execute import volume and glusterd restart bricks in separate synctask. Importing snaps had to be also done through synctask as there's a dependency of the parent volume need to be available for the importing snap functionality to work. >mainline patch : https://review.gluster.org/#/c/19357/ https://review.gluster.org/#/c/19536/ Change-Id: I290b244d456afcc9b913ab30be4af040d340428c BUG: 1543708 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/dht: ENOSPC will not fail rebalance	N Balachandran	2018-04-02	1	-9/+3
\| \| \| \| \| \| \| \| \|	ENOSPC returned by a file migration is no longer considered a rebalance failure. Change-Id: I21cf3a8acdc827bc478e138d6cb5db649d53a28c BUG: 1555161 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterd: optimize glusterd import volumes code path	Atin Mukherjee	2018-03-08	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	In case there's a version mismatch detected for one of the volumes glusterd was ending up with updating all the volumes which is a overkill. >mainline patch : https://review.gluster.org/#/c/19358/ Change-Id: I6df792db391ce3a1697cfa9260f7dbc3f59aa62d BUG: 1543709 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit bb34b07fd2ec5e6c3eed4fe0cdf33479dbf5127b)
*	cluster/afr: Fail open on split-brain	Pranith Kumar K	2018-03-08	11	-95/+208
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Append on a file with split-brain succeeds. Open is intercepted by open-behind, when write comes on the file, open-behind does open+write. Open succeeds because afr doesn't fail it. Then write succeeds because write-behind intercepts it. Flush is also intercepted by write-behind, so the application never gets to know that the write failed. Fix: Fail open on split-brain, so that when open-behind does open+write open fails which leads to write failure. Application will know about this failure. Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15 BUG: 1544635 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> (cherry picked from commit 786343abca3474ff01aa1017210112d97cbc4843)
*	glusterd/store: handle the case of fsid being set to 0	Amar Tumballi	2018-03-06	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Generally this would happen when a system gets upgraded from an version which doesn't have fsid details, to a version with fsid values. Without this change, after upgrade, people would see reduced 'df ' output, causing lot of confusions. Debugging Credits: Nithya B <nbalacha@redhat.com> Change-Id: Id718127ddfb69553b32770b25021290bd0e7c49a BUG: 1517260 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster/dht: Handle single dht child in dht_lookup	N Balachandran	2018-03-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch limits itself to only handling the case where no file (data or linkto) exists on the subvol. Additional cases to be handled: 1. A linkto file was found on the only child subvol. This currently calls dht_lookup_everywhere which eventually deletes it. It can be deleted directly as it will not be pointing to a valid subvol. 2. Directory lookups - locking might be unnecessary in some cases. > Change-Id: I940ba34531f2aaee1d36fd9ca45ecfd46be662a4 > BUG: 1546620 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I940ba34531f2aaee1d36fd9ca45ecfd46be662a4 BUG: 1548270 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Ignore ENODATA from getxattr for posix acls	N Balachandran	2018-03-05	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	dht_migrate_file no longer prints an error if getxattr for posix acls fails with ENODATA/ENOATTR. > Change-Id: Id9ecf6852cb5294c1c154b28d609889ea3420e1c > BUG: 1546954 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Id9ecf6852cb5294c1c154b28d609889ea3420e1c BUG: 1548078 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Fixed a typo	N Balachandran	2018-03-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Replaced "then" with "than" > Change-Id: I73090e8c1a639befd7c5458e8d63bd173248bc7d > BUG: 1547128 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I73090e8c1a639befd7c5458e8d63bd173248bc7d BUG: 1547841 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	protocol/server: Backport patch to reduce duplicate code in server-rpc-fops.c	Amar Tumballi	2018-02-27	1	-991/+208
\| \| \| \| \| \| \| \| \|	> Signed-off-by: Amar Tumballi <amarts@redhat.com> > (cherry picked from commit a81c0c2b9abdcb8ad73d0a226b53120d84082a09) BUG: 1549505 Change-Id: Ifad0a88245fa6fdbf4c43d813b47c314d2c50435 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	glusterd: fix tier-enabled flag op-version check	Atin Mukherjee	2018-02-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	tier-enabled flag in volinfo structure was introduced in 3.10, however while writing this value to the glusterd store was done with a wrong op-version check which results into volume checksum failure during upgrades. >Change-Id: I4330d0c4594eee19cba42e2cdf49a63f106627d4 >BUG: 1544600 >Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: I4330d0c4594eee19cba42e2cdf49a63f106627d4 BUG: 1544637 Signed-off-by: hari gowtham <hgowtham@redhat.com>
*	cluster/dht: Skip '..' for the volume root dir	N Balachandran	2018-02-12	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dht_populate_inode_for_dentry tries to update the layout for the '..' entry when listing the root of the volume. This entry does not correspond to an entry in the volume and therefore does not have a gfid or a layout on disk, causing layout processing to fail. > Change-Id: I2b7470e1c5e20d87b5545160697f24d041045140 > BUG: 1537457 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I2b7470e1c5e20d87b5545160697f24d041045140 BUG: 1539516 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Cleanup on fallocate failure	N Balachandran	2018-02-09	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It looks like fallocate leaves a non-empty file behind in case of some failures. We now truncate the file to 0 bytes on failure in __dht_rebalance_create_dst_file. > Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59 > BUG: 1541916 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59 BUG: 1542601 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Unlink linkto files as root	N Balachandran	2018-02-08	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Non-privileged users cannot delete linkto files. However the failure to unlink a stale linkto causes DHT to fail the lookup with EIO and hence prevent access to the file. > Change-Id: Id295362d41e52263790694602f36f1219f0646a2 > BUG: 1542318 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: Id295362d41e52263790694602f36f1219f0646a2 BUG: 1543016 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Fixed leak in dht_populate_inode_for_dentry	N Balachandran	2018-02-07	2	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed an issue in dht_populate_inode_for_dentry where a layout is set in the inode without checking if it is already set. This overwrites the value each time without freeing the already existing layout. Also includes the changes in https://review.gluster.org/19471 > Change-Id: I651bf539a0b82b4ddc4c355890c16a8e91f5f1fd > BUG: 1541264 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I651bf539a0b82b4ddc4c355890c16a8e91f5f1fd BUG: 1541267 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/afr: remove unnecessary child_up initialization	Xavier Hernandez	2018-02-06	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The child_up array was initialized with all elements being -1 to allow afr_notify() to differentiate down bricks from bricks that haven't reported yet. With current implementation this is not needed anymore and it was causing unexpected results when other parts of the code considered that if child_up[i] != 0, it meant that it was up. Backport of: > BUG: 1541038 Change-Id: I2a9d712ee64c512f24bd5cd3a48dcb37e3139472 BUG: 1541930 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
*	cluster/dht: Add migration checks to dht_(f)xattrop	N Balachandran	2018-02-06	8	-45/+362
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The dht_(f)xattrop implementation did not implement migration phase1/phase2 checks which could cause issues with rebalance on sharded volumes. This does not solve the issue where fops may reach the target out of order. > Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c > BUG: 1471031 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c BUG: 1540224 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/ec: OpenFD heal implementation for EC	Sunil Kumar Acharya	2018-02-02	7	-32/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Existing EC code doesn't try to heal the OpenFD to avoid unnecessary healing of the data later. Fix implements the healing of open FDs before carrying out file operations on them by making an attempt to open the FDs on required up nodes. Backport of: >BUG: 1431955 BUG: 1536334 Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	glusterd: process pmap sign in only when port is marked as free	Atin Mukherjee	2018-02-02	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because of some crazy race in volume start code path because of friend handshaking with volumes with quorum enabled we might end up into a situation where glusterd would start a brick and get a disconnect and then immediately try to start the same brick instance based on another friend update request. And then if for the very first brick even if the process doesn't come up at the end sign in event gets sent and we end up having two duplicate portmap entries for the same brick. Since in brick start we mark the previous port as free, its better to consider a sign in request as no op if the corresponding port type is marked as free. >mainline patch : https://review.gluster.org/#/c/19263/ Change-Id: I995c348c7b6988956d24b06bf3f09ab64280fc32 BUG: 1537346 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 9d708a3739c8201d23f996c413d6b08f8b13dd90)
*	selinux-xlator : validate dict before calling dict_rename_key()	Jiffin Tony Thottan	2018-02-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Upstream reference >Change-Id: I71da3b64e5e8c82e8842e119b2b05da3e2ace550 >BUG: 1535772 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >(cherry picked from commit bee06ccd7b80e3f5804f0c7c7c56936fed6d2b4e) Change-Id: I71da3b64e5e8c82e8842e119b2b05da3e2ace550 BUG: 1533269
*	posix: delete stale gfid handles in nameless lookup	Ravishankar N	2018-01-16	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \|	..in order for self-heal of symlinks to work properly (see BZ for details). Backport of https://review.gluster.org/#/c/19070/ Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501 BUG: 1534847
*	glusterd: connect to an existing brick process when qourum status is ↵	Atin Mukherjee	2018-01-12	9	-15/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NOT_APPLICABLE_QUORUM First of all, this patch reverts commit 635c1c3 as the same is causing a regression with bricks not coming up on time when a node is rebooted. This patch tries to fix the problem in a different way by just trying to connect to an existing running brick when quorum status is not applicable. >mainline patch : https://review.gluster.org/#/c/19134/ Change-Id: I0efb5901832824b1c15dcac529bffac85173e097 BUG: 1511301 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/dht: Use percentages for space check	N Balachandran	2018-01-12	1	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With heterogenous bricks now being supported in DHT we could run into issues where files are not migrated even though there is sufficient space in newly added bricks which just happen to be considerably smaller than older bricks. Using percentages instead of absolute available space for space checks can mitigate that to some extent. Marking bug-1247563.t as that used to depend on the easier code to prevent a file from migrating. This will be removed once we find a way to force a file migration failure. > Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647 > BUG: 1529440 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647 BUG: 1530455 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	quota: fixes issue in quota.conf when setting large number of limits	Sanoj Unnikrishnan	2018-01-10	1	-12/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: It was not possible to configure more than 7712 quota limits. This was because a stack buffer of size 131072 was used to read from quota.conf file. In the new format of quota.conf file each gfid entry takes 17bytes (16byte gfid + 1 byte type). So, the buf_size was not a multiple of gfid entry size and as per code this was considered as corruption. Solution: make buf size multiple of gfid entry size Change-Id: Id036225505a47a4f6fa515a572ee7b0c958f30ed BUG: 1489043 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com> (cherry picked from commit 2899a4f125735636fe7cd8db73c0b8a13289df9b)
*	Revert "mount/fuse: report ESTALE as ENOENT"	Raghavendra G	2018-01-10	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 26d16b90ec7f8acbe07e56e8fe1baf9c9fa1519e. Consider rename (index.new, store.idx) and open (store.idx) being executed in parallel. When we break down operations following sequence is possible. * lookup (store.idx) - as part of open(store.idx) returns gfid1 as the result. * rename (index.new, store.idx) changes gfid of store.idx to gfid2. Note that gfid2 was the nodeid of index.new. Since rename is successful, gfid2 is associated with store.idx. * open (store.idx) resumes and issues open fop to glusterfs with gfid1. open in glusterfs fails as gfid1 doesn't exist and the error returned by glusterfs to kernel-fuse is ENOENT. * kernel passes back the same error to application as a result to open. This error could've been prevented if kernel retries open with gfid2. Interestingly kernel do retry open when it receives ESTALE error. Even though failure to find gfid resulted in ESTALE error, commit 26d16b90ec7f8acb converted that error to ENOENT while sending an error reply to kernel. This prevented kernel from retrying open resulting in error. >Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4 >BUG: 1500269 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 019a55e708375d2b1e576fcc948a691bcdc5c749) Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4 BUG: 1529088 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: Nullify pmap entry for bricks belonging to same port	Atin Mukherjee	2018-01-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 30e0b86 tried to address all the stale port issues glusterd had in case of a brick is abruptly killed. For brick multiplexing case because of a bug the portmap entry was not getting removed. This patch addresses the same. >mainline patch : https://review.gluster.org/#/c/19119/ Change-Id: Ib020b967a9b92f1abae9cab9492f0cacec59aaa1 BUG: 1530448 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	mount/fuse: never fail open(dir) with ENOENT	Raghavendra G	2018-01-03	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	open(dir) being an operation on inode should never fail with ENOENT. If gfid is not present, the appropriate error is ESTALE. This will enable kernel to retry open after a revalidate lookup. >Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b >BUG: 1500269 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit fb4b914ce84bc83a5f418719c5ba7c25689a9251) Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b BUG: 1529088 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	performance/write-behind: fix bug while handling short writes	Raghavendra G	2018-01-02	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The variabled "fulfilled" in wb_fulfill_short_write is not reset to 0 while handling every member of the list. This has some interesting consequences: * If we break from the loop while processing last member of the list head->winds, req is reset to head as the list is a circular one. However, head is already fulfilled and can potentially be freed. So, we end up adding a freed request to wb_inode->todo list. This is the RCA for the crash tracked by the bug associated with this patch (Note that we saw "holder" which is freed in todo list). * If we break from the loop while processing any of the last but one member of the list head->winds, req is set to next member in the list, skipping the current request, even though it is not entirely synced. This can lead to data corruption. The fix is very simple and we've to change the code to make sure "fulfilled" reflects whether the current request is fulfilled or not and it doesn't carry history of previous requests in the list. >Change-Id: Ia3d6988175a51c9e08efdb521a7b7938b01f93c8 >BUG: 1528558 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 0bc22bef7f3c24663aadfb3548b348aa121e3047) Change-Id: Ia3d6988175a51c9e08efdb521a7b7938b01f93c8 BUG: 1529095 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	mount/fuse: use fstat in getattr implementation if any opened fd is available	Raghavendra G	2018-01-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The restriction of using fds opened by the same Pid means fds cannot be shared across threads of multithreaded application. Note that fops from kernel have different Pid for different threads. Imagine following sequence of operations: * Turn off performance.open-behind * Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of "file" is "nodeid-file". * Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of "newfile" as "nodeid-newfile". * t2 proceeds to do fstat (fd1) The above set of operations can sometimes result in ESTALE/ENOENT errors. RENAME overwrites "file" with "newfile" changing its nodeid from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is removed from the backend. If fstat carries nodeid-file as argument, which can happen if lookup has not refreshed the nodeid of "file" and since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT which will fail as "nodeid-file" no longer exists. Since the above set of operations and sharing of fds across multiple threads are valid, this is a bug. The fix is to use any fd opened on the inode. In this specific example fuse_getattr_resume will find fd1 and winds down the call as fstat (fd1) which won't fail. Cross-checked with "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for any security issues with this solution and he approves the solution. Thanks to "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for all the pointers and discussions. >Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c >BUG: 1510401 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 8b57378e5596f287a7b9d106dd6fb56a624b42ee) Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c BUG: 1529085 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	feature/bitrot: remove internal xattrs from lookup cbk	Ravishankar N	2017-12-19	2	-7/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: afr requests all xattrs in lookup via the list-xattr key. If bitrot is enabled and later disabled, or if the bitrot xattrs were present due to an older version of bitrot which used to create the xattrs without enabling the feature, the xattrs (trusted.bit-rot.version in particular) was not getting filtered and ended up reaching the client stack. AFR, on noticing different values of the xattr across bricks of the replica, started triggering spurious metadata heals. Fix: Filter all internal xattrs in bitrot xlator before unwinding lookup, (f)getxattr. Thanks to Kotresh for the help in RCA'ing. Change-Id: I5bc70e4b901359c3daefc67b8e4fa6ddb47f046c BUG: 1527276 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit d341f20230b9921391aff22337eaf9be82f44d88)
*	glusterd: Free up svc->conn on volume delete	Atin Mukherjee	2017-12-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Daemons like snapd, tierd and gfproxyd are maintained on per volume basis and on a volume delete we should destroy the rpc connection established for them. >mainline patch : https://review.gluster.org/#/c/18957/ Change-Id: Id1440e39da07b990fdb9b207df18da04b1ca8014 BUG: 1523048 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 36ce4c614a3391043a3417aa061d0aa16e60b2d3)
*	Disable gfid2path by default on NetBSD	Emmanuel Dreyfus	2017-12-08	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NetBSD storage of extended attributes for UFS1 badly scales when the list of extended attributes names rises. gfid2path can add as many extended attributes names as we have files, hence we keep it disabled for performance sake. > Change-Id: Id77b5f5ceb4d5eba1b3362b4b9fc693450ffbc2b > Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> > BUG: 1129939 Change-Id: Id77b5f5ceb4d5eba1b3362b4b9fc693450ffbc2b Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> BUG: 1513258
*	cluster/dht: don't overfill the buffer in readdir(p)	Raghavendra G	2017-12-08	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Superflous dentries that cannot be fit in the buffer size provided by kernel are thrown away by fuse-bridge. This means, * the next readdir(p) seen by readdir-ahead would have an offset of a dentry returned in a previous readdir(p) response. When readdir-ahead detects non-monotonic offset it turns itself off which can result in poor readdir performance. * readdirp can be cpu-intensive on brick and there is no point to read all those dentries just to be thrown away by fuse-bridge. So, the best strategy would be to fill the buffer optimally - neither overfill nor underfill. > Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84 > BUG: 1492625 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit e785faead91f74dce7c832848f2e8f3f43bd0be5) Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84 BUG: 1478411 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht: populate inode in dentry for single subvolume dht	Raghavendra G	2017-12-06	2	-1/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... in readdirp response if dentry points to a directory inode. This is a special case where the entire layout is stored in one single subvolume and hence no need for lookup to construct the layout >Change-Id: I44fd951e2393ec9dac2af120469be47081a32185 >BUG: 1492625 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 59d1cc720f52357f7a6f20bb630febc6a622c99c) Change-Id: I44fd951e2393ec9dac2af120469be47081a32185 BUG: 1478411 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	glusterd: display gluster volume status, when quorum type is server	Sanju Rakonde	2017-11-30	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: when server-quorum-type is server, after restarting glusterd in the node which is up, gluster volume status is giving incorrect information. Fix: check whether server is blank, before adding other keys into the dictionary. Change-Id: I926ebdffab330ccef844f23f6d6556e137914047 BUG: 1511782 Signed-off-by: Sanju Rakonde <srakonde@redhat.com> (cherry picked from commit 046c7e3199fca715592762e271e6061ac99b0c4b)
*	cluster/afr: Honor default timeout of 5min for analyzing split-brain files	karthik-us	2017-11-30	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: After setting split-brain-choice option to analyze the file to resolve the split brain using the command "setfattr -n replica.split-brain-choice -v "choiceX" <path-to-file>" should allow to access the file from mount for default timeout of 5mins. But the timeout was not honored and was able to access the file even after the timeout. Fix: Call the inode_invalidate() in afr_set_split_brain_choice_cbk() so that it will triger the cache invalidate after resetting the timer and the split brain choice. So the next calls to access the file will fail with EIO. Change-Id: I698cb833676b22ff3e4c6daf8b883a0958f51a64 BUG: 1514380 Signed-off-by: karthik-us <ksubrahm@redhat.com> (cherry picked from commit 933ec57ccda2c1ba5ce6f207313c3b6802e67ca3)
*	features/locks: Fix memory leaks	Xavier Hernandez	2017-11-30	5	-5/+11
\| \| \| \| \| \| \| \| \|	Backport of: > BUG: 1515161 Change-Id: Ic1d2e17a7d14389b6734d1b88bd28c0a2907bbd6 BUG: 1517689 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
*	cluster/dht: make rebalance use truncate incase	Susant Palai	2017-11-23	3	-71/+99
\| \| \| \| \| \| \| \| \| \| \| \| \|	.. the brick file system does not support fallocate. > Change-Id: Id76cda2d8bb3b223b779e5e7a34f17c8bfa6283c > BUG: 1488103 > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: Id76cda2d8bb3b223b779e5e7a34f17c8bfa6283c BUG: 1516691 Signed-off-by: Susant Palai <spalai@redhat.com>
*	cluster/dht: Don't set ACLs on linkto file	N Balachandran	2017-11-20	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The trusted.SGI_ACL_FILE appears to set posix ACLs on the linkto file that is a target of file migration. This can mess up file permissions and cause linkto identification to fail. Now we remove all ACL xattrs from the results of the listxattr call on the source before setting them on the target. > BUG: 1514329 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I56802dbaed783a16e3fb90f59f4ce849f8a4a9b4 BUG: 1515042 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM	Atin Mukherjee	2017-11-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a volume is not having server quorum enabled and in a trusted storage pool all the glusterd instances from other peers are down, on restarting glusterd the brick start trigger doesn't happen resulting into the brick not coming up. > mainline patch : https://review.gluster.org/#/c/18669/ Change-Id: If1458e03b50a113f1653db553bb2350d11577539 BUG: 1511301 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 635c1c3691a102aa658cf1219fa41ca30dd134ba)
*	md-cache: avoid checking the xattr value buffer with string functions.	Günther Deschner	2017-11-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xattrs may very well contain binary, non-text data with leading 0 values. Using strcmp for checking empty values is not the appropriate thing to do: In the best case, it might treat a binary xattr value starting with 0 from being cached (and hence also from being reported back with xattr). In the worst case, we might read beyond the end of a data blob that does contain any zero byte. We fix this by checking the length of the data blob and checking the first byte against 0 if the length is one. > Signed-off-by: Guenther Deschner <gd@samba.org> > Pair-Programmed-With: Michael Adam <obnox@samba.org> > Change-Id: If723c465a630b8a37b6be58782a2724df7ac6b11 > BUG: 1476324 > Reviewed-on: https://review.gluster.org/17910 > Reviewed-by: Michael Adam <obnox@samba.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Poornima G <pgurusid@redhat.com> > Tested-by: Poornima G <pgurusid@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > (cherry picked from commit ab4ffdac9dec1867f2d9b33242179cf2b347319d) Change-Id: If723c465a630b8a37b6be58782a2724df7ac6b11 BUG: 1499892 Signed-off-by: Günther Deschner <gd@samba.org>
*	glusterd : introduce timer in mgmt_v3_lock	Gaurav Yadav	2017-11-06	4	-17/+241
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a multinode environment, if two of the op-sm transactions are initiated on one of the receiver nodes at the same time, there might be a possibility that glusterd may end up in stale lock. Solution: During mgmt_v3_lock a registration is made to gf_timer_call_after which release the lock after certain period of time >mainline patch : https://review.gluster.org/#/c/18437/ Change-Id: I16cc2e5186a2e8a5e35eca2468b031811e093843 BUG: 1503239 Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
*	protocol/server: fix the comparision logic in case of subdir mount	Amar Tumballi	2017-11-06	1	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	without the fix, the stat entry on a file would return inode==1 for many files, in case of subdir mount This happened with the confusion of return value of 'gf_uuid_compare()', it is more like strcmp, instead of a gf_boolean return value, and hence resulted in the bug. Change-Id: I31b8cbd95eaa3af5ff916a969458e8e4020c86bb BUG: 1505527 Signed-off-by: Amar Tumballi <amarts@redhat.com> (cherry picked from commit 2ade36cd98ea0f5bd2a8f619a19c20438318afaf)
*	protocol/client: handle the subdir handshake properly for add-brick	Amar Tumballi	2017-11-06	1	-1/+9
\| \| \| \| \| \| \| \| \| \|	There should be different way we handle handshake in case of subdir mount for the first time, and in case of subsequent graph changes. Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54 BUG: 1505323 Signed-off-by: Amar Tumballi <amarts@redhat.com> (cherry picked from commit 9aa574a51b84717c1f3949ed2e28a49e49840a93)