glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/afr: Take a copy of xattr-req	Pranith Kumar K	2019-10-30	1	-2/+7
\| \| \| \| \| \| \| \| \| \|	Afr adds its own xattrs to the req, so it should take a copy of the dictionary to prevent parent xlator re-using the modified xattr-req to another subvolume fixes: bz#1765155 Change-Id: I268e2dbd1b12323135d369e90a22a8bdde2cf7c2 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Add afr_seek to fops table	Pranith Kumar K	2019-10-14	2	-0/+4
\| \| \| \| \| \|	fixes: bz#1760189 Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	afr: align structs	Yaniv Kaul	2019-10-14	2	-138/+138
\| \| \| \| \| \| \| \| \| \|	squash >50 warnings on padding of structs in afr structures. The warnings were found by manually added '-Wpadded' to the GCC command line. Change-Id: I961fbdeb33715cedf3dd10db8e4f8ef40cd3e867 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	cluster/afr: Heal entries when there is a source & no healed_sinks	karthik-us	2019-10-09	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame anything for entry heal, heal will not complete even though we have clear source and sinks. This will happen because while doing afr_selfheal_find_direction() only the bricks which are blamed by non-accused bricks are considered as sinks. Later in __afr_selfheal_entry_finalize_source() when it tries to mark all the non-sources as sinks it fails to do so because there won't be any healed_sinks marked, no witness present and there will be a source. Fix: If there is a source and no healed_sinks, then reset all the locked sources to 0 and healed sinks to 1 to do conservative merge. Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548 Fixes: bz#1749322 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	afr: support split-brain CLI for replica 3	Ravishankar N	2019-10-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ever since we added quorum checks for lookups in afr via commit bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution commands would not work for replica 3 because there would be no readables for the lookup fop. The argument was that split-brains do not occur in replica 3 but we do see (data/metadata) split-brain cases once in a while which indicate that there are a few bugs/corner cases yet to be discovered and fixed. Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD, split-brain resolution commands will work for replica 3 volumes too. Likewise, the check is added in shard_lookup as well to permit resolving split-brains by specifying "/.shard/shard-file.xx" as the file name (which previously used to fail with EPERM). Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9 Fixes: bz#1756938 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: replace afr_frame_return() when possible with direct call	Yaniv Kaul	2019-10-07	5	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \|	If you are already under lock, just decrement the call count directly instead of removing the lock, re-taking the lock and decrementing. Implements https://github.com/gluster/glusterfs/issues/728 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I3fa20b4651fbdb826655c5a03baeed46e99b5487
*	afr-common.c, afr-self-heal.h: calloc/alloca0 -> malloc/alloca	Yaniv Kaul	2019-09-20	2	-5/+4
\| \| \| \| \| \| \| \| \| \|	In 3 cases, there was a memory allocation and zeroing, followed directly by populating it with content. Replaced with memory allocation that did not zero the memory. Change-Id: I4fbb5c924fb3a144e415d2368126b784dde760ea updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	ctime/rebalance: Heal ctime xattr on directory during rebalance	Kotresh HR	2019-09-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After add-brick and rebalance, the ctime xattr is not present on rebalanced directories on new brick. This patch fixes the same. Note that ctime still doesn't support consistent time across distribute sub-volume. This patch also fixes the in-memory inconsistency of time attributes when metadata is self healed. Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df fixes: bz#1734026 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	afr/lookup: Pass xattr_req in while doing a selfheal in lookup	Mohammed Rafi KC	2019-09-05	3	-5/+16
\| \| \| \| \| \| \| \| \| \|	We were not passing xattr_req when doing a name self heal as well as a meta data heal. Because of this, some xdata was missing which causes i/o errors Change-Id: Ibfb1205a7eb0195632dc3820116ffbbb8043545f Fixes: bz#1728770 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	afr: wake up index healer threads	Ravishankar N	2019-08-30	5	-11/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	...whenever shd is re-enabled after disabling or there is a change in `cluster.heal-timeout`, without needing to restart shd or waiting for the current `cluster.heal-timeout` seconds to expire. See BZ 1743988 for more details. Change-Id: Ia5ebd7c8e9f5b54cba3199c141fdd1af2f9b9bfe fixes: bz#1744548 Reported-by: Glen Kiessling <glenk1973@hotmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	cluster/afr - Unused variables	Barak Sason	2019-08-24	2	-6/+9
\| \| \| \| \| \| \| \| \| \| \|	-Minor change to if-else structure to avoid code duplication. -Added logging in case method calls fails CID: 1394654 Updates: bz#789278 Change-Id: Ibef4450dc89ddd3bf951303d5b87f503924fd250 Signed-off-by: Barak Sason <bsasonro@redhat.com>
*	afr: restore timestamp of parent dir during entry-heal	Ravishankar N	2019-08-14	1	-0/+2
\| \| \| \| \| \|	Fixes: bz#1734370 Change-Id: I29e338bac62104233a6f80212df8d0fb016affda Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	Multiple files: get trivial stuff done before lock	Yaniv Kaul	2019-08-01	1	-6/+4
\| \| \| \| \| \| \| \| \|	Initialize a dictionary for example seems to be prefectly fine to be done before taking a lock. Change-Id: Ib29516c4efa8f0e2b526d512beab488fcd16d2e7 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	(multiple files) use dict_allocate_and_serialize() where applicable.	Yaniv Kaul	2019-07-22	1	-28/+6
\| \| \| \| \| \| \| \|	This function does length, allocation and serialization for you. Change-Id: I142a259952a2fe83dd719442afaefe4a43a8e55e updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	cluster/ta: Notify the clients only if there are pending heals	karthik-us	2019-07-12	4	-22/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In case of thin arbiter, before index healer starts crawling the indices at every heal-timeout interval, even if there is nothing to be healed it will send an upcall notification to all the clients to release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait for the upcall to return before proceeding with the heal even though there is nothing to be healed. This will also invalidates the cached information about the bricks states on the clients which leads to extra calls on TA from clients for the next reads & writes if needed. This will impact the IO performance. Fix: - Before sending the upcall to the clients, check for any pending heals on TA without taking any locks. - If there is nothing marked bad on TA, then continue with the index crawl to heal any dirty markings present on the files due to any post-op failure. - If there is a brick marked as bad on TA, then take the AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and continue with the current healing process. Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6 fixes: bz#1724184 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	cluster/afr: Fix incorrect reporting of gfid & type mismatch	karthik-us	2019-07-12	2	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problems: 1. When checking for type and gfid mismatch, if the type or gfid is unknown because of missing gfid handle and the gfid xattr it will be reported as type or gfid mismatch and the heal will not complete. 2. If the source selected during entry heal has null gfid the same will be sent to afr_lookup_and_heal_gfid(). In this function when we try to assign the gfid on the bricks where it does not exist, we are considering the same gfid and try to assign that on those bricks. This will fail in posix_gfid_set() since the gfid sent is null. Fix: If the gfid sent to afr_lookup_and_heal_gfid() is null choose a valid gfid before proceeding to assign the gfid on the bricks where it is missing. In afr_selfheal_detect_gfid_and_type_mismatch(), do not report type/gfid mismatch if the type/gfid is unknown or not set. Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da fixes: bz#1722507 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	afr/read: Implement latency based read child selection	Mohammed Rafi KC	2019-06-20	3	-27/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Network latency is an important factor selecting a read subvolume. So this patch is adding two new policy. 1) We measure the latency of a child during a GF_DUMP rpc call. Then use this latency to pick a read subvol having the least latency. 2) Second one is an hybrid mode where it calculates the effective latency by multiplying outstanding pending read request and latency, and choose the least one. Change-Id: Ia49c8a08ab61f7dcdad8b8950aa4d338e7accf97 fixes: #520 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	afr/fini: Free local_pool data during an afr fini	Mohammed Rafi KC	2019-06-17	1	-0/+6
\| \| \| \| \| \| \| \| \|	We should free the mem_pool local_pool during an afr_fini. Otherwise this will lead to mem leak for shd Change-Id: I805a34a88077bf7b886c28b403798bf9eeeb1c0b Updates: bz#1716695 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	multiple files: another attempt to remove includes	Yaniv Kaul	2019-06-14	6	-32/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	Cluster/afr: Don't treat all bricks having metadata pending as split-brain	karthik-us	2019-06-10	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in metadata split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others for metadata. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks. Fix: Pick a source based on the xattr values. If 2 bricks blame one, the blamed one must be treated as sink. If there is no majority, all are sources. Once we pick a source, self-heal will then do the heal instead of erroring out due to split-brain. This patch also adds restriction of all the bricks to be up to perform metadata heal to avoid any metadata loss. Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t as it was doing metadata heal even when only 2 of 3 bricks were up. Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09 fixes: bz#1717819 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	afr/frame: Destroy frame after afr_selfheal_entry_granular	Mohammed Rafi KC	2019-05-21	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \|	In function "afr_selfheal_entry_granular", after completing the heal we are not destroying the frame. This will lead to crash. when we execute statedump operation, where it tried to access xlator object. If this xlator object is freed as part of the graph destroy this will lead to an invalid memory access Change-Id: I0a5e78e704ef257c3ac0087eab2c310e78fbe36d fixes: bz#1708926 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	ec/shd: Cleanup self heal daemon resources during ec fini	Mohammed Rafi KC	2019-05-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	We were not properly cleaning self-heal daemon resources during ec fini. With shd multiplexing, it is absolutely necessary to cleanup all the resources during ec fini. Change-Id: Iae4f1bce7d8c2e1da51ac568700a51088f3cc7f2 fixes: bz#1703948 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	afr: thin-arbiter lock release fixes	Ravishankar N	2019-05-10	3	-47/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- pass fop state instead of afr local to afr_ta_dom_lock_check_and_release() - avoid afr_lock_release_synctask() being called simultaneosuly from notify code path and transaction (post-op) code path due to races. - Check if the post-op on TA is valid based on event_gen checks. - Invalidate in-memory information when we get TA child down. Note: Thi patch addresses some pending review comments of commit 053b1309dc8fbc05fcde5223e734da9f694cf5cc (https://review.gluster.org/#/c/glusterfs/+/20095/) fixes: bz#1698449 Change-Id: I2ccd7e1b53362f9f3fed8680aecb23b5011eb18c Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: log before attempting data self-heal.	Ravishankar N	2019-05-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was working on a blog about troubleshooting AFR issues and I wanted to copy the messages logged by self-heal for my blog. I then realized that AFR-v2 is not logging before attempting data heal while it logs it for metadata and entry heals. I [MSGID: 108026] [afr-self-heal-entry.c:883:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on d120c0cf-6e87-454b-965b-0d83a4c752bb. sources=[0] 2 sinks=1 I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed data selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5 I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on a9b5f183-21eb-4fb3-a342-287d3a7dddc5. sources=[0] 2 sinks=1 Adding it in this patch. Now there is a 'performing' and a corresponding 'Completed' message for every type of heal. fixes: bz#1707746 Change-Id: I0b954cf1e17b48280aefa76640b5119b92133d61 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr : fix Coverity CID 1398627	Rinku Kothiya	2019-05-07	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \|	Fixed coverity error, "Unchecked return value (CHECKED_RETURN)". Checking return value & logging error message if afr_set_pending_dict fails. updates: bz#789278 Change-Id: Iab7da6b4f3cd0622b95b8e1c412b007a330467e5 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	cluster/afr: Set lk-owner before inodelk/entrylk/lk	Pranith Kumar K	2019-04-22	2	-19/+23
\| \| \| \| \| \|	Updates: bz#1624701 Change-Id: I7152c28ad85925abccdcc4cd6de8cb2a2b847a51 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Remove local from owners_list on failure of lock-acquisition	Pranith Kumar K	2019-04-15	4	-18/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	When eager-lock lock acquisition fails because of say network failures, the local is not being removed from owners_list, this leads to accumulation of waiting frames and the application will hang because the waiting frames are under the assumption that another transaction is in the process of acquiring lock because owner-list is not empty. Handled this case as well in this patch. Added asserts to make it easier to find these problems in future. fixes bz#1696599 Change-Id: I3101393265e9827755725b1f2d94a93d8709e923 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	Replace memdup() with gf_memdup()	Vijay Bellur	2019-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	memdup() and gf_memdup() have the same implementation. Removed one API as the presence of both can be confusing. Change-Id: I562130c668457e13e4288e592792872d2e49887e updates: bz#1193929 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/afr: Thin-arbiter SHD fixes	karthik-us	2019-04-12	2	-13/+13
\| \| \| \| \| \| \| \| \|	This patch address post-merge review comments for commit 5784a00f997212d34bd52b2303e20c097240d91c Change-Id: I7ed954664a2ae8e1091d23ee3ceb9c66e83bfeac fixes: bz#1697930 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	cluster/afr: Invalidate inode on change of split-brain-choice	Pranith Kumar K	2019-04-05	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \|	When split-brain choice is changed from one brick to another brick, inode-invalidate is not called so readv call is served from cache leading to failures in split-brain-resolution.t. Fixed it by calling inode_invaldate() when this happens. updates bz#1193929 Change-Id: I2624614eec38c0303f3e1dc55dfae3d4b864218b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Send inodelk/entrylk with non-zero lk-owner	Pranith Kumar K	2019-04-02	2	-12/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found missing assignment of lk-owner for an inodelk/entrylk before winding the fops. locks xlator at the moment allows this operation. This leads to multiple threads in the same client being able to get locks on the inode because lk-owner is same and transport is same. So isolation with locks can't be achieved. To fix it, we need locks xlator change which will disallow null-lk-owner based inodelk/entrylk/lk. To achieve that we need to first fix all the places which do this mistake. updates bz#1624701 Change-Id: Ic3431da3f451a1414f1f4fdcfc4cf41e555f69dd Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	afr: thin-arbiter read txn fixes	Ravishankar N	2019-03-29	3	-22/+37
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Fixes afr_ta_read_txn() to handle inode refresh failures. code-path. - Fixes a double free issue of dict. Note: This patch address post-merge review comments for commit 69532c141be160b3fea03c1579ae4ac13018dcdf fixes: bz#1686398 Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	afr: add client-pid to all gf_event() calls	Ravishankar N	2019-03-27	6	-15/+28
\| \| \| \| \| \| \| \| \|	client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD updates: bz#1689250 Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	cluster/afr: Remove un-used variables related to pump	Pranith Kumar K	2019-03-26	1	-3/+0
\| \| \| \| \| \|	updates bz#1193929 Change-Id: I01b60d644f517c00a1bcc127bf9a8ed90b6eb7a0 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr : TA: Return actual error code in case of failure	Ashish Pandey	2019-03-14	1	-6/+6
\| \| \| \| \| \| \| \| \|	In afr_ta_post_op_do, we were sending EIO for every failure. However, the original error code should be sent. Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a updates: bz#1686711 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	cluster/afr: Send truncate on arbiter brick from SHD	karthik-us	2019-03-11	1	-15/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In an arbiter volume configuration SHD will not send any writes onto the arbiter brick even if there is data pending marker for the arbiter brick. If we have a arbiter setup on the geo-rep master and there are data pending markers for the files on arbiter brick, SHD will not mark any data changelog during healing. While syncing the data from master to slave, if the arbiter-brick is considered as ACTIVE, then there is a chance that slave will miss out some data. If the arbiter brick is being newly added or replaced there is a chance of slave missing all the data during sync. Fix: If there is data pending marker for the arbiter brick, send truncate on the arbiter brick during heal, so that it will record truncate as the data transaction in changelog. Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec fixes: bz#1686568 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	cluster/afr: Add quorum checks to open & opendir fops	karthik-us	2019-03-08	4	-2/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently even if open & opendir fails on quorum number of bricks, but succeeds on atleast one brick, it will result in success. This leads to inconsistency in the behaviour with other operations following the open, which has quorum checks. Fix: Add quorum checks to open & opendir fops to avoid inconsistency. Change-Id: If8fcb82072a6dc45ea6d4a6754b79763215eba2a fixes: bz#1634664 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	afr: mark changelog_fsync as internal	Soumya Koduri	2019-03-05	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	As afr_changelog_fsync is used for internal operations, use GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating it as conflicting fop and recall lease. Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	afr/shd: Cleanup self heal daemon resources during afr fini	Mohammed Rafi KC	2019-02-12	2	-0/+59
\| \| \| \| \| \| \| \| \|	We were not properly cleaning self-heal daemon resources during afr fini. This patch will clean the same. Change-Id: I597860be6f781b195449e695d871b8667a418d5a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	cluster/thin-arbiter: Consider thin-arbiter before marking new entry changelog	Ashish Pandey	2019-02-01	4	-19/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a fop to create an entry fails on one of the data brick, we mark the pending changelog on the entry on brick for which it was successful. This is done as part of post op phase to make sure that entry gets healed even if it gets renamed to some other path where its parent was not marked as bad. As it happens as part of post op, we should consider thin-arbiter to check if the brick, which was successful, is the good brick or not. This will avoide split brain and other issues. Change-Id: I12686675be98f02f70a5186b3ed748c541514d53 updates: bz#1662264 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	2	-12/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	afr: not resolve splitbrains when copies are of same size	Iraj Jamali	2019-01-22	1	-18/+25
\| \| \| \| \| \| \| \| \| \| \| \|	Automatic Splitbrain with size as policy must not resolve splitbrains when the copies are of same size. Determining if the sizes of copies are same and returning -1 in that case. updates: bz#1655052 Change-Id: I3d8e8b4d7962b070ed16c3ee02a1e5a926fd5eab Signed-off-by: Iraj Jamali <ijamali@redhat.com>
*	afr: Splitbrain with size as policy must not resolve for directory	Sheetal Pamecha	2019-01-21	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In automatic Splitbrain resolution when favorite child policy is set as size, split brain resolution must not work for directories. Currently, if a directory is in split brain with both copies having same size, the source is selected arbitrarily and healed. fixes: bz#1655050 Change-Id: I5739498639c17c89874cc577362e543adab55f5d Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
*	cluster/afr: fix zerofill transaction.start	Xiubo Li	2019-01-14	1	-1/+1
\| \| \| \| \| \| \| \|	This maybe one mistake when coding. Fixes: bz#1665332 Change-Id: Ia8f8dadf4a71579240ff9950b141ca528bd342b3 Signed-off-by: Xiubo Li <xiubli@redhat.com>
*	afr : fix memory leak	Sunny Kumar	2019-01-11	1	-23/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes memory leak reported by ASan. The fix was first merged by https://review.gluster.org/#/c/glusterfs/+/21805. But later change was reverted due to this patch https://review.gluster.org/#/c/glusterfs/+/21178/. updates: bz#1633930 Change-Id: I1febe121e0be33a637397a0b54d6b78391692b0d Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	cluster/afr: Disable client side heals in AFR by default.	Sunil Kumar Acharya	2019-01-10	1	-3/+3
\| \| \| \| \| \| \| \| \|	With this changeset, default value for the AFR client side heal volume option is set to "off" fixes: bz#1663102 Change-Id: Ie4016932339c4896487e3e7cb5caca68739b7ba2 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	cluster/afr: Refactor internal locking code to allow multiple inodelks	Pranith Kumar K	2018-12-28	6	-798/+366
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For implementing copy_file_range fop, AFR needs to perform two inodelks in the same transaction. This patch brings in the necessary structure to make it easier to do so. Entry-locks in AFR were already taking multiple entry-locks on different inodes with the respective basenames. This patch extends the logic in inodelks to use the same lockee_t structure. This lead to removal of quite a lot of duplicate code present in afr-lk-common.c as both the locks are doing same things except 'winding' part. updates: #536 Change-Id: Ibfce7e3f260bb27b18645152ec680c33866fe0ae Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/afr: Allow lookup on root if it is from ADD_REPLICA_MOUNT	karthik-us	2018-12-18	7	-30/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When trying to convert a plain distribute volume to replica-3 or arbiter type it is failing with ENOTCONN error as the lookup on the root will fail as there is no quorum. Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT which is used while adding bricks to a volume. It will try to set the pending xattrs for the newly added bricks to allow the heal to happen in the right direction and avoid data loss scenarios. Note: This fix will solve the problem of type conversion only in the case where the volume was mounted at least once. The conversion of non mounted volumes will still fail since the dht selfheal tries to set the directory layout will fail as they do that with the PID GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root. Change-Id: Ic511939981dad118cc946754341318b164954b3b fixes: bz#1655854 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	xlators/cluster/afr/src/afr-self-heal-common.c: remove a variable array.	Yaniv Kaul	2018-12-18	1	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Added '-Wvla' and saw this - gcc doesn't like variable arrays. There are plenty of others in the EC code, but this seems OK to remove: there is no use for the array members (I hope - that was from reading the code). Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I350f4520e52b86c8bbcd60eea1b27ef99cd119aa
*	Don't depend on string options to be valid always	Pranith Kumar K	2018-12-17	5	-41/+56
\| \| \| \| \| \|	updates bz#1650403 Change-Id: Ib5a11e691599ce4bd93c1ed5aca6060592893961 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>