summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr-read-txn.c
Commit message (Collapse)AuthorAgeFilesLines
* afr: do not mention split-brain in log message in read_txnRavishankar N2017-04-041-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am seeing a lot of messages in qe/customer logs where read_txn complains that file is possibly in split-brain because of no readable subvol being found, does inode refresh and then there is no split-brain message post the inode refresh. This means that a lookup was not issued on the indoe to populate 'readable' or it can mean one brick is source for data and the other for metadata, making readable to be zero (because readable=intersection of (data,metadata readable) since commit 7a1c1e290470149696. Since we anyway log actual split-brains post inode-refresh, move this message to DEBUG log level. > Signed-off-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-on: https://review.gluster.org/16879 > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 71e023fcaab0058f32fedc7b6b702040fdd85f46) Change-Id: Idb88b8ea362515279dc9b246f06b6b646c6d8013 BUG: 1434302 Reviewed-on: https://review.gluster.org/16933 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Do not log of split-brain when there isn't oneKrutika Dhananjay2017-01-161-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16362 * Even on errors like ENOENT, AFR logs split-brain after read-txn refresh, introduced by commit a07ddd8f. This can be a cause of much panic and confusion and needs to be fixed. * Also fixed this issue in write-txns. * Fixed afr read txns to log about split-brain only after knowing that there is no split-brain choice configured. * Removed code duplication * Fixed incorrect passing of error code in afr_write_txn_refresh_done() (the function was passing -0 as errno to gf_msg(). Change-Id: I21ac7f6e31840fe3da2f9eecccc495056ab46ece BUG: 1412915 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16393 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Ignore event_generation checks post inode refresh for write txnsRavishankar N2017-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/16205/ Before http://review.gluster.org/#/c/16091/, after inode refresh, we failed read txns in case of EIO or event_generation being zero. For write transactions, the check was only for EIO. 16091 re-factored the code to fail both read and write when event_generation=0. This seems to have caused a regression as explained in the BZ. This patch restores that behaviour in afr_txn_refresh_done(). Change-Id: Id763ed2d420b6d045d4505893a18959d998c91a3 BUG: 1378547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16322 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* afr: allow I/O when favorite-child-policy is enabledRavishankar N2017-01-081-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy, inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks), refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. > Reviewed-on: http://review.gluster.org/15673 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1378547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/16091 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* afr:Don't wind reads for files in metadata split-brainRavishankar N2016-06-271-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13389/ Problem: For a read on a file in metadata split-brain: 1.lookup_done resets event_generation to zero. 2. readv is issued, goes to inode refresh due to mismatching event_gen. 3. After refresh is successful, we update event_generation, data and metdata readable. 3. We then call afr_read_txn_refresh_done() which in turn calls afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind is called with local->readable (which is populated with data_readable), thus winding the read to a brick. 4. Also, further parallel reads that come directly go to the wind path because there is no inode_refresh needed. Fix: 1.For any afr_read_txn(), readable must be an intersection of data and metadata readable. 2.Check for EIO in afr_read_txn_refresh_done(). Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e BUG: 1349879 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188) Reviewed-on: http://review.gluster.org/14790 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Refresh inode for inode-write fops in needPranith Kumar K2016-05-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If a named fresh-lookup is done on an loc and the fop fails on one of the bricks or not sent on one of the bricks, but by the time response comes to afr, if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(), this will lead to inode-ctx for that inode to be not set, this can lead to EIO in case of a transaction as it depends on 'readable' array to be available by that point. Fix: Refresh inode for inode-write fops for the ctx to be set if it is not already done at the time of named fresh-lookup or if the file is in split-brain where we need to perform one more refresh before failing the fop to check if the file is still in split-brain or not. >BUG: 1336612 >Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14368 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> BUG: 1337822 Change-Id: I0f904ebaa78b99cbb11546e08c9fc1562e9a3eef Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14449 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/afr : Examine data/metadata readable for read-subvolAnuradha Talur2015-08-251-3/+13
| | | | | | | | | | | | | | | | | | During lookup and discover, currently read_subvol is based only on data_readable. read_subvol should be decided based on both data_readable and metadata_readable. Credits to Ravishankar N for the logic of afr_first_up_child from http://review.gluster.org/10905/ . Change-Id: I98580b23c278172ee2902be08eeaafb6722e830c BUG: 1240244 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/11551 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Fix incorrect logging in read transactionsKrutika Dhananjay2015-07-261-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | afr_read_txn_refresh_done() at its entry point can fail for reasons like ENOENT/ESTALE but seldom due to EIO, which is something _AFR_ would internally generate and not receive in response from a child translator. AFR is reporting "split-brain" for _any_ kind of failure in read txn, of the following kind: [2015-07-07 18:04:34.787612] E [MSGID: 108008] [afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-3: Failing STAT on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed. [Input/output error] This patch fixes such misleading errors. To-Do: Avoid logging EIO if/when split-brain choice is set. Will do that as part of a separate commit. Change-Id: Ib513c75168f7026118ad5b3f0b35e9dd498cfe1e BUG: 1246052 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11756 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* afr: Porting messages to new logging frameworkarao2015-06-271-5/+7
| | | | | | | | | | | | | updated Change-Id: I94ac7b2cb0d43a82cf0eeee21407cff9b575c458 BUG: 1194640 Signed-off-by: arao <arao@redhat.com> Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com> Reviewed-on: http://review.gluster.org/9897 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Block fops when file is in split-brainRavishankar N2015-06-261-14/+8
| | | | | | | | | | | | | For directories, block metadata FOPS. For non-directories, block data and metadata FOPS. Do not block entry FOPS. Change-Id: Id7f656f4a513b9d33c457dd7f2d58028dbef8e61 BUG: 1235007 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11371 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Pick gfid from poststat during fresh lookup for read child ↵Krutika Dhananjay2015-06-241-2/+2
| | | | | | | | | | | calculation Change-Id: I12c1e4f67f4ec4affbe13d7daf871044a8a2a12e BUG: 1235216 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11373 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* afr: allow readdir to proceed for directories in split-brainRavishankar N2015-05-281-18/+22
| | | | | | | | | | | | | | | | | | | Problem: afr_read_txn() bails out if read_subvol==-1. This meant that for directories that were in entry split-brain, FOPS like readdir, access, stat etc were not allowed. Fix: Except for getxattr, all other FOPS are wound on the first up child of afr. Change-Id: Iacec8fbb1e75c4d2094baa304f62331c81a6f670 BUG: 1221481 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10776 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Tested-by: NetBSD Build System
* cluster/afr : enable inspection & resolution of files in split-brainAnuradha2015-03-191-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part 2/2 patch to enable users analyze and resolve split-brain. This patch enables : 1) Users to inspect the files in data and metadata split-brain. 2) Resolve the split-brain. Both using a series of setfattr commands. Consider a volume "test" with 2 bricks. 1) To inspect a file f1: setfattr -n replica.split-brain-choice -v test-client-0 f1 After the execution of this command, if no read_subvol is found, reads will be served from test-client-0 (corresponding to brick-0). 2) To resolve split-brain : setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1 Execution of this command will lead to the resolution of data and metadata split-brain with subvol mentioned in the command (test-client-0 here) as the source and the rest as sink. Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179 BUG: 1191396 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/9743 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Implementation of quorum-readsPranith Kumar K2015-03-051-0/+10
| | | | | | | | | | | Provide a way of disabling reads when quorum is not met. Change-Id: Ic4f57c2b87a0b8514600759de3a7a47e217fe3b5 BUG: 1187885 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9543 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Fix spurious metadata self-healsPranith Kumar K2014-09-241-0/+3
| | | | | | | | | | | | | - Added logging for metadata and data self-heals which helped in debugging this issue. - Added checks to skip self-heals when no sinks are available to heal Change-Id: I0d50dceb84cd9ad4fe00e0b749ddf7d4ff42348a BUG: 1128721 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8709 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: move messages to new logging frameworkRavishankar N2014-05-171-3/+4
| | | | | | | | | | | | | | Change important (from a diagnostics point of view) log messages to use the gf_msg() framework. Change-Id: I0a58184bbb78989db149e67f07c140a21c781bc2 BUG: 1075611 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/7784 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: refactorAnand Avati2014-03-221-0/+239
- Remove client side self-healing completely (opendir, openfd, lookup) - Re-work readdir-failover to work reliably in case of NFS - Remove unused/dead lock recovery code - Consistently use xdata in both calls and callbacks in all FOPs - Per-inode event generation, used to force inode ctx refresh - Implement dirty flag support (in place of pending counts) - Eliminate inode ctx structure, use read subvol bits + event_generation - Implement inode ctx refreshing based on event generation - Provide backward compatibility in transactions - remove unused variables and functions - make code more consistent in style and pattern - regularize and clean up inode-write transaction code - regularize and clean up dir-write transaction code - regularize and clean up common FOPs - reorganize transaction framework code - skip setting xattrs in pending dict if nothing is pending - re-write self-healing code using syncops - re-write simpler self-heal-daemon Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6010 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>