summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/ec/src/ec-data.c
Commit message (Collapse)AuthorAgeFilesLines
* cluster/ec: fix fd reopenPranith Kumar K2019-05-081-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently EC tries to reopen fd's that have been opened while a brick was down. This is done as part of regular write operations, just after having acquired the locks, and it's sent as a sub-fop of the main write fop. There were two problems: 1. The reopen was attempted on all UP bricks, even if a previous lock didn't succeed. This is incorrect because most probably the open will fail. 2. If reopen is sent and fails, the error is propagated to the main operation, causing it to fail when it shouldn't. To fix this, we only attempt reopens on bricks where the current fop owns a lock, and we prevent any error to be propagated to the main fop. To implement this behaviour an argument used to indicate the minimum number of required answers has overloaded to also include some flags. To make the change consistent, it has been necessary to rename the argument, which means that a lot of files have been changed. However there are no functional changes. This change has also uncovered a problem in discard code, which didn't correctely process requests of small sizes because no real discard fop was being processed, only a write of 0's on some region. In this case some fields of the fop remained uninitialized or with incorrect values. To fix this, a new function has been created to simulate success on a fop and it's used in the discard case. Thanks to Pranith for providing a test script that has also detected an issue in this patch. This patch includes a small modification of this script to force data to be written into bricks before stopping them. Backport of: > Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec > BUG: bz#1699866 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec Fixes: bz#1699917 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-102/+84
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* cluster/ec: Improve logging for some critical error messagesAshish Pandey2018-09-071-0/+1
| | | | | | Change-Id: I037e52a3467467b81a1ba5416317870864060d4d updates: bz#1615703 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* cluster/ec: Prevent self-heal to work after PARENT_DOWNXavier Hernandez2017-11-281-18/+3
| | | | | | | | | | | | | | | | | | | When the volume is being stopped, PARENT_DOWN event is received. This instructs EC to wait until all pending operations are completed before declaring itself down. However heal operations are ignored and allowed to continue even after having said it was down. This may cause unexpected results and crashes. To solve this, heal operations are considered exactly equal as any other operation and EC won't propagate PARENT_DOWN until all operations, including healing, are complete. To avoid big delays if this happens in the middle of a big heal, a check has been added to quit current heal if shutdown is detected. Change-Id: I26645e236ebd115eb22c7ad4972461111a2d2034 BUG: 1515266 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* cluster/ec: FORWARD_NULL coverity fixAkarsha Rai2017-09-291-0/+1
| | | | | | | | | | | | | Problem: cbk could be NULL. Solution: Returning NULL when memory is not allocated for cbk. BUG: 789278 Change-Id: Iea9128e0f3b95100deca560f690f9baaae226abf Signed-off-by: Akarsha Rai <akrai@redhat.com>
* cluster/ec: Mark internal fops appropriatelyXavier Hernandez2015-11-191-0/+3
| | | | | | | | | | | | | 1) Mark read fops in read-modify-write by EC as internal. 2) Handle uid/gid set/reset correctly BUG: 1282761 Change-Id: I5c1ce0cd6213367eaead5fed33aa2397c4e46df7 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/12599 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix bad management of lock ownersXavier Hernandez2015-11-051-1/+2
| | | | | | | | | | | | | | | | | | | | Since the addition of parallel reads patch for ec, a lock can have more than one owner at the same time. The list of owners was stored inside the 'owner_list' field of each fop. The problem was with fops that required more than one lock (like rename). In this case the same field was used to add the fop to more than one list, casing an overwrite of the previous list. This has been solved moving the 'owner_list' field from ec_fop_data_t to ec_lock_link_t structure. Change-Id: I6042129f09082497b80782b5704a52c35c78f44d BUG: 1276031 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/12445 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* all: reduce "inline" usageJeff Darcy2015-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/ec: Allow read fops to be processed in parallelXavier Hernandez2015-08-291-0/+1
| | | | | | | | | | | | | | Currently ec only sends a single read request at a time for a given inode. Since reads do not interfere between them, this patch allows multiple concurrent read requests to be sent in parallel. Change-Id: If853430482a71767823f39ea70ff89797019d46b BUG: 1245689 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/11742 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Minimize usage of EIO errorXavier Hernandez2015-07-281-3/+3
| | | | | | | | | | Change-Id: I82e245615419c2006a2d1b5e94ff0908d2f5e891 BUG: 1245276 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/11741 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/ec: wind readlink on good subvol(s)Pranith Kumar K2015-07-141-0/+1
| | | | | | | | | | BUG: 1232172 Change-Id: I3a56e487840d86147dd85bf5fbe79b165eae289f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11589 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Fix use after free bugPranith Kumar K2015-07-071-0/+1
| | | | | | | | | | | | | | | In ec_lock() there is a chance that ec_resume is called on fop even before ec_sleep. This can result in refs == 0 for fop leading to use after free in this function when it calls ec_sleep so do ec_sleep at start and ec_resume at end of this function. Change-Id: I879b2667bf71eaa56be1b53b5bdc91b7bb56c650 BUG: 1240284 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11558 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/ec: Add throttling in background healingPranith Kumar K2015-07-011-0/+2
| | | | | | | | | | | | | | - 8 parallel heals can happen. - 128 heals will wait for their turn - Heals will be rejected if 128 heals are already waiting. Change-Id: I2e99bf064db7bce71838ed9901a59ffd565ac390 BUG: 1237381 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11471 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* ec: Porting messages to new logging frameworkNandaja Varma2015-06-261-8/+14
| | | | | | | | | Change-Id: Ia05ae750a245a37d48978e5f37b52f4fb0507a8c BUG: 1194640 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/10465 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: wind fops on good subvols for access/readdir[p]Pranith Kumar K2015-06-261-0/+2
| | | | | | | | | Change-Id: I1e629a6adc803c4b7164a5a7a81ee5cb1d0e139c BUG: 1232172 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11246 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Forced unlock when lock contention is detectedXavier Hernandez2015-05-271-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC uses an eager lock mechanism to optimize multiple read/write requests on the same entry or inode. This increases performance but can have adverse results when other clients try to access the same entry/inode. To solve this, this patch adds a functionality to detect when this happens and force an earlier release to not block other clients. The method consists on requesting GF_GLUSTERFS_INODELK_COUNT and GF_GLUSTERFS_ENTRYLK_COUNT for all fops that take a lock. When this count is greater than one, the lock is marked to be released. All fops already waiting for this lock will be executed normally before releasing the lock, but new requests that also require it will be blocked and restarted after the lock has been released and reacquired again. Another problem was that some operations did correctly lock the parent of an entry when needed, but got the size and version xattrs from the entry instead of the parent. This patch solves this problem by binding all queries of size and version to each lock and replacing all entrylk calls by inodelk ones to remove concurrent updates on directory metadata. This also allows rename to correctly update source and destination directories. Change-Id: I2df0b22bc6f407d49f3cbf0733b0720015bacfbd BUG: 1165041 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/10852 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix use after free crashPranith Kumar K2015-05-211-6/+44
| | | | | | | | | | | | | | | | | ec_heal creates ec_fop_data but doesn't run ec_manager. ec_fop_data_allocate adds this fop to ec->pending_fops, because ec_manager is not run on this heal fop it is never removed from ec->pending_fops. When it is accessed after free it leads to crash. It is better to not to add HEAL fops to ec->pending_fops because we don't want graph switch to hang the mount because of a BIG file/directory heal. BUG: 1188145 Change-Id: I8abdc92f06e0563192300ca4abca3909efcca9c3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10868 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/ec: Correctly cleanup delayed locksXavier Hernandez2015-05-201-6/+7
| | | | | | | | | | | | | | | | | When a delayed lock is pending, a graph switch doesn't correctly terminate it. This means that the update of version and size xattrs is lost, causing EIO errors. This patch handles GF_EVENT_PARENT_DOWN event to correctly finish pending udpdates before completing the graph switch. Change-Id: I394f3b8d41df8d83cdd36636aeb62330f30a66d5 BUG: 1188145 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/10787 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* ec: Change licenseXavier Hernandez2014-12-031-16/+6
| | | | | | | | | | Change-Id: Iae90ade2421898417b53dec0417a610cf306c44b BUG: 1168167 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9201 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Optimize read/write performanceXavier Hernandez2014-09-151-1/+0
| | | | | | | | | | | | | | | | | | This patch significantly improves performance of read/write operations on a dispersed volume by reusing previous inodelk/ entrylk operations on the same inode/entry. This reduces the latency of each individual operation considerably. Inode version and size are also updated when needed instead of on each request. This gives an additional boost. Change-Id: I4b98d5508c86b53032e16e295f72a3f83fd8fcac BUG: 1122586 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec: Added erasure code translatorXavier Hernandez2014-07-111-0/+261
Change-Id: I293917501d5c2ca4cdc6303df30cf0b568cea361 BUG: 1118629 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/7749 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>