summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/ec/src/ec-inode-write.c
Commit message (Collapse)AuthorAgeFilesLines
* ec: Fix posix compliance failuresXavier Hernandez2015-01-281-34/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch solves some problems that caused dispersed volumes to not pass posix smoke tests: * Problems in open/create with O_WRONLY Opening files with -w- permissions using O_WRONLY returned an EACCES error because internally O_WRONLY was replaced with O_RDWR. * Problems with entrylk on renames. When source and destination were the same, ec tried to acquire the same entrylk twice, causing a deadlock. * Overwrite of a variable when reordering locks. On a rename, if the second lock needed to be placed at the beggining of the list, the 'lock' variable was overwritten and later its timer was cancelled, cancelling the incorrect one. * Handle O_TRUNC in open. When O_TRUNC was received in an open call, it was blindly propagated to child subvolumes. This caused a discrepancy between real file size and the size stored into trusted.ec.size xattr. This has been solved by removing O_TRUNC from open and later calling ftruncate. Change-Id: I20c3d6e1c11be314be86879be54b728e01013798 BUG: 1161886 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9420 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* ec: Remove O_APPEND from flags on create and open.Xavier Hernandez2015-01-091-51/+53
| | | | | | | | | | | | | | | | | | | Allowing O_APPEND flag to pass through to the brick files corrupts fragment contents because writes are not stored on the desired place. Write fop has been modified so that it uses current file size as its write offset. This guarantees that all writes, even those comming from different file descriptors and clients, will write to the end of the file. Change-Id: I9f721f12217a98231fe52e344166d1c94172c272 BUG: 1161621 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9079 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Handle internal xattr get/setPranith Kumar K2015-01-081-52/+45
| | | | | | | | | | | | | | | | | | | | Problem: Internal xattrs of EC like trusted.ec.size/config/version can be modified by users and that can lead to misbehavior in EC. Fix: Don't let the user modify the xattrs. Hide these xattrs in getfattr outputs. Change-Id: I39cec96ae12826b506b496fda7da74201015fd75 BUG: 1178688 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9385 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Tested-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* ec: Fix return errors when not enough bricksXavier Hernandez2014-12-051-0/+5
| | | | | | | | | | | | | | | | | | | | | Changes introduced by this patch: * Fix an incorrect error propagation when the state of the life cycle of a fop returns an error. * Fix incorrect unlocking of failed locks. * Return ENOTCONN if there aren't enough bricks online. * In readdir(p) check that the fd has been successfully open by a previous opendir. Change-Id: Ib44f25a1297849ebcbab839332f3b6359f275ebe BUG: 1162805 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9098 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Change licenseXavier Hernandez2014-12-031-16/+6
| | | | | | | | | | Change-Id: Iae90ade2421898417b53dec0417a610cf306c44b BUG: 1168167 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/9201 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Fix self-heal issuesXavier Hernandez2014-10-211-14/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Doing an 'ls' of a directory that has been modified while one of the bricks was down, sometimes returns the old directory contents. Cause: Directories are not marked when they are modified as files are. The ec xlator balances requests amongst available and healthy bricks. Since there is no way to detect that a directory is out of date in one of the bricks, it is used from time to time to return the directory contents. Solution: Basically the solution consists in use versioning information also for directories, however some additional changes have been necessary. Changes: * Use directory versioning: This required to lock full directory instead of a single entry for all requests that add or remove entries from it. This is needed to allow atomic version update. This affects the following fops: create, mkdir, mknod, link, symlink, rename, unlink, rmdir Another side effect is that opendir requires to do a previous lookup to get versioning information and discard out of date bricks for subsequent readdir(p) calls. * Restrict directory self-heal: Till now, when one discrepancy was found in lookup, a self-heal was automatically started. This caused the versioning information of a bad directory to be healed instantly, making the original problem to reapear again. To solve this, when a missing directory is detected in one or more bricks on lookup or opendir fops, only a partial self-heal is performed on it. A partial self-heal basically creates the directory but does not restore any additional information. This avoids that an 'ls' could repair the directory and cause the problem to happen again. With this change, output of 'ls' is always consistent. However, since the directory has been created in the brick, this allows any other operation on it (create new files, for example) to succeed on all bricks and not add additional work to the self-heal process. To force a self-heal of a directory, any other operation must be done on it. For example a getxattr. With these changes, the correct healing procedure that would avoid inconsistent directory browsing consists on a post-order traversal of directoriesi being healed. This way, the directory contents will be healed before healing the directory itslef. * Additional changes to fix self-heal errors - Don't use fop->fd to decide between fd/loc. open, opendir and create have an fd, but the correct data is in loc. - Fix incorrect management of bad bricks per inode/fd. - Fix incorrect selection of fop's target bricks when there are bad bricks involved. - Improved ec_loc_parent() to always return a parent loc as complete as possible. Change-Id: Iaf3df174d7857da57d4a87b4a8740a7048b366ad BUG: 1149726 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8916 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* ec: Fix invalid inode lock in ftruncateXavier Hernandez2014-09-191-9/+9
| | | | | | | | | | | | | | | | | | | The fops 'truncate' and 'ftruncate' share some code and inodelk() was always made against the inode inside the loc_t structure instead of that of fd_t. Since ftruncate has the loc initialized to NULL, this fop was executed without any lock, allowing some concurrent modifications in the file size. Also changed the way in which 'fop' and 'ffop' are differentiated in shared code. Now it uses 'id' field instead of checking if 'fd' is NULL. Change-Id: Ibd18accf2652193b395a841b9029729e5f4867c6 BUG: 1140396 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8695 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Optimize read/write performanceXavier Hernandez2014-09-151-52/+65
| | | | | | | | | | | | | | | | | | This patch significantly improves performance of read/write operations on a dispersed volume by reusing previous inodelk/ entrylk operations on the same inode/entry. This reduces the latency of each individual operation considerably. Inode version and size are also updated when needed instead of on each request. This gives an additional boost. Change-Id: I4b98d5508c86b53032e16e295f72a3f83fd8fcac BUG: 1122586 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec: Fix incorrect management of NFS requestsXavier Hernandez2014-08-021-38/+17
| | | | | | | | | | | | | | | | | | | Some operations, specially those comming from NFS, do not use a regular fd and use an anonymous fd (i.e. a previous open call has not been sent). Any context information created during open or create will not be present on these fd's, so we simply return NULL for contexts of those fd. Also it seems that NFS can send write requests with a very big buffer (higher that the default value of 128 KB). Some changes have been made to correctly handle these large buffers. Change-Id: I281476bd0d2cbaad231822248d6a616fcf5d4003 BUG: 1122417 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ec: Fixed coveriry scan issuesXavier Hernandez2014-07-211-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CID list: 1226163 Logically dead code 1226166 Missing break in switch 1226167 Missing break in switch 1226168 Missing break in switch 1226169 Missing break in switch 1226170 Missing break in switch 1226171 Missing break in switch 1226172 Missing break in switch 1226173 Missing break in switch 1226174 Missing break in switch 1226175 Missing break in switch 1226176 Missing break in switch 1226177 Missing break in switch 1226178 Data race condition 1226179 Data race condition 1226180 Data race condition 1226181 Thread deadlock 1226182 Uninitialized pointer read 1226183 Uninitialized pointer read 1226184 Read from pointer after free Change-Id: I4d33aa42289371927175c43bb29e018df64fb943 BUG: 789278 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/8317 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Added erasure code translatorXavier Hernandez2014-07-111-0/+2235
Change-Id: I293917501d5c2ca4cdc6303df30cf0b568cea361 BUG: 1118629 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/7749 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>