summaryrefslogtreecommitdiffstats
path: root/xlators/features/shard
Commit message (Collapse)AuthorAgeFilesLines
* features/shard: Skip shards resolution if lookup on /.shard returns ENOENTKrutika Dhananjay2015-06-221-22/+118
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/11065 This change is done in [f]truncate, rename, unlink and readv. Also, this patch also makes lookup in shard delete GF_CONTENT_KEY as a workaround for the problems with read caching of sparse files by quick-read. A proper solution would involve shard_lookup_cbk() performing a readv, aggregating and ordering the responses and setting it in the xdata before unwinding the response to upper translators, which will be done in a separate patch. Change-Id: I31e5cec8815db0269e664c17ce3e221c55c8863f BUG: 1227572 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11332 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Handle symlinks appropriately in fopsKrutika Dhananjay2015-06-061-5/+15
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10995 (f)stat, unlink and rename must skip doing inode_ctx_get() of shard block size on symbolic links. Change-Id: Iaf2502512a5838db137e5e1f0c14b12f5058865f BUG: 1227572 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11066 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Fix incorrect parameter to get_lowest_block()Krutika Dhananjay2015-06-031-2/+3
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10804 Due to get_lowest_block() being a macro, what needs to be passed to it is the evaluation of the expression (local->offset - 1), without which its substitution can cause junk values to be assigned to local->first_block. This patch also fixes calls to get_highest_block() where if offset and size are both equal to zero, it could return negative values. Change-Id: I8f1bc54b536587d6af3a5c193434d06dccbf76dc BUG: 1227572 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11051 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Fix issue with readdir(p) fopKrutika Dhananjay2015-06-022-43/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10809 Problem: When readdir(p) is performed on '/' and ".shard" happens to be the last of the entries read in a given iteration of dht_readdir(p) (in other words the entry with the highest offset in the dirent list sorted in ascending order of d_offs), shard xlator would delete this entry as part of handling the call so as to avoid exposing its presence to the application. This would cause xlators above (like fuse, readdir-ahead etc) to wind the next readdirp as part of the same req at an offset which is (now) the highest d_off (post deletion of .shard) from the previously unwound list of entries. This offset would be less than that of ".shard" and therefore cause /.shard to be read once again. If by any chance this happens to be the only entry until end-of-directory, shard xlator would delete this entry and unwind with 0 entries, causing the xlator(s) above to think there is nothing more to readdir and the fop is complete. This would prevent DHT from gathering entries from the rest of its subvolumes, causing some entries to disappear. Fix: At the level of shard xlator, if ".shard" happens to be the last entry, make shard xlator wind another readdirp at offset equal to d_off of ".shard". That way, if ".shard" happens to be the only other entry under '/' until end-of-directory, DHT would receive an op_ret=0. This would enable it to wind readdir(p) on the rest of its subvols and gather the complete picture. Also, fixed a bug in shard_lookup_cbk() wherein file_size should be fetched unconditionally in cbk since it is set unconditionally in the wind path, failing which, lookup would be unwound with ia_size and ia_blocks only equal to that of the base file. Change-Id: I0ff0b48b6c9c12edbef947b6840a77a54c131650 BUG: 1226880 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11031 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Skip block count and size update for directoriesKrutika Dhananjay2015-05-281-0/+2
| | | | | | | | | | | Backport of: http://review.gluster.org/10772 Change-Id: I3594641ef0bf6a17e1ceab3c9ad87ef18b981d2e BUG: 1225922 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10972 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Implement [f]truncate fopsKrutika Dhananjay2015-05-082-307/+802
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10631 To-Do: * Make ftruncate work even in the absence of path * Aggregate and update ia_blocks appropriately when a file is truncated to a lower size. Change-Id: Icd424430066233ba61a030e72fdddf692d2b3f22 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement readv() fopKrutika Dhananjay2015-05-062-269/+635
| | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/10528/ Change-Id: I3ff03d146a8d49cc11e7bf22ffbf830b4dd1e9f1 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10569 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Take hole size into account while computing ia_sizeKrutika Dhananjay2015-05-032-2/+10
| | | | | | | | | | | Backport of: http://review.gluster.org/10446 Change-Id: Ic05e07801605c0d610545368a513b56d8df21bf4 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Add "is-directory" checks in stat/fstatKrutika Dhananjay2015-05-031-0/+12
| | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10427 During mount, NFS directly calls stat on the root of the volume without sending a lookup on it. This was causing inode_ctx_get_block_size() to fail on /. A check is now added in [f]stat which would ensure no action is taken by shard xlator when the operation is on a directory. Change-Id: I8645b7fe58b2d44b5f527d50c1c7102de44acc00 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10509 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement rename() fopKrutika Dhananjay2015-04-302-44/+261
| | | | | | | | | | | | | Backport of: http://review.gluster.org/10373 Change-Id: I15867667d50b2b4aad0ee3738a29f7a410d61ef4 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10455 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement unlink fopKrutika Dhananjay2015-04-262-233/+495
| | | | | | | | | | | | Backport of: http://review.gluster.org/10249 Change-Id: I01761721224c4efbbc5e4992e70ecf68b3868d63 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10377 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Consume size and block count in metadata read opsKrutika Dhananjay2015-04-262-98/+271
| | | | | | | | | | | | | | | | | | Backport of : http://review.gluster.org/10098 Metadata read fops like lookup, stat etc will now fetch the xattr that holds the size and block count information, extract the size and block count fields and set them in respective stbuf before unwinding the resultant iatt to the parent xlator. Change-Id: If7d2c4af886f8d70dc69d7cb09f1f66be391f198 BUG: 1214248 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10331 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introduce file size xattrKrutika Dhananjay2015-04-083-43/+268
| | | | | | | | | | | | | | | With each inode write FOP, the size and block count of the file will be updated within the xattr. There are two 64 byte fields that are intentionally left blank for now for future use when consistency guarantee is introduced later in sharding. Change-Id: I40a2e700150c1f199a6bf87909f063c84ab7bb43 BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10097 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Refactor codeKrutika Dhananjay2015-04-062-49/+40
| | | | | | | | | | | | | | | | | * Renamed shard_writev_create_write_shards() to shard_common_resolve_shards() to appropriately reflect its functionality and for reuse in other fops too. * Move code common to MKNOD and CREATE into a macro. * Cut down on if nesting in shard_lookup_cbk() Change-Id: I488255499673accd426390c6d42f2b39bab3d637 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10096 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/shard: Create and use xattr_req dict as and when neededKrutika Dhananjay2015-04-042-5/+38
| | | | | | | | | | | | | | | | | Reusing local->xattr_req for the several calls and callbacks per xlator fop would cause keys set from previous call/cbk (sometimes even by the xlators below) to remain which in some cases can lead to errors. For instance, the presence of "trusted.glusterfs.dht.*" keys (which are remnants of the previous call/cbk), can cause the GF_IF_INTERNAL_XATTR_GOTO() check in DHT to fail when the same dict is used to wind [f]setxattr. Change-Id: I8612d020f83f3dc55e4a34d10ccbdaf11d7b4fdd BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10095 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Fail writes if /.shard already exists as a fileKrutika Dhananjay2015-04-031-0/+8
| | | | | | | | | | Change-Id: Id7250ca4637c37a005cf2def43d5b843c1ea6562 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10094 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Bug fixesKrutika Dhananjay2015-03-272-30/+50
| | | | | | | | | | | | | | | | | * Return number of bytes written in writev cbk on success * Eliminate separate inode table for sharding xlator. * Fix appearance of "shard" as an option within the volfile for subvolume of type features/shard. * Fix values of min and max allowed shard block size * Return @new as opposed to NULL in shard_create_gfid_dict() on success Change-Id: I6319d377a196d1c5ceed1f65d337ff8eabcb21f8 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10003 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introducing sharding translatorKrutika Dhananjay2015-03-195-0/+1950
Based on the high-level design by Anand V. Avati which can be found @ https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20 Still to-do: * complete implementation of inode write fops - [f]truncate, zerofill, fallocate, discard * introduce transaction mechanism in inode write fops * complete readv * Handle open with O_TRUNC * Handle unlinking of all shards during unlink/rename * Compute total ia_size and ia_blocks in lookup, readdirp, etc * wind fsync/flush on all shards Note: Most of the items above are related. Once we come up with a clean way to determine the last shard/shard count for a file/file size and the mgmt of sparse regions of the file, implementing them becomes trivial. Change-Id: Id871379b53a4a916e4baa2e06f197dd8c0043b0f BUG: 1200082 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9841 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>