summaryrefslogtreecommitdiffstats
path: root/xlators/features/shard/src/shard.c
Commit message (Collapse)AuthorAgeFilesLines
* features/shard: Fix issue with readdir(p) fopKrutika Dhananjay2015-05-311-43/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When readdir(p) is performed on '/' and ".shard" happens to be the last of the entries read in a given iteration of dht_readdir(p) (in other words the entry with the highest offset in the dirent list sorted in ascending order of d_offs), shard xlator would delete this entry as part of handling the call so as to avoid exposing its presence to the application. This would cause xlators above (like fuse, readdir-ahead etc) to wind the next readdirp as part of the same req at an offset which is (now) the highest d_off (post deletion of .shard) from the previously unwound list of entries. This offset would be less than that of ".shard" and therefore cause /.shard to be read once again. If by any chance this happens to be the only entry until end-of-directory, shard xlator would delete this entry and unwind with 0 entries, causing the xlator(s) above to think there is nothing more to readdir and the fop is complete. This would prevent DHT from gathering entries from the rest of its subvolumes, causing some entries to disappear. Fix: At the level of shard xlator, if ".shard" happens to be the last entry, make shard xlator wind another readdirp at offset equal to d_off of ".shard". That way, if ".shard" happens to be the only other entry under '/' until end-of-directory, DHT would receive an op_ret=0. This would enable it to wind readdir(p) on the rest of its subvols and gather the complete picture. Also, fixed a bug in shard_lookup_cbk() wherein file_size should be fetched unconditionally in cbk since it is set unconditionally in the wind path, failing which, lookup would be unwound with ia_size and ia_blocks only equal to that of the base file. Change-Id: I6c2bc770f1bcdad51c273c777ae0b42c88c53f61 BUG: 1222379 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10809 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Skip block count and size update for directoriesKrutika Dhananjay2015-05-271-0/+2
| | | | | | | | | | Change-Id: Iaa7022c95a8d9c9c471db025ec644e0bcc4eeb29 BUG: 1221104 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10772 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Implement [f]truncate fopsKrutika Dhananjay2015-05-071-307/+799
| | | | | | | | | | | | | | To-Do: * Make ftruncate work even in the absence of path * Aggregate and update ia_blocks appropriately when a file is truncated to a lower size. Change-Id: Ifd24c2f5e80d2c3bc921261f5481251df8948126 BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10631 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Implement readv() fopKrutika Dhananjay2015-05-051-269/+628
| | | | | | | | | Change-Id: I4cc060710482de8633141170dd35f669f01f639b BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10528 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Add "is-directory" checks in stat/fstatKrutika Dhananjay2015-05-031-0/+12
| | | | | | | | | | | | | | | | During mount, NFS directly calls stat on the root of the volume without sending a lookup on it. This was causing inode_ctx_get_block_size() to fail on /. A check is now added in [f]stat which would ensure no action is taken by shard xlator when the operation is on a directory. Change-Id: I81849eeddfdad9f271155442408d95b4a25d7647 BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10427 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Take hole size into account while computing ia_sizeKrutika Dhananjay2015-05-021-2/+9
| | | | | | | | | | Change-Id: I1a90ad6669c1cb79aaae6b4bd9621c75d9985c8a BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10446 Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement rename() fopKrutika Dhananjay2015-04-291-44/+254
| | | | | | | | | | | Change-Id: Ia39fa0f29eac84c18d13a94f704b1703565b504e BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10373 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement unlink fopKrutika Dhananjay2015-04-251-233/+489
| | | | | | | | | | | Change-Id: I5cdd805821a4f3657f490223b97f42c724ee588f BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10249 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Consume size and block count in metadata read opsKrutika Dhananjay2015-04-211-96/+263
| | | | | | | | | | | | | | | | Metadata read fops like lookup, stat etc will now fetch the xattr that holds the size and block count information, extract the size and block count fields and set them in respective stbuf before unwinding the resultant iatt to the parent xlator. Change-Id: I881be8955092fa6b75f8b0e4f3deb01344cb638e BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10098 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introduce file size xattrKrutika Dhananjay2015-04-081-43/+237
| | | | | | | | | | | | | | | With each inode write FOP, the size and block count of the file will be updated within the xattr. There are two 64 byte fields that are intentionally left blank for now for future use when consistency guarantee is introduced later in sharding. Change-Id: I40a2e700150c1f199a6bf87909f063c84ab7bb43 BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10097 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Refactor codeKrutika Dhananjay2015-04-061-48/+20
| | | | | | | | | | | | | | | | | * Renamed shard_writev_create_write_shards() to shard_common_resolve_shards() to appropriately reflect its functionality and for reuse in other fops too. * Move code common to MKNOD and CREATE into a macro. * Cut down on if nesting in shard_lookup_cbk() Change-Id: I488255499673accd426390c6d42f2b39bab3d637 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10096 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/shard: Create and use xattr_req dict as and when neededKrutika Dhananjay2015-04-041-5/+37
| | | | | | | | | | | | | | | | | Reusing local->xattr_req for the several calls and callbacks per xlator fop would cause keys set from previous call/cbk (sometimes even by the xlators below) to remain which in some cases can lead to errors. For instance, the presence of "trusted.glusterfs.dht.*" keys (which are remnants of the previous call/cbk), can cause the GF_IF_INTERNAL_XATTR_GOTO() check in DHT to fail when the same dict is used to wind [f]setxattr. Change-Id: I8612d020f83f3dc55e4a34d10ccbdaf11d7b4fdd BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10095 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Fail writes if /.shard already exists as a fileKrutika Dhananjay2015-04-031-0/+8
| | | | | | | | | | Change-Id: Id7250ca4637c37a005cf2def43d5b843c1ea6562 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10094 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Bug fixesKrutika Dhananjay2015-03-271-27/+47
| | | | | | | | | | | | | | | | | * Return number of bytes written in writev cbk on success * Eliminate separate inode table for sharding xlator. * Fix appearance of "shard" as an option within the volfile for subvolume of type features/shard. * Fix values of min and max allowed shard block size * Return @new as opposed to NULL in shard_create_gfid_dict() on success Change-Id: I6319d377a196d1c5ceed1f65d337ff8eabcb21f8 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10003 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introducing sharding translatorKrutika Dhananjay2015-03-191-0/+1791
Based on the high-level design by Anand V. Avati which can be found @ https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20 Still to-do: * complete implementation of inode write fops - [f]truncate, zerofill, fallocate, discard * introduce transaction mechanism in inode write fops * complete readv * Handle open with O_TRUNC * Handle unlinking of all shards during unlink/rename * Compute total ia_size and ia_blocks in lookup, readdirp, etc * wind fsync/flush on all shards Note: Most of the items above are related. Once we come up with a clean way to determine the last shard/shard count for a file/file size and the mgmt of sparse regions of the file, implementing them becomes trivial. Change-Id: Id871379b53a4a916e4baa2e06f197dd8c0043b0f BUG: 1200082 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9841 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>