summaryrefslogtreecommitdiffstats
path: root/xlators/features/shard/src/shard.h
Commit message (Collapse)AuthorAgeFilesLines
* features/changelog: Capture FXATTROP and XATTROP in changelogKotresh HR2015-11-241-2/+0
| | | | | | | | | | | | | | | | | GEO-REP INTEROP WITH SHARD FEATURE shard xlator updates size of the file using FXATTROP or XATTROP. Hence record the same in changelog. BUG: 1284453 Change-Id: I2f14b6075f863c0bed3ee2a159085b9b536a756d Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12225 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/12732 Reviewed-by: Aravinda VK <avishwan@redhat.com>
* features/shard: Force cache-refresh when lookup/readdirp/stat detect that ↵Krutika Dhananjay2015-10-291-0/+2
| | | | | | | | | | | | | | xattr value has changed Backport of: http://review.gluster.org/12400 Change-Id: Ifa51979bc530e31d36781759ca62bcac1de7af24 BUG: 1274600 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12457 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Regulate memory consumption by individual shards' inode_t ↵Krutika Dhananjay2015-10-081-1/+10
| | | | | | | | | | | | | | | | | | | | objects Backport of: http://review.gluster.org/#/c/12254/ Shard translator will now maintain an lru list of inodes associated with individual shards of constant size, and will make sure that at no point the number of these inodes will exceed the configured limit. This is to keep the memory consumption by the thousands of shards of every large file from exploding. Change-Id: I7290f7cf1d76d5545ef04dc061c170f9f27329d2 BUG: 1269730 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12313 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Port log messages to new frameworkKrutika Dhananjay2015-09-291-4/+8
| | | | | | | | | | | | Backport of: http://review.gluster.org/12217 Change-Id: I129ff0d6d1cab15078fa474132c290950f5e1137 BUG: 1266822 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12236 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Performance improvements in IO pathKrutika Dhananjay2015-09-271-3/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12126/ This is patch 1/2 of the performance improvement work for sharding in the IO path. What this patch does: Since the primary use-case where sharding is targeted - VM store - is a single-writer workload, instead of performing lookup on the base file everytime to gather the size and block count from the backend in reads, writes and truncate, now the size and block count is also cached and kept up-to-date after every inode write in the inode ctx. TO-DO: Make changes in rename, link, unlink, [f]setattr and [f]stat to keep the relevant iatt members up-to-date in the inode ctx. Change-Id: Id4f5c33044411b87b55968083a70a0a11a335ab2 BUG: 1261716 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12213 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Filter internal shard xattrs in {get,remove,set}xattrKrutika Dhananjay2015-09-131-0/+1
| | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12121/ and http://review.gluster.org/#/c/12136/ Change-Id: Ifadebe1cda80fa4a525605e10513e6e64ee72709 BUG: 1261008 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Fix permission issuesKrutika Dhananjay2015-08-311-0/+18
| | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/11992 This patch does the following: * reverts commit b467af0e99b39ef708420d3f7f6696b0ca618512 * changes ownership on shards under /.shard to be root:root * makes readv, writev, [f]truncate, rename, and unlink fops to perform operations on files under /.shard with frame->root->{uid,gid} as 0. This would ensure that a [f]setattr on a sharded file does not need to be called on all the shards associated with it. Change-Id: I50d8533bd2b769a4dfe8cd1b49bdcfc117a7e660 BUG: 1253151 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12052 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Fix size update for writes at hole regionKrutika Dhananjay2015-08-291-0/+1
| | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/12020 Change-Id: I4b6e9101ccb881d3d285704902484e1e89ccaceb BUG: 1257204 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12026 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Ensure shards are owned by the same owner/group as the ↵Krutika Dhananjay2015-08-191-0/+3
| | | | | | | | | | | | | | original file Backport of: http://review.gluster.org/#/c/11874/ Change-Id: I8c905917430c32eb2e6573968722920d3b27fb55 BUG: 1253151 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11927 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Use xattrop (as opposed to setxattr) for updates to size xattrKrutika Dhananjay2015-07-211-1/+3
| | | | | | | | | | | | Backport of: http://review.gluster.org/11467 Change-Id: I9effecbb1296d11cf1629b5e5cc38192f84cfcb3 BUG: 1243655 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11689 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Fix incorrect parameter to get_lowest_block()Krutika Dhananjay2015-06-031-2/+3
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10804 Due to get_lowest_block() being a macro, what needs to be passed to it is the evaluation of the expression (local->offset - 1), without which its substitution can cause junk values to be assigned to local->first_block. This patch also fixes calls to get_highest_block() where if offset and size are both equal to zero, it could return negative values. Change-Id: I8f1bc54b536587d6af3a5c193434d06dccbf76dc BUG: 1227572 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11051 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Fix issue with readdir(p) fopKrutika Dhananjay2015-06-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10809 Problem: When readdir(p) is performed on '/' and ".shard" happens to be the last of the entries read in a given iteration of dht_readdir(p) (in other words the entry with the highest offset in the dirent list sorted in ascending order of d_offs), shard xlator would delete this entry as part of handling the call so as to avoid exposing its presence to the application. This would cause xlators above (like fuse, readdir-ahead etc) to wind the next readdirp as part of the same req at an offset which is (now) the highest d_off (post deletion of .shard) from the previously unwound list of entries. This offset would be less than that of ".shard" and therefore cause /.shard to be read once again. If by any chance this happens to be the only entry until end-of-directory, shard xlator would delete this entry and unwind with 0 entries, causing the xlator(s) above to think there is nothing more to readdir and the fop is complete. This would prevent DHT from gathering entries from the rest of its subvolumes, causing some entries to disappear. Fix: At the level of shard xlator, if ".shard" happens to be the last entry, make shard xlator wind another readdirp at offset equal to d_off of ".shard". That way, if ".shard" happens to be the only other entry under '/' until end-of-directory, DHT would receive an op_ret=0. This would enable it to wind readdir(p) on the rest of its subvols and gather the complete picture. Also, fixed a bug in shard_lookup_cbk() wherein file_size should be fetched unconditionally in cbk since it is set unconditionally in the wind path, failing which, lookup would be unwound with ia_size and ia_blocks only equal to that of the base file. Change-Id: I0ff0b48b6c9c12edbef947b6840a77a54c131650 BUG: 1226880 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11031 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* features/shard: Implement [f]truncate fopsKrutika Dhananjay2015-05-081-0/+3
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10631 To-Do: * Make ftruncate work even in the absence of path * Aggregate and update ia_blocks appropriately when a file is truncated to a lower size. Change-Id: Icd424430066233ba61a030e72fdddf692d2b3f22 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement readv() fopKrutika Dhananjay2015-05-061-0/+7
| | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/10528/ Change-Id: I3ff03d146a8d49cc11e7bf22ffbf830b4dd1e9f1 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10569 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/shard: Take hole size into account while computing ia_sizeKrutika Dhananjay2015-05-031-0/+1
| | | | | | | | | | | Backport of: http://review.gluster.org/10446 Change-Id: Ic05e07801605c0d610545368a513b56d8df21bf4 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10493 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement rename() fopKrutika Dhananjay2015-04-301-0/+7
| | | | | | | | | | | | | Backport of: http://review.gluster.org/10373 Change-Id: I15867667d50b2b4aad0ee3738a29f7a410d61ef4 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10455 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement unlink fopKrutika Dhananjay2015-04-261-0/+6
| | | | | | | | | | | | Backport of: http://review.gluster.org/10249 Change-Id: I01761721224c4efbbc5e4992e70ecf68b3868d63 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10377 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Consume size and block count in metadata read opsKrutika Dhananjay2015-04-261-2/+8
| | | | | | | | | | | | | | | | | | Backport of : http://review.gluster.org/10098 Metadata read fops like lookup, stat etc will now fetch the xattr that holds the size and block count information, extract the size and block count fields and set them in respective stbuf before unwinding the resultant iatt to the parent xlator. Change-Id: If7d2c4af886f8d70dc69d7cb09f1f66be391f198 BUG: 1214248 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10331 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introduce file size xattrKrutika Dhananjay2015-04-081-0/+30
| | | | | | | | | | | | | | | With each inode write FOP, the size and block count of the file will be updated within the xattr. There are two 64 byte fields that are intentionally left blank for now for future use when consistency guarantee is introduced later in sharding. Change-Id: I40a2e700150c1f199a6bf87909f063c84ab7bb43 BUG: 1207603 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10097 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Refactor codeKrutika Dhananjay2015-04-061-1/+20
| | | | | | | | | | | | | | | | | * Renamed shard_writev_create_write_shards() to shard_common_resolve_shards() to appropriately reflect its functionality and for reuse in other fops too. * Move code common to MKNOD and CREATE into a macro. * Cut down on if nesting in shard_lookup_cbk() Change-Id: I488255499673accd426390c6d42f2b39bab3d637 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10096 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Create and use xattr_req dict as and when neededKrutika Dhananjay2015-04-041-0/+1
| | | | | | | | | | | | | | | | | Reusing local->xattr_req for the several calls and callbacks per xlator fop would cause keys set from previous call/cbk (sometimes even by the xlators below) to remain which in some cases can lead to errors. For instance, the presence of "trusted.glusterfs.dht.*" keys (which are remnants of the previous call/cbk), can cause the GF_IF_INTERNAL_XATTR_GOTO() check in DHT to fail when the same dict is used to wind [f]setxattr. Change-Id: I8612d020f83f3dc55e4a34d10ccbdaf11d7b4fdd BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10095 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Bug fixesKrutika Dhananjay2015-03-271-3/+3
| | | | | | | | | | | | | | | | | * Return number of bytes written in writev cbk on success * Eliminate separate inode table for sharding xlator. * Fix appearance of "shard" as an option within the volfile for subvolume of type features/shard. * Fix values of min and max allowed shard block size * Return @new as opposed to NULL in shard_create_gfid_dict() on success Change-Id: I6319d377a196d1c5ceed1f65d337ff8eabcb21f8 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10003 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Introducing sharding translatorKrutika Dhananjay2015-03-191-0/+118
Based on the high-level design by Anand V. Avati which can be found @ https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20 Still to-do: * complete implementation of inode write fops - [f]truncate, zerofill, fallocate, discard * introduce transaction mechanism in inode write fops * complete readv * Handle open with O_TRUNC * Handle unlinking of all shards during unlink/rename * Compute total ia_size and ia_blocks in lookup, readdirp, etc * wind fsync/flush on all shards Note: Most of the items above are related. Once we come up with a clean way to determine the last shard/shard count for a file/file size and the mgmt of sparse regions of the file, implementing them becomes trivial. Change-Id: Id871379b53a4a916e4baa2e06f197dd8c0043b0f BUG: 1200082 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9841 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>