summaryrefslogtreecommitdiffstats
path: root/xlators/features/shard
Commit message (Collapse)AuthorAgeFilesLines
* features/shard: optimization over shard lookup in case of preallocVinayakswami Hariharmath2020-08-201-7/+39
| | | | | | | | | | | | | | | | | | | | | | | | | Assume that we are preallocating a VM of size 1TB with a shard block size of 64MB then there will be ~16k shards. This creation happens in 2 steps shard_fallocate() path i.e 1. lookup for the shards if any already present and 2. mknod over those shards do not exist. But in case of fresh creation, we dont have to lookup for all shards which are not present as the the file size will be 0. Through this, we can save lookup on all shards which are not present. This optimization is quite useful in the case of preallocating big vm. Also if the file is already present and the call is to extend it to bigger size then we need not to lookup for non- existent shards. Just lookup preexisting shards, populate the inodes and issue mknod on extended size. Fixes: #1425 Change-Id: I60036fe8302c696e0ca80ff11ab0ef5bcdbd7880 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
* features/shard: Convert shard block indices to uint64Krutika Dhananjay2020-07-082-7/+10
| | | | | | | | | | | | | | This patch fixes a crash in FOPs that operate on really large sharded files where number of participant shards could sometimes exceed signed int32 max. The patch also adds GF_ASSERTs to ensure that number of participating shards is always greater than 0 for files that do have more than one shard. Change-Id: I354de58796f350eb1aa42fcdf8092ca2e69ccbb6 Fixes: #1348 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Use fd lookup post file openVinayakswami Hariharmath2020-06-111-43/+76
| | | | | | | | | | | | | | | Issue: When a process has the open fd and the same file is unlinked in middle of the operations, then file based lookup fails with ENOENT or stale file Solution: When the file already open and fd is available, use fstat to get the file attributes Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4 Fixes: #1281 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
* features/shard: Aggregate file size, block-count before unwinding removexattrKrutika Dhananjay2020-05-262-70/+196
| | | | | | | | | | | | | Posix translator returns pre and postbufs in the dict in {F}REMOVEXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4b2dd41ede472c5829af80a67401ec5a6376d872 Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Aggregate size, block-count in iatt before unwinding setxattrKrutika Dhananjay2020-05-211-17/+191
| | | | | | | | | | | | | Posix translator returns pre and postbufs in the dict in {F}SETXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4da0eceb4235b91546df79270bcc0af8cd64e9ea Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix crash during shards cleanup in error casesKrutika Dhananjay2020-03-261-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | A crash is seen during a reattempt to clean up shards in background upon remount. And this happens even on remount (which means a remount is no workaround for the crash). In such a situation, the in-memory base inode object will not be existent (new process, non-existent base shard). So local->resolver_base_inode will be NULL. In the event of an error (in this case, of space running out), the process would crash at the time of logging the error in the following line - gf_msg(this->name, GF_LOG_ERROR, local->op_errno, SHARD_MSG_FOP_FAILED, "failed to delete shards of %s", uuid_utoa(local->resolver_base_inode->gfid)); Fixed that by using local->base_gfid as the source of gfid when local->resolver_base_inode is NULL. Change-Id: I0b49f2b58becd0d8874b3d4b14ff8d92a89d02d5 Fixes: #1127 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* multiple: fix bad type castXavi Hernandez2020-01-101-1/+2
| | | | | | | | | | | | When using inode_ctx_get() or inode_ctx_set(), a 'uint64_t *' is expected. In many cases, the value to retrieve or store is a pointer, which will be of smaller size in some architectures (for example 32-bits). In this case, directly passing the address of the pointer casted to an 'uint64_t *' is wrong and can cause memory corruption. Change-Id: Iae616da9dda528df6743fa2f65ae5cff5ad23258 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1785611
* tests/shard: fix tests/bugs/shard/unlinks-and-renames.t failureSheetal Pamecha2019-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | on rhel8 machine cleanup of shards is not happening properly for a sharded file with hard-links. It needs to refresh the hard link count to make it successful The problem occurs when a sharded file with hard-links gets removed. When the last link file is removed, all shards need to be cleaned up. But in the current code structure shard xlator, instead of sending a lookup to get the link count uses stale cache values of inodectx. Therby removing the base shard but not the shards present in /.shard directory. This fix will make sure that it marks in the first unlink's callback that the inode ctx needs a refresh so that in the next operation, it will be refreshed by looking up the file on-disk. fixes: bz#1764110 Change-Id: I81625c7451dabf006c0864d859b1600f3521b648 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* afr: support split-brain CLI for replica 3Ravishankar N2019-10-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Ever since we added quorum checks for lookups in afr via commit bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4, the split-brain resolution commands would not work for replica 3 because there would be no readables for the lookup fop. The argument was that split-brains do not occur in replica 3 but we do see (data/metadata) split-brain cases once in a while which indicate that there are a few bugs/corner cases yet to be discovered and fixed. Fortunately, commit 8016d51a3bbd410b0b927ed66be50a09574b7982 added GF_CLIENT_PID_GLFS_HEALD as the pid for all fops made by glfsheal. If we leverage this and allow lookups in afr when pid is GF_CLIENT_PID_GLFS_HEALD, split-brain resolution commands will work for replica 3 volumes too. Likewise, the check is added in shard_lookup as well to permit resolving split-brains by specifying "/.shard/shard-file.xx" as the file name (which previously used to fail with EPERM). Change-Id: I3c543dea79caf7cfbc1633e9089cb1cdd2538ba9 Fixes: bz#1756938 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* features/shard: Send correct size when reads are sent beyond file sizeKrutika Dhananjay2019-08-121-0/+2
| | | | | | Change-Id: I0cebaaf55c09eb1fb77a274268ff564e871b743b fixes bz#1738419 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* graph/shd: Use glusterfs_graph_deactivate to free the xl recMohammed Rafi KC2019-06-271-0/+3
| | | | | | | | | | | | We were using glusterfs_graph_fini to free the xl rec from glusterfs_process_volfp as well as glusterfs_graph_cleanup. Instead we can use glusterfs_graph_deactivate, which is does fini as well as other common rec free. Change-Id: Ie4a5f2771e5254aa5ed9f00c3672a6d2cc8e4bc1 Updates: bz#1716695 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* libglusterfs: cleanup iovec functionsXavi Hernandez2019-06-111-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans some iovec code and creates two additional helper functions to simplify management of iovec structures. iov_range_copy(struct iovec *dst, uint32_t dst_count, uint32_t dst_offset, struct iovec *src, uint32_t src_count, uint32_t src_offset, uint32_t size); This function copies up to 'size' bytes from 'src' at offset 'src_offset' to 'dst' at 'dst_offset'. It returns the number of bytes copied. iov_skip(struct iovec *iovec, uint32_t count, uint32_t size); This function removes the initial 'size' bytes from 'iovec' and returns the updated number of iovec vectors remaining. The signature of iov_subset() has also been modified to make it safer and easier to use. The new signature is: iov_subset(struct iovec *src, int src_count, uint32_t start, uint32_t size, struct iovec **dst, int32_t dst_count); This function creates a new iovec array containing the subset of the 'src' vector starting at 'start' with size 'size'. The resulting array is allocated if '*dst' is NULL, or copied to '*dst' if it fits (based on 'dst_count'). It returns the number of iovec vectors used. A new set of functions to iterate through an iovec array have been created. They can be used to simplify the implementation of other iovec-based helper functions. Change-Id: Ia5fe57e388e23392a8d6cdab17670e337cadd587 Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* features/shard: Fix extra unref when inode object is lru'd out and added backKrutika Dhananjay2019-06-091-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long tale of double unref! But do read... In cases where a shard base inode is evicted from lru list while still being part of fsync list but added back soon before its unlink, there could be an extra inode_unref() leading to premature inode destruction leading to crash. One such specific case is the following - Consider features.shard-deletion-rate = features.shard-lru-limit = 2. This is an oversimplified example but explains the problem clearly. First, a file is FALLOCATE'd to a size so that number of shards under /.shard = 3 > lru-limit. Shards 1, 2 and 3 need to be resolved. 1 and 2 are resolved first. Resultant lru list: 1 -----> 2 refs on base inode - (1) + (1) = 2 3 needs to be resolved. So 1 is lru'd out. Resultant lru list - 2 -----> 3 refs on base inode - (1) + (1) = 2 Note that 1 is inode_unlink()d but not destroyed because there are non-zero refs on it since it is still participating in this ongoing FALLOCATE operation. FALLOCATE is sent on all participant shards. In the cbk, all of them are added to fync_list. Resulting fsync list - 1 -----> 2 -----> 3 (order doesn't matter) refs on base inode - (1) + (1) + (1) = 3 Total refs = 3 + 2 = 5 Now an attempt is made to unlink this file. Background deletion is triggered. The first $shard-deletion-rate shards need to be unlinked in the first batch. So shards 1 and 2 need to be resolved. inode_resolve fails on 1 but succeeds on 2 and so it's moved to tail of list. lru list now - 3 -----> 2 No change in refs. shard 1 is looked up. In lookup_cbk, it's linked and added back to lru list at the cost of evicting shard 3. lru list now - 2 -----> 1 refs on base inode: (1) + (1) = 2 fsync list now - 1 -----> 2 (again order doesn't matter) refs on base inode - (1) + (1) = 2 Total refs = 2 + 2 = 4 After eviction, it is found 3 needs fsync. So fsync is wound, yet to be ack'd. So it is still inode_link()d. Now deletion of shards 1 and 2 completes. lru list is empty. Base inode unref'd and destroyed. In the next batched deletion, 3 needs to be deleted. It is inode_resolve()able. It is added back to lru list but base inode passed to __shard_update_shards_inode_list() is NULL since the inode is destroyed. But its ctx->inode still contains base inode ptr from first addition to lru list for no additional ref on it. lru list now - 3 refs on base inode - (0) Total refs on base inode = 0 Unlink is sent on 3. It completes. Now since the ctx contains ptr to base_inode and the shard is part of lru list, base shard is unref'd leading to a crash. FIX: When shard is readded back to lru list, copy the base inode pointer as is into its inode ctx, even if it is NULL. This is needed to prevent double unrefs at the time of deleting it. Change-Id: I99a44039da2e10a1aad183e84f644d63ca552462 Updates: bz#1696136 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* across: clang-scan: fix NULL dereferencing warningsAmar Tumballi2019-06-041-10/+18
| | | | | | | | | All these checks are done after analyzing clang-scan report produced by the CI job @ https://build.gluster.org/job/clang-scan updates: bz#1622665 Change-Id: I590305af4ceb779be952974b2a36066ffc4865ca Signed-off-by: Amar Tumballi <amarts@redhat.com>
* features/shard: Fix block-count accounting upon truncate to lower sizeKrutika Dhananjay2019-06-042-13/+49
| | | | | | | | | | | | | | | | | The way delta_blocks is computed in shard is incorrect, when a file is truncated to a lower size. The accounting only considers change in size of the last of the truncated shards. FIX: Get the block-count of each shard just before an unlink at posix in xdata. Their summation plus the change in size of last shard (from an actual truncate) is used to compute delta_blocks which is used in the xattrop for size update. Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53 fixes: bz#1705884 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix crash during background shard deletion in a specific caseKrutika Dhananjay2019-05-161-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following case - 1. A file gets FALLOCATE'd such that > "shard-lru-limit" number of shards are created. 2. And then it is deleted after that. The unique thing about FALLOCATE is that unlike WRITE, all of the participant shards are resolved and created and fallocated in a single batch. This means, in this case, after the first "shard-lru-limit" number of shards are resolved and added to lru list, as part of resolution of the remaining shards, some of the existing shards in lru list will need to be evicted. So these evicted shards will be inode_unlink()d as part of eviction. Now once the fop gets to the actual FALLOCATE stage, the lru'd-out shards get added to fsync list. 2 things to note at this point: i. the lru'd out shards are only part of fsync list, so each holds 1 ref on base shard ii. and the more recently used shards are part of both fsync and lru list. So each of these shards holds 2 refs on base inode - one for being part of fsync list, and the other for being part of lru list. FALLOCATE completes successfully and then this very file is deleted, and background shard deletion launched. Here's where the ref counts get mismatched. First as part of inode_resolve()s during the deletion, the lru'd-out inodes return NULL, because they are inode_unlink()'d by now. So these inodes need to be freshly looked up. But as part of linking them in lookup_cbk (precisely in shard_link_block_inode()), inode_link() returns the lru'd-out inode object. And its inode ctx is still valid and ctx->base_inode valid from the last time it was added to list. But shard_common_lookup_shards_cbk() passes NULL in the place of base_pointer to __shard_update_shards_inode_list(). This means, as part of adding the lru'd out inode back to lru list, base inode is not ref'd since its NULL. Whereas post unlinking this shard, during shard_unlink_block_inode(), ctx->base_inode is accessible and is unref'd because the shard was found to be part of LRU list, although the matching ref didn't occur. This at some point leads to base_inode refcount becoming 0 and it getting destroyed and released back while some of its associated shards are continuing to be unlinked in parallel and the client crashes whenever it is accessed next. Fix is to pass base shard correctly, if available, in shard_link_block_inode(). Also, the patch fixes the ret value check in tests/bugs/shard/shard-fallocate.c Change-Id: Ibd0bc4c6952367608e10701473cbad3947d7559f Updates: bz#1696136 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix integer overflow in block count accountingKrutika Dhananjay2019-05-061-1/+1
| | | | | | | | ... by holding delta_blocks in 64-bit int as opposed to 32-bit int. Change-Id: I2c1ddab17457f45e27428575ad16fa678fd6c0eb updates: bz#1705884 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* shard: fix crash caused by using null inodeKinglong Mee2019-03-141-4/+3
| | | | | | Change-Id: I156bf962223304e586b83a36be59a0ca74589b43 Updates: bz#1688287 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* features/shard: Ref shard inode while adding to fsync listKrutika Dhananjay2019-01-241-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Lot of the earlier changes in the management of shards in lru, fsync lists assumed that if a given shard exists in fsync list, it must be part of lru list as well. This was found to be not true. Consider this - a file is FALLOCATE'd to a size which would make the number of participant shards to be greater than the lru list size. In this case, some of the resolved shards that are to participate in this fop will be evicted from lru list to give way to the rest of the shards. And once FALLOCATE completes, these shards are added to fsync list but without a ref. After the fop completes, these shard inodes are unref'd and destroyed while their inode ctxs are still part of fsync list. Now when an FSYNC is called on the base file and the fsync-list traversed, the client crashes due to illegal memory access. FIX: Hold a ref on the shard inode when adding to fsync list as well. And unref under following conditions: 1. when the shard is evicted from lru list 2. when the base file is fsync'd 3. when the shards are deleted. Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2 fixes: bz#1669077 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix launch of multiple synctasks for background deletionKrutika Dhananjay2019-01-112-71/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: When multiple sharded files are deleted in quick succession, multiple issues were observed: 1. misleading logs corresponding to a sharded file where while one log message said the shards corresponding to the file were deleted successfully, this was followed by multiple logs suggesting the very same operation failed. This was because of multiple synctasks attempting to clean up shards of the same file and only one of them succeeding (the one that gets ENTRYLK successfully), and the rest of them logging failure. 2. multiple synctasks to do background deletion would be launched, one for each deleted file but all of them could readdir entries from .remove_me at the same time could potentially contend for ENTRYLK on .shard for each of the entry names. This is undesirable and wasteful. FIX: Background deletion will now follow a state machine. In the event that there are multiple attempts to launch synctask for background deletion, one for each file deleted, only the first task is launched. And if while this task is doing the cleanup, more attempts are made to delete other files, the state of the synctask is adjusted so that it restarts the crawl even after reaching end-of-directory to pick up any files it may have missed in the previous iteration. This patch also fixes uninitialized lk-owner during syncop_entrylk() which was leading to multiple background deletion synctasks entering the critical section at the same time and leading to illegal memory access of base inode in the second syntcask after it was destroyed post shard deletion by the first synctask. Change-Id: Ib33773d27fb4be463c7a8a5a6a4b63689705324e updates: bz#1662368 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Assign fop id during background deletion to prevent ↵Krutika Dhananjay2019-01-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | excessive logging ... of the kind "[2018-12-26 05:22:44.195019] E [MSGID: 133010] [shard.c:2253:shard_common_lookup_shards_cbk] 0-volume1-shard: Lookup on shard 785 failed. Base file gfid = cd938e64-bf06-476f-a5d4-d580a0d37416 [No such file or directory]" shard_common_lookup_shards_cbk() has a specific check to ignore ENOENT error without logging them during specific fops. But because background deletion is done in a new frame (with local->fop being GF_FOP_NULL), the ENOENT check is skipped and the absence of shards gets logged everytime. To fix this, local->fop is initialized to GF_FOP_UNLINK during background deletion. Change-Id: I0ca8d3b3bfbcd354b4a555eee520eb0479bcda35 updates: bz#1662368 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* New xlator option to control enable/disable of xlators in Gd2Aravinda VK2018-12-071-0/+8
| | | | | | | | | | | | | Since glusterd2 don't maintain the xlator option details in code, it directly reads the xlators options table from `*.so` files. To support enable and disable of xlator new option added to the option table with the name same as xlator name itself. This change will not affect the functionality with glusterd1. Change-Id: I23d9e537f3f422de72ddb353484466d3519de0c1 updates: #302 Signed-off-by: Aravinda VK <avishwan@redhat.com>
* all: add xlator_api to many translatorsAmar Tumballi2018-12-061-0/+14
| | | | | | Fixes: #164 Change-Id: I93ad6f0232a1dc534df099059f69951e1339086f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-054-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* feature/shard: Fix coverity issue - Use after freeSusant Palai2018-11-141-1/+1
| | | | | | | | CID: 1325524 Change-Id: Ic713285bd9e76d8e4dc1815aa471087d279008b5 updates: bz#789278 Signed-off-by: Susant Palai <spalai@redhat.com>
* all: fix the format string exceptionsAmar Tumballi2018-11-051-2/+2
| | | | | | | | | | | | | | | | Currently, there are possibilities in few places, where a user-controlled (like filename, program parameter etc) string can be passed as 'fmt' for printf(), which can lead to segfault, if the user's string contains '%s', '%d' in it. While fixing it, makes sense to make the explicit check for such issues across the codebase, by making the format call properly. Fixes: CVE-2018-14661 Fixes: bz#1644763 Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* shard : fix coverity issue in shard.cSunny Kumar2018-10-251-4/+12
| | | | | | | | | | | | This patch fixes CID: 1394664 : CHECKED_RETURN 1356534 : Macro compares unsigned to 0 (NO_EFFECT) 1356532 : Macro compares unsigned to 0 (NO_EFFECT) updates: bz#789278 Change-Id: I04d64fd8c007627611710dc56109b76eeb59333a Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* shard : fix newly introduced coveritySunny Kumar2018-10-181-1/+2
| | | | | | | | | This patch fixes CID: 1396177: NULL dereference. updates: bz#789278 Change-Id: Ic5d302a5e32d375acf8adc412763ab94e6dabc3d Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* features/shard: Hold a ref on base inode when adding a shard to lru listKrutika Dhananjay2018-10-161-14/+34
| | | | | | | | | | | | | | | | | | | | | | | | | In __shard_update_shards_inode_list(), previously shard translator was not holding a ref on the base inode whenever a shard was added to the lru list. But if the base shard is forgotten and destroyed either by fuse due to memory pressure or due to the file being deleted at some point by a different client with this client still containing stale shards in its lru list, the client would crash at the time of locking lru_base_inode->lock owing to illegal memory access. So now the base shard is ref'd into the inode ctx of every shard that is added to lru list until it gets lru'd out. The patch also handles the case where none of the shards associated with a file that is about to be deleted are part of the LRU list and where an unlink at the beginning of the operation destroys the base inode (because there are no refkeepers) and hence all of the shards that are about to be deleted will be resolved without the existence of a base shard in-memory. This, if not handled properly, could lead to a crash. Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* all: fix warnings on non 64-bits architecturesXavi Hernandez2018-10-101-8/+8
| | | | | | | | | | When compiling in other architectures there appear many warnings. Some of them are actual problems that prevent gluster to work correctly on those architectures. Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0 fixes: bz#1632717 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* glusterd: Update op-version from 4.2 to 5.0ShyamsundarR2018-09-131-2/+2
| | | | | | | | | | | | Post changing the max op-version to 4.2, after release 4.1 branching, the decision was to go with increasing release numbers. Thus this needs to change to 5.0. This commit addresses the above change. Fixes: bz#1628664 Change-Id: Ifcc0c6da90fdd51e4eceea40749511110a432cce Signed-off-by: ShyamsundarR <srangana@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-5518/+5444
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* Land clang-format changesGluster Ant2018-09-123-315/+307
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* Multiple files: calloc -> mallocYaniv Kaul2018-09-042-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/storage/posix/src/posix-inode-fd-ops.c: xlators/storage/posix/src/posix-helpers.c: xlators/storage/bd/src/bd.c: xlators/protocol/client/src/client-lk.c: xlators/performance/quick-read/src/quick-read.c: xlators/performance/io-cache/src/page.c xlators/nfs/server/src/nfs3-helpers.c xlators/nfs/server/src/nfs-fops.c xlators/nfs/server/src/mount3udp_svc.c xlators/nfs/server/src/mount3.c xlators/mount/fuse/src/fuse-helpers.c xlators/mount/fuse/src/fuse-bridge.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-syncop.h xlators/mgmt/glusterd/src/glusterd-snapshot.c xlators/mgmt/glusterd/src/glusterd-rpc-ops.c xlators/mgmt/glusterd/src/glusterd-replace-brick.c xlators/mgmt/glusterd/src/glusterd-op-sm.c xlators/mgmt/glusterd/src/glusterd-mgmt.c xlators/meta/src/subvolumes-dir.c xlators/meta/src/graph-dir.c xlators/features/trash/src/trash.c xlators/features/shard/src/shard.h xlators/features/shard/src/shard.c xlators/features/marker/src/marker-quota.c xlators/features/locks/src/common.c xlators/features/leases/src/leases-internal.c xlators/features/gfid-access/src/gfid-access.c xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c xlators/features/bit-rot/src/bitd/bit-rot.c xlators/features/bit-rot/src/bitd/bit-rot-scrub.c bxlators/encryption/crypt/src/metadata.c xlators/encryption/crypt/src/crypt.c xlators/performance/md-cache/src/md-cache.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! .. and allocate memory as much as needed. xlators/nfs/server/src/mount3.c : Don't blindly allocate PATH_MAX, but strlen() the string and allocate appropriately. Also, align error messges. updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
* xlators: move from strlen() to sizeof()Yaniv Kaul2018-08-311-2/+2
| | | | | | | | | | | | | | | | | | xlators/features/index/src/index.c xlators/features/shard/src/shard.c xlators/features/upcall/src/upcall-internal.c xlators/mgmt/glusterd/src/glusterd-bitrot.c xlators/mgmt/glusterd/src/glusterd-locks.c xlators/mgmt/glusterd/src/glusterd-mountbroker.c xlators/mgmt/glusterd/src/glusterd-op-sm.c For const strings, just do compile time size calc instead of runtime. Compile-tested only! Change-Id: I995b2b89f14454b3855a4cd0ca90b3f01d5e080f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* feature/shard: Fix Coverity issueAshish Pandey2018-08-271-7/+5
| | | | | | | | | | | | | | | | | | | | | Fix following coverity issues- CID: 1394660 1394668 1394667 1389008 1389434 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821108&mergedDefectId=1389008 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821101&mergedDefectId=1389434 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821001&mergedDefectId=1394660 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821010&mergedDefectId=1394667 https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=84880983&defectInstanceId=25821017&mergedDefectId=1394668 Change-Id: I08f09649dbe758ba0d367ae5330b48b18784dec3 updates: bz#789278 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* features/shard: Fix crash and test case in RENAME fopKrutika Dhananjay2018-08-141-2/+5
| | | | | | | | | | | | | | Setting the refresh flag in inode ctx in shard_rename_src_cbk() is applicable only when the dst file exists and is sharded and has a hard link > 1 at the time of rename. But this piece of code is exercised even when dst doesn't exist. In this case, the mount crashes because local->int_inodelk.loc.inode is NULL. Change-Id: Iaf85a5ee3dff8b01a76e11972f10f2bb9dcbd407 Updates: bz#1611692 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Make lru limit of inode list configurableKrutika Dhananjay2018-07-232-4/+21
| | | | | | | | | | | | | | | Currently this lru limit is hard-coded to 16384. This patch makes it configurable to make it easier to hit the lru limit and enable testing of different cases that arise when the limit is reached. The option is features.shard-lru-limit. It is by design allowed to be configured only in init() but not in reconfigure(). This is to avoid all the complexity associated with eviction of least recently used shards when the list is shrunk. Change-Id: Ifdcc2099f634314fafe8444e2d676e192e89e295 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Perform shards deletion in the backgroundKrutika Dhananjay2018-06-203-159/+685
| | | | | | | | | | | | | | | A synctask is created that would scan the indices from .shard/.remove_me, to delete the shards associated with the gfid corresponding to the index bname and the rate of deletion is controlled by the option features.shard-deletion-rate whose default value is 100. The task is launched on two accounts: 1. when shard receives its first-ever lookup on the volume 2. when a rename or unlink deleted an inode Change-Id: Ia83117230c9dd7d0d9cae05235644f8475e97bc3 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Introducing ".shard/.remove_me" for atomic shard deletion ↵Krutika Dhananjay2018-06-134-421/+1040
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | (part 1) PROBLEM: Shards are deleted synchronously when a sharded file is unlinked or when a sharded file participating as the dst in a rename() is going to be replaced. The problem with this approach is it makes the operation really slow, sometimes causing the application to time out, especially with large files. SOLUTION: To make this operation atomic, we introduce a ".remove_me" directory. Now renames and unlinks will simply involve two steps: 1. creating an empty file under .remove_me named after the gfid of the file participating in unlink/rename 2. carrying out the actual rename/unlink A synctask is created (more on that in part 2) to scan this directory after every unlink/rename operation (or upon a volume mount) and clean up all shards associated with it. All of this happens in the background. The task takes care to delete the shards associated with the gfid in .remove_me only if this gfid doesn't exist in backend, ensuring that the file was successfully renamed/unlinked and its shards can be discarded now safely. Change-Id: Ia1d238b721a3e99f951a73abbe199e4245f51a3a updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix missing unlock in shard_fsync_shards_cbk()Vijay Bellur2018-06-011-0/+1
| | | | | | | updates: bz#789278 Change-Id: I745a98e957cf3c6ba69247fcf6b58dd05cf59c3c Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Add option to barrier parallel lookup and unlink of shardsKrutika Dhananjay2018-04-232-28/+89
| | | | | | | | | Also move the common parallel unlink callback for GF_FOP_TRUNCATE and GF_FOP_FTRUNCATE into a separate function. Change-Id: Ib0f90a5f62abdfa89cda7bef9f3ff99f349ec332 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Make operations on internal directories genericKrutika Dhananjay2018-04-182-92/+206
| | | | | | Change-Id: Iea7ad2102220c6d415909f8caef84167ce2d6818 updates: bz#1568521 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Do list_del_init() while list memory is validPranith Kumar K2018-03-201-1/+1
| | | | | | | | | | | | | | | | | Problem: shard_post_lookup_fsync_handler() goes over the list of inode-ctx that need to be fsynced and in cbk it removes each of the inode-ctx from the list. When the first member of list is removed it tries to modifies list head's memory with the latest next/prev and when this happens, there is no guarantee that the list-head which is from stack memory of shard_post_lookup_fsync_handler() is valid. Fix: Do list_del_init() in the loop before winding fsync. BUG: 1557876 Change-Id: If429d3634219e1a435bd0da0ed985c646c59c2ca Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* features/shard: Upon FSYNC from upper layers, wind fsync on all changed shardsKrutika Dhananjay2018-03-053-38/+504
| | | | | | Change-Id: Ib74354f57a18569762ad45a51f182822a2537421 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix shard inode refcount when it's part of priv->lru_list.Krutika Dhananjay2018-03-021-9/+17
| | | | | | | | | | | For as long as a shard's inode is in priv->lru_list, it should have a non-zero ref-count. This patch achieves it by taking a ref on the inode when it is added to lru list. When it's time for the inode to be evicted from the lru list, a corresponding unref is done. Change-Id: I289ffb41e7be5df7489c989bc1bbf53377433c86 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterfsd: Memleak in glusterfsd process while brick mux is onMohit Agrawal2018-02-271-0/+3
| | | | | | | | | | | | | | | | | | Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/19574) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
* features/shard: Leverage block_num info in inode-ctx in read callbackKrutika Dhananjay2018-02-271-18/+3
| | | | | | | | | ... instead of adding this information in fd_ctx in call path and retrieving it again in the callback. Change-Id: Ibbddbbe85baadb7e24aacf5ec8a1250d493d7800 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Pass the correct block-num to store in inode ctxKrutika Dhananjay2018-02-271-1/+1
| | | | | | Change-Id: Icf3a5d0598a081adb7d234a60bd15250a5ce1532 BUG: 1468483 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: shard options for GD2.Krutika Dhananjay2018-01-261-0/+3
| | | | | | | Updates #302 Change-Id: Ife21440ffcf5805ce5858360dc94a456ead891e5 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>