summaryrefslogtreecommitdiffstats
path: root/xlators/features
Commit message (Collapse)AuthorAgeFilesLines
* multiple files: another attempt to remove includesYaniv Kaul2019-06-1413-28/+4
| | | | | | | | | | | | | | | | | | There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* upcall: Avoid sending notifications for invalid inodesSoumya Koduri2019-06-141-1/+18
| | | | | | | | | | | | | | | | | | | For nameless LOOKUPs, server creates a new inode which shall remain invalid until the fop is successfully processed post which it is linked to the inode table. But incase if there is an already linked inode for that entry, it discards that newly created inode which results in upcall notification. This may result in client being bombarded with unnecessary upcalls affecting performance if the data set is huge. This issue can be avoided by looking up and storing the upcall context in the original linked inode (if exists), thus saving up on those extra callbacks. Change-Id: I044a1737819bb40d1a049d2f53c0566e746d2a17 fixes: bz#1718338 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* libglusterfs: cleanup iovec functionsXavi Hernandez2019-06-112-15/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans some iovec code and creates two additional helper functions to simplify management of iovec structures. iov_range_copy(struct iovec *dst, uint32_t dst_count, uint32_t dst_offset, struct iovec *src, uint32_t src_count, uint32_t src_offset, uint32_t size); This function copies up to 'size' bytes from 'src' at offset 'src_offset' to 'dst' at 'dst_offset'. It returns the number of bytes copied. iov_skip(struct iovec *iovec, uint32_t count, uint32_t size); This function removes the initial 'size' bytes from 'iovec' and returns the updated number of iovec vectors remaining. The signature of iov_subset() has also been modified to make it safer and easier to use. The new signature is: iov_subset(struct iovec *src, int src_count, uint32_t start, uint32_t size, struct iovec **dst, int32_t dst_count); This function creates a new iovec array containing the subset of the 'src' vector starting at 'start' with size 'size'. The resulting array is allocated if '*dst' is NULL, or copied to '*dst' if it fits (based on 'dst_count'). It returns the number of iovec vectors used. A new set of functions to iterate through an iovec array have been created. They can be used to simplify the implementation of other iovec-based helper functions. Change-Id: Ia5fe57e388e23392a8d6cdab17670e337cadd587 Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* features/shard: Fix extra unref when inode object is lru'd out and added backKrutika Dhananjay2019-06-091-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long tale of double unref! But do read... In cases where a shard base inode is evicted from lru list while still being part of fsync list but added back soon before its unlink, there could be an extra inode_unref() leading to premature inode destruction leading to crash. One such specific case is the following - Consider features.shard-deletion-rate = features.shard-lru-limit = 2. This is an oversimplified example but explains the problem clearly. First, a file is FALLOCATE'd to a size so that number of shards under /.shard = 3 > lru-limit. Shards 1, 2 and 3 need to be resolved. 1 and 2 are resolved first. Resultant lru list: 1 -----> 2 refs on base inode - (1) + (1) = 2 3 needs to be resolved. So 1 is lru'd out. Resultant lru list - 2 -----> 3 refs on base inode - (1) + (1) = 2 Note that 1 is inode_unlink()d but not destroyed because there are non-zero refs on it since it is still participating in this ongoing FALLOCATE operation. FALLOCATE is sent on all participant shards. In the cbk, all of them are added to fync_list. Resulting fsync list - 1 -----> 2 -----> 3 (order doesn't matter) refs on base inode - (1) + (1) + (1) = 3 Total refs = 3 + 2 = 5 Now an attempt is made to unlink this file. Background deletion is triggered. The first $shard-deletion-rate shards need to be unlinked in the first batch. So shards 1 and 2 need to be resolved. inode_resolve fails on 1 but succeeds on 2 and so it's moved to tail of list. lru list now - 3 -----> 2 No change in refs. shard 1 is looked up. In lookup_cbk, it's linked and added back to lru list at the cost of evicting shard 3. lru list now - 2 -----> 1 refs on base inode: (1) + (1) = 2 fsync list now - 1 -----> 2 (again order doesn't matter) refs on base inode - (1) + (1) = 2 Total refs = 2 + 2 = 4 After eviction, it is found 3 needs fsync. So fsync is wound, yet to be ack'd. So it is still inode_link()d. Now deletion of shards 1 and 2 completes. lru list is empty. Base inode unref'd and destroyed. In the next batched deletion, 3 needs to be deleted. It is inode_resolve()able. It is added back to lru list but base inode passed to __shard_update_shards_inode_list() is NULL since the inode is destroyed. But its ctx->inode still contains base inode ptr from first addition to lru list for no additional ref on it. lru list now - 3 refs on base inode - (0) Total refs on base inode = 0 Unlink is sent on 3. It completes. Now since the ctx contains ptr to base_inode and the shard is part of lru list, base shard is unref'd leading to a crash. FIX: When shard is readded back to lru list, copy the base inode pointer as is into its inode ctx, even if it is NULL. This is needed to prevent double unrefs at the time of deleting it. Change-Id: I99a44039da2e10a1aad183e84f644d63ca552462 Updates: bz#1696136 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* uss: Ensure that snapshot is deleted before creating a new snapshotRaghavendra Bhat2019-06-083-3/+17
| | | | | | | | * Also some logging enhancements in snapview-server Change-Id: I6a7646771cedf4bd1c62806eea69d720bbaf0c83 fixes: bz#1715921 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* across: clang-scan: fix NULL dereferencing warningsAmar Tumballi2019-06-046-18/+33
| | | | | | | | | All these checks are done after analyzing clang-scan report produced by the CI job @ https://build.gluster.org/job/clang-scan updates: bz#1622665 Change-Id: I590305af4ceb779be952974b2a36066ffc4865ca Signed-off-by: Amar Tumballi <amarts@redhat.com>
* features/shard: Fix block-count accounting upon truncate to lower sizeKrutika Dhananjay2019-06-042-13/+49
| | | | | | | | | | | | | | | | | The way delta_blocks is computed in shard is incorrect, when a file is truncated to a lower size. The accounting only considers change in size of the last of the truncated shards. FIX: Get the block-count of each shard just before an unlink at posix in xdata. Their summation plus the change in size of last shard (from an actual truncate) is used to compute delta_blocks which is used in the xattrop for size update. Change-Id: I9128a192e9bf8c3c3a959e96b7400879d03d7c53 fixes: bz#1705884 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* lcov: improve line coverageAmar Tumballi2019-06-032-78/+36
| | | | | | | | | | | | upcall: remove extra variable assignment and use just one initialization. open-behind: reduce the overall number of lines, in functions not frequently called selinux: reduce some lines in init failure cases updates: bz#1693692 Change-Id: I7c1de94f2ec76a5bfe1f48a9632879b18e5fbb95 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* across: coverity fixesAmar Tumballi2019-06-032-2/+1
| | | | | | | | | | | | | | | * locks/posix.c: key was not freed in one of the cases. * locks/common.c: lock was being free'd out of context. * nfs/exports: handle case of missing free. * protocol/client: handle case of entry not freed. * storage/posix: handle possible case of double free CID: 1398628, 1400731, 1400732, 1400756, 1124796, 1325526 updates: bz#789278 Change-Id: Ieeaca890288bc4686355f6565f853dc8911344e8 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* stack: Make sure to have unique call-stacks in all casesPranith Kumar K2019-05-301-3/+0
| | | | | | | | | | | | | | | At the moment new stack doesn't populate frame->root->unique in all cases. This makes it difficult to debug hung frames by examining successive state dumps. Fuse and server xlators populate it whenever they can, but other xlators won't be able to assign 'unique' when they need to create a new frame/stack because they don't know what 'unique' fuse/server xlators already used. What we need is for unique to be correct. If a stack with same unique is present in successive statedumps, that means the same operation is still in progress. This makes 'finding hung frames' part of debugging hung frames easier. fixes bz#1714098 Change-Id: I3e9a8f6b4111e260106c48a2ac3a41ef29361b9e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* marker: remove some unused functionsAmar Tumballi2019-05-307-148/+8
| | | | | | | | | After basic analysis, found that these methods were not being used at all. updates: bz#1693692 Change-Id: If9cfa1ab189e6e7b56230c4e1d8e11f9694a9a65 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Fix some "Null pointer dereference" coverity issuesXavi Hernandez2019-05-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | This patch fixes the following CID's: * 1124829 * 1274075 * 1274083 * 1274128 * 1274135 * 1274141 * 1274143 * 1274197 * 1274205 * 1274210 * 1274211 * 1288801 * 1398629 Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* lock: check null value of dict to avoid log floodingSusant Palai2019-05-231-1/+1
| | | | | | updates: bz#1712322 Change-Id: I120a1d23506f9ebcf88c7ea2f2eff4978a61cf4a Signed-off-by: Susant Palai <spalai@redhat.com>
* features/shard: Fix crash during background shard deletion in a specific caseKrutika Dhananjay2019-05-161-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following case - 1. A file gets FALLOCATE'd such that > "shard-lru-limit" number of shards are created. 2. And then it is deleted after that. The unique thing about FALLOCATE is that unlike WRITE, all of the participant shards are resolved and created and fallocated in a single batch. This means, in this case, after the first "shard-lru-limit" number of shards are resolved and added to lru list, as part of resolution of the remaining shards, some of the existing shards in lru list will need to be evicted. So these evicted shards will be inode_unlink()d as part of eviction. Now once the fop gets to the actual FALLOCATE stage, the lru'd-out shards get added to fsync list. 2 things to note at this point: i. the lru'd out shards are only part of fsync list, so each holds 1 ref on base shard ii. and the more recently used shards are part of both fsync and lru list. So each of these shards holds 2 refs on base inode - one for being part of fsync list, and the other for being part of lru list. FALLOCATE completes successfully and then this very file is deleted, and background shard deletion launched. Here's where the ref counts get mismatched. First as part of inode_resolve()s during the deletion, the lru'd-out inodes return NULL, because they are inode_unlink()'d by now. So these inodes need to be freshly looked up. But as part of linking them in lookup_cbk (precisely in shard_link_block_inode()), inode_link() returns the lru'd-out inode object. And its inode ctx is still valid and ctx->base_inode valid from the last time it was added to list. But shard_common_lookup_shards_cbk() passes NULL in the place of base_pointer to __shard_update_shards_inode_list(). This means, as part of adding the lru'd out inode back to lru list, base inode is not ref'd since its NULL. Whereas post unlinking this shard, during shard_unlink_block_inode(), ctx->base_inode is accessible and is unref'd because the shard was found to be part of LRU list, although the matching ref didn't occur. This at some point leads to base_inode refcount becoming 0 and it getting destroyed and released back while some of its associated shards are continuing to be unlinked in parallel and the client crashes whenever it is accessed next. Fix is to pass base shard correctly, if available, in shard_link_block_inode(). Also, the patch fixes the ret value check in tests/bugs/shard/shard-fallocate.c Change-Id: Ibd0bc4c6952367608e10701473cbad3947d7559f Updates: bz#1696136 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/shard: Fix integer overflow in block count accountingKrutika Dhananjay2019-05-061-1/+1
| | | | | | | | ... by holding delta_blocks in 64-bit int as opposed to 32-bit int. Change-Id: I2c1ddab17457f45e27428575ad16fa678fd6c0eb updates: bz#1705884 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* cloudsync/plugin: coverity fixesSusant Palai2019-04-301-6/+6
| | | | | | | | | CID 1401087: Null pointer dereferences (REVERSE_INULL) CID 1401088: Null pointer dereferences (FORWARD_NULL) Change-Id: I71bf67af80e1b22bcd2eb997b01a1a5ef0b4d80b Updates: bz#789278 Signed-off-by: Susant Palai <spalai@redhat.com>
* cloudsync: Fix bug in cloudsync-fops-c.pyAnuradha Talur2019-04-261-3/+21
| | | | | | | | | | | | | | | | In some of the fops generated by generator.py, xdata request was not being wound to the child xlator correctly. This was happening because when though the logic in cloudsync-fops-c.py was correct, generator.py was generating a resultant code that omits this logic. Made changes in cloudsync-fops-c.py so that correct code is produced. Change-Id: I6f25bdb36ede06fd03be32c04087a75639d79150 updates: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
* cloudsync/cvlt: Cloudsync plugin for commvault storeAnuradha Talur2019-04-2610-2/+1204
| | | | | | Change-Id: Icbe53e78e9c4f6699c7a26a806ef4b14b39f5019 updates: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
* features/locks: error-out {inode,entry}lk fops with all-zero lk-ownerPranith Kumar K2019-04-265-15/+53
| | | | | | | | | | | | | | | | | Problem: Sometimes we find that developers forget to assign lk-owner for an inodelk/entrylk/lk before writing code to wind these fops. locks xlator at the moment allows this operation. This leads to multiple threads in the same client being able to get locks on the inode because lk-owner is same and transport is same. So isolation with locks can't be achieved. Fix: Disallow locks with lk-owner zero. fixes bz#1624701 Change-Id: I1aadcfbaaa4d49308f7c819505857e201809b3bc Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cloudsync: Make readdirp return stat info of all the direntsAnuradha Talur2019-04-252-1/+36
| | | | | | | | | | | | | | | This change got missed while the initial changes were sent. Should have been a part of : https://review.gluster.org/#/c/glusterfs/+/21757/ Gist of the change: Function that fills in stat info for dirents is invoked in readdirp in posix when cloudsync populates xdata request with GF_CS_OBJECT_STATUS. Change-Id: Ide0c4e80afb74cd2120f74ba934ed40123152d69 updates: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
* features/bit-rot: Unconditionally sign the files during oneshot crawlRaghavendra Bhat2019-04-251-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently bit-rot feature has an issue with disabling and reenabling it on the same volume. Consider enabling bit-rot detection which goes on to crawl and sign all the files present in the volume. Then some files are modified and the bit-rot daemon goes on to sign the modified files with the correct signature. Now, disable bit-rot feature. While, signing and scrubbing are not happening, previous checksums of the files continue to exist as extended attributes. Now, if some files with checksum xattrs get modified, they are not signed with new signature as the feature is off. At this point, if the feature is enabled again, the bit rot daemon will go and sign those files which does not have any bit-rot specific xattrs (i.e. those files which were created after bit-rot was disabled). Whereas the files with bit-rot xattrs wont get signed with proper new checksum. At this point if scrubber runs, it finds the on disk checksum and the actual checksum of the file to be different (because the file got modified) and marks the file as corrupted. FIX: The fix is to unconditionally sign the files when the bit-rot daemon comes up (instead of skipping the files with bit-rot xattrs). Change-Id: Iadfb47dd39f7e2e77f22d549a4a07a385284f4f5 fixes: bz#1700078 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* core: avoid dynamic TLS allocation when possibleXavi Hernandez2019-04-242-49/+5
| | | | | | | | | | | | | | | | | | | Some interdependencies between logging and memory management functions make it impossible to use the logging framework before initializing memory subsystem because they both depend on Thread Local Storage allocated through pthread_key_create() during initialization. This causes a crash when we try to log something very early in the initialization phase. To prevent this, several dynamically allocated TLS structures have been replaced by static TLS reserved at compile time using '__thread' keyword. This also reduces the number of error sources, making initialization simpler. Updates: bz#1193929 Change-Id: I8ea2e072411e30790d50084b6b7e909c7bb01d50 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* features/sdfs: Assign unique lk-owner for entrylk fopPranith Kumar K2019-04-221-0/+6
| | | | | | | | | | | | | sdfs is supposed to serialize entry fops by doing entrylk, but all the locks are being done with all-zero lk-owner. In essence sdfs doesn't achieve its goal of mutual exclusion when conflicting operations are executed by same client because two locks on same entry with same all-zero-owner will get locks. Fixed this up by assigning lk-owner before taking entrylk updates bz#1624701 Change-Id: Ifabfc998c9f1724915d38e90ed8287e05797d769 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* features/locks: fix coverity issuesXavi Hernandez2019-04-192-1/+6
| | | | | | | | | | | | | | This patch fixes the following NULL dereferences identified by Coverity: * CID 1398619 * CID 1398621 * CID 1398623 * CID 1398625 * CID 1398626 Change-Id: Id6af0d7cba0bb3346005376bc27180e8476255a4 Updates: bz#789278 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Revert "features/locks: error-out {inode,entry}lk fops with all-zero lk-owner"Atin Mukherjee2019-04-175-53/+15
| | | | | | | | | This reverts commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 as it has broken sdfs-sanity.t. Updates: bz#1624701 Change-Id: Icb2b0d6bfcce4d556f1cd0f11695c03ffc138736 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* features/bit-rot-stub: clean the mutex after cancelling the signer threadRaghavendra Bhat2019-04-172-7/+59
| | | | | | | | | | | | | | | | | | When bit-rot feature is disabled, the signer thread from the bit-rot-stub xlator (the thread which performs the setxattr of the signature on to the disk) is cancelled. But, if the cancelled signer thread had already held the mutex (&priv->lock) which it uses to monitor the queue of files to be signed, then the mutex is never released. This creates problems in future when the feature is enabled again. Both the new instance of the signer thread and the regular thread which enqueues the files to be signed will be blocked on this mutex. So, as part of cancelling the signer thread, unlock the mutex associated with it as well using pthread_cleanup_push and pthread_cleanup_pop. Change-Id: Ib761910caed90b268e69794ddeb108165487af40 updates: bz#1700078 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* features/locks: error-out {inode,entry}lk fops with all-zero lk-ownerPranith Kumar K2019-04-165-15/+53
| | | | | | | | | | | | | | | | | Problem: Sometimes we find that developers forget to assign lk-owner for an inodelk/entrylk/lk before writing code to wind these fops. locks xlator at the moment allows this operation. This leads to multiple threads in the same client being able to get locks on the inode because lk-owner is same and transport is same. So isolation with locks can't be achieved. Fix: Disallow locks with lk-owner zero. fixes bz#1624701 Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* libgfchangelog : use find_library to locate shared librarySunny Kumar2019-04-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: libgfchangelog.so: cannot open shared object file Due to hardcoded shared library name runtime loader looks for particular version of a shared library. Solution: Using find_library to locate shared library at runtime solves this issue. Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 323, in main func(args) File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker local.service_loop(remote) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1261, in service_loop changelog_agent.init() File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in __call__ return self.ins(self.meth, *a) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in __call__ raise res OSError: libgfchangelog.so: cannot open shared object file: No such file or directory Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5 fixes: bz#1699394 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* marker-quota: remove dead codeAmar Tumballi2019-04-151-37/+4
| | | | | | | | | also make minor changes for signature (int -> void) where return value was not checked anywhere. updates: bz#1693692 Change-Id: Iff117712eb65e0b6b8b441a779202a117fcdf1fb Signed-off-by: Amar Tumballi <amarts@redhat.com>
* core: Brick is not able to detach successfully in brick_mux environmentMohit Agrawal2019-04-141-0/+1
| | | | | | | | | | | | | | | | | | | | | Problem: In brick_mux environment, while volumes are stopped in a loop bricks are not detached successfully. Brick's are not detached because xprtrefcnt has not become 0 for detached brick. At the time of initiating brick detach process server_notify saves xprtrefcnt on detach brick and once counter has become 0 then server_rpc_notify spawn a server_graph_janitor_threads for cleanup brick resources.xprtrefcnt has not become 0 because socket framework is not working due to assigning 0 as a fd for socket. In commit dc25d2c1eeace91669052e3cecc083896e7329b2 there was a change in changelog fini to close htime_fd if htime_fd is not negative, by default htime_fd is 0 so it close 0 also. Solution: Initialize htime_fd to -1 after just allocate changelog_priv by GF_CALLOC Fixes: bz#1699025 Change-Id: I5f7ca62a0eb1c0510c3e9b880d6ab8af8d736a25 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* features/cloudsync : Added some new functionsAnuradha Talur2019-04-106-93/+591
| | | | | | | | | | | | | This patch contains the following changes: 1) Store ID info will now be stored in the inode ctx 2) Added new readv type where read is made directly from the remote store. This choice is made by volume set operation. 3) cs_forget() was missing. Added it. Change-Id: Ie3232b3d7ffb5313a03f011b0553b19793eedfa2 fixes: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
* changelog: remove unused code.Yaniv Kaul2019-04-034-32/+0
| | | | | | | | Seems to be unused. Change-Id: I75eed9641dd030a1fbb1b942a9d818f10a7e1437 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* rpc/transport: Missing a ref on dict while creating transport objectMohammed Rafi KC2019-03-202-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* shard: fix crash caused by using null inodeKinglong Mee2019-03-141-4/+3
| | | | | | Change-Id: I156bf962223304e586b83a36be59a0ca74589b43 Updates: bz#1688287 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* rpm: add thin-arbiter packageAmar Tumballi2019-03-111-2/+0
| | | | | | | | | | | | | | | | | | | Discussion on thin arbiter volume - https://github.com/gluster/glusterfs/issues/352#issuecomment-350981148 Main idea of having this rpm package is to deploy thin-arbiter without glusterd and other commands on a node, and all we need on that tie-breaker node is to run a single glusterfs command. Also note that, no other glusterfs installation needs thin-arbiter.so. Make sure RPM contains sample vol file, which can work by default, and a script to configure that volfile, along with translator image. Change-Id: Ibace758373d8a991b6a19b2ecc60c93b2f8fc489 updates: bz#1674389 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* WORM-Xlator: Maybe integer overflow when computing new atimeDavid Spisla2019-03-072-8/+8
| | | | | | | | | | | | | | | | | | | | The structs worm_reten_state_t and read_only_priv_t from read-only.h are using uint64_t values to store periods of retention and autocommmit. This seems to be dangerous since in worm-helper.c the function worm_set_state computes in line 97: stbuf->ia_atime = time(NULL) + retention_state->ret_period; stbuf->ia_atime is using int64_t because of the settings of struct iattr. So if there is a very very high retention period stored, there is maybe an integer overflow. What can be the solution? Using int64_t instead if uint64_t may reduce the probability of the occurance. Change-Id: Id1e86c6b20edd53f171c4cfcb528804ba7881f65 fixes: bz#1685944 Signed-off-by: David Spisla <david.spisla@iternity.com>
* leases: Do not process internal fopsSoumya Koduri2019-03-052-0/+26
| | | | | | | | | | fops marked internal are used to maintain data integrity and ideally do not intervene with application client leases. Hence it seems safe to ignore them by lease xlator. Change-Id: I887b6f2da7ec0081442cc4b572a7a9e110f79eb2 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* quotad: fix passing GF_DATA_TYPE_STR_OLD dict data to v4 protocolKinglong Mee2019-03-044-16/+52
| | | | | | | | | | | | | | | | | | quotad prints many logs as, [glusterfs3.h:752:dict_to_xdr] 0-dict: key 'trusted.glusterfs.quota.size' is not sent on wire [Invalid argument] [glusterfs3.h:752:dict_to_xdr] 0-dict: key 'volume-uuid' is not sent on wire [Invalid argument] For quota, there is a deamon named quotad which has a rpcsvc_program quotad_aggregator_prog that only supports v3 right now. Quotad has two actors (LOOKUP,GETLIMIT) that contains a dict in request, quotad just decodes the dict by dict_unserialize, those dict dates's type is GF_DATA_TYPE_STR_OLD, which type is not supported at glusterfs v4. Change-Id: Ib649d7a2e3c68c32dc26bc0f88923a0ba967ebd7 Updates: bz#1596787 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* leases-internal.c: minor reduction of work under lock.Yaniv Kaul2019-02-252-42/+43
| | | | | | | | | | | | Minor changes to reduce work done under a lock. Changed few CALLOC() to MALLOC(), and moved some time(NULL) outside the lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I4683d0d6e0b653a6adefff87b43ae717fd46843a
* upcall: some modifications to reduce work under lockYaniv Kaul2019-02-193-138/+66
| | | | | | | | | | | | | | 1. Reduced the number of times we call time(). This may affect accuracy of access time and so on - please review carefully. I think the resolution is OK'ish. 2. Removed dead code. 3. Changed from CALLOC() to MALLOC() where it made sense. 4. Moved some bits of work outside of a lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I9fb8ca5d79b0e9126c1eb07e1a1ab5dbd8bf3f79
* core: make gf_thread_create() easier to useXavi Hernandez2019-02-012-11/+3
| | | | | | | | | | | | | | This patch creates a specific function to set the thread name using a string format and a variable argument list, like printf(). This function is used to set the thread name from gf_thread_create(), which now accepts a variable argument list to create the full name. It's not necessary anymore to use a local array to build the name of the thread. This is done automatically. Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* feature/bitrot: Avoid thread creation if xlator is not enabledMohit Agrawal2019-01-311-8/+64
| | | | | | | | | | | | | Problem: Avoid thread creation for bitrot-stub for a volume if feature is not enabled Solution: Before thread creation check the flag if feature is enabled Updates: #475 Change-Id: I2c6cc35bba142d4b418cc986ada588e558512c8e Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
* features/sdfs: disable by defaultAmar Tumballi2019-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the feature enabled, some of the performance testing results, specially those which create millions of small files, got approximately 4x regression compared to version before enabling this. On master without this patch: 765 creates/sec On master with this patch : 3380 creates/sec Also there seems to be regression caused by this in 'ls -l' workload. On master without this patch: 3030 files/sec On master with this patch : 16610 files/sec This is a feature added to handle multiple clients parallely operating (specially those which race for file creates with same name) on a single namespace/directory. Considering that is < 3% of Gluster's usecase right now, it makes sense to disable the feature by default, so we don't penalize the default users who doesn't bother about this usecase. Also note that the client side translators, specially, distribute, replicate and disperse already handle the issue upto 99.5% of the cases without SDFS, so it makes sense to keep the feature disabled by default. Credits: Shyamsunder <srangana@redhat.com> for running the tests and getting the numbers. Change-Id: Iec49ce1d82e621e9db25eb633fcb1d932e74f4fc Updates: bz#1670031 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Multiple files: reduce work while under lock.Yaniv Kaul2019-01-294-18/+22
| | | | | | | | | | | | | | | | | Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
* features/shard: Ref shard inode while adding to fsync listKrutika Dhananjay2019-01-241-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: Lot of the earlier changes in the management of shards in lru, fsync lists assumed that if a given shard exists in fsync list, it must be part of lru list as well. This was found to be not true. Consider this - a file is FALLOCATE'd to a size which would make the number of participant shards to be greater than the lru list size. In this case, some of the resolved shards that are to participate in this fop will be evicted from lru list to give way to the rest of the shards. And once FALLOCATE completes, these shards are added to fsync list but without a ref. After the fop completes, these shard inodes are unref'd and destroyed while their inode ctxs are still part of fsync list. Now when an FSYNC is called on the base file and the fsync-list traversed, the client crashes due to illegal memory access. FIX: Hold a ref on the shard inode when adding to fsync list as well. And unref under following conditions: 1. when the shard is evicted from lru list 2. when the base file is fsync'd 3. when the shards are deleted. Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2 fixes: bz#1669077 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* afr/self-heal:Fix wrong type checkingRavishankar N2019-01-241-4/+8
| | | | | | | | | | gf_dirent struct has d_type variable which should check with DT_DIR istead of IA_IFDIR or IA_IFDIR has to compare with entry->d_stat.ia_type Change-Id: Idf1059ce2a590734bc5b6adaad73604d9a708804 updates: bz#1653359 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* rpc: use address-family option from vol fileMilind Changire2019-01-221-1/+4
| | | | | | | | | | | | | | | | | This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
* locks/fencing: Add a security knob for fencingSusant Palai2019-01-222-9/+31
| | | | | | | | | | | | | There is a low level security issue with fencing since one client can preempt another client's lock. This patch does not completely eliminate the issue of a client misbehaving, but certainly it adds a security layer for default use cases that does not need fencing. Change-Id: I55cd15f2ed1ae0f2556e3d27a2ef4bc10fdada1c updates: #466 Signed-off-by: Susant Palai <spalai@redhat.com>
* quotad: fix wrong memory freeKinglong Mee2019-01-213-19/+7
| | | | | | | | | | | | | | | | 1. cli_req.dict.dict_val, It must be freed no metter operation error or success. Fix it as lookup "alloca" memory before decode. 2. args.xdata.xdata_val, It is allocated by "alloca", free is unneeded. 3. qd_nameless_lookup, It olny needs gfid, a gfs3_lookup_req argument is unneeded. Change-Id: I746dddf7f3d1465b1885af2644afe0bcf0a5665b fixes: bz#1656682 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* core: Feature added to accept CidrIp in auth.allowRinku Kothiya2019-01-181-1/+1
| | | | | | | | | | | | | | | Added functionality to gluster volume set auth.allow command to accept CIDR IP addresses. Modified few functions to isolate cidr feature so that it prevents other gluster commands such as peer probe to use cidr format ip. The functions are modified in such a way that they have an option to enable accepting of cidr format for other gluster commands if required in furture. updates: bz#1138841 Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b Credits: Mohit Agrawal Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>