summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* protocol/client: Do not fallback to anon-fd if fd is not openPranith Kumar K2019-03-311-1/+7
| | | | | | | | | | | | | | | | | | If an open comes on a file when a brick is down and after the brick comes up, a fop comes on the fd, client xlator would still wind the fop on anon-fd leading to wrong behavior of the fops in some cases. Example: If lk fop is issued on the fd just after the brick is up in the scenario above, lk fop will be sent on anon-fd instead of failing it on that client xlator. This lock will never be freed upon close of the fd as flush on anon-fd is invalid and is not wound below server xlator. As a fix, failing the fop unless the fd has FALLBACK_TO_ANON_FD flag. Change-Id: I77692d056660b2858e323bdabdfe0a381807cccc fixes bz#1390914 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* afr: thin-arbiter read txn fixesRavishankar N2019-03-293-22/+37
| | | | | | | | | | | | | - Fixes afr_ta_read_txn() to handle inode refresh failures. code-path. - Fixes a double free issue of dict. Note: This patch address post-merge review comments for commit 69532c141be160b3fea03c1579ae4ac13018dcdf fixes: bz#1686398 Change-Id: Id5299b45b68569d47df6b73755918237a1592cb4 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* mount.glusterfs: change the error messageAmar Tumballi2019-03-291-2/+7
| | | | | | | | | | | | | | | In scenarios where a mount fails before creating log file, doesn't make sense to give message to 'check log file'. See below: ``` ERROR: failed to create logfile "/var/log/glusterfs/mnt.log" (No space left on device) ERROR: failed to open logfile /var/log/glusterfs/mnt.log Mount failed. Please check the log file for more details. ``` Fixes: bz#1688068 Change-Id: I1d837caa4f9bc9f1a37780783e95007e01ae4e3f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* client-rpc: Fix the payload being sent on the wirePoornima G2019-03-296-244/+308
| | | | | | | | | | | | | | | | | | | The fops allocate 3 kind of payload(buffer) in the client xlator: - fop payload, this is the buffer allocated by the write and put fop - rsphdr paylod, this is the buffer required by the reply cbk of some fops like lookup, readdir. - rsp_paylod, this is the buffer required by the reply cbk of fops like readv etc. Currently, in the lookup and readdir fop the rsphdr is sent as payload, hence the allocated rsphdr buffer is also sent on the wire, increasing the bandwidth consumption on the wire. With this patch, the issue is fixed. Fixes: bz#1692093 Change-Id: Ie8158921f4db319e60ad5f52d851fa5c9d4a269b Signed-off-by: Poornima G <pgurusid@redhat.com>
* rpc: Remove duplicate codePranith Kumar K2019-03-281-1/+1
| | | | | | | | | | rpc_clnt_disable() and rpc_clnt_disconnect() have same code. Removed rpc_clnt_disconnect() and moved calls to rpc_clnt_disconnect() to rpc_clnt_disable() updates bz#1193929 Change-Id: I965f57cc1d5af36d266810125558b6f5e5f279d4 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: fix potential locking issue on peer probeZhang Huan2019-03-272-3/+5
| | | | | | | | | | | There are two cases to restart brick, one is when glusterd starts or quorum is met, another is when new peers are joined and quorum is changes. In the later case, sync_lock is not taken, and may cause lock corruption. Change-Id: I0844e7a631350f5ee00bdacb613602bffffcdf9f fixes: bz#1692612 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* cluster/ec: Don't enqueue an entry if it is already healingAshish Pandey2019-03-275-30/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1 - heal-wait-qlength is by default 128. If shd is disabled and we need to heal files, client side heal is needed. If we access these files that will trigger the heal. However, it has been observed that a file will be enqueued multiple times in the heal wait queue, which in turn causes queue to be filled and prevent other files to be enqueued. 2 - While a file is going through healing and a write fop from mount comes on that file, it sends write on all the bricks including healing one. At the end it updates version and size on all the bricks. However, it does not unset dirty flag on all the bricks, even if this write fop was successful on all the bricks. After healing completion this dirty flag remain set and never gets cleaned up if SHD is disabled. Solution: 1 - If an entry is already in queue or going through heal process, don't enqueue next client side request to heal the same file. 2 - Unset dirty on all the bricks at the end if fop has succeeded on all the bricks even if some of the bricks are going through heal. Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727 updates: bz#1593224 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* afr: add client-pid to all gf_event() callsRavishankar N2019-03-277-15/+38
| | | | | | | | | client-pid for glustershd is GF_CLIENT_PID_SELF_HEALD client-pid for glfsheal is GF_CLIENT_PID_GLFS_HEALD updates: bz#1689250 Change-Id: Ib3a863af160ff48c822a5e6b0c27c575c9887470 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/afr: Remove un-used variables related to pumpPranith Kumar K2019-03-261-3/+0
| | | | | | updates bz#1193929 Change-Id: I01b60d644f517c00a1bcc127bf9a8ed90b6eb7a0 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: fix txn-id mem leakAtin Mukherjee2019-03-252-6/+36
| | | | | | | | | | | | | | | This commit ensures the following: 1. Don't send commit op request to the remote nodes when gluster v status all is executed as for the status all transaction the local commit gets the name of the volumes and remote commit ops are technically a no-op. So no need for additional rpc requests. 2. In op state machine flow, if the transaction is in staged state and op_info.skip_locking is true, then no need to set the txn id in the priv->glusterd_txn_opinfo dictionary which never gets freed. Fixes: bz#1691164 Change-Id: Ib6a9300ea29633f501abac2ba53fb72ff648c822 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Multiple files: remove HAVE_BD_XLATOR related code.Yaniv Kaul2019-03-256-383/+0
| | | | | | | | | | | | The BD translator was removed some time ago, (in commit a907e468e724c32b9833ce59806fc215c7122d63). This completes the work. Compile-tested only! updates: bz#1635688 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I999df52e479a72d3cc9523f22f9056de17eb559c
* fuse : fix high sev coverity issueSunny Kumar2019-03-211-1/+7
| | | | | | | | | | | | This patch fixed coverity issue in fuse-bridge.c. CID : 1398630 : Resource leak CID : 1399757 : Uninitialized pointer read updates: bz#789278 Change-Id: I69f8591400ee56a5d215eeac443a8e3d7777db27 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* server.c: fix Coverity CID 1399758Yaniv Kaul2019-03-211-1/+2
| | | | | | | | | | | | 1399758 Dereference before null check It was introduced @ commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008 updates: bz#789278 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I1424b008b240691fe2a8924e31c708d0fb4f362d
* rpc/transport: Missing a ref on dict while creating transport objectMohammed Rafi KC2019-03-2010-14/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | while creating rpc_tranpsort object, we store a dictionary without taking a ref on dict but it does an unref during the cleaning of the transport object. So the rpc layer expect the caller to take a ref on the dictionary before passing dict to rpc layer. This leads to a lot of confusion across the code base and leads to ref leaks. Semantically, this is not correct. It is the rpc layer responsibility to take a ref when storing it, and free during the cleanup. I'm listing down the total issues or leaks across the code base because of this confusion. These issues are currently present in the upstream master. 1) changelog_rpc_client_init 2) quota_enforcer_init 3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma. 4) quotad_aggregator_init 5) glusterd: init 6) nfs3_init_state 7) server: init 8) client:init This patch does the cleanup according to the semantics. Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425 updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* glusterd-locks: misc. changes.Yaniv Kaul2019-03-192-64/+51
| | | | | | | | | | Move to use dict_*n() functions, where it made sense. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ie9c4b2021d2229ea9a815cc75e9eb8c3945c109e
* mount/fuse: Fix spelling mistakePranith Kumar K2019-03-151-1/+2
| | | | | | updates bz#1193929 Change-Id: I55ffa8f086ad9570f2526d91c196d7de9ffe6add Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* geo-rep: IPv6 supportAravinda VK2019-03-151-2/+28
| | | | | | | | | | | | | `address_family=inet6` needs to be added while mounting master and slave volumes in gverify script. New option introduced to gluster cli(`--inet6`) which will be used internally by geo-rep while calling `gluster volume info --remote-host=<ipv6>`. Fixes: bz#1688833 Change-Id: I1e0d42cae07158df043e64a2f991882d8c897837 Signed-off-by: Aravinda VK <avishwan@redhat.com>
* shard: fix crash caused by using null inodeKinglong Mee2019-03-141-4/+3
| | | | | | Change-Id: I156bf962223304e586b83a36be59a0ca74589b43 Updates: bz#1688287 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* cluster/afr : TA: Return actual error code in case of failureAshish Pandey2019-03-141-6/+6
| | | | | | | | | In afr_ta_post_op_do, we were sending EIO for every failure. However, the original error code should be sent. Change-Id: I9fdc15dac00d758baf8e6f14db244f526481a63a updates: bz#1686711 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* storage/posix: Remove nr_files usagePranith Kumar K2019-03-144-7/+0
| | | | | | | | | | | | | | | | nr_files is supposed to represent the number of files opened in posix. Present logic doesn't seem to handle anon-fds because of which the counts would always be wrong. I don't remember anyone using this value in debugging any problem probably because we always have 'ls -l /proc/<pid>/fd' which not only prints the fds that are active but also prints their paths. It also handles directories and anon-fds which actually opened the file. So removing this code instead of fixing the buggy logic to have the nr_files. fixes bz#1688106 Change-Id: Ibf8713fdfdc1ef094e08e6818152637206a54040 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterfsd: Brick is getting crash at the time of startupMohit Agrawal2019-03-131-5/+5
| | | | | | | | | | | | Problem: Brick is getting crash because graph was not activated at the time of accessing server_conf Solution: To avoid the crash check ctx->active before processing a request Change-Id: Ib112e0eace19189e45f430abdac5511c026bed47 fixes: bz#1687705 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* dht: NULL check before setting error flagMohammed Rafi KC2019-03-121-1/+2
| | | | | | | | | | | Function dht_common_mark_mdsxattr blindly setting value for an integer pointer without validating it. In fact there are two callers of this function that passes NULL value to the same pointer which leads to a crash. Change-Id: Id94ffe216f6a21f007b3291bff0b1e1c1989075c fixes: bz#1687811 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* io-threads.c: Potentially skip a lock.Yaniv Kaul2019-03-121-12/+13
| | | | | | | | | | | | | | | Before going into the lock, verify stub_cnt != 0. Otherwise, let's skip this code. Unrelated, switch a CALLOC to MALLOC, as we initialize all members right away. This allocation is done also under lock, so also should help a bit. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ie2fe6adff41ae4969abff95eff945b54e1a01d32
* cluster/afr: Send truncate on arbiter brick from SHDkarthik-us2019-03-111-15/+13
| | | | | | | | | | | | | | | | | | | Problem: In an arbiter volume configuration SHD will not send any writes onto the arbiter brick even if there is data pending marker for the arbiter brick. If we have a arbiter setup on the geo-rep master and there are data pending markers for the files on arbiter brick, SHD will not mark any data changelog during healing. While syncing the data from master to slave, if the arbiter-brick is considered as ACTIVE, then there is a chance that slave will miss out some data. If the arbiter brick is being newly added or replaced there is a chance of slave missing all the data during sync. Fix: If there is data pending marker for the arbiter brick, send truncate on the arbiter brick during heal, so that it will record truncate as the data transaction in changelog. Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec fixes: bz#1686568 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* rpm: add thin-arbiter packageAmar Tumballi2019-03-111-2/+0
| | | | | | | | | | | | | | | | | | | Discussion on thin arbiter volume - https://github.com/gluster/glusterfs/issues/352#issuecomment-350981148 Main idea of having this rpm package is to deploy thin-arbiter without glusterd and other commands on a node, and all we need on that tie-breaker node is to run a single glusterfs command. Also note that, no other glusterfs installation needs thin-arbiter.so. Make sure RPM contains sample vol file, which can work by default, and a script to configure that volfile, along with translator image. Change-Id: Ibace758373d8a991b6a19b2ecc60c93b2f8fc489 updates: bz#1674389 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* cluster/afr: Add quorum checks to open & opendir fopskarthik-us2019-03-084-2/+48
| | | | | | | | | | | | | | | Problem: Currently even if open & opendir fails on quorum number of bricks, but succeeds on atleast one brick, it will result in success. This leads to inconsistency in the behaviour with other operations following the open, which has quorum checks. Fix: Add quorum checks to open & opendir fops to avoid inconsistency. Change-Id: If8fcb82072a6dc45ea6d4a6754b79763215eba2a fixes: bz#1634664 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* performance/readdir-ahead: fix deadlockRaghavendra Gowdappa2019-03-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | This deadlock happens while processing dentry corresponding to current directory (.) in rda_fill_readdirp. In this case following order is followed: LOCK(directory_fd_ctx->lock); rda_inode_ctx_get_iatt -> LOCK(directory_inode->lock); However, in rda_mark_inode_dirty following lock order is followed: LOCK(directory_inode->lock); LOCK(directory_fd_ctx->lock); these two codepaths when executed concurrently resulted in a deadlock. Current patch fixes this by removing locking directory inode and fd-ctx in rda_fill_readdirp. This is fine as directory inode's stat won't change due to writes to files within directory. Change-Id: Ic93a67a0dac8229bb0d79582e526a512e6f2569c fixes: bz#1674412 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Fixes:bz#1674412
* WORM-Xlator: Maybe integer overflow when computing new atimeDavid Spisla2019-03-072-8/+8
| | | | | | | | | | | | | | | | | | | | The structs worm_reten_state_t and read_only_priv_t from read-only.h are using uint64_t values to store periods of retention and autocommmit. This seems to be dangerous since in worm-helper.c the function worm_set_state computes in line 97: stbuf->ia_atime = time(NULL) + retention_state->ret_period; stbuf->ia_atime is using int64_t because of the settings of struct iattr. So if there is a very very high retention period stored, there is maybe an integer overflow. What can be the solution? Using int64_t instead if uint64_t may reduce the probability of the occurance. Change-Id: Id1e86c6b20edd53f171c4cfcb528804ba7881f65 fixes: bz#1685944 Signed-off-by: David Spisla <david.spisla@iternity.com>
* core: make compute_cksum function op_version compatibleSanju Rakonde2019-03-071-4/+8
| | | | | | | | | | | | | | | Problem: commit 5a152a changed the mechansim of computing the checksum. In heterogeneous cluster, peers are running into rejected state because we have different cksum computation mechansims in upgraded and non-upgraded nodes. Solution: add a check for op-version so that all the nodes in the cluster follow the same mechanism for computing the cksum. Change-Id: I1508f000e8c9895588b6011b8b6cc0eda7102193 fixes: bz#1685120 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* leases: Do not process internal fopsSoumya Koduri2019-03-052-0/+26
| | | | | | | | | | fops marked internal are used to maintain data integrity and ideally do not intervene with application client leases. Hence it seems safe to ignore them by lease xlator. Change-Id: I887b6f2da7ec0081442cc4b572a7a9e110f79eb2 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* glusterd: glusterd memory leak while running "gluster v profile" in a loopMohit Agrawal2019-03-052-3/+6
| | | | | | | | | | | Problem: glusterd has memory leak while running "gluster v profile" in a loop Solution: Resolve leak code path to avoid leak Change-Id: Id608703ff6d0ad34ed8f921a5d25544e24cfadcd fixes: bz#1685414 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* io-threads: Prioritize fops with NO_ROOT_SQUASH pidSusant Palai2019-03-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | There was 30% regression observed in mkdir path with commit b139bc58eb504adf5ef81658896c9283ae21f390. On analysis it is found that io-threads xlator deprioritzes fops with all -ve pid. Some context in to the no-root-squash pid requirement: DHT xlator does some of the internal fops with root privileges. This is needed so that operations like layout healing should not be abandoned because a non root user is operating. If root-squash option is enabled the layout set operation looses its root privilege as server xlator converts the uid and pid to random numbers. Hence, the above mentioned commit converted pid to GF_CLIENT_PID_NO_ROOT_SQUASH to continue fops as root. Combining the above I am proposing not to deprioritize fops with no-root-squash pid. Change-Id: I54d056c01b25729304a77f9242fbaff39c5672ba fixes: bz#1676430 Signed-off-by: Susant Palai <spalai@redhat.com>
* afr: mark changelog_fsync as internalSoumya Koduri2019-03-051-1/+3
| | | | | | | | | | As afr_changelog_fsync is used for internal operations, use GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating it as conflicting fop and recall lease. Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* quotad: fix passing GF_DATA_TYPE_STR_OLD dict data to v4 protocolKinglong Mee2019-03-044-16/+52
| | | | | | | | | | | | | | | | | | quotad prints many logs as, [glusterfs3.h:752:dict_to_xdr] 0-dict: key 'trusted.glusterfs.quota.size' is not sent on wire [Invalid argument] [glusterfs3.h:752:dict_to_xdr] 0-dict: key 'volume-uuid' is not sent on wire [Invalid argument] For quota, there is a deamon named quotad which has a rpcsvc_program quotad_aggregator_prog that only supports v3 right now. Quotad has two actors (LOOKUP,GETLIMIT) that contains a dict in request, quotad just decodes the dict by dict_unserialize, those dict dates's type is GF_DATA_TYPE_STR_OLD, which type is not supported at glusterfs v4. Change-Id: Ib649d7a2e3c68c32dc26bc0f88923a0ba967ebd7 Updates: bz#1596787 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* mgmt/glusterd: Fix a memory leak when peer detach failsVijay Bellur2019-02-271-0/+13
| | | | | | | | | Dictionary object is not being unref'd when an error happens in __glusterd_handle_cli_deprobe(). This patch addresses that problem. Change-Id: I11e1f92d06dc9edd1260845256f435ea31ef1a87 fixes: bz#1683816 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: remove experimental xlator options from glusterd-volume-set.cSanju Rakonde2019-02-261-20/+0
| | | | | | | | | | experimental xlators have been removed from the codebase. But we missed to remove the options related to experimental xlators from the codebase. This patch removes those options. fixes: bz#1683352 Change-Id: I3fa7e14c6cd8ebde5cebc8d2b0cb2409bf37c1ae Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* leases-internal.c: minor reduction of work under lock.Yaniv Kaul2019-02-252-42/+43
| | | | | | | | | | | | Minor changes to reduce work done under a lock. Changed few CALLOC() to MALLOC(), and moved some time(NULL) outside the lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I4683d0d6e0b653a6adefff87b43ae717fd46843a
* fuse : fix memory leakSunny Kumar2019-02-251-0/+4
| | | | | | | | | | | | | | | | | | | | This patch fixes memory leak reported by ASan. Tracebacks: ERROR: LeakSanitizer: detected memory leaks Direct leak of 712 byte(s) in 1 object(s) allocated from: #0 0x7f35139dc848 in __interceptor_malloc (/lib64/libasan.so.5+0xef848) #1 0x7f35136efb29 in __gf_malloc ../libglusterfs/src/mem-pool.c:136 #2 0x7f3510591ce9 in fuse_thread_proc ../xlators/mount/fuse/src/fuse-bridge.c:5929 #3 0x7f351336d58d in start_thread (/lib64/libpthread.so.0+0x858d) SUMMARY: AddressSanitizer: 712 byte(s) leaked in 1 allocation(s). updates: bz#1633930 Change-Id: Ie5b4da6b338d8e5fc770c5b2da1238e3462468ac Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: fix get-state leakAtin Mukherjee2019-02-221-0/+2
| | | | | | Updates: bz#1193929 Change-Id: I95897fd4d3102b4fa2b8b2864116b1bf24491cf9 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* md-cache: Adapt integer data types to avoid integer overflowDavid Spisla2019-02-201-3/+3
| | | | | | | | | | | | | The "struct iatt" in iatt.h is using int64_t types for storing the atime, mtime and ctime. Therefore the struct 'struct md_cache' in md-cache.c should also use this types to avoid an integer overflow. This can happen e.g. if someone uses a very high default-retention-period in the WORM-Xlator. Change-Id: I605268d300ab622b9c8ab30e459dc00d9340aad1 fixes: bz#1678726 Signed-off-by: David Spisla <david.spisla@iternity.com>
* upcall: some modifications to reduce work under lockYaniv Kaul2019-02-193-138/+66
| | | | | | | | | | | | | | 1. Reduced the number of times we call time(). This may affect accuracy of access time and so on - please review carefully. I think the resolution is OK'ish. 2. Removed dead code. 3. Changed from CALLOC() to MALLOC() where it made sense. 4. Moved some bits of work outside of a lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I9fb8ca5d79b0e9126c1eb07e1a1ab5dbd8bf3f79
* performance/write-behind: handle call-stub leaksRaghavendra Gowdappa2019-02-191-0/+8
| | | | | | Change-Id: I7be9a5f48dcad1b136c479c58b1dca1e0488166d Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Fixes: bz#1674406
* performance/write-behind: fix use-after-free in readdirpRaghavendra Gowdappa2019-02-191-18/+22
| | | | | | | | | | | | | | | Two issues were found: 1. in wb_readdirp_cbk, inode should unrefed after wb_inode is unlocked. Otherwise, inode and hence the context wb_inode can be freed by the type we try to unlock wb_inode 2. wb_readdirp_mark_end iterates over a list of wb_inodes of children of a directory. But inodes could've been freed and hence the list might be corrupted. To fix take a reference on inode before adding it to invalidate_list of parent. Change-Id: I911b0e0b2060f7f41ded0b05db11af6f9b7c09c5 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Updates: bz#1674406
* core: implement a global thread poolXavi Hernandez2019-02-189-27/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a thread pool that is wait-free for adding jobs to the queue and uses a very small locked region to get jobs. This makes it possible to decrease contention drastically. It's based on wfcqueue structure provided by urcu library. It automatically enables more threads when load demands it, and stops them when not needed. There's a maximum number of threads that can be used. This value can be configured. Depending on the workload, the maximum number of threads plays an important role. So it needs to be configured for optimal performance. Currently the thread pool doesn't self adjust the maximum for the workload, so this configuration needs to be changed manually. For this reason, the global thread pool has been made optional, so that volumes can still use the thread pool provided by io-threads. To enable it for bricks, the following option needs to be set: config.global-threading = on This option has no effect if bricks are already running. A restart is required to activate it. It's recommended to also enable the following option when running bricks with the global thread pool: performance.iot-pass-through = on To enable it for a FUSE mount point, the option '--global-threading' must be added to the mount command. To change it, an umount and remount is needed. It's recommended to disable the following option when using global threading on a mount point: performance.client-io-threads = off To enable it for services managed by glusterd, glusterd needs to be started with option '--global-threading'. In this case all daemons, like self-heal, will be using the global thread pool. Currently it can only be enabled for bricks, FUSE mounts and glusterd services. The maximum number of threads for clients and bricks can be configured using the following options: config.client-threads config.brick-threads These options can be applied online and its effect is immediate most of the times. If one of them is set to 0, the maximum number of threads will be calcutated as #cores * 2. Some distributions use a very old userspace-rcu library (version 0.7) for this reason, some header files from version 0.10 have been copied into contrib/userspace-rcu and are used if the detected version is 0.7 or older. An additional change has been made to io-threads to prevent that threads are started when iot-pass-through is set. Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b updates: #532 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* auth-cache.c: minor reduction of work under lock.Yaniv Kaul2019-02-181-6/+3
| | | | | | | | | | | Minor change to reduce work done under a lock. Also, remove unused variable (unrelated to the above). Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I1dfb55823c3db7c638d8a34288423bd1faa37c32
* server.c: use dict_() funcs with key length.Yaniv Kaul2019-02-181-15/+19
| | | | | | | | | | | Changed to use the dict_() funcs which take the key length. This happens to also reduce work under the lock in one case as well. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I958fcc29e95286fe3c74178cae3f01a8b2db26f2
* md-cache.c: minor reduction of work under lock.Yaniv Kaul2019-02-181-4/+3
| | | | | | | | | | Take the time before taking the lock, not under lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I6cd05d8556a9bcc015e1be53f6ba46854e52a380
* protocol/server: Use SERVER_REQ_SET_ERROR correctly for dictsPranith Kumar K2019-02-151-275/+236
| | | | | | | | | | Removed op_errno based SERVER_REQ_SET_ERROR() calls which was dead-code. xdr_to_dict() calls have this check which is used in 4.0 version of xdr-to-dict. fixes bz#1676797 Change-Id: I6f56907c85576f1263a6ec04ed7e37f723b01ac3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* dht-shared.c: minor reduction of work under lock.Yaniv Kaul2019-02-141-6/+7
| | | | | | | | | | Minor changes to reduce work done under a lock. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ia58adfb5125129e5d1f3bbf2202f38520fdbc29f
* storage/posix: print the actual file pathRaghavendra Bhat2019-02-141-54/+75
| | | | | | | | | | | | | posix converts incoming operations on files to operations on corresponding gfid handles. While this in itself is not a problem, logging of those gfid handles in place of actual file paths can create confusions during debugging. The best way would be to print both the actual file (recieved as an argument) for path based operations and the gfid handle associated with it. Change-Id: I408c36ca6456f2e3981b93151c19ef7f60085ad6 fixes: bz#1675076 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>