summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Delay post-op for fsyncPranith Kumar K2020-06-084-6/+31
| | | | | | | | | | | | | | | | | | Problem: AFR doesn't delay post-op for fsync fop. For fsync heavy workloads this leads to un-necessary fxattrop/finodelk for every fsync leading to bad performance. Fix: Have delayed post-op for fsync. Add special flag in xdata to indicate that afr shouldn't delay post-op in cases where either the process will terminate or graph-switch would happen. Otherwise it leads to un-necessary heals when the graph-switch/process-termination happens before delayed-post-op completes. Fixes: #1253 Change-Id: I531940d13269a111c49e0510d49514dc169f4577 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/afr: Prioritize ENOSPC over other errorskarthik-us2020-06-052-47/+5
| | | | | | | | | | | | | | | | | | | | | | Problem: In a replicate/arbiter volume if file creations or writes fails on quorum number of bricks and on one brick it is due to ENOSPC and on other brick it fails for a different reason, it may fail with errors other than ENOSPC in some cases. Fix: Prioritize ENOSPC over other lesser priority errors and do not set op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any error_no which can be misinterpreted by __afr_dir_write_finalize(). Also removing the function afr_has_arbiter_fop_cbk_quorum() which might consider a successful reply form a single brick as quorum success in some cases, whereas we always need fop to be successful on quorum number of bricks in arbiter configuration. Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08 Fixes: #1254 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* afr: fix memory leak in afr_priv_destroy()Dmitry Antipov2020-06-011-0/+2
| | | | | | | | | | | | | | | | Found with GCC ASan: Direct leak of 202 byte(s) in 2 object(s) allocated from: #0 0x7fc6c6ef0667 in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb0667) #1 0x7fc6c6bd145b in __gf_malloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:175 #2 0x7fc6c6bd17a3 in gf_vasprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:223 #3 0x7fc6c6bd1993 in gf_asprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:243 #4 0x7fc6b0dc92f6 in init /path/to/glusterfs/xlators/cluster/afr/src/afr.c:590 ... Change-Id: I29feb1d30a045fb70472758e6ed4e195888090b2 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1278
* dht - sparse files rebalance enhancementsBarak Sason Rofman2020-06-011-111/+102
| | | | | | | | | | | | | | | Currently data migration in rebalance reads sparse file sequentially, disregarding which segments are holes and which are data. This can lead to extremely long migration time for large sparse file. Data migration mechanism needs to be enhanced so only data segments are read and migrated. This can be achieved using lseek to seek for holes and data in the file. This enhancement is a consequence of https://bugzilla.redhat.com/show_bug.cgi?id=1823703 fixes: #1222 Change-Id: If5f448a0c532926464e1f34f504c5c94749b08c3 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* dht: add null check in gf_defrag_free_dir_dfmetaSusant Palai2020-05-261-1/+2
| | | | | | fixes: #1258 Change-Id: I9d1fb512072bcc540d21d47da5b15ae1b79cf2b8 Signed-off-by: Susant Palai <spalai@redhat.com>
* afr/changelog: fix NULL dereferences and error handlingAshish Pandey2020-05-261-1/+2
| | | | | | | | | This patch includes the following CID from Coverity Scan: *1419116 *1420206 Change-Id: Id92fd6a78c8a00726a61aa4697b5c126ced8ed4d Updates: #1202
* dht: Do opendir selectively in gf_defrag_process_dirSusant Palai2020-05-122-22/+55
| | | | | | | | | | | | | | | | Currently opendir is done from the cluster view. Hence, even if one opendir is successful, the opendir operation as a whole is considered successful. But since in gf_defrag_get_entry we fetch entries selectively from local_subvols, we need to opendir individually on those local subvols and keep track of fds separately. Otherwise it is possible that opendir failed on one of the subvol and we wind readdirp call on the fd to the corresponding subvol, which will ultimately result in EINVAL error. fixes: #1218 Change-Id: I50dd88b9597852a15579f4ee325918979417f570 Signed-off-by: Susant Palai <spalai@redhat.com>
* cluster/ec: Return correct error code and log messageAshish Pandey2020-05-081-2/+9
| | | | | | | | | | | | In case of readdir was send with an FD on which opendir was failed, this FD will be useless and we return it with error. For now, we are returning it with EINVAL without logging any message in log file. Return a correct error code and also log the message to improve thing to debug. fixes: #1220 Change-Id: Iaf035254b9c5aa52fa43ace72d328be622b06169
* cluster/dht: Don't access local after STACK_DESTROYPranith Kumar K2020-05-031-1/+3
| | | | | | | | | | | | | There is a possibility that 'frame' could have been destroyed in dht_selfheal_dir_setattr() which can lead to local->mds_heal_fresh_lookup showing junk non-zero number. That will lead to double STACK_DESTROY. Remembered the value of the variable before the call to fix the access. Fixes: #1214 Change-Id: I37d1657798bfb549bb3887e260484d58fff42c91 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Fix varargs warning reported with clang-10Dmitry Antipov2020-05-011-10/+9
| | | | | | | | | | | | | | | | | | | | | Found with clang-10 -Wvarargs: xlators/cluster/ec/src/ec-combine.c:360:20: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(args, global); ^ xlators/cluster/ec/src/ec-combine.c:348:34: note: parameter of type 'bool' is declared here gf_boolean_t global, ...) According to The C11 Standard, 7.16.1.4p4: If the parameter parmN is declared with the register storage class, with a function or array type, or with a type that is not compatible with the type that results after application of the default argument promotions, the behavior is undefined. Fixes: #1207 Change-Id: I527527845b2d574000d736c278be87cf19504761 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
* dht xlator: integer handling issuenik-redhat2020-04-294-11/+19
| | | | | | | | | | | | | | Issue: The ret value is passed to the function instead of the proper errno value Fix: Passing the errno generated to the log function CID: 1415824 : Improper use of negative value CID: 1420205 : Improper use of negative value Change-Id: Iaa7407ebd03eda46a2c027695e6bf0f598b371b2 Updates: #1060 Signed-off-by: nik-redhat <nladha@redhat.com>
* dht: Handle setxattr and rm race for directory in rebalanceSusant Palai2020-04-283-0/+33
| | | | | | | | | | | | | | Problem: Selfheal as part of directory does not return an error if the layout setxattr fails. This is because the actual lookup fop must have been successful to proceed for layout heal. Hence, we could not tell if fix-layout failed in rebalance. Solution: We can check this information in the layout structure that whether all the xlators have returned error. fixes: #1200 Change-Id: I3e5f2a36c0d934c21476a73a9a5473d8e490cde7 Signed-off-by: Susant Palai <spalai@redhat.com>
* afr: event gen changesRavishankar N2020-04-243-78/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The general idea of the changes is to prevent resetting event generation to zero in the inode ctx, since event gen is something that should follow 'causal order'. Change #1: For a read txn, in inode refresh cbk, if event_generation is found zero, we are failing the read fop. This is not needed because change in event gen is only a marker for the next inode refresh to happen and should not be taken into account by the current read txn. Change #2: The event gen being zero above can happen if there is a racing lookup, which resets even get (in afr_lookup_done) if there are non zero afr xattrs. The resetting is done only to trigger an inode refresh and a possible client side heal on the next lookup. That can be acheived by setting the need_refresh flag in the inode ctx. So replaced all occurences of resetting even gen to zero with a call to afr_inode_need_refresh_set(). Change #3: In both lookup and discover path, we are doing an inode refresh which is not required since all 3 essentially do the same thing- update the inode ctx with the good/bad copies from the brick replies. Inode refresh also triggers background heals, but I think it is okay to do it when we call refresh during the read and write txns and not in the lookup path. The .ts which relied on inode refresh in lookup path to trigger heals are now changed to do read txn so that inode refresh and the heal happens. Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec Fixes: #1179 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
* dht - fixing rebalance failures for files with holesBarak Sason Rofman2020-04-241-11/+10
| | | | | | | | | | | Rebalance process handling of files which contains holes casued rebalance to fail with "No space left on device" errors. This patch modifies the code-flow in such a way that files with holes will be rebalanced correctly. fixes: #1187 Change-Id: I89bc3d4ea7f074db7213d759c49307f379543932 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* dht/rebalance - fixing recursive failure issueBarak Sason Rofman2020-04-211-1/+2
| | | | | | | | | | | If rebalance process is failing, recursive failures appear in the log file, which is distracting from the root cause. In order to avoid recursive failure, error handling mechanism has been modified. fixes: #1072 Change-Id: Iae19430323630acd97c2c8d35685626d8da747a7 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* dht - Remove "tier" code (part 1)v9devBarak Sason Rofman2020-04-173-476/+19
| | | | | | | | | | | | | This patch is removing some of the "tier" code in dht xlator, as it is no longer being used. Not all of the not-needed code is removed at once, so reviewing is easier. Follow up patches removing additional unused code will follow. This is based in the work done in https://review.gluster.org/#/c/glusterfs/+/23935/ Change-Id: I3cb6a0c5d8f14afcd87cf021ef8f74b91c0f908a updates: #1097 Signed-off-by: Barak Sason Rofman <bsaonro@redhat.com>
* dht - fixing a permission update issueBarak Sason Rofman2020-04-082-8/+33
| | | | | | | | | | | | | | | | | | When bringing back a downed brick and performing lookup from the client side, the permission on said brick aren't updated on the first lookup, but only on the second. This patch modifies permission update logic so the first lookup will trigger a permission update on the downed brick. LIMITATIONS OF THE PATCH: As the choice of source depends on whether the directory has layout or not. Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory. Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted. fixes: #999 Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* cluster/afr: Removing unsupported options from code base to improve coveragekarthik-us2020-04-071-9/+0
| | | | | | | | | | | | | | | | Support for gluster volume heal <volname> info healed/heal-failed was removed by commit bb02cfb56ae08f56df4452c2b948fa962ae1212b in release-3.6. cli parser will display the usage message in all the supported versions whenever these clis are run, leading to some dead code in the latest branches. Since support for these clis were removed long back, this should not give any backward compatibility issues as well. Hence removing the dead code from the code base which will lead to better code coverage by the regression runs as well. Updates: #1052 Change-Id: I0c2b061469caf233c06d9699b0d159ce48e240b9 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* afr: mark pending xattrs as a part of metadata healRavishankar N2020-04-021-1/+61
| | | | | | | | | | | | | | | | | | | ...if pending xattrs are zero for all children. Problem: If there are no pending xattrs and a metadata heal needs to be performed, it can be possible that we end up with xattrs inadvertendly deleted from all bricks, as explained in the BZ. Fix: After picking one among the sources as the good copy, mark pending xattrs on all sources to blame the sinks. Now even if this metadata heal fails midway, a subsequent heal will still choose one of the valid sources that it picked previously. Fixes: #1067 Change-Id: If1b050b70b0ad911e162c04db4d89b263e2b8d7b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* dht: gf_defrag_process_dir is called even if gf_defrag_fix_layout has failedSusant Palai2020-03-241-0/+1
| | | | | | | | | Currently even though gf_defrag_fix_layout fails with ENOENT or ESTALE, a subsequent call is made to gf_defrag_process_dir leading to rebalance failure. fixes: #1102 Change-Id: Ib0c309fd78e89a000fed3feb4bbe2c5b48e61478 Signed-off-by: Susant Palai <spalai@redhat.com>
* cluster/afr: Fixes for haloPranith Kumar K2020-03-133-5/+19
| | | | | | | | | | | Current implementation assumes that ping-event will come after connect event but that may not be the case in the cases where after socket connection fds need to be re-opened which would consume more time. So handle any order of the ping/child-up events. fixes: bz#1800583 Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* dht - selfheal code cleaningBarak Sason Rofman2020-03-121-135/+20
| | | | | | | | | 1 - Converted methods to static 2 - Removed unused code Change-Id: I49db3e28116da1c3c9ff0a33dcce7281bc3856f7 updates: bz#1193929 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* dht/rebalance - fixing failure occurace due to rebalance stopBarak Sason Rofman2020-03-041-0/+8
| | | | | | | | | | | | | | | | | Probelm description: When topping rebalance, the following error messages appear in the rebalance log file: [2020-01-28 14:31:42.452070] W [dht-rebalance.c:3447:gf_defrag_process_dir] 0-distrep-dht: Found error from gf_defrag_get_entry [2020-01-28 14:31:42.452764] E [MSGID: 109111] [dht-rebalance.c:3971:gf_defrag_fix_layout] 0-distrep-dht: gf_defrag_process_dir failed for directory: /0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31 [2020-01-28 14:31:42.453498] E [MSGID: 109016] [dht-rebalance.c:3906:gf_defrag_fix_layout] 0-distrep-dht: Fix layout failed for /0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30 In order to avoid seing these error messages, a modification to the error handling mechanism has been made. In addition, several log messages had been added in order to improve debugging efficiency fixes: bz#1800956 Change-Id: Ifc82dae79ab3da9fe22ee25088a2a6b855afcfcf Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* xlator/dht-helper: structure loggingyatipadia2020-03-032-97/+75
| | | | | | | | | | convert gf_msg() to gf_smsg() Updates: #657 Change-Id: Iab35ac89b7d7fb6fb0074fc61b11bf679c517c9d Signed-off-by: yatipadia <ypadia@redhat.com> Signed-off-by: yatip <ypadia@redhat.com>
* cluster/afr: fix race when bricks come upXavi Hernandez2020-03-023-6/+9
| | | | | | | | | | | | | | | | | | | | The was a problem when self-heal was sending lookups at the same time that one of the bricks was coming up. In this case there was a chance that the number of 'up' bricks changes in the middle of sending the requests to subvolumes which caused a discrepancy in the expected number of replies and the actual number of sent requests. This discrepancy caused that AFR continued executing requests before all requests were complete. Eventually, the frame of the pending request was destroyed when the operation terminated, causing a use- after-free issue when the answer was finally received. In theory the same thing could happen in the reverse way, i.e. AFR tries to wait for more replies than sent requests, causing a hang. Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1808875
* xlator/dht-lock: structure loggingyatipadia2020-02-262-105/+113
| | | | | | | | convert gf_msg() to gf_smsg() Change-Id: If540ca921b1cd8ca75b92b3d72eb9eb61bdaaa10 Updates: #657 Signed-off-by: yatip <ypadia@redhat.com>
* afr: prevent spurious entry heals leading to gfid split-brainRavishankar N2020-02-185-15/+7
| | | | | | | | | | | | | | | | | | | | Problem: In a hyperconverged setup with granular-entry-heal enabled, if a file is recreated while one of the bricks is down, and an index heal is triggered (with the brick still down), entry-self heal was doing a spurious heal with just the 2 good bricks. It was doing a post-op leading to removal of the filename from .glusterfs/indices/entry-changes as well as erroneous setting of afr xattrs on the parent. When the brick came up, the xattrs were cleared, resulting in the renamed file not getting healed and leading to gfid split-brain and EIO on the mount. Fix: Proceed with entry heal only when shd can connect to all bricks of the replica, just like in data and metadata heal. fixes: bz#1801624 Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/thin-arbiter: Wait for TA connection before ta-file lookupAshish Pandey2020-02-171-17/+21
| | | | | | | | | | | | | | | | | | Problem: When we mount a ta volume, as soon as 2 data bricks are connected we consider that the mount is done and then send a lookup/create on ta file on ta node. However, this connection with ta node might not have been completed. Due to this delay, ta replica id file will not be created and we will see ENOTCONN error in log file if we do lookup. Solution: As we know that this ta node could have a higher latency, we should wait for reasonable time for connection to happen before sending lookup/create on replica id file. fixes: bz#1720463 Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc
* dht - Reducing methods scopeBarak Sason Rofman2020-02-136-104/+60
| | | | | | | | | | 1. Reduced methods scope in the following: inode read&write, layout, linkfile, shard 2. Removed dead code @ dht-linkkile.c:174-228 & dht-shard.c:44 Change-Id: I2d08a10c7b074fccdb0c020845cad60c6ea32db5 updates: bz#1193929 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* cluster/afr: Check for lock on source & sink before doing data healkarthik-us2020-02-131-3/+19
| | | | | | | | | | | | | | | | Problem: In function afr_selfheal_data_block(), we only check for the lock count to be equal to or greater than the number of sinks. There can be a case where we have 2 source bricks and one sink and the locking is successful on only the source brick(s). In this case we continue with the healing on sink without having a lock, which is not correct. Fix: Check for lock on atleast source & one sink before starting the data heal. Change-Id: Iebcb57dcaa4b31831fedfee63d6ca16e9d6c8df8 fixes: bz#1688115 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* tests: Fix spurious self-heald.t failurePranith Kumar K2020-02-111-23/+15
| | | | | | | | | | | | | | | | | | Problem: heal-info code assumes that all indices in xattrop directory definitely need heal. There is one corner case. The very first xattrop on the file will lead to adding the gfid to 'xattrop' index in fop path and in _cbk path it is removed because the fop is zero-xattr xattrop in success case. These gfids could be read by heal-info and shown as needing heal. Fix: Check the pending flag to see if the file definitely needs or not instead of which index is being crawled at the moment. fixes: bz#1801623 Change-Id: I79f00dc7366fedbbb25ec4bec838dba3b34c7ad5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* dht: Fix stale-layout and create issueSusant Palai2020-02-092-15/+140
| | | | | | | | | | | | | | | | | Problem: With lookup-optimize set to on by default, a client with stale-layout can create a new file on a wrong subvol. This will lead to possible duplicate files if two different clients attempt to create the same file with two different layouts. Solution: Send in-memory layout to be cross checked at posix before commiting a "create". In case of a mismatch, sync the client layout with that of the server and attempt the create fop one more time. test: Manual, testcase(attached) fixes: bz#1786679 Change-Id: Ife0941f105113f1c572f4363cbcee65e0dd9bd6a Signed-off-by: Susant Palai <spalai@redhat.com>
* dht-hashfn.c: ensure we do not try to calculate hash on NULL pathYaniv Kaul2020-02-051-0/+3
| | | | | | | | | | | For some reason, dht_selfheal_layout_alloc_start() sends a NULL loc->path string to dht_hash_compute(). Until we understand why it happens, we should strive not to crash on a strlen of a NULL pointer. Change-Id: I8c2a22602cfccba9af85f432a1841556f6978450 updates: bz#1793378 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* xlator/dht-selfheal: structure loggingyatipadia2020-02-043-258/+222
| | | | | | | | Convert gf_msg() to gf_smsg() Change-Id: Ic72f2513e641cfcbe074933cb2697ee9fc05a766 Updates: #657 Signed-off-by: yatip <ypadia@redhat.com>
* xlator/dht-linkfile: structure loggingyatip2020-02-042-26/+269
| | | | | | | | | convert all gf_msg() to gf_smsg() Updates: #657 Change-Id: I9104ba8a8102f04d031a208abb06b6cf8ea8fd13 Signed-off-by: yatip <ypadia@redhat.com>
* Improve logging in EC, client and lock translatorAshish Pandey2020-02-032-3/+4
| | | | | Change-Id: I98af8672a25ff9fd9dba91a2e1384719f9155255 Fixes: bz#1779760
* geo-rep: Fix for "Transport End Point not connected" issueHarpreet Kaur2020-01-312-0/+63
| | | | | | | | | | | | | | | | | | | | | | problem: Geo-rep gsyncd process mounts the master and slave volume on master nodes and slave nodes respectively and starts the sync. But it doesn't wait for the mount to be in ready state to accept I/O. The gluster mount is considered to be ready when all the distribute sub-volumes is up. If the all the distribute subvolumes are not up, it can cause ENOTCONN error, when lookup on file comes and file is on the subvol that is down. solution: Added a Virtual Xattr "dht.subvol.status" which returns "1" if all subvols are up and "0" if all subvols are not up. Geo-rep then uses this virtual xattr after a fresh mount, to check whether all subvols are up or not and then starts the I/O. fixes: bz#1664335 Change-Id: If3ad01d728b1372da7c08ccbe75a45bdc1ab2a91 Signed-off-by: Harpreet Kaur <hlalwani@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
* afr: restore timestamp of files during metadata healSheetal Pamecha2020-01-241-6/+2
| | | | | | | | | | | For files: During metadata heal, we restore timestamps only for non-regular (char, block etc.) files. Extenting it for regular files as timestamp is updated via touch command also fixes: bz#1787274 Change-Id: I26fe4fb6dff679422ba4698a7f828bf62ca7ca18 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* dht-hashfn.c: remove a strlen()Yaniv Kaul2020-01-141-16/+19
| | | | | | | | | | | We already have the length of the name, or when we munge it, we can return the length of it instead of strlen() again. Also, reduce a bit the code under the lock. Change-Id: I0141b0725ed1a4134d8d9f81ed1187b551b038b5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* multiple xlators: reduce key lengthYaniv Kaul2020-01-141-3/+3
| | | | | | | | | | | | | | | In many cases, we were freely allocating long keys with no need. Smaller char arrays are just fine almost anywhere, so just went ahead and looked where they we can use smaller ones. In some cases, annotated the functions as static and the prefixes passed as const as it was easier to read and understand. Where relevant, converted the dict functions to use known key length. Change-Id: I882ab33ea20d90b63278336cd1370c09ffdab7f2 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* dht-rename.c: fix Coverity issues 1397018/7 - strcat into uninitialized valueYaniv Kaul2020-01-101-0/+4
| | | | | | | | | initialize both src and dst if they were not initialized already. fixes: CID#1397018 Change-Id: Ic91954423953e8bf24eaa11fc2798c554f304d28 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* afr: expose cluster.optimistic-change-log to CLI.Ravishankar N2020-01-071-0/+2
| | | | | | | | | | | This volume option was not made avaialble to `gluster volume set` CLI. Reported-by: epolakis(https://github.com/kinsu) in https://github.com/gluster/glusterfs/issues/781 fixes: bz#1787554 Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* afr: simplify afr_has_quorum()Yaniv Kaul2020-01-023-29/+10
| | | | | | | | | | | | | 1. Perform AFR_COUNT() once, in afr_has_quorum() and pass the result to afr_lookup_has_quorum() 2. Simplify afr_lookup_has_quorum() - pass less parameters to it. (Via the change in item 1 above). 3. Make afr_is_add_replica_mount_lookup_on_root() static function. 4. Remove dead code - afr_decide_heal_info() which was not used. Change-Id: If9168cd01e22788a0e60b91e315787d2aa60e97b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Avoid buffer overwrite due to uuid_utoa() misuseDmitry Antipov2019-12-271-2/+3
| | | | | | | | | | | | | | | Code like: f(..., uuid_utoa(x), uuid_utoa(y)); is not valid (causes undefined behaviour) because uuid_utoa() uses the only static thread-local buffer which will be overwritten by the subsequent call. All such cases should be converted to use uuid_utoa_r() with explicitly specified buffer. Change-Id: I5e72bab806d96a9dd1707c28ed69ca033b9c8d6c Updates: bz#1193929 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
* afr_inode/xlator: structure loggingyatipadia2019-12-203-57/+159
| | | | | | | | convert gf_msg() into gf_smsg() Change-Id: I8f5b7bbb9caa78902b06f67257502b67adab7405 Updates: #657 Signed-off-by: yatipadia <ypadia@redhat.com>
* DHT - Reduce methods scope (dht-common.c)Barak Sason Rofman2019-12-173-804/+708
| | | | | | | | | | | Methods that should have been static were defined as global, and the other way around. This patch fixes the issue in order to enforce encapsulation. updates: bz#1776757 Change-Id: I3eb5781849c5e597c1dd347e03f356c00db62a39 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* dht-common.h/dht-helper.c: exctract the LOCK() from DHT_UPDATE_TIMEYaniv Kaul2019-12-172-20/+20
| | | | | | | | | | | | | Currently, the code (and only place) that is using this macro is in dht_inode_ctx_time_update() where it is called 3 times in a row, which is essentially 3 cycles of LOCK/UNLOCK on the same lock. Instead, extract the LOCK()/UNLOCK() part of the macro and wrap those calls with it. Change-Id: I6312b985e3d97517857b55f342440accc4063db6 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* afr: make heal info locklessRavishankar N2019-12-123-203/+211
| | | | | | | | | | | | | | | | | | | Changes in locks xlator: Added support for per-domain inodelk count requests. Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the dict and then set each key with name 'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'. In the response dict, the xlator will send the per domain count as values for each of these keys. Changes in AFR: Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has been added to make the latter behave same as the former, thus not breaking the current heal info output behaviour. fixes: bz#1774011 Change-Id: Ie9e83c162aa77f44a39c2ba7115de558120ada4d Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* afr/self-heald - Missing error logsBarak Sason Rofman2019-12-102-24/+77
| | | | | | | | | | As a follow up on https://review.gluster.org/#/c/glusterfs/+/23749/, adding error logging for the entire method. In addition, converted logging to structured logging in the method. Fixes: bz#1778457 Change-Id: I1f412159e6849d6f6ddbde53ec4a85ad709bbdf4 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* cluster/dht: Add comments to codeN Balachandran2019-12-102-2/+14
| | | | | | Change-Id: Ieb7531af19ae89fb8a8387e81663c7f157b10c02 Updates: bz#1765421 Signed-off-by: N Balachandran <nbalacha@redhat.com>