summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt/glusterd
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: Heal directory rename without rmdir/mkdirPranith Kumar K2020-10-012-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem1: When a directory is renamed while a brick is down entry-heal always did an rm -rf on that directory on the sink on old location and did mkdir and created the directory hierarchy again in the new location. This is inefficient. Problem2: Renamedir heal order may lead to a scenario where directory in the new location could be created before deleting it from old location leading to 2 directories with same gfid in posix. Fix: As part of heal, if oldlocation is healed first and is not present in source-brick always rename it into a hidden directory inside the sink-brick so that when heal is triggered in new-location shd can rename it from this hidden directory to the new-location. If new-location heal is triggered first and it detects that the directory already exists in the brick, then it should skip healing the directory until it appears in the hidden directory. Credits: Ravi for rename-data-loss.t script Fixes: #1211 Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: start the brick on a different portSanju Rakonde2020-09-291-0/+3
| | | | | | | | | | | | | | | | | | | | | Problem: brick fails to start when the port provided by glusterd is in use by any other process Solution: glusterd should check errno set by runner_run() and if it is set to EADDRINUSE, it should allocate a new port to the brick and try to start it again. Previously ret value is checked instead of errno, so the retry part never executed. Now, we initialize errno to 0 before calling runner framework. and afterwards store the errno into ret to avoid modification of errno in subsequent function calls. fixes: #1101 Change-Id: I1aa048a77c5f8b035dece36976d60602d9753b1a Signed-off-by: Sanju Rakonde <srakonde@redhat.com> Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: Replacing str with ptr in strchr to reduce comparisonsrijan-sivakumar2020-09-281-1/+1
| | | | | | | | | | | | | | | Issue: On seeing the function glusterd_replace_slash_with_hyphen we see that the strchr inside the while loop takes str as a parameter, that'd mean repeated comparison from index 0 of str even though the characters have already been compared. Why not use ptr instead which points to the latest replacement in the string. Code change: replacing str with ptr inside the strchr function. Fixes: #1516 Change-Id: Id049ec2ad9800a01730f2a0263d9e0528557ae81 Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
* glusterd: resource leaksnik-redhat2020-09-251-2/+1
| | | | | | | | | | | | | | Issue: iobref was not freed before exiting the function. Fix: Modified the code to free iobref before exiting. CID: 1430107 Updates: #1060 Change-Id: I89351b3aa645792eb8dda6292d1e559057b02d8b Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: Fix Add-brick with increasing replica count failureSheetal Pamecha2020-09-241-0/+4
| | | | | | | | | | | | | Problem: add-brick operation fails with multiple bricks on same server error when replica count is increased. This was happening because of extra runs in a loop to compare hostnames and if bricks supplied were less than "replica" count, the bricks will get compared to itself resulting in above error. Fixes: #1508 Change-Id: I8668e964340b7bf59728bb838525d2db062197ed Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* glusterd: Removing strlen and using existing len field of data_tsrijan-sivakumar2020-09-241-3/+5
| | | | | | | | | | | | | | Issue: The strlen being used to find the length of the dictionary is an extra step as there already exists len field in data_t which contains the same value. Code Change : 1. Replacing the strlen with len. 2. Removing a typecast which wasn't required. Fixes: #1497 Change-Id: I2780c3876b17b8825038d222fb489a87e090411f Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
* glusterd: Fixing coverity issues.srijan-sivakumar2020-09-241-4/+2
| | | | | | | | | | | Fixing Incorrect expression (IDENTICAL_BRANCHES) reported by the coverity scan. CID: 1432721 Change-Id: If6ab3a129dffb6c5fcc618e775f99bc1125003ab Updates: #1060 Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
* glusterd: add post-commit phase to the transactionSanju Rakonde2020-09-217-1/+569
| | | | | | | | | | | | | | | | This is part 2 of the fix. part 1 is at https://review.gluster.org/#/c/glusterfs/+/24325/ This patch adds post commit phase to the mgmt v3 transaction framework. In post commit phase we replace the old auth.allow list in case of add-brick and replace-brick. fixes: #1391 Change-Id: I41c871d59e6252d27163b042ad710e929d7d0399 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: mount directory getting truncated on mounting shared_storagenik-redhat2020-09-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Issue: In case of a user created volume the mount point is the brick path 'ex: /data/brick' but in case of shared_storage the mount point is '/'.So, here we increment the array by one so as to get the exact path of brick without '/', which works fine for other volumes as the pointer of the brick_dir variable is at '/', but for shared_storage it is at 'v'(where v is starting letter of 'var' directory). So, on incrementing the path we get in case of shared_storage starts from 'ar/lib/glusterd/...' Fix: Only, increment the pointer if the current position is '/', else the path will be wrong. Fixes: #1480 Change-Id: Id31bb13f58134ae2099884fbc5984c4e055fb357 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd:Reducing file operations when writing options into volfile.Srijan Sivakumar2020-09-174-91/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: The options to be written into the volfile are in form of key-value pairs and the current approach taken to write them into a file is to invoke the write syscall for each key-value pair. This implies an increased number of system calls. Code Changes: 1. Addition of a structure, glusterd_volinfo_data_store_t in glusterd-store.h, containing a character buffer, a pointer to gf_store_handle_t, the current length of data in the buffer as well as a flag for checking key while storing in the buffer. This is used for passing the required file descriptor as well having a character buffer for storing multiple options before being written into a file. 2. Modification of function, _storeopts in glusterd-store.c, which now invokes the gf_store_save_items when buffer is to be emptied into the volfile before further write into it. Also, it has replaced the function _storeslaves, _gd_store_rebalance_dict and _store_global_opts. 3. Modification of function, glusterd_store_volinfo_write in glusterd-store.c, wherein a pointer of type glusterd_volinfo_data_store_t is initialized for further operation. Also, the buffer is emptied into the volfile before it is freed. 4. Modification of function, glusterd_store_node_state_write in glusterd-store.c, wherein the a pointer of type glusterd_volinfo_data_store_t is initialized for further operations. Also, the buffer is emptied into the volfile before it is freed. 5. Addition of enum into glusterd-mem-types.h 6. Modification of function, glusterd_store_options in glusterd-store.c, wherein a pointer of type glusterd_volinfo_data_store_t is initialized for further opertaions. Also, the buffer is emptied into the volfile before it is freed. Reasoning behind the approach: 1.Instead of a dynamic allocation of buffer or increasing the buffer size with increased number of options, it, the current approach takes a buffer of fixed size (VOLINFO_BUFFER_SIZE). Before any write into the buffer, the size is checked and if it exceeds the available space, the contents of the buffer are written to the file before copying new contents. Dynamic allocation can lead to increased memory usage as one doesn't know the number of options that could be added in time and may go on to occupy more space than mandated. 2.The function dict_foreach is a generic function used across different modules. It made sense not to change its implementation as it might affect other Functionalities. Hence a structure was added which could just be passed as one of the parameter to this function (as it takes a void*). 3. Reduced number of system calls implies an increase in execution speed. Also, these modified functions come into play whenever the volume is started or modified. 4. The functions _storeslaves, _gd_store_rebalace_dict and _store_global_opts were doing the same set of operations as that of _storeopts except the checking for the key. This has been handled with the help of a flag in the glusterd_volinfo_data_store_t structure. This reduces the duplicate code present. Signed-off-by: Srijan Sivakumar <ssivakum@redhat.com> Change-Id: I22e6e91c78ed51e3a171482054d77bf793b9ab16 Fixes: #718
* glusterd: readdir-ahead off by defaultnik-redhat2020-09-141-1/+1
| | | | | | | | | | | | Changing the default value of readdir-ahead to off, but it can be enabled/disabled later on if with gluster vol set <volname> performance.readdir-ahead enabel/disable command. Fixes: #1472 Change-Id: Idb3e16e8be98d7a811fc8e5d09906919ef50fbab Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: fixing coverity issuesnik-redhat2020-09-111-3/+9
| | | | | | | | | | | | | | | In the last patch merge for the performance.readdir -ahead dependencies, there was few issues with return check and NULL derefencing. So, fixed that as per the coverity scanner. CID: 1432493 CID: 1432492 Updates: #1060 Change-Id: I6dee6d35ef41ab8d6322f1b2e3734c4796ee2804 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: automatically turn on dependencies for parallel-readdirnik-redhat2020-09-081-2/+5
| | | | | | | | | | | | | | | | | | | Issue: On setting the performance.parallel-readdir to "on" the dependencies of it should automatically be turned on and readdir-ahead should be the parent of each dht subvolume. Fix: On enabling the parallel-readdir, the dependencies are turned on by enabling readdir-ahead simultaneously, and readdir-ahead will be seen as the parent of each dht subvolume. Fixes: #1416 Change-Id: Ic83ae470152b88edddc274d5e6c4d74169d23c15 Signed-off-by: nik-redhat <nladha@redhat.com>
* xlators: prefer libglusterfs time APIDmitry Antipov2020-09-072-2/+4
| | | | | | | | | | Prefer timespec_now_realtime() and gf_time() over clock_gettime() and time(), use gf_tvdiff() and gf_tsdiff() where appropriate, drop unused time_elapsed() and leftovers in 'struct posix_private'. Change-Id: Ie1f0229df5b03d0862193ce2b7fb91d27b0981b6 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* build: extend --enable-valgrind to support Memcheck and DRDDmitry Antipov2020-09-056-27/+75
| | | | | | | | | Extend '-enable-valgrind' to '--enable=valgrind[=memcheck,drd]' to enable Memcheck or DRD Valgrind tool, respectively. Change-Id: I80d13d72ba9756e0cbcdbeb6766b5c98e3e8c002 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* glusterd: use after free (coverity issue)nik-redhat2020-09-041-2/+3
| | | | | | | | | | | | | | | | | | Issue: dict_unref is called on the same dict again, in the out label of the code, which causes the use after free issue. Fix: Set the dict to NULL after unref, to avoid use after free issue. CID: 1430127 Updates: #1060 Change-Id: Ide9a5cbc5f496705c671e72b0260da6d4c06f16d Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: cksum mismatch on upgrading to latest glusternik-redhat2020-09-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | Issue: In gluster versions less than 7, the checksums were calculated whether or not the quota is enabled or not, and that cksum value was also getting stored in the quota.cksum file. But, from gluster 7 version onwards cksum was calculated only if the quota is enabled. Due to this, the cksums in quota.cksum files differ after upgrading. Fix: Added a check to see if the OP_VERSION is less than 7 then, follow the previous method otherwise, move as per the latest changes for cksum calculation. This changes for the cksum calculation was done in this commit : https://github.com/gluster/glusterfs/commit/3b5eb592f5 Fixes: #1332 Change-Id: I7a95e5e5f4d4be4983fb7816225bf9187856c003 Signed-off-by: nik-redhat <nladha@redhat.com>
* snapshot/ganesha: Modify ganesha export file while creating cloneMohammed Rafi KC2020-08-213-24/+103
| | | | | | | | | | | | | A snapshot clone is nothing but a volume, So if the ganesha is enabled for the parent volume, the clone should also have the ganesha enabled. This patch add clonename to the export file. Change-Id: I847f23e62036aee02fb9e6adbc868aec6455d86e Fixes: #1043 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
* Remove need for /proc on FreeBSDDaniel Morante2020-08-202-3/+64
| | | | | Change-Id: Ieebd9a54307813954011ac8833824831dce6da10 Fixes: #1376
* Missing link to mntent_compat for glusterdDaniel Morante2020-08-191-2/+4
| | | | | Change-Id: I5d6d38759de4492de3256995e79d01b9ed7befef Fixes: #1376
* glusterd: performance improvementnik-redhat2020-08-184-69/+83
| | | | | | | | | | | | | | | | | | | | | | Issue: In the glusertd_op_stage_create_volume(), fetching of values from the dict is done, whereas same values are fetched by glusterd_check_brick_order() which is called from that function. This leads to unnecssary performance overhead. Fix: Instead of fetching the values again, passing the values to the glusterd_check_brick_order() if it's fethced before, else a NULL is passed and then only fetching is done here. Also, few changes are made to the code to reduce the cost of operations such as 'fast fail' for false conditions and a bit of code clean up. Fixes: #1397 Change-Id: Ic7b523adbca8eb63ef9eb29c206e3b19e05c0815 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: memory deallocated twicenik-redhat2020-08-181-9/+9
| | | | | | | | | | | | | | | | | | | | | Issue: If the the pointer tmptier is destroyed in the function code it still it checks for the same in the out label. And tries to destroy the same pointer again. Fix: So, instead of passing the ptr by value, if we pass it by reference then, on making the ptr in the function the value will persist, in the calling function and next time when the gf_store_iter_destory() is called it won't try to free the ptr again. CID: 1430122 Updates: #1060 Change-Id: I019cea8e301c7cc87be792c03b58722fc96f04ef Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: shared storage mount fails in ipv6 environmentnik-redhat2020-08-181-0/+16
| | | | | | | | | | | | | | | | | Issue: In case of ipv6 environment, the mounting of glusterd_shared_storage volume fails as it doesn't recognises the ipv6 enviornment. Fix: In case of ipv6 environment, the address-family is passed to the hooks script on creating shared-storage, then depending upon the address-family --xlator-option=transport.address-family=inet6 option is added to the mount command, and the mounting succeeds. Fixes: #1406 Change-Id: Ib1888c34d85e6c01618b0ba214cbe1f57576908d Signed-off-by: nik-redhat <nladha@redhat.com>
* libglusterfs: add library wrapper for time()Dmitry Antipov2020-08-172-2/+2
| | | | | | | | | Add thin convenient library wrapper gf_time(), adjust related users and comments as well. Change-Id: If8969af2f45ee69c30c3406bce5baa8305fb7f80 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* glusterd: Increase buffer length to save multiple hostnames in peer fileMohit Agrawal2020-08-041-3/+3
| | | | | | | | | | | | | | | Problem: At the time of handling friend update request glusterd updates peer file and if DNS has returned multiple hostnames for the same IP, glusterd saves all hostnames in peer file.In commit 1fa089e7a2b180e0bdcc1e7e09a63934a2a0c0ef We changed the approach to save all key value pairs in single shot. In case of a buffer is not having space to store the hostnames glusterd writes partial hostname in peer file. Solution: To avoid the failure increase the buffer length Change-Id: Iee969d165333e9c5ba69431d474c541b8f12d442 Fixes: #1407 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterd/auth.allow : allow add-brick from peersSanju Rakonde2020-07-304-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | Problem: When auth.allow list is set to some ip's, add-brick operation is failing. Cause: add-brick commands creates a temparary mount on the bricks to set the extended attributes on the brick mount points. When auth.allow list is set to default i.e, * (all) we will not see any issue, but when it is set to certain ip's add-brick operation fails as temparory mount on the bricks fails because the peers are not part of auth.allow list. Solution: When auth.allow list is already set, add all the peers to the auth.allow list during add-brick operation. the old list will be replaced in post commit phase. As this can happen with replace-brick operation as well, added code to handle it. updates: #1391 Change-Id: I5ede8c35f05ab25ff431b88e074ddbe9c10a90f1 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: fix resource leakSheetal Pamecha2020-07-291-0/+1
| | | | | | | CID: 1430146 Change-Id: Icce4ffa0e78575b110e0cfd9d5cfd133141680c1 Updates: #1060 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* libglusterfs/xlator: undefined symbol xlator_apinik-redhat2020-07-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Issue: On executiing the command gluster vol set help, an error comes up in glusterd logs stating `undefined symbol: xlator_api`. This issue is seen for the rpc-transport/socket.so file. Fix: The symbol `xlator_api` is not found in rpc-transport/socket.so file as it is not a xlator but a shared object for transport. In the `xlator.c` file, there is a function `xlator_volopt_dynload`, which looks for the default values of the options available in gluster, which is stored inside the respective xlator files for different voltypes. In each of these files the `options` object is present which contains the default values, which is therefore referenced from the `options` data member of `xlator_api` object in case of xlators.But, since `rpc-transport/socket.so` is not an xlator we don't have the `xlator_api` object present to point to that object. So, in case of `rpc-transport/socket.so` type we are accesing the `options` object directly from the `xlator_volopt_dynload` function to fetch the default values for the available options. Fixes: #827 Change-Id: I3b2b0c1f2a11896be250aaca1a33a65b044991d5 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: getspec() returns wrong response when volfile not foundTamar Shacked2020-07-231-2/+3
| | | | | | | | | | | | | | | In a cluster env: getspec() detects that volfile not found. but further on, this return code is set by another call so the error is lost and not handled. As a result the server responds with ambiguous message: {op_ret = -1, op_errno = 0..} - which cause the client to stuck. Fix: server side: don't override the failure error. fixes: #1375 Change-Id: Id394954d4d0746570c1ee7d98969649c305c6b0d Signed-off-by: Tamar Shacked <tshacked@redhat.com>
* glusterd: avoid crashSanju Rakonde2020-07-201-1/+2
| | | | | | | | | When dirp is null, we should not call sys_closedir() on it. fixes: #1379 Change-Id: I33633df983aeea11e9d685e41ed9ec58644b6258 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* [RFC]glusterd-utils.c: display which options have changedYaniv Kaul2020-07-101-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Display which options were not changed from the default. The user may have opted to change some global or volume options from the default they were initially. Display '(DEFAULT)' if the values used are those that were not explicitly set by the user. Example output: Option Value ------ ----- cluster.server-quorum-ratio 50 cluster.enable-shared-storage disable (DEFAULT) cluster.op-version 80000 cluster.max-op-version 90000 cluster.brick-multiplex disable (DEFAULT) cluster.max-bricks-per-process 250 (DEFAULT) glusterd.vol_count_per_thread 100 (DEFAULT) cluster.localtime-logging disable (DEFAULT) cluster.daemon-log-level INFO (DEFAULT) Since glusterfind uses the value, it is now filtering the value and only picking the 1st word (which is the value itself) and ignores the rest, which may now be '(DEFAULT)'. Fixes: #1357 Change-Id: I7c59055158d099a5de38943f2169fd02c77f5d09 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* glusterd: dereference of null pointernik-redhat2020-07-091-3/+3
| | | | | | | | | | | | | | Issue: 'this' is used before it is defined, therefore it can lead to a NULL dereference. Fix: Moved the definition of 'this', before it's use to avoid NULL dereference. Change-Id: I6ad382192129dfa3a206426e5610040e7a905be6 Updates: #1096 Signed-off-by: nik-redhat <nladha@redhat.com>
* xlator/mgmt/glusterd: lto-type-mismatchKaleb S. KEITHLEY2020-07-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | Seen in fedora rawhide/33, and SUSE tumbleweed, in all versions going back at least as far as glusterfs-6. [ 351s] glusterd.c:68:13: warning: type of 'snap_mount_dir' does not match original declaration [-Wlto-type-mismatch] [ 351s] 68 | extern char snap_mount_dir[PATH_MAX]; [ 351s] | ^ [ 351s] glusterd-snapshot.c:65:6: note: array types have different bounds [ 351s] 65 | char snap_mount_dir[VALID_GLUSTERD_PATHMAX]; [ 351s] | ^ [ 351s] glusterd-snapshot.c:65:6: note: 'snap_mount_dir' was previously declared here In this case it's only a warning, but certainly merits fixing. Another case where a decl in a header file instead of open-coding extern decls in multiple .c files would have been preferable. Change-Id: Idc91e536a56a1a7717be83ed27698069e71dff67 Updates: #1002 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* glusterd: null dereferencenik-redhat2020-07-083-5/+5
| | | | | | | | | | | | | | | | | | | | Issue: There has been either an explicit null dereference or a dereference after null check in some cases. Fix: Added the proper condition for null check and fixed null derefencing. CID: 1430106 : Dereference after null check CID: 1430120 : Explicit null dereferenced CID: 1430132 : Dereference after null check CID: 1430134 : Dereference after null check Change-Id: I7e795cf9f7146a633097c26a766f16b159881fa3 Updates: #1060 Signed-off-by: nik-redhat <nladha@redhat.com>
* libglusterfs, glusterd: tweak directory scanningDmitry Antipov2020-07-074-34/+28
| | | | | | | | | Replace an over-engineered GF_SKIP_IRRELEVANT_ENTRIES() with inline function gf_irrelevant_entry(), adjust related users. Change-Id: I6f66c460f22a82dd9ebeeedc2c55fdbc10f4eec5 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1350
* glusterd: change the log-level to WarningSanju Rakonde2020-07-061-1/+1
| | | | | | | | | Reason for changing the log-level stated at the github isse fixes: #1353 Change-Id: I21202075916c5a7525e5f26e7fb595efe7717b66 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: rebalance status displays stats as 0 after rebootSanju Rakonde2020-07-021-9/+20
| | | | | | | | | | | | | | | | | | | | | | | problem: while the rebalance is in progress, if a node is rebooted rebalance v status shows the stats of this node as 0 once the node is back. Reason: when the node is rebooted, once it is back glusterd_volume_defrag_restart() starts the rebalance and creates the rpc. but due to some race, rebalance process is sending disconnect event, so rpc object is getting destroyed. As the rpc object is null, request for fetching the latest stats is not sent to rebalance process. and stats are shows as default values which is 0. Solution: When the rpc object null, we should create the rpc if the rebalance process is up. so that request can be sent to rebalance process using the rpc. fixes: #1339 Change-Id: I1c7533fedd17dcaffc0f7a5a918c87356133a81c Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* glusterd: removing unused macronik-redhat2020-07-021-5/+2
| | | | | | | | | | | Removed the macro 'GD_MSG_DICT_SERL_LENGTH_GET_FAIL' from the glusterd-messages file as 'GD_MSG_DICT_ALLOC_AND_SERL_LENGTH_GET_FAIL' is used in it's place Change-Id: I69d7d95b5cb8f1bdd7e616d7a3e9539e891ba378 Fixes: #874 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: additional log informationnik-redhat2020-06-2931-569/+2077
| | | | | | | | | | | | | | | Issue: Some of the functions didn't had sufficient logging of information in case of failure. Fix: Added log information in few functions in case of failure indicating the cause of such event. Change-Id: I301cf3a1c8d2c94505c6ae0d83072b0241c36d84 fixes: #874 Signed-off-by: nik-redhat <nladha@redhat.com>
* glusterd: add-brick command failureSanju Rakonde2020-06-214-46/+67
| | | | | | | | | | | | | | | Problem: add-brick operation is failing when replica or disperse count is not mentioned in the add-brick command. Reason: with commit a113d93 we are checking brick order while doing add-brick operation for replica and disperse volumes. If replica count or disperse count is not mentioned in the command, the dict get is failing and resulting add-brick operation failure. fixes: #1306 Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* volgen: add an option to disable aclAmar Tumballi2020-06-172-2/+26
| | | | | | | | | | | | | | | Also add a message saying this is to be used only for 'debug' purpose only. This is helpful to corner the issue to acl. There were recently many issues reported related to permissions, and acl access denied bugs. The bugs were elsewhere, but to validate them and to get people back to service (in certain cases like oVirt, where gluster volumes are used mostly by single user), this option can be used. Updates: #876 Change-Id: I7be4401153607e11c9efb831ab794df4176604df Signed-off-by: Amar Tumballi <amar@kadalu.io>
* glusterd: migrating remove-brick commands to mgmt v3 frameworkSanju Rakonde2020-06-173-9/+136
| | | | | | | | | | | Currently remove-brick commands follow sync-op framework. For code extensibility (like, adding more phases in the trnasaction) we are migrating the command to mgmt v3 framework. fixes: #1164 Change-Id: I5d363223d6f9dc7a70b61adb9d3a5250e84a71b4 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Indicate timezone offsets in timestampsCsaba Henk2020-06-153-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Logs and other output carrying timestamps will have now timezone offsets indicated, eg.: [2020-03-12 07:01:05.584482 +0000] I [MSGID: 106143] [glusterd-pmap.c:388:pmap_registry_remove] 0-pmap: removing brick (null) on port 49153 To this end, - gf_time_fmt() now inserts timezone offset via %z strftime(3) template. - A new utility function has been added, gf_time_fmt_tv(), that takes a struct timeval pointer (*tv) instead of a time_t value to specify the time. If tv->tv_usec is negative, gf_time_fmt_tv(... tv ...) is equivalent to gf_time_fmt(... tv->tv_sec ...) Otherwise it also inserts tv->tv_usec to the formatted string. - Building timestamps of usec precision has been converted to gf_time_fmt_tv, which is necessary because the method of appending a period and the usec value to the end of the timestamp does not work if the timestamp has zone offset, but it's also beneficial in terms of eliminating repetition. - The buffer passed to gf_time_fmt/gf_time_fmt_tv has been unified to be of GF_TIMESTR_SIZE size (256). We need slightly larger buffer space to accommodate the zone offset and it's preferable to use a buffer which is undisputedly large enough. This change does *not* do the following: - Retaining a method of timestamp creation without timezone offset. As to my understanding we don't need such backward compatibility as the code just emits timestamps to logs and other diagnostic texts, and doesn't do any later processing on them that would rely on their format. An exception to this, ie. a case where timestamp is built for internal use, is graph.c:fill_uuid(). As far as I can see, what matters in that case is the uniqueness of the produced string, not the format. - Implementing a single-token (space free) timestamp format. While some timestamp formats used to be single-token, now all of them will include a space preceding the offset indicator. Again, I did not see a use case where this could be significant in terms of representation. - Moving the codebase to a single unified timestamp format and dropping the fmt argument of gf_time_fmt/gf_time_fmt_tv. While the gf_timefmt_FT format is almost ubiquitous, there are a few cases where different formats are used. I'm not convinced there is any reason to not use gf_timefmt_FT in those cases too, but I did not want to make a decision in this regard. Change-Id: I0af73ab5d490cca7ed8d07a2ce7ac22a6df2920a Updates: #837 Signed-off-by: Csaba Henk <csaba@redhat.com>
* glusterd: destroy all volume info locks and mutexesDmitry Antipov2020-06-102-2/+5
| | | | | | | | | | Add destroy calls for 'store_volinfo_lock' and 'lock' of volume info. Move initialization of 'store_volinfo_lock' from glusterd_op_create_volume() to common place, which is glusterd_volinfo_new() indeed. Change-Id: I5fae4469f28eab80c4fa6f5947646528e6aedad7 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1291
* test: Test case brick-mux-validation-in-cluster.t is failing on RHEL-8Mohit Agrawal2020-06-092-2/+8
| | | | | | | | | | | | | | | | | | | | | Brick process are not properly attached on any cluster node while some volume options are changed on peer node and glusterd is down on that specific node. Solution: At the time of restart glusterd it got a friend update request from a peer node if peer node having some changes on volume.If the brick process is started before received a friend update request in that case brick_mux behavior is not workingproperly. All bricks are attached to the same process even volumes options are not the same. To avoid the issue introduce an atomic flag volpeerupdate and update the value while glusterd has received a friend update request from peer for a specific volume.If volpeerupdate flag is 1 volume is started by glusterd_import_friend_volume synctask Change-Id: I4c026f1e7807ded249153670e6967a2be8d22cb7 Credit: Sanju Rakaonde <srakonde@redhat.com> fixes: #1290 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: To do full heal in different online node when do ec/afr full healyinkui2020-06-091-3/+36
| | | | | | | | | | | | | | | | | | | For example: We have 3 nodes and create ec 3*(2+1) volume for test-disperse-0/test-disperse-1/test-disperse-2 when we do 'gluster v heal test full' in node-1 that can in node-1/ node-2/node-3 glustershd's get op=GF_EVENT_TRANSLATOR_OP and then do full heal in different disperse group. Let us say we have 2X(2+1) disperse with each brick from different machine m0, m1, m2, m3, m4, m5. and candidate_max is m5. and do full heal so '*index' is 3 and !gf_uuid_compare(MY_UUID, brickinfo->uuid) will be true in m3,and then m3's glustershd will be the heal-xlator. Id: I5c6762e6cfb375aed32d3fc11fe5eae3ee41aab4 Signed-off-by: yinkui <13965432176@163.com> Change-Id: Ic7ef3ddfd30b5f4714ba99b4e7b708c927d68764 fixes: bz#1724948
* glusterd: check for same node while adding bricks in disperse volumeSheetal Pamecha2020-06-064-238/+248
| | | | | | | | | | | | | | | The optimal way for configuring disperse and replicate volumes is to have all bricks in different nodes. During create operation it fails saying it is not optimal, user must use force to over-ride this behavior. Implementing same during add-brick operation to avoid situation where all the added bricks end up from same host. Operation will error out accordingly. and this can be over-ridden by using force same as create. fixes: #1047 Change-Id: I3ee9c97c1a14b73f4532893bc00187ef9355238b Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* glusterd: fix memory leak in glusterd_store_retrieve_bricks()Dmitry Antipov2020-06-051-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Found with GCC's address sanitizer: ==67190==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24624 byte(s) in 6 object(s) allocated from: #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837) #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122 #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144 #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511 #4 0x7f623e2f9ed7 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2389 Direct leak of 8208 byte(s) in 2 object(s) allocated from: #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837) #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122 #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144 #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511 #4 0x7f623e2f9cf0 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2363 #5 0x7fff5cb70bcf ([stack]+0x15bcf) #6 0x7f623e309113 in glusterd_store_retrieve_volumes /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:3505 #7 0xfffeb96e61d (<unknown module>) #8 0x7f623e4586d7 (/usr/lib64/glusterfs/9dev/xlator/mgmt/glusterd.so+0x2f86d7) Change-Id: I9b2a543dc095f4fa739cd664fd4d608bf8c87d60 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1263
* io-cache,quick-read: deprecate volume options with flawed semantics or namingCsaba Henk2020-06-021-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | - performance.cache-size has a flawed semantics, as it's dispatched on two independent translators, io-cache and quick-read. - performance.qr-cache-timeout has a confusing name, as other options affecting quick-read have an unabbreviated "quick-read-..." prefix in their names. We keep these options with unchanged operation, but in the help output we indicate their deprecation. The following better alternatives are introduced: - performance.io-cache-size to tune cache-size option of io-cache - performance.quick-read-cache-size to tune cache-size option of quick-read - performance.quick-read-cache-timeout as a preferred synonym for performance.qr-cache-timeout Fixes: #952 Change-Id: Ibd04fb638de8cac450ba992ad8a415154f9f4281 Signed-off-by: Csaba Henk <csaba@redhat.com>
* glusterd/snapshot: Improve log message during snapshot cloneMohammed Rafi KC2020-05-221-0/+10
| | | | | | | | | | | While taking a snapshot clone, if the snapshot is not activated, th cli was returning that the bricks are down. This patch clearly print tha the error is due to the snapshot state. Change-Id: Ia840e6e071342e061ad38bf15e2e2ff2b0dacdfa Fixes: #1255 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>