summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* posix: use synctask for janitorPoornima G2018-12-198-82/+196
| | | | | | | | | | | | | | With brick mux, the number of threads increases as the number of bricks increases. As an initiative to reduce the number of threads in brick mux scenario, replacing janitor thread to use synctask infra. Now close() and closedir() handle by separate janitor thread which is linked with glusterfs_ctx. Updates #475 Change-Id: I0c4aaf728125ab7264442fde59f3d08542785f73 Signed-off-by: Poornima G <pgurusid@redhat.com>
* cluster/afr: Allow lookup on root if it is from ADD_REPLICA_MOUNTkarthik-us2018-12-188-31/+79
| | | | | | | | | | | | | | | | | | | | | Problem: When trying to convert a plain distribute volume to replica-3 or arbiter type it is failing with ENOTCONN error as the lookup on the root will fail as there is no quorum. Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT which is used while adding bricks to a volume. It will try to set the pending xattrs for the newly added bricks to allow the heal to happen in the right direction and avoid data loss scenarios. Note: This fix will solve the problem of type conversion only in the case where the volume was mounted at least once. The conversion of non mounted volumes will still fail since the dht selfheal tries to set the directory layout will fail as they do that with the PID GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root. Change-Id: Ic511939981dad118cc946754341318b164954b3b fixes: bz#1655854 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* iobuf: Get rid of pre allocated iobuf_pool and use per thread mem poolPoornima G2018-12-181-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of iobuf_pool has two problems: - prealloc of 12.5MB memory, this limits the scale factor of the gluster processes due to RAM requirements - lock contention, as the current implementation has one global iobuf_pool lock. Credits for debugging and addressing the same goes to Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410 Hence changing the iobuf implementation to use per thread mem pool. This may theoritically appear to cause perf dip as there is no preallocation. But per thread mem pool will not have significant perf impact as the last allocated memory is kept alive for subsequent allocs, for some time. The worst case would be if iobufs requested are of random sizes each time. The best case is, if we get iobuf request of the same size. From the perf tests, this patch did not seem to cause any perf decrease. Note that, with this patch, the rdma performance is going to degrade drastically. In one of the previous patchsets we had fixes to not degrade rdma perf, but rdma is not supported and also not tested [1]. Hence the decision was to not have code in rdma that is not tested and not supported. [1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html Updates: #325 Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27 Signed-off-by: Poornima G <pgurusid@redhat.com>
* performance/io-cache: update pages with write dataRaghavendra Gowdappa2018-12-182-4/+90
| | | | | | | | | | | | Currently io-cache invalidate pages falling in the range of write. But instead it can update pages with same data so that reads can make use of the cache. credits: Xavi Hernandez <xhernandez@redhat.com> Change-Id: I932bd3da97ddfd464187f3009b1013eb334f00a7 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1659869
* performance/ob: make open-behind as a child of quick-readRaghavendra Gowdappa2018-12-181-9/+7
| | | | | | | | | | | | | | | | | With read-after-open being set to yes by default, if open-behind sees any reads, it'll do an open on backend (and hence flush/release later). This means with the current order of quick-read and open-behind, open-behind sees all reads and hence also does open bringing down performance for small file reads. Since for small files, reads are absorbed by quick-read, if quick-read is made a parent of open-behind, ob doesn't witness any reads. For read-only workloads, this means ob doen't do any opens (even with read-after-open yes and use-anonymous-fd no). Change-Id: I138a42b006d104cff43ee6f07829e39c36f6f234 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Fixes: bz#1659327
* xlators/cluster/afr/src/afr-self-heal-common.c: remove a variable array.Yaniv Kaul2018-12-181-10/+6
| | | | | | | | | | | | | Added '-Wvla' and saw this - gcc doesn't like variable arrays. There are plenty of others in the EC code, but this seems OK to remove: there is no use for the array members (I hope - that was from reading the code). Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I350f4520e52b86c8bbcd60eea1b27ef99cd119aa
* glusterd: migrating rebalance commands to mgmt_v3 frameworkSanju Rakonde2018-12-188-21/+630
| | | | | | | | | Current rebalance commands use the op_state machine framework. Porting it to use the mgmt_v3 framework. Change-Id: I6faf4a6335c2e2f3d54bbde79908a7749e4613e7 fixes: bz#1655827 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* Don't depend on string options to be valid alwaysPranith Kumar K2018-12-1711-67/+93
| | | | | | updates bz#1650403 Change-Id: Ib5a11e691599ce4bd93c1ed5aca6060592893961 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* features/snapview-client: access priv->path inside lockRaghavendra Bhat2018-12-173-75/+360
| | | | | | | | | | To handle the race condition of a fop or a function accessing priv->path and a reconfigure changing priv->path (because entry point directory changed), the private structure's path is guarded by the lock. updates bz#1650403 Change-Id: I61c539da06d68d38eafcf2155699c7702f31323e Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* AFR xlator: use dict_{setn|getn|deln|get_int32n|set_int32n|set_strn}Yaniv Kaul2018-12-1712-203/+282
| | | | | | | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patch moves some xlators to use it. - In some cases, moved strlen() of the key length outside of locks, which is usually a good thing. Please verify it's safe to do so. - In some cases, created a prefix for the keys, replacing something like "%d-%d" with a "%s" in snprintf(). Not sure it adds value, but improves readability. Please review carefully. Compile-tested only! Change-Id: I04f2a1eb2ecfc3283d849d150d10d088ae7aa7f1 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* selinux/glusterd : add "features.selinux" to glusterd-volume-set.cJiffin Tony Thottan2018-12-171-0/+9
| | | | | | Fixes: bz#1659868 Change-Id: I38675ba4d47c8ba7f94cfb4734692683ddb3dcfd Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* cluster/afr: Fix mem leak reported by ASANKotresh HR2018-12-171-4/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Traceback: Direct leak of 765 byte(s) in 9 object(s) allocated from: #0 0x7ffb9cad2c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7ffb9c5f8949 in __gf_malloc ./libglusterfs/src/mem-pool.c:136 #2 0x7ffb9c5f91bb in gf_vasprintf ./libglusterfs/src/mem-pool.c:236 #3 0x7ffb9c5f938a in gf_asprintf ./libglusterfs/src/mem-pool.c:256 #4 0x7ffb826714ab in afr_get_heal_info ./xlators/cluster/afr/src/afr-common.c:6204 #5 0x7ffb825765e5 in afr_handle_heal_xattrs ./xlators/cluster/afr/src/afr-inode-read.c:1481 #6 0x7ffb825765e5 in afr_getxattr ./xlators/cluster/afr/src/afr-inode-read.c:1571 #7 0x7ffb9c635af7 in syncop_getxattr ./libglusterfs/src/syncop.c:1680 #8 0x406c78 in glfsh_process_entries ./heal/src/glfs-heal.c:810 #9 0x408555 in glfsh_crawl_directory ./heal/src/glfs-heal.c:898 #10 0x408cc0 in glfsh_print_pending_heals_type ./heal/src/glfs-heal.c:970 #11 0x408fc5 in glfsh_print_pending_heals ./heal/src/glfs-heal.c:1012 #12 0x409546 in glfsh_gather_heal_info ./heal/src/glfs-heal.c:1154 #13 0x403e96 in main ./heal/src/glfs-heal.c:1745 #14 0x7ffb99bc411a in __libc_start_main ../csu/libc-start.c:308 The dictionary is referenced by caller to print the status. So set it as dynstr, the last unref of dictionary will free it. updates: bz#1633930 Change-Id: Ib5a7cb891e6f7d90560859aaf6239e52ff5477d0 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* dht: Fix clang warnings in dht-common.cShyamsundarR2018-12-161-20/+37
| | | | | | Change-Id: I0894d62edd68e13d123aaa5ca1827b98283f0d3e Updates: bz#1622665 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* fuse: SETLKW interruptCsaba Henk2018-12-141-0/+130
| | | | | | | | | Use the (f)getxattr based clearlocks interface to interrupt a pending lock request. updates: #465 Change-Id: I4e91a4d8791fc688fed400a02de4c53487e61be2 Signed-off-by: Csaba Henk <csaba@redhat.com>
* fuse: add --lru-limit optionAmar Tumballi2018-12-143-51/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inode LRU mechanism is moot in fuse xlator (ie. there is no limit for the LRU list), as fuse inodes are referenced from kernel context, and thus they can only be dropped on request of the kernel. This might results in a high number of passive inodes which are useless for the glusterfs client, causing a significant memory overhead. This change tries to remedy this by extending the LRU semantics and allowing to set a finite limit on the fuse inode LRU. A brief history of problem: When gluster's inode table was designed, fuse didn't have any 'invalidate' method, which means, userspace application could never ask kernel to send a 'forget()' fop, instead had to wait for kernel to send it based on kernel's parameters. Inode table remembers the number of times kernel has cached the inode based on the 'nlookup' parameter. And 'nlookup' field is not used by no other entry points (like server-protocol, gfapi etc). Hence the inode_table of fuse module always has to have lru-limit as '0', which means no limit. GlusterFS always had to keep all inodes in memory as kernel would have had a reference to it. Again, the reason for this is, kernel's glusterfs inode reference was pointer of 'inode_t' structure in glusterfs. As it is a pointer, we could never free it (to prevent segfault, or memory corruption). Solution: In the inode table, handle the prune case of inodes with 'nlookup' differently, and call a 'invalidator' method, which in this case is fuse_invalidate(), and it sends the request to kernel for getting the forget request. When the kernel sends the forget, it means, it has dropped all the reference to the inode, and it will send the forget with the 'nlookup' parameter too. We just need to make sure to reduce the 'nlookup' value we have when we get forget. That automatically cause the relevant prune to happen. Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B fixes: bz#1560969 Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* performance/rda: Fixed dict_t memory leakN Balachandran2018-12-141-8/+0
| | | | | | | | | | Removed all references to dict_t xdata_from_req which is allocated but not used anywhere. It is also not cleaned up and hence causes a memory leak. Change-Id: I2edb857696191e872ad12a12efc36999626bacc7 fixes: bz#1659432 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* locks: handle "clear locks" xattr in fgetxattr tooCsaba Henk2018-12-143-50/+82
| | | | | | | | | | | | | | The lock clearing procedure was kicked in only in getxattr context. We need it to work the same way if it's triggered via fgetxattr (as is the case with interrupt handling). Also cleaned up the instrumentation a bit (more logs, proper management of allocated data). updates: #465 Change-Id: Icfca26ee181da3b8e15ca3fcf61cd5702e2730c8 Signed-off-by: Csaba Henk <csaba@redhat.com>
* Multiple posix related files: several modificationsYaniv Kaul2018-12-146-232/+169
| | | | | | | | | | | | | | | | | | | | | | | | Just looked at posix.c and related code and performed some changes and cleanups. The only important one is #3 below, but surely the others (#2 and #4) need careful review. Changes to other files are as they were related to code paths in posix.c. I'll send a separate patch for other posix related files. Main changes: 1. Proper initializtion for parameters, where it made sense. 2. Logged outside the lock in several places. 3. Moved from CALLOC to MALLOC where it made sense. 4. Aligned structures. 5. moved dictionary functions to use _sizen where possible. (dict_get() -> dict_get_sizen() for example) Compile-tested only! Change-Id: Ia84699fb495e06d095339c91c1ba770d1393bb6c updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* cluster/ec: NULL pointer deferencing clang fixSheetal Pamecha2018-12-141-1/+0
| | | | | | | | Removing VALIDATE_OR_GOTO check on "this" Change-Id: I154deaca5302b41c1cafd87077de880dd03ec613 Updates: bz#1622665 Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
* clang: Fix various missing checks for empty listShyamsundarR2018-12-149-118/+159
| | | | | | | | | | | | | | | | | | | | When using list_for_each_entry(_safe) functions, care needs to be taken that the list passed in are not empty, as these functions are not empty list safe. clag scan reported various points where this this pattern could be caught, and this patch fixes the same. Additionally the following changes are present in this patch, - Added an explicit op_ret setting in error case in the macro MAKE_INODE_HANDLE to address another clang issue reported - Minor refactoring of some functions in quota code, to address possible allocation failures in certain functions (which in turn cause possible empty lists to be passed around) Change-Id: I1e761a8d218708f714effb56fa643df2a3ea2cc7 Updates: bz#1622665 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* all: remove code which is not being considered in buildAmar Tumballi2018-12-13113-37439/+0
| | | | | | | | | | | | | | | | | | | | | | | | | These xlators are now removed from build as per discussion/announcement done at https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html * move rot-13 to playground, as it is used only as demo purpose, and is documented in many places. * Removed code of below xlators: - cluster/stripe - cluster/tier - features/changetimerecorder - features/glupy - performance/symlink-cache - encryption/crypt - storage/bd - experimental/posix2 - experimental/dht2 - experimental/fdl - experimental/jbr updates: bz#1635688 Change-Id: I1d2d63c32535e149bc8dcb2daa76236c707996e8 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* xlator: make 'xlator_api' mandatoryAmar Tumballi2018-12-1325-90/+252
| | | | | | | | | | | | | | * Remove the options to load old symbol. * keep only 'xlator_api' symbol from being exported using xlator.sym * add xlator_api to all the xlators where its missing NOTE: This covers all the xlators which has at least a test case to validate its loading. If there is a translator, which doesn't have any test, then we should probably remove that from codebase. fixes: #164 Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb Signed-off-by: Amar Tumballi <amarts@redhat.com>
* symlink-cache: remove from the buildAmar Tumballi2018-12-131-1/+1
| | | | | | | | | | | | | | | symlink-cache was written as an experiment to reduce the load on 'build' systems, which keep doing symlink resolution to get the proper header files. But since last 6+ years, there was no way to add it to the volfile using gluster cli, and hence was not supported anymore. As it is not maintained, and as announced on [1], we are planning to remove it from the build system. [1]- https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html updates: bz#1635688 Change-Id: Iaa25069bceed04cf65f79a4b4a02c05cee848eb5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* [geo-rep]: Worker still ACTIVE after killing bricksMohit Agrawal2018-12-137-40/+211
| | | | | | | | | | | | | | | | | | | | | | | Problem: In changelog xlator after destroying listener it call's unlink to delete changelog socket file but socket file reference is not cleaned up from process memory Solution: 1) To cleanup reference completely from process memory serialize transport cleanup for changelog and then unlink socket file 2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next xlator only after cleanup all xprts Test: To test the same run below steps 1) Setup some volume and enable brick mux 2) kill anyone brick with gf_attach 3) check changelog socket for specific to killed brick in lsof, it should cleanup completely fixes: bz#1600145 Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* afr: some minor itable related cleanupsRavishankar N2018-12-124-12/+17
| | | | | | | | | | | | - this->itable always needs to be allocated, hence move it outside afr_selfheal_daemon_init(). - Invoke afr_selfheal_daemon_init() only for self-heal daemon case. - remove redundant itable allocation in afr_discover(). - destroy itable in fini. Updates bz#1193929 Change-Id: Ib28b50b607386f5a5aa7d2f743c8b506ccb10eae Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* copy_file_range support in GlusterFSRaghavendra Bhat2018-12-1226-10/+1051
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* cluster/afr: Do not update read_subvol in inode_ctx after rename/link fopkarthik-us2018-12-121-1/+3
| | | | | | | | | | | Since rename/link fops on a file will not change any data in it, it should not update the read_subvol values in the inode_ctx, which interprets the data & metadata readable subvols for that file. The old read_subvol values should be retained even after the rename/link operations. Change-Id: I068044a426823a566f5bea8aa063cd689199d6dd fixes: bz#1657783 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* afr: Resource leak coverity fixesBhumika Goyal2018-12-111-2/+13
| | | | | | | | | | | | | Problem reported by Coverity: Leak of memory or pointers to system resources. Deallocate the memory pointed to by xattr_serz as the memory reference is not stored anywhere. Fixes CID: 1124760, 124787, 1382418 Change-Id: Ib9c2ef28c52e2d43de2552cfd959a98b26272bc1 updates: bz#789278 Signed-off-by: Bhumika Goyal <bgoyal@redhat.com>
* write-behind/bit-rot: fix identifierrishubhjain2018-12-112-2/+2
| | | | | | | | | Rename the identifiers, bit-rot-server to bit-rot in bit-rot.c & write-ahead to write-behind in write-behind.c to ensure GD2 understands the options Change-Id: Id271ae97de2e54f4e30174482c4e1fb6afc728d3 Fixes: #164 Signed-off-by: rishubhjain <rishubhjain47@gmail.com>
* nfs: memory leak issue reported by asanHarpreet Kaur2018-12-111-0/+3
| | | | | | | | | | | | This patch fixes Direct leaks in exports.c Leaks are happening in exp_file_parse SUMMARY: AddressSanitizer: 5120 byte(s) leaked in 20 allocation(s). SUMMARY: AddressSanitizer: 512 byte(s) leaked in 4 allocation(s). Updates: bz#1633930 Change-Id: Ib4474f8f6c65d737ed54ed35b4234410d1fd673e Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
* encryption: remove crypt xlator from buildAmar Tumballi2018-12-111-2/+2
| | | | | | | | | | | | | | | | | | Based on the proposal to remove few features as they are not actively maintained [1], removing crypt translator from the build. [1] - https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Crypt xlator helped in on-disk / at-rest encryption of data. But currently as there are no maintainers for this, planning to remove it from master codebase. We are planning to host these experimental/ tech-preview xlators in another repository, so people who want to contribute can still use the bits. updates: bz#1635688 Change-Id: I7f2453907a595c34f635a88c49aab0845369c6e7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* posix: posix_health_check_thread_proc crash due to priv is NULLMohit Agrawal2018-12-112-9/+14
| | | | | | | | | | | | | Problem: posix_fini sends a cancellation request to health_check thread and cleanup priv without ensuring health_check thread is running Solution: Make health_check && disk_space thread joinable and call gf_thread_cleanup_xint to wait unless thread is not finished Change-Id: I4d37b08138766881dab0922a47ed68a2c3411f13 fixes: bz#1636570 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: Resolve memory leak in some glusterd functionsMohit Agrawal2018-12-101-0/+6
| | | | | | | | | | | Problem: Functions allocate memory for req structure but after submit request they missed to cleanup memory Solution: After submit request cleanup allocated mmeory Change-Id: I8f995787ed8986b882f008ccd588670b5d4139f5 updates: bz#1633930 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* glusterd: fix get_mux_limit_per_process to read default valueAtin Mukherjee2018-12-074-10/+4
| | | | | | | | | | | get_mux_limit_per_process () reads the global option dictionary and in case it doesn't find out a key, assumes that cluster.max-bricks-per-process option isn't configured however the default value should be picked up in such case. Change-Id: I35dd8da084adbf59793d58557e818d8e6c17f9f3 Fixes: bz#1656951 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* performance/readdir-ahead: update stats from prefetched dentriesRaghavendra Gowdappa2018-12-072-6/+93
| | | | | | | | | | stats from prefetched dentries should be invalidated only if the files pointed to those dentries were written in the window of prefetching. Otherwise its safe to use these stats. Change-Id: I9ea5aeea4c75dfa03387fca32c626cb4e693290d Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Fixes: bz#1656348
* New xlator option to control enable/disable of xlators in Gd2Aravinda VK2018-12-0710-0/+80
| | | | | | | | | | | | | Since glusterd2 don't maintain the xlator option details in code, it directly reads the xlators options table from `*.so` files. To support enable and disable of xlator new option added to the option table with the name same as xlator name itself. This change will not affect the functionality with glusterd1. Change-Id: I23d9e537f3f422de72ddb353484466d3519de0c1 updates: #302 Signed-off-by: Aravinda VK <avishwan@redhat.com>
* all: add xlator_api to many translatorsAmar Tumballi2018-12-0633-28/+472
| | | | | | Fixes: #164 Change-Id: I93ad6f0232a1dc534df099059f69951e1339086f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-05455-1547/+1547
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* protocol/server: support server.all-squashXie Changlong2018-12-053-19/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | We still use gnfs on our side, so do a little work to support server.all-squash. Just like server.root-squash, it's also a volume wide option. Also see bz#1285126 $ gluster volume set <VOLNAME> server.all-squash on Note: If you enable server.root-squash and server.all-squash at the same time, only server.all-squash works. Please refer to following table +---------------+-----------------+---------------------------+ | |all_squash | no_all_squash | +-------------------------------------------------------------+ | | |anonuid/anongid for root | |root_squash |anonuid/anongid |useruid/usergid for no-root| +-------------------------------------------------------------+ |no_root_squash |anonuid/anongid |useruid/usergid | +-------------------------------------------------------------+ Updates bz#1285126 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Xue Chuanyu <xuechuanyu@cmss.chinamobile.com> Change-Id: Iea043318fe6e9a75fa92b396737985062a26b47e
* glusterd: glusterd to regenerate volfiles when GD_OP_VERSION_MAX changesAtin Mukherjee2018-12-054-12/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While glusterd has an infra to allow post install of spec to bring it up in the interim upgrade mode to allow all the volfiles to be regenerated with the latest executable, in container world the same methodology is not followed as container image always point to the specific gluster rpm and gluster rpm doesn't go through an upgrade process. This fix does the following: 1. If glusterd.upgrade file doesn't exist, regenerate the volfiles 2. If maximum-operating-version read from glusterd.upgrade doesn't match with GD_OP_VERSION_MAX, glusterd detects it to be a version where new options are introduced and regenerate the volfiles. Tests done: 1. Bring up glusterd, check if glusterd.upgrade file has been created with GD_OP_VERSION_MAX value. 2. Post 1, restart glusterd and check glusterd hasn't regenerated the volfiles as there's is no change in the GD_OP_VERSION_MAX vs the op_version read from the file. 3. Bump up the GD_OP_VERSION_MAX in the code by 1 and post compilation restart glusterd where the volfiles should be again regenerated. Note: The old way of having volfiles regenerated during an rpm upgrade is kept as it is for now but eventually this can be sunset later. Change-Id: I75b49a1601c71e99f6a6bc360dd12dd03a96414b Fixes: bz#1651463 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* debug/io-stats: Fix json outputChris Holcombe2018-12-051-33/+29
| | | | | | | | | | Summary: The json being output by the io-stats debug xlator quotes the numbers. This is not necessary and makes parsing in strongly typed languages more difficult. Change-Id: I3ac13700e2c52dbdc29d0bcdd39896d7871f36fe fixes: bz#1654521 Signed-off-by: Chris Holcombe <xfactor973@gmail.com>
* xlators/mgmt/glusterd/src/glusterd-volgen.c: use dict_ new functionsYaniv Kaul2018-12-051-237/+230
| | | | | | | | | | | | | | | In a previous patch (https://review.gluster.org/20769) we've added the key length to be passed to dict_* funcs, to remove the need to strlen() it. This patches makes use of these functions over this whole file. Please review carefully, as there are many many changes there. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2e1ee340300ec330936c31becda6bfe1b6533281
* glusterd: set cluster.max-bricks-per-process to 250Atin Mukherjee2018-12-051-1/+1
| | | | | | | | | Commit 6821cec changed this default from 0 to 250 in the option table, however the same wasn't done in the global option table. Change-Id: I6075f2ebc51e839510d6492fb62e706deb2d845b Fixes: bz#1652118 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: migrating profile commands to mgmt_v3 frameworkSanju Rakonde2018-12-045-23/+244
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current profile commands use the op_state machine framework. Porting it to use the mgmt_v3 framework. The following tests were performed on the patch: case 1: 1. On a 3 node cluster, created and started 3 volumes 2. Mounted all the three volumes and wrote some data 3. Started profile operation for all the volumes 4. Ran "gluster v status" from N1, "gluster v profile <volname1> info" form N2, "gluster v profile <volname2> info" from N3 simultaneously in a loop for around 10000 times 5. Didn't find any cores generated. case 2: 1. Repeat the steps 1,2 and 3 from case 1. 2. Ran "gluster v status" from N1, "gluster v profile <volname1> info" form N2(terminal 1), "gluster v profile <volname2> info" from N2(terminal 2) simultaneously in a loop. 3. No cores were generated. fixes: bz#1654181 Change-Id: I83044cf5aee3970ef94066c89fcc41783ed468a6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* rpc: bump up server.event-threadsMilind Changire2018-12-042-2/+2
| | | | | | | | | | | | | | | Problem: A single event-thread causes performance issues in the system. Solution: Bump up event-threads to 2 to make the system more performant. This helps in making the system more responsive and helps avoid the ping-timer-expiry problem as well. However, setting the event-threads to 2 is not the only thing required to avoid ping-timer-expiry issues. Change-Id: Idb0fd49e078db3bd5085dd083b0cdc77b59ddb00 fixes: bz#1653277 Signed-off-by: Milind Changire <mchangir@redhat.com>
* io-cache: xdata needs to be passed for readv operationsSoumya Koduri2018-12-042-2/+16
| | | | | | | | | | | | io-cache xlator has been skipping xdata references when the date needs to be read into page cache. This patch fixes the same. Note: similar changes may be needed for other fops as well which are handled by io-cache. Change-Id: I28d73d4ba471d13eb55d0fd0b5197d222df77a2a updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
* glusterd: perform rcu_read_lock/unlock() under cleanup_lock mutexSanju Rakonde2018-12-0316-199/+213
| | | | | | | | | | | | | | Problem: glusterd should not try to acquire locks on any resources, when it already received a SIGTERM and cleanup is started. Otherwise we might hit segfault, since the thread which is going through cleanup path will be freeing up the resouces and some other thread might be trying to acquire locks on freed resources. Solution: perform rcu_read_lock/unlock() under cleanup_lock mutex. fixes: bz#1654270 Change-Id: I87a97cfe4f272f74f246d688660934638911ce54 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* server: Resolve memory leak path in server_initMohit Agrawal2018-12-032-40/+47
| | | | | | | | | | | | | | Problem: 1) server_init does not cleanup allocate resources while it is failed before return error 2) dict leak at the time of graph destroying Solution: 1) free resources in case of server_init is failed 2) Take dict_ref of graph xlator before destroying the graph to avoid leak Change-Id: I9e31e156b9ed6bebe622745a8be0e470774e3d15 fixes: bz#1654917 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* features/bitrot: compare the signature with proper lengthRaghavendra Bhat2018-12-032-8/+14
| | | | | | | | | | | | | | | | | | | | | * The scrubber was comparing the checksum of the file that it calculated (by reading the file) with the on disk signature (stored via xattr) wrongly. It was using strlen to calculate the signature, while the actual length of the signature is given by the brick. Just use the actual length that the brick provides instead of trying to calculate the signature length via strlen API. * In posix, gfid2path was using the same string that contains the list of all the xattrs of file to save the value of the gfid2path xattr as well. This causes confusion when gfid2path xattr is queried by scrubber for getting the actual path of a corrupted file. Use separate string to fetch the value of the xattr instead of the string that contains the list of xattrs. Change-Id: I2d664ab524d2b312233476cb35863dde3122e9a9 fixes: bz#1654805 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* afr: assign gfid during name heal when no 'source' is present.Ravishankar N2018-12-034-52/+52
| | | | | | | | | | | | | | | | | | Problem: If parent dir is in split-brain or has dirty xattrs set, and the file has gfid missing on one of the bricks, then name heal won't assign the gfid. Fix: Use the brick we select the gfid from as the 'source'. Note: Problem was found while trying to debug a split-brain issue on Cynthia Zhou's setup. updates: bz#1637249 Change-Id: Id088d4f0fb017aa35122de426654194e581ed742 Reported-by: Cynthia Zhou <cynthia.zhou@nokia-sbell.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>