summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/syncop.c
Commit message (Collapse)AuthorAgeFilesLines
* libglusterfs: add library wrapper for time()Dmitry Antipov2020-08-171-1/+1
| | | | | | | | | Add thin convenient library wrapper gf_time(), adjust related users and comments as well. Change-Id: If8969af2f45ee69c30c3406bce5baa8305fb7f80 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Updates: #1002
* libglusterfs: annotate synctasks with ThreadSanitizer APIDmitry Antipov2020-08-111-0/+35
| | | | | | | | | | If --enable-tsan is specified and API headers are detected, annotate synctask context initialization and context switch with ThreadSanitizer API. Change-Id: I7ac4085d7ed055448f1fc80df7d59905d129f186 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Fixes: #1400
* syncop: improve scaling and implement more toolsXavi Hernandez2020-05-131-66/+240
| | | | | | | | | | | | | | | | | | | | The current scaling of the syncop thread pool is not working properly and can leave some tasks in the run queue more time than necessary when the maximum number of threads is not reached. This patch provides a better scaling condition to react faster to pending work. Condition variables and sleep in the context of a synctask have also been implemented. Their purpose is to replace regular condition variables and sleeps that block synctask threads and prevent other tasks to be executed. The new features have been applied to several places in glusterd. Change-Id: Ic50b7c73c104f9e41f08101a357d30b95efccfbf Fixes: #1116 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* posix - fix seek functionalityBarak Sason Rofman2020-05-071-5/+6
| | | | | | | | | A wrong pointer check causes the offset returned by seek to be always wrong fixes: #1228 Change-Id: Iac4c6a163175617ac4f14544fc6b7c6fb4041cd6 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* core: fix memory pool management racesXavi Hernandez2020-02-181-0/+7
| | | | | | | | | | | | | | | | | | | | | | Objects allocated from a per-thread memory pool keep a reference to it to be able to return the object to the pool when not used anymore. The object holding this reference can have a long life cycle that could survive a glfs_fini() call. This means that it's unsafe to destroy memory pools from glfs_fini(). Another side effect of destroying memory pools from glfs_fini() is that the TLS variable that points to one of those pools cannot be reset for all alive threads. This means that any attempt to allocate memory from those threads will access already free'd memory, which is very dangerous. To fix these issues, mem_pools_fini() doesn't destroy pool lists anymore. Only at process termination the pools are destroyed. Change-Id: Ib189a5510ab6bdac78983c6c65a022e9634b0965 Fixes: bz#1801684 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* core: avoid dynamic TLS allocation when possibleXavi Hernandez2019-04-241-114/+19
| | | | | | | | | | | | | | | | | | | Some interdependencies between logging and memory management functions make it impossible to use the logging framework before initializing memory subsystem because they both depend on Thread Local Storage allocated through pthread_key_create() during initialization. This causes a crash when we try to log something very early in the initialization phase. To prevent this, several dynamically allocated TLS structures have been replaced by static TLS reserved at compile time using '__thread' keyword. This also reduces the number of error sources, making initialization simpler. Updates: bz#1193929 Change-Id: I8ea2e072411e30790d50084b6b7e909c7bb01d50 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* core: make gf_thread_create() easier to useXavi Hernandez2019-02-011-13/+3
| | | | | | | | | | | | | | This patch creates a specific function to set the thread name using a string format and a variable argument list, like printf(). This function is used to set the thread name from gf_thread_create(), which now accepts a variable argument list to create the full name. It's not necessary anymore to use a local array to build the name of the thread. This is done automatically. Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* syncop: remove unnecessary call to gf_backtrace_save()Xavi Hernandez2019-01-311-1/+0
| | | | | | | | | | | | A call to gf_backtrace_save() was done on each context switch of a synctask. The backtrace is generated writing to the filesystem, so it can have an important impact on latency. The generated backtrace was not used anywhere, so it's been removed. Change-Id: I399a93b932c5b6e981c696c72c3e1ef44710ba52 Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* copy_file_range support in GlusterFSRaghavendra Bhat2018-12-121-1/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* libglusterfs multiple files: remove dead initilizationYaniv Kaul2018-11-111-39/+1
| | | | | | | | | | | | | Per newer GCC releases and clang-scan, some trivial dead initialization (values that were set but were never read) were removed. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ia9959b2ff87d2e9cb46864e68ffe7dccb984db34
* syncop: check syncenv status before pthread_cond_timedwait() to avoid 600s ↵Kinglong Mee2018-10-091-4/+3
| | | | | | | | | | | timeout If a syncenv_task starts after syncenv_destroy, the syncenv_task enters a 600s timeout cond timedwait, and syncenv_destroy must waits it timeout. Change-Id: I972a2b231e50cbebd3c71707800e58033e40c29d updates: bz#1626313 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-2340/+2339
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* coverity: libglusterfs issuesAmar Tumballi2018-08-191-0/+2
| | | | | | | | | CID: 1391415, 1274122, 1274201, 1382408, 1382437, 1389436 1288798, 1288106, 1288110 updates: bz#789278 Change-Id: I48c7a50f22f5f4580310040c66463d9f7dd26204 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* core (named threads): flood of -Wformat-truncation warnings with gcc-7.1Kaleb S. KEITHLEY2018-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Starting in Fedora 26 which has gcc-7.1.x, -Wformat-trunction is enabled with -Wformat, resulting in a flood of new warnings. This many warnings is a concern because it makes it hard(er) to see other warnings that should be addressed. An example is at https://kojipkgs.fedoraproject.org//packages/glusterfs/3.12.0/1.fc28/data/logs/x86_64/build.log For more info see https://review.gluster.org/#/c/18267/ I can't find much (or good) documentation on the heuristics the compiler uses for this warning. In the case of printing integer types it appears it looks at the available space in the destination and the range of values for the variable and/or its type. To address the specific question about why 0x3ff versus 0xfff to mask the value, either would suffice to hint to the compiler that the printed value will fit in three characters. But the loop is from 0...1023 (or 0...0x3ff if you prefer) so I chose that as a more "accurate" mask to use as it exactly matches the range of values of the loop. Fixes: bz#1492847 Change-Id: I6e309ba42159841131d8241bfc0566ef09e00aa9
* libglusterfs: Capture the dict response in syncop_xattrop_cbkkarthik-us2018-04-271-2/+16
| | | | | | | | | | | | | | | Problem: Currently it is not possible to capture the xattrs values which are set on the bricks by calling syncop_(f)xattrop, because the response dict is not being assigned to any of the dictionaries. Fix: In the xattrop callback capture the response dict and send it back to the caller if it is requested. Change-Id: I9de9bcd97d6008091c9b060bcca3676cb9ae8ef9 fixes: bz#1572076 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* libglusterfs/syncop: Handle barrier_{init/destroy} in error casesPranith Kumar K2018-04-231-4/+26
| | | | | | | BUG: 1568521 updates: bz#1568521 Change-Id: I53e60cfcaa7f8edfa5eca47307fa99f10ee64505 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* libglusterfs/syncop: Add syncop_entrylkRaghavendra G2018-02-131-0/+37
| | | | | | Change-Id: Idd86b9f0fa144c2316ab6276e2def28b696ae18a BUG: 1543279 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* gfapi: return pre/post attributes from glfs_ftruncateKinglong Mee2018-02-121-2/+13
| | | | | | Updates: #389 Change-Id: I8faea0828921fb17f05f7321c3cb01747373f21e Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* gfapi: return pre/post attributes from glfs_fsync/fdatasyncKinglong Mee2018-02-121-2/+12
| | | | | | Updates: #389 Change-Id: I4153df72d5eeecefa7579170899db4c340128bea Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* gfapi: return pre/post attributes from glfs_pread/pwriteKinglong Mee2018-02-121-3/+18
| | | | | | | | | | | | | | | As nfs-ganesha, a wcc data contains pre/post attributes is return in read/write rpc reply. nfs-ganesha get those attributes by two getattr between the real read/write right now. But, gluster has return pre/post attributes from glusterfsd, those attributes are skipped in syncop/gfapi, if gfapi return them, the upper user (nfs-ganesha) can use them directly without any duplicate getattr. Updates: #389 Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* rio/everywhere: add icreate/namelink fopSusant Palai2017-12-051-0/+76
| | | | | | | | | | | | | | | | | | | | | icreate creates inode, while namelink links the basename to it's parent gfid. For now mkdir is the primary user of these fops. Better distribution is acheived by creating the inode on ,(say) mds1 and linking the basename to it's parent gfid on mds2. The inode serves readdirp, stat etc. More details about the fops are present at: https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md This backport of three patches from experimental branch. 1- https://review.gluster.org/#/c/18085/ 2- https://review.gluster.org/#/c/18086/ 3- https://review.gluster.org/#/c/18094/ Updates gluster/glusterfs#243 Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592 Signed-off-by: Susant Palai <spalai@redhat.com>
* libglusterfs: Add put fopPoornima G2017-12-051-0/+49
| | | | | | | | | | | | | | | | | Problem: It had been a longtime request to implement put fop in gluster. put fop in gluster may not have the exact sementics of HTTP PUT, but can be easily extended to do so. The subsequent patches, will contain more semantics on the put fop and its guarentees. Why compound fop framework is not used for put? Compound fop framework currently doesn't allow compounding of entry fop and inode fops, i.e. fops on multiple inodes cannot be combined in compound fop. Updates #353 Change-Id: Idb7891b3e056d46d570bb7e31bad1b6a28656ada Signed-off-by: Poornima G <pgurusid@redhat.com>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: rebalance perf enhancementSusant Palai2017-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Throttle settings "normal" and "aggressive" for rebalance did not have performance difference. normal mode spawns $(no. of cores - 4)/2 threads and aggressive spawns $(no. of cores - 4) threads. Though aggressive mode has twice the number of threads compared to that of normal mode, there was no performance gain when switched to aggressive mode from normal mode. RCA: During the course of debugging the above problem, we tried assigning migration job to migration threads spawned by rebalance, rather than synctasks(as there is more overhead associated to manage the task queue and threads). This gave us a significant improvement over rebalance under synctasks. This patch does not really gurantee that there will be a clear performance difference between normal and aggressive mode, but this patch certainly maximized the disk utilization for 1GBfiles run. Results: Test enviroment: Gluster Config: Number of Bricks: 2 (one brick per disk(RAID-6 12 disk)) Bricks: Brick1: server1:/brick/test1/1 Brick2: server2:/brick/test1/1 Options Reconfigured: performance.readdir-ahead: on server.event-threads: 4 client.event-threads: 4 1000 files with 1GB each were created/renamed such that all files will have server1 as cached and server2 as hashed, so that all files will be migrated. Test machines had 24 cores each. Results with/without synctask based migration: ----------------------------------------------- mode normal(10threads) aggressive(20threads) timetaken 0:55:30 (h:m:s) 0:56:3 (h:m:s) withsynctask timetaken with migrator 0:38:3 (h:m:s) 0:23:41 (h:m:s) threads From above table it can be seen that, there is a clear 2x perf gain between rebalance with synctask vs rebalance with migrator threads. Additionally this patch modifies the code so that caller will have the exact error number returned by dht_migrate_file(earlier the errno meaning was overloaded). This will help avoiding scenarios where migration failure due to ENOENT, can result in rebalance abort/failure. Change-Id: I8904e2fb147419d4a51c1267be11a08ffd52168e BUG: 1420166 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/16427 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* syncop: don't wake task in synctask_wake unless really neededRavishankar N2017-03-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In EC and AFR, we launch synctasks during self-heal. (i) These tasks usually stackwind a FOP to all its children and call synctask_yield() which does a swapcontext to synctask_switchto() and puts the task in syncenv's waitq by calling __wait(task). This happends as long as the FOP ckbs from all children haven't been received. (ii) For each FOP cbk, we call synctask_wake() which again does a swapcontext to synctask_switchto() which now puts the task in syncenv's runq by calling __run(task). When the task runs and the conext switches back to the FOP path, it puts the task in waitq because we haven't heard from all children as explained in (i). Thus we are unnecessarily using the swapcontext syscalls to just toggle the task back and forth between the waitq and runq. Fix: Store the stackwind count in new variable 'syncbarrier->waitfor' before winding the fop. In each cbk when we call synctask_wake(), perform an actual wake only if the cbk count == stackwind count. Change-Id: Id62d3b6ffed5a8c50f8b79267fb34e9470ba5ed5 BUG: 1434274 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/16931 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* syncop: Fix args for makecontextPranith Kumar K2017-03-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Passing 64bit arguments to makecontext is not portable and manpage says the following: <snip> On architectures where int and pointer types are the same size (e.g., x86-32, where both types are 32 bits), you may be able to get away with passing pointers as argu‐ ments to makecontext() following argc. However, doing this is not guaranteed to be portable, is undefined according to the standards, and won't work on architectures where pointers are larger than ints. Nevertheless, starting with version 2.8, glibc makes some changes to makecontext(), to permit this on some 64-bit architectures (e.g., x86-64). </snip> Since we do not depend on the arguments, it is better to change makecontext to not take any arguments. BUG: 1434274 Change-Id: Ic46c9e9faaeb2f78e4efde353ef861466515b1ec Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16951 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* core: fix synclocks' handling of "woken" flagJeff Darcy2017-03-231-2/+3
| | | | | | | | | | | | | | | | The "woken" flag wasn't being reset when it should have been, leading (eventually) to a SEGV when someone tried to folow a synclock's waitq to a task structure that had been freed while still on the queue. See the bug report for (far) more detail. Change-Id: I5cd9ae1bcb831555274108b292181ec2a29b6d95 BUG: 1434062 Signed-off-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-on: https://review.gluster.org/16926 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* syncop: fix argc count in call to makecontext()Ravishankar N2017-03-211-1/+1
| | | | | | | | | | | | | | We are only passing one argument (a pointer to struct synctask) to the function, so argc must be 1 and not 2. Change-Id: I4eaadd58a76f32327d8bb3efa9c5c435700d7391 BUG: 1434274 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16930 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* libglusterfs, dht, locks, glusterd: Coverity fixesNigel Babu2017-02-231-1/+3
| | | | | | | | | | | | | | Fix up use after free bugs and dead code Change-Id: I8f79ed6b5108926c1fac31c147b5ecba79d10785 BUG: 1424905 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16666 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* core: Honour mandatory lock flags during lock migrationAnoop C S2016-05-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | lk_flags from posix_lock_t structure is the primary key used to differentiate locks as either advisory and mandatory type. During lock migration this field is not read in getactivelk() call path. So in order to copy the exact lock state from source to destination it is necessary to include lk_flags within lock_migration_info_t structure to maintain accurate state. This change also includes minor modifications to setactivelk() call to consider lk_flags during lock migration. Change-Id: I20a7b6b6a0f3bdac5734cce8a2cd2349eceff195 BUG: 1332501 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/14189 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Poornima G <pgurusid@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* core: add setactivelk () fopSusant Palai2016-05-011-0/+44
| | | | | | | | | | | Change-Id: Ic2ba77a1fdd27801a6e579e04e6c0dd93cd7127b BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14011 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* core: add getactivelk () fopSusant Palai2016-05-011-0/+79
| | | | | | | | | | | Change-Id: Ifd0ff278dcf43da064021f5c25e5dcd34347fcde BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13970 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* Use pthread_equal to compare threadsMichael Scherer2016-04-251-2/+2
| | | | | | | | | | The man pages about pthreads are quite clear on the fact that pthread_t is supposed to be opaque, and so can't be compared using the equality operator. Change-Id: Id69e166ed73a98668d19a71cd6d9ab9a0429ec38 BUG: 1330225 Signed-off-by: Michael Scherer <mscherer@redhat.com>
* core: add lease fopPoornima G2016-04-211-0/+49
| | | | | | | | | | | | | Change-Id: Ia27d66b1061b0377857827515590eb89b18515c9 BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11596 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* syncop: add seek() FOPNiels de Vos2016-01-311-1/+36
| | | | | | | | | | | | | | | Add the new seek() FOP to the syncop framework. gfapi will use this in the future. Change-Id: I0c15153beb27de73d5844b6f692175750fc28f60 BUG: 1220173 Singed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11481 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* syncop: Include iatt to 'syncop_link' argsSoumya Koduri2015-07-101-2/+8
| | | | | | | | | | | | | | Include iatt to 'syncop_link' args to fetch proper attributes of the newly linked inode. Signed-off-by: Soumya Koduri <skoduri@redhat.com> Change-Id: If6b92961bd7a89add3791ed3a9b494087348b492 BUG: 1241788 Reviewed-on: http://review.gluster.org/11611 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* quota/marker: use smaller stacksize in synctask for marker updationvmallika2015-07-091-7/+22
| | | | | | | | | | | | | | | | | Default stacksize that synctask uses is 2M. For marker we set it to 16k Also move market xlator close to io-threads to have smaller stack Change-Id: I8730132a6365cc9e242a3564a1e615d94ef2c651 BUG: 1207735 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/11499 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* libglusterfs removing strerror in logging Mohamed Ashiq Liyazudeen2015-07-091-4/+2
| | | | | | | | | | Change-Id: I8a0f40834da1151ddaef6139af3782bc076df57e BUG: 1194640 Signed-off-by: Mohamed Ashiq Liyazudeen <mliyazud@redhat.com> Reviewed-on: http://review.gluster.org/11464 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs: Use GF_CALLOC/GF_FREE instead of CALLOC/FREEPranith Kumar K2015-07-021-8/+8
| | | | | | | | | | | | | - Also removed numbers for the types as the string form of type is printed in statedump now, so the numbers are not needed anymore. Change-Id: I6e8c15a1dc8cb6187842f96f1d46ec0f26a602b4 BUG: 1237381 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/11495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mem-pool,stack,store,syncop,timer/libglusterfs : Porting to a new logging ↵Mohamed Ashiq2015-06-261-40/+51
| | | | | | | | | | | framework Change-Id: Idd3dcaf7eeea5207b3a5210676ce3df64153197f BUG: 1194640 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/10827 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs: Copy d_len and dict as well into dst direntKrutika Dhananjay2015-06-021-2/+16
| | | | | | | | | | | | | | | Also, added memory allocation failure checks in light of the comments received @ http://review.gluster.org/#/c/10809/2/libglusterfs/src/gf-dirent.c, and http://review.gluster.org/#/c/10809/1/xlators/features/shard/src/shard.c Change-Id: Ie4092218545c8f4f8a0e6cc1fec6ba37bbbf2620 BUG: 1226551 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/11026 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/bitrot: Throttle filesystem scrubberVenky Shankar2015-05-071-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces multithreaded filesystem scrubber based on throttling option configured for a particular volume. The implementation "logically" breaks scanning and scrubbing with the number of scrubber threads auto-configured depending upon the throttle configuration. Scanning (crawling) is left single threaded (per brick) with entries scrubbed in bulk. On reaching this "bulk" watermark, scanner waits until entries are scrubbed. Bricks for a particular volume have a set of thread(s) assigned for scrubbing, with entries for each brick scrubbed in a round robin fashion to avoid scrub "stalls" when a brick (out of N bricks) is under active scrubbing. This mechanism helps us implement "pause/resume" with ease: all one need to do is to cleanup scrubber threads and let the main scanner thread "wait" untill scrubbing is resumed (where the scrubber thread(s) are spawned again), therefore continuing where we left off (unless we restart the deamons, where crawl initiates from root directory again, but I guess that's OK). [ NOTE: Throttling is optional for the signer daemon, without which it runs full throttle. However, passing "-DBR_RATE_LIMIT_SIGNER" predefined in CFLAGS enables CPU throttling (during checksum calculation) thereby avoiding high CPU usage. ] Subsequent patches would introduce CPU throttling during hash calculation for scrubber. Change-Id: I5701dd6cd4dff27ca3144ac5e3798a2216b39d4f BUG: 1207020 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10511 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Implementation of sync lock as recursive lock to avoid crash.anand2015-04-281-48/+126
| | | | | | | | | | | | | | | | | | | | | | | Problem : In glusterd,we are using big lock which is implemented based on sync task frame work for thread synchronization and rcu lock for data consistency. sync task frame work swap the threads if there is no worker poll threads available,due to this rcu lock and rcu unlock was happening in different threads (urcu-bp will not allow this),resulting into glusterd crash. fix : To avoid releasing the sync lock(big lock) in between rcu critical section,implemented sync lock as recursive lock. More details: link : http://www.spinics.net/lists/gluster-devel/msg14632.html Change-Id: I2b56c1caf3f0470f219b1adcaf62cce29cdc6b88 BUG: 1211640 Signed-off-by: anand <anekkunt@redhat.com> Reviewed-on: http://review.gluster.org/10285 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs/syncop: Pass xdata even in error caseRaghavendra Talur2015-04-271-2/+2
| | | | | | | | | | | | | xdata should be passed even in error cases. lookup() call was missed in previous patch set. Change-Id: I1ad2c452d05a3b4433b640762aaea5d3a91f2ba5 BUG: 1209869 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/10193 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* syncop: Implement syncop_fxattropPranith Kumar K2015-04-261-0/+20
| | | | | | | | | | | Change-Id: Ifc7937ceb451f6e11e40a9513017226fd0f115b0 BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10382 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs/syncop: Add xdata to all syncop callsRaghavendra Talur2015-04-081-109/+394
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for xdata in both the request and response path of syncops. Few calls like lookup already had the support; have renamed variables in few places to maintain uniformity. xdata passed downwards is known as xdata_in and xdata passed upwards is known as xdata_out. There is an old patch by Jeff Darcy at http://review.gluster.org/#/c/8769/3 which does the same for some selected calls. It also brings in xdata support at gfapi level. xdata support at gfapi level would be introduced in subsequent patches. Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc BUG: 1158621 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9859 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Quota/marker : Support for inode quotavmallika2015-03-171-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to retrieve the number of files/objects in a directory or volume is to do a crawl of the entire directory/volume. This is expensive and is not scalable. The new mechanism proposes to store count of objects/files as part of an extended attribute of a directory. Each directory's extended attribute value will indicate the number of files/objects present in a tree with the directory being considered as the root of the tree. Currently file usage is accounted in marker by doing multiple FOPs like setting and getting xattrs. Doing this with STACK WIND and UNWIND can be harder to debug as involves multiple callbacks. In this code we are replacing current mechanism with syncop approach as syncop code is much simpler to follow and help us implement inode quota in an organized way. Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4 BUG: 1188636 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/9567 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>