summaryrefslogtreecommitdiffstats
path: root/libglusterfs
Commit message (Collapse)AuthorAgeFilesLines
* server: fix unresolved symbols by moving them to libglusterfsMohit Agrawal2018-04-203-0/+105
| | | | | | | | | | | | | | | | Problem: glusterd2 build is failed due to undefined symbol (xlator_mem_cleanup , glusterfsd_ctx) in server.so Solution: To resolve the same done below two changes 1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c to xlator.c to be part of libglusterfs.so 2) replace glusterfsd_ctx to this->ctx because symbol glusterfsd_ctx is not part of server.so BUG: 1544090 Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5 fixes: bz#1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* Make glusterfsd binary print statedump & xlator dirPrashanth Pai2018-04-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | The glusterd2 needs following options, some of which are provided by gluster CLI today: --print-xlatordir --print-statedumpdir --print-logdir However, the CLI package need not be present on the machine running glusterd2. This change adds the above CLI options to glusterfsd binary which glusterd2 depends on. Reverts 9a1ae47c8d60836ae0628a04a153f28c1085c0e8 Related changes: https://review.gluster.org/#/c/19882/ https://github.com/gluster/glusterd2/pull/663 Updates: bz#1193929 Change-Id: I18c123b0d3350d2bd4f2400783e3b94e402a4e29 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-198-27/+70
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* glusterd: volume inode/fd status broken with brick muxhari gowtham2018-04-192-24/+33
| | | | | | | | | | | | | | | | | | | | | | | Problem: The values for inode/fd was populated from the ctx received from the server xlator. Without brickmux, every brick from a volume belonged to a single brick from the volume. So searching the server and populating it worked. With brickmux, a number of bricks can be confined to a single process. These bricks can be from different volumes too (if we use the max-bricks-per-process option). If they are from different volumes, using the server xlator to populate causes problem. Fix: Use the brick to validate and populate the inode/fd status. Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: I2543fa5397ea095f8338b518460037bba3dfdbfd fixes: bz#1566067
* cluster/afr: Make sure latency-arg is passed to afrPranith Kumar K2018-04-182-2/+3
| | | | | | | | | | | xlator_notify doesn't pass the extra arguments that come in the input function, so XLATOR_NOTIFY macro should be used instead to pass the extra arguments to the function. BUG: 1567881 fixes bz#1567881 Change-Id: Ic15b6c446638cbacf3149693147a754219037c47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* libglusterfs: fix comparison of a NULL dict with a non-NULL dictXavi Hernandez2018-04-181-8/+8
| | | | | | | | | | Function are_dicts_equal() had a bug when the first argument was NULL and the second one wasn't NULL. In this case it incorrectly returned that the dicts were different when they could be equal. Fixes: bz#1566732 Change-Id: I0fc245c2e7d1395865a76405dbd05e5d34db3273 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* core/build/various: python3 compat, prepare for python2 -> python3Kaleb S. KEITHLEY2018-04-122-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note 1) we're not supposed to be using #!/usr/bin/env python, see https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Shebang_lines Note 2) we're also not supposed to be using "!/usr/bin/python, see https://fedoraproject.org/wiki/Changes/Avoid_usr_bin_python_in_RPM_Build#Quick_Opt-Out The previous patch (https://review.gluster.org/19767) tried to do too much in one patch, so it was abandoned. This patch does two things: 1) minor cleanup of configure(.ac) to explicitly use python2 2) change all the shebang lines to #!/usr/bin/python2 and add them where they were missing based on warnings emitted during rpmbuild. In a follow-up patch python2 will eventually be changed to python3. Before that python2-isms (e.g. print, string.join(), etc.) need to be converted to python3. Some of those can be rewritten in version agnostic python. E.g. print statements become print() with "from __future_ import print_function". The python 2to3 utility will be used for some of those. Also Aravinda has given guidance in the comments to the first patch for changes. updates: #411 Change-Id: I471730962b2526022115a1fc33629fb078b74338 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* experimental/cloudsync: Download xlator for archival featureSusant Palai2018-04-102-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spec-files: https://review.gluster.org/#/c/18854/ Overview: * Cloudsync maintains three file states in it's inode-ctx i.e 1 - LOCAL, 2 - REMOTE, 3 - DOWNLOADING. * A data modifying fop is allowed only if the state is LOCAL. If the state is REMOTE or DOWNLOADING, client will download or wait for the download to finish initiated by other client. * Multiple download and upload from different clients are synchronized by inodelk. * In POSIX a state check is done (part of different commit)before allowing the fop to continue. If the state is remote/downloading the fop is unwound with EREMOTE. The client will then download the file and continue with the fop again. * Basic Algo for fop (let's say write fop): - If LOCAL -> resume fop - If REMOTE -> - INODELK - STAT (this gets state and heal the state if needed) - DOWNLOAD - resume fop Note: * Developers will need to write plugins for download, based on the remote store they choose. In phase-1, support will be added for one remote store per volume. In future, more options for multiple remote stores will be explored. TODOs: - Implement stat/lookup/readdirp to return size info from xattr - Make plugins configurable - Implement unlink fop - Add metrics collection - Add sharding support Design Contributions: Aravinda V K <avishwan@redhat.com> Amar Tumballi <amarts@redhat.com> Ram Ankireddypalle <areddy@commvault.com> Susant Palai <spalai@redhat.com> updates: #387 Change-Id: Iddf711ee7ab4e946ae3e472ff62791a7b85e6d4b Signed-off-by: Susant Palai <spalai@redhat.com>
* mount/fuse: Add support for multi-threaded fuse readersKrutika Dhananjay2018-04-021-0/+1
| | | | | | | | | | | | | | Usage: Use 'reader-thread-count=<NUM>' as command line option to set the thread count at the time of mounting the volume. Next task is to make these threads auto-scale based on the load, instead of having the user remount the volume everytime to change the thread count. Updates #412 Change-Id: I94aa1505e5ae6a133683d473e0e4e0edd139b76b Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* Quota: heal directory on newly added bricks when quota limit is reachedSanoj Unnikrishnan2018-03-284-2/+229
| | | | | | | | | | | | | | | | | Problem: if a lookup is done on a newly added brick for a path on which limit has been reached, the lookup fails to heal the directory tree due to quota. Solution: Tag the lookup as an internal fop and ignore it in quota. Since marking internal fop does not usually give enough contextual information. Introducing new flags to pass the contextual info. Adding dict_check_flag and dict_set_flag to aid flag operations. A flag is a single bit in a bit array (currently limited to 256 bits). Change-Id: Ifb6a68bcaffedd425dd0f01f7db24edd5394c095 fixes: bz#1505355 BUG: 1505355 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
* storage/posix: Add active-fd-count option in glusterPranith Kumar K2018-03-214-0/+6
| | | | | | | | | | | | | | | | | | | | Problem: when dd happens on sharded replicate volume all the writes on shards happen through anon-fd. When the writes don't come quick enough, old anon-fd closes and new fd gets created to serve the new writes. open-fd-count is decremented only after the fd is closed as part of fd_destroy(). So even when one fd is on the way to be closed a new fd will be created and during this short period it appears as though there are multiple fds opened on the file. AFR thinks another application opened the same file and switches off eager-lock leading to extra latency. Fix: Have a different option called active-fd whose life cycle starts at fd_bind() and ends just before fd_destroy() BUG: 1557932 Change-Id: I2e221f6030feeedf29fbb3bd6554673b8a5b9c94 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterd: TLS verification fails while using intermediate CAMohit Agrawal2018-03-193-2/+49
| | | | | | | | | | | | | | | | | | | | | Problem: TLS verification fails while using intermediate CA if mgmt SSL is enabled. Solution: There are two main issue of TLS verification failing 1) not calling ssl_api to set cert_depth 2) The current code does not allow to set certificate depth while MGMT SSL is enabled. After apply this patch to set certificate depth user need to set parameter option transport.socket.ssl-cert-depth <depth> in /var/lib/glusterd/secure_acccess instead to set in /etc/glusterfs/glusterd.vol. At the time of set secure_mgmt in ctx we will check the value of cert-depth and save the value of cert-depth in ctx.If user does not provide any value in cert-depth in that case it will consider default value is 1 BUG: 1555154 Change-Id: I89e9a9e1026e37efb5c20f9ec62b1989ef644f35 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* cleanup: xlator_t structure's 'client_latency' variable is not usedSven Fischer2018-03-191-8/+7
| | | | | | | | | | | | | | | | | - Removed unused struct member and its one time usage. - cleaned up wrong white space member 'client_latency' was not used otherwise since it was added by commit 07cc8679cdf3b29680f4f105d0222da168d8bfc1 Author: Kevin Vigor <kvigor@fb.com> Date: Tue Mar 21 08:23:25 2017 -0700 Halo Replication feature for AFR translator Change-Id: Ibb0ea828d4090bbe8897f6af326b317884162a00 BUG: 1495153 Signed-off-by: Sven Fischer <sven@fischer-abc.de>
* protocol: Fix 4.0 client, parsing older iatt in dictShyamsundarR2018-03-104-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | In a mixed mode cluster involving 4.0 and older 3.x bricks, if clients are newer, then the iatt encoded in the dictionary can be of the older iatt format, which a newer client will map incorrectly to the newer structure. This causes failures in FOPs that depend on this iatt for some functionality (seen in mkdir operations failing as EIO, when DHT hits its internal setxattr call). The fix provided is to convert the iatt in the dict, based on which RPC version is used to communicate with the server. IOW, this is the reverse of change in commit "b966c7790e" Tested using a mixed mode cluster (i.e bricks in 3.12 and 4.0 versions) and a mixed set of clients, 3.12 and 4.0 clients. There is no regression test provided, as this needs a mixed mode cluster to test and validate. Change-Id: I454e54651ca836b9f7c28f45f51d5956106aefa9 BUG: 1554053 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* protocol: Added iatt conversion to older formatShyamsundarR2018-03-101-0/+47
| | | | | | | | | | | | | | Added iatt conversion to an older format, when dealing with older RPC versions. This enables iatt structure conformance when dealing with older clients. This helps fix rolling upgrade from 3.x versions to 4.0 version of gluster by sending the right iatt in the dictionary when DHT requests the same. Change-Id: Ieaf925f81f8c7798a8fba1e90a59fa9dec82856c BUG: 1544699 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* core: provide infra to make any xlator pass-throughAmar Tumballi2018-03-093-12/+41
| | | | | | | updates: #304 Change-Id: If6a13d2e56b195390a386d720103a882e077f66c Signed-off-by: Amar Tumballi <amarts@redhat.com>
* libglusterfs: Fix coverity issue FORWARD_NULLPoornima G2018-03-021-7/+4
| | | | | Change-Id: I1402046edb232ca9d23346db82a0cfd041c91e70 Signed-off-by: Poornima G <pgurusid@redhat.com>
* libglusterfs: Fix volume_options_t structKaushal M2018-03-022-1/+18
| | | | | | | | | | | | | | | The volume_options_t struct was modified and a new member was introduced in the middle of the struct. This caused GD2 to crash when it tried to read the volume options. The new member has been moved to the end of the struct to correct this. And a note has been added to notify developers on how to modify this struct, and the xlator_api_t struct. Updates: gluster/glusterfs#302 Change-Id: I2e9899ec10516be29c7e9d574da53be8ec17a99e Signed-off-by: Kaushal M <kaushal@redhat.com>
* libglusterfs: move compat RPC/XDR #defines to eliminate warningsKaleb S. KEITHLEY2018-02-272-20/+13
| | | | | | | | | | | | | | | | | | | | | | | | Building with libtirpc (versus legacy glibc rpc) results in many warnings about xdr macros that are redefined in libtirpc headers because of the way compat.h and glusterfs.h are usually #included. And these xdr macros in libglusterfs/src/compat.h - which were copied from legacy glibc's rpc headers - are different than the same-name macros in libtirpc. I haven't checked to see that any of the macros are expanded (incorrectly) between the definition in compat.h and the redefinition in tirpc/rpc/xdr.h; the risk seems pretty minimal. Regardless it seems better, from a truth-and-beauty perspective to not have the old, incorrect definitions in the first place. Not to mention that any file that #includes compat.h and not glusterfs.h does not need these xdr macro definitions at all. They're really only needed when using really old glibc rpc, which would only be evident if including glusterfs.h and/or glusterfs-fops.h. (Which by the way, nothing currently #includes glusterfs-fops.h by itself. And maybe nothing ever should?) Change-Id: Ic11e4407d6ab7c498a8745a99379cbf4788a24e8 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* options: framework for options levelsN Balachandran2018-02-271-0/+10
| | | | | | | | | Framework in order to classify options. Updates gluster/glusterfs#302 Change-Id: I3dd6ae27bd0eb8e0065ffca75838c801e4f3ac91 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* performance/io-threads: nuke everything from a client when it disconnectsVarsha Rao2018-02-271-4/+5
| | | | | | | | | | | | | > io-threads: nuke everything from a client when it disconnects > Commit ID: 4d8268d760 > https://review.gluster.org/#/c/18254/ > By Jeff Darcy <jdarcy@fb.com> This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: I13d3a74862eea3d01e8dbc8736987c3dae6e8b2a Signed-off-by: Varsha Rao <varao@redhat.com>
* rpcsvc: scale rpcsvc_request_handler threadsMilind Changire2018-02-261-0/+7
| | | | | | | | | | | | Scale rpcsvc_request_handler threads to match the scaling of event handler threads. Please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c51 for a discussion about why we need multi-threaded rpcsvc request handlers. Change-Id: Ib6838fb8b928e15602a3d36fd66b7ba08999430b Signed-off-by: Milind Changire <mchangir@redhat.com>
* xlators/features/namespace: Add namespace xlator and link into brick graphVarsha Rao2018-02-213-1/+11
| | | | | | | | | | | | | | | | | | | | | The following release-3.8-fb branch patch is upstreamed: > features/namespace: Add namespace xlator and link into brick graph > Commit ID: dbd30776f26e > https://review.gluster.org/#/c/18041/ > By Michael Goulet <mgoulet@fb.com> Changes in this patch: Removes extra config.h and namespace.h file in namespace.c Adds default_getspec_cbk to libglusterfs.sym Rename dict_for_each to dict_foreach_inline Remove fd.h header file stack.h Add test case for truncate, open and symlink This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: Ib88c95b89eecee9b8957df8a4c8712c899c761d1 Signed-off-by: Varsha Rao <varao@redhat.com>
* posix/afr: handle backward compatibility for rchecksum fopRavishankar N2018-02-193-0/+10
| | | | | | | | | Added a volume option 'fips-mode-rchecksum' tied to op version 4. If not set, rchecksum fop will use MD5 instead of SHA256. updates: #230 Change-Id: Id8ea1303777e6450852c0bc25503cda341a6aec2 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* metrics: set latency min value during xlator initAmar Tumballi2018-02-162-1/+9
| | | | | | | | | | | otherwise, the very first metrics will have all the min as 0. also no need to print pending-fops if it is 0. Updates #168 Change-Id: I233de6c92b1a73977bb468ba211ac6ec3c05298f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Fetch backup volfile servers from glusterd2Prashanth Pai2018-02-164-0/+138
| | | | | | | | | | | | | | | | Clients will request for a list of volfile servers from glusterd2 by setting a (optional) flag in GETSPEC RPC call. glusterd2 will check for the presence of this flag and accordingly return a list of glusterd2 servers in GETSPEC RPC reply. Currently, this list of servers returned only contains servers which have bricks belonging to the volume. See: https://github.com/gluster/glusterd2/issues/382 https://github.com/gluster/glusterfs/issues/351 Updates #351 Change-Id: I0eee3d0bf25a87627e562380ef73063926a16b81 Signed-off-by: Prashanth Pai <ppai@redhat.com>
* libglusterfs/syncop: Add syncop_entrylkRaghavendra G2018-02-133-0/+43
| | | | | | Change-Id: Idd86b9f0fa144c2316ab6276e2def28b696ae18a BUG: 1543279 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* gfapi: return pre/post attributes from glfs_ftruncateKinglong Mee2018-02-123-2/+16
| | | | | | Updates: #389 Change-Id: I8faea0828921fb17f05f7321c3cb01747373f21e Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* gfapi: return pre/post attributes from glfs_fsync/fdatasyncKinglong Mee2018-02-123-4/+17
| | | | | | Updates: #389 Change-Id: I4153df72d5eeecefa7579170899db4c340128bea Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* gfapi: return pre/post attributes from glfs_pread/pwriteKinglong Mee2018-02-123-5/+24
| | | | | | | | | | | | | | | As nfs-ganesha, a wcc data contains pre/post attributes is return in read/write rpc reply. nfs-ganesha get those attributes by two getattr between the real read/write right now. But, gluster has return pre/post attributes from glusterfsd, those attributes are skipped in syncop/gfapi, if gfapi return them, the upper user (nfs-ganesha) can use them directly without any duplicate getattr. Updates: #389 Change-Id: I7b643ae4241cfe2aeb17063de00192d81674024a Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* io-threads: Implement put fopPoornima G2018-02-121-0/+4
| | | | | | Updates #353 Change-Id: I8a30b53a52618c6a6c740d2c67b19e5322ce4ddb Signed-off-by: Poornima G <pgurusid@redhat.com>
* performance/io-threads: expose io-thread queue depthsVarsha Rao2018-02-082-1/+40
| | | | | | | | | | | | | | | | | | | | The following release-3.8-fb branch patch is upstreamed: > io-stats: Expose io-thread queue depths > Commit ID: 69509ee7d2 > https://review.gluster.org/#/c/18143/ > By Shreyas Siravara <sshreyas@fb.com> Changes in this patch: - Replace iot_pri_t with gf_fop_pri_t - Replace IOT_PRI_{HI, LO, NORMAL, MAX, LEAST} with GF_FOP_PRI_{HI, LO, NORMAL, MAX, LEAST} - Use dict_unref() instead of dict_destroy() This patch is required to forward port io-threads namespace patch. Updates: #401 Change-Id: I1b47a63185a441a30fbc423ca1015df7b36c2518 Signed-off-by: Varsha Rao <varao@redhat.com>
* cluster/dht: avoid overwriting client writes during migrationSusant Palai2018-02-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | For more details on this issue see https://github.com/gluster/glusterfs/issues/308 Solution: This is a restrictive solution where a file will not be migrated if a client writes to it during the migration. This does not check if the writes from the rebalance and the client actually do overlap. If dht_writev_cbk finds that the file is being migrated (PHASE1) it will set an xattr on the destination file indicating the file was updated by a non-rebalance client. Rebalance checks if any other client has written to the dst file and aborts the file migration if it finds the xattr. updates gluster/glusterfs#308 Change-Id: I73aec28bc9dbb8da57c7425ec88c6b6af0fbc9dd Signed-off-by: Susant Palai <spalai@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
* statedump: sanity check of mem_acct and rec for xlatorKinglong Mee2018-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | With memory accounting is disabled, glusterfs crash when doing statedump at, 0 0x00007fe24cff543a in gf_proc_dump_xlator_mem_info_only_in_use (xl=0x7fe23e44dc00) at statedump.c:269 1 0x00007fe24cff6310 in gf_proc_dump_oldgraph_xlator_info (top=0x7fe23e44dc00) at statedump.c:530 2 0x00007fe24cff7114 in gf_proc_dump_info (signum=10, ctx=0x7fe24ac0e000) at statedump.c:845 3 0x00007fe24d4d4bab in glusterfs_sigwaiter (arg=0x7ffc6c080750) at glusterfsd.c:2109 4 0x00007fe24bbd5dc5 in start_thread () from /lib64/libpthread.so.0 5 0x00007fe24b51a73d in clone () from /lib64/libc.so.6 (gdb) p xl->mem_acct $1 = (struct mem_acct *) 0x0 (gdb) p xl->mem_acct->rec $2 = 0x10 Change-Id: I10858170431311833ae01224d51c66caaad5e9a3 BUG: 1539603 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* glusterd: Update op-version for masterShyamsundarR2018-01-301-1/+3
| | | | | | | | | Updated the op-version on master to the next release op-version, for any future options appearing on master. Change-Id: I2ef6f8874c638ade1d97477bdd8ffa1bd1a9f952 BUG: 1540338 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* rpc: Showing some unusual timer error logs during brick stopMohit Agrawal2018-01-301-15/+3
| | | | | | | | | Solution: Update msg condition in gf_timer_call_after function to avoid the message BUG: 1538427 Change-Id: I849e8e052a8259cf977fd5e7ff3aeba52f9b5f27 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* quiesce, gfproxy: Implement failover across multiple gfproxy nodesPoornima G2018-01-302-1/+5
| | | | | | Updates: #242 Change-Id: I767e574a26e922760a7130bd209c178d74e8cf69 Signed-off-by: Poornima G <pgurusid@redhat.com>
* gfapi : New APIs have been added to use lease feature in glusterSoumya Koduri2018-01-261-0/+1
| | | | | | | | | | | Following APIs glfs_h_lease(), glfs_lease() added, so that gfapi applications can set and get lease which enables more efficient client side caching. Updates: #350 Change-Id: Iede85be9af1d4df969b890d0937ed0afa4ca6596 Signed-off-by: Poornima G <pgurusid@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* build: use libtirpc by default, even if ipv6 is not the defaultKaleb S. KEITHLEY2018-01-261-1/+2
| | | | | | | | | | | | | | Another error snuck in with Change-Id I86f847dfd, or more accurately I think, with Change-Id: Ic47065e9c2... All libs, not just libgfrpc, need to be linked with libtirpc, especially on systems that still have xdr functions in (g)libc where you will get a mixture of calls to libtirpc functions and glibc functions, with catastrophic results. BUG: 1536186 Change-Id: I97dc39c7844f44c36fe210aa813480c219e1e415 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* dentry fop serializer: added new server side xlator for dentry fop serializationSakshi Bansal2018-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problems addressed by this xlator : [1]. To prevent race between parallel mkdir,mkdir and lookup etc. Fops like mkdir/create, lookup, rename, unlink, link that happen on a particular dentry must be serialized to ensure atomicity. Another possible case can be a fresh lookup to find existance of a path whose gfid is not set yet. Further, storage/posix employs a ctime based heuristic 'is_fresh_file' (interval time is less than 1 second of current time) to check fresh-ness of file. With serialization of these two fops (lookup & mkdir), we eliminate the race altogether. [2]. Staleness of dentries This causes exponential increase in traversal time for any inode in the subtree of the directory pointed by stale dentry. Cause : Stale dentry is created because of following two operations: a. dentry creation due to inode_link, done during operations like lookup, mkdir, create, mknod, symlink, create and b. dentry unlinking due to various operations like rmdir, rename, unlink. The reason is __inode_link uses __is_dentry_cyclic, which explores all possible path to avoid cyclic link formation during inode linkage. __is_dentry_cyclic explores stale-dentry(ies) and its all ancestors which is increases traversing time exponentially. Implementation : To acheive this all fops on dentry must take entry locks before they proceed, once they have acquired locks, they perform the fop and then release the lock. Some documentation from email conversation: [1] http://www.gluster.org/pipermail/gluster-devel/2015-December/047314.html [2] http://www.gluster.org/pipermail/gluster-devel/2015-August/046428.html With this patch, the feature is optional, enable it by running: `gluster volume set $volname features.sdfs enable` Also the feature is tested for a month without issues in the experiemental branch for all the regression. Change-Id: I6e80ba3cabfa6facd5dda63bd482b9bf18b6b79b Fixes: #397 BUG: 1304962 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* libglusterfs: Reset errno before callv4.1devNigel Babu2018-01-231-1/+4
| | | | | | | | This was causing Gluster to return a failure when testing on Centos7. BUG: 1536913 Change-Id: Idb90baef05058123a7f69e94a51dd79abd371815 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* libgfapi: Add new api for supporting mandatory-locksAnoop C S2018-01-221-2/+3
| | | | | | | | | | | | | | | | The current API for byte-range locks [glfs_posix_lock()] doesn't allow applications to specify whether it is advisory or mandatory type locks. This particular change is to introduce an extended byte-range lock API with an additional argument for including the byte-range lock mode to be one among advisory(default) or mandatory. Patch also includes a gfapi test case which make use of this new api to acquire mandatory locks. Ref: https://github.com/gluster/glusterfs-specs/blob/master/done/GlusterFS%203.8/Mandatory%20Locks.md Change-Id: Ia09042c755d891895d96da857321abc4ce03e20c Updates #393 Signed-off-by: Anoop C S <anoopcs@redhat.com>
* md-cache: Implement dynamic configuration of xattr list for cachingPoornima G2018-01-223-0/+21
| | | | | | | | | | | | | | | Currently, the list of xattrs that md-cache can cache is hard coded in the md-cache.c file, this necessiates code change and rebuild everytime a new xattr needs to be added to md-cache xattr cache list. With this patch, the user will be able to configure a comma seperated list of xattrs to be cached by md-cache Updates #297 Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e Signed-off-by: Poornima G <pgurusid@redhat.com>
* protocol: make on-wire-change of protocol using new XDR definition.Amar Tumballi2018-01-191-209/+0
| | | | | | | | | | | | | | | | | | | | | | With this patchset, some major things are changed in XDR, mainly: * Naming: Instead of gfs3/gfs4 settle for gfx_ for xdr structures * add iattx as a separate structure, and add conversion methods * the *_rsp structure is now changed, and is also reduced in number (ie, no need for different strucutes if it is similar to other response). * use proper XDR methods for sending dict on wire. Also, with the change of xdr structure, there are changes needed outside of xlator protocol layer to handle these properly. Mainly because the abstraction was broken to support 0-copy RDMA with payload for write and read FOP. This made transport layer know about the xdr payload, hence with the change of xdr payload structure, transport layer needed to know about the change. Updates #384 Change-Id: I1448fbe9deab0a1b06cb8351f2f37488cefe461f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* gfapi : added glfs_setfsleaseid() for setting lease idSoumya Koduri2018-01-191-0/+1
| | | | | | | | | | | | A new function glfs_setfsleaseid() added in gfapi. Currently lock owner is saved in the thread context. Similarly the leaseid attribute can be saved using glfs_setfsleaseid(). Updates: #350 Change-Id: I55966cca01d0f2649c32b87bd255568c3ffd1262 Signed-off-by: Poornima G <pgurusid@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
* cluster/afr: Adding option to take full file lockkarthik-us2018-01-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In replica 3 volumes there is a possibilities of ending up in split brain scenario, when multiple clients writing data on the same file at non overlapping regions in parallel. Scenario: - Initially all the copies are good and all the clients gets the value of data readables as all good. - Client C0 performs write W1 which fails on brick B0 and succeeds on other two bricks. - C1 performs write W2 which fails on B1 and succeeds on other two bricks. - C2 performs write W3 which fails on B2 and succeeds on other two bricks. - All the 3 writes above happen in parallel and fall on different ranges so afr takes granular locks and all the writes are performed in parallel. Since each client had data-readables as good, it does not see file going into split-brain in the in_flight_split_brain check, hence performs the post-op marking the pending xattrs. Now all the bricks are being blamed by each other, ending up in split-brain. Fix: Have an option to take either full lock or range lock on files while doing data transactions, to prevent the possibility of ending up in split brains. With this change, by default the files will take full lock while doing IO. If you want to make use of the old range lock change the value of "cluster.full-lock" to "no". Change-Id: I7893fa33005328ed63daa2f7c35eeed7c5218962 BUG: 1535438 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* rpc/*: auth-header changesAmar Tumballi2018-01-171-0/+5
| | | | | | | | | | | | | | | | | Introduce another authentication header which can now send more data. This is useful because this data can be common for all the fops, and we don't need to change all the signatures. As part of this, made rpc-clnt.c little more modular to support multiple authentication structures. stack.h changes are placeholder for the ctime etc, can be moved later based on need. updates #384 Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* dict: add another type to handle backward compatibilityAmar Tumballi2018-01-174-6/+35
| | | | | | | | | | | | | | | | This new type helps to avoid excessive logs. It should be set only in case of * volume graph building (graph.y) * dict unserialize (happens once a dictionary is received on wire in old protocol) All other dict set and get should have proper check and warning logs if there is a mismatch. updates #220 Change-Id: I1cccb304a877aa80c07aaac95f10f5005e35b9c5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* locks: added inodelk/entrylk contention upcall notificationsXavier Hernandez2018-01-161-0/+17
| | | | | | | | | | | | | | The locks xlator now is able to send a contention notification to the current owner of the lock. This is only a notification that can be used to improve performance of some client side operations that might benefit from extended duration of lock ownership. Nothing is done if the lock owner decides to ignore the message and to not release the lock. For forced release of acquired resources, leases must be used. Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
* dict: fix VALIDATE_DATA_AND_LOG callAtin Mukherjee2018-01-071-2/+2
| | | | | | | | | | Couple of instances doesn't pass enough number of parameters to the function resulting compilation to fail. Updates #203 Change-Id: Id8caa6fe7fc611645ad7ff11d81a2462e4ec6bab Signed-off-by: Atin Mukherjee <amukherj@redhat.com>