summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* write-behind: Allow trickling-writes to be configurableCsaba Henk2017-12-201-0/+12
| | | | | | | | | | | | | | This is the undisputed/trivial part of Shreyas' patch he attached to https://bugzilla.redhat.com/1364740 (of which the current bug is a clone). We need more evaluation for the page_size and window_size bits before taking them on. Change-Id: Iaa0b9a69d35e522b77a52a09acef47460e8ae3e9 BUG: 1428060 Co-authored-by: Shreyas Siravara <sshreyas@fb.com> Signed-off-by: Csaba Henk <csaba@redhat.com>
* feature/bitrot: remove internal xattrs from lookup cbkRavishankar N2017-12-192-7/+21
| | | | | | | | | | | | | | | | | | | | | | Problem: afr requests all xattrs in lookup via the list-xattr key. If bitrot is enabled and later disabled, or if the bitrot xattrs were present due to an older version of bitrot which used to create the xattrs without enabling the feature, the xattrs (trusted.bit-rot.version in particular) was not getting filtered and ended up reaching the client stack. AFR, on noticing different values of the xattr across bricks of the replica, started triggering spurious metadata heals. Fix: Filter all internal xattrs in bitrot xlator before unwinding lookup, (f)getxattr. Thanks to Kotresh for the help in RCA'ing. Change-Id: I5bc70e4b901359c3daefc67b8e4fa6ddb47f046c BUG: 1527275 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit d341f20230b9921391aff22337eaf9be82f44d88)
* cluster/dht: don't overfill the buffer in readdir(p)Raghavendra G2017-12-111-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | Superflous dentries that cannot be fit in the buffer size provided by kernel are thrown away by fuse-bridge. This means, * the next readdir(p) seen by readdir-ahead would have an offset of a dentry returned in a previous readdir(p) response. When readdir-ahead detects non-monotonic offset it turns itself off which can result in poor readdir performance. * readdirp can be cpu-intensive on brick and there is no point to read all those dentries just to be thrown away by fuse-bridge. So, the best strategy would be to fill the buffer optimally - neither overfill nor underfill. >Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84 >BUG: 1492625 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit e785faead91f74dce7c832848f2e8f3f43bd0be5) Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84 BUG: 1522710 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: make rebalance use truncate incaseSusant Palai2017-12-113-71/+99
| | | | | | | | | | | | | .. the brick file system does not support fallocate. > Change-Id: Id76cda2d8bb3b223b779e5e7a34f17c8bfa6283c > BUG: 1488103 > Signed-off-by: Susant Palai <spalai@redhat.com> Change-Id: Id76cda2d8bb3b223b779e5e7a34f17c8bfa6283c BUG: 1520232 Signed-off-by: Susant Palai <spalai@redhat.com>
* glusterd: Free up svc->conn on volume deleteAtin Mukherjee2017-12-071-0/+5
| | | | | | | | | | | | | Daemons like snapd, tierd and gfproxyd are maintained on per volume basis and on a volume delete we should destroy the rpc connection established for them. >mainline patch : https://review.gluster.org/#/c/18957/ Change-Id: Id1440e39da07b990fdb9b207df18da04b1ca8014 BUG: 1523046 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 36ce4c614a3391043a3417aa061d0aa16e60b2d3)
* posix: Change GD_OP_VERSION to 3_13_0 from 3_12_0 for storage.reserveMohit Agrawal2017-12-011-1/+1
| | | | | | | | | | | Problem: Change GD_OP_VERSION to 3_13_0 from 3_12_0 for option storage.reserve Solution: Actually feature was merged in 3.13.0 branch so GD_OP_VERSION needs to change from 3_12_0 to 3_13_0 BUG: 1518512 Change-Id: I6f753978fd607919efcc60f73c37feaadc4f32ef Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* Disable gfid2path by default on NetBSDEmmanuel Dreyfus2017-11-301-0/+11
| | | | | | | | | | | | | | | NetBSD storage of extended attributes for UFS1 badly scales when the list of extended attributes names rises. gfid2path can add as many extended attributes names as we have files, hence we keep it disabled for performance sake. > Change-Id: Id77b5f5ceb4d5eba1b3362b4b9fc693450ffbc2b > Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> > BUG: 1129939 Change-Id: I17c12251d80dbb41b7d4864d5739d1ad3d6877a0 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> BUG: 1513259
* cluster/ec: EC DISCARD doesn't punch hole properlySunil Kumar Acharya2017-11-291-2/+4
| | | | | | | | | | | | | | | | | | Problem: DISCARD operation on EC volume was punching hole of lesser size than the specified size in some cases. Solution: EC was not handling punch hole for tail part in some cases. Updated the code to handle it appropriately. >BUG: 1516206 >Change-Id: If3e69e417c3e5034afee04e78f5f78855e65f932 >Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> BUG: 1518257 Change-Id: If3e69e417c3e5034afee04e78f5f78855e65f932 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* features/locks: Fix memory leaksXavier Hernandez2017-11-285-5/+11
| | | | | | | | | Backport of: > BUG: 1515161 Change-Id: Ic1d2e17a7d14389b6734d1b88bd28c0a2907bbd6 BUG: 1517692 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* cluster/afr: Fix for arbiter becoming sourcekarthik-us2017-11-274-6/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When eager-lock is on, and two writes happen in parallel on a FD we were observing the following behaviour: - First write fails on one data brick - Since the post-op is not yet happened, the inode refresh will get both the data bricks as readable and set it in the inode context - In flight split brain check see both the data bricks as readable and allows the second write - Second write fails on the other data brick - Now the post-op happens and marks both the data bricks as bad and arbiter will become source for healing Fix: Adding one more variable called write_suvol in inode context and it will have the in memory representation of the writable subvols. Inode refresh will not update this value and its lifetime is pre-op through unlock in the afr transaction. Initially the pre-op will set this value same as read_subvol in inode context and then in the in flight split brain check we will use this value instead of read_subvol. After all the checks we will update the value of this and set the read_subvol same as this to avoid having incorrect value in that. Change-Id: I2ef6904524ab91af861d59690974bbc529ab1af3 BUG: 1516313 Signed-off-by: karthik-us <ksubrahm@redhat.com> (cherry picked from commit 19f9bcff4aada589d4321356c2670ed283f02c03)
* features/worm: new config option to manage deletion of Worm files.Vishal Pandey2017-11-224-1/+20
| | | | | | | | | | | | | | | | | | | | Add a new configuration option worm-files-deletable to file-level Worm in order to control behaviour of Worm files upon deletion. Steps to Test: 1. Add all the configuration options to a volume to activate file-level-worm 2. Option features.worm-files-deletable is set to 1 by default. 3. Create a new file and wait for the retention time to expire. 4. After retention time expires, do an truncate, rename, unlink, link or write to send the file in Worm state. 5. After that do `rm -f filename`. 6. The file is successfully removed. 7. Repeat from step 2 by setting features.worm-files-deletable 0. This time deletion should not be successful. Change-Id: Ibc89861ee296e065330b93a9f9606be5da40af31 BUG: 1508898 Signed-off-by: Vishal Pandey <vishpandey2014@gmail.com>
* afr: add checks for allowing lookupsRavishankar N2017-11-214-93/+162
| | | | | | | | | | | | | | | | | | | | | | | Problem: In an arbiter volume, lookup was being served from one of the sink bricks (source brick was down). shard uses the iatt values from lookup cbk to calculate the size and block count, which in this case were incorrect values. shard_local_t->last_block was thus initialised to -1, resulting in an infinite while loop in shard_common_resolve_shards(). Fix: Use client quorum logic to allow or fail the lookups from afr if there are no readable subvolumes. So in replica-3 or arbiter vols, if there is no good copy or if quorum is not met, fail lookup with ENOTCONN. With this fix, we are also removing support for quorum-reads xlator option. So if quorum is not met, neither read nor write txns are allowed and we fail the fop with ENOTCONN. Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0 BUG: 1515572 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit bd44d59741bb8c0f5d7a62c5b1094179dd0ce8a4)
* cluster/dht: Don't set ACLs on linkto fileN Balachandran2017-11-211-0/+11
| | | | | | | | | | | | | | | | The trusted.SGI_ACL_FILE appears to set posix ACLs on the linkto file that is a target of file migration. This can mess up file permissions and cause linkto identification to fail. Now we remove all ACL xattrs from the results of the listxattr call on the source before setting them on the target. > BUG: 1514329 > Signed-off-by: N Balachandran <nbalacha@redhat.com> Change-Id: I56802dbaed783a16e3fb90f59f4ce849f8a4a9b4 BUG: 1515045 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/ec: Fix op-version for disperse.other-eager-lockXavier Hernandez2017-11-162-2/+2
| | | | | | | | | | | | | The op-version used for the new option was wrong. It has been set to 3.13.0. >Change-Id: I88fbd7834e4a8018c8906303e734c251e90be8cf >BUG: 1502610 >Signed-off-by: Xavier Hernandez <jahernan@redhat.com> Change-Id: I88fbd7834e4a8018c8906303e734c251e90be8cf BUG: 1512460 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* cluster/ec: create eager-lock option for non-regular filesXavier Hernandez2017-11-164-13/+52
| | | | | | | | | | | | | A new option is added to allow independent configuration of eager locking for regular files and non-regular files. >Change-Id: I8f80e46d36d8551011132b15c0fac549b7fb1c60 >BUG: 1502610 >Signed-off-by: Xavier Hernandez <jahernan@redhat.com> Change-Id: I8f80e46d36d8551011132b15c0fac549b7fb1c60 BUG: 1512460 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* glusterd: display gluster volume status, when quorum type is serverSanju Rakonde2017-11-141-0/+6
| | | | | | | | | | | | | | Problem: when server-quorum-type is server, after restarting glusterd in the node which is up, gluster volume status is giving incorrect information. Fix: check whether server is blank, before adding other keys into the dictionary. Change-Id: I926ebdffab330ccef844f23f6d6556e137914047 BUG: 1511768 Signed-off-by: Sanju Rakonde <srakonde@redhat.com> (cherry picked from commit 046c7e3199fca715592762e271e6061ac99b0c4b)
* glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUMAtin Mukherjee2017-11-141-1/+2
| | | | | | | | | | | | | | If a volume is not having server quorum enabled and in a trusted storage pool all the glusterd instances from other peers are down, on restarting glusterd the brick start trigger doesn't happen resulting into the brick not coming up. > mainline patch : https://review.gluster.org/#/c/18669/ Change-Id: If1458e03b50a113f1653db553bb2350d11577539 BUG: 1511293 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 635c1c3691a102aa658cf1219fa41ca30dd134ba)
* glusterd: Changed default op-version for some optionsShyamsundarR2017-11-062-6/+6
| | | | | | | | | | As 3.13 is branched at a point that includes the features that are changed with this commit, their minimum supported op-versions should also change to 3.13. Change-Id: I7ef8eccc13a16f93939c1edbff9508d1e167c5e4 BUG: 1510019 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* core: remove experimental xlators and associated testsKaleb S. KEITHLEY2017-11-0650-5809/+2
| | | | | | | | | | | | experimental xlators removed from 3.13 > Cherry picked from 4231c40973c60999f5ef759db450d25e129ef6ba: > Reviewed-on: https://review.gluster.org/17953 > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Signed-off-by: ShyamsundarR <srangana@redhat.com> Change-Id: I34419ce22ca09b7626b8f9382c377a614fd9fed8 BUG: 1510022
* xlator/tier: flood of -Wformat-truncation warnings with gcc-7.1v4.0dev1Kaleb S. KEITHLEY2017-11-011-8/+9
| | | | | | | | | | | | | | | | Starting in Fedora 26 which has gcc-7.1.x, -Wformat-trunction is enabled with -Wformat, resulting in a flood of new warnings. This many warnings is a concern because it makes it hard(er) to see other warnings that should be addressed. An example is at https://kojipkgs.fedoraproject.org//packages/glusterfs/3.12.0/1.fc28/data/logs/x86_64/build.log For more info see https://review.gluster.org/#/c/18267/ Change-Id: Id7ef8e0dedd28ada55f72c03d91facbe1c9888bd BUG: 1492849 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* nfs: Fix crash bug when mnt3_resolve_subdir_cbk() failsShreyas Siravara2017-11-011-0/+3
| | | | | | | | | | | | | | | Summary: When mnt3_resolve_subdir_cbk() fails (this can happen when racing operations), authorized_path is sometimes NULL. We try to run strlen() on this path and as a result we crash. This diff fixes that by checking if path is NULL before dereferencing it. Test Plan: Run with patch and observe that there is no crash. Prove test for auth code. BUG: 1365683 Change-Id: I2e2bdfdc61ae906788611e151d2c753b79b312df Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
* cluster/ec: FORWARD_NULL coverity fixSunil Kumar Acharya2017-11-011-1/+3
| | | | | | | | | | Problem: frame could be NULL. Solution: Added check to verify frame. BUG: 789278 Change-Id: I55a64c936ae71ec8587a3f9dfa0fdee5d0ea5213 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* cluster/ec: FORWARD_NULL coverity fixSunil Kumar Acharya2017-11-012-0/+5
| | | | | | | | | | Problem: cbk could be NULL. Solution: Assigned appropriate value to cbk. BUG: 789278 Change-Id: I2e4bba9a54f965c6a7bccf0b0cb6c5f75399f6e6 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* glusterd: fix brick restart parallelismAtin Mukherjee2017-11-016-32/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | glusterd's brick restart logic is not always sequential as there is atleast three different ways how the bricks are restarted. 1. through friend-sm and glusterd_spawn_daemons () 2. through friend-sm and handling volume quorum action 3. through friend handshaking when there is a mimatch on quorum on friend import. In a brick multiplexing setup, glusterd ended up trying to spawn the same brick process couple of times as almost in fraction of milliseconds two threads hit glusterd_brick_start () because of which glusterd didn't have any choice of rejecting any one of them as for both the case brick start criteria met. As a solution, it'd be better to control this madness by two different flags, one is a boolean called start_triggered which indicates a brick start has been triggered and it continues to be true till a brick dies or killed, the second is a mutex lock to ensure for a particular brick we don't end up getting into glusterd_brick_start () more than once at same point of time. Change-Id: I292f1e58d6971e111725e1baea1fe98b890b43e2 BUG: 1506513 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: MISSING_BREAK coverity fixSunil Kumar Acharya2017-10-311-0/+3
| | | | | | | | | | Problem: switch case syntax issue. Solution: syntax fixed. BUG: 789278 Change-Id: I76da72c3ab6ffc5db671686a71d6a596beaf496e Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* snapshot: fix coverity issue 'DEADCODE'Sunny Kumar2017-10-311-11/+1
| | | | | | | | | | | | | | Problem : Unreachable code at glusterd-snapshot.c:6718 Unreachable code at glusterd-snapshot.c:7352 FIx : Remove unreachable code At glusterd-snapshot.c:6718 in if condition the value of "snap" must be "NULL" to call glusterd_snap_remove() which is not possible here. Change-Id: Id865bde7c1474a9b9ed11c0ed614676b4e2443c6 BUG: 789278 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: delete source brick only once in reset-brick commit forceAtin Mukherjee2017-10-311-1/+1
| | | | | | | | | | | While stopping the brick which is to be reset and replaced delete_brick flag was passed as true which resulted glusterd to free up to source brick before the actual operation. This results commit force to fail failing to find the source brickinfo. Change-Id: I1aa7508eff7cc9c9b5d6f5163f3bb92736d6df44 BUG: 1507466 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* Xlators/features: Coverity Issue "NEGATIVE_RETURNS" in ↵Subha sree Mohankumar2017-10-311-5/+20
| | | | | | | | | | | | | | | | worm_truncate,worm_ftruncate,worm_link,worn_unlink and worm_rename. Issue :Event negative_returns: "op_errno" is passed to a parameter that cannot be negative. Fix : In functions worm_link,worm_unlink,worm_truncate,worm_ftruncate and worm_rename, added an condition to check the whether value of "op_errno" is negative or not.If the value is negative, then it is set to EROFS. Change-Id: Ie65601b6cc9799f587cc574408281dab50a58afb BUG: 789278 Signed-off-by: Subha sree Mohankumar <smohanku@redhat.com>
* glusterd: clean up portmap on brick disconnectAtin Mukherjee2017-10-314-11/+46
| | | | | | | | | | | | | | | | | | | | GlusterD's portmap entry for a brick is cleaned up when a PMAP_SIGNOUT event is initiated by the brick process at the shutdown. But if the brick process crashes or gets killed through SIGKILL then this event is not initiated and glusterd ends up with a stale port. Since GlusterD's portmap traversal happens both ways, forward for allocation and backward for registry search, there is a possibility that glusterd might end up running with a stale port for a brick which eventually will end up with clients to fail to connect to the bricks. Solution is to clean up the port entry in case the process is down as part of the brick disconnect event. Although with this the handling PMAP_SIGNOUT event becomes redundant in most of the cases, but this is the safeguard method to avoid glusterd getting into the stale port issues. Change-Id: I04c5be6d11e772ee4de16caf56dbb37d5c944303 BUG: 1503246 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: persist brickinfo's port change into glusterd's storeGaurav Yadav2017-10-315-10/+62
| | | | | | | | | | | | | | | | | | Problem: Consider a case where node reboot is performed and prior to reboot brick was listening to 49153. Post reboot glusterd assigned 49152 to brick and started the brick process but the new port was never persisted. Now when glusterd restarts glusterd always read the port from its persisted store i.e 49153 however pmap signin happens with the correct port i.e 49152. Fix: Make sure when glusterd_brick_start is called, glusterd_store_volinfo is eventually invoked. Change-Id: Ic0efbd48c51d39729ed951a42922d0e59f7115a1 BUG: 1506589 Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
* write once read many: file appendable in worm stateVishal Pandey2017-10-301-10/+1
| | | | | | | | | | | | | | | | | | | | Issue: A new file is appendable even when file level worm is 1. Fix: Do a state transition in writev function in worm.c file. Steps To Test: 1- Activate file level worm. 2- Create a new file. 3- Leave file dormant for auto commit period. 4- Try and append some content to the file. 5- check the file if new content has been appended or not. 6- check if file has been transitioned to Worm Retention state. Change-Id: I52d50ad888cb0c39ad54be9352ccb07d48b8d71a BUG: 1505807 Signed-off-by: Vishal Pandey <vishpandey2014@gmail.com>
* cluster/afr: Honor default timeout of 5min for analyzing split-brain fileskarthik-us2017-10-301-1/+5
| | | | | | | | | | | | | | | | | | | Problem: After setting split-brain-choice option to analyze the file to resolve the split brain using the command "setfattr -n replica.split-brain-choice -v "choiceX" <path-to-file>" should allow to access the file from mount for default timeout of 5mins. But the timeout was not honored and was able to access the file even after the timeout. Fix: Call the inode_invalidate() in afr_set_split_brain_choice_cbk() so that it will triger the cache invalidate after resetting the timer and the split brain choice. So the next calls to access the file will fail with EIO. Change-Id: I698cb833676b22ff3e4c6daf8b883a0958f51a64 BUG: 1503519 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* posix: Ignore disk space reserve check for internal FOPSMohit Agrawal2017-10-305-36/+36
| | | | | | | | | | | | | Problem: Currently disk space reserve check is applicable for internal FOP also it needs to be ignore for internal FOP. Solution: Update the DISK_SPACE_CHECK_AND_GOTO macro at posix component. Macro will call only while key "GLUSTERFS_INTERNAL_FOP_KEY" exists in xdata. BUG: 1506083 Change-Id: I2b0840bbf4fa14bc247855b024ca136773d68d16 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* md-cache: Add additional samba and macOS specific EAs to mdcacheGünther Deschner2017-10-301-6/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Samba ships with a server implementation of the Apple Create Context extension (AAPL) as negotiated by all modern Apple clients. With the support of the AAPL extension, Apple clients will integrate better with Samba servers. The AAPL implementation itself is contained in the Samba vfs_fruit(8) module which has to be activated in Samba. This vfs_fruit module also provides support for macOS alternate data streams which will be represented in EAs. Two standard data streams ("AFP_AfpInfo" and "AFP_Resource") will be stored in the following EAs: * user.org.netatalk.Metadata * user.org.netatalk.ResourceFork For all other data streams, vfs_fruit relies on another Samba vfs module, vfs_streams_xattr(8), to handle these. Although configurable, by default the vfs_streams_xattr module will build EA keynames with a "user.DosStream." prefix. Please note that we have to deal with only one known prefix key, as macOS will happily compose EA keynames like: * user.DosStream.com.apple.diskimages.fsck:$DATA * user.DosStream.com.apple.diskimages.recentcksum:$DATA * user.DosStream.com.apple.metadata:kMDItemWhereFroms:$DATA * user.DosStream.com.apple.quarantine:$DATA * etc. Caching of vfs_fruit specific EAs is crucial for SMB performance and is controlled with the same configuration option "performance.cache-samba-metadata". Guenther Change-Id: Ia7aa341234dc13e1c0057f3d658b7ef711b5d31e BUG: 1499933 Signed-off-by: Guenther Deschner <gd@samba.org>
* protocol/client: handle the subdir handshake properly for add-brickAmar Tumballi2017-10-291-1/+9
| | | | | | | | | There should be different way we handle handshake in case of subdir mount for the first time, and in case of subsequent graph changes. Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54 BUG: 1505323 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cluster/afr: Fail open on split-brainPranith Kumar K2017-10-2611-95/+208
| | | | | | | | | | | | | | | | | Problem: Append on a file with split-brain succeeds. Open is intercepted by open-behind, when write comes on the file, open-behind does open+write. Open succeeds because afr doesn't fail it. Then write succeeds because write-behind intercepts it. Flush is also intercepted by write-behind, so the application never gets to know that the write failed. Fix: Fail open on split-brain, so that when open-behind does open+write open fails which leads to write failure. Application will know about this failure. Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15 BUG: 1294051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* Make sure attach_brick return a proper error codeMichael Scherer2017-10-261-1/+1
| | | | | | | | | | | | | | Coverity warn about "Using uninitialized value "ret".", which is indeed the case. If we can't attach after 15 tries, attach_brick return uninitiliazed return code, which is likely hard to trigger, but bad. I use -1 to follow the UNIX convetion, but there is no specific error code tested by code. Change-Id: I43ad60d25ab169f9cea0db600805ced7f77c37ba BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com>
* cluster/ec: Implement DISCARD FOP for ECSunil Kumar Acharya2017-10-255-48/+332
| | | | | | | | | | | Updates #254 This code change implements DISCARD FOP support for EC. BUG: 1461018 Change-Id: I09a9cb2aa9d91ec27add4f422dc9074af5b8b2db Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* protocol/server: fix the comparision logic in case of subdir mountAmar Tumballi2017-10-241-30/+30
| | | | | | | | | | | | | without the fix, the stat entry on a file would return inode==1 for many files, in case of subdir mount This happened with the confusion of return value of 'gf_uuid_compare()', it is more like strcmp, instead of a gf_boolean return value, and hence resulted in the bug. Change-Id: I31b8cbd95eaa3af5ff916a969458e8e4020c86bb BUG: 1505527 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cluster/ec: Allow parallel writes in EC if possiblePranith Kumar K2017-10-249-142/+288
| | | | | | | | | | | | | | | | | | Problem: Ec at the moment sends one modification fop after another, so if some of the disks become slow, for a while then the wait time for the writes that are waiting in the queue becomes really bad. Fix: Allow parallel writes when possible. For this we need to make 3 changes. 1) Each fop now has range parameters they will be updating. 2) Xattrop is changed to handle parallel xattrop requests where some would be modifying just dirty xattr. 3) Fops that refer to size now take locks and update the locks. Fixes #251 Change-Id: Ibc3c15372f91bbd6fb617f0d99399b3149fa64b2 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* gluster: IPv6 single stack supportKevin Vigor2017-10-243-0/+36
| | | | | | | | | | | | | | | | | | | | | | Summary: - This diff changes all locations in the code to prefer inet6 family instead of inet. This will allow change GlusterFS to operate via IPv6 instead of IPv4 for all internal operations while still being able to serve (FUSE or NFS) clients via IPv4. - The changes apply to NFS as well. - This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch. Test Plan: Prove tests! Reviewers: dph, rwareing Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33 BUG: 1406898 Reviewed-on-3.8-fb: http://review.gluster.org/16059 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com>
* libgfchangelog: Fix possible null pointer dereferenceKotresh HR2017-10-231-6/+6
| | | | | | | | | | If pthread_attr_init fails, gf_msg uses this->name where 'this' is not initialized yet. This patch fixes the same. Change-Id: Ie004cbe1015a0d62fc3b5512e8954c5606eeeb5f Signed-off-by: Kotresh HR <khiremat@redhat.com> BUG: 1505325
* snapshot: Issue with other processes accessing the mounted brickSunny Kumar2017-10-235-44/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added code for unmount of activated snapshot brick during snapshot deactivation process which make sense as mount point for deactivated bricks should not exist. Removed code for mounting newly created snapshot, as newly created snapshots should not mount until it is activated. Added code for mount point creation and snapshot mount during snapshot activation. Added validation during glusterd init for mounting only those snapshot whose status is either STARTED or RESTORED. During snapshot restore, mount point for stopped snap should exist as it is required to set extended attribute. During handshake, after getting updates from friend mount point for activated snapshot should exist and should not for deactivated snapshot. While getting snap status we should show relevent information for deactivated snapshots, after this pathch 'gluster snap status' command will show output like- Snap Name : snap1 Snap UUID : snap-uuid Brick Path : server1:/run/gluster/snaps/snap-vol-name/brick Volume Group : N/A (Deactivated Snapshot) Brick Running : No Brick PID : N/A Data Percentage : N/A LV Size : N/A Fixes: #276 Change-Id: I65783488e35fac43632615ce1b8ff7b8e84834dc BUG: 1482023 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd: documenting server.allow-insecureSanju Rakonde2017-10-181-1/+1
| | | | | | | | | | | | problem: "server.allow-insecure" is invisible in gluster volume set help. Fix: "server.allow-insecure" is defined as NO_DOC type, chainging it to DOC type solve the problem. Change-Id: I327f1e4c1684ff846deb8b7df07d4d8a09073274 BUG: 1503424 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* gfproxyd: Let glusterd manage gfproxy daemonPoornima G2017-10-1819-75/+869
| | | | | | | Updates: #242 BUG: 1428063 Change-Id: Iaaf2edf99b2ecc75f6d30762c752a6d445c1c826 Signed-off-by: Poornima G <pgurusid@redhat.com>
* glusterd : introduce timer in mgmt_v3_lockGaurav Yadav2017-10-174-19/+244
| | | | | | | | | | | | | | | | Problem: In a multinode environment, if two of the op-sm transactions are initiated on one of the receiver nodes at the same time, there might be a possibility that glusterd may end up in stale lock. Solution: During mgmt_v3_lock a registration is made to gf_timer_call_after which release the lock after certain period of time Change-Id: I16cc2e5186a2e8a5e35eca2468b031811e093843 BUG: 1499004 Signed-off-by: Gaurav Yadav <gyadav@redhat.com>
* worm/read-only: Add sentinel NULL key to optionsPrashanth Pai2017-10-172-0/+4
| | | | | | BUG: 1193929 Change-Id: Ibfee382362826556e2e760f9b946c83445d6a08e Signed-off-by: Prashanth Pai <ppai@redhat.com>
* mount/fuse: never fail open(dir) with ENOENTRaghavendra G2017-10-171-0/+7
| | | | | | | | | | open(dir) being an operation on inode should never fail with ENOENT. If gfid is not present, the appropriate error is ESTALE. This will enable kernel to retry open after a revalidate lookup. Change-Id: I8d07d2ebb5a0da6c3ea478317442cb42f1797a4b BUG: 1500269 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* Revert "mount/fuse: report ESTALE as ENOENT"Raghavendra G2017-10-171-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 26d16b90ec7f8acbe07e56e8fe1baf9c9fa1519e. Consider rename (index.new, store.idx) and open (store.idx) being executed in parallel. When we break down operations following sequence is possible. * lookup (store.idx) - as part of open(store.idx) returns gfid1 as the result. * rename (index.new, store.idx) changes gfid of store.idx to gfid2. Note that gfid2 was the nodeid of index.new. Since rename is successful, gfid2 is associated with store.idx. * open (store.idx) resumes and issues open fop to glusterfs with gfid1. open in glusterfs fails as gfid1 doesn't exist and the error returned by glusterfs to kernel-fuse is ENOENT. * kernel passes back the same error to application as a result to open. This error could've been prevented if kernel retries open with gfid2. Interestingly kernel do retry open when it receives ESTALE error. Even though failure to find gfid resulted in ESTALE error, commit 26d16b90ec7f8acb converted that error to ENOENT while sending an error reply to kernel. This prevented kernel from retrying open resulting in error. Change-Id: I2e752ca60dd8af1b989dd1d29c7b002ee58440b4 BUG: 1500269 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: fix crash when deleting directoriesZhang Huan2017-10-161-2/+4
| | | | | | | | | | | | | | | | | | | | | In DHT, after locks on all subvolumes are acquired, it would perform the following steps sequentially, 1. send remove dir on all other subvolumes except the hashed one in a loop; 2. wait for all pending rmdir to be done 3. remove dir on the hashed subvolume The problem is that in step 1 there is a check to skip hashed subvolume in the loop. If the last subvolume to check is actually the hashed one, and step 3 is quickly done before the last and hashed subvolume is checked, by accessing shared context data be destroyed in step 3, would cause a crash. Fix by saving shared data in a local variable to access later in the loop. Change-Id: I8db7cf7cb262d74efcb58eb00f02ea37df4be4e2 BUG: 1490642 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>