glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	dht - fixing xattr inconsistency	Barak Sason Rofman	2020-07-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scenario of setting an xattr to a dir, killing one of the bricks, removing the xattr, bringing back the brick results in xattr inconsistency - The downed brick will still have the xattr, but the rest won't. This patch add a mechanism that will remove the extra xattrs during lookup. This patch is a modification to a previous patch based on comments that were made after merge: https://review.gluster.org/#/c/glusterfs/+/24613/ fixes: #1324 Change-Id: Ifec0b7aea6cd40daa8b0319b881191cf83e031d1 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
*	dht xlator: integer handling issue	nik-redhat	2020-04-29	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: The ret value is passed to the function instead of the proper errno value Fix: Passing the errno generated to the log function CID: 1415824 : Improper use of negative value CID: 1420205 : Improper use of negative value Change-Id: Iaa7407ebd03eda46a2c027695e6bf0f598b371b2 Updates: #1060 Signed-off-by: nik-redhat <nladha@redhat.com>
*	xlator/dht-helper: structure logging	yatipadia	2020-03-03	1	-96/+70
\| \| \| \| \| \| \| \| \| \|	convert gf_msg() to gf_smsg() Updates: #657 Change-Id: Iab35ac89b7d7fb6fb0074fc61b11bf679c517c9d Signed-off-by: yatipadia <ypadia@redhat.com> Signed-off-by: yatip <ypadia@redhat.com>
*	dht-common.h/dht-helper.c: exctract the LOCK() from DHT_UPDATE_TIME	Yaniv Kaul	2019-12-17	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the code (and only place) that is using this macro is in dht_inode_ctx_time_update() where it is called 3 times in a row, which is essentially 3 cycles of LOCK/UNLOCK on the same lock. Instead, extract the LOCK()/UNLOCK() part of the macro and wrap those calls with it. Change-Id: I6312b985e3d97517857b55f342440accc4063db6 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	dht: Rebalance causing IO Error - File descriptor in bad state	Mohit Agrawal	2019-10-15	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem : When a file is migrated, dht attempts to re-open all open fds on the new cached subvol. Earlier, if dht had not opened the fd, the client xlator would be unable to find the remote fd and would fall back to using an anon fd for the fop. That behavior changed with https://review.gluster.org/#/c/glusterfs/+/15804, causing fops to fail with EBADFD if the fd was not available on the cached subvol. The client xlator returns EBADFD if the remote fd is not found but dht only checks for EBADF before re-opening fds on the new cached subvol. Solution: Handle EBADFD at dht code path to avoid the issue Change-Id: I43c51995cdd48d05b12e4b2889c8dbe2bb2a72d8 Fixes: bz#1758579
*	Multiple files: make root gfid a static variable	Yaniv Kaul	2019-10-14	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In many places we use it, compare to it, etc. It could be a static variable, as it really doesn't change. I think it's better than initializing to 0 and then doing gfid[15] = 1 or other tricks. I think there are additional oppportunuties to make more variables static. This is an attempt at an easy one. Change-Id: I7f23a30a94056d8f043645371ab841cbd0f90d19 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	cluster/dht: Correct fd processing loop	N Balachandran	2019-10-02	1	-22/+62
\| \| \| \| \| \| \| \| \| \| \| \|	The fd processing loops in the dht_migration_complete_check_task and the dht_rebalance_inprogress_task functions were unsafe and could cause an open to be sent on an already freed fd. This has been fixed. Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540 Fixes: bz#1757399 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	multiple files: another attempt to remove includes	Yaniv Kaul	2019-06-14	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are many include statements that are not needed. A previous more ambitious attempt failed because of *BSD plafrom (see https://review.gluster.org/#/c/glusterfs/+/21929/ ) Now trying a more conservative reduction. It does not solve all circular deps that we have, but it does reduce some of them. There is just too much to handle reasonably (dht-common.h includes dht-lock.h which includes dht-common.h ...), but it does reduce the overall number of lines of include we need to look at in the future to understand and fix the mess later one. Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	dht: fix double extra unref of inode at heal path	Kinglong Mee	2019-02-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	The loc_wipe is done in the _out_ section, inode_unref(loc.parent) here casues a double extra unref of loc.parent. Change-Id: I2dc809328d3d34bf7b02c7df9a4f97788af511e6 updates: bz#1651439 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	cluster/dht: NULL pointer dereferencing clang fix	Harpreet Lalwani	2018-10-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Dereferencing NUll pointers this,local and stbuf. 1.Replaced this->name with "dht". 2.Removed GF_VALIDATE_OR_GOTO. 3.Removed the check for "stbuf" and "this". Updates: bz#1622665 Change-Id: Id2fb2270d5ec37b76fa2aae1f1c8dca72dcc728a Signed-off-by: Harpreet Lalwani <hlalwani@redhat.com>
*	all: fix warnings on non 64-bits architectures	Xavi Hernandez	2018-10-10	1	-16/+16
\| \| \| \| \| \| \| \| \| \|	When compiling in other architectures there appear many warnings. Some of them are actual problems that prevent gluster to work correctly on those architectures. Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0 fixes: bz#1632717 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	dht: utilize the framework to pass-through xlator tasks	Amar Tumballi	2018-09-19	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also fixes the issue caused due to not converting back the fn function to after getting its address. We wanted the value of the field, not the address of the pt_fop field. With this patch, DHT will always be started in pass-through mode if the number of subvols is just 1. Fixes some tests to make sure DHT is in full config (ie, subvols > 1). - increased timeout of brick-mux test as it was bordering on 300 seconds. - Also change the volume type to supported 'replica 3' from 'replica 2'. - also no DHT tests should assume presence of DHT when there is just 1 brick in volume Credits: Nithya B <nbalacha@redhat.com> fixes: #405 Change-Id: I8e55239ce58d6ac6ae1901e2e384be1ecbd33d6e Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	1	-1734/+1648
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	dht: Use snprintf instead of strncpy	N Balachandran	2018-09-12	1	-9/+20
\| \| \| \| \| \| \| \| \| \| \|	The recent changes to use malloc instead of calloc left the new_name and new_path non-null terminated. We now use snprintf instead of strncpy to fix this. Change-Id: I1a31701ca9447efde38921be0ba2c73cde2e7976 fixes: bz#1626346 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
*	cluster/dht: fix inode ref management in dht_heal_path	Susant Palai	2018-08-14	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In dht_heal_path, the inodes are created & looked up from top to down. If the path is "a/b/c", then lookup will be done on a, then b and so on. Here is a rough snippet of the function "dht_heal_path". <snippet> if (bname) { ref_count - loc.inode = create/grep inode 1 - syncop_lookup (loc.inode) - linked_inode = inode_link (loc.inode) 2 /clean up current loc/ - loc_wipe(&loc) 1 /set up parent and bname for next child / - loc.parent = inode - bname = next_child_name } out: - inode_ref (linked_inode) 2 - loc_wipe (&loc) 1 </snippet> The problem with the above code is if _bname_ is empty ie the chain lookup is done, then for the next iteration we populate loc.parent anyway. Now that bname is empty, the loc_wipe is done in the _out_ section as well. Since, the loc.parent was set to the previous inode, we lose a ref unwantedly. Now a dht_local_wipe as part of the DHT_STACK_UNWIND takes away the last ref leading to inode_destroy. This problenm is observed currently with nfs-ganesha with the nameless lookup. Post the inode_purge, gfapi does not get the new inode to link and hence, it links the inode it sent in the lookup fop, which does not have any dht related context (layout) leading to "invalid argument error" in lookup path done parallely with tar operation. test done in the following way: - create two nfs client connected with two different nfs servers. - run untar on one client and run lookup continuously on the other. - Prior to this patch, invalid arguement was seen which is fixed with the current patch. Change-Id: Ifb90c178a2f3c16604068c7da8fa562b877f5c61 fixes: bz#1610256 Signed-off-by: Susant Palai <spalai@redhat.com>
*	All: remove memset() before sprintf()	Yaniv Kaul	2018-08-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	It's not needed. There's a good chance the compiler is smart enough to remove it anyway, but it can't hurt - I hope. Compile-tested only! Change-Id: Id7c054e146ba630227affa591007803f3046416b updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	All: run codespell on the code and fix issues.	Yaniv Kaul	2018-07-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	dht: remove useless argument from dht_iatt_merge	Kinglong Mee	2018-07-18	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	The last using of the subvol argument has been removed at 4e1ec35ef4f7 ("core: fill 'ia_ino' from 'ia_gfid' in 'storage/posix' ......") 7 years ago (2011-06-16). Change-Id: I9788d79e2e40cc153cf2960e28c7c1c1033dc8f7 fixes: bz#1601683 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	cluster/dht: Do not try to use up the readdirp buffer	N Balachandran	2018-06-29	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DHT attempts to use up the entire buffer in readdirp before unwinding in an attempt to reduce the number of calls. However, this has 2 disadvantages: 1. This can cause a stack overflow when parallel readdir is enabled. If the buffer only has a little space,rda can send back only one or two entries. If those entries are stripped out by dht_readdirp_cbk (linkto files for example) it will once again wind down to rda in an attempt to fill the buffer before unwinding to FUSE. This process can continue for several iterations, causing the stack to grow and eventually overflow, causing the process to crash. 2. If parallel readdir is disabled, dht could send readdirp calls with small buffers to the bricks, thus increasing the number of network calls. We are therefore reverting to the existing behaviour. Please note, this only mitigates the stack overflow, it does not prevent it from happening. This is still possible if a subvol has thousands of linkto files for instance. Change-Id: I291bc181c5249762d0c4fe27fa4fc2631166adf5 fixes: bz#1593548 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: fixes to parallel renames to same destination codepath	Raghavendra G	2018-05-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Test case: # while true; do uuid="`uuidgen`"; echo "some data" > "test$uuid"; mv "test$uuid" "test" -f \|\| break; echo "done:$uuid"; done This script was run in parallel from multiple mountpoints Along the course of getting the above usecase working, many issues were found: Issue 1: ======= consider a case of rename (src, dst). We can encounter a situation where, * dst is a file present at the time of lookup * dst is removed by the time rename fop reaches glusterfs In this scenario, acquring inodelk on dst fails with ESTALE resulting in failure of rename. However, as per POSIX irrespective of whether dst is present or not, rename should be successful. Acquiring entrylk provides synchronization even in races like this. Algorithm: 1. Take inodelks on src and dst (if dst is present) on respective cached subvols. These inodelks are done to preserve backward compatibility with older clients, so that synchronization is preserved when a volume is mounted by clients of different versions. Once relevant older versions (3.10, 3.12, 3.13) reach EOL, this code can be removed. 2. Ignore ENOENT/ESTALE errors of inodelk on dst. 3. protect namespace of src and dst. To protect namespace of a file, take inodelk on parent on hashed subvol, then take entrylk on the same subvol on parent with basename of file. inodelk on parent is done to guard against changes to parent layout so that hashed subvol won't change during rename. 4. <rest of rename continues> 5. unlock all locks Issue 2: ======== linkfile creation in lookup codepath can race with a rename. Imagine the following scenario: * lookup finds a data-file with gfid - gfid-dst - without a corresponding linkto file on hashed-subvol. It decides to create linkto file with gfid - gfid-dst. - Note that some codepaths of dht-rename deletes linkto file of dst as first step. So, a lookup racing with an in-progress rename can easily run into this situation. * a rename (src-path:gfid-src, dst-path:gfid-dst) renames data-file and hence gfid of data-file changes to gfid-src with path dst-path. * lookup proceeds and creates linkto file - dst-path - with gfid - dst-gfid - on hashed-subvol. * rename tries to create a linkto file dst-path with src-gfid on hashed-subvol, but it fails with EEXIST. But EEXIST is ignored during linkto file creation. Now we've ended with dst-path having different gfids - dst-gfid on linkto file and src-gfid on data file. Future lookups on dst-path will always fail with ESTALE, due to differing gfids. The fix is to synchronize linkfile creation in lookup path with rename using the same mechanism of protecting namespace explained in solution of Issue 1. Once locks are acquired, before proceeding with linkfile creation, we check whether conditions for linkto file creation are still valid. If not, we skip linkto file creation. Issue 3: ======== gfid of dst-path can change by the time locks are acquired. This means, either another rename overwrote dst-path or dst-path was deleted and recreated by a different client. When this happens, cached-subvol for dst can change. If rename proceeds with old-gfid and old-cached subvol, we'll end up in inconsistent state(s) like dst-path with different gfids on different subvols, more than one data-file being present etc. Fix is to do the lookup with a new inode after protecting namespace of dst. Post lookup, we've to compare gfids and correct local state appropriately to be in sync with backend. Issue 4: ======== During revalidate lookup, if following a linkto file doesn't lead to a valid data-file, local->cached-subvol was not reset to NULL. This means we would be operating on a stale state which can lead to inconsistency. As a fix, reset it to NULL before proceeding with lookup everywhere. Issue 5: ======== Stale dentries left out in inode table on brick resulted in failures of link fop even though the file/dentry didn't exist on backend fs. A patch is submitted to fix this issue. Please check the dependency tree of current patch on gerrit for details In short, we fix the problem by not blindly trusting the inode-table. Instead we validate whether dentry is present by doing lookup on backend fs. Change-Id: I832e5c47d232f90c4edb1fafc512bf19bebde165 updates: bz#1543279 BUG: 1543279 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht: Add migration checks to dht_(f)xattrop	N Balachandran	2017-12-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	The dht_(f)xattrop implementation did not implement migration phase1/phase2 checks which could cause issues with rebalance on sharded volumes. This does not solve the issue where fops may reach the target out of order. Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c BUG: 1471031 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Check for NULL before using variable	Ashish Pandey	2017-11-06	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Coverity ID: 245 Check statvfs received as cbk before using it Coverity ID: 228 Check NULL loc before freeing it. Change-Id: I1b153ed5e7b81bcf7033bf710808e95908dcfef4 BUG: 789278 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	cluster/dht: Don't store the entire uuid for subvols	N Balachandran	2017-10-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Comparing the uuid string of the local node against that stored in the local_subvol information is inefficient, especially as it is done for every file to be migrated. The code has now been changed to set the value of info to 1 if the nodeuuid is that of the node making the comparison so this becomes an integer comparison. Change-Id: I7491d59caad3b71dbf5facc94dcde0cd53962775 BUG: 1451434 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht : User xattrs are not healed after brick stop/start	Mohit Agrawal	2017-10-04	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a distributed volume custom extended attribute value for a directory does not display correct value after stop/start or added newly brick. If any extended(acl) attribute value is set for a directory after stop/added the brick the attribute(user\|acl\|quota) value is not updated on brick after start the brick. Solution: First store hashed subvol or subvol(has internal xattr) on inode ctx and consider it as a MDS subvol.At the time of update custom xattr (user,quota,acl, selinux) on directory first check the mds from inode ctx, if mds is not present on inode ctx then throw EINVAL error to application otherwise set xattr on MDS subvol with internal xattr value of -1 and then try to update the attribute on other non MDS volumes also.If mds subvol is down in that case throw an error "Transport endpoint is not connected". In dht_dir_lookup_cbk\| dht_revalidate_cbk\|dht_discover_complete call dht_call_dir_xattr_heal to heal custom extended attribute. In case of gnfs server if hashed subvol has not found based on loc then wind a call on all subvol to update xattr. Fix: 1) Save MDS subvol on inode ctx 2) Check if mds subvol is present on inode ctx 3) If mds subvol is down then call unwind with error ENOTCONN and if it is up then set new xattr "GF_DHT_XATTR_MDS" to -1 and wind a call on other subvol. 4) If setxattr fop is successful on non-mds subvol then increment the value of internal xattr to +1 5) At the time of directory_lookup check the value of new xattr GF_DHT_XATTR_MDS 6) If value is not 0 in dht_lookup_dir_cbk(other cbk) functions then call heal function to heal user xattr 7) syncop_setxattr on hashed_subvol to reset the value of xattr to 0 if heal is successful on all subvol. Test : To reproduce the issue followed below steps 1) Create a distributed volume and create mount point 2) Create some directory from mount point mkdir tmp{1..5} 3) Kill any one brick from the volume 4) Set extended attribute from mount point on directory setfattr -n user.foo -v "abc" ./tmp{1..5} It will throw error " Transport End point is not connected " for those hashed subvol is down 5) Start volume with force option to start brick process 6) Execute getfattr command on mount point for directory 7) Check extended attribute on brick getfattr -n user.foo <volume-location>/tmp{1..5} It shows correct value for directories for those xattr fop were executed successfully. Note: The patch will resolve xattr healing problem only for fuse mount not for nfs mount. BUG: 1371806 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5
*	cluster/dht: EBADF handling for fremovexattr and fsetxattr	N Balachandran	2017-08-09	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add EBADF handling for dht_fremovexattr and dht_fsetxattr. Change-Id: Ide0d5812dae79655d2565157e5baabcd753b4309 BUG: 1476665 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17999 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht: Check for open fd only on EBADF	N Balachandran	2017-08-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DHT fd based fops used to check if the fd was open on the cached subvol before winding the call. However, this introduced a performance regression of about 30% for reads. This check was introduced to handle cases where files were migrated while IOs were happening. As this is not the common case, dht will now check if the fd is open on the cached subvol only if the call fails with EBADF. This will prevent a performance hit where a rebalance is not running. Change-Id: I2035a858d63c3fcd22bb634055bbb0ad01686808 BUG: 1476665 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17976 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/dht: Fix fd check race	N Balachandran	2017-07-11	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a another race between the cached subvol being updated in the inode_ctx and the fd being opened on the target. 1. fop1 -> fd1 -> subvol0 2. file migrated from subvol0 to subvol1 and cached_subvol changed to subvol1 in inode_ctx 3. fop2 -> fd1 -> subvol1 [takes new cached subvol] 4. fop2 -> checks fd ctx (fd not open on subvol1) -> opens fd1 on subvol1 5. fop1 -> checks fd ctx (fd not open on subvol0) -> tries to open fd1 on subvol0 -> fails with "No such file on directory". Fix: If dht_fd_open_on_dst fails with ENOENT or ESTALE, wind to old subvol and let the phase1/phase2 checks handle it. Change-Id: I34f8011574a8b72e3bcfe03b0cc4f024b352f225 BUG: 1465075 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17731 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
*	cluster/dht: Check if fd is opened on dst subvol	N Balachandran	2017-06-28	1	-0/+280
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an fd is opened on a file, the file is migrated and the cached subvol is updated in the inode_ctx before an fd based fop is sent, the fop is sent to the dst subvol on which the fd is not opened. This causes the FOP to fail with EBADF. Now, every fd based fop will check to see that the fd has been opened on the dst subvol before winding it down. Change-Id: Id92ef5eb7a5b5226688e2d2868b15e383f5f240e BUG: 1465075 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17630 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/dht: Make optimal usage of buffer provided with readdir(p)	Sakshi	2017-05-31	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dht_readdirp must unwind with list of entries only after the entire buffer requested by kernel is filled to avoid extra syscalls occuring when returning partially filled buffer. Also wind readdir call to next subvol on reaching EOD for directory on that subvol to avoid extra network call. Change-Id: If2e1a2722f813d95457c7542bff25fef56c7a041 BUG: 1356453 Signed-off-by: Sakshi <sabansal@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/12271 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com>
*	cluster/dht: Rebalance on all nodes should migrate files	N Balachandran	2017-05-16	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Rebalance compares the node-uuid of a file against its own to and migrates a file only if they match. However, the current behaviour in both AFR and EC is to return the node-uuid of the first brick in a replica set for all files. This means a single node ends up migrating all the files if the first brick of every replica set is on the same node. Fix: AFR and EC will return all node-uuids for the replica set. The rebalance process will divide the files to be migrated among all the nodes by hashing the gfid of the file and using that value to select a node to perform the migration. This patch makes the required DHT and tiering changes. Some tests in rebal-all-nodes-migrate.t will need to be uncommented once the AFR and EC changes are merged. Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c BUG: 1366817 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17239 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	feature/dht: Directory synchronization	Kotresh HR	2017-04-26	1	-648/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Design doc: https://review.gluster.org/16876 Directory creation is now synchronized with blocking inodelk of the parent on the hashed subvolume followed by the entrylk on the hashed subvolume between dht_mkdir, dht_rmdir, dht_rename_dir and lookup selfheal mkdir. To maintain internal consistency of directories across all subvols of dht, we need locks. Specifically we are interested in: 1. Consistency of layout of a directory. Only one writer should modify the layout at a time. A writer (layout setting during directory heal as part of lookup) shouldn't modify the layout while there are readers (all other fops like create, mkdir etc., which consume layout) and readers shouldn't read the layout while a writer is in progress. Readers can read the layout simultaneously. Writer takes a WRITE inodelk on the directory (whose layout is being modified) across ALL subvols. Reader takes a READ inodelk on the directory (whose layout is being read) on ANY subvol. 2. Consistency of directory namespace across subvols. The path and associated gfid should be same on all subvols. A gfid should not be associated with more than one path on any subvol. All fops that can change directory names (mkdir, rmdir, renamedir, directory creation phase in lookup-heal) takes an entrylk on hashed subvol of the directory. NOTE1: In point 2 above, since dht takes entrylk on hashed subvol of a directory, the transaction itself is a consumer of layout on parent directory. So, the transaction is a reader of parent layout and does an inodelk on parent directory just like any other layout reader. So a mkdir (dir/subdir) would: > Acquire a READ inodelk on "dir" on any subvol. > Acquire an entrylk (dir, "subdir") on hashed subvol of "subdir". > creates directory on hashed subvol and possibly on non-hashed subvols. > UNLOCK (entrylk) > UNLOCK (inodelk) NOTE2: mkdir fop while setting the layout of the directory being created is considered as a reader, but NOT a writer. The reason is for a fop which can consume the layout of a directory to come either of the following conditions has to be true: > mkdir syscall from application has to complete. In this case no need of synchronization. > A lookup issued on the directory racing with mkdir has to complete. Since layout setting by a lookup is considered as a writer, only one of either mkdir or lookup will set the layout. Code re-organization: All the lock related routines are moved to "dht-lock.c" file. New wrapper function is introduced to take blocking inodelk followed by entrylk 'dht_protect_namespace' Updates #191 Change-Id: I01569094dfbe1852de6f586475be79c1ba965a31 Signed-off-by: Kotresh HR <khiremat@redhat.com> BUG: 1443373 Reviewed-on: https://review.gluster.org/15472 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	refcount: typecast function for calling on free	Niels de Vos	2017-01-31	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All of the functions called to free the refcounted structure are doing a typecast from (void*) to their own type taht is being free'd. This really is not needed and the refcount interface is made a little simpler without the requirement of typecasting. With this small improvement in the API, all callers are updated too. Change-Id: I32473b6d1799f62861d4b2d78ea30c09e6c80ab1 BUG: 1416889 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16471 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	dht/cluster: add logs to fix-layout code path	Susant Palai	2017-01-20	1	-7/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently there is no helpful log in fix-layout code path. Adding the logs to be helpful for debugging fix-layout failures. BUG: 1414782 Change-Id: I61c29ceedcaa2e235fa7be99866709d6ca6de3ae Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16040 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	feature/dht: undo partially successful dir rename	Csaba Henk	2017-01-11	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As with dht, dirs are present on all subvolumes, renaming them is a compound operation and thus a partial success + partial failure scenario is possible, resulting in an inconsistent state. For purposes of reproduction, such a scenario can easily be produced by stopping the volume, edit the volfile of a certain subvolume to get at an "option read-only on" setting, and then restart the volume. Thus those operations that are to make change on the affected subvolume will fail with EROFS. To handle such scenarios, we introduce an in-memory cache where we record the return values obtained from the subvolumes. At the final stage of the dir rename operation we check if it's a partial success/fail situation. If yes, then we perform a reverse rename op on those subvolumes where the operation succeeded. Change-Id: I3d05f74f53932cb984a918d252a7309c1009a51d BUG: 1412069 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/15739 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Fix dict_leak in migration check tasks	N Balachandran	2017-01-02	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed a memleak where dict was not being unrefed in the dht_migration_complete_check_task and dht_rebalance_inprogress_task functions. Change-Id: I3d42e9a2e5c8596c985bf6431a68fd3905227383 BUG: 1409186 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/16308 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	cluster/dht: remove unnecessary struct member	N Balachandran	2016-09-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removed a structure member that was not used from dht_local_t. Skipped some unnecessary checks in dht_filter_loc_subvol_key. Change-Id: I81740b6528e063fb9cf5817e05865ff4d77aa748 BUG: 1378305 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15542 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	quotad: fix potential buffer overflows	Raghavendra G	2016-08-25	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This converts sprintf to gf_asprintf in following components: * quotad.c * dht * afr * protocol/client * rpc/rpc-lib * rpc/rpc-transport Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee BUG: 1332073 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14102 Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
*	cluster/dht: Fix unsafe iteration on inode->fd_list	Xavier Hernandez	2016-06-15	1	-16/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When DHT traverses the inode->fd_list, it does that in an unsafe way that can generate races with fd_unref() called from other threads. This patch fixes this problem taking the inode->lock and adding a reference to the fd while it's being used outside of the mutex protected region. A minor change in storage/posix has been done to also access the inode->fd_list in a safe way. Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93 BUG: 1344340 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14682 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	dht:remember locked subvol and send unlock to the same	Mohammed Rafi KC	2016-05-06	1	-0/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During locking we send lock request to cached subvol, and normally we unlock to the cached subvol But with parallel fresh lookup on a directory, there is a race window where the cached subvol can change and the unlock can go into a different subvol from which we took lock. This will result in a stale lock held on one of the subvol. So we will store the details of subvol which we took the lock and will unlock from the same subvol Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb BUG: 1311002 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13492 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
*	cluster/distribute: detect stale layouts in entry fops	Raghavendra G	2016-04-22	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1323040 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/13885 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
*	dht: lock on subvols to prevent lookup vs rmdir race	Sakshi	2016-04-05	1	-5/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a possibility that while an rmdir is completed on some non-hashed subvol and proceeding to others, a lookup selfheal can recreate the same directory on those subvols for which the rmdir had succeeded. Now the deletion of the parent directory will fail with an ENOTEMPTY. To fix this take blocking inodelk on the subvols before starting rmdir. Selfheal must also take blocking inodelk before creating the entry. Change-Id: I168a195c35ac1230ba7124d3b0ca157755b3df96 BUG: 1245065 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13528 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
*	dht: report constant directory size	Jeff Darcy	2016-03-20	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Directory size is meaningless. Every filesystem has its own unpredictable way of increasing or decreasing it, based on internal data structures and even transient conditions. Some filesystems (e.g. ext4) never decrease it at all. Others (e.g. btrfs) don't even report it. Very few programs look at it, and those that do are broken. Unfortunately, one such program is GNU tar, which will complain when it sees different values because at different times we got the value from different DHT subvolumes. To avoid such problems, just report a constant value. Change-Id: Id64ce917c75b5f7ff50cb55b6e997f3b3556e7e3 BUG: 1302948 Original-author: Shyam <srangana@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13770 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht : Ftruncate on migrating file fails with EINVAL	N Balachandran	2015-12-22	1	-25/+202
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	What: If dht_open is called on a migrating file after the inode_ctx is set, subsequent FOPs on that fd do not open the fd on the dst subvol. This is seen when the open-ftruncate-close sequence is repeatedly called on a migrating file. A second call to the sequence described above causes dht_truncate_cbk to call dht_truncate2 as the dht_inode_ctx was already set by the first call. As dht_rebalance_in_progress_check is not called, the fd is not opened on the dst subvol. On a distributed-replicate volume, this causes AFR to open the fd using afr_fix_open, but with the wrong flags, causing posix_ftruncate to fail with EINVAL. The fix: We require fd specific information to make a decision while handling migrating files. Set the fd_ctx to indicate the fd has been opened on the dst subvol and check if it has been set while processing Phase1/Phase2 checks in the FOP callback functions. Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14 BUG: 1284823 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12985 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: fix loading tier.so into glusterd	N Balachandran	2015-12-03	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glusterd occasionally loads shared libraries of translators. This failed for tiering due to a reference to dht_methods which is defined as a global variable which is not necessary. The global variable has been removed and this is now a member of dht_conf and is now initialised in the *_init calls. Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953 BUG: 1287842 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12863 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	dht: heal directory path if the directory is not present	Mohammed Rafi KC	2015-11-08	1	-0/+171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After a successful nameless lookup if the directory is not present on any of the subvol, then we will get the path of the directory and will recursively send a named lookp on each parent directory. This will help particularly for the scenarios like add brick and attach-tier. Change-Id: I64c2118a5ab03bbaa59b0dfc62babdf4472a92a3 BUG: 1272949 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12376 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	dht/rebalance: fix layout and dict leaks	Susant Palai	2015-10-06	1	-0/+5
\| \| \| \| \| \| \| \| \|	Change-Id: Ib3911dfa1f950ff9decbe249ad798e97226dd06d BUG: 1266877 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12295 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/tier: Handle FOPs on files being migrated	N Balachandran	2015-09-22	1	-23/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Determine which DHT level is responsible for handling fops on a file undergoing migration based on the name of the the linkto xattr set on the file being migrated and process accordingly. Change-Id: I82772e39314d4fe7f2ba0dcf22de0c6a374ee139 BUG: 1254428 Signed-off-by: N Balachandran <nbalacha@redhat.com> Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12090 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>