glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	doc: Added release 8.2 notesv8.2	Rinku Kothiya	2020-09-16	1	-0/+27
\| \| \| \| \| \| \|	Updates: #1485 Change-Id: Ia42666051df1624444ea203bf8b7c876cf78b592 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	Events: Fixing coverity issues.	Srijan Sivakumar	2020-09-15	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	Fixing resource leak reported by coverity scan. CID: 1431237 Change-Id: I2bed106b3dc4296c50d80542ee678d32c6928c25 Updates: #1060 Signed-off-by: Srijan Sivakumar <ssivakum@redhat.com> (cherry picked from commit ebc0253269d8a538239dd0b99d42f56ea320b0f0)
*	Events: Socket creation after getaddrinfo and IPv4 and IPv6 packet capture	srijan-sivakumar	2020-09-14	3	-15/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: Currently, the socket creation is done prior to getaddrinfo function being invoked. This can cause mismatch in the protocol and address families of the created socket and the result of the getaddrinfo api. Also, the glustereventsd UDP server by default only captures IPv4 packets hence IPv6 packets are not even captured. Code Changes: 1. Modified the socket creation in such a way that the parameters taken in are dependent upon the result of the getaddrinfo function. 2. Created a subclass for adding address family in glustereventsd.py for both AF_INET and AF_INET6. 3. Modified addresses in the eventsapiconf.py.in Reasoning behind the approach: 1. If we are using getaddrinfo function then socket creation should happen only after we check if we received back valid addresses. Hence socket creation should come after the call to getaddrinfo 2. The listening server which pushes the events to the webhook has to listen for both IPv4 and IPv6 messages as we would not be sure as to what address family is picked in _gf_event. Fixes: #1377 Change-Id: I568dcd1a977c8832f0fef981e1f81cac7043c760 Signed-off-by: srijan-sivakumar <ssivakum@redhat.com> (cherry picked from commit 7c309928591deb8d0188793677958226ac03897a)
*	glusterd: readdir-ahead off by default	nik-redhat	2020-09-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Changing the default value of readdir-ahead to off, but it can be enabled/disabled later on if with gluster vol set <volname> performance.readdir-ahead enabel/disable command. Updates: #1472 Change-Id: Idb3e16e8be98d7a811fc8e5d09906919ef50fbab Signed-off-by: nik-redhat <nladha@redhat.com> (cherry picked from commit 84a4cf76219b6187fc625740d1a1ebbe40e9f22c)
*	glusterd: cksum mismatch on upgrading to latest gluster	nik-redhat	2020-09-14	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: In gluster versions less than 7, the checksums were calculated whether or not the quota is enabled or not, and that cksum value was also getting stored in the quota.cksum file. But, from gluster 7 version onwards cksum was calculated only if the quota is enabled. Due to this, the cksums in quota.cksum files differ after upgrading. Fix: Added a check to see if the OP_VERSION is less than 7 then, follow the previous method otherwise, move as per the latest changes for cksum calculation. This changes for the cksum calculation was done in this commit : https://github.com/gluster/glusterfs/commit/3b5eb592f5 Updates: #1332 Change-Id: I7a95e5e5f4d4be4983fb7816225bf9187856c003 Signed-off-by: nik-redhat <nladha@redhat.com> (cherry picked from commit 865cca1190e233381f975ff36118f46e29477dcf) Signed-off-by: nik-redhat <nladha@redhat.com>
*	open-behind: implement create fop	Xavi Hernandez	2020-09-14	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Open behind didn't implement create fop. This caused that files created were not accounted for the number of open fd's. This could cause future opens to be delayed when they shouldn't. This patch implements the create fop. It also fixes a problem when destroying the stack: when frame->local was not NULL, STACK_DESTROY() tried to mem_put() it, which is not correct. Fixes: #1440 Change-Id: Ic982bad07d4af30b915d7eb1fbcef7a847a45869 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	tests: provide an option to mark tests as 'flaky'	Amar Tumballi	2020-09-14	15	-40/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* also add some time gap in other tests to see if we get things properly * create a directory 'tests/000/', which can host any tests, which are flaky. * move all the tests mentioned in the issue to above directory. * as the above dir gets tested first, all flaky tests would be reported quickly. * change `run-tests.sh` to continue tests even if flaky tests fail. Reference: gluster/project-infrastructure#72 Updates: #1000 Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e Signed-off-by: Amar Tumballi <amar@kadalu.io> (cherry picked from 097db13c11390174c5b9f11aa0fd87eca1735871)
*	libglusterfs: fix dict leak	Ravishankar N	2020-09-07	4	-10/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: gf_rev_dns_lookup_cached() allocated struct dnscache->dict if it was null but the freeing was left to the caller. Fix: Moved dict allocation and freeing into corresponding init and fini routines so that its easier for the caller to avoid such leaks. Updates: #1000 Change-Id: I90d6a6f85ca2dd4fe0ab461177aaa9ac9c1fbcf9 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 079f7a7d8a2bd85070c1da4dde2452ca82a1cdbb)
*	doc: Updated release 8.1 notesv8.1	Rinku Kothiya	2020-08-25	1	-2/+2
\| \| \| \| \| \| \|	Updates: #1318 Change-Id: I87787a1aaf59302ad045ed6d2562920e17654678 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	doc: Added release 8.1 notes	Rinku Kothiya	2020-08-24	1	-0/+32
\| \| \| \| \| \| \|	Updates: #1318 Change-Id: I14d589bd9af85bdd4ae02902e41d4c5f7d930358 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	afr: add null check for thin-arbiter gfid.	Ravishankar N	2020-08-21	4	-88/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Lookup/creation of thin-arbiter ID file happens in background during mounting. On new volumes, if the ID file creation is in progress, and a FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since the TA file's gfid is null at this point, the ASSERT checks in protocol/ client causes a crash. Fix: Given that we decided to do Lookup/creation of thin-arbiter in background, fail the other AFR FOPS on TA if the ID file's gfid is null instead of winding it down to protocol/client. Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead code. Updates: #763 Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit f9b5074394e3d2f3b6728aab97230ba620879426)
*	open-behind: fix call_frame leak	Xavi Hernandez	2020-08-21	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	When an open was delayed, a copy of the frame was created because the current frame was used to unwind the "fake" open. When the open was actually sent, the frame was correctly destroyed. However if the file was closed before needing to send the open, the frame was not destroyed. This patch correctly destroys the frame in all cases. Change-Id: I8c00fc7f15545c240e8151305d9e4cf06d653926 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: #1440
*	posix: Implement a janitor thread to close fd	Mohit Agrawal	2020-08-21	7	-26/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In the commit fb20713b380e1df8d7f9e9df96563be2f9144fd6 we use syntask to close fd but we have found the patch is reducing the performance Solution: Use janitor thread to close fd's and save the pfd ctx into ctx janitor list and also save the posix_xlator into pfd object to avoid the race condition during cleanup in brick_mux environment Change-Id: Ifb3d18a854b267333a3a9e39845bfefb83fbc092 Fixes: #1396 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> (cherry picked from commit 41b9616435cbdf671805856e487e373060c9455b)
*	glusterd: dump SSL error stack on disconnect	Leonid Ishimnikov	2020-08-20	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When a non-SSL connection is attempted on an SSL-enabled management port, unrelated peers are subsequently disconnected from the node with a misleading error message. Cause: A non-SSL client causes OpenSSL to push a wrong version error into its thread-local error stack, but this error is never cleared, and it lingers in the stack until the thread is used by another SSL session, and a certain condition requires the error stack to be examined, at which time the old error is discovered and the connection is terminated. Solution: Log and clear the error stack upon terminating the connection. Change-Id: I82f3a723285df24dafc88850ae4fca65b69f6ae4 Fixes: #1418 Signed-off-by: Leonid Ishimnikov <lishim@fastmail.com> (cherry picked from commit bb5801d1480314e09b4203d2525bd01aada5c683)
*	features/shard: optimization over shard lookup in case of prealloc	Vinayakswami Hariharmath	2020-08-20	2	-7/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Assume that we are preallocating a VM of size 1TB with a shard block size of 64MB then there will be ~16k shards. This creation happens in 2 steps shard_fallocate() path i.e 1. lookup for the shards if any already present and 2. mknod over those shards do not exist. But in case of fresh creation, we dont have to lookup for all shards which are not present as the the file size will be 0. Through this, we can save lookup on all shards which are not present. This optimization is quite useful in the case of preallocating big vm. Also if the file is already present and the call is to extend it to bigger size then we need not to lookup for non- existent shards. Just lookup preexisting shards, populate the inodes and issue mknod on extended size. Fixes: #1425 Change-Id: I60036fe8302c696e0ca80ff11ab0ef5bcdbd7880 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com> (cherry picked from commit 2ede911d07c6dc07a0f729526ab590ace77341ae)
*	extras: Modify group 'virt' to include network-related options	Krutika Dhananjay	2020-08-20	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is needed to work around an issue seen where vms running on online hosts are getting killed when a different host is rebooted in ovirt-gluster hyperconverged environments. Actual RCA is quite lengthy and documented in the github issue. Please refer to it for more details. Change-Id: Ic25b5f50144ad42458e5c847e1e7e191032396c1 Fixes: #1217 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit 5391f16fc4aa00f75af2a4c2707768370ace5f6c)
*	cluster/ec: Remove stale entries from indices/xattrop folder	Ashish Pandey	2020-08-20	2	-2/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If a gfid is present in indices/xattrop folder while the file/dir is actaully healthy and all the xattrs are healthy, it causes lot of lookups by shd on an entry which does not need to be healed. This whole process eats up lot of CPU usage without doing meaningful work. Solution: Set trusted.ec.dirty xattr of the entry so that actual heal process happens and at the end of it, during unset of dirty, gfid enrty from indices/xattrop will be removed. Change-Id: Ib1b9377d8dda384bba49523e9ff6ba9f0699cc1b Fixes: #1385 Signed-off-by: Ashish Pandey <aspandey@redhat.com> (cherry picked from commit ba1b0a471dec968633f89c7f790b099fb4ad700d)
*	glusterd: Increase buffer length to save multiple hostnames in peer file	Mohit Agrawal	2020-08-19	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of handling friend update request glusterd updates peer file and if DNS has returned multiple hostnames for the same IP, glusterd saves all hostnames in peer file.In commit 1fa089e7a2b180e0bdcc1e7e09a63934a2a0c0ef We changed the approach to save all key value pairs in single shot. In case of a buffer is not having space to store the hostnames glusterd writes partial hostname in peer file. Solution: To avoid the failure increase the buffer length Change-Id: Iee969d165333e9c5ba69431d474c541b8f12d442 Fixes: #1407 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> (cherry picked from commit 6e8e73a06d71382f8f6e3cd83fe72692d19e66ba)
*	geo-rep: Fix corner case in rename on mkdir during hybrid crawl	Sunny Kumar	2020-08-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: The issue is being hit during hybrid mode while handling rename on slave. In this special case the rename is recorded as mkdir and geo-rep process it by resolving the path form backend. While resolving the backend path during this special handling one corner case is not considered. <snip> Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker res = getattr(self.obj, rmeth)(in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops src_entry = get_slv_dir_path(slv_host, slv_volume, gfid) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 710, in get_slv_dir_path dir_entry = os.path.join(pfx, pargfid, basename) File "/usr/lib64/python2.7/posixpath.py", line 75, in join if b.startswith('/'): AttributeError: 'int' object has no attribute 'startswith' In pyhthon3: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.8/posixpath.py", line 90, in join genericpath._check_arg_types('join', a, p) File "/usr/lib64/python3.8/genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int' </snip> Backport of: >Ptach link: https://review.gluster.org/#/c/glusterfs/+/24468/ >Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332 >Fixes: #1250 >Signed-off-by: Sunny Kumar <sunkumar@redhat.com> Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332 Fixes: #1250 Signed-off-by: Sunny Kumar <sunkumar@redhat.com> (cherry picked from commit 27f5c8ba844e9da54fc1304df4ffe015a3bbb9bd) Change-Id: I171eb9ad4e30f49cfe86cb258918682d3c0f5af9
*	cluster/ec: Improve detection of new heals	Xavi Hernandez	2020-08-19	6	-28/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When EC successfully healed a directory it assumed that maybe other entries inside that directory could have been created, which could require additional heal cycles. For this reason, when the heal happened as part of one index heal iteration, it triggered a new iteration. The problem happened when the directory was healthy, so no new entries were added, but its index entry was not removed for some reason. In this case self-heal started and endless loop healing the same directory continuously, cause high CPU utilization. This patch improves detection of new files added to the heal index so that a new index heal iteration is only triggered if there is new work to do. Change-Id: I2355742b85fbfa6de758bccc5d2e1a283c82b53f Fixes: #1354 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	features/shard: Convert shard block indices to uint64	Krutika Dhananjay	2020-08-19	2	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a crash in FOPs that operate on really large sharded files where number of participant shards could sometimes exceed signed int32 max. The patch also adds GF_ASSERTs to ensure that number of participating shards is always greater than 0 for files that do have more than one shard. Change-Id: I354de58796f350eb1aa42fcdf8092ca2e69ccbb6 Fixes: #1348 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit cdf01cc47eb2efb427b5855732d9607eec2abc8a)
*	features/shard: Use fd lookup post file open	Vinayakswami Hariharmath	2020-08-19	2	-43/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: When a process has the open fd and the same file is unlinked in middle of the operations, then file based lookup fails with ENOENT or stale file Solution: When the file already open and fd is available, use fstat to get the file attributes Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4 Fixes: #1281 Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com> (cherry picked from commit 71dd19f710b81136f318b3a95ae430971198ee70)
*	Issue with gf_fill_iatt_for_dirent	Soumya Koduri	2020-08-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In "gf_fill_iatt_for_dirent()", while calculating inode_path for loc, the inode should be of parent's. Instead it is loc.inode which results in error and eventually lookup/readdirp fails. This patch fixes the same. Change-Id: Ied086234a4634e8cb13520521ac547c87b3c76b5 Fixes: #1351 Signed-off-by: Soumya Koduri <skoduri@redhat.com> (cherry picked from commit ab8308333aaf033e07dbbdf2f69f9313a7e311f3)
*	cluster/afr: Delay post-op for fsync	Pranith Kumar K	2020-07-28	9	-10/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: AFR doesn't delay post-op for fsync fop. For fsync heavy workloads this leads to un-necessary fxattrop/finodelk for every fsync leading to bad performance. Fix: Have delayed post-op for fsync. Add special flag in xdata to indicate that afr shouldn't delay post-op in cases where either the process will terminate or graph-switch would happen. Otherwise it leads to un-necessary heals when the graph-switch/process-termination happens before delayed-post-op completes. Fixes: #1253 Change-Id: I531940d13269a111c49e0510d49514dc169f4577 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	api: libgfapi symbol versions break LTO in Fedora rawhide/f33	Kaleb S. KEITHLEY	2020-07-20	6	-355/+210
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way symbol versions are implemented is incompatible with gcc-10 and LTO. Fedora provenpackager Jeff Law (law [at] redhat.com) writes in the Fedora dist-git glusterfs.spec: This package uses top level ASM constructs which are incompatible with LTO. Top level ASMs are often used to implement symbol versioning. gcc-10 introduces a new mechanism for symbol versioning which works with LTO. Converting packages to use that mechanism instead of toplevel ASMs is recommended. In particular, note that the version of gluster in Fedora rawhide/f33 is glusterfs-8.0RC0. Once this fix is merged it will be necessary to backport it to the release-8 branch. At the time that gfapi symbol versions were first implemented we copied the GNU libc (glibc) symbol version implementation following Uli Drepper's symbol versioning HOWTO. Now gcc-10 has a symver attribute that can be used instead. (Maybe it has been there all along?) Both the original implemenation and this implemenation yield the same symbol versions. This can be seen by running `nm -D --with-symbol-versions libgfapi.so` on the libgfapi.so built before and after applying this fix. Change-Id: I05fda580afacfff1bfc07be810dd1afc08a92fb8 Fixes: #1352 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	doc: Updated release 8.0 notesv8.0	Rinku Kothiya	2020-07-05	1	-1/+7
\| \| \| \| \| \| \|	Updates: #1180 Change-Id: I6e5c85f2714896704949d9a99b06eefdde15d633 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	features/shard: Aggregate file size, block-count before unwinding removexattr	Krutika Dhananjay	2020-06-29	3	-70/+208
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Posix translator returns pre and postbufs in the dict in {F}REMOVEXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4b2dd41ede472c5829af80a67401ec5a6376d872 Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit 32519525108a2ac6bcc64ad931dc8048d33d64de)
*	packaging: refactor to align with common practices	Kaleb S. KEITHLEY	2020-06-29	1	-0/+11
\| \| \| \| \| \| \| \|	Apparently some sdditional Obsoletes: are required Change-Id: I919ae5a0fcc6f720e3eab4784af36977b9eef044 Fixes: #1126 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	packaging: refactor to align with common practices	Kaleb S. KEITHLEY	2020-06-29	1	-82/+224
\| \| \| \| \| \| \| \| \| \| \|	The claim that Fedora package guidelines do not require this scheme is a non-argument. Not only do they not require it, they don't prohibit it either. (And you can't prove a negative. It's a specious argument.) Change-Id: I7748c7531d52dedd71b3a7f5df049742258a6aba Fixes: #1126 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	open-behind: rewrite of internal logic	Xavi Hernandez	2020-06-29	12	-823/+1393
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a critical flaw in the previous implementation of open-behind. When an open is done in the background, it's necessary to take a reference on the fd_t object because once we "fake" the open answer, the fd could be destroyed. However as long as there's a reference, the release function won't be called. So, if the application closes the file descriptor without having actually opened it, there will always remain at least 1 reference, causing a leak. To avoid this problem, the previous implementation didn't take a reference on the fd_t, so there were races where the fd could be destroyed while it was still in use. To fix this, I've implemented a new xlator cbk that gets called from fuse when the application closes a file descriptor. The whole logic of handling background opens have been simplified and it's more efficient now. Only if the fop needs to be delayed until an open completes, a stub is created. Otherwise no memory allocations are needed. Correctly handling the close request while the open is still pending has added a bit of complexity, but overall normal operation is simpler. Change-Id: I6376a5491368e0e1c283cc452849032636261592 Fixes: #1225 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	features/shard: Aggregate size, block-count in iatt before unwinding setxattr	Krutika Dhananjay	2020-06-29	2	-17/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Posix translator returns pre and postbufs in the dict in {F}SETXATTR fops. These iatts are further cached at layers like md-cache. Shard translator, in its current state, simply returns these values without updating the aggregated file size and block-count. This patch fixes this problem. Change-Id: I4da0eceb4235b91546df79270bcc0af8cd64e9ea Fixes: #1243 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit 29ec66c6ab77e2d6893c6e213a3d1fb148702c99)
*	afr: more quorum checks in lookup and new entry marking	Ravishankar N	2020-06-29	4	-13/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: See github issue for details. Fix: -In lookup if the entry exists in 2 out of 3 bricks, don't fail the lookup with ENOENT just because there is an entrylk on the parent. Consider quorum before deciding. -If entry FOP does not succeed on quorum no. of bricks, do not perform new entry mark. Fixes: #1303 Change-Id: I56df8c89ad53b29fa450c7930a7b7ccec9f4a6c5 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit c4a6748f25d2c1ab3ebcf89952278ebf94c8d371)
*	locks: prevent deletion of locked entries	Xavi Hernandez	2020-06-29	7	-113/+674
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	To keep consistency inside transactions started by locking an entry or an inode, this change delays the removal of entries that are currently locked by one or more clients. Once all locks are released, the removal is processed. It has also been improved the detection of stale inodes in the locking code of EC. Fixes: #990 Change-Id: Ic8ba23d9480f80c7f74e7a310bf8a15922320fd5 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	cluster/afr: Prioritize ENOSPC over other errors	karthik-us	2020-06-16	4	-48/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a replicate/arbiter volume if file creations or writes fails on quorum number of bricks and on one brick it is due to ENOSPC and on other brick it fails for a different reason, it may fail with errors other than ENOSPC in some cases. Fix: Prioritize ENOSPC over other lesser priority errors and do not set op_errno in posix_gfid_set if op_ret is 0 to avoid receiving any error_no which can be misinterpreted by __afr_dir_write_finalize(). Also removing the function afr_has_arbiter_fop_cbk_quorum() which might consider a successful reply form a single brick as quorum success in some cases, whereas we always need fop to be successful on quorum number of bricks in arbiter configuration. Change-Id: I106e267f8b9451f681022f1cccb410d9bc824c08 Fixes: #1254 Signed-off-by: karthik-us <ksubrahm@redhat.com> (cherry picked from commit fa63b45ca5edf172b1b89b28b5db3c5129cc57b6)
*	md-cache: fix several NULL dereferencesv8.0rc0	Xavi Hernandez	2020-05-31	1	-66/+129
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch includes the following CID from Coverity Scan: * 1425196 * 1425197 * 1425198 * 1425199 * 1525200 Change-Id: Iddcfea449d3dd56d4dfcc39f4c3c608518e611e4 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Updates: #1060 (cherry picked from commit b53ba17dbfd2d18c10e2c308b8899d36726ab440)
*	doc: Added release 8.0 notes	Rinku Kothiya	2020-05-30	1	-0/+362
\| \| \| \| \| \| \|	Updates: #1180 Change-Id: If3ff097e299261f2b647d5faa85d268b0a86908c Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	syncop: improve scaling and implement more tools	Xavi Hernandez	2020-05-30	13	-109/+332
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current scaling of the syncop thread pool is not working properly and can leave some tasks in the run queue more time than necessary when the maximum number of threads is not reached. This patch provides a better scaling condition to react faster to pending work. Condition variables and sleep in the context of a synctask have also been implemented. Their purpose is to replace regular condition variables and sleeps that block synctask threads and prevent other tasks to be executed. The new features have been applied to several places in glusterd. Change-Id: Ic50b7c73c104f9e41f08101a357d30b95efccfbf Fixes: #1116 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	common-ha: ganesha-ha.sh bad test for {rhel,centos} for pcs options	Kaleb S. KEITHLEY	2020-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	bash [[ ... =~ ... ]] built-in returns _0_ when the regex matches, not 1, thus the sense of the test is backwards and never correctly detects rhel or centos. master commit: eaf126f4b06a842977c1932ce699c4d76421a4b2 Change-Id: Ic9e60aae4ea38aff8f13979080995e60621a68fe Fixes: #1269 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	tests: skip tests on absence of reflink in xfs	Pranith Kumar K	2020-05-29	3	-10/+12
\| \| \| \| \| \| \|	Fixes: #1223 Change-Id: I36cb72d920ffd77405051546615c5262c392daef Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> (cherry picked from commit b85f01abab658d1d704cd6caf84dd64eddafbff7)
*	fuse: occasional logging for fuse device 'weird' write errors	Csaba Henk	2020-05-28	2	-1/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change is a followup to I510158843e4b1d482bdc496c2e97b1860dc1ba93. In referred change we pushed log messages about 'weird' write errors to fuse device out of sight, by reporting them at Debug loglevel instead of Error (where 'weird' means errno is not POSIX compliant but having meaningful semantics for FUSE protocol). This solved the issue of spurious error reporting. And so far so good: these messages don't indicate an error condition by themselves. However, when they come in high repetitions, that indicates a suboptimal condition which should be reported.[1] Therefore now we shall emit a Warning if a certain errno occurs a certain number of times[2] as the outcome of a write to the fuse device. ___ [1] typically ENOENTs and ENOTDIRs accumulate when glusterfs' inode invalidation lags behind the kernel's internal inode garbage collection (in this case above errnos mean that the inode which we requested to be invalidated is not found in kernel). This can be mitigated with the invalidate-limit command line / mount option, cf. bz#1732717. [2] 256, as of the current implementation. Change-Id: I8cc7fe104da43a88875f93b0db49d5677cc16045 Updates: #1000 Signed-off-by: Csaba Henk <csaba@redhat.com> (cherry picked from commit c1baf3c68b87584aea5389af958326f6ed01d7ec)
*	cluster/ec: Return correct error code and log message	Ashish Pandey	2020-05-28	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	In case of readdir was send with an FD on which opendir was failed, this FD will be useless and we return it with error. For now, we are returning it with EINVAL without logging any message in log file. Return a correct error code and also log the message to improve thing to debug. fixes: #1220 Change-Id: Iaf035254b9c5aa52fa43ace72d328be622b06169 (cherry picked from commit af70cb5eedd80207cd184e69f2a4fb252b72d070)
*	afr: event gen changes	Ravishankar N	2020-04-24	4	-82/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general idea of the changes is to prevent resetting event generation to zero in the inode ctx, since event gen is something that should follow 'causal order'. Change #1: For a read txn, in inode refresh cbk, if event_generation is found zero, we are failing the read fop. This is not needed because change in event gen is only a marker for the next inode refresh to happen and should not be taken into account by the current read txn. Change #2: The event gen being zero above can happen if there is a racing lookup, which resets even get (in afr_lookup_done) if there are non zero afr xattrs. The resetting is done only to trigger an inode refresh and a possible client side heal on the next lookup. That can be acheived by setting the need_refresh flag in the inode ctx. So replaced all occurences of resetting even gen to zero with a call to afr_inode_need_refresh_set(). Change #3: In both lookup and discover path, we are doing an inode refresh which is not required since all 3 essentially do the same thing- update the inode ctx with the good/bad copies from the brick replies. Inode refresh also triggers background heals, but I think it is okay to do it when we call refresh during the read and write txns and not in the lookup path. The .ts which relied on inode refresh in lookup path to trigger heals are now changed to do read txn so that inode refresh and the heal happens. Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec Fixes: #1179 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Erik Jacobson <erik.jacobson at hpe.com> (cherry picked from commit f0fcd909ad4535b60c9208d4804ebe6afe421a09)
*	Update rfc.sh for release-8v8.0alpha	Rinku Kothiya	2020-04-17	1	-1/+1
\| \| \| \| \| \|	updates: #1180 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	dht - Remove "tier" code (part 1)v9dev	Barak Sason Rofman	2020-04-17	3	-476/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is removing some of the "tier" code in dht xlator, as it is no longer being used. Not all of the not-needed code is removed at once, so reviewing is easier. Follow up patches removing additional unused code will follow. This is based in the work done in https://review.gluster.org/#/c/glusterfs/+/23935/ Change-Id: I3cb6a0c5d8f14afcd87cf021ef8f74b91c0f908a updates: #1097 Signed-off-by: Barak Sason Rofman <bsaonro@redhat.com>
*	tests: Fix for spurious failure for some test cases	Mohit Agrawal	2020-04-16	5	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Sometimes test case is failing at the time of creating files on mount point after mounting the volume Solution: After started the volume need to wait to make sure all bricks instances are completely started so put a online_brick_count check after just started the volume Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6 Fixes: #1162 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	md-cache: avoid clearing cache when not necessary	Xavi Hernandez	2020-04-16	1	-72/+93
\| \| \| \| \| \| \| \| \| \| \| \|	mdc_inode_xatt_set() blindly cleared current cache when dict was not NULL, even if there was no xattr requested. This patch fixes this by only calling mdc_inode_xatt_set() when we have explicitly requested something to cache. Change-Id: Idc91a4693f1ff39f7059acde26682ccc361b947d Fixes: #1140 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	posix: fix GF_VALIDATE_OR_GOTO(this->name, this, out)	Sanju Rakonde	2020-04-16	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove GF_VALIDATE_OR_GOTO(this->name, this, out) when this is passed as an argument and is checked for NULL in the caller itself. GF_VALIDATE_OR_GOTO(this->name, this, out) is modified to use xlator name instead of this->name as we are still verifying whether this is NULL. updates: #1000 Change-Id: Ide3180da29d0d4a35b2c5b9a7604fdf2ff4a9ffb Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	test: tests/bugs/rpc/bug-847624.t is crashed	Mohit Agrawal	2020-04-15	2	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterfs(GNFS) is crashing at the time of handling Pollerr event in rpcsvc_drc_client_unref.GNFS is crashed because ref was 0 at the time of unref and ref was taken while Pollin event successfully handled. Solution: Convert drc_client ref to atomic ref to avoid the crash Change-Id: Ia4c054f2f388032a5cd99597d0cfa18b003ca690 Fixes: #1038 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: do not truncate file offsets and sizes to 32-bit	Dmitry Antipov	2020-04-15	4	-9/+13
\| \| \| \| \| \| \| \| \| \|	Do not truncate file offsets and sizes to 32-bit to prevent tests from spurious failures on >2Gb files. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2 Fixes: #1161
*	common-ha: cluster status shows "FAILOVER" when actually HEALTHY	Kaleb S. KEITHLEY	2020-04-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pacemaker devs change the format of the ouput of `pcs status` Expected to find a line in the format: Online: .... but now it's * Online: ... And the `grep -E "^Online:" no longer finds the list of nodes that are online. Change-Id: If2aa1e7b53c766c625d7b4cc222a83ea2c0bd72d Fixes: #1169 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>