glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/afr: fix race when bricks come up	Xavi Hernandez	2020-04-20	3	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The was a problem when self-heal was sending lookups at the same time that one of the bricks was coming up. In this case there was a chance that the number of 'up' bricks changes in the middle of sending the requests to subvolumes which caused a discrepancy in the expected number of replies and the actual number of sent requests. This discrepancy caused that AFR continued executing requests before all requests were complete. Eventually, the frame of the pending request was destroyed when the operation terminated, causing a use- after-free issue when the answer was finally received. In theory the same thing could happen in the reverse way, i.e. AFR tries to wait for more replies than sent requests, causing a hang. Backport of: > Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> > Fixes: bz#1808875 Change-Id: I7ed6108554ca379d532efb1a29b2de8085410b70 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1809439
*	open-behind: fix missing fd reference	Xavi Hernandez	2020-04-20	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \|	Open behind was not keeping any reference on fd's pending to be opened. This makes it possible that a concurrent close and en entry fop (unlink, rename, ...) caused destruction of the fd while it was still being used. Change-Id: Ie9e992902cf2cd7be4af1f8b4e57af9bd6afd8e9 Fixes: #1028 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	gfapi: Suspend synctasks instead of blocking them	Soumya Koduri	2020-04-16	3	-2/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are certain conditions which blocks the current execution thread (like waiting on mutex lock or condition variable or I/O response). In such cases, if it is a synctask thread, we should suspend the task instead of blocking it (like done in SYNCOP using synctask_yield) This is to avoid deadlock like the one mentioned below - 1) synctaskA sets fs->migration_in_progress to 1 and does I/O (LOOKUP) 2) Other synctask threads wait for fs->migration_in_progress to be reset to 0 by synctaskA and hence blocked 3) but synctaskA cannot resume as all synctask threads are blocked on (2). Note: this same approach is already used by few other components like syncbarrier etc. Change-Id: If90f870d663bb242c702a5b86ac52eeda67c6f0d Fixes: #1146 Signed-off-by: Soumya Koduri <skoduri@redhat.com> (cherry picked from commit 55914f968d907ed747774da15285b42653afda61)
*	glusterd: Brick process fails to come up with brickmux on	Vishal Pandey	2020-03-04	2	-14/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: 1- In a cluster of 3 Nodes N1, N2, N3. Create 3 volumes vol1, vol2, vol3 with 3 bricks (one from each node) 2- Set cluster.brick-multiplex on 3- Start all 3 volumes 4- Check if all bricks on a node are running on same port 5- Kill N1 6- Set performance.readdir-ahead for volumes vol1, vol2, vol3 7- Bring N1 up and check volume status 8- All bricks processes not running on N1. Root Cause - Since, There is a diff in volfile versions in N1 as compared to N2 and N3 therefore glusterd_import_friend_volume() is called. glusterd_import_friend_volume() copies the new_volinfo and deletes old_volinfo and then calls glusterd_start_bricks(). glusterd_start_bricks() looks for the volfiles and sends an rpc request to glusterfs_handle_attach(). Now, since the volinfo has been deleted by glusterd_delete_stale_volume() from priv->volumes list before glusterd_start_bricks() and glusterd_create_volfiles_and_notify_services() and glusterd_list_add_order is called after glusterd_start_bricks(), therefore the attach RPC req gets an empty volfile path and that causes the brick to crash. Fix- Call glusterd_list_add_order() and glusterd_create_volfiles_and_notify_services before glusterd_start_bricks() cal is made in glusterd_import_friend_volume > Change-Id: Idfe0e8710f7eb77ca3ddfa1cabeb45b2987f41aa > Bug: bz#1773856 > Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Change-Id: Idfe0e8710f7eb77ca3ddfa1cabeb45b2987f41aa Fixes: bz#1808966 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	Adding release notes for release-6.8v6.8	Hari Gowtham	2020-03-02	1	-0/+39
\| \| \| \| \| \|	Change-Id: I7cb7d0f863c4ef32ab8d4e0db43e2f8135a54e4f fixes: bz#1806846 Signed-off-by: Hari Gowtham <hgowtham@redhat.com>
*	events: fix IPv6 memory corruption	Xavi Hernandez	2020-02-28	1	-41/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an event was generated and the target host was resolved to an IPv6 address, there was a memory overflow when that address was copied to a fixed IPv4 structure (IPv6 addresses are longer than IPv4 ones). This fix correctly handles IPv4 and IPv6 addresses returned by getaddrinfo() Backport of: > Change-Id: I5864a0c6e6f1b405bd85988529570140cf23b250 > Fixes: bz#1790870 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Change-Id: I5864a0c6e6f1b405bd85988529570140cf23b250 Fixes: bz#1792857 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	eventsapi: Set IPv4/IPv6 family based on input IP	Aravinda VK	2020-02-28	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	server.sin_family was set to AF_INET while creating socket connection, this was failing if the input address is IPv6(`::1`). With this patch, sin_family is set by reading the ai_family of `getaddrinfo` result. Backport of: > Fixes: bz#1752330 > Change-Id: I499f957b432842fa989c698f6e5b25b7016084eb > Signed-off-by: Aravinda VK <avishwan@redhat.com> Fixes: bz#1807786 Change-Id: I499f957b432842fa989c698f6e5b25b7016084eb Signed-off-by: Aravinda VK <avishwan@redhat.com>
*	core: replace inet_addr with inet_pton	Rinku Kothiya	2020-02-28	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes warning raised by RPMDiff on the use of inet_addr, which may impact Ipv6 support Backport of: > fixes: bz#1721385 > Change-Id: Id2d9afa1747efa64bc79d90dd2566bff54deedeb > Signed-off-by: Rinku Kothiya <rkothiya@redhat.com> Fixes: bz#1807793 Change-Id: Id2d9afa1747efa64bc79d90dd2566bff54deedeb Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	core: fix memory pool management races	Xavi Hernandez	2020-02-28	5	-105/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Objects allocated from a per-thread memory pool keep a reference to it to be able to return the object to the pool when not used anymore. The object holding this reference can have a long life cycle that could survive a glfs_fini() call. This means that it's unsafe to destroy memory pools from glfs_fini(). Another side effect of destroying memory pools from glfs_fini() is that the TLS variable that points to one of those pools cannot be reset for all alive threads. This means that any attempt to allocate memory from those threads will access already free'd memory, which is very dangerous. To fix these issues, mem_pools_fini() doesn't destroy pool lists anymore. They should be destroyed when the library is unloaded or the process is terminated, but this cannot be done right now because gluster doesn't stop other threads before calling exit(), which could cause some races. This patch is the backport of 2 master patches: > Change-Id: Ib189a5510ab6bdac78983c6c65a022e9634b0965 > Fixes: bz#1801684 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> > > Change-Id: Id7cfb4407fcf208e28f03a7c3cdc3ef9c1f3bf9b > Fixes: bz#1801684 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Change-Id: Id7cfb4407fcf208e28f03a7c3cdc3ef9c1f3bf9b Fixes: bz#1805671 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	cluster/ec: Change handling of heal failure to avoid crash	Ashish Pandey	2020-02-28	2	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: ec_getxattr_heal_cbk was called with NULL as second argument in case heal was failing. This function was dereferencing "cookie" argument which caused crash. Solution: Cookie is changed to carry the value that was supposed to be stored in fop->data, so even in the case when fop is NULL in error case, there won't be any NULL dereference. Thanks to Xavi for the suggestion about the fix. Change-Id: I0798000d5cadb17c3c2fbfa1baf77033ffc2bb8c fixes: bz#1806836
*	afr: prevent spurious entry heals leading to gfid split-brain	Ravishankar N	2020-02-28	7	-29/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a hyperconverged setup with granular-entry-heal enabled, if a file is recreated while one of the bricks is down, and an index heal is triggered (with the brick still down), entry-self heal was doing a spurious heal with just the 2 good bricks. It was doing a post-op leading to removal of the filename from .glusterfs/indices/entry-changes as well as erroneous setting of afr xattrs on the parent. When the brick came up, the xattrs were cleared, resulting in the renamed file not getting healed and leading to gfid split-brain and EIO on the mount. Fix: Proceed with entry heal only when shd can connect to all bricks of the replica, just like in data and metadata heal. fixes: bz#1804594 Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 06453d77d056fbaa393a137ca277a20e38d2f67e)
*	lock: check null value of dict to avoid log flooding	Mohit Agrawal	2020-02-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	> updates: bz#1712322 > Change-Id: I120a1d23506f9ebcf88c7ea2f2eff4978a61cf4a > Signed-off-by: Susant Palai <spalai@redhat.com> > (cherry picked from commit 2bb1807879493cb77ec9b5088485d88f13b84828) updates: bz#1797985 Change-Id: I120a1d23506f9ebcf88c7ea2f2eff4978a61cf4a Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	cluster/thin-arbiter: Wait for TA connection before ta-file lookup	Ashish Pandey	2020-02-26	1	-19/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When we mount a ta volume, as soon as 2 data bricks are connected we consider that the mount is done and then send a lookup/create on ta file on ta node. However, this connection with ta node might not have been completed. Due to this delay, ta replica id file will not be created and we will see ENOTCONN error in log file if we do lookup. Solution: As we know that this ta node could have a higher latency, we should wait for reasonable time for connection to happen before sending lookup/create on replica id file. fixes: bz#1804546 Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc (cherry picked from commit a7fa54ddea3fe429f143b37e4de06a93b49d776a)
*	tools/glusterfind: Remove an extra argument	Shwetha K Acharya	2020-02-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Backport of: > Upstream Patch: https://review.gluster.org/#/c/glusterfs/+/24011/ >fixes: bz#1790748 >Change-Id: I1cb12c975142794139456d0f8e99fbdbb03c53a1 >Signed-off-by: Shwetha K Acharya <sacharya@redhat.com> >(cherry picked from commit d73872e764214f8071c8915536a75bdac1e5e685) fixes: bz#1790850 Change-Id: I1cb12c975142794139456d0f8e99fbdbb03c53a1 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	gf-event: Handle unix volfile-servers	Pranith Kumar K	2020-02-26	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glfsheal program uses unix-socket-based volfile server. volfile server will be the path to socket in this case. gf_event expects this to be hostname in all cases. So getaddrinfo will fail on the unix-socket path, events won't be sent in this case. Fix: In case of unix sockets, default to localhost fixes: bz#1793096 Change-Id: I60d27608792c29d83fb82beb5fde5ef4754bece8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: skip updating ctx->loc again when ec_fix_open/opendir	Kinglong Mee	2020-02-26	2	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	The ec_manager_open/opendir memsets ctx->loc which causes memory/inode leak, and ec_fheal uses ctx->loc out of fd->lock that loc_copy may copy bad data when memset it. This patch skips updating ctx->loc when it is initilizaed. With it, ctx->loc is filled once, and never updated. Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a fixes: bz#1806838 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
*	Cluster/afr: Don't treat all bricks having metadata pending as split-brain	karthik-us	2020-02-25	4	-67/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in metadata split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others for metadata. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks. Fix: Pick a source based on the xattr values. If 2 bricks blame one, the blamed one must be treated as sink. If there is no majority, all are sources. Once we pick a source, self-heal will then do the heal instead of erroring out due to split-brain. This patch also adds restriction of all the bricks to be up to perform metadata heal to avoid any metadata loss. Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t as it was doing metadata heal even when only 2 of 3 bricks were up. Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09 fixes: bz#1805097 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	server: Mount fails after reboot 1/3 gluster nodes	Mohit Agrawal	2020-02-11	3	-16/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of coming up one server node(1x3) after reboot client is unmounted.The client is unmounted because a client is getting AUTH_FAILED event and client call fini for the graph.The client is getting AUTH_FAILED because brick is not attached with a graph at that moment Solution: To avoid the unmounting the client graph throw ENOENT error from server in case if brick is not attached with server at the time of authenticate clients. > Credits: Xavi Hernandez <xhernandez@redhat.com> > Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e > Fixes: bz#1793852 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > (cherry picked from commit > f6421dff22a6ddaf14134f6894deae219948c89d) Change-Id: Ie6fbd73cbcf23a35d8db8841b3b6036e87682f5e Fixes: bz#1794020 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	To fix readdir-ahead memory leak	HuangShujun	2020-02-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Glusterfs client process has memory leak if create serveral files under one folder, and delete the folder. According to statedump, the ref counts of readdir-ahead is bigger than zero in the inode table. Readdir-ahead get parent inode by inode_parent in rda_mark_inode_dirty when each rda_writev_cbk,the inode ref count of parent folder will be increased in inode_parent, but readdir-ahead do not unref it later. The correction is unref the parent inode at the end of rda_mark_inode_dirty Backport of: > Change-Id: Iee68ab1089cbc2fbc4185b93720fb1f66ee89524 > Fixes: bz#1779055 > Signed-off-by: HuangShujun <549702281@qq.com> Change-Id: Iee68ab1089cbc2fbc4185b93720fb1f66ee89524 (cherry picked from commit 99044a5cedcff9a9eec40a07ecb32bd66271cd02) Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Fixes: bz#1789337
*	glusterfind: python3 compatibility	Sunny Kumar	2020-02-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: While we delete gluster volume the hook script 'S57glusterfind-delete-post.py' is failed to execute and error message can be observed in glusterd log. Traceback: File "/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post", line 69, in <module> main() File "/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post", line 39, in main glusterfind_dir = os.path.join(get_glusterd_workdir(), "glusterfind") File "/usr/lib64/python3.7/posixpath.py", line 94, in join genericpath._check_arg_types('join', a, *p) File "/usr/lib64/python3.7/genericpath.py", line 155, in _check_arg_types raise TypeError("Can't mix strings and bytes in path components") from None TypeError: Can't mix strings and bytes in path components Solution: Added the 'universal_newlines' flag to Popen to support backward compatibility. Backport of: > Change-Id: Ie5655b11b55535c5ad2338108d0448e6fdaacf4f > Fixes: bz#1789478 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> > (cherry picked from commit 33c3cbe71b67f523538b04334f1ef962953281ed) Change-Id: Ie5655b11b55535c5ad2338108d0448e6fdaacf4f Fixes: bz#1790449 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	tools/glusterfind: handle offline bricks	Milind Changire	2020-02-11	2	-25/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: glusterfind is unable to copy remote output file to local node when a remove-brick is in progress on the remote node. After copying remote files, in the --full output listing path, a "sort -u" command is run on the collected files. However, "sort" exits with an error code if it finds any file missing. Solution: Maintain a map of (pid, output file) when the node commands are started and remove the mapping for the pid for which the command returns an error. Use the list of files present in the map for the "sort" command. Backport of: > Change-Id: Ie6e019037379f4cb163f24b1c65eb382efc2fb3b > fixes: bz#1410439 > Signed-off-by: Milind Changire <mchangir@redhat.com> > Signed-off-by: Shwetha K Acharya <sacharya@redhat.com> > (cherry picked from commit 42c1605f42b89520d4d05806d7074e9e93b63640) Change-Id: Ie6e019037379f4cb163f24b1c65eb382efc2fb3b Fixes: bz#1790445 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	extras: enable log rotation for USS logs	Sunny Kumar	2020-02-11	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added logrotate support for user serviceable snapshot's logs. Backport of: >Change-Id: Ic920eaa8ab5e44daf5937a027c6913d7bb26d517 >Fixes: bz#1786722 >Signed-off-by: Sunny Kumar <sunkumar@redhat.com> Change-Id: Ic920eaa8ab5e44daf5937a027c6913d7bb26d517 Fixes: bz#1786754 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	cluster/dht: Correct fd processing loop	N Balachandran	2019-12-30	1	-22/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fd processing loops in the dht_migration_complete_check_task and the dht_rebalance_inprogress_task functions were unsafe and could cause an open to be sent on an already freed fd. This has been fixed. > Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540 > Fixes: bz#1757399 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > (cherry picked from commit 9b15867070b0cc241ab165886292ecffc3bc0aed) Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540 Fixes: bz#1786983 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
*	doc: Added release 6.7 notesv6.7	hari gowtham	2019-12-26	2	-1/+33
\| \| \| \| \| \| \|	Fixes: bz#1780540 Change-Id: I02ef8c24bc19c321ab2db1d13224ea9e89325e3a Signed-off-by: hari gowtham <hgowtham@redhat.com>
*	rpc: Synchronize slot allocation code	Mohit Agrawal	2019-12-26	1	-34/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Current slot allocation/deallocation code path is not synchronized.There are scenario when due to race condition in slot allocation/deallocation code path brick is crashed. Solution: Synchronize slot allocation/deallocation code path to avoid the issue > Change-Id: I4fb659a75234218ffa0e5e0bf9308f669f75fc25 > Fixes: bz#1763036 > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit faf5ac13c4ee00a05e9451bf8da3be2a9043bbf2) Change-Id: I4fb659a75234218ffa0e5e0bf9308f669f75fc25 Fixes: bz#1778182 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	geo-rep: Fix py2/py3 compatibility in repce	Kotresh HR	2019-12-24	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Geo-rep fails to start on python2 only machine like centos6. It fails with "ImportError no module named _io". This patch fixes the same. Backport of: > Patch: https://review.gluster.org//23702/ > BUG: 1771577 > Change-Id: I8228458a853a230546f9faf29a0e9e0f23b3efec > Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 9595ecca3de49fdf37d30b151f5c3e071e0a80d0) fixes: bz#1771842 Change-Id: I8228458a853a230546f9faf29a0e9e0f23b3efec Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	test: fix non-root test case for geo-rep	Sunny Kumar	2019-12-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: On a freshly installed system non-root geo-rep test case gets blocked. Solution: On a freshly installed system, the remote key need to be accepted automatically by ssh-copy-id. Credits: M. Scherer <mscherer@redhat.com> Backport of: > Change-Id: I5077f99a6681660f7e3e84c25ef216f521b7c29c > Fixes: bz#1779742 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> Change-Id: I5077f99a6681660f7e3e84c25ef216f521b7c29c Fixes: bz#1784796 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	socket: fix error handling	Xavi Hernandez	2019-12-13	1	-84/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When __socket_proto_state_machine() detected a problem in the size of the request or it couldn't allocate an iobuf of the requested size, it returned -ENOMEM (-12). However the caller was expecting only -1 in case of error. For this reason the error passes undetected initially, adding back the socket to the epoll object. On further processing, however, the error is finally detected and the connection terminated. Meanwhile, another thread could receive a poll_in event from the same connection, which could cause races with the connection destruction. When this happened, the process crashed. To fix this, all error detection conditions have been hardened to be more strict on what is valid and what not. Also, we don't return -ENOMEM anymore. We always return -1 in case of error. An additional change has been done to prevent destruction of the transport object while it may still be needed. Backport of: > Change-Id: I6e59cd81cbf670f7adfdde942625d4e6c3fbc82d > Fixes: bz#1782495 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Change-Id: I6e59cd81cbf670f7adfdde942625d4e6c3fbc82d Fixes: bz#1749625 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	extras: Cgroup(CPU/Mem) restriction are not working on gluster process	Mohit Agrawal	2019-11-19	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: After Configure the Cgroup(CPU/MEM) limit to a gluster processes resource(CPU/MEM) limits are not applicable to the gluster processes.Cgroup limits are not applicable because all threads are not moved into a newly created cgroup to apply restriction. Solution: To move a gluster thread to newly created cgroup change the condition in script > Change-Id: I8ad81c69200e4ec43a74f6052481551cf835354c > Fixes: bz#1764208 > Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> > (cherry picked from commit f5811979935ce607391825ac6913a95f588818e3) > (Reviewed on upstream link https://review.gluster.org/#/c/glusterfs/+/23599/) Change-Id: I8ad81c69200e4ec43a74f6052481551cf835354c Fixes: bz#1766425 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	libgfchangelog : use find_library to locate shared library	Sunny Kumar	2019-11-08	3	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: libgfchangelog.so: cannot open shared object file Due to hardcoded shared library name runtime loader looks for particular version of a shared library. Solution: Using find_library to locate shared library at runtime solves this issue. Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 323, in main func(args) File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in subcmd_worker local.service_loop(remote) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1261, in service_loop changelog_agent.init() File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in __call__ return self.ins(self.meth, *a) File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in __call__ raise res OSError: libgfchangelog.so: cannot open shared object file: No such file or directory Backport of: > Patch: https://review.gluster.org/22557 > Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5 > BUG: 1699394 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> (cherry picked from commit f316c8b797283818bd800569771870a4b9bf1310) Change-Id: I3dd013d701ed1cd99ba7ef20d1898f343e1db8f5 fixes: bz#1770100 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
*	Detach iot_worker to release its resources	Liguang Li	2019-11-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When iot_worker terminates, its resources have not been reaped, which will consumes lots of memory. Detach iot_worker to automically release its resources back to the system. Change-Id: I71fabb2940e76ad54dc56b4c41aeeead2644b8bb fixes: bz#1768726 Signed-off-by: Liguang Li <liguang.lee6@gmail.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/ec: Update lock->good_mask on parent fop failure	Pranith Kumar K	2019-10-30	2	-0/+6
\| \| \| \| \| \| \| \| \| \|	When discard/truncate performs write fop, it should do so after updating lock->good_mask to make sure readv happens on the correct mask fixes: bz#1739449 Change-Id: Idfef0bbcca8860d53707094722e6ba3f81c583b7 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: Fix reopen flags to avoid misbehavior	Pranith Kumar K	2019-10-30	2	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: when a file needs to be re-opened O_APPEND and O_EXCL flags are not filtered in EC. - O_APPEND should be filtered because EC doesn't send O_APPEND below EC for open to make sure writes happen on the individual fragments instead of at the end of the file. - O_EXCL should be filtered because shd could have created the file so even when file exists open should succeed - O_CREAT should be filtered because open happens with gfid as parameter. So open fop will create just the gfid which will lead to problems. Fix: Filter out these two flags in reopen. Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4 Fixes: bz#1739450 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: Always read from good-mask	Pranith Kumar K	2019-10-30	2	-5/+25
\| \| \| \| \| \| \| \| \| \|	There are cases where fop->mask may have fop->healing added and readv shouldn't be wound on fop->healing. To avoid this always wind readv to lock->good_mask updates: bz#1739449 Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	cluster/ec: inherit healing from lock when it has info	Kinglong Mee	2019-10-30	1	-2/+3
\| \| \| \| \| \| \| \| \|	If lock has info, fop should inherit healing mask from it. Otherwise, fop cannot inherit right healing when changed_flags is zero. Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f updates: bz#1739449 Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
*	cluster/ec: Prevent double pre-op xattrops	Pranith Kumar K	2019-10-30	2	-6/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Race: Thread-1 Thread-2 1) Does ec_get_size_version() to perform pre-op fxattrop as part of write-1 2) Calls ec_set_dirty_flag() in ec_get_size_version() for write-2. This sets dirty[] to 1 3) Completes executing ec_prepare_update_cbk leading to ctx->dirty[] = '1' 4) Takes LOCK(inode->lock) to check if there are any flags and sets dirty-flag because lock->waiting_flag is 0 now. This leads to fxattrop to increment on-disk dirty[] to '2' At the end of the writes the file will be marked for heal even when it doesn't need heal. Fix: Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked as '1' in step 2) above Updates bz#1739446 Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	doc: Added release 6.6 notesv6.6	hari gowtham	2019-10-25	1	-0/+56
\| \| \| \| \| \| \|	Fixes: bz#1762237 Change-Id: If339505ca0436947b814d9b0bc0b78f92d5d3317 Signed-off-by: hari gowtham <hgowtham@redhat.com>
*	test: fix suspicous non-root geo-rep test failures	Sunny Kumar	2019-10-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Export of env variable is required for ssh-copy-id command. Backport of: > fixes: bz#1765426 > Change-Id: Icaf7a848cb8f4ae9f887d885a8c5bb71f26633b4 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> fixes: bz#1765433 Change-Id: Icaf7a848cb8f4ae9f887d885a8c5bb71f26633b4 Signed-off-by: Sunny Kumar <sunkumar@redhat.com> (cherry picked from commit bfebfa9f2ec9dfc5dbf4a68c3518f98364ebc461)
*	geo-rep: Fix Permission denied traceback on non root setup	Kotresh HR	2019-10-24	3	-16/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: While syncing rename of directory in hybrid crawl, geo-rep crashes as below. Traceback (most recent call last): File "/usr/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker res = getattr(self.obj, rmeth)(in_data[2:]) File "/usr/local/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops src_entry = get_slv_dir_path(slv_host, slv_volume, gfid) File "/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 687, in get_slv_dir_path [ENOENT], [ESTALE]) File "/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap return call(arg) PermissionError: [Errno 13] Permission denied: '/bricks/brick1/b1/.glusterfs/8e/c0/8ec0fcd4-d50f-4a6e-b473-a7943ab66640' Cause: Conversion of gfid to path for a directory uses readlink on backend .glusterfs gfid path. But this fails for non root user with permission denied. Fix: Use gfid2path interface to get the path from gfid Backport of: > Patch: https://review.gluster.org/23570 > Change-Id: I9d40c713a1b32cea95144cbc0f384ada82972222 > BUG: 1763439 > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: I9d40c713a1b32cea95144cbc0f384ada82972222 fixes: bz#1764183 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	cluster/ec: fix EIO error for concurrent writes on sparse files	Xavi Hernandez	2019-10-24	2	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EC doesn't allow concurrent writes on overlapping areas, they are serialized. However non-overlapping writes are serviced in parallel. When a write is not aligned, EC first needs to read the entire chunk from disk, apply the modified fragment and write it again. The problem appears on sparse files because a write to an offset implicitly creates data on offsets below it (so, in some way, they are overlapping). For example, if a file is empty and we read 10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte at offset 1M and retry the same read, the system call will return 10 bytes (all containing 0's). So if we have two writes, the first one at offset 10 and the second one at offset 1M, EC will send both in parallel because they do not overlap. However, the first one will try to read missing data from the first chunk (i.e. offsets 0 to 9) to recombine the entire chunk and do the final write. This read will happen in parallel with the write to 1M. What could happen is that half of the bricks process the write before the read, and the half do the read before the write. Some bricks will return 10 bytes of data while the otherw will return 0 bytes (because the file on the brick has not been expanded yet). When EC tries to recombine the answers from the bricks, it can't, because it needs more than half consistent answers to recover the data. So this read fails with EIO error. This error is propagated to the parent write, which is aborted and EIO is returned to the application. The issue happened because EC assumed that a write to a given offset implies that offsets below it exist. This fix prevents the read of the chunk from bricks if the current size of the file is smaller than the read chunk offset. This size is correctly tracked, so this fixes the issue. Also modifying ec-stripe.t file for Test #13 within it. In this patch, if a file size is less than the offset we are writing, we fill zeros in head and tail and do not consider it strip cache miss. That actually make sense as we know what data that part holds and there is no need of reading it from bricks. Backport of: > Patch:https://review.gluster.org/#/c/glusterfs/+/23066/ > Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 > BUG: 1730715 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> (cherry picked from commit b01a43586c5abc23a874e5528a063c508f952cbd) Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 Fixes: bz#1739451 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	features/shard: Send correct size when reads are sent beyond file size	Krutika Dhananjay	2019-10-24	2	-0/+31
\| \| \| \| \| \| \|	Change-Id: I0cebaaf55c09eb1fb77a274268ff564e871b743b fixes bz#1737141 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> (cherry picked from commit 51237eda7c4b3846d08c5d24d1e3fe9b7ffba1d4)
*	geo-rep: Fix config upgrade on non-participating node	Kotresh HR	2019-10-22	3	-1/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After upgrade, if the config files are of old format, it gets migrated to new format. Monitor process migrates it. Since monitor doesn't run on nodes where bricks are not hosted, it doesn't get migrated there. So this patch fixes the config upgrade on nodes which doesn't host bricks. This happens during config either on get/set/reset. Backport of: > Patch: https://review.gluster.org/23555 > Change-Id: Ibade2f2310b0f3affea21a3baa1ae0eb71162cba > Signed-off-by: Kotresh HR <khiremat@redhat.com> > BUG: 1762220 Change-Id: Ibade2f2310b0f3affea21a3baa1ae0eb71162cba fixes: bz#1763028 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	tests : test case for non-root geo-rep setup	Sunny Kumar	2019-10-22	1	-0/+251
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added test case for non-root geo-rep setup. Backport of: > Patch: https://review.gluster.org/22902 > Change-Id: Ib6ebee79949a9f61bdc5c7b5e11b51b262750e98 > BUG: 1717827 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: Ib6ebee79949a9f61bdc5c7b5e11b51b262750e98 fixes: bz#1764178 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep: Fix the name of changelog archive file	Kotresh HR	2019-10-22	2	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Background: The processed changelogs are archived each month in a single tar file. The default format is "archive_YYYYMM.tar" which is specified as "%%Y%%m" in configuration file. Problem: The created changelog archive file didn't have corresponding year and month. It created as "archive_%Y%m.tar" on python2 only systems. Cause and Fix: Geo-rep expects "%Y%m" after the ConfigParser reads it from config file. Since it was "%%Y%%m" in config file, geo-rep used to get correct value "%Y%m" in python3 and "%%Y%%m" in python2 which is incorrect. The fix can be to use "%Y%m" in config file but that fails in python3. So the fix is to use "RawConfigParser" in geo-rep and use "%Y%m". This works both in python2 and python3. Backport of: > Patch: https://review.gluster.org/23248 > Change-Id: Ie5b7d2bc04d0d53cd1769e064c2d67aaf95d557c > BUG: 1741890 > Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: Ie5b7d2bc04d0d53cd1769e064c2d67aaf95d557c fixes: bz#1764176 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep: Fix Config Get Race	Aravinda VK	2019-10-22	1	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When two threads(sync jobs) in Geo-rep worker calls `gconf.get` and `gconf.getr`(realtime) at the sametime, `getr` resets the conf object and other one gets None. Thread Lock is introduced to fix the issue. ``` File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 368, in twrap tf(*aargs) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1987, in syncjob po = self.sync_engine(pb, self.log_err) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1444, in rsync rconf.ssh_ctl_args + \ AttributeError: 'NoneType' object has no attribute 'split' ``` Backport of: > Patch: https://review.gluster.org/23158 > Change-Id: I9c245e5c36338265354e158f5baa32b119eb2da5 > BUG: 1737484 > Signed-off-by: Aravinda VK <avishwan@redhat.com> Change-Id: I9c245e5c36338265354e158f5baa32b119eb2da5 fixes: bz#1764174 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep: Test case for upgrading config file	Shwetha K Acharya	2019-10-22	2	-6/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added test case for the patch https://review.gluster.org/#/c/glusterfs/+/22894/4 Also updated if else structure in gsyncdconfig.py to avoid repeated occurance of values in new configfile. Backport of: > Patch: https://review.gluster.org/22982 > BUG: 1707731 > Change-Id: If97e1d37ac52dbd17d47be6cb659fc5a3ccab6d7 > Signed-off-by: Shwetha K Acharya <sacharya@redhat.com> fixes: bz#1764171 Change-Id: If97e1d37ac52dbd17d47be6cb659fc5a3ccab6d7 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep : fix gluster command path for non-root session	Sunny Kumar	2019-10-22	2	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: gluster command not found. Cause: In Volinfo class we issue command 'gluster vol info' to get information about volume like getting brick_root to perform various operation. When geo-rep session is configured for non-root user Volinfo class fails to issue gluster command due to unavailability of gluster binary path for non-root user. Solution: Use config value 'slave-gluster-command-dir'/'gluster-command-dir' to get path for gluster command based on caller. Backport of: > Patch: https://review.gluster.org/22920 > BUG: 1722740 > Change-Id: I4ec46373da01f5d00ecd160c4e8c6239da8b3859 > Signed-off-by: Sunny Kumar <sunkumar@redhat.com> fixes: bz#1764172 Change-Id: I4ec46373da01f5d00ecd160c4e8c6239da8b3859 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	geo-rep: Upgrading config file to new version	Shwetha K Acharya	2019-10-22	2	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- configuration handling is enhanced with patch https://review.gluster.org/#/c/glusterfs/+/18257/ - hence, the old configurations are not applied when Geo-rep session is created in the old version and upgraded. This patch solves the issue. It, - checks if the config file is old. - parses required values from old config file and stores in new config file, which ensures that configerations are applied on upgrade. - stores old config file as backup. - handles changes in options introduced in https://review.gluster.org/#/c/glusterfs/+/18257/ Backport of: > Patch: https://review.gluster.org/22894 > BUG: bz#1707731 > Change-Id: Iad8da6c1e1ae8ecf7c84dfdf8ea3ac6966d8a2a0 > Signed-off-by: Shwetha K Acharya <sacharya@redhat.com> updates: bz#1764171 Change-Id: Iad8da6c1e1ae8ecf7c84dfdf8ea3ac6966d8a2a0 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	dht: Rebalance causing IO Error - File descriptor in bad state	Mohit Agrawal	2019-10-17	5	-17/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem : When a file is migrated, dht attempts to re-open all open fds on the new cached subvol. Earlier, if dht had not opened the fd, the client xlator would be unable to find the remote fd and would fall back to using an anon fd for the fop. That behavior changed with https://review.gluster.org/#/c/glusterfs/+/15804, causing fops to fail with EBADFD if the fd was not available on the cached subvol. The client xlator returns EBADFD if the remote fd is not found but dht only checks for EBADF before re-opening fds on the new cached subvol. Solution: Handle EBADFD at dht code path to avoid the issue >Change-Id: I43c51995cdd48d05b12e4b2889c8dbe2bb2a72d8 >Fixes: bz#1758579 >(cherry picked from commit 9314a9fbf487614c736cf6c4c1b93078d37bb9df) >(Reviewed on upstream https://review.gluster.org/#/c/glusterfs/+/23518/) Change-Id: I43c51995cdd48d05b12e4b2889c8dbe2bb2a72d8 Fixes: bz#1761907
*	cluster/afr: Heal entries when there is a source & no healed_sinks	karthik-us	2019-10-17	2	-0/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In a situation where B1 blames B2, B2 blames B1 and B3 doesn't blame anything for entry heal, heal will not complete even though we have clear source and sinks. This will happen because while doing afr_selfheal_find_direction() only the bricks which are blamed by non-accused bricks are considered as sinks. Later in __afr_selfheal_entry_finalize_source() when it tries to mark all the non-sources as sinks it fails to do so because there won't be any healed_sinks marked, no witness present and there will be a source. Fix: If there is a source and no healed_sinks, then reset all the locked sources to 0 and healed sinks to 1 to do conservative merge. Change-Id: If40d8bc95d52a52b2730f55bdcf135109b421548 Fixes: bz#1760706 Signed-off-by: karthik-us <ksubrahm@redhat.com>