glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	performance/md-cache: Fix a crash when statfs caching is enabled	Vijay Bellur	2019-01-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	mem_put() in STACK_UNWIND_STRICT causes a crash if frame->local is not null as md-cache obtains local from CALLOC. Changed two occurrences of STACK_UNWIND_STRICT to MDC_STACK_UNWIND as the latter macro does not rely on STACK_UNWIND_STRICT for cleaning up frame->local. fixes: bz#1632503 Change-Id: I1b3edcb9372a164ef73119e99a49e747765d7166 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	tests: increase the timeout for distribute bug 1117851.t	Amar Tumballi	2019-01-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	The test is in borderline of 200seconds, and many a times, randomly takes little more time, and fails the whole regression. Better to keep timeout high, so we don't 'randomly' fail regression tests. updates: bz#1193929 Change-Id: Ib0d3a9f7a75ee44446ec6da5e0510cccf83eecaa Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster/afr: Disable client side heals in AFR by default.	Sunil Kumar Acharya	2019-01-10	10	-1/+30
\| \| \| \| \| \| \| \| \|	With this changeset, default value for the AFR client side heal volume option is set to "off" fixes: bz#1663102 Change-Id: Ie4016932339c4896487e3e7cb5caca68739b7ba2 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	gfapi: update returned/callback pre/post attributes to glfs_stat	ShyamsundarR	2019-01-07	1	-3/+6
\| \| \| \| \| \| \|	Change-Id: Ie0fe971e694101aa011d66aa496d0644669c2c5a Updates: #389 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com> Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	gfapi: new api glfs_statx as linux's statx	ShyamsundarR	2019-01-07	2	-0/+214
\| \| \| \| \| \| \|	Change-Id: I44dd6ceef0954ae7fc13f920e84d81bbd3f6a774 Updates: #389 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com> Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	cluster/ta: Check number/type of locks held on ta file	Ashish Pandey	2018-12-27	1	-0/+68
\| \| \| \| \| \|	Change-Id: Iec47856ce2819e7d7d38a60279602e53ba45858d updates: bz#1624332 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	tests: Brick is getting OOM in ./tests/bugs/core/bug-1432542-mpx-restart-crash.t	Mohit Agrawal	2018-12-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test "tests/bugs/core/bug-1432542-mpx-restart-crash.t" case creates 20 2x3 volumes after enabling brick_mux.At the time of creating last volume brick is getting OOM because brick consumption has increased from previous consumption due to these patches https://review.gluster.org/#/c/glusterfs/+/19997/, https://review.gluster.org/#/c/glusterfs/+/20362/ To avoid OOM reduce NUM_VOLS to 15 so that brick consumption has reduced Change-Id: Ib98b47a3db6b990ff22c7e57396d51e7fef5c7e8 fixes: bz#1661214 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: Fix zero-flag.t script	Krutika Dhananjay	2018-12-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default value of shard-block-size was changed from 4MB to 64MB sometime back. The script "fallocate"s a 6MB file and expects it to have 1 shard under .shard. This worked when the shard-block-size was 4MB. With the default value now at 64MB, file "file1" won't have any shards under .shard and the stat on the 1st shard's path fails with ENOENT. Changed the script to explicitly set shard-block-size to 4MB. Change-Id: I7f1785922287d16d74c95fa57cbbe12e6e66e4f7 fixes: bz#1656264 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	cluster/afr: Allow lookup on root if it is from ADD_REPLICA_MOUNT	karthik-us	2018-12-18	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When trying to convert a plain distribute volume to replica-3 or arbiter type it is failing with ENOTCONN error as the lookup on the root will fail as there is no quorum. Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT which is used while adding bricks to a volume. It will try to set the pending xattrs for the newly added bricks to allow the heal to happen in the right direction and avoid data loss scenarios. Note: This fix will solve the problem of type conversion only in the case where the volume was mounted at least once. The conversion of non mounted volumes will still fail since the dht selfheal tries to set the directory layout will fail as they do that with the PID GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root. Change-Id: Ic511939981dad118cc946754341318b164954b3b fixes: bz#1655854 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	iobuf: Get rid of pre allocated iobuf_pool and use per thread mem pool	Poornima G	2018-12-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current implementation of iobuf_pool has two problems: - prealloc of 12.5MB memory, this limits the scale factor of the gluster processes due to RAM requirements - lock contention, as the current implementation has one global iobuf_pool lock. Credits for debugging and addressing the same goes to Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410 Hence changing the iobuf implementation to use per thread mem pool. This may theoritically appear to cause perf dip as there is no preallocation. But per thread mem pool will not have significant perf impact as the last allocated memory is kept alive for subsequent allocs, for some time. The worst case would be if iobufs requested are of random sizes each time. The best case is, if we get iobuf request of the same size. From the perf tests, this patch did not seem to cause any perf decrease. Note that, with this patch, the rdma performance is going to degrade drastically. In one of the previous patchsets we had fixes to not degrade rdma perf, but rdma is not supported and also not tested [1]. Hence the decision was to not have code in rdma that is not tested and not supported. [1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html Updates: #325 Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	fuse: SETLKW interrupt	Csaba Henk	2018-12-14	1	-0/+33
\| \| \| \| \| \| \| \| \|	Use the (f)getxattr based clearlocks interface to interrupt a pending lock request. updates: #465 Change-Id: I4e91a4d8791fc688fed400a02de4c53487e61be2 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	fuse: add --lru-limit option	Amar Tumballi	2018-12-14	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inode LRU mechanism is moot in fuse xlator (ie. there is no limit for the LRU list), as fuse inodes are referenced from kernel context, and thus they can only be dropped on request of the kernel. This might results in a high number of passive inodes which are useless for the glusterfs client, causing a significant memory overhead. This change tries to remedy this by extending the LRU semantics and allowing to set a finite limit on the fuse inode LRU. A brief history of problem: When gluster's inode table was designed, fuse didn't have any 'invalidate' method, which means, userspace application could never ask kernel to send a 'forget()' fop, instead had to wait for kernel to send it based on kernel's parameters. Inode table remembers the number of times kernel has cached the inode based on the 'nlookup' parameter. And 'nlookup' field is not used by no other entry points (like server-protocol, gfapi etc). Hence the inode_table of fuse module always has to have lru-limit as '0', which means no limit. GlusterFS always had to keep all inodes in memory as kernel would have had a reference to it. Again, the reason for this is, kernel's glusterfs inode reference was pointer of 'inode_t' structure in glusterfs. As it is a pointer, we could never free it (to prevent segfault, or memory corruption). Solution: In the inode table, handle the prune case of inodes with 'nlookup' differently, and call a 'invalidator' method, which in this case is fuse_invalidate(), and it sends the request to kernel for getting the forget request. When the kernel sends the forget, it means, it has dropped all the reference to the inode, and it will send the forget with the 'nlookup' parameter too. We just need to make sure to reduce the 'nlookup' value we have when we get forget. That automatically cause the relevant prune to happen. Credits: Csaba Henk, Xavier Hernandez, Raghavendra Gowdappa, Nithya B fixes: bz#1560969 Change-Id: Ifee0737b23b12b1426c224ec5b8f591f487d83a2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	[geo-rep]: Worker still ACTIVE after killing bricks	Mohit Agrawal	2018-12-13	2	-0/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In changelog xlator after destroying listener it call's unlink to delete changelog socket file but socket file reference is not cleaned up from process memory Solution: 1) To cleanup reference completely from process memory serialize transport cleanup for changelog and then unlink socket file 2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next xlator only after cleanup all xprts Test: To test the same run below steps 1) Setup some volume and enable brick mux 2) kill anyone brick with gf_attach 3) check changelog socket for specific to killed brick in lsof, it should cleanup completely fixes: bz#1600145 Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	copy_file_range support in GlusterFS	Raghavendra Bhat	2018-12-12	2	-0/+257
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
*	cluster/afr: Do not update read_subvol in inode_ctx after rename/link fop	karthik-us	2018-12-12	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \|	Since rename/link fops on a file will not change any data in it, it should not update the read_subvol values in the inode_ctx, which interprets the data & metadata readable subvols for that file. The old read_subvol values should be retained even after the rename/link operations. Change-Id: I068044a426823a566f5bea8aa063cd689199d6dd fixes: bz#1657783 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	geo-rep: Make slave volume read-only (by default)	Harpreet Kaur	2018-12-07	5	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added a command to set "features.read-only" option to a default value "on" for slave volume. Changes are made in: $SRC//extras/hook-scripts/S56glusterd-geo-rep-create-post.sh for root geo-rep and $SRC/geo-replication/src/set_geo_rep_pem_keys.sh for non-root geo-rep. Fixes: bz#1654187 Change-Id: I15beeae3506f3f6b1dcba0a5c50b6344fd468c7c Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
*	libglusterfs: Move devel headers under glusterfs directory	ShyamsundarR	2018-12-05	2	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	protocol/server: support server.all-squash	Xie Changlong	2018-12-05	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We still use gnfs on our side, so do a little work to support server.all-squash. Just like server.root-squash, it's also a volume wide option. Also see bz#1285126 $ gluster volume set <VOLNAME> server.all-squash on Note: If you enable server.root-squash and server.all-squash at the same time, only server.all-squash works. Please refer to following table +---------------+-----------------+---------------------------+ \| \|all_squash \| no_all_squash \| +-------------------------------------------------------------+ \| \| \|anonuid/anongid for root \| \|root_squash \|anonuid/anongid \|useruid/usergid for no-root\| +-------------------------------------------------------------+ \|no_root_squash \|anonuid/anongid \|useruid/usergid \| +-------------------------------------------------------------+ Updates bz#1285126 Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Xue Chuanyu <xuechuanyu@cmss.chinamobile.com> Change-Id: Iea043318fe6e9a75fa92b396737985062a26b47e
*	tests/geo-rep: Mask failure of geo-rep arbiter test	Kotresh HR	2018-12-05	3	-21/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Comment out the particular test which is failing arbitrarily. Also changed the code to differentiate error cases. There could be some race because of which it's failing arbitrarily. This will be debugged and fixed in separate patch. Change-Id: I925df6421737d7a9abd9446a9d85029b4285ad2c updates: bz#1193929 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	tests: Mark tests/bugs/shard/zero-flag.t bad	Atin Mukherjee	2018-12-05	1	-0/+1
\| \| \| \| \| \|	Change-Id: I2f4ca470c6666584e0feb129ab712f06772a86c2 Updates: bz#1656264 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	geo-rep: Fix syncing of files with non-ascii filenames	Kotresh HR	2018-12-04	1	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Creation of files/directories with non-ascii names fails to sync to the slave. It crashes with below traceback on slave. ... File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/repce.py", line 118, in worker res = getattr(self.obj, rmeth)(in_data[2:]) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 709, in entry_ops [ESTALE, EINVAL, EBUSY]) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 546, in errno_wrap return call(arg) File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 83, in lsetxattr cls.raise_oserr() File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/libcxattr.py", line 38, in raise_oserr raise OSError(errn, os.strerror(errn)) OSError: [Errno 12] Cannot allocate memory Cause: The length calculation arguments passed to blob creation was done before encoding. Hence was failing in gfid-access layer. Fix: It appears that the calculating lenght properly fixes this issue. But it will cause issues in other places in 'python2' and not in 'python3'. So encoding and decoding each required string to make geo-rep compatible with both 'python2' and 'python3' is a nightmare and is not fool proof. Hence kept 'python2' code as is with out encode/decode and applied encode/decode only to 'python3' Added non-ascii filename tests to regression fixes: bz#1650893 Change-Id: I35cfaf848e07b1a0b5cb93c01b98b472f08271a6 Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	afr: assign gfid during name heal when no 'source' is present.	Ravishankar N	2018-12-03	1	-0/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If parent dir is in split-brain or has dirty xattrs set, and the file has gfid missing on one of the bricks, then name heal won't assign the gfid. Fix: Use the brick we select the gfid from as the 'source'. Note: Problem was found while trying to debug a split-brain issue on Cynthia Zhou's setup. updates: bz#1637249 Change-Id: Id088d4f0fb017aa35122de426654194e581ed742 Reported-by: Cynthia Zhou <cynthia.zhou@nokia-sbell.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests/geo-rep: Add Arbiter volume test case	Harpreet Kaur	2018-11-28	2	-0/+439
\| \| \| \| \| \| \| \| \|	Added geo-rep regression tests with Arbiter volume. Fixes: bz#1653565 Change-Id: Id99523c1f1d3d301fbe871aa0641d9ae4ed7b8d7 Signed-off-by: Harpreet Kaur <hlalwani@redhat.com>
*	cluster/afr: Add test for thin-arbiter feature	Ashish Pandey	2018-11-26	1	-0/+51
\| \| \| \| \| \| \| \| \|	Test : Check success/failure of write fop while different bricks/ta process are down. Change-Id: I3c376935df93ebf1f794c964bd19bc1280d91c59 updates: bz#1624332 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	gfapi: Offload callback notifications to synctask	Soumya Koduri	2018-11-26	2	-0/+374
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upcall notifications are received from server via epoll and same thread is used to forward these notifications to the application. This may lead to deadlock and hang in the following scenario. Consider if as part of handling these callbacks, application has to do some operations which involve sending I/Os to gfapi stack which inturn have to wait for epoll threads to receive repsonse. Thus this may lead to deadlock if all the epoll threads are waiting to complete these callback notifications. To address it, instead of using epoll thread itself, make use of synctask to send those notificaitons to the application. Change-Id: If614e0d09246e4279b9d1f40d883a32a39c8fd90 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	cluster/dht: sync brick root perms on add brick	N Balachandran	2018-11-19	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a single brick is added to the volume and the newly added brick is the first to respond to a dht_revalidate call, its stbuf will not be merged into local->stbuf as the brick does not yet have a layout. The is_permission_different check therefore fails to detect that an attr heal is required as it only considers the stbuf values from existing bricks. To fix this, merge all stbuf values into local->stbuf and use local->prebuf to store the correct directory attributes. Change-Id: Ic9e8b04a1ab9ed1248b6b056e3450bbafe32e1bc fixes: bz#1648298 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	lease: Treat unlk request as noop if lease not found	Soumya Koduri	2018-11-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the glusterfs server recalls the lease, it expects client to flush data and unlock the lease. If not it sets a timer (starting from the time it sends RECALL request) and post timeout, it revokes it. Here we could have a race where in client did send UNLK lease request but because of network delay it may have reached after server revokes it. To handle such situations, treat such requests as noop and return sucesss. Change-Id: I166402d10273f4f115ff04030ecbc14676a01663 updates: bz#1648768 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	ctime: Enable ctime feature by default	Kotresh HR	2018-11-11	3	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does following. 1. Enable ctime feature by default. 2. Earlier, to enable the ctime feature, two options needed to be enabled a. gluster vol set <volname> utime on b. gluster vol set <volname> ctime on This is inconvenient from the usability point of view. Hence changed it to following single option a. gluster vol set <volname> ctime on fixes: bz#1624724 Change-Id: I04af0e5de1ea6126c58a06ba8a26e22f9f06344e Signed-off-by: Kotresh HR <khiremat@redhat.com>
*	bd: remove from the build	Amar Tumballi	2018-11-08	1	-142/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removed BD (block device) translator from the build. [1] - https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Updates: bz#1635688 Change-Id: Ia96db406c58a7aef355dde6bc33523bb2492b1a9 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glupy: remove from the build	Amar Tumballi	2018-11-08	1	-31/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing 'glupy' translator from the build. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html This patch aims at clearing the translator from build and tests. A followup is needed to remove the code from repository. Updates: bz#1642810 Change-Id: I41d0c1956330c3bbca62c540ccf9ab01bbf3a092 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/interrupt.t: remove 'stripe' volume type	Amar Tumballi	2018-11-06	1	-1/+1
\| \| \| \| \| \| \| \| \|	Merged the patch which introduced this testcase after the 'remove stripe' patch got merged, and hence the confusion. Updates: bz#1193929 Change-Id: Ia08552debb111292caf14e51ea6a27334fe5c788 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	fuse: diagnostic FLUSH interrupt	Csaba Henk	2018-11-06	2	-0/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We add dummy interrupt handling for the FLUSH fuse message. It can be enabled by the "--fuse-flush-handle-interrupt" hidden command line option, or "-ofuse-flush-handle-interrupt=yes" mount option. It serves no other than diagnostic & demonstational purposes -- to exercise the interrupt handling framework a bit and to give an usage example. Documentation is also provided that showcases interrupt handling via FLUSH. Change-Id: I522f1e798501d06b74ac3592a5f73c1ab0590c60 updates: #465 Signed-off-by: Csaba Henk <csaba@redhat.com>
*	glusterd: coverity fixes	Atin Mukherjee	2018-11-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Addresses CIDs : 1124769, 1124852, 1124864, 1134024, 1229876, 1382382 Also addressed a spurious failure in tests/bugs/glusterd/df-results-post-replace-brick-operations.t to ensure post replace brick operation and before triggering 'df' from mount, client has connection to the newly replaced bricks. Change-Id: Ie5d7e02f89400a661491d7fc2a120d6f6a83a1cc Updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tiering: remove the translator from build and glusterd	Amar Tumballi	2018-11-02	25	-2078/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing tier translator from the build. Also make sure there are no regression tests involving tiering feature are present. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5 updates: bz#1642807 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterd: set fsid while performing replace brick	Sanju Rakonde	2018-11-02	1	-0/+58
\| \| \| \| \| \| \| \| \| \|	While performing the replace-brick operation, we should set fsid value to the new brick. fixes: bz#1637196 Change-Id: I9e9a4962fc0c2f5dff43e4ac11767814a0c0beaf Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: brick-mux-fd-cleanup.t should be under core directory	Atin Mukherjee	2018-10-31	1	-0/+0
\| \| \| \| \| \|	Fixes: bz#1637934 Change-Id: I5f95beab62bd2bdde3bbee94c308b0ad03e94379 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	stripe: remove the translator from build and glusterd	Amar Tumballi	2018-10-31	25	-331/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing stripe translator from the build. Also make sure there are no regression tests involving stripe translator. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Note that this patch aims at removing the translator from build, and a followup patch is needed to remove the code from repository. Updates: bz#1364707 Change-Id: I235b305338f138e29e9f30cba65bc0dadbebbbd5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr: thin-arbiter 2 domain locking and in-memory state	Ravishankar N	2018-10-25	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	2 domain locking + xattrop for write-txn failures: -------------------------------------------------- - A post-op wound on TA takes AFR_TA_DOM_NOTIFY range lock and AFR_TA_DOM_MODIFY full lock, does xattrop on TA and releases AFR_TA_DOM_MODIFY lock and stores in-memory which brick is bad. - All further write txn failures are handled based on this in-memory value without querying the TA. - When shd heals the files, it does so by requesting full lock on AFR_TA_DOM_NOTIFY domain. Client uses this as a cue (via upcall), releases AFR_TA_DOM_NOTIFY range lock and invalidates its in-memory notion of which brick is bad. The next write txn failure is wound on TA to again update the in-memory state. - Any incomplete write txns before the AFR_TA_DOM_NOTIFY upcall release request is got is completed before the lock is released. - Any write txns got after the release request are maintained in a ta_waitq. - After the release is complete, the ta_waitq elements are spliced to a separate queue which is then processed one by one. - For fops that come in parallel when the in-memory bad brick is still unknown, only one is wound to TA on wire. The other ones are maintained in a ta_onwireq which is then processed after we get the response from TA. Change-Id: I32c7b61a61776663601ab0040e2f0767eca1fd64 updates: bz#1579788 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com>
*	glusterd: ensure volinfo->caps is set to correct value.	Sanju Rakonde	2018-10-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the commit febf5ed4848, during the volume create op, we are setting volinfo->caps to 0, only if any of the bricks belong to the same node and brickinfo->vg[0] is null. Previously, we used to set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. With this patch, we set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. (as we do earlier without commit febf5ed4848). fixes: bz#1635820 Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: correction in tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t	Sanju Rakonde	2018-10-25	1	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch https://review.gluster.org/#/c/glusterfs/+/19135/ has optimised glusterd test cases by clubbing the similar test cases into a single test case. https://review.gluster.org/#/c/glusterfs/+/19135/15/tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t test case has been deleted and added as a part of tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t In the original test case, we create a volume with two bricks, each on a separate node(N1 & N2). From another node in cluster(N3), we try to detach a node which is hosting bricks. It fails. In the new test, we created volume with single brick on N1. and from another node in cluster, we tried to detach N1. we expect peer detach to fail, but peer detach was success as the node is hosting all the bricks of volume. Now, changing the new test case to cover the original test case scenario. Please refer https://bugzilla.redhat.com/show_bug.cgi?id=1642597#c1 to understand why the new test case is not failing in centos-regression. fixes: bz#1642597 Change-Id: Ifda12b5677143095f263fbb97a6808573f513234 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	cluster/ec : Prevent volume create without redundant brick	Sunil Kumar Acharya	2018-10-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: EC volumes can be created without any redundant brick. Solution: Updated the conditional check to avoid volume create without redundant brick. fixes: bz#1642448 Change-Id: I0cb334b1b9378d67fcb8abf793dbe312c3179c0b Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	tests: check for shd up status in bug-1637802-arbiter-stale-data-heal-lock.t	Ravishankar N	2018-10-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing this .t spuriously. On checking one of the failure logs, I see: 22:05:44 Launching heal operation to perform index self heal on volume patchy has been unsuccessful: 22:05:44 Self-heal daemon is not running. Check self-heal daemon log file. 22:05:44 not ok 20 , LINENUM:38 In glusterd log: [2018-10-18 22:05:44.298832] E [MSGID: 106301] [glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation 'Volume Heal' failed on localhost : Self-heal daemon is not running. Check self-heal daemon log file But the tests which preceed this check whether via a statedump if the shd is conected to the bricks, and they have succeeded and even started healing. From glustershd.log: [2018-10-18 22:05:40.975268] I [MSGID: 108026] [afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2 So the only reason I can see launching heal via cli failing is a race where shd has been spawned but glusterd has not yet updated in-memory that it is up, and hence failing the CLI. Fix: Check for shd up status before launching heal via CLI Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4 fixes: bz#1641344 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	api: fill out attribute information if not valid	Raghavendra Gowdappa	2018-10-17	2	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	translators like readdir-ahead selectively retain entry information of iatt (gfid and type) when rest of the iatt is invalidated (for write invalidating ia_size, (m)(c)times etc). Fuse-bridge uses this information and sends only entry information in readdirplus response. However such option doesn't exist in gfapi. This patch modifies gfapi to populate the stat by forcing an extra lookup. Thanks to Shyamsundar Ranganathan <srangana@redhat.com> and Prashanth Pai <ppai@redhat.com> for tests. Change-Id: Ieb5f8fc76359c327627b7d8420aaf20810e53000 Fixes: bz#1630804 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	gfapi: Bug fixes in leases processing code-path	Soumya Koduri	2018-10-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes below issues in gfapi lease code-path * 'glfs_setfsleasid' should allow NULL input to be able to reset leaseid * Applications should be allowed to (un)register for upcall notifications of type GLFS_EVENT_LEASE_RECALL * APIs added to read contents of GLFS_EVENT_LEASE_RECALL argument which is of type "struct glfs_upcall_lease" Change-Id: I3320ddf235cc82fad561e13b9457ebd64db6c76b updates: #350 Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	features/shard: Hold a ref on base inode when adding a shard to lru list	Krutika Dhananjay	2018-10-16	5	-7/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In __shard_update_shards_inode_list(), previously shard translator was not holding a ref on the base inode whenever a shard was added to the lru list. But if the base shard is forgotten and destroyed either by fuse due to memory pressure or due to the file being deleted at some point by a different client with this client still containing stale shards in its lru list, the client would crash at the time of locking lru_base_inode->lock owing to illegal memory access. So now the base shard is ref'd into the inode ctx of every shard that is added to lru list until it gets lru'd out. The patch also handles the case where none of the shards associated with a file that is about to be deleted are part of the LRU list and where an unlink at the beginning of the operation destroys the base inode (because there are no refkeepers) and hence all of the shards that are about to be deleted will be resolved without the existence of a base shard in-memory. This, if not handled properly, could lead to a crash. Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	core: glusterfsd keeping fd open in index xlator	Mohit Agrawal	2018-10-12	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of processing GF_EVENT_PARENT_DOWN at brick xlator, it forwards the event to next xlator only while xlator ensures no stub is in progress. At io-thread xlator it decreases stub_cnt before the process a stub and notify EVENT to next xlator Solution: Introduce a new counter to save stub_cnt and decrease the counter after process the stub completely at io-thread xlator. To avoid brick crash at the time of call xlator_mem_cleanup move only brick xlator if detach brick name has found in the graph Note: Thanks to pranith for sharing a simple reproducer to reproduce the same fixes bz#1637934 Change-Id: I1a694a001f7a5417e8771e3adf92c518969b6baa Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	afr: prevent winding inodelks twice for arbiter volumes	Ravishankar N	2018-10-10	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In an arbiter volume, if there is a pending data heal of a file only on arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks it only once, leaving behind a stale lock on the brick. This causes the next write to the file to hang. Fix: Fix the code-bug to take lock only once. This bug was introduced master with commit eb472d82a083883335bc494b87ea175ac43471ff Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA. fixes: bz#1637802 Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	cli: memory leak issues reported by asan	Amar Tumballi	2018-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \|	With this fix, a run on 'rpc-coverage.t' passes properly. This should help to get started with other fixes soon! Change-Id: I257ae4e28b9974998a451d3b490cc18c02650ba2 updates: bz#1633930 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests: add get-state command to test	Sanju Rakonde	2018-10-07	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When geo-replication session is running, run "gluster get-state" command to test. https://review.gluster.org/#/c/glusterfs/+/20461/ patch fixes glusterd crash, when we run get-state command with geo-rep session configured. Adding the test now. Fixes: bz#1598345 Change-Id: I56283fba2c782f83669923ddfa4af3400255fed6 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	Reduce execution time of bug-1559004-EMLINK-handling.t	Xavi Hernandez	2018-10-04	1	-12/+51
\| \| \| \| \| \| \| \| \| \| \|	This patch reduces the execution time of bug-1559004-EMLINK-handling.t from ~14 minutes to ~90 seconds. To do so, it creates some fake hard links directly on the brick instead of creating them through the volume. Change-Id: I9715ff1a4eba47574c733d4f28e68f42f56a7d3f updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>