glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	glusterd/store: store all key-values in one shot	Yaniv Kaul	2019-05-08	3	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of saving each key-value separately, which is slow ( especially as we fflush() after each!), store them all as one string and write all together. Implements https://github.com/gluster/glusterfs/issues/629 Change-Id: Ie77a272446b0b6785584b710a4fdd9c613dd9578 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat,.com>
*	libglusterfs: Fix compilation when --disable-mempool is used	Pranith Kumar K	2019-05-07	1	-0/+5
\| \| \| \| \| \|	updates bz#1193929 Change-Id: I245c065b209bcce5db939b6a0a934ba6fd393b47 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	mem-pool.{c\|h}: minor changes	Yaniv Kaul	2019-05-06	1	-25/+12
\| \| \| \| \| \| \| \| \| \| \|	1. Removed some code that was not needed. It did not really do anything. 2. CALLOC -> MALLOC in one place. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I4419161e1bb636158e32b5d33044b06f1eef2449
*	tests: validate volfile grammar - strings in volfile	Amar Tumballi	2019-05-06	4	-85/+17
\| \| \| \| \| \| \| \|	* libglusterfs/graph-print: remove unused code updates: bz#1693692 Change-Id: Iae81bb6a3af5911c3da07ab8f1d8f58f27e06905 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	options.c,h: minor changes to GF_OPTION_RECONF	Yaniv Kaul	2019-04-30	2	-49/+26
\| \| \| \| \| \| \| \| \| \| \|	Minor code changes (less variables and if statements) and use dict_get_strn(), since all options are fixed strings. Similar changes could be done to GF_OPTION_INIT() as well. Change-Id: I4a523f67183f4c4852a3d4de5e3cac52df68d3cf updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	libglusterfs: remove compound-fop helper functions	Amar Tumballi	2019-04-29	3	-177/+3
\| \| \| \| \| \|	updates: bz#1693692 Change-Id: If69702990af273be1f38855ba56b3b89fabff167 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cloudsync/cvlt: Cloudsync plugin for commvault store	Anuradha Talur	2019-04-26	1	-0/+1
\| \| \| \| \| \|	Change-Id: Icbe53e78e9c4f6699c7a26a806ef4b14b39f5019 updates: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
*	logging.c/h: aggressively remove sprintfs()	Yaniv Kaul	2019-04-25	2	-328/+201
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to reduce the number of sprintf() and string copies until we finally log a log line. Specifically, do not sprintf separately the timestr string and do not sprintf/strcpy the appmsgstr separately - just stick it with the header. Hoping I did not leak anything or changed the log line formatting. Also, allocate 4K (GF_LOG_BACKTRACE_SIZE) of memory dynamically for trace output - only if trace was actually requested (previously, it was unconditionally) In addition, some minor code formatting (unrelated to the above). updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Id2ccc85f9213a2b1c6eaa4a2f58ce043eac1824f
*	core: avoid dynamic TLS allocation when possible	Xavi Hernandez	2019-04-24	6	-431/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some interdependencies between logging and memory management functions make it impossible to use the logging framework before initializing memory subsystem because they both depend on Thread Local Storage allocated through pthread_key_create() during initialization. This causes a crash when we try to log something very early in the initialization phase. To prevent this, several dynamically allocated TLS structures have been replaced by static TLS reserved at compile time using '__thread' keyword. This also reduces the number of error sources, making initialization simpler. Updates: bz#1193929 Change-Id: I8ea2e072411e30790d50084b6b7e909c7bb01d50 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	core: fix hang issue in __gf_free	Susant Palai	2019-04-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently GF_ASSERT is done under mem_accounting lock at some places. On a GF_ASSERT failure, gf_msg_callingfn is called which calls gf_malloc internally and it takes the same mem_accounting lock leading to deadlock. This is a temporary fix to avoid any hang issue in master. https://review.gluster.org/#/c/glusterfs/+/22589/ is being worked on in the mean while so that GF_ASSERT can be used under mem_accounting lock. Change-Id: I6d67f23979e7edd2695bdc6aab2997dae4a4060a updates: bz#1700865 Signed-off-by: Susant Palai <spalai@redhat.com>
*	core: handle memory accounting correctly	Xavi Hernandez	2019-04-22	4	-114/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a translator stops, memory accounting for that translator is not destroyed (because there could remain memory allocated that references it), but mutexes that coordinate updates of memory accounting were destroyed. This caused incorrect memory accounting and even crashes in debug mode. This patch also fixes some other things: * Reduce the number of atomic operations needed to manage memory accounting. * Correctly account memory when realloc() is used. * Merge two critical sections into one. * Cleaned the code a bit. Change-Id: Id5eaee7338729b9bc52c931815ca3ff1e5a7dcc8 Updates: bz#1659334 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	shd/mux: Fix coverity issues introduced by shd mux patch	Mohammed Rafi KC	2019-04-15	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	CID 1400475: Null pointer dereferences (FORWARD_NULL) CID 1400474: Null pointer dereferences (FORWARD_NULL) CID 1400471: Code maintainability issues (UNUSED_VALUE) CID 1400470: Null pointer dereferences (FORWARD_NULL) CID 1400469: Memory - illegal accesses (USE_AFTER_FREE) CID 1400467: Code maintainability issues (UNUSED_VALUE) Change-Id: I0ca1c733be335c6e5844f44850f8066626ac40d4 updates: bz#789278 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: Log level changes do not effect on running client process	Mohit Agrawal	2019-04-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: commit c34e4161f3cb6539ec83a9020f3d27eb4759a975 set log-level per xlator during reconfigure only for a brick process not for the client process. Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure about brick multiplex introudce a flag brick_mux at ctx->cmd_args. Note: There are two other changes done with this patch 1) Ignore client-log-level option to attach a brick with already running brick if brick_mux is enabled 2) Add a log to print pid of the running process to make easier debugging Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com> Fixes: bz#1696046
*	graph.c: remove extra gettimeofday() - reuse the graph dob.	Yaniv Kaul	2019-04-15	2	-19/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	It was written just before fill_void() call. Note that there was a possible overflow if the hostname was too long (unrelated to this patch), but it is now also fixed, as we use a smaller buffer for the hostname. This, in turn, forces us to check if gethostname() failed and add explicitly the terminating null to it. Change-Id: I45fbc0a8e105f1247f3cbf61befac06fabbaea06 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	Replace memdup() with gf_memdup()	Vijay Bellur	2019-04-12	4	-17/+5
\| \| \| \| \| \| \| \| \|	memdup() and gf_memdup() have the same implementation. Removed one API as the presence of both can be confusing. Change-Id: I562130c668457e13e4288e592792872d2e49887e updates: bz#1193929 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	core: only log seek errors if SEEK_HOLE/SEEK_DATA is available	Niels de Vos	2019-04-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On RHEL-6 there is no support for SEEK_HOLE/SEEK_DATA and this causes the POSIX xlator to return errno=EINVAL. Because of this, the rpc-server xlator will log all 'failed' seek attempts. When applications call seek() often, the brick logs can grow very quickly and fill up the disks. Messages that get logged are like [server-rpc-fops.c:2091:server_seek_cbk] 0-vol01-server: 4947: SEEK-2 (53920aee-062c-4598-aa50-2b4d7821b204), client: worker.example.com-7808-2019/02/08-18:04:57:903430-vol01-client-0-0-0, error-xlator: vol01-posix [Invalid argument] The problem can be reproduced by running a Gluster Server on RHEL-6, with a client running on RHEL-7. The client should execute an application that calls lseek() with SEEK_HOLE/SEEK_DATA. Change-Id: I7b6c16f8e0ba1a183e845cfdb8d5a3f8caeab138 Fixes: bz#1697316 Signed-off-by: Niels de Vos <ndevos@redhat.com>
*	libglusterfs: define macros needed for cloudsync	Anuradha Talur	2019-04-04	1	-0/+4
\| \| \| \| \| \|	Change-Id: Iec5ce7f17fbf899f881a58cd20c4c967e3b71668 fixes: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
*	logging: Fix GF_LOG_OCCASSIONALLY API	Atin Mukherjee	2019-04-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	GF_LOG_OCCASSIONALLY doesn't log on the first instance rather at every 42nd iterations which isn't effective as in some cases we might not have the code flow hitting the same log for as many as 42 times and we'd end up suppressing the log. Fixes: bz#1694925 Change-Id: Iee293281d25a652b64df111d59b13de4efce06fa Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	mgmt/shd: Implement multiplexing in self heal daemon	Mohammed Rafi KC	2019-04-01	9	-3/+508
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Shd daemon is per node, which means they create a graph with all volumes on it. While this is a great for utilizing resources, it is so good in terms of performance and managebility. Because self-heal daemons doesn't have capability to automatically reconfigure their graphs. So each time when any configurations changes happens to the volumes(replicate/disperse), we need to restart shd to bring the changes into the graph. Because of this all on going heal for all other volumes has to be stopped in the middle, and need to restart all over again. Solution: This changes makes shd as a per volume daemon, so that the graph will be generated for each volumes. When we want to start/reconfigure shd for a volume, we first search for an existing shd running on the node, if there is none, we will start a new process. If already a daemon is running for shd, then we will simply detach a graph for a volume and reatach the updated graph for the volume. This won't touch any of the on going operations for any other volumes on the shd daemon. Example of an shd graph when it is per volume graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ --------- --------- ---------- \| AFR-1 \| \| AFR-2 \| \| AFR-3 \| -------- --------- ---------- A running shd daemon with 3 volumes will be like--> graph ----------------------- \| debug-iostat \| ----------------------- / \| \ / \| \ ------------ ------------ ------------ \| volume-1 \| \| volume-2 \| \| volume-3 \| ------------ ------------ ------------ Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99 fixes: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	mem-pool: remove dead code.	Yaniv Kaul	2019-03-26	2	-81/+0
\| \| \| \| \| \|	Change-Id: I3bbda719027b45e1289db2e6a718627141bcbdc8 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
*	inode: fix unused vars	Atin Mukherjee	2019-03-22	1	-1/+2
\| \| \| \| \| \| \| \| \|	Commit 6d6a3b2 introduced some unused vars. This patch defines them within #ifdef DEBUG Fixes: bz#1580315 Change-Id: I8a332b00c3ffb66689b4b6480c490b9436c17d63 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	inode: don't dump the whole table to CLI	Amar Tumballi	2019-03-20	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \|	dumping the whole inode table detail to screen doesn't solve any purpose. We should be getting only toplevel details on CLI, and then if one wants to debug further, then they need to get to 'statedump' to get full details. Fixes: bz#1580315 Change-Id: Iaf3de94602f9c76832c3c918ffe2ad13c0b0e448 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster-syncop: avoid duplicate unlock of inodelk/entrylk	Kinglong Mee	2019-03-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	When using ec, there are many messages at brick log as, [inodelk.c:514:__inode_unlock_lock] 0-test-locks: Matching lock not found for unlock 0-9223372036854775807, lo=68e040a84b7f0000 on 0x7f208c006f78 [MSGID: 115053] [server-rpc-fops_v2.c:280:server4_inodelk_cbk] 0-test-server: 2557439: INODELK <gfid:df4e41be-723f-4289-b7af-b4272b3e880c> (df4e41be-723f-4289-b7af-b4272b3e880c), client: CTX_ID:67d4a7f3-605a-4965-89a5-31309d62d1fa-GRAPH_ID:0-PID:1659-HOST:openfs-node2-PC_NAME:test-client-1-RECON_NO:-28, error-xlator: test-locks [Invalid argument] Change-Id: Ib164d29ebb071f620a4ca9679c4345ef7c88512a Updates: bz#1689920 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
*	glusterfsd: Multiple shd processes are spawned on brick_mux environment	Mohit Agrawal	2019-03-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Multiple shd processes are spawned while starting volumes in the loop on brick_mux environment.glusterd spawn a process based on a pidfile and shd daemon is taking some time to update pid in pidfile due to that glusterd is not able to get shd pid Solution: Commit cd249f4cb783f8d79e79468c455732669e835a4f changed the code to update pidfile in parent for any gluster daemon after getting the status of forking child in parent.To resolve the same correct the condition update pidfile in parent only for glusterd and for rest of the daemon pidfile is updated in child Change-Id: Ifd14797fa949562594a285ec82d58384ad717e81 fixes: bz#1684404 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	glusterd: change the op-version	Sanju Rakonde	2019-03-11	2	-1/+3
\| \| \| \| \| \| \| \| \| \|	as commit 073444 is backported to release-5.4 branch, op-version for this change should 5.4 instead of 6. fixes: bz#1685120 Change-Id: Id504b9a1446125cea7c6a32117ccc44f28e73aa7 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	WORM-Xlator: Maybe integer overflow when computing new atime	David Spisla	2019-03-07	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The structs worm_reten_state_t and read_only_priv_t from read-only.h are using uint64_t values to store periods of retention and autocommmit. This seems to be dangerous since in worm-helper.c the function worm_set_state computes in line 97: stbuf->ia_atime = time(NULL) + retention_state->ret_period; stbuf->ia_atime is using int64_t because of the settings of struct iattr. So if there is a very very high retention period stored, there is maybe an integer overflow. What can be the solution? Using int64_t instead if uint64_t may reduce the probability of the occurance. Change-Id: Id1e86c6b20edd53f171c4cfcb528804ba7881f65 fixes: bz#1685944 Signed-off-by: David Spisla <david.spisla@iternity.com>
*	core: make compute_cksum function op_version compatible	Sanju Rakonde	2019-03-07	2	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: commit 5a152a changed the mechansim of computing the checksum. In heterogeneous cluster, peers are running into rejected state because we have different cksum computation mechansims in upgraded and non-upgraded nodes. Solution: add a check for op-version so that all the nodes in the cluster follow the same mechanism for computing the cksum. Change-Id: I1508f000e8c9895588b6011b8b6cc0eda7102193 fixes: bz#1685120 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	core: fix volume heal to avoid "invalid argument"	Rinku Kothiya	2019-03-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This patch avoids printing of "invalid argument" unless loglevel is set to GF_LOG_DEBUG. fixes : bz#1654021 Change-Id: I0e3d43bc627526f696b12921081342ca9b4a5f84 fixes: bz#1654021 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	core: implement a global thread pool	Xavi Hernandez	2019-02-18	7	-4/+947
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a thread pool that is wait-free for adding jobs to the queue and uses a very small locked region to get jobs. This makes it possible to decrease contention drastically. It's based on wfcqueue structure provided by urcu library. It automatically enables more threads when load demands it, and stops them when not needed. There's a maximum number of threads that can be used. This value can be configured. Depending on the workload, the maximum number of threads plays an important role. So it needs to be configured for optimal performance. Currently the thread pool doesn't self adjust the maximum for the workload, so this configuration needs to be changed manually. For this reason, the global thread pool has been made optional, so that volumes can still use the thread pool provided by io-threads. To enable it for bricks, the following option needs to be set: config.global-threading = on This option has no effect if bricks are already running. A restart is required to activate it. It's recommended to also enable the following option when running bricks with the global thread pool: performance.iot-pass-through = on To enable it for a FUSE mount point, the option '--global-threading' must be added to the mount command. To change it, an umount and remount is needed. It's recommended to disable the following option when using global threading on a mount point: performance.client-io-threads = off To enable it for services managed by glusterd, glusterd needs to be started with option '--global-threading'. In this case all daemons, like self-heal, will be using the global thread pool. Currently it can only be enabled for bricks, FUSE mounts and glusterd services. The maximum number of threads for clients and bricks can be configured using the following options: config.client-threads config.brick-threads These options can be applied online and its effect is immediate most of the times. If one of them is set to 0, the maximum number of threads will be calcutated as #cores * 2. Some distributions use a very old userspace-rcu library (version 0.7) for this reason, some header files from version 0.10 have been copied into contrib/userspace-rcu and are used if the detected version is 0.7 or older. An additional change has been made to io-threads to prevent that threads are started when iot-pass-through is set. Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b updates: #532 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	socket: socket event handlers now return void	Milind Changire	2019-02-18	4	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Returning any value from socket event handlers to the event sub-system doesn't make sense since event sub-system cannot handle socket sub-system errors. Solution: Change return type of all socket event handlers to 'void' Change-Id: I70dc2c57f12b7ea2fae41120f71aa0d7fe0b2b6f Fixes: bz#1651246 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	inode: make critical section smaller	Amar Tumballi	2019-02-13	3	-217/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do all the 'static' tasks outside of locked region. * hash_dentry() and hash_gfid() are now called outside locked region. * remove extra __dentry_hash exported in libglusterfs.sym * avoid checks in locked functions, if the check is done in calling function. * implement dentry_destroy(), which handles freeing of dentry separately, from that of dentry_unset (which takes care of separating dentry from inode, and table) Updates: bz#1670031 Change-Id: I584213e0748464bb427fbdef3c4ab6615d7d5eb0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr/shd: Cleanup self heal daemon resources during afr fini	Mohammed Rafi KC	2019-02-12	1	-0/+8
\| \| \| \| \| \| \| \| \|	We were not properly cleaning self-heal daemon resources during afr fini. This patch will clean the same. Change-Id: I597860be6f781b195449e695d871b8667a418d5a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	clnt/rpc: ref leak during disconnect.	Mohammed Rafi KC	2019-02-12	1	-5/+11
\| \| \| \| \| \| \| \| \| \|	During disconnect cleanup, we are not cancelling reconnect timer, which causes a ref leak each time when a disconnect happen. Change-Id: I9d05d1f368d080e04836bf6a0bb018bf8f7b5b8a updates: bz#1659708 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	inode: create inode outside locked region	Amar Tumballi	2019-02-11	1	-11/+13
\| \| \| \| \| \| \| \| \|	Only linking of inode to the table, and inserting it in a list needs to be in locked region. Updates: bz#1670031 Change-Id: I6ea7e956b80cf2765c2233d761909c4bf9c7253c Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	fuse: correctly handle setxattr values	Xavi Hernandez	2019-02-07	2	-4/+26
\| \| \| \| \| \| \| \| \| \|	The setxattr function receives a pointer to raw data, which may not be null-terminated. When this data needs to be interpreted as a string, an explicit null termination needs to be added before using the value. Change-Id: Id110f9b215b22786da5782adec9449ce38d0d563 updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	glusterd: Update op-version for release 7	ShyamsundarR	2019-02-05	1	-1/+3
\| \| \| \| \| \|	Change-Id: I0f3978d7e603e6e767dc7aa2a23ef35b1f2b43f7 Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
*	glusterd: manage upgrade to current master	Amar Tumballi	2019-02-04	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scenarios tested: * Upgrade the node when there are stripe / tiering and regular type of volumes are present. - All volumes are started fine (as the change was not on brick volfile) - For tier, the functionality may not even work, as changetimerecorder is not present. - 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and tier type of volume. * Upgrade in a rolling upgrade scenario, where an old version is able to connect to higher master. - on a normal volume, if the volfile-server was new, the newer client volfiles needed to have utime xlator conditionally. - with this one change, all other changes seem to work fine. Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8 updates: bz#1635688 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	mount/fuse: expose auto-invalidation as a mount option	Raghavendra Gowdappa	2019-02-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Auto invalidation is necessary when same (meta)data is shared/access across multiple mounts. However, if (meta)data is not shared, all relevant I/O goes through the cache of single mount and hence is coherent with (meta)data on bricks always. So, fuse-auto-invalidation can be disabled for this case which gives a huge performance boost for workloads that write data and then immediately read the data they just wrote. From glusterfs --help, <snip> --auto-invalidation[=BOOL] controls whether fuse-kernel can auto-invalidate attribute, dentry and page-cache. Disable this only if same files/directories are not accessed across two different mounts concurrently [default: "on"] </snip> Details on how disabling auto-invalidation helped to reduce pgbench init times can be found at [1]. Time taken for pgbench init of scale 8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s) with auto-invalidations turned off along with other optimizations. Just disabling auto-invalidation contributed 56% improvement by reducing the total time taken by 33260s. [1] https://www.spinics.net/lists/gluster-devel/msg25907.html Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> updates: bz#1664934
*	core: make gf_thread_create() easier to use	Xavi Hernandez	2019-02-01	5	-59/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch creates a specific function to set the thread name using a string format and a variable argument list, like printf(). This function is used to set the thread name from gf_thread_create(), which now accepts a variable argument list to create the full name. It's not necessary anymore to use a local array to build the name of the thread. This is done automatically. Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	core: move "dict is NULL" logs to DEBUG log level	Milind Changire	2019-02-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	Too many logs get printed if dict_ref() and dict_unref() are passed NULL pointer. fixes: bz#1671213 Change-Id: I18afd849d64318f68baa7b549ee310dac0e1e786 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	syncop: remove unnecessary call to gf_backtrace_save()	Xavi Hernandez	2019-01-31	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \|	A call to gf_backtrace_save() was done on each context switch of a synctask. The backtrace is generated writing to the filesystem, so it can have an important impact on latency. The generated backtrace was not used anywhere, so it's been removed. Change-Id: I399a93b932c5b6e981c696c72c3e1ef44710ba52 Updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	core: heketi-cli is throwing error "target is busy"	Mohit Agrawal	2019-01-31	4	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk() is supposed to call rpc_transport_unref() after open-files on that transport are flushed per transport.But open-fd-count is maintained in bound_xl->fd_count, which can be incremented/decremented cumulatively in server_connection_cleanup() by all transport disconnect paths. So instead of rpc_transport_unref() happening per transport, it ends up doing it only once after all the files on all the transports for the brick are flushed leading to rpc-leaks. Solution: To avoid races maintain fd_cnt at client instead of maintaining on brick Credits: Pranith Kumar Karampuri Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6 fixes: bz#1668190 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	Multiple files: reduce work while under lock.	Yaniv Kaul	2019-01-29	4	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mostly, unlock before logging. In some cases, moved different code that was not needed to be under lock (for example, taking time, or malloc'ing) to be executed before taking the lock. Note: logging might be slightly less accurate in order, since it may not be done now under the lock, so order of logs is racy. I think it's a reasonable compromise. Compile-tested only! updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
*	socket: fix issue on concurrent handle of a socket	Zhang Huan	2019-01-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found an issue on concurrent invoke of event handler to the same socket fd, causing memory corruption. This issue arises after applying commit "socket: Remove redundant in_lock in incoming message handling" that removes priv->in_lock to serialize socket read. The following call sequence describes how concurrent socket event handle happens. thread 1 thread 2 thread 3 epoll_wait() return (slot->in_handler is 0) call select_on_epoll() and epoll_ctl() on fd epoll_wait() return slot->in_handler++ (slot->in_handler is 1) slot->in_handler++ (slot->in_handler is 2) call handler() call handler() Fix this issue by skip invoke of handler if there is already a handler inprogress. Change-Id: I437126ac772debcadb00993a948919c931cd607b updates: bz#1467614 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
*	afr/self-heal:Fix wrong type checking	Ravishankar N	2019-01-24	4	-1/+28
\| \| \| \| \| \| \| \| \| \|	gf_dirent struct has d_type variable which should check with DT_DIR istead of IA_IFDIR or IA_IFDIR has to compare with entry->d_stat.ia_type Change-Id: Idf1059ce2a590734bc5b6adaad73604d9a708804 updates: bz#1653359 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
*	core: move logs which are only developer relevant to DEBUG level	Amar Tumballi	2019-01-23	3	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	We had only changed the log level to DEBUG in release branch earlier. But considering 90%+ of our deployments happen in same env, we can look at these specific logs on need basis. With this change, the master branch will be easier to debug with lesser logs. Change-Id: I4157a7ec7d5ec9c2948b2bbc1e4cb8317f28d6b8 Updates: bz#1666833 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	rpc: use address-family option from vol file	Milind Changire	2019-01-22	3	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch helps enable IPv6 connections in the cluster. The default address-family is IPv4 without using this option explicitly. When address-family is set to "inet6" in the /etc/glusterfs/glusterd.vol file, the mount command-line also needs to have -o xlator-option="transport.address-family=inet6" added to it. This option also gets added to the brick command-line. Snapshot and gfapi use-cases should also use this option to pass in the inet6 address-family. Change-Id: I97db91021af27bacb6d7578e33ea4817f66d7270 fixes: bz#1635863 Signed-off-by: Milind Changire <mchangir@redhat.com>
*	common-utils: make vector a const parameter	Xiubo Li	2019-01-22	1	-1/+1
\| \| \| \| \| \| \| \|	To avoid the warning and preparing for adding writesame support. Updates: #617 Change-Id: I0710b1e4c240368a9bf52968bddc6e250ae2028d Signed-off-by: Xiubo Li <xiubli@redhat.com>
*	core: Feature added to accept CidrIp in auth.allow	Rinku Kothiya	2019-01-18	3	-6/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added functionality to gluster volume set auth.allow command to accept CIDR IP addresses. Modified few functions to isolate cidr feature so that it prevents other gluster commands such as peer probe to use cidr format ip. The functions are modified in such a way that they have an option to enable accepting of cidr format for other gluster commands if required in furture. updates: bz#1138841 Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b Credits: Mohit Agrawal Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	lock: Add fencing support	Susant Palai	2019-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	design reference: https://review.gluster.org/#/c/glusterfs-specs/+/21925/ This patch adds the lock preempt support. Note: The current model stores lock enforcement information as separate xattr on disk. There is another effort going in parallel to store this in stat(x) of the file. This patch is self sufficient to add fencing support. Based on the availability of the stat(x) support either I will rebase this patch or we can modify the necessary bits post merging this patch. Change-Id: If4a42f3e0afaee1f66cdb0360ad4e0c005b5b017 updates: #466 Signed-off-by: Susant Palai <spalai@redhat.com>