summaryrefslogtreecommitdiffstats
path: root/rpc
Commit message (Collapse)AuthorAgeFilesLines
* gluster: IPv6 single stack supportKevin Vigor2017-10-245-0/+116
| | | | | | | | | | | | | | | | | | | | | | Summary: - This diff changes all locations in the code to prefer inet6 family instead of inet. This will allow change GlusterFS to operate via IPv6 instead of IPv4 for all internal operations while still being able to serve (FUSE or NFS) clients via IPv4. - The changes apply to NFS as well. - This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch. Test Plan: Prove tests! Reviewers: dph, rwareing Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33 BUG: 1406898 Reviewed-on-3.8-fb: http://review.gluster.org/16059 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com>
* build: make it possible to build cleanly 2x in a rowNiels de Vos2017-10-131-1/+1
| | | | | | | | | | | | | | | | | 'make clean' does not cleanup everything, and some of the files get cleaned too eagerly. Several files are being packaged in a 'make dist' tarball, that get rebuild each time anyway. Specifically, this change prevents - libglusterfs/src/generator.pyc from laying around - keeping rpc/xdr/gen/*.x symlinks - modifying tests/basic/{fuse,gfapi}/Makefile each run - including tests/env.rc and events/src/eventtypes.py in the tarball Change-Id: I774dd1abf3a9d3b6a89b938cf6ee7d7792c59a82 BUG: 1501317 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com>
* rpc: free conn->name in rpc_clnt_destroy()Niels de Vos2017-10-111-0/+1
| | | | | | Change-Id: Idf99908aa48718a7faf7f0bbc647a679ec548282 BUG: 1443145 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* rpc: free registered callback programsNiels de Vos2017-10-111-0/+7
| | | | | | Change-Id: I8c6f6b642f025d1faf74015b8f7aaecd7ebfd4d5 BUG: 1443145 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* rpc/rpc-transport/socket/: Coverity Issue MISSING_BREAKGirjesh Rajoria2017-09-291-1/+1
| | | | | | | | | | | | | Issue: Event unterminated_case: The case for value "AF_INET_SDP" is not terminated by a 'break' statement. Function: socket_server_get_local_sockaddr Added Fall through in comment after case AF_INET_SDP. Change-Id: I5340cde2257be33ff13c53cee6ccdb33df7d2c88 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
* heal: New feature heal info summary to list the status of brick and count of ↵Mohamed Ashiq Liyazudeen2017-09-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entries to be healed Command output: Brick 192.168.2.8:/brick/1 Status: Connected Total Number of entries: 363 Number of entries in heal pending: 362 Number of entries in split-brain: 0 Number of entries possibly healing: 1 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <healInfo> <bricks> <brick hostUuid="9105dd4b-eca8-4fdb-85b2-b81cdf77eda3"> <name>192.168.2.8:/brick/1</name> <status>Connected</status> <totalNumberOfEntries>363</numberOfEntries> <numberOfEntriesInHealPending>362</numberOfEntriesInHealPending> <numberOfEntriesInSplitBrain>0</numberOfEntriesInSplitBrain> <numberOfEntriesPossiblyHealing>1</numberOfEntriesPossiblyHealing> </brick> </bricks> </healInfo> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> </cliOutput> Change-Id: I40cb6f77a14131c9e41b292f4901b41a228863d7 BUG: 1261463 Signed-off-by: Mohamed Ashiq Liyazudeen <mliyazud@redhat.com> Reviewed-on: https://review.gluster.org/12154 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Karthik U S <ksubrahm@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc-transport/socket: tab -> spaces cleanupKaleb S. KEITHLEY2017-09-152-466/+466
| | | | | | | | | | | tired of review comments about use of tabs when the whole file uses tabs Change-Id: I4f822a53f47886da04282f9c3fb84d81a7b3f8d0 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/18286 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Fix use-after-free in gf_rdma_do_readsMichael Scherer2017-09-151-0/+1
| | | | | | | | | | | | | | | | If iobref_new can't allocate a ref, we free the iobuf, and then go to the cleanup part of the function, that will run iobuf_unref a 2nd time, which trigger a warning with coverity. Change-Id: Ie9cf7a5d5f98244a390e44e1403c614199eb650c BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/18245 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: TLSv1_2_method() is deprecated in OpenSSL-1.1Kaleb S. KEITHLEY2017-09-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fedora 26 has OpenSSL-1.1. Compile-time warnings indicate that TLSv1_2_method() is now deprecated. As per the SSL man page: TLS_method(), TLS_server_method(), TLS_client_method() These are the general-purpose version-flexible SSL/TLS methods. The actual protocol version used will be negotiated to the highest version mutually supported by the client and the server. The supported protocols are SSLv3, TLSv1, TLSv1.1 and TLSv1.2. Applications should use these methods, and avoid the version- specific methods described below. ... TLSv1_2_method(), ... ... Note that OpenSSL-1.1 is the version of OpenSSL; Fedora 25 and RHEL 7.3 and other distributions (still) have OpenSSL-1.0. TLS versions are orthogonal to the OpenSSL version. TLS_method() is the new — in OpenSSL-1.1 — version flexible function intended to replace the TLSv1_2_method() function in OpenSSL-1.0 and the older (?), insecure TLSv23_method(). (OpenSSL-1.0 does not have TLS_method()) Change-Id: I190363ccffe7c25606ea2cf30a6b9ff1ec186057 BUG: 1491025 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/18268 Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* socket: Use granular mutex locks during pollin and pollout event processingKrutika Dhananjay2017-09-082-35/+58
| | | | | | | | | | | | | | | | | | | ... instead of one global lock. This is because pollin and pollout processing code operate on mutually exclusive members of socket_private_t. Keeping this in mind, this patch introduces the more granular priv->in_lock and priv->out_lock locks. For pollerr, which modifies both priv->incoming and priv->ioq, both locks need to be taken. Change-Id: Id7aeb608dc7755551b6b404470d5d80709c81960 BUG: 1467614 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17687 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Do not declare the variable timeout_ms if TCP_USER_TIMEOUT is not definedMichael Scherer2017-09-071-0/+2
| | | | | | | | | | | | | Trying to fix the build on FreeBSD, -Wunused-variable trigger this warning. Change-Id: I318121829ac6a609fe9c606aead257827c7748b1 BUG: 1488829 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/18215 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Michael Scherer <misc@fedoraproject.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* Command to identify client processhari gowtham2017-09-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | command: gluster volume status <volname/all> client-list output: Client connections for volume v1 Name count ----- ------ fuse 2 tierd 1 total clients for volume v1 : 3 ----------------------------------------------------------------- Client connections for volume v2 Name count ----- ------ tierd 1 fuse.gsync 1 total clients for volume v2 : 2 ----------------------------------------------------------------- Updates: #178 Change-Id: I0ff2579d6adf57cc0d3bd0161a2ec6ac6c4747c0 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/18095 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: destroy transport after client_tMilind Changire2017-08-311-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1. Ref counting increment on the client_t object is done in rpcsvc_request_init() which is incorrect. 2. Ref not taken when delegating to grace_time_handler() Solution: 1. Only fop requests which require processing down the graph via stack 'frames' now ref count the request in get_frame_from_request() 2. Take ref on client_t object in server_rpc_notify() but avoid dropping in RPCSVC_EVENT_TRANSPORT_DESRTROY. Drop the ref unconditionally when exiting out of grace_time_handler(). Also, avoid dropping ref on client_t in RPCSVC_EVENT_TRANSPORT_DESTROY when ref mangement as been delegated to grace_time_handler() Change-Id: Ic16246bebc7ea4490545b26564658f4b081675e4 BUG: 1481600 Reported-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17982 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* Revert "rpc: disable client on disconnection from rebalance"Milind Changire2017-08-241-4/+0
| | | | | | | | | | | | | | This reverts commit 5b14c11d3cae38bc66006b02217ede485ae30dea. BUG: 1484225 Change-Id: I3269d3fc64de3f3cc6f670ea564a87d7725e10fd Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/18113 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: disable client on disconnection from rebalanceMilind Changire2017-08-231-0/+4
| | | | | | | | | | | | | | | | | | | Problem: glusterd rpc code path attempts to reconnect to rebalance process via the reconnect timer even after the rebalance process disconnection Solution: Set the clnt->disabled flag to 1 to avoid reconnection and cause the clnt object to be unref'd Change-Id: I4e38eaef45d2fdea86d25e9dff9f1af0cd29cf66 BUG: 1484225 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/18093 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier: separation of attach-tier from add-brickhari gowtham2017-08-011-0/+1
| | | | | | | | | | | | | | | | | | PROBLEM: Both attach tier and add brick have the same RPC and set of code. This becomes a hurdle while tring to implement add brick on a tiered volume. FIX: This patch separates the add brick and attach tier giving them separate RPCs. Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2 BUG: 1376326 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/15503 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: prevent logging null client [Invalid argument]Prashanth Pai2017-07-251-1/+3
| | | | | | | | | | | | | | | | | | | | | The following log entry is observed during volume operations such as volume create: [2017-07-20 05:13:43.213797] E [client_t.c:321:gf_client_ref] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_create+0x1a4) [0x7f987f66cd20] -->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_init+0xd0) [0x7f987f66ca23] -->/usr/local/lib/libglusterfs.so.0(gf_client_ref+0x56) [0x7f987f91cbd5] ) 0-client_t: null client [Invalid argument] Change-Id: I49ba753e8d1a828bb275b0ccb1a181706774f387 BUG: 1193929 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17848 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd: Add option to get all volume options through get-state CLISamikshan Bairagya2017-07-251-1/+2
| | | | | | | | | | | | | | | | | | | | This commit makes the get-state CLI capable to returning the values for all volume options for all volumes. This is similar to what you get when you issue a `gluster volume get <volname> all` command. This is the new usage for the get-state CLI: # gluster get-state [<daemon>] [[odir </path/to/output/dir/>] \ [file <filename>]] [detail|volumeoptions] Fixes: #277 Change-Id: Ice52d936a5a389c6fa0ba5ab32416a65cdfde46d Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17858 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Gaurav Yadav <gyadav@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* rpc: include current second in timed out frame cleanupMilind Changire2017-07-211-1/+1
| | | | | | | | | | | | | | | | | | | Problem: frames which time out at current second are missed out Solution: change test to include frames timing out at current second i.e. timeout <= current.tv_sec instead of timeout < current.tv_sec Change-Id: I459d47856ade2b657a0289e49f7f63da29186d6e BUG: 1468433 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17722 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-193-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* program/GF-DUMP: Shield ping processing from traffic to GlusterfsRaghavendra G2017-07-182-5/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Program Since poller thread bears the brunt of execution till the request is handed over to io-threads, poller thread experiencies lock contention(s) in the control flow till io-threads, which slows it down. This delay invariably affects reading ping requests from network and responding to them, resulting in increased ping latencies, which sometimes results in a ping-timer-expiry on client leading to disconnect of transport. So, this patch aims to free up poller thread from executing code of Glusterfs Program. We do this by making * Glusterfs Program registering itself asking rpcsvc to execute its actors in its own threads. * GF-DUMP Program registering itself asking rpcsvc to _NOT_ execute its actors in its own threads. Otherwise program's ownthreads become bottleneck in processing ping traffic. This means that poller thread reads a ping packet, invokes its actor and hands the response msg to transport queue. Change-Id: I526268c10bdd5ef93f322a4f95385137550a6a49 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1421938 Reviewed-on: https://review.gluster.org/17105 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* socket: call init_openssl_mt() in init() and add cleanupPrashanth Pai2017-07-171-25/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | init_openssl_mt() wasn't explicitly invoked and was run implicitly before dlopen() returned as it was tagged as __attribute__ ((constructor)). This function used to call GF_CALLOC() which wasn't available or initialized when socket.so is dlopen()ed by an external program. The program used to crash with SIGSEGV as follows: 0x00007ffff5efe1ad in __gf_calloc (nmemb=41, size=40, type=158, typestr=0x7ffff63eb3d6 "gf_sock_mt_lock_array") at mem-pool.c:109 0x00007ffff63e6acf in init_openssl_mt () at socket.c:4016 0x00007ffff7de90da in call_init.part () from /lib64/ld-linux-x86-64.so.2 0x00007ffff7de91eb in _dl_init () from /lib64/ld-linux-x86-64.so.2 0x00007ffff7dedde1 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 0x00007ffff7de8f84 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 0x00007ffff7ded339 in _dl_open () from /lib64/ld-linux-x86-64.so.2 This change moves call to init_openssl_mt() from being a constructor function to the init() function and introduces fini_openssl_mt() which cleans up resources (called in destructor). BUG: 1193929 Change-Id: Iab690897ec34e24c33f6b43f8d8d9f8fd75ac607 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17753 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* Link against missed libraries to resolve symbolsPrashanth Pai2017-07-032-2/+5
| | | | | | | | | | | | | | | | | | | | | | When external programs perform a dlopen("..so", RTLD_LAZY|RTLD_LOCAL) on some shared objects like xlators, it can fail with dlerror set to error string "undefined symbol <some-type>". This was observed for the following shared objects: fuse.so, quota.so, quotad.so, server.so, libgfrpc.so and socket.so P.S: This was found while running a go program which fetches the list of xlator options (volume_option_t) from xlator's shared object. BUG: 1193929 Change-Id: I7b958409cf11fb67c2be32a3f85a96fb1260236b Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17659 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* multiple: fix struct/typedef inconsistenciesJeff Darcy2017-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The most common pattern, both in our code and elsewhere, is this: struct _xyz { ... }; typedef struct _xyz xyz_t; These exceptions - especially call_frame/call_stack - have been slowing down code navigation for years. By converging on a single pattern, navigating from xyz_t in code to the actual definition of struct _xyz (i.e. without having to visit the typedef first) might even be automatable. Change-Id: I0e5dd1f51f98e000173c62ef4ddc5b21d9ec44ed Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17650 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* protocol/server: make listen backlog value as configurableMohammed Rafi KC2017-06-083-11/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problem: When we call listen from protocol/server, we are giving a hard coded valie of 10 if it is not manually given. With multiplexing, especially when glusterd restarts all clients may try to connect to the server at a time. Which will result in overflowing the queue, and kernel will complain about the errors. Solution: This patch will introduce a volume set command to make backlog value as a configurable. This patch also changes the default values for backlog from 10 to 128. This changes is only applicable for sockets listening from protocol. Example: gluster volume set <volname> transport.listen-backlog 1024 Note: 1 Brick has to be restarted to get this value in effect 2 This changes won't be reflected in glusterd, or other xlators which calls listen. If you need, you have to add this option to the volfile. Change-Id: I0c5a2bbf28b5db612f9979e7560e05dd82b41477 BUG: 1456405 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17411 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterfs: Not able to mount running volume after enable brick mux and ↵Mohit Agrawal2017-05-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | stopped any volume Problem: After enabled brick mux if any volume has down and then try ot run mount with running volume , mount command is hung. Solution: After enable brick mux server has shared one data structure server_conf for all associated subvolumes.After down any subvolume in some ungraceful manner (remove brick directory) posix xlator sends GF_EVENT_CHILD_DOWN event to parent xlatros and server notify updates the child_up to false in server_conf.When client is trying to communicate with server through mount it checks conf->child_up and it is FALSE so it throws message "translator are not yet ready". From this patch updated structure server_conf to save child_up status for xlator wise. Another improtant correction from this patch is cleanup threads from server side xlators after stop the volume. BUG: 1453977 Change-Id: Ic54da3f01881b7c9429ce92cc569236eb1d43e0d Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17356 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* socket/reconfigure: reconfigure should be done on new dictMohammed Rafi KC2017-05-291-8/+8
| | | | | | | | | | | | | | In socket reconfigure, reconfigurations are doing with old dict values. It should be with new reconfigured dict values Change-Id: Iac5ad4382fe630806af14c99bb7950a288756a87 BUG: 1456405 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17412 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* socket: Avoid flooding of error message in case of SSLMohit Agrawal2017-05-151-1/+10
| | | | | | | | | | | | | | | | | Problem: socket poller is throwing Input/Output messages during volume operation Solution: Update the code in socket_connect function before call socket_spwan. BUG: 1450559 Change-Id: I5f275fe7a4b730b16d7b0a0407e76288b07ceaef Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17280 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* event/epoll: Add back socket for polling of events immediately afterRaghavendra G2017-05-122-19/+83
| | | | | | | | | | | | | | | | | | | | | | | | | reading the entire rpc message from the wire Currently socket is added back for future events after higher layers (rpc, xlators etc) have processed the message. If message processing involves signficant delay (as in writev replies processed by Erasure Coding), performance takes hit. Hence this patch modifies transport/socket to add back the socket for polling of events immediately after reading the entire rpc message, but before notification to higher layers. credits: Thanks to "Kotresh Hiremath Ravishankar" <khiremat@redhat.com> for assitance in fixing a regression in bitrot caused by this patch. Change-Id: I04b6b9d0b51a1cfb86ecac3c3d87a5f388cf5800 BUG: 1448364 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/15036 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* rpc: fix a routine to destory RDMA qp(queue-pair)Ji-Hyeon Gim2017-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Problem: If an error has occured with rdma_create_id() in gf_rdma_connect(), process will jump to the 'unlock' label and then call gf_rdma_teardown() which call __gf_rdma_teardown(). Presently, __gf_rdma_teardown() checks InifiniBand QP with peer->cm_id->qp! Unfortunately, cm_id is not allocated and will be crushed in this situation :) Solution: If 'this->private->peer->cm_id' member is null, do not check 'this->private->peer->cm_id->qp'. Change-Id: Ie321b8cf175ef4f1bdd9733d73840f03ddff8c3b BUG: 1449495 Signed-off-by: Ji-Hyeon Gim <potatogim@potatogim.net> Reviewed-on: https://review.gluster.org/17249 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Ji-Hyeon Gim CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* rpc: use GF_ATOMIC_INC to generate rpc_clnt's callidZhou Zhengping2017-05-082-17/+3
| | | | | | | | | | | | Change-Id: I57ad970411db1ccd3d2c56c504c7da9cc221051f BUG: 1448692 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: https://review.gluster.org/17198 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: Remove accidental IPV6 changesKaushal M2017-05-055-117/+0
| | | | | | | | | | | | | | They snuck in with the HALO patch (07cc8679c) Change-Id: I8ced6cbb0b49554fc9d348c453d4d5da00f981f6 BUG: 1447953 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: https://review.gluster.org/17174 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* build: 'make cscope' requires generated .c and .h XDR filesNiels de Vos2017-05-051-0/+4
| | | | | | | | | | | | | | | | | When running 'make cscope' on a clean configured tree (before building anything), it fails because of missing files unders rpc/xdr/src/. These files are symlinked from generated files in the rpc/xdr/gen/ directory. In order to create the symlinks for 'make cscope', a target needs to specify how these symlinks are created. Change-Id: I473c90e10d915ee438425cf0f806c0531b9f582a BUG: 1447966 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17176 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* rpc: fix transport add/remove race on port probingMilind Changire2017-05-031-165/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Spurious __gf_free() assertion failures seen all over the place with header->magic being overwritten when running port probing tests with 'nmap' Solution: Fix sequence of: 1. add accept()ed socket connection fd to epoll set 2. add newly created rpc_transport_t object in RPCSVC service list Correct sequence is #2 followed by #1. Reason: Adding new fd returned by accept() to epoll set causes an epoll_wait() to return immediately with a POLLIN event. This races ahead to a readv() which returms with errno:104 (Connection reset by peer) during port probing using 'nmap'. The error is then handled by POLLERR code to remove the new transport object from RPCSVC service list and later unref and destroy the rpc transport object. socket_server_event_handler() then catches up with registering the unref'd/destroyed rpc transport object. This is later manifest as assertion failures in __gf_free() with the header->magic field botched due to invalid address references. All this does not result in a Segmentation Fault since the address space continues to be mapped into the process and pages still being referenced elsewhere. As a further note: This race happens only in accept() codepath. Only in this codepath, the notify will be referring to two transports: 1, listener transport and 2. newly accepted transport All other notify refer to only one transport i.e., the transport/socket on which the event is received. Since epoll is ONE_SHOT another event won't arrive on the same socket till the current event is processed. However, in the accept() codepath, the current event - ACCEPT - and the new event - POLLIN/POLLER - arrive on two different sockets: 1. ACCEPT on listener socket and 2. POLLIN/POLLERR on newly registered socket. Also, note that these two events are handled different thread contexts. Cleanup: Critical section in socket_server_event_handler() has been removed. Instead, an additional ref on new_trans has been used to avoid ref/unref race when notifying RPCSVC. Change-Id: I4417924bc9e6277d24bd1a1c5bcb7445bcb226a3 BUG: 1438966 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17139 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* Halo Replication feature for AFR translatorKevin Vigor2017-05-028-14/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 BUG: 1428061 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 Reviewed-on: https://review.gluster.org/16177 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: Fix removing pmap entry on rpc disconnectPrashanth Pai2017-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: The following line of code intended to remove pmap entry for the connection during disconnects: pmap_registry_remove (this, 0, NULL, GF_PMAP_PORT_NONE, xprt); However, no pmap entry will have it's type set to GF_PMAP_PORT_NONE at any point in time. So a call to pmap_registry_search_by_xprt() in pmap_registry_remove() will always fail to find a match. Fix: Optionally ignore pmap entry's type in pmap_registry_search_by_xprt(). BUG: 1193929 Change-Id: I705f101739ab1647ff52a92820d478354407264a Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17129 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterd: Add client details to get-state outputSamikshan Bairagya2017-04-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | This commit optionally adds client details corresponding to the locally running bricks to the get-state output. Since getting the client details involves sending RPC requests to the respective local bricks, this is a relatively more costly operation. These client details would be added to the get-state output only if the get-state command is invoked with the 'detail' option. This commit therefore also changes the get-state CLI usage. The modified usage is as follows: # gluster get-state [<daemon>] [[odir </path/to/output/dir/>] \ [file <filename>]] [detail] Change-Id: I42cd4ef160f9e96d55a08a10d32c8ba44e4cd3d8 BUG: 1431183 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17003 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: add options to manage socket keepalive lifespanMilind Changire2017-04-122-42/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Default values for handling socket timeouts for brick responses are insufficient for aggressive applications such as databases. Solution: Add 1:1 gluster options for keepalive, keepalive-idle, keepalive-interval and keepalive-timeout as per the socket level options available as per tcp(7) man page. Default values for options are NOT agressive and continue to be values which result in default timeout when only the keep alive option is turned on. These options are Linux specific and will not be applicable to the *BSDs. Change-Id: I2a08ecd949ca8ceb3e090d336ad634341e2dbf14 BUG: 1426059 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16731 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* build: place generated XDR .h and .c files under $(top_builddir)Niels de Vos2017-04-041-5/+6
| | | | | | | | | | | | | Change-Id: I0487337223a54a52e73088cb6dd812ce6d47178d BUG: 1429696 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16994 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Tested-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd : Disallow peer detach if snapshot bricks exist on itGaurav Yadav2017-03-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem : - Deploy gluster on 2 nodes, one brick each, one volume replicated - Create a snapshot - Lose one server - Add a replacement peer and new brick with a new IP address - replace-brick the missing brick onto the new server (wait for replication to finish) - peer detach the old server - after doing above steps, glusterd fails to restart. Solution: With the fix detach peer will populate an error : "N2 is part of existing snapshots. Remove those snapshots before proceeding". While doing so we force user to stay with that peer or to delete all snapshots. Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c BUG: 1322145 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16907 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* build: errors generating xdr stubs+headers with `make -j`Kaleb S. KEITHLEY2017-03-291-5/+5
| | | | | | | | | | | | | | | Using a makebomb, on f23 at least, blows up when generating the xdr headers and stubs. (Works reliably on f25 though, go figure.) This change appears to mitigate the race on f23. BUG: 1429696 Change-Id: Icca2de035255a7759563bf06d853950dddc5724d Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16954 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* build: errors generating xdr stubs+headers with `make -j`Kaleb S. KEITHLEY2017-03-271-3/+3
| | | | | | | | | | | | | | | | Using a makebomb, on f23 at least, blows up when generating the xdr headers and stubs. (Works reliably on f25 though, go figure.) This change appears to mitigate the race on f23. Change-Id: I006066f0e7c3f8b65189f97c70089f3422e3e08b BUG: 1429696 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16941 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Anoop C S <anoopcs@redhat.com>
* build: libgfxdr.so calls GF_FREE(), needs to link with -lglusterfsKaleb S. KEITHLEY2017-03-212-47/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous change to remove the xdrgen script exposed (or created) a recursive build dependency: libglusterfs needs the generated headers, and libgfxdr should be linked with libglusterfs for GF_FREE/__gf_free. (Much grumbling about libglusterfs being the kitchen sink of gluster elided. This would not be necessary if there were two or more libs, a gluster "runtime" library with common gluster code shared by the xlators and daemons, and a utility library with things like the rbtree, memory allocation, and whatnot.) So. Link at build time or link at runtime? For truth-and-beauty, link with libglusterfs.so at build time. Without truth-and-beauty, don't link with libglusterfs and rely on the other things that link with libglusterfs to provide resolution of __gf_free(). Truth-and-beauty it is. But how to generate the headers first, then build libglusterfs, then come back and build libgfxdr? Autotools is a maze of twisty passages, all different. Things that work with gnu make on linux don't work with the BSD make. Finally I hit on this solution. Add a shadow directory where make only generates the headers, then build libglusterfs using the generated headers, and finally build libgfxdr and link with libglusterfs. See original BZ 1330604 change http://review.gluster.org/14085 Change-Id: Iede8a30e3103176cb8f0b054885f30fcb352492b BUG: 1429696 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16873 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* rpc: bump up conn->cleanup_gen in rpc_clnt_reconnect_cleanupAtin Mukherjee2017-03-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Commit 086436a introduced generation number (cleanup_gen) to ensure that rpc layer doesn't end up cleaning up the connection object if application layer has already destroyed it. Bumping up cleanup_gen was done only in rpc_clnt_connection_cleanup (). However the same is needed in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed through the reconnect event in the application layer, rpc layer will still end up in trying to delete the object resulting into double free and crash. Peer probing an invalid host/IP was the basic test to catch this issue. Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46 BUG: 1433578 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16914 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* transport: allow OS to assign us a portKevin Vigor2017-03-122-0/+10
| | | | | | | | | | | | | | | | Replace complex and slow port selection code with bind(0) which already respects privileged ports. Change-Id: I408a8528e58e1aafcd32eba6a8f1a759e0bf274e BUG: 1405628 Reviewed-on-release-3.8-fb: http://review.gluster.org/16150 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16178 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: avoid logging success on failureMilind Changire2017-03-072-0/+7
| | | | | | | | | | | | | | | | Avoid logging Success in the event of failure especially when errno has no meaningful value w.r.t. the failure. In this case the errno is set to zero when there's indeed a failure at the RPC level. Change-Id: If2cc81aa1e590023ed22892dacbef7cac213e591 BUG: 1426032 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16730 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Return ret as soon as `this` is freedNigel Babu2017-03-011-0/+1
| | | | | | | | | | | | | | | Omitting this return causes a confusing error message in the next block, which is also a case of use after free. This bug was found by Coverity scan. BUG: 789278 Change-Id: Ifd4932de437e8ae875ff191033ea43cff81b701d Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16790 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc: log more about socket disconnectsMilind Changire2017-03-011-3/+15
| | | | | | | | | | | | | | | | | Log more about the different paths leading to socket disconnect for ease of debugging. Log via gf_log_callingfn() in __socket_disconnect() at loglevel TRACE if socket connection is being torn down. Change-Id: I1e551c2d685784b5ec747f481179f64d524c0461 BUG: 1426125 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16732 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc/clnt: remove locks while notifying CONNECT/DISCONNECTRaghavendra G2017-03-012-31/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Locking during notify was introduced as part of commit aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent xlators [2]. However as part of handling DISCONNECT protocol/client does unwind saved frames (with failure) waiting for responses. This saved_frames_unwind can be a costly operation and hence ideally shouldn't be included in the critical section of notifylock, as it unnecessarily delays the reconnection to same brick. Also, its not a good practise to pass control to other xlators holding a lock as it can lead to deadlocks. So, this patch removes locking in rpc-clnt while notifying parent xlators. To fix [2], two changes are present in this patch: * notify DISCONNECT before cleaning up rpc connection (same as commit a6b63e11b7758cf1bfcb6798, patch [3]). * protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc connection and does a start while handling a DISCONNECT event from rpc. Note that patch [3] was reverted as rpc_clnt_start called in quick_reconnect path of protocol/client didn't invoke connect on transport as the connection was not cleaned up _yet_ (as cleanup was moved post notification in rpc-clnt). This resulted in clients never attempting connect to bricks. Note that one of the neater ways to fix [2] (without using locks) is to introduce generation numbers to map CONNECT and DISCONNECTS across epochs and ignore DISCONNECT events if they don't belong to current epoch. However, this approach is a bit complex to implement and requires time. So, current patch is a hacky stop-gap fix till we come up with a more cleaner solution. [1] http://review.gluster.org/15916 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626 [3] http://review.gluster.org/15681 Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1427012 Reviewed-on: https://review.gluster.org/16784 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Use int instead of int8_t for the 3 variablesMichael Scherer2017-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | Since strcmp return a int, and since the spec of strcmp do not tell the return value, it could return 256 and this would overflow. Found by Coverity scan. (thanks to Stéphane Marcheusin who explained the details to me) Change-Id: I5195e05b44f8b537226e6cee178d95a1ab904e96 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16738 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Michael Scherer <misc@fedoraproject.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>