summaryrefslogtreecommitdiffstats
path: root/rpc
Commit message (Collapse)AuthorAgeFilesLines
* socket/reconfigure: reconfigure should be done on new dictMohammed Rafi KC2017-07-031-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | In socket reconfigure, reconfigurations are doing with old dict values. It should be with new reconfigured dict values Backport of> >Change-Id: Iac5ad4382fe630806af14c99bb7950a288756a87 >BUG: 1456405 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: https://review.gluster.org/17412 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Iac5ad4382fe630806af14c99bb7950a288756a87 BUG: 1463517 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17588 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* event/epoll: Add back socket for polling of events immediately after reading ↵Raghavendra G2017-06-052-19/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the entire rpc message from the wire Currently socket is added back for future events after higher layers (rpc, xlators etc) have processed the message. If message processing involves signficant delay (as in writev replies processed by Erasure Coding), performance takes hit. Hence this patch modifies transport/socket to add back the socket for polling of events immediately after reading the entire rpc message, but before notification to higher layers. credits: Thanks to "Kotresh Hiremath Ravishankar" <khiremat@redhat.com> for assitance in fixing a regression in bitrot caused by this patch. >Reviewed-on: https://review.gluster.org/15036 >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Amar Tumballi <amarts@redhat.com> Change-Id: I04b6b9d0b51a1cfb86ecac3c3d87a5f388cf5800 BUG: 1456259 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: https://review.gluster.org/17391 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* rpc: fix a routine to destory RDMA qp(queue-pair)Ji-Hyeon Gim2017-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is backport of https://review.gluster.org/#/c/17249/ Problem: If an error has occured with rdma_create_id() in gf_rdma_connect(), process will jump to the 'unlock' label and then call gf_rdma_teardown() which call __gf_rdma_teardown(). Presently, __gf_rdma_teardown() checks InifiniBand QP with peer->cm_id->qp! Unfortunately, cm_id is not allocated and will be crushed in this situation :) Solution: If 'this->private->peer->cm_id' member is null, do not check 'this->private->peer->cm_id->qp'. > Change-Id: Ie321b8cf175ef4f1bdd9733d73840f03ddff8c3b > BUG: 1449495 > Signed-off-by: Ji-Hyeon Gim <potatogim@potatogim.net> > Reviewed-on: https://review.gluster.org/17249 > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Prashanth Pai <ppai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Ji-Hyeon Gim > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> (cherry picked from commit ccfa06767f1282d9a3783e37555515a63cc62e69) Change-Id: Ie321b8cf175ef4f1bdd9733d73840f03ddff8c3b BUG: 1450565 Signed-off-by: Ji-Hyeon Gim <potatogim@gluesys.com> Reviewed-on: https://review.gluster.org/17282 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Ji-Hyeon Gim NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* rpc: fix transport add/remove race on port probingMilind Changire2017-05-121-165/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Spurious __gf_free() assertion failures seen all over the place with header->magic being overwritten when running port probing tests with 'nmap' Solution: Fix sequence of: 1. add accept()ed socket connection fd to epoll set 2. add newly created rpc_transport_t object in RPCSVC service list Correct sequence is #2 followed by #1. Reason: Adding new fd returned by accept() to epoll set causes an epoll_wait() to return immediately with a POLLIN event. This races ahead to a readv() which returms with errno:104 (Connection reset by peer) during port probing using 'nmap'. The error is then handled by POLLERR code to remove the new transport object from RPCSVC service list and later unref and destroy the rpc transport object. socket_server_event_handler() then catches up with registering the unref'd/destroyed rpc transport object. This is later manifest as assertion failures in __gf_free() with the header->magic field botched due to invalid address references. All this does not result in a Segmentation Fault since the address space continues to be mapped into the process and pages still being referenced elsewhere. As a further note: This race happens only in accept() codepath. Only in this codepath, the notify will be referring to two transports: 1, listener transport and 2. newly accepted transport All other notify refer to only one transport i.e., the transport/socket on which the event is received. Since epoll is ONE_SHOT another event won't arrive on the same socket till the current event is processed. However, in the accept() codepath, the current event - ACCEPT - and the new event - POLLIN/POLLER - arrive on two different sockets: 1. ACCEPT on listener socket and 2. POLLIN/POLLERR on newly registered socket. Also, note that these two events are handled different thread contexts. Cleanup: Critical section in socket_server_event_handler() has been removed. Instead, an additional ref on new_trans has been used to avoid ref/unref race when notifying RPCSVC. mainline: > BUG: 1438966 > Signed-off-by: Milind Changire <mchangir@redhat.com> > Reviewed-on: https://review.gluster.org/17139 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> (cherry picked from commit 4f7ef3020edcc75cdeb22d8da8a1484f9db77ac9) Change-Id: I4417924bc9e6277d24bd1a1c5bcb7445bcb226a3 BUG: 1449191 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17218 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* Halo Replication feature for AFR translatorKevin Vigor2017-05-087-14/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16177 https://review.gluster.org/17174 Merged both these patches to make sure IPV6 changes don't make it to 3.11 at all. Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 >Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 >BUG: 1428061 >Signed-off-by: Kevin Vigor <kvigor@fb.com> >Reviewed-on: http://review.gluster.org/16099 >Reviewed-on: https://review.gluster.org/16177 >Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1448416 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17192 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* glusterd: Fix removing pmap entry on rpc disconnectPrashanth Pai2017-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: The following line of code intended to remove pmap entry for the connection during disconnects: pmap_registry_remove (this, 0, NULL, GF_PMAP_PORT_NONE, xprt); However, no pmap entry will have it's type set to GF_PMAP_PORT_NONE at any point in time. So a call to pmap_registry_search_by_xprt() in pmap_registry_remove() will always fail to find a match. Fix: Optionally ignore pmap entry's type in pmap_registry_search_by_xprt(). BUG: 1193929 Change-Id: I705f101739ab1647ff52a92820d478354407264a Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17129 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterd: Add client details to get-state outputSamikshan Bairagya2017-04-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | This commit optionally adds client details corresponding to the locally running bricks to the get-state output. Since getting the client details involves sending RPC requests to the respective local bricks, this is a relatively more costly operation. These client details would be added to the get-state output only if the get-state command is invoked with the 'detail' option. This commit therefore also changes the get-state CLI usage. The modified usage is as follows: # gluster get-state [<daemon>] [[odir </path/to/output/dir/>] \ [file <filename>]] [detail] Change-Id: I42cd4ef160f9e96d55a08a10d32c8ba44e4cd3d8 BUG: 1431183 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: https://review.gluster.org/17003 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: add options to manage socket keepalive lifespanMilind Changire2017-04-122-42/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Default values for handling socket timeouts for brick responses are insufficient for aggressive applications such as databases. Solution: Add 1:1 gluster options for keepalive, keepalive-idle, keepalive-interval and keepalive-timeout as per the socket level options available as per tcp(7) man page. Default values for options are NOT agressive and continue to be values which result in default timeout when only the keep alive option is turned on. These options are Linux specific and will not be applicable to the *BSDs. Change-Id: I2a08ecd949ca8ceb3e090d336ad634341e2dbf14 BUG: 1426059 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16731 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* build: place generated XDR .h and .c files under $(top_builddir)Niels de Vos2017-04-041-5/+6
| | | | | | | | | | | | | Change-Id: I0487337223a54a52e73088cb6dd812ce6d47178d BUG: 1429696 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16994 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Tested-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glusterd : Disallow peer detach if snapshot bricks exist on itGaurav Yadav2017-03-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem : - Deploy gluster on 2 nodes, one brick each, one volume replicated - Create a snapshot - Lose one server - Add a replacement peer and new brick with a new IP address - replace-brick the missing brick onto the new server (wait for replication to finish) - peer detach the old server - after doing above steps, glusterd fails to restart. Solution: With the fix detach peer will populate an error : "N2 is part of existing snapshots. Remove those snapshots before proceeding". While doing so we force user to stay with that peer or to delete all snapshots. Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c BUG: 1322145 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16907 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* build: errors generating xdr stubs+headers with `make -j`Kaleb S. KEITHLEY2017-03-291-5/+5
| | | | | | | | | | | | | | | Using a makebomb, on f23 at least, blows up when generating the xdr headers and stubs. (Works reliably on f25 though, go figure.) This change appears to mitigate the race on f23. BUG: 1429696 Change-Id: Icca2de035255a7759563bf06d853950dddc5724d Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16954 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* build: errors generating xdr stubs+headers with `make -j`Kaleb S. KEITHLEY2017-03-271-3/+3
| | | | | | | | | | | | | | | | Using a makebomb, on f23 at least, blows up when generating the xdr headers and stubs. (Works reliably on f25 though, go figure.) This change appears to mitigate the race on f23. Change-Id: I006066f0e7c3f8b65189f97c70089f3422e3e08b BUG: 1429696 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16941 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Anoop C S <anoopcs@redhat.com>
* build: libgfxdr.so calls GF_FREE(), needs to link with -lglusterfsKaleb S. KEITHLEY2017-03-212-47/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous change to remove the xdrgen script exposed (or created) a recursive build dependency: libglusterfs needs the generated headers, and libgfxdr should be linked with libglusterfs for GF_FREE/__gf_free. (Much grumbling about libglusterfs being the kitchen sink of gluster elided. This would not be necessary if there were two or more libs, a gluster "runtime" library with common gluster code shared by the xlators and daemons, and a utility library with things like the rbtree, memory allocation, and whatnot.) So. Link at build time or link at runtime? For truth-and-beauty, link with libglusterfs.so at build time. Without truth-and-beauty, don't link with libglusterfs and rely on the other things that link with libglusterfs to provide resolution of __gf_free(). Truth-and-beauty it is. But how to generate the headers first, then build libglusterfs, then come back and build libgfxdr? Autotools is a maze of twisty passages, all different. Things that work with gnu make on linux don't work with the BSD make. Finally I hit on this solution. Add a shadow directory where make only generates the headers, then build libglusterfs using the generated headers, and finally build libgfxdr and link with libglusterfs. See original BZ 1330604 change http://review.gluster.org/14085 Change-Id: Iede8a30e3103176cb8f0b054885f30fcb352492b BUG: 1429696 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/16873 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* rpc: bump up conn->cleanup_gen in rpc_clnt_reconnect_cleanupAtin Mukherjee2017-03-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Commit 086436a introduced generation number (cleanup_gen) to ensure that rpc layer doesn't end up cleaning up the connection object if application layer has already destroyed it. Bumping up cleanup_gen was done only in rpc_clnt_connection_cleanup (). However the same is needed in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed through the reconnect event in the application layer, rpc layer will still end up in trying to delete the object resulting into double free and crash. Peer probing an invalid host/IP was the basic test to catch this issue. Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46 BUG: 1433578 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16914 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* transport: allow OS to assign us a portKevin Vigor2017-03-122-0/+10
| | | | | | | | | | | | | | | | Replace complex and slow port selection code with bind(0) which already respects privileged ports. Change-Id: I408a8528e58e1aafcd32eba6a8f1a759e0bf274e BUG: 1405628 Reviewed-on-release-3.8-fb: http://review.gluster.org/16150 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16178 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: avoid logging success on failureMilind Changire2017-03-072-0/+7
| | | | | | | | | | | | | | | | Avoid logging Success in the event of failure especially when errno has no meaningful value w.r.t. the failure. In this case the errno is set to zero when there's indeed a failure at the RPC level. Change-Id: If2cc81aa1e590023ed22892dacbef7cac213e591 BUG: 1426032 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16730 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* Return ret as soon as `this` is freedNigel Babu2017-03-011-0/+1
| | | | | | | | | | | | | | | Omitting this return causes a confusing error message in the next block, which is also a case of use after free. This bug was found by Coverity scan. BUG: 789278 Change-Id: Ifd4932de437e8ae875ff191033ea43cff81b701d Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: https://review.gluster.org/16790 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc: log more about socket disconnectsMilind Changire2017-03-011-3/+15
| | | | | | | | | | | | | | | | | Log more about the different paths leading to socket disconnect for ease of debugging. Log via gf_log_callingfn() in __socket_disconnect() at loglevel TRACE if socket connection is being torn down. Change-Id: I1e551c2d685784b5ec747f481179f64d524c0461 BUG: 1426125 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16732 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc/clnt: remove locks while notifying CONNECT/DISCONNECTRaghavendra G2017-03-012-31/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Locking during notify was introduced as part of commit aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent xlators [2]. However as part of handling DISCONNECT protocol/client does unwind saved frames (with failure) waiting for responses. This saved_frames_unwind can be a costly operation and hence ideally shouldn't be included in the critical section of notifylock, as it unnecessarily delays the reconnection to same brick. Also, its not a good practise to pass control to other xlators holding a lock as it can lead to deadlocks. So, this patch removes locking in rpc-clnt while notifying parent xlators. To fix [2], two changes are present in this patch: * notify DISCONNECT before cleaning up rpc connection (same as commit a6b63e11b7758cf1bfcb6798, patch [3]). * protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc connection and does a start while handling a DISCONNECT event from rpc. Note that patch [3] was reverted as rpc_clnt_start called in quick_reconnect path of protocol/client didn't invoke connect on transport as the connection was not cleaned up _yet_ (as cleanup was moved post notification in rpc-clnt). This resulted in clients never attempting connect to bricks. Note that one of the neater ways to fix [2] (without using locks) is to introduce generation numbers to map CONNECT and DISCONNECTS across epochs and ignore DISCONNECT events if they don't belong to current epoch. However, this approach is a bit complex to implement and requires time. So, current patch is a hacky stop-gap fix till we come up with a more cleaner solution. [1] http://review.gluster.org/15916 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626 [3] http://review.gluster.org/15681 Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1427012 Reviewed-on: https://review.gluster.org/16784 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Use int instead of int8_t for the 3 variablesMichael Scherer2017-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | Since strcmp return a int, and since the spec of strcmp do not tell the return value, it could return 256 and this would overflow. Found by Coverity scan. (thanks to Stéphane Marcheusin who explained the details to me) Change-Id: I5195e05b44f8b537226e6cee178d95a1ab904e96 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16738 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Michael Scherer <misc@fedoraproject.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* socket: Avoid flooding of SSL messages in case of failure/successMohit Agrawal2017-02-271-7/+7
| | | | | | | | | | | | | | | | | | Problem: Avoid flodding of SSL messages in case of failure/success Solution: 1) Ideally ssl_setup_connection should be call after getting success on connect so update the condition before call socket_spawn in socket_connect. 2) Change the message type to debug in case of success. BUG: 1427018 Change-Id: Icb6101e49304d5fe539609b4afacfb1b50b62f84 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/16767 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* Free iobuf after using it, not beforeMichael Scherer2017-02-261-2/+2
| | | | | | | | | | | | | | | Coverity warn of use after free here. I assume that under pressure, this might crash the whole process. Change-Id: I15fb5cfc9b509705e96e4156b739988d816bbef5 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16719 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Michael Scherer <misc@fedoraproject.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: fix obvious typo in cleanup code in rpc_clnt_notifyMateusz Slupny2017-02-191-1/+1
| | | | | | | | | | | | | | | Change-Id: I003e38b238704d3345d46688355bcf3702455ba1 BUG: 1399593 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> [ndevos: rebased after I8ff5d1a32 moved the code around] Reviewed-on: https://review.gluster.org/15969 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpcsvc: Add rpchdr and proghdr to iobref before submitting to transportPoornima G2017-02-152-3/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1421937 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16613 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* Tier: remove warning related to the enumhari gowtham2017-02-071-1/+2
| | | | | | | | | | | | | | | | | | | PROBLEM: In the tier as a service patch the enums for tier (from gf1_op_command and gf_defrag_command) are put into a single enum gf_defrag_command which causes a warning that will make the build fail. FIX: send both the enum and eliminate the warning. Change-Id: I899ff622dfb07134e6459aa65f65ea7252765293 BUG: 1418973 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16539 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* socket: GF_REF_PUT should be called outside lockRajesh Joseph2017-02-061-2/+4
| | | | | | | | | | | | | | | GF_REF_PUT was called inside lock which can call socket_poller_mayday which inturn tries to take the same lock. This can lead to deadlock scenario. BUG: 1410701 Change-Id: Ib3b161bcfeac810bd3593dc04c10ef984f996b17 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: https://review.gluster.org/16343 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* socket: retry connect immediately if it failsJeff Darcy2017-02-021-2/+36
| | | | | | | | | | | | | | | | | | | | | | Previously we relied on a complex dance of setting flags, shutting down the socket, tearing stuff down, getting an event, tearing more stuff down, and waiting for a higher-level retry. What we really need, in the case where we're just trying to connect prematurely e.g. to a brick that hasn't fully come up yet, is a simple retry of the connect(2) call. This was discovered by observing failures in ec-new-entry.t with multiplexing enabled, but probably fixes other random failures as well. Change-Id: Ibedb8942060bccc96b02272a333c3002c9b77d4c BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16510 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* libglusterfs+transport+io-threads: fix 256KB stack abuseJeff Darcy2017-02-012-12/+12
| | | | | | | | | | | | | | | | | Some functions were allocating 64K booleans, which are (crazily) mapped to 4-byte ints, for a total of 256KB per call. Changed to use bitfields instead, so usage is now only 8KB per call. This was the impediment to changing the io-threads stack size, so that has been adjusted too. Change-Id: I8781c4f2c8f2b830f4535e366995fac8dd0a8653 BUG: 1418095 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/15745 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* rpc/socket.c : Bonnie++ hangs during rewrites in ganesha + SSLMohit Agrawal2017-02-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Bonnie++ rewrite operation hangs in ganesha + SSL environment Solution: Bonnie++ hangs during execution of rewrite operation in ganesha + SSL environment.It was hanged due to blocking on poll call in ssl_do because no POLLOUT event was getting on socket. Socket is not getting POLLOUT event because all other threads are waiting to get lock and lock is not released ssl_do because it is not getting any event on poll.To correct it update the condition in ssl_do as same in getting error SSL_ERROR_WANT_READ. Test: To test the patch followed below procedure 1) Setup 2X2 Ganesha + SSL environment. 2) Run bonnie from 3 nfs client parallely 3) After run "Rewwrite operation" by bonnie it is hanged. 4) After apply the patch it is not hanged. BUG: 1418213 Change-Id: I5985cbbc4cfdac5d287268d791e31c274abc3c8d Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/16501 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* refcount: typecast function for calling on freeNiels de Vos2017-01-311-4/+2
| | | | | | | | | | | | | | | | | | | All of the functions called to free the refcounted structure are doing a typecast from (void*) to their own type taht is being free'd. This really is not needed and the refcount interface is made a little simpler without the requirement of typecasting. With this small improvement in the API, all callers are updated too. Change-Id: I32473b6d1799f62861d4b2d78ea30c09e6c80ab1 BUG: 1416889 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16471 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-303-3/+1
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: add a cli command to trigger a statedump on a clientPoornima G2017-01-233-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we will be able to trigger statedumps on remote Gluster clients, mainly targetted for applications using libgfapi. Design: SIGUSR signal is the most comman way of taking a statedump in Gluster. But it cannot be used for libgfapi based processes, as the process loading the library might have already consumed SIGUSR signal. Hence going by the command way. One has to issue a Gluster command to initiate a statedump on the libgfapi based client. The command takes hostname and PID as an argument. All the glusterds in the cluster, check if they are connected to the specified hostname, and send an RPC request to all the connected clients from that hostname (via the mgmt connection). URL: http://review.gluster.org/16357 Change-Id: Icbe4d2f026b32a2c7d5535e1bfb2cdaaff042e91 BUG: 1169302 Signed-off-by: Poornima G <pgurusid@redhat.com> [ndevos: minor fixes and split patch in smaller pieces] Reviewed-on: https://review.gluster.org/9228 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
* tier : Tier as a servicehari gowtham2017-01-162-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterfs : Removing logically dead codeMuthu-vigneshwaran2017-01-071-6/+0
| | | | | | | | | | | | | | | CID = 1356502 BUG: 789278 Change-Id: I11a814addc6607902c12aca8f4efec5741cbd7d3 Signed-off-by: Muthu-vigneshwaran <mvignesh@redhat.com> Reviewed-on: http://review.gluster.org/16028 Tested-by: Muthu Vigneshwaran <muthuvigneshwaran77@gmail.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* socket: socket disconnect should wait for poller thread exitRajesh Joseph2016-12-218-34/+106
| | | | | | | | | | | | | | | | | | | | | When SSL is enabled or if "transport.socket.own-thread" option is set then socket_poller is run as different thread. Currently during disconnect or PARENT_DOWN scenario we don't wait for this thread to terminate. PARENT_DOWN will disconnect the socket layer and cleanup resources used by socket_poller. Therefore before disconnect we should wait for poller thread to exit. Change-Id: I71f984b47d260ffd979102f180a99a0bed29f0d6 BUG: 1404181 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/16141 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* socket: Keepalives should happen on IPv6 as well as IPv4Shreyas Siravara2016-12-161-1/+1
| | | | | | | | | | | | | | | | | Summary: - Check for AF_INET *and* AF_INET6. - This is a cherry-pick of D3057373 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I53eb79284eddfee6e13821c6570809f575b96769 BUG: 1405478 Reviewed-on: http://review.gluster.org/16167 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: fix for race between rpc and protocol/clientRajesh Joseph2016-12-052-40/+59
| | | | | | | | | | | | | | | | | | | | | | | | It is possible that the notification thread which notifies protocol/client layer about the disconnection is put to sleep and meanwhile, a fuse thread or a timer thread initiates and completes reconnection to the brick. The notification thread is then woken up and protocol/client layer updates its flags to indicate that network is disconnected. No reconnection is initiated because reconnection is rpc-lib layer's responsibility and its flags indicate that connection is connected. Fix: Serialize connect and disconnect notify Credit: Raghavendra Talur <rtalur@redhat.com> Change-Id: I8ff5d1a3283b47f5c26848a42016a40bc34ffc1d BUG: 1386626 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15916 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht/md-cache: Filter invalidate if the file is made a linkto filePoornima G2016-12-021-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | Upcall as a part of setattr, sends an invalidation and the invalidation carries the resulting stat value. When a file is converted to linkto files, even then an invalidation is set and as a result the mountpoint shows the sticky bit in the stat of the file. eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999 Fix: When dht recieves a notification of sticky bit change, it updates the flag, to indicate md-cache to send the subsequent lookup. Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606 BUG: 1392713 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15789 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* cluster/afr: CLI for granular entry heal enablement/disablementKrutika Dhananjay2016-11-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099 BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15747 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* afr,dht,ec: Replace GF_EVENT_CHILD_MODIFIED with event SOME_DESCENDENT_DOWN/UPPoornima G2016-11-211-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently these are few events related to child_up/down: GF_EVENT_CHILD_UP : Issued when any of the protocol client connects. GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec GF_EVENT_CHILD_DOWN : Issued when any of the protocol client disconnects. These events get modified at the dht/afr/ec layers. Here is a brief on the same. DHT: - All the subvolumes reported once, and atleast one child came up, then GF_EVENT_CHILD_UP is issued - connect GF_EVENT_CHILD_UP is issued - disconnect GF_EVENT_CHILD_MODIFIED is issued - All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued AFR: - First subvolume came up, then GF_EVENT_CHILD_UP is issued - Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED - Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued - Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP, GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes go up or down. Now with md-cache changes, there is a necessity to differentiate between child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and getting rid of GF_EVENT_CHILD_MODIFIED. [1] http://review.gluster.org/12573 Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58 BUG: 1396038 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15764 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* Revert "rpc: Fix the race between notification and reconnection"Pranith Kumar Karampuri2016-11-161-4/+3
| | | | | | | | | | | | | | | | This reverts commit a6b63e11b7758cf1bfcb67985e25ec02845f0995. Nithya and Rajesh found that the mount fails sometimes after this patch was merged so reverting it. BUG: 1386626 Change-Id: I959a5b6c7da61368cf4c67c98193c6e8fdd1755d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15838 Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* glusterd/quota: upgrade quota.conf file during an upgradeManikandan Selvaganesh2016-11-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem ======= When quota is enabled on 3.6, it will have quota conf version in quota.conf as v1.1. This node gets upgraded to 3.7 but it will still have quota conf version as v1.1 until a quota enable/disable/set limit is initiated. When this is not initiated and when this node tries to peer probe a node which is a fresh install of 3.7 (which will have quota conf version as v1.2), then this will result in "Peer rejected" state. This patch fixes the issue. Solution ======== When an upgrade happens from 3.6 to 3.7, quota.conf file needs to be modified as well. With 3.6, in quota.conf the version will be v1.1 and it needs to be changed to v1.2 from 3.7. This is because in 3.7, inode quota feature is introduced. So when an op-version bumpup happens quota.conf needs to be upgraded with quota conf version v1.2 and all the 16 byte uuid needs to be changed to 17 bytes uuid as well. Previously, when the cluster version is upgraded to 3.7, the quota.conf got upgraded as well. But, the upgradation was done only when quota enable/disable/set limit is done. With this patch, the upgradation is done during a cluster op version bump up as well. Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0 BUG: 1371539 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/15352 Tested-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: Fix the race between notification and reconnectionPranith Kumar K2016-10-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: There was a hang because unlock on an entry failed with ENOTCONN. Client thinks the connection is down where as server thinks the connection is up. This is the race we are seeing: 1) Connection from client to the brick disconnects. 2) Saved frames unwind is called which unwinds all frames that were wound before disconnect. 3) connection from client to the brick happens and setvolume. 4) Disconnect notification for the connection in 1) comes now and calls client_rpc_notify() which marks the connection to be offline even when the connection is up. This is happening because I/O can retrigger connection before disconnect notification is sent to the higher layers in rpc. Fix: Notify the higher layers that a disconnect happened and then go ahead with reconnect logic. For the logs which point to the information above check: https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1 Thanks to Raghavendra G for suggesting the correct fix. BUG: 1386626 Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15681 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* compound fops: Fix file corruption issueKrutika Dhananjay2016-10-244-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Address of a local variable @args is copied into state->req in server3_3_compound (). But even after the function has gone out of scope, in server_compound_resume () this pointer is accessed and dereferenced. This patch fixes that. 2. Compound fops, by virtue of NOT having a vector sizer (like the one writev has), ends up having both the header and the data (in case one of its member fops is WRITEV) in the same hdr_iobuf. This buffer was not being preserved through the lifetime of the compound fop, causing it to be overwritten by a parallel write fop, even when the writev associated with the currently executing compound fop is yet to hit the desk, thereby corrupting the file's data. This is fixed by associating the hdr_iobuf with the iobref so its memory remains valid through the lifetime of the fop. 3. Also fixed a use-after-free bug in protocol/client in compound fops cbk, missed by Linux but caught by NetBSD. Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their help in debugging this file corruption issue. Change-Id: I6d5c04f400ecb687c9403a17a12683a96c2bf122 BUG: 1378778 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15654 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc/socket.c : Modify socket_poller code in case of ENODATA error code.Mohit Agrawal2016-10-231-1/+1
| | | | | | | | | | | | | | | | | | Problem: Continuous warning message(ENODATA) are coming in socket_rwv while SSL is enabled. Solution: To avoid the warning message update one condition in socket_poller loop code before break from loop in case of error returned by poll functions. BUG: 1386450 Change-Id: I19b3a92d4c3ba380738379f5679c1c354f0ab9b1 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15677 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc/socket: Close pipe on disconnectionKaushal M2016-10-231-1/+8
| | | | | | | | | | | | | | | | Encrypted connections create a pipe, which isn't closed when the connection disconnects. This leaks fds, and gluster eventually ends up in a situation with fd starvation which leads to operation failures. Change-Id: I144e1f767cec8c6fc1aa46b00cd234129d2a4adc BUG: 1336371 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/14356 Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc/socket.c : Modify gf_log message in socket_poller code in case of errorMohit Agrawal2016-10-121-3/+6
| | | | | | | | | | | | | | | | | | | Problem: In case of SSL after stopping the volume if client(mount point) is still trying to write the data on socket then it will throw an EIO error on that socket and given this log message is captured at every attempt this would flood the log file. Solution: To reduce the frequency of stored log message use GF_LOG_OCCASIONALLY instead of gf_log. BUG: 1381115 Change-Id: I66151d153c2cbfb017b3ebc4c52162278c0f537c Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15605 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* socket: log the client identifier in ssl connectMohit Agrawal2016-10-051-2/+2
| | | | | | | | | | | | | | | | | | Problem: client identifier is not logged in message in ssl_setup_connection Solutuion: In ssl_setup_connection xl_private is not available in rpc_transport so changed to this peerinfo.identifier. BUG: 1380275 Change-Id: I05006a3d63e46de8c388298c22faa9a3329eb6f3 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15596 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* build: rpc/xdr .c and .h files are regenerated unnecessarilyKaleb S. KEITHLEY2016-09-201-4/+8
| | | | | | | | | | | | | | | | | | | .c and .h files are blindly (re)generated. If you run `make` followed by a second `make` or `make install` the subsequent `make` will unnecessarily recompile everything because of the redundant regeneration of the .c and .h files are now newer than the prior builds .o files Change-Id: I7e477bcdcc20869e144ada7e6d91e7221b8ee71f BUG: 1377341 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15530 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com>
* socket: pollerr event shouldn't trigger socket_connnect_finishAtin Mukherjee2016-09-192-1/+44
| | | | | | | | | | | | | | | | | | | | | | | | If connect fails with any other error than EINPROGRESS we cannot get the error status using getsockopt (... SO_ERROR ... ). Hence we need to remember the state of connect and take appropriate action in the event_handler for the same. As an added note, a event can come where poll_err is HUP and we have poll_in as well (i.e some status was written to the socket), so for such cases we need to finish the connect, process the data and then the poll_err as is the case in the current code. Special thanks to Kaushal M & Raghavendra G for figuring out the issue. Change-Id: Ic45ad59ff8ab1d0a9d2cab2c924ad940b9d38528 BUG: 1372356 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/15440 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>