summaryrefslogtreecommitdiffstats
path: root/xlators/protocol
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'origin/release-3.8' into release-3.8-fbJeff Darcy2017-08-312-8/+14
|\ | | | | | | Change-Id: Ie35cd1c8c7808949ddf79b3189f1f8bf0ff70ed8
| * rpc/clnt: remove locks while notifying CONNECT/DISCONNECTNiels de Vos2017-07-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Locking during notify was introduced as part of commit aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent xlators [2]. However as part of handling DISCONNECT protocol/client does unwind saved frames (with failure) waiting for responses. This saved_frames_unwind can be a costly operation and hence ideally shouldn't be included in the critical section of notifylock, as it unnecessarily delays the reconnection to same brick. Also, its not a good practise to pass control to other xlators holding a lock as it can lead to deadlocks. So, this patch removes locking in rpc-clnt while notifying parent xlators. To fix [2], two changes are present in this patch: * notify DISCONNECT before cleaning up rpc connection (same as commit a6b63e11b7758cf1bfcb6798, patch [3]). * protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc connection and does a start while handling a DISCONNECT event from rpc. Note that patch [3] was reverted as rpc_clnt_start called in quick_reconnect path of protocol/client didn't invoke connect on transport as the connection was not cleaned up _yet_ (as cleanup was moved post notification in rpc-clnt). This resulted in clients never attempting connect to bricks. Note that one of the neater ways to fix [2] (without using locks) is to introduce generation numbers to map CONNECT and DISCONNECTS across epochs and ignore DISCONNECT events if they don't belong to current epoch. However, this approach is a bit complex to implement and requires time. So, current patch is a hacky stop-gap fix till we come up with a more cleaner solution. [1] http://review.gluster.org/15916 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626 [3] http://review.gluster.org/15681 Cherry picked from commit 773f32caf190af4ee48818279b6e6d3c9f2ecc79: > Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1427012 > Reviewed-on: https://review.gluster.org/16784 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Milind Changire <mchangir@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 Reported-by: Markus Stockhausen <mst@collogia.de> Signed-off-by: Niels de Vos <ndevos@redhat.com> BUG: 1462447 Reviewed-on: https://review.gluster.org/17733 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
| * rpcsvc: Add rpchdr and proghdr to iobref before submitting to transportPoornima G2017-04-071-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of https://review.gluster.org/16613 Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> > Reviewed-on: https://review.gluster.org/16613 > Reviewed-by: Prashanth Pai <ppai@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1422788 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16638 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* | server: fix core dumps on upstream test machinesJeff Darcy2017-07-181-1/+5
| | | | | | | | | | | | | | | | | | | | Change-Id: I48f5340507a5fcebe874f498eba737585c1c32a7 Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17818 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | cluster/afr: Handle gfid-less directories in heal flowRichard Wareing2017-07-122-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Updates heal flow to handle case where a directory does not have a gfid assigned. In this case we will remove _only_ empty directories in these cases such that the parent can re-gain consistency and files within can be correctly healed. - Also adds a test for the case where a file does not have a gfid, this is already handles by the metadata heal flow, but tests were lacking for this code path. Test Plan: - prove -v tests/basic/shd_autofix_nogfid.t - prove -v tests/basic/gfid_unsplit_shd.t Reviewers: dph, moox, sshreyas Reviewed By: sshreyas Differential Revision: https://phabricator.fb.com/D2502067 Tasks: 8549168 Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17750 Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* | Merge remote-tracking branch 'origin/release-3.8' into merge-3.8Kevin Vigor2017-03-053-34/+33
|\|
| * protocol/client: Fix double free of client fdctx destroyRavishankar N2017-02-203-34/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the race between fd re-open code and fd release code, both of which free the fd context due to a race in certain variable checks as explained below: 1. client process (shd in the case of this BZ) sends an opendir to its children (client xlators) which send the fop to the bricks to get a valid fd. 2. Client xlator loses connection to the brick. fdctx->remotefd is -1 3. Client re-establishes connection. After handshake, it reopens the dir and sets fdctx->remotefd to a valid fd in client3_3_reopendir_cbk(). 4. Meanwhile, shd sends a fd unref after it is done with the opendir. This triggers a releasedir (since fd->refcount becomes 0). 5. client3_3_releasedir() sees that fdctx-->remotefd is a valid number (i.e not -1), sets fdctx->released=1 and calls client_fdctx_destroy() 6. As a continuation of step3, client_reopen_done() is called by client3_3_reopendir_cbk(), which sees that fdctx->released==1 and again calls client_fdctx_destroy(). Depending on when step-5 does GF_FREE(fdctx), we may crash at any place in step-6 in client3_3_reopendir_cbk() when it tries to access fdctx->{whatever}. > Reviewed-on: https://review.gluster.org/16521 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 25fc74f9d1f2b1e7bab76485a99f27abadd10b7b) Change-Id: Ia50873d11763e084e41d2a1f4d53715438e5e947 BUG: 1422352 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16621 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* | protocol/server: Fix crash bug in unlink flowRichard Wareing2017-01-051-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes a crash bug during unlink in server-rpc-fops.c Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I049a9863ffd4003742276e0aa9e8d1224488182d Reviewed-on: http://review.gluster.org/16335 Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* | Merge remote-tracking branch 'origin/release-3.8' into merge-3.8-againKevin Vigor2017-01-055-4/+11
|\| | | | | | | | | Change-Id: I844adf2aef161a44d446f8cd9b7ebcb224ee618a Signed-off-by: Kevin Vigor <kvigor@fb.com>
| * protocol/client: Fix potential mem-leaksKrutika Dhananjay2016-12-202-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16156 Commit 93eaeb9c93be3232f24e840044d560f9f0e66f71 introduces leaks in INODELK callback where a dict is unserialized twice, leading to dict leaks. Change-Id: I2ad6f4243d78ba30841731d331f5a4a0006827da BUG: 1405886 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16187 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
| * protocol/server: capture offset in seekRavishankar N2016-11-293-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: http://review.gluster.org/11482 implemented seek FOP but http://review.gluster.org/#/c/14137/ 'undid' the change where we pack the offset returned by seek in server xlator before sending it to the client. As a result, seek always returns zero to the client for SEEK_HOLE/ SEEK_DATA. Fix: I think 14137 removed it unintentionally, hence adding it back again. > Reviewed-on: http://review.gluster.org/15920 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit cc37e5929d1e3ea4eaf4c4576a82066bf131ad05) Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I67a1f7b53214b043c5291f5704be4a50b698f91c BUG: 1399130 Reviewed-on: http://review.gluster.org/15943 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* | Add halo-min-samples option, better swap logic, edge case fixesRichard Wareing2016-12-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Changes halo-decision to be based on the lowest halo value observed - Adds halo-min-sample option to wait until N latency samples have been gathered prior to activating halos. - Fixed 3 edge cases where halo's weren't being correctly config'd, or not configured as quickly as is possible. Namely: 1. Don't mark a child down if there's no better alternative (and you'd no longer satisfy min/max replicas); fixes unneccessary flapping. 2. If a child goes down and this causes us to fall below max_replicas, swap in a warm child immediately if it is within our halo latency (don't wait around for the next "ping"); swaps in a new child immediately helping with resiliency. 3. If the child latency is within the halo, and it's currently marked up, mark it down if it's the highest latency child and the number of children is > max_replicas; this will allow us to support the SHD use-case where we can "beam" a single copy to a geo and have it replicate within the geo after that. - More commenting Test Plan: - Run halo prove tests - Pointed compiled code at gfsglobal.prn2, tested out an NFS daemon and FUSE mounts to ensure they worked as expected on a large scale cluster. Reviewers: dph, jackl, cjh, mmckeen Reviewed By: mmckeen FB-commit-id: 7e2e8ae6b8ec62a5e0b31c9fd6100c81795b3424 Change-Id: Iba2b2f1bc848b4546cb96117ff1895f83953a4f8 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16304 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | Make Halo calculate & use average latencies, not realtimeRichard Wareing2016-12-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Realtime latencies in practice have far too much jitter under real loading conditions, instead let's use a running average which will get very "heavy" over time such that temp spikes in brick latency will not affect halo decisions. Test Plan: - Run prove tests Reviewed By: mmckeen Change-Id: I5ebf9bc93c67d9a226287796dd7ca5eeb7b1cfa5 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16301 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* | client: Increase default ping-timeout to 180 secondsShreyas Siravara2016-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - We've seen lots of issues when the ping timeout is too low @ 60 seconds. - This diff defaults the value to 180 seconds. - This is a cherry-pick of D3753765 to 3.8. Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I70b96b027ac024df63af4ca1aa768f973295b7e4 Reviewed-on: http://review.gluster.org/16219 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kevin Vigor <kvigor@fb.com>
* | protocol/client: Fix race in brick reconnectionKevin Vigor2016-12-161-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - A race condition exists when reconnecting to a brick after connection has been lost; it is possible for the client translator to believe the connection is down while the socket layer believes the connection is up. This situation is permanent and eventually leads to loss of quorum and EROFS errors. - This is a cherry-pick of D3490020 to 3.8 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: Ida7afbafd3dceadf9ca7ea8b350aa88db382dd88 Reviewed-on: http://review.gluster.org/16174 Reviewed-by: Kevin Vigor <kvigor@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* | Halo Replication feature for AFR translatorRichard Wareing2016-12-152-46/+54
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
* protocol/server: readlink rsp xdr will fail while readlink got an errorRyan Ding2016-09-013-3/+4
| | | | | | | | | | | | | | | | set gfs3_readlink_rsp.path with an empty string while error happen, to make xdr_gfs3_readlink_rsp happy. otherwise the original errno will be lost, and return an rpc internal errno instead. Change-Id: I36655b66df8b9f164e5bd21eb17244722c2f5a52 BUG: 1370172 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15312 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* protocol/client: Unserialize xdata even if lookup failsAnuradha Talur2016-08-222-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: AFR relies on xdata returned by lookup to determine if there are any files that need healing. This info is further used to optimize readdirp. In case of lookups with negative return value, client xlator was sending NULL xdata. Due to absence of xdata, AFR conservatively assumes that there are files that need healing, which is incorrect. Solution: Even in case of unsuccessful lookups, send the xdata received by protocol client so that higher xlators can get the info that they rely on. >Change-Id: Id3a1023eb536180888eb2c0b39050000b76f7226 >BUG: 1366284 >Signed-off-by: Anuradha Talur <atalur@redhat.com> >Reviewed-on: http://review.gluster.org/15120 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Poornima G <pgurusid@redhat.com> >Tested-by: Poornima G <pgurusid@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Signed-off-by: Anuradha Talur <atalur@redhat.com> Change-Id: Id3a1023eb536180888eb2c0b39050000b76f7226 BUG: 1369187 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/15237 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* protocol/server: Fix client/server compatibilityAvra Sengupta2016-07-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 3.8 client expects a child_up key from the server indicating the status of the server translators. This key is not being sent by the servers running older versions, thereby breaking compatibility. With this patch we are treating the absence of the said key as an indication that the server trying to connect to this client is running an older version and hence in such a case we are setting conf->child_up as _gf_true explicitly. This should suffice in emulating the older behavior. Due to the nature of this bug, requiring two version to be reproducible, there are no testcases added for the same. > Reviewed-on: http://review.gluster.org/14811 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit 10fa1bcce3b73f630dbc3241722c1af9dee4c414) Change-Id: I29e0a5c63b55380dc9db8e42852d7e95b64a2b2e BUG: 1350326 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14810 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* libglusterfs: Negate all but O_DIRECT flag if present on anon fdsKrutika Dhananjay2016-06-271-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14665 This is to prevent any unforeseen problems that might arise due to writevs and readvs being wound with @flag parameter containing O_TRUNC or O_APPEND especially wrt translators like sharding and ec where O_TRUNC write or O_APPEND write on individual shards/fragments is not the same as O_TRUNC write or O_APPEND write as expected by the application. >Change-Id: Ib0110731d6099bc888b7ada552a2abbea8e76051 >BUG: 1342903 >Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-on: http://review.gluster.org/14735 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I7ffa4fa366f727f7e345ab0bf4c8eb009710074b BUG: 1347553 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14755 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core, shard: Make shards inherit main file's O_DIRECT flag if presentKrutika Dhananjay2016-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14191 If the application opens a file with O_DIRECT, the shards' anon fds would also need to inherit the flag. Towards this, shard xl would be passing the odirect flag in the @flags parameter to the WRITEV fop. This will be used in anon fd resolution and subsequent opening by posix xl. >Change-Id: I3a0593fa46cc25e390a5762a0354b469c2a1532d >BUG: 1342903 >Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> >Reviewed-on: http://review.gluster.org/14663 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ibfc164aa7f9eecd6993255f1c03557f2ec35ac8c BUG: 1347553 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14754 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* protocol: Add framework to send transaction id with recallPoornima G2016-06-102-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/14647/ Issue: The upcall(cache invalidation/recall) event is sent from the bricks to clients. In AFR/EC setup, it can so happen that all the bricks will send the upcall for the same event, and if AFR/EC doesn't filter out these duplicate notifications, the logic above cluster xlators can fail. Solution: Use transaction id to filter out duplicate notifications. This patch adds framework for duplicate notifications. AFR/EC can build up on this patch for deduping the notifications Change-Id: I66b08e63b8799bc5932f2b2545376138a5701168 BUG: 1337638 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/14648 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Use correct log levelsAshish Pandey2016-05-302-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem : Misleading messages are getting logged in mount logs and bricks log. "Mismatching xdata" and "Heal failed" are getting logged Solution : Reduce the level of logs from INFO, WARNING and NOTICE to DEBUG level wherever applicable OR use fop_log_level to get proper log level. Backport of commit 02b2750ecc35f88c3262015b401dda962381f9da: > Change-Id: Ia824c71e75ab683d3cb8949e1966ea09c9ccce72 > BUG: 1231224 > Signed-off-by: Ashish Pandey <aspandey@redhat.com> > Reviewed-on: http://review.gluster.org/13266 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: Ia824c71e75ab683d3cb8949e1966ea09c9ccce72 BUG: 1254934 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14520 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* core: Honour mandatory lock flags during lock migrationAnoop C S2016-05-272-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lk_flags from posix_lock_t structure is the primary key used to differentiate locks as either advisory and mandatory type. During lock migration this field is not read in getactivelk() call path. So in order to copy the exact lock state from source to destination it is necessary to include lk_flags within lock_migration_info_t structure to maintain accurate state. This change also includes minor modifications to setactivelk() call to consider lk_flags during lock migration. > Reviewed-on: http://review.gluster.org/14189 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Poornima G <pgurusid@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit deaf8439fc42435988aae6a7b9ab681cc0d36b09) Change-Id: I20a7b6b6a0f3bdac5734cce8a2cd2349eceff195 BUG: 1337805 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/14457 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* protocol/client: Filter o-direct in readv/writevPranith Kumar K2016-05-231-8/+15
| | | | | | | | | | | | | | | | | | | | >Change-Id: I519c666b3a7c0db46d47e08a6a7e2dbecc05edf2 >BUG: 1322214 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14215 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1335284 Change-Id: I119a5f1eebf657b01d8d924ff1f59a49eb472667 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14299 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* client: Fix the message idsPoornima G2016-05-151-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | The message id of PC_MSG_GFID_NULL was changed as a part of rebase of http://review.gluster.org/#/c/11597. Fixing the same Backport of http://review.gluster.org/#/c/14276/ Cherry picked from commit f3699f32fd8d468f757697fdacf4949b8d5312d5 > Change-Id: I773e02fb5695b6b55700046f4a4298ec475f8991 > Signed-off-by: Poornima G <pgurusid@redhat.com> > Reviewed-on: http://review.gluster.org/14276 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Change-Id: I773e02fb5695b6b55700046f4a4298ec475f8991 BUG: 1334994 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/14280 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* glusterd: add defence mechanism to avoid brick port clashesPrasanna Kumar Kalever2016-05-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro: Currently glusterd maintain the portmap registry which contains ports that are free to use between 49152 - 65535, this registry is initialized once, and updated accordingly as an then when glusterd sees they are been used. Glusterd first checks for a port within the portmap registry and gets a FREE port marked in it, then checks if that port is currently free using a connect() function then passes it to brick process which have to bind on it. Problem: We see that there is a time gap between glusterd checking the port with connect() and brick process actually binding on it. In this time gap it could be so possible that any process would have occupied this port because of which brick will fail to bind and exit. Case 1: To avoid the gluster client process occupying the port supplied by glusterd : we have separated the client port map range with brick port map range more @ http://review.gluster.org/#/c/13998/ Case 2: (Handled by this patch) To avoid the other foreign process occupying the port supplied by glusterd : To handle above situation this patch implements a mechanism to return EADDRINUSE error code to glusterd, upon which a new port is allocated and try to restart the brick process with the newly allocated port. Note: Incase of glusterd restarts i.e. runner_run_nowait() there is no way to handle Case 2, becuase runner_run_nowait() will not wait to get the return/exit code of the executed command (brick process). Hence as of now in such case, we cannot know with what error the brick has failed to connect. This patch also fix the runner_end() to perform some cleanup w.r.t return values. Backport of: > Change-Id: Iec52e7f5d87ce938d173f8ef16aa77fd573f2c5e > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14043 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Id7d8351a0082b44310177e714edc0571ad0f7195 BUG: 1333711 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14235 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* protocol/server: address double free'sPrasanna Kumar Kalever2016-05-031-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Backport of: > Change-Id: Ic8a8fe85cf91c5c7aa93dce872cedbc67464e4ea > BUG: 1227667 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14150 > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Change-Id: I895457636c99cc77d643985b64180d659ac356ba BUG: 1332414 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14179 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* protocol/server: Implementation of compound fopv3.9devAnuradha Talur2016-05-0110-246/+3339
| | | | | | | | | | | Change-Id: I981258afa527337dd2ad33eecba7fc8084238e6d BUG: 1303829 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/14137 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* protocol: add setactivelk () fopSusant Palai2016-05-018-2/+400
| | | | | | | | | | | Change-Id: I60fe2d59c454095febce4c0fbef87a2dad9636e4 BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14013 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* protocol: add getactivelk () fopSusant Palai2016-05-018-1/+414
| | | | | | | | | | | Change-Id: Ie38198db990f133fe163ba160cdf647e34f83f4f BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13994 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* protocol/client : Implementation of compound fopAnuradha Talur2016-04-306-2/+1848
| | | | | | | | | | | Change-Id: Iade71daf3bc70e60833d693ac55151c9cf691381 BUG: 1303829 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/14114 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* protocol/client : Refactoring functionsAnuradha Talur2016-04-306-1146/+3040
| | | | | | | | | | | | | | | | | These changes are made to accommodate compound fops. The new functions that are added pack the arguments required to perform the fops. These will be used both by normal fops and compound ones. Change-Id: I44d9cef8ff1d33aa2f5661689c8e9386d87b2007 BUG: 1303829 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/13963 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Protocol: Add lease fopPoornima G2016-04-298-14/+329
| | | | | | | | | | | | Change-Id: I64c361d3e4ae86d57dc18bb887758d044c861237 BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11597 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/afr/client/posix: Fail mkdir without gfid-reqPranith Kumar K2016-04-293-3/+23
| | | | | | | | | | | | | | | Do not allow directory creations without gfids as after the directories are created, operations on them fail anyway. So it is better to fail mkdir. BUG: 1317361 Change-Id: I8f8e3b38bbded1960b7215bac0432500f7e78038 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13690 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* protocol/client: Use loc->pargfid if loc->parent(->gfid) is not filledKrutika Dhananjay2016-04-281-8/+6
| | | | | | | | | | | Change-Id: Id73bf635ca94dcf7518b33e529ffca07daeeb1f4 BUG: 1269461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14078 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* server: send lookup on root inode when itable is createdvmallika2016-03-308-31/+139
| | | | | | | | | | | | | | | | | | | | | | | | | * xlators like quota, marker, posix_acl can cause problems if inode-ctx are not created. sometime these xlarors may not get lookup on root inode with below cases 1) client may not send lookup on root inode (like NSR leader) 2) if the xlators on one of the bricks are not up, and client sending lookup during this time: brick can miss the lookup It is always better to make sure that there is one lookup on root. So send a first lookup when the inode table is created * When sending lookup on root, new inode is created, we need to use itable->root instead Change-Id: Iff2eeaa1a89795328833a7761789ef588f11218f BUG: 1320818 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13837 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/server: Do not log ENOENT/ESTALE in fd based fopsPranith Kumar K2016-03-151-20/+30
| | | | | | | | | | | | | | | When fd-fops come on anon-fds there is a chance to log ENOENT/ESTALE for them. Log it as DEBUG. Change-Id: I8ae53c29d6a66f6a65081c281a9a5c205f53766b BUG: 1315168 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13621 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* protocol client/server: Fix client-server handshakeAvra Sengupta2016-03-108-12/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently on a successful connection between protocol server and client, the protocol client initiates a CHILD_UP event in the client stack. At this point in time, only the connection between server and client is established, and there is no guarantee that the server side stack is ready to serve requests. It works fine now, as most server side translators are not dependent on any other factors, before being able to serve requests today and hence they are up by the time the client stack translators receive the CHILD_UP (initiated by client handshake). The gap here is exposed when certain server side translators like NSR-Server for example, have a couple of protocol clients as their child(connecting them to other bricks), and they can't really serve requests till a quorum of their children are up. Hence these translators should defer sending CHILD_UP till they have enough children up, and the same needs to be propagated to the client stack translators. Fix: Maintain a child_up variable in both the protocol client and protocol server translators. The protocol server should update this value based on the CHILD_UP and CHILD_DOWN events it receives from the translators below it. On receiving such an event it should forward that event to the client. The protocol client on receiving such an event should forward it up the client stack, thereby letting the client translators correctly know that the server is up and ready to serve. The clients connecting later(long after a server has initialized and processed it's CHILD_UP events), will receive a child_up status as part of the handshake, and based on the status of the server's child_up, can either propagate a CHILD_UP event or defer it. Change-Id: I0807141e62118d8de9d9cde57a53a607be44a0e0 BUG: 1312845 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/13549 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/client: Don't change op_ret when xdata_rsp is presentPranith Kumar K2016-02-291-1/+1
| | | | | | | | | | | | Change-Id: Ia69cc420ad7b5766d513ea2715bbca50d8d57132 BUG: 1312226 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13530 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* all: fixes for clang compile warningsKaleb S KEITHLEY2016-02-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cli/src/cli-cmd-parser.c (chenk) cli/src/cli-xml-output.c (spandit) cli/src/cli.c (chenk) libglusterfs/src/common-utils.c (vmallika) libglusterfs/src/gfdb/gfdb_sqlite3.c (jfernand +1) rpc/rpc-transport/socket/src/socket.c (?) xlators/cluster/afr/src/afr-transaction.c (?) xlators/cluster/dht/src/dht-common.h (srangana +2) xlators/cluster/dht/src/dht-selfheal.c (srangana +2) xlators/debug/io-stats/src/io-stats.c (R. Wareing) xlators/features/barrier/src/barrier.c (vshastry) xlators/features/bit-rot/src/bitd/bit-rot-scrub.h (vshankar +1) xlators/features/shard/src/shard.c (kdhananj +1) xlators/mgmt/glusterd/src/glusterd-ganesha.c (skoduri) xlators/mgmt/glusterd/src/glusterd-handler.c (atinmu) xlators/mgmt/glusterd/src/glusterd-op-sm.h (atinmu) xlators/mgmt/glusterd/src/glusterd-snapshot.c (spandit) xlators/mgmt/glusterd/src/glusterd-syncop.c (atinmu) xlators/mgmt/glusterd/src/glusterd-volgen.c (atinmu) xlators/protocol/client/src/client-messages.h (mselvaga +1) xlators/storage/bd/src/bd-helper.c (M. Mohan Kumar) xlators/storage/bd/src/bd.c (M. Mohan Kumar) xlators/storage/posix/src/posix.c (nbalacha +1) Change-Id: I85934fbcaf485932136ef3acd206f6ebecde61dd BUG: 1293133 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13031 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* NSR: Volgen SupportJeff Darcy2016-02-081-2/+6
| | | | | | | | | | | | | | | | Allows the user to convert an afr-volume to a nsr-volume by using cluster.nsr option in the volume set command gluster volume set <volname> cluster.nsr <on/off> Change-Id: Ia1c5aa89d27535f7275d474cf312dc5efb8e222f BUG: 1158654 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12943 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Avra Sengupta <asengupt@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* protocol: implement seek() FOPNiels de Vos2016-02-046-3/+266
| | | | | | | | | | | | | | | Network protocol extensions for the seek() FOP. The format is based on the SEEK procedure in NFSv4.2. Change-Id: I060768a8a4b9b1c80f4a24c0f17d630f7f028690 BUG: 1220173 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11482 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/server: Race between server_reconfigure and server_setvolumeMohammed Rafi KC2016-01-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | During server_reconfigure we authenticate each connected clients against the current options. To do this authentication we store previous values in a dictionary during the connection establishment phase (server_setvolume). If the authentication fails during reconfigure then we will disconnect the transport. Here it introduce a race between server_setvolume and reconfugure. If a reconfigure called before doing a setvolume, the transport will be disconnected Change-Id: Icce2c28a171481327a06efd3901f8a5ee67b05ab BUG: 1300564 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13271 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* upcall: free the xdr* allocationsSoumya Koduri2016-01-141-0/+6
| | | | | | | | | | | | | Free the xdr string allocations after decoding the upcall cache_invalidation request. Change-Id: I0ffc64f587d6c8566cba76cf08148f937a735926 BUG: 1295107 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13232 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* build: export minimum symbols from xlators for correct resolutionKaleb S KEITHLEY2015-12-222-2/+2
| | | | | | | | | | | | | | | | | | | | | | Revisiting http://review.gluster.org/#/c/11814/, which unintentionally introduced warnings from libtool about the xlator .so names. According to [1], the -module option must appear in the Makefile.am file(s); if -module is defined in a macro, e.g. in configure(.ac), then libtool will not recognize that this is a module and will emit a warning. [1] http://www.gnu.org/software/automake/manual/automake.html#Libtool-Modules Change-Id: Ifa5f9327d18d139597791c305aa10cc4410fb078 BUG: 1248669 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13003 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* geo-rep: Don't log geo-rep safe errors in mount logsKotresh HR2015-11-181-3/+6
| | | | | | | | | | | | | | | | | | | | | ENOENT is a safe error for geo-replication in case of rm -rf. RMDIR is recorded in changelog of each brick, geo-rep processes all changelogs among which one will succeed and rest will get ENOENT which can be ignored. Similarly ENOENT can also be ignored in case of all unlink operation during changelog replay that can happen when worker goes down and comes back. Change-Id: I6756f8f4c3fce7a159751a2bfce891ff16ad31a4 BUG: 1250009 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/11833 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/ec: Mark self-heal fops as internalPranith Kumar K2015-11-181-1/+1
| | | | | | | | | | Change-Id: I8ae7af266d3e00460f0cfdc9389a926e5f2fee36 BUG: 1282761 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12598 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* protocol/client: prevent use-after-free of frame->rootNiels de Vos2015-11-171-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A regression failure generated a coredump on the glusterfs-client side: (gdb) f 0 #0 0x00007fba6cd76432 in client_submit_request (this=0x7fba68006fc0, req=0x7fba6579aa70, frame=0x7fba5c0058cc, prog=0x7fba6cfb53c0 <clnt3_3_fop_prog>, procnum=41, cbkfn=0x7fba6cd9206d <client3_3_release_cbk>, iobref=0x0, rsphdr=0x0, rsphdr_count=0, rsp_payload=0x0, rsp_payload_count=0, rsp_iobref=0x0, xdrproc=0x7fba79801075 <xdr_gfs3_release_req>) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/client/src/client.c:324 324 frame->root->ngrps = ngroups; (gdb) l 319 gf_msg_debug (this->name, 0, "rpc_clnt_submit failed"); 320 } 321 322 if (!conf->send_gids) { 323 /* restore previous values */ 324 frame->root->ngrps = ngroups; 325 if (ngroups <= SMALL_GROUP_COUNT) 326 frame->root->groups_small[0] = gid; 327 } 328 (gdb) p *frame->root Cannot access memory at address 0x64185df000000000 After looking at this in more detail, the flow is like this: client_submit_request() | '- rpc_clnt_submit() // on line 314 | '- cbkfn() // = client3_3_release_cbk | :- STACK_DESTROY (frame->root); .----' .----' | :- frame->root->ngrps = ngroups; // on line 324 ' So, there is a use-after-free, and it is not needed to restore the previous groups in frame->root. Change-Id: I9e7d712183692ed92cfc2f75cd3c2781a9db20e2 BUG: 128128 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12575 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* build: install and package header files more conventionallyKaleb S. KEITHLEY2015-11-161-1/+3
| | | | | | | | | | | | | | The current way we install and package header files for the -devel package is a hack. This patch uses more conventional autoconf, libtool, and rpmbuild idioms to package -devel headers and libraries. Change-Id: I63ffb3460f5c12b6b355493bd00824ac9e5354c5 BUG: 1271907 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12360 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>