| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Implement minimal proper synchronization between gf_attach
and underlying RPC layer using convenient POSIX primitives.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1260
Change-Id: Ib5130b586a8b65ed5cf5f9156c111b161570224b
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
squash tens of warnings on padding of structs in afr structures.
The warnings were found by manually added '-Wpadded' to the GCC
command line.
Also made relevant structs and definitions static, where it
was applicable.
Change-Id: Ib71a7e9c6179378f072d796d11172d086c343e53
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were unconditionally cleaning up the grap when we get
child_down followed by parent_down. But this is prone to
race condition when some of the bricks are already disconnected.
In this case, even before the last child down is executed in the
client xlator code,we might have freed the graph. Because the
child_down event is alreadt recevied.
To fix this race, we have introduced a check to see if all client
xlator have cleared thier reconnect chain, and called the child_down
for last time.
Change-Id: I7d02813bc366dac733a836e0cd7b14a6fac52042
fixes: bz#1727329
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 59841f7e1ff0511b04884015441a181a56d07bea.
This revert is done as a 'possible' fix for frequent regression
failures, which are random in nature too (ie, different tests fails
in different runs).
Why exactly this patch? Because this patch seemed like most probable
candidate which got merged in last 15days, and after which regressions
are failing more often.
Updates: bz#1711827
Change-Id: I35333162fcd4064f9609525ca93c666053c6d959
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a connection failure happens, gluster tries to reconnect every 3
seconds. In some cases the failure is spurious, so a delay of 3 seconds
could be unnecessarily long.
This patch implements a back-off strategy that tries a reconnect as soon
as 1 tenth of a second. If this fails, the time is doubled until it's
around 3 seconds. After that, the reconnect is attempted every 3 seconds
as before.
Change-Id: Icb3fbe20d618f50cbbb599dce542b4e871c22149
Updates: bz#1193929
Signed-off-by: Xavier Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_disable() and rpc_clnt_disconnect() have same code.
Removed rpc_clnt_disconnect() and moved calls to rpc_clnt_disconnect()
to rpc_clnt_disable()
updates bz#1193929
Change-Id: I965f57cc1d5af36d266810125558b6f5e5f279d4
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
| |
Make an effort to slightly better align the structures.
Change-Id: I6f80a451f2ffbf15adfb986cedc24c2799787b49
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assorted code refactoring to reduce lock contention.
Also, took the opportunity to reorder structs more properly.
Removed dead code.
Hopefully, no functional changes.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I5de6124ad071fd5e2c31832364d602b5f6d6fe28
|
|
|
|
| |
Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The state management of "connected" in rpc is ad-hoc as far as the
responsibility goes. Note that there is nothing wrong with
functionality itself. rpc layer manages this state in disconnect
codepath and has exposed an api to manage this one from
consumers. Note that rpc layer never sets "connected" to true by
itself, which forces the consumers to use this api to get a working
rpc connection. The situation is best captured from a comment in code
from Jeff Darcy in glusterfsd/src/gf-attach.c:
-/*
- * In a sane world, the generic RPC layer would be capable of tracking
- * connection status by itself, with no help from us. It might invoke our
- * callback if we had registered one, but only to provide information. Sadly,
- * we don't live in that world. Instead, the callback *must* exist and *must*
- * call rpc_clnt_{set,unset}_connected, because that's the only way those
- * fields get set (with RPC both above and below us on the stack). If we don't
- * do that, then rpc_clnt_submit doesn't think we're connected even when we
- * are. It calls the socket code to reconnect, but the socket code tracks this
- * stuff in a sane way so it knows we're connected and returns EINPROGRESS.
- * Then we're stuck, connected but unable to use the connection. To make it
- * work, we define and register this trivial callback.
- */
Also, consumers of rpc know about state of connection only through the
notifications sent by rpc-clnt. So, consumers don't have any extra
information to manage the state and hence letting them manage the
state is counter intuitive. This patch cleans that up and instead
moves the responsibility of state management of rpc layer into
itself.
Change-Id: I31e641a60795fc480ca753917f4b2579f1e05094
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1585585
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have the following undefined symbol error from protocol/server.so:
glusterfs_mgmt_pmap_signout
glusterfs_autoscale_threads
See https://review.gluster.org/19225 (bz#1532238)
and https://review.gluster.org/19657 (bz#1550895)
(why are there two different bzs for the same bug?)
IMO this is a cleaner solution. I.e. moving the above two functions
to libgfrpc (.../rpc/rpc-lib/...)
I would also, for (foolish) consistency sake, like to see
glusterfs_mgmt_pmap_signin() moved from glusterfsd to libgfrpc as
well.
This works on f28/rawhide, with its new, more restrictive run-time
link semantics. The smoke and regression tests on earlier fedora and
centos will confirm that it works on those platforms too.
Change-Id: I9cfbd1cc15e7ebd9fc31b56ac791287fa2c584de
BUG: 1550895
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce another authentication header which can now send more data.
This is useful because this data can be common for all the fops, and
we don't need to change all the signatures.
As part of this, made rpc-clnt.c little more modular to support multiple
authentication structures.
stack.h changes are placeholder for the ctime etc, can be moved later
based on need.
updates #384
Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In glusterfs code base we call mutex_lock/unlock to take
reference/dereference for a object.Sometime it could be
reason for lock contention also.
Solution: There is no need to use mutex to increase/decrease ref
counter, instead of using mutex use gcc builtin ATOMIC
operation.
Test: I have not observed yet how much performance gain after apply
this patch specific to glusterfs but i have tested same
with below small program(mutex and atomic both) and
get good difference.
static int numOuterLoops;
static void *
threadFunc(void *arg)
{
int j;
for (j = 0; j < numOuterLoops; j++) {
__atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL);
}
return NULL;
}
int
main(int argc, char *argv[])
{
int opt, s, j;
int numThreads;
pthread_t *thread;
int verbose;
int64_t n = 0;
if (argc < 2 ) {
printf(" Please provide 2 args Num of threads && Outer Loop\n");
exit (-1);
}
numThreads = atoi(argv[1]);
numOuterLoops = atoi (argv[2]);
if (1) {
printf("\tthreads: %d; outer loops: %d;\n",
numThreads, numOuterLoops);
}
thread = calloc(numThreads, sizeof(pthread_t));
if (thread == NULL) {
printf ("calloc error so exit\n");
exit (-1);
}
__atomic_store (&glob, &n, __ATOMIC_RELEASE);
for (j = 0; j < numThreads; j++) {
s = pthread_create(&thread[j], NULL, threadFunc, NULL);
if (s != 0) {
printf ("pthread_create failed so exit\n");
exit (-1);
}
}
for (j = 0; j < numThreads; j++) {
s = pthread_join(thread[j], NULL);
if (s != 0) {
printf ("pthread_join failed so exit\n");
exit (-1);
}
}
printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED));
exit(0);
}
time ./thr_count 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 1m10.288s
user 0m57.269s
sys 3m31.565s
time ./thr_count_atomic 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 0m20.313s
user 1m20.558s
sys 0m0.028
Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_submit() acquires conn->lock before call to
rpc_transport_submit_request() and subsequent queuing of frame into
saved_frames list. However, as part of handling RPC_TRANSPORT_MSG_RECEIVED
and RPC_TRANSPORT_MSG_SENT notifications in rpc_clnt_notify(), conn->lock
is again used to atomically update conn->last_received and conn->last_sent
event timestamps.
So when conn->lock is acquired as part of submitting a request,
a parallel POLLIN notification gets blocked at rpc layer until the request
submission completes and the lock is released.
To get around this, this patch call clock_gettime (instead to call gettimeofday)
to update event timestamps in conn->last_received and conn->last_sent and to
call clock_gettime don't need to call mutex_lock because it (clock_gettime)
is thread safe call.
Note: Run fio on vm after apply the patch, iops is improved after apply
the patch.
Change-Id: I347b5031d61c426b276bc5e07136a7172645d763
BUG: 1467614
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I57ad970411db1ccd3d2c56c504c7da9cc221051f
BUG: 1448692
Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
Reviewed-on: https://review.gluster.org/17198
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster. Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.
In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.
There are a few new volume options due to this feature:
halo-shd-latency: The threshold below which self-heal daemons will
consider children (bricks) connected.
halo-nfsd-latency: The threshold below which NFS daemons will consider
children (bricks) connected.
halo-latency: The threshold below which all other clients will
consider children (bricks) connected.
halo-min-replicas: The minimum number of replicas which are to
be enforced regardless of latency specified in the above 3 options.
If the number of children falls below this threshold the next
best (chosen by latency) shall be swapped in.
New FUSE mount options:
halo-latency & halo-min-replicas: As descripted above.
This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.
Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
Writes & deletes will be protected via entry-locks as usual preventing
concurrent writes into files which are undergoing replication. Read operations
on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
other regions. The take away from this is care should be taken to ensure
multiple writers do not write the same files resulting in a gfid split-brain
which will require resolution via split-brain policies (majority, mtime &
size). Recommended method for preventing this is using the nfs-auth feature to
define which region for each share has RW permissions, tiers not in the origin
region should have RO perms.
TODO:
- Synchronous clients (including the SHD) should choose clients from their own
region as preferred sources for reads. Most of the plumbing is in place for
this via the child_latency array.
- Better GFID split brain handling & better dent type split brain handling
(i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
to synchronously write to
Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer & valgrind
- Prove tests
Reviewers: jackl, dph, cjh, meyering
Reviewed By: meyering
Subscribers: ethanr
Differential Revision: https://phabricator.fb.com/D1272053
Tasks: 4117827
Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Locking during notify was introduced as part of commit
aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced
to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent
xlators [2]. However as part of handling DISCONNECT protocol/client
does unwind saved frames (with failure) waiting for responses. This
saved_frames_unwind can be a costly operation and hence ideally
shouldn't be included in the critical section of notifylock, as it
unnecessarily delays the reconnection to same brick. Also, its not a
good practise to pass control to other xlators holding a lock as it
can lead to deadlocks. So, this patch removes locking in rpc-clnt
while notifying parent xlators.
To fix [2], two changes are present in this patch:
* notify DISCONNECT before cleaning up rpc connection (same as commit
a6b63e11b7758cf1bfcb6798, patch [3]).
* protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc
connection and does a start while handling a DISCONNECT event from
rpc. Note that patch [3] was reverted as rpc_clnt_start called in
quick_reconnect path of protocol/client didn't invoke connect on
transport as the connection was not cleaned up _yet_ (as cleanup was
moved post notification in rpc-clnt). This resulted in clients never
attempting connect to bricks.
Note that one of the neater ways to fix [2] (without using locks) is
to introduce generation numbers to map CONNECT and DISCONNECTS across
epochs and ignore DISCONNECT events if they don't belong to current
epoch. However, this approach is a bit complex to implement and
requires time. So, current patch is a hacky stop-gap fix till we come
up with a more cleaner solution.
[1] http://review.gluster.org/15916
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626
[3] http://review.gluster.org/15681
Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
BUG: 1427012
Reviewed-on: https://review.gluster.org/16784
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Milind Changire <mchangir@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for multiple brick translator stacks running
in a single brick server process. This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more. It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.
Multiplexing is controlled by the "cluster.brick-multiplex" global option. By
default it's off, and bricks are started in separate processes as before. If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.
Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that the notification thread which notifies
protocol/client layer about the disconnection is put to sleep
and meanwhile, a fuse thread or a timer thread initiates and
completes reconnection to the brick. The notification thread
is then woken up and protocol/client layer updates its flags
to indicate that network is disconnected. No reconnection is
initiated because reconnection is rpc-lib layer's responsibility
and its flags indicate that connection is connected.
Fix: Serialize connect and disconnect notify
Credit: Raghavendra Talur <rtalur@redhat.com>
Change-Id: I8ff5d1a3283b47f5c26848a42016a40bc34ffc1d
BUG: 1386626
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/15916
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
1. Freeing up rpc_clnt object might lead to crashes. Well,
it was not a necessity to free rpc-clnt object till now
because all the existing use cases needs to reconnect
back on disconnects. Hence timer code was not taking
ref on rpc-clnt object.
Glusterd had some use-cases that led to crash due to
ping-timer and they fixed only those code paths that
involve ping-timer.
Now, since changelog has an use-case where rpc-clnt
need to be freed up, we need to fix timer code to take
refs
2. In changelog, because of issue 1, only mydata was being
freed which is incorrect. And there are races where
rpc-clnt object would access the freed mydata which
would lead to crashes.
Since changelog xlator resides on brick side and is long
living process, if multiple libgfchangelog consumers
register to changelog and disconnect/reconnect mulitple
times, it would result in leak of 'rpc-clnt' object
for every connect/disconnect.
SOLUTION:
1. Handle ref/unref of 'rpc_clnt' structure in timer
functions properly.
2. In changelog, unref 'rpc_clnt' in RPC_CLNT_DISCONNECT
after disabling timers and free mydata on RPC_CLNT_DESTROY.
RPC SETUP IN CHANGELOG:
1. changelog xlator initiates rpc server say 'changelog_rpc_server'
2. libgfchangelog initiates one rpc server say 'libgfchangelog_rpc_server'
3. libgfchangelog initiates rpc client and connects to 'changelog_rpc_server'
4. In return changelog_rpc_server initiates a rpc client and connects back
to 'libgfchangelog_rpc_server'
REF/UNREF HANDLING IN TIMER FUNCTIONS:
Let's say rpc clnt refcount = 1
1. Take the ref before reigstering callback to timer queue
>>>> rpc_clnt_ref (say ref count becomes = 2)
2. Register a callback to timer say 'callback1'
3. If register fails:
>>>> rpc_clnt_unref (ref count = 1)
4. On timer expiration, 'callback1' gets called. So unref rpc clnt at the end
in 'callback1'. This is corresponding to ref taken in step 1
>>>> rpc_clnt_unref (ref count = 1)
5. The cycle from step-1 to step-4 continues....until timer cancel event happens
6. timer cancel of say 'callback1'
If timer cancel fails:
Do nothing, Step-4 would have unrefd
If timer cancel succeeds:
>>>> rpc_clnt_unref (ref count = 1)
Change-Id: I91389bc511b8b1a17824941970ee8d2c29a74a09
BUG: 1316178
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/13658
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a peer rpc disconnect event has been already processed, skip the furthers as
processing them are overheads and sometimes may lead to a crash like due to a
double free
Change-Id: Iec589ce85daf28fd5b267cb6fc82a4238e0e8adc
BUG: 1318546
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/13790
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The @owner argument tells RPC layer the xlator that owns
the connection and to which xlator THIS needs be set during
network notifications like CONNECT and DISCONNECT.
Code paths that originate from the head of a (volume) graph and use
STACK_WIND ensure that the RPC local endpoint has the right xlator saved
in the frame of the call (callback pair). This guarantees that the
callback is executed in the right xlator context.
The client handshake process which includes fetching of brick ports from
glusterd, setting lk-version on the brick for the session, don't have
the correct xlator set in their frames. The problem lies with RPC
notifications. It doesn't have the provision to set THIS with the xlator
that is registered with the corresponding RPC programs. e.g,
RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set
to its xlator. This implies, call(-callbacks) originating from this
thread don't have the right xlator set too.
The fix would be to save the xlator registered with the RPC connection
during rpc_clnt_new. e.g, protocol/client's xlator would be saved with
the RPC connection that it 'owns'. RPC notifications such as CONNECT,
DISCONNECT, etc inherit THIS from the RPC connection's xlator.
Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409
BUG: 1235582
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/11436
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Idd94adb0457aaffce7330f56f98cebafa2c4dae8
BUG: 1249499
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/11818
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch ports nfs, shd, quotad & snapd with the approach suggested in
http://www.gluster.org/pipermail/gluster-devel/2014-December/043180.html
Change-Id: I4ea5b38793f87fc85cc9d2cf873727351dedffd2
BUG: 1191486
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/9428
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can be seen as below,
># cat $META/graphs/active/vol-client-0/private |grep ping_msgs_sent
ping_msgs_sent = 2
># cat $META/graphs/active/vol-client-0/private |grep "^msgs_sent"
msgs_sent = 13
where $META is /<fuse-mountpt>/.meta
Change-Id: I2107ec2b045bac701377760635e18758adb943a3
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/8285
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the existing client ping timer implementation, and makes
use of the common code for implementing both client ping timer and the
glusterd ping timer.
A new gluster rpc program for ping is introduced. The ping timer is only
started for peers that have this new program. The deafult glusterd ping
timeout is 30 seconds. It is configurable by setting the option
'ping-timeout' in glusterd.vol .
Also, this patch introduces changes in the glusterd-handshake path. The client
programs for a peer are now set in the callback of dump_versions, for both
the older handshake and the newer op-version handshake. This is the only place
in the handshake process where we know what programs a peer supports.
Change-Id: I035815ac13449ca47080ecc3253c0a9afbe9016a
BUG: 1038261
Signed-off-by: Vijaikumar M <vmallika@redhat.com>
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5202
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt object is destroyed after the corresponding transport object is
destroyed. But rpc_clnt_reconnect, a timer driven function, refers to
the transport object beyond its 'life'. Instead, using the embedded
connection object prevents use after free problem wrt transport object.
Also, access transport object under conn->lock.
Change-Id: Iae28e8a657d02689963c510114ad7cb7e6764e62
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/6751
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc:
- On a RPC_TRANSPORT_CLEANUP event, rpc_clnt_notify calls the registered
notifyfn with a RPC_CLNT_DESTROY event. The notifyfn should properly
cleanup the saved mydata on this event.
- Break the reconnect chain when an rpc client is disabled. This will
prevent new disconnect events which can lead to crashes.
glusterd:
- Added support for RPC_CLNT_DESTROY in glusterd_brick_rpc_notify
- Use a common glusterd_rpc_clnt_unref() function throught glusterd in
place of rpc_clnt_unref(). This function correctly gives up the
big-lock before performing the unref.
Change-Id: I93230441c5089039643fc9f5632477ef1b695348
BUG: 962619
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/5512
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_transport object should be alive as long as the rpc_clnt object is
alive. To ensure this, on rpc_clnt's last unref, we cleanup the
corresponding rpc_transport object and complete the rpc_clnt cleanup
later, in a bottom-up fashion.
Introduced rpc_clnt_is_disabled, to allow higher layers to differentiate
between the 'final'[1] disconnect triggered from upper layers, and a
normal disconnect. This differentiation helps in cleaning up resources,
at higher layers, in a race-free manner.
[1] - 'final' here means that the rpc and the associated connection, is
not going to be used anymore. eg - glusterd_brick_disconnect on
volume-stop.
Change-Id: I2ecf891a36e3b02cd9eacca964e659525d1bbc6e
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5107
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the asymmetry in the 'layer' (read rpc, transport
etc) in which transport options were being filled for inet and unix
sockets.
Change-Id: Iaa080691fd5e4c3baedffa97e9c3f16642c1fc12
BUG: 955919
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4850
Reviewed-by: Raghavendra G <raghavendra@gluster.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit efa154bb0a4cac34d5a9610ec25d38eebe495f22.
-- Following is Avati's analysis (edited) from gerrit --
The claim of the patch (being reverted) is that it in some cases cbkfn
is missed. This is wrong analysis. cbk_fn is _always_ called. The patch
treats ret > 0 as a "missed cbk". ret > 0 only means socket submission
was not complete, and is queued to submit asynchronously when POLLOUT is
raised. This is sufficient to guarantee that cbkfn is going to be
called (either the socket errors or submission succeeds and reply
eventually arrives).
This commit also removes spurious barrier_wake(s).
call backs are guaranteed to be called even if the transport is
disconnected. This means, a 'wake' would be called if rpc_clnt_submit is
called. Also, we count both successful and failed operations in a
particular batch of operations for the synctask_barrier_wait. So,
calling synctask_barrier_wake on failure of rpc_clnt_submit (say, due to
network failure) would result in a spurious wake.
Change-Id: I7d508c2a54b74a65b82f097742206bc777afc53a
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4922
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd syncops perform a barrier_wake whenever rpc_clnt_submit returned -1.
This is based on the wrong assumption that the cbkfn wasn't called.
This would result in one more wakeup than there ought to be.
Change-Id: I591e67c267f0e26d1145bf8fb5feeb2c13a751a1
BUG: 948686
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/4802
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce frame-timeout for glusterd connections from 30mins to 10 mins. 30mins is
too long when compared to cli timeout of 2mins. Changing to 10mins reduces the
disparity between cli and glusterd.
Also, fix glusterfs_submit_reply() so that a reply is sent even if serialize
failed.
Change-Id: Id5f68f2ff28ea7453d9a62429fe12aa0c0a66952
BUG: 843003
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.com/3803
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Need to differentiate the callback functions based on which
rpc-clnt the callback is received. without it, all callback
actor handling will be like global.
BUG: 839345
Change-Id: Ide024f5585eab3c5fe6c3b33250772fb6e8ad655
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.com/3656
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that the license was not changed in any of the following:
.../argp-standalone/...
.../booster/...
.../cli/...
.../contrib/...
.../extras/...
.../glusterfsd/...
.../glusterfs-hadoop/...
.../mod_clusterfs/...
.../scheduler/...
.../swift/...
The license was not changed in any of the non-building xlators. The
license was not changed in any of the xlators that seemed — to me — to
be clearly server-side only, e.g. protocol/server
Note too that copyright was changed along with the license; I did
not change the copyright in files where the license did not change.
If you find any errors or ommissions please don't hesitate to let me know.
The complete list of files with the license change is:
libglusterfs/src/byte-order.h
libglusterfs/src/call-stub.c
libglusterfs/src/call-stub.h
libglusterfs/src/checksum.c
libglusterfs/src/checksum.h
libglusterfs/src/circ-buff.c
libglusterfs/src/circ-buff.h
libglusterfs/src/common-utils.c
libglusterfs/src/common-utils.h
libglusterfs/src/compat-errno.c
libglusterfs/src/compat-errno.h
libglusterfs/src/compat.c
libglusterfs/src/compat.h
libglusterfs/src/daemon.c
libglusterfs/src/daemon.h
libglusterfs/src/defaults.c
libglusterfs/src/defaults.h
libglusterfs/src/dict.c
libglusterfs/src/dict.h
libglusterfs/src/event-history.c
libglusterfs/src/event-history.h
libglusterfs/src/event.c
libglusterfs/src/event.h
libglusterfs/src/fd-lk.c
libglusterfs/src/fd-lk.h
libglusterfs/src/fd.c
libglusterfs/src/fd.h
libglusterfs/src/gf-dirent.c
libglusterfs/src/gf-dirent.h
libglusterfs/src/globals.c
libglusterfs/src/globals.h
libglusterfs/src/glusterfs.h
libglusterfs/src/graph-print.c
libglusterfs/src/graph-utils.h
libglusterfs/src/graph.c
libglusterfs/src/hashfn.c
libglusterfs/src/hashfn.h
libglusterfs/src/iatt.h
libglusterfs/src/inode.c
libglusterfs/src/inode.h
libglusterfs/src/iobuf.c
libglusterfs/src/iobuf.h
libglusterfs/src/latency.c
libglusterfs/src/latency.h
libglusterfs/src/list.h
libglusterfs/src/lkowner.h
libglusterfs/src/locking.h
libglusterfs/src/logging.c
libglusterfs/src/logging.h
libglusterfs/src/mem-pool.c
libglusterfs/src/mem-pool.h
libglusterfs/src/mem-types.h
libglusterfs/src/options.c
libglusterfs/src/options.h
libglusterfs/src/rbthash.c
libglusterfs/src/rbthash.h
libglusterfs/src/run.c
libglusterfs/src/run.h
libglusterfs/src/scheduler.c
libglusterfs/src/scheduler.h
libglusterfs/src/stack.c
libglusterfs/src/stack.h
libglusterfs/src/statedump.c
libglusterfs/src/statedump.h
libglusterfs/src/syncop.c
libglusterfs/src/syncop.h
libglusterfs/src/syscall.c
libglusterfs/src/syscall.h
libglusterfs/src/timer.c
libglusterfs/src/timer.h
libglusterfs/src/trie.c
libglusterfs/src/trie.h
libglusterfs/src/xlator.c
libglusterfs/src/xlator.h
libglusterfsclient/src/libglusterfsclient-dentry.c
libglusterfsclient/src/libglusterfsclient-internals.h
libglusterfsclient/src/libglusterfsclient.c
libglusterfsclient/src/libglusterfsclient.h
rpc/rpc-lib/src/auth-glusterfs.c
rpc/rpc-lib/src/auth-null.c
rpc/rpc-lib/src/auth-unix.c
rpc/rpc-lib/src/protocol-common.h
rpc/rpc-lib/src/rpc-clnt.c
rpc/rpc-lib/src/rpc-clnt.h
rpc/rpc-lib/src/rpc-transport.c
rpc/rpc-lib/src/rpc-transport.h
rpc/rpc-lib/src/rpcsvc-auth.c
rpc/rpc-lib/src/rpcsvc-common.h
rpc/rpc-lib/src/rpcsvc.c
rpc/rpc-lib/src/rpcsvc.h
rpc/rpc-lib/src/xdr-common.h
rpc/rpc-lib/src/xdr-rpc.c
rpc/rpc-lib/src/xdr-rpc.h
rpc/rpc-lib/src/xdr-rpcclnt.c
rpc/rpc-lib/src/xdr-rpcclnt.h
rpc/rpc-transport/rdma/src/name.c
rpc/rpc-transport/rdma/src/name.h
rpc/rpc-transport/rdma/src/rdma.c
rpc/rpc-transport/rdma/src/rdma.h
rpc/rpc-transport/socket/src/name.c
rpc/rpc-transport/socket/src/name.h
rpc/rpc-transport/socket/src/socket.c
rpc/rpc-transport/socket/src/socket.h
xlators/cluster/afr/src/afr-common.c
xlators/cluster/afr/src/afr-dir-read.c
xlators/cluster/afr/src/afr-dir-read.h
xlators/cluster/afr/src/afr-dir-write.c
xlators/cluster/afr/src/afr-dir-write.h
xlators/cluster/afr/src/afr-inode-read.c
xlators/cluster/afr/src/afr-inode-read.h
xlators/cluster/afr/src/afr-inode-write.c
xlators/cluster/afr/src/afr-inode-write.h
xlators/cluster/afr/src/afr-lk-common.c
xlators/cluster/afr/src/afr-mem-types.h
xlators/cluster/afr/src/afr-open.c
xlators/cluster/afr/src/afr-self-heal-algorithm.c
xlators/cluster/afr/src/afr-self-heal-algorithm.h
xlators/cluster/afr/src/afr-self-heal-common.c
xlators/cluster/afr/src/afr-self-heal-common.h
xlators/cluster/afr/src/afr-self-heal-data.c
xlators/cluster/afr/src/afr-self-heal-entry.c
xlators/cluster/afr/src/afr-self-heal-metadata.c
xlators/cluster/afr/src/afr-self-heal.h
xlators/cluster/afr/src/afr-self-heald.c
xlators/cluster/afr/src/afr-self-heald.h
xlators/cluster/afr/src/afr-transaction.c
xlators/cluster/afr/src/afr-transaction.h
xlators/cluster/afr/src/afr.c
xlators/cluster/afr/src/afr.h
xlators/cluster/afr/src/pump.c
xlators/cluster/afr/src/pump.h
xlators/cluster/dht/src/dht-common.c
xlators/cluster/dht/src/dht-common.h
xlators/cluster/dht/src/dht-diskusage.c
xlators/cluster/dht/src/dht-hashfn.c
xlators/cluster/dht/src/dht-helper.c
xlators/cluster/dht/src/dht-inode-read.c
xlators/cluster/dht/src/dht-inode-write.c
xlators/cluster/dht/src/dht-layout.c
xlators/cluster/dht/src/dht-linkfile.c
xlators/cluster/dht/src/dht-mem-types.h
xlators/cluster/dht/src/dht-rebalance.c
xlators/cluster/dht/src/dht-rename.c
xlators/cluster/dht/src/dht-selfheal.c
xlators/cluster/dht/src/dht.c
xlators/cluster/dht/src/nufa.c
xlators/cluster/dht/src/switch.c
xlators/cluster/stripe/src/stripe-helpers.c
xlators/cluster/stripe/src/stripe-mem-types.h
xlators/cluster/stripe/src/stripe.c
xlators/cluster/stripe/src/stripe.h
xlators/features/index/src/index-mem-types.h ¹
xlators/features/index/src/index.c ¹
xlators/features/index/src/index.h ¹
xlators/performance/io-cache/src/io-cache.c
xlators/performance/io-cache/src/io-cache.h
xlators/performance/io-cache/src/ioc-inode.c
xlators/performance/io-cache/src/ioc-mem-types.h
xlators/performance/io-cache/src/page.c
xlators/performance/io-threads/src/io-threads.c
xlators/performance/io-threads/src/io-threads.h
xlators/performance/io-threads/src/iot-mem-types.h
xlators/performance/md-cache/src/md-cache-mem-types.h
xlators/performance/md-cache/src/md-cache.c
xlators/performance/quick-read/src/quick-read-mem-types.h
xlators/performance/quick-read/src/quick-read.c
xlators/performance/quick-read/src/quick-read.h
xlators/performance/read-ahead/src/page.c
xlators/performance/read-ahead/src/read-ahead-mem-types.h
xlators/performance/read-ahead/src/read-ahead.c
xlators/performance/read-ahead/src/read-ahead.h
xlators/performance/symlink-cache/src/symlink-cache.c
xlators/performance/write-behind/src/write-behind-mem-types.h
xlators/performance/write-behind/src/write-behind.c
xlators/protocol/auth/addr/src/addr.c ¹
xlators/protocol/auth/login/src/login.c ¹
xlators/protocol/client/src/client-callback.c
xlators/protocol/client/src/client-handshake.c
xlators/protocol/client/src/client-helpers.c
xlators/protocol/client/src/client-lk.c
xlators/protocol/client/src/client-mem-types.h
xlators/protocol/client/src/client.c
xlators/protocol/client/src/client.h
xlators/protocol/client/src/client3_1-fops.c
¹ Copyright only, license reverted to original
Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8
BUG: 820551
Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.com/3304
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vijay@gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
noticed that there are possibilities where one would like to do a
connection_cleanup() before destroying a RPC connection itself, also
current code is such that, rpc_clnt_connection_cleanup() does
rpc_clnt_ref() and unref(), creating a race window/double unref
possibilities in the code.
by separating out the functions, this race window/double fault can be
prevented.
Change-Id: I7ebd3392efa891232857b6db9108b0b19e40fc12
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 802403
Reviewed-on: http://review.gluster.com/2979
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vijay@gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* while creating 'rpc_clnt', the caller knows what would be the ideal
load on it, so an extra argument to set some pool sizes
* while creating 'rpcsvc', the caller knows what would be the ideal
load of it, so an extra argument to set request pool size
* cli memory footprint is reduced
Change-Id: Ie245216525b450e3373ef55b654b4cd30741347f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 765336
Reviewed-on: http://review.gluster.com/2784
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vijay@gluster.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4644e944bad4d240d16de47786b9fa277333dba4
BUG: 767862
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: http://review.gluster.com/2735
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vijay@gluster.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic31b8bb10a28408da2a623f4ecc0c60af01c64af
BUG: 795421
Signed-off-by: Krishna Srinivas <ksriniva@redhat.com>
Reviewed-on: http://review.gluster.com/2711
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
so, NLM can send the lk-owner field directly to the locks translators,
while doing the same effort, also enabled sending maximum of 500 aux gid
over protocol.
Change-Id: I87c2514392748416f7ffe21d5154faad2e413969
Signed-off-by: Amar Tumballi <amar@gluster.com>
BUG: 767229
Reviewed-on: http://review.gluster.com/779
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of today, all fops bail out after 30 mins by default. This is not desirable
with lock fops since it could be 'rightfully blocked' for a really long time.
But the client would assume that the lock fop has 'expired' after 30 mins and
clean up its references to the locks. Later when the locks xlator decides to
grant it, we are left with an orphan (stale) lock .
This fix exempts lock fops from timeouts. Only on network disruptions both
client and server decide to 'clean up' the locks held.
Change-Id: If1d74ba21113650b976728e9b764c551a0a49e59
BUG: 3617
Reviewed-on: http://review.gluster.com/478
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amar@gluster.com>
|
|
|
|
|
|
|
|
| |
Change-Id: I2d10f2be44f518f496427f257988f1858e888084
BUG: 3348
Reviewed-on: http://review.gluster.com/200
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@gluster.com>
|
|
|
|
|
|
|
|
| |
Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f
BUG: 3348
Reviewed-on: http://review.gluster.com/182
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@gluster.com>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Signed-off-by: Vijay Bellur <vijay@dev.gluster.com>
BUG: 1965 (need a cmd to get io-stat details)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1965
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_new() creates a new passive rpc clnt object.
rpc_clnt_start() triggers the connection/reconnection from the object.
rpc_clnt_init() - the old way - still works.
Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 2078 (Volume Migration is not working)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2078
|
|
|
|
|
|
|
|
| |
Signed-off-by: shishir gowda <shishirng@gluster.com>
Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
BUG: 1990 (Gluster mainline build on solaris fails with errors)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1990
|
|
|
|
|
|
|
|
|
|
|
|
| |
needed because, a RPC disconnect doesn't mean that a RPC transport/listener
is dead. With this, the race in server protocol cleaning up the lock table /
fd table when some frames are in transit will be handled properly.
Signed-off-by: Amar Tumballi <amar@gluster.com>
Signed-off-by: Vijay Bellur <vijay@dev.gluster.com>
BUG: 1843 ()
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1843
|
|
|
|
|
|
|
|
| |
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Signed-off-by: Vijay Bellur <vijay@dev.gluster.com>
BUG: 1594 (make probe oneway)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1594
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- buffer containing authdata pointed by rpc-request was allocated on stack of
procedure rpc_clnt_fill_request, but was being used as source for xdr-encoding
in rpc_clnt_record_build_record. Hence by the time auth-data is being copied
during encoding of request, it might've been freed and hence contain garbage.
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Signed-off-by: Vijay Bellur <vijay@dev.gluster.com>
BUG: 875 (Implement a new protocol to provide proper backward/forward compatibility)
URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=875
|