| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With https://review.gluster.org/#/c/12363/ being merged, we no longer
send client's lk-version to server side and the corresponding check on
server is also removed. But when clients are upgraded prior to servers,
the check for lk-version at server side fails and is reported back to
clients resulting in disconnection.
Since we don't have lock-recovery (lk-version and grace-timeout) logic
anymore in code base our best bet would be to add client's default
lk-version i.e, 1, into the dictionary just to make server side check
pass and continue with remaining SETVOLUME operations.
Change-Id: I441b67bd271d1e9ba9a7c08703e651c7a6bd945b
BUG: 1544699
Signed-off-by: Anoop C S <anoopcs@redhat.com>
|
|
|
|
|
|
|
| |
updates #384
Change-Id: Id80bf470988dbecc69779de9eb64088559cb1f6a
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I27f5e1e34fe3eac96c7dd88e90753fb5d3d14550
BUG: 1272030
Signed-off-by: Anoop C S <anoopcs@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce another authentication header which can now send more data.
This is useful because this data can be common for all the fops, and
we don't need to change all the signatures.
As part of this, made rpc-clnt.c little more modular to support multiple
authentication structures.
stack.h changes are placeholder for the ctime etc, can be moved later
based on need.
updates #384
Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current use of a per-client mutex to protect fdctx introduces lock
contentions when there are dozens of file operations active.
Use finer grain spinlock to reduce contention, and put retrieving
fdctx out of lock.
Change-Id: Iea3e2eb481e76a5d73a582ba81529180c5b88248
BUG: 1519598
Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In glusterfs code base we call mutex_lock/unlock to take
reference/dereference for a object.Sometime it could be
reason for lock contention also.
Solution: There is no need to use mutex to increase/decrease ref
counter, instead of using mutex use gcc builtin ATOMIC
operation.
Test: I have not observed yet how much performance gain after apply
this patch specific to glusterfs but i have tested same
with below small program(mutex and atomic both) and
get good difference.
static int numOuterLoops;
static void *
threadFunc(void *arg)
{
int j;
for (j = 0; j < numOuterLoops; j++) {
__atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL);
}
return NULL;
}
int
main(int argc, char *argv[])
{
int opt, s, j;
int numThreads;
pthread_t *thread;
int verbose;
int64_t n = 0;
if (argc < 2 ) {
printf(" Please provide 2 args Num of threads && Outer Loop\n");
exit (-1);
}
numThreads = atoi(argv[1]);
numOuterLoops = atoi (argv[2]);
if (1) {
printf("\tthreads: %d; outer loops: %d;\n",
numThreads, numOuterLoops);
}
thread = calloc(numThreads, sizeof(pthread_t));
if (thread == NULL) {
printf ("calloc error so exit\n");
exit (-1);
}
__atomic_store (&glob, &n, __ATOMIC_RELEASE);
for (j = 0; j < numThreads; j++) {
s = pthread_create(&thread[j], NULL, threadFunc, NULL);
if (s != 0) {
printf ("pthread_create failed so exit\n");
exit (-1);
}
}
for (j = 0; j < numThreads; j++) {
s = pthread_join(thread[j], NULL);
if (s != 0) {
printf ("pthread_join failed so exit\n");
exit (-1);
}
}
printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED));
exit(0);
}
time ./thr_count 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 1m10.288s
user 0m57.269s
sys 3m31.565s
time ./thr_count_atomic 800 800000
threads: 800; outer loops: 800000;
glob value is 640000000
real 0m20.313s
user 1m20.558s
sys 0m0.028
Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Today the main users of client uuid are protocol layers, locks, leases.
Protocol layers requires each client uuid to be unique, even across
connects and disconnects. Locks and leases on the server side also use
the same client uid which changes across file migrations. Which makes the graph
switch and file migration tedious for locks and leases.
file migration across bricks becomes difficult as client uuid for the same
client, is different on the other brick.
The exact set of issues exists for leases as well.
Solution would be to introduce a constant in the client-uid string which
the locks and leases can use to identify the owner client across bricks.
Client uuid currently:
%s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume count/reconnect count)
Proposed Client uuid:
"CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s"
- CTX_ID: This is will be constant per client.
- GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume count)
remains the same.
Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c
BUG: 1369028
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* xdr: add gfid to on wire format for fsetattr/rchecksum
* as it is change in on wire XDR format, needed backward
compatible RPC programs.
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 827334
Change-Id: Id0a2da3632516dc1a5560dde2b151b2e5f0be8e5
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces the space used from four bytes to one, and allows
new code to use familiar C99 types/values interoperably with our
old cruft. It does *not* change current declarations or code;
that will be left for a separate - much larger - patch.
Updates: #80
Change-Id: I5baedd17d3fb05b38f0d8b8bb9dd62824475842e
Signed-off-by: Jeff Darcy <jdarcy@fb.com>
|
|
|
|
|
|
|
|
|
| |
There should be different way we handle handshake in case of subdir
mount for the first time, and in case of subsequent graph changes.
Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54
BUG: 1505323
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: currently we can't identify which process is running and
how many instances of it are available.
Fix: name the process when its spawned and send it to the server
and save it in the client_t
The processes that abide by this change from this patch are:
1) fuse mount,
2) rebalance,
3) selfheal,
4) tier,
5) quota,
6) snapshot,
7) brick.
8) gfapi (by default. gfapi.<processname> if processname is found)
Note: fuse gets a process name as native-fuse-client by default.
If the user gives a name for the fuse and spawns it, it will be of
this type --process-name native-fuse-client.<name_specified>.
This can be made use by the process like aux mount done by quota,
geo-rep, etc by adding another option in the aux mount " -o
process-name=gsync_mount"
Updates: #178
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Change-Id: Ie4d02257216839338043737691753bab9a974d5e
Reviewed-on: https://review.gluster.org/17957
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes:
1. Take subdir mount option in client (mount.gluster / glusterfsd)
2. Pass the subdir mount to server-handshake (from client-handshake)
3. Handle subdir-mount dir's lookup in server-first-lookup and handle
all fops resolution accordingly with proper gfid of subdir
4. Change the auth/addr module to handle the multiple subdir entries
in option, and valid parsing.
How to use the feature:
`# mount -t glusterfs $hostname:/$volname/$subdir /$mount_point`
Or
`# mount -t glusterfs $hostname:/$volname -osubdir_mount=$subdir /$mount_point`
Option can be set like:
`# gluster volume set <volname> auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)"`
Updates #175
Change-Id: I7ea57f76ddbe6c3862cfe02e13f89e8a39719e11
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Reviewed-on: https://review.gluster.org/17141
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster. Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.
In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.
There are a few new volume options due to this feature:
halo-shd-latency: The threshold below which self-heal daemons will
consider children (bricks) connected.
halo-nfsd-latency: The threshold below which NFS daemons will consider
children (bricks) connected.
halo-latency: The threshold below which all other clients will
consider children (bricks) connected.
halo-min-replicas: The minimum number of replicas which are to
be enforced regardless of latency specified in the above 3 options.
If the number of children falls below this threshold the next
best (chosen by latency) shall be swapped in.
New FUSE mount options:
halo-latency & halo-min-replicas: As descripted above.
This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.
Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
Writes & deletes will be protected via entry-locks as usual preventing
concurrent writes into files which are undergoing replication. Read operations
on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
other regions. The take away from this is care should be taken to ensure
multiple writers do not write the same files resulting in a gfid split-brain
which will require resolution via split-brain policies (majority, mtime &
size). Recommended method for preventing this is using the nfs-auth feature to
define which region for each share has RW permissions, tiers not in the origin
region should have RO perms.
TODO:
- Synchronous clients (including the SHD) should choose clients from their own
region as preferred sources for reads. Most of the plumbing is in place for
this via the child_latency array.
- Better GFID split brain handling & better dent type split brain handling
(i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
to synchronously write to
Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer & valgrind
- Prove tests
Reviewers: jackl, dph, cjh, meyering
Reviewed By: meyering
Subscribers: ethanr
Differential Revision: https://phabricator.fb.com/D1272053
Tasks: 4117827
Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the race between fd re-open code and fd release code,
both of which free the fd context due to a race in certain variable
checks as explained below:
1. client process (shd in the case of this BZ) sends an opendir to its
children (client xlators) which send the fop to the bricks to get a valid fd.
2. Client xlator loses connection to the brick. fdctx->remotefd is -1
3. Client re-establishes connection. After handshake, it reopens the dir
and sets fdctx->remotefd to a valid fd in client3_3_reopendir_cbk().
4. Meanwhile, shd sends a fd unref after it is done with the opendir.
This triggers a releasedir (since fd->refcount becomes 0).
5. client3_3_releasedir() sees that fdctx-->remotefd is a valid number
(i.e not -1), sets fdctx->released=1 and calls client_fdctx_destroy()
6. As a continuation of step3, client_reopen_done() is called by
client3_3_reopendir_cbk(), which sees that fdctx->released==1 and
again calls client_fdctx_destroy().
Depending on when step-5 does GF_FREE(fdctx), we may crash at any place in
step-6 in client3_3_reopendir_cbk() when it tries to access
fdctx->{whatever}.
Change-Id: Ia50873d11763e084e41d2a1f4d53715438e5e947
BUG: 1418629
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: https://review.gluster.org/16521
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for multiple brick translator stacks running
in a single brick server process. This reduces our per-brick memory usage by
approximately 3x, and our appetite for TCP ports even more. It also creates
potential to avoid process/thread thrashing, and to improve QoS by scheduling
more carefully across the bricks, but realizing that potential will require
further work.
Multiplexing is controlled by the "cluster.brick-multiplex" global option. By
default it's off, and bricks are started in separate processes as before. If
multiplexing is enabled, then *compatible* bricks (mostly those with the same
transport options) will be started in the same process.
Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb
BUG: 1385758
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: https://review.gluster.org/14763
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the `gluster volume status <VOLNAME|all> clients` command
gives us the following information on clients:
1. Brick name
2. Client count for each brick
3. hostname:port for each client
4. Bytes read and written for each client
There is no information regarding op-version for each client. This
patch adds that to the output.
Change-Id: Ib2ece93ab00c234162bb92b7c67a7d86f3350a8d
BUG: 1409078
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: http://review.gluster.org/16303
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When SSL is enabled or if "transport.socket.own-thread" option is set
then socket_poller is run as different thread. Currently during
disconnect or PARENT_DOWN scenario we don't wait for this thread
to terminate. PARENT_DOWN will disconnect the socket layer and
cleanup resources used by socket_poller.
Therefore before disconnect we should wait for poller thread to exit.
Change-Id: I71f984b47d260ffd979102f180a99a0bed29f0d6
BUG: 1404181
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/16141
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 93eaeb9c93be3232f24e840044d560f9f0e66f71 introduces
leaks in INODELK callback where a dict is unserialized twice,
leading to dict leaks.
Change-Id: I219ccb2279f237ebc2e4fc366af4775a461929b8
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16156
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.
However 14085 won't pass the smoke test until all the warnings are
fixed.
BUG: 1369124
Change-Id: I54055b3b1038374b4e21432da48fdaeca2938289
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15339
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 3.8 client expects a child_up key from the server
indicating the status of the server translators. This
key is not being sent by the servers running older
versions, thereby breaking compatibility.
With this patch we are treating the absence of the said
key as an indication that the server trying to connect
to this client is running an older version and hence
in such a case we are setting conf->child_up as
_gf_true explicitly. This should suffice in emulating
the older behavior.
Due to the nature of this bug, requiring two version to
be reproducible, there are no testcases added for the same.
Change-Id: I29e0a5c63b55380dc9db8e42852d7e95b64a2b2e
BUG: 1350327
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/14811
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently on a successful connection between protocol
server and client, the protocol client initiates a
CHILD_UP event in the client stack. At this point in
time, only the connection between server and client is
established, and there is no guarantee that the server
side stack is ready to serve requests.
It works fine now, as most server side translators are
not dependent on any other factors, before being able
to serve requests today and hence they are up by the time
the client stack translators receive the CHILD_UP (initiated
by client handshake).
The gap here is exposed when certain server side translators
like NSR-Server for example, have a couple of protocol clients
as their child(connecting them to other bricks), and they
can't really serve requests till a quorum of their children are
up. Hence these translators should defer sending CHILD_UP
till they have enough children up, and the same needs to be
propagated to the client stack translators.
Fix:
Maintain a child_up variable in both the protocol client
and protocol server translators. The protocol server should
update this value based on the CHILD_UP and CHILD_DOWN
events it receives from the translators below it. On receiving
such an event it should forward that event to the client.
The protocol client on receiving such an event should forward
it up the client stack, thereby letting the client translators
correctly know that the server is up and ready to serve.
The clients connecting later(long after a server has initialized
and processed it's CHILD_UP events), will receive a child_up status
as part of the handshake, and based on the status of the server's
child_up, can either propagate a CHILD_UP event or defer it.
Change-Id: I0807141e62118d8de9d9cde57a53a607be44a0e0
BUG: 1312845
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/13549
Smoke: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally GF_EVENT_CHILD_UP is dispatched after client
handshake. But we have some dead code in client_rpc_notify
which is assumed to do the same on receiving RPC_CLNT_CONNECT.
This dispatch is based on a condition whether "disable-handshake"
is enabled or not. Since we require client-handshake everytime
we have a connect this check for "disable-handshake" is invalid
and no longer required. Moreover this option is never handled
in any of the translators.
Change-Id: Ic862d6ac08cd3b18cf231f50140cd00e84e52ca0
BUG: 1227667
Signed-off-by: Anoop C S <anoopcs@redhat.com>
Reviewed-on: http://review.gluster.org/12170
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On account of a lock reacquire failure [in clnt_release_reopen_fd()]
the return value, on submitting the client request for release of
reopened fd, is not honoured correctly.
Change-Id: Iff11523b2cc6f284e806855f32a13d8c4432f1c6
BUG: 1227667
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/11088
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
clnt_mark_fd_bad() is no longer used to mark the fd bad. Instead
we make use of client_mark_fd_bad() to do the same.
Change-Id: I09af892d8c0c5d1cf853ff020e8596c53d9539c0
BUG: 1227667
Signed-off-by: Anoop C S <achiraya@redhat.com>
Reviewed-on: http://review.gluster.org/11063
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the 3rd and 5th argument of gf_msg framework
prints the error string in case of strerror(),
5th argument is removed.
Change-Id: Ib1794ea2d4cb5c46a39311f0afcfd7e494540506
BUG: 1194640
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/11280
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9bf2ca08fef969e566a64475d0f7a16d37e66eeb
BUG: 1194640
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/10042
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterfs relies on Linux uuid implementation, which
API is incompatible with most other systems's uuid. As
a result, libglusterfs has to embed contrib/uuid,
which is the Linux implementation, on non Linux systems.
This implementation is incompatible with systtem's
built in, but the symbols have the same names.
Usually this is not a problem because when we link
with -lglusterfs, libc's symbols are trumped. However
there is a problem when a program not linked with
-lglusterfs will dlopen() glusterfs component. In
such a case, libc's uuid implementation is already
loaded in the calling program, and it will be used
instead of libglusterfs's implementation, causing
crashes.
A possible workaround is to use pre-load libglusterfs
in the calling program (using LD_PRELOAD on NetBSD for
instance), but such a mechanism is not portable, nor
is it flexible. A much better approach is to rename
libglusterfs's uuid_* functions to gf_uuid_* to avoid
any possible conflict. This is what this change attempts.
BUG: 1206587
Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10017
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I948f85cb369206ee8ce8b8cd5e48cae9adb971c9
BUG: 1075417
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/9529
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID: 1124448
CID: 1124449
Removal of the dead code in the 'out' label.
Change-Id: Ibdd05cbb6e2204f6aefdf442698225883c2d7734
BUG: 789278
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/9676
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
position in the graph rather than relative (local) to a particular
translator.
Encoding the volume in this way allows a single translator to manage
which brick is currently being scanned for directory entries. Using a
single translator minimizes allocated bits in the d_off. It also allows
multiple DHT translators in the same graph to have a common frame of
reference (the graph position) for which brick is being read. Multiple
DHT translators are needed for the Tiering feature.
The fix builds off a previous change (9332) which removed subvolume
encoding from AFR. The fix makes an equivalent change to the EC
translator.
More background can be found in fix 9332 and gluster-dev discussions [1].
DHT and AFR/EC are responsibile (as before) for choosing which brick to
enumerate directory entries in over the readdir lifecycle.
The client translator receiving the readdir fop encodes the dht_t. It
is referred to as the "leaf node" in the graph and corresponds to the
brick being scanned.
When DHT decodes the d_off, it translates the leaf node to a local
subvolume, which represents the next node in the graph leading to
the brick.
Tracking of leaf nodes is done in common utility functions. Leaf nodes
counts and positional information are updated on a graph switch.
[1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html
Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8
BUG: 1190734
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9688
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... from all bricks in the volume
This patch is important in the context of MT epoll. With MT epoll,
notification events from client xlators could reach cluster xlators like
afr, dht, ec, stripe etc. in different orders.
For e.g, In a distributed replicate volume of 2 bricks, namely Brick1
and Brick2, the following network events are observed by a mount
process.
- connection to Brick1 is broken.
- connection to Brick1 has been restored.
- connection to Brick2 is broken.
- connection to Brick2 has been restored.
Without establishing a total ordering of events, we can't guarantee that
cluster xlators like afr, dht perceive them in the same order. While we
would expect afr (say) to perceive it as only one of Brick1 and Brick2
going down at any given time, it is possible for the notification of
Brick2 going offline to race with the notification of Brick1 coming back
online.
Change-Id: I78f5a52bfb05593335d0e9ad53ebfff98995593d
BUG: 1104462
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/9591
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For rdma type only volume client connection establishment
with server takes more than three seconds. Because for
tcp,rdma type volume, will have 2 ports one for tcp and
one for rdma, tcp port is stored with brickname and rdma
port is stored as "brickname.rdma" during pamap_sighin.
During the handshake when trying to get the brick port
for rdma clients, since we are not aware of server
transport type, we will append '.rdma' with brick name.
So for tcp,rdma volume there will be an entry with
'.rdma', but it will fail for rdma type only volume.
So we will try again, this time without appending '.rdma'
using a flag variable need_different_port, and it will succeed,
but the reconnection happens only after 3 seconds.
In this patch for rdma only type volume
we will append '.rdma' during the pmap_signin. So during the
handshake we will get the correct port for first try itself.
Since we don't need to retry , we can remove the
need_different_port flag variable.
Change-Id: Ie8e3a7f532d4104829dbe995e99b35e95571466c
BUG: 1153569
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/8934
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we try to mount a tcp,rdma volume as rdma
transport using FUSE protocol, then mount will
hang if the brick is down. When we kill a process,
signal will be received in glusterfsd process and
it will call pmap_signout with port listening on tcp only.
In case of the tcp,rdma there will be two ports,
and port which is listening for rdma will not
called for sign out.
So the mount process will try to connect to a port
which is not open and it will keep trying to connect.
This patch will call pmap_signout for rdma port also,
So when mount tries to get the brick port,it will fail.
Change-Id: I23676f65f96eb90b69b76478f7a21412a6aba70f
BUG: 1143886
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/8762
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the existing client ping timer implementation, and makes
use of the common code for implementing both client ping timer and the
glusterd ping timer.
A new gluster rpc program for ping is introduced. The ping timer is only
started for peers that have this new program. The deafult glusterd ping
timeout is 30 seconds. It is configurable by setting the option
'ping-timeout' in glusterd.vol .
Also, this patch introduces changes in the glusterd-handshake path. The client
programs for a peer are now set in the callback of dump_versions, for both
the older handshake and the newer op-version handshake. This is the only place
in the handshake process where we know what programs a peer supports.
Change-Id: I035815ac13449ca47080ecc3253c0a9afbe9016a
BUG: 1038261
Signed-off-by: Vijaikumar M <vmallika@redhat.com>
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/5202
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt object is destroyed after the corresponding transport object is
destroyed. But rpc_clnt_reconnect, a timer driven function, refers to
the transport object beyond its 'life'. Instead, using the embedded
connection object prevents use after free problem wrt transport object.
Also, access transport object under conn->lock.
Change-Id: Iae28e8a657d02689963c510114ad7cb7e6764e62
BUG: 962619
Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-on: http://review.gluster.org/6751
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
It was observed that in some cases client disconnects
and re-connects before server xlator could detect that a
disconnect happened. So it still uses previous fdtable and ltable.
But it can so happen that in between disconnect and re-connect
an 'unlock' fop may fail because the fds are marked 'bad' in client
xlator upon disconnect. Due to this stale locks remain on the brick
which lead to hangs/self-heals not happening etc.
For the exact bug RCA please look at
https://bugzilla.redhat.com/show_bug.cgi?id=1049932#c0
Fix:
When lk-heal is not enabled make sure connection-id is different for
every setvolume. This will make sure that a previous connection's
resources are not re-used in server xlator.
Change-Id: Id844aaa76dfcf2740db72533bca53c23b2fe5549
BUG: 1049932
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6669
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if client/server state versions match, we still need to notify
parent xlators of reconnection (CHILD_UP) because they were
notified of CHILD_DOWN at the time of disconnection.
Change-Id: I36c4bde6d8c3db9cb0c48eeb10663b56897c932e
BUG: 1037267
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6396
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gettimeofday() returns the current wall clock time and timezone.
Using these functions in order to measure the passage of time
(how long an operation took) therefore seems like a no-brainer.
This time suffer's from some limitations:
a. They have a low resolution: “High-performance” timing by
definition, requires clock resolutions into the microseconds
or better.
b. They can jump forwards and backwards in time: Computer
clocks all tick at slightly different rates, which causes
the time to drift. Most systems have NTP enabled which
periodically adjusts the system clock to keep them in sync
with “actual” time. The adjustment can cause the clock to
suddenly jump forward (artificially inflating your timing
numbers) or jump backwards (causing your timing calculations
to go negative or hugely positive). In such cases timer
thread could go into an infinite loop.
From 'man gettimeofday':
----------
..
..
The time returned by gettimeofday() is affected by discontinuous
jumps in the system time (e.g., if the system administrator manually
changes the system time). If you need a monotonically increasing
clock, see clock_gettime(2).
..
..
----------
Rationale:
For calculating interval timing for Timer thread, all that’s
needed should be clock as a simple counter that increments
at a stable rate.
This is necessary to avoid the jumps which are caused by using
"wall time", this counter must be monotonic that can never
“tick” backwards, ever.
Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb
BUG: 1017993
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/6070
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently when gluster volume start force is executed, client process
will talk to glusterd to get the port of the brick. But if brick's path
is not available it cannot return brick's port. So client process will
keep connecting and disconnecting from glusterd for port-query which
is ultimately responsible for execssive logging of disconnect messages.
Fix: Message will be logged just once at INFO level after the first
disconnect from glusterd. Afterwards "disconnect" messages will be
logged in DEBUG mode.
Change-Id: I2b787f3820b5da45e090c562e5698fcfe24a02cd
BUG: 959969
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4953
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When client_submit_request fails it calls cbk. The cleanups should
happen only in cbk. The code committed as part of
http://review.gluster.org/4357 violates this. Also found that
clnt_release_reopen_fd violates this as well.
This patch fixes these issue.
Change-Id: Ic02ba278724b03c65c00b686c39fd7846122618a
BUG: 821056
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4464
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the brick is taken down and the hard disk is replaced
and the brick is brought back up, the re-opens of the open-fds
will fail because the file is not present on the brick.
Re-opens are not attempted even if the files are re-created by
self-heal until the brick is brought down after the files are
re-created and brought back up. This is a problem with a VM-store
in a replica-setup. Until the fd is re-opened the writes will
never happen on the brick where the hard-disk is replaced.
To handle this situation gracefully, client xlator is enhanced
to perform finodelk, fxattrop, writev, readv using anonymous fds
if the file is yet to be re-opened. If the fop succeeds then client
xlator attempts re-open.
Change-Id: I1cc6d1bbf8227cd996868ab2ed0a57fb05e00017
BUG: 821056
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4358
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I01caa1b51570359e6e3ffe1ffb7279cbdb0b0c64
BUG: 821056
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/4357
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that way, it would help admins to look at the corresponding brick directly.
All credit goes to Amar.
Change-Id: I959df59111864cc0574945d827f8fe5f2d919491
BUG: 839021
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/4341
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rpc_clnt_reconnect and client_query_portmap_cbk
problem:
-------
Theoretically there is a possibility that we could complete
querying the remote brick's port number before rpc_transport_connect
can return. If rpc_clnt_reconnect happens to be the caller of
rpc_transport_connect and we've already got the remote brick's port
number by the time rpc_transport_connect returns, without synchronization,
rpc_clnt_connect resets config.remote_port to zero even before we have
attempted a connection with remote brick.
fix:
---
By making only poll thread do setting and resetting of
config.remote_port, we avoid the race-condition.
Change-Id: I51879ba1cac651a80ff5c9c070ec7fe1ceea9e05
BUG: 765051
Signed-off-by: Raghavendra G <raghavendra@gluster.com>
Reviewed-on: http://review.gluster.org/4044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the disconnect after a portmap query is treated like an
ordinary disconnect and the reconnection attempt (in this case, to
the brick) is attempted only after 3 secs. This results in a delay
which is unnecessary.
Mark the disconnection happening because of a successful portmap
query as needing a 'quick reconnect' to avoid the delay for this
special case.
Change-Id: I43c8292ff0c30858d883ff3569a3761acbf2f5eb
BUG: 860220
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/3994
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
The function dict_serialized_length could, owing to an error,
return a negative integer (-EINVAL) that gets assigned to an
unsigned int member 'dict_len' of gf_setvolume_req structure.
FIX:
Hold the value returned by dict_serialized_length in local
variable ret (which is a signed int). Test if ret is negative,
in which case the control would anyway branch to the label fail
where the function returns. Otherwise dict_len is assigned with
ret, in turn giving a more meaningful value to the attribute
length.
TEST:
Attached gdb to glusterfs mount process, set breakpoint at
client_setvolume, forced dict_serialized_length to return
-EINVAL (indirectly by forcing _dict_serialized_length to return
-EINVAL after setting count to -1 within its body) and checked
the value of ret (which is now sure to contain a negative value)
whose value will be appropriately tested to decide the next
course of action within client_setvolume: whether to simply
exit due to an error or execute the subsequent statements.
Change-Id: Ib22ad8f30d8ae04acaf2ff5bfee9c348a2c47148
BUG: 789278
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.com/3755
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
See comments in http://bugzilla.redhat.com/839925 for
the code to perform this change.
Signed-off-by: Jim Meyering <meyering@redhat.com>
BUG: 839925
Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a
Reviewed-on: http://review.gluster.com/3661
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
's/3_1/3_3/g' in case of glusterfs protocol
's/3_1_/_/g' in case of CLI and mgmt protocol
Change-Id: I6e6510d02c05f68f290c52ed284c04576326e12c
Signed-off-by: Amar Tumballi <amarts@redhat.com>
BUG: 764890
Reviewed-on: http://review.gluster.com/3632
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RCA
The bug is observed in 3.2.x because posix xlator changes
the uid/gid of file as per frame->root-uid/gid if O_CREAT flag
is set in open fop. Posix does not do this in 3.3.x so that
bug does not appear anymore but this issue exposed the actual
bug in client xlator re-open. Re-open of a file on re-connection
should not perform re-open with the same flags at the time of
open/create/opendir. Imagine a case where a file is opened with
O_TRUNC|O_RDWR and some data is written to it, now if the brick
goes down and comes back the file will be truncated.
When I tested this case, the file is not truncated because locks
xlator resets O_TRUNC unconditionally.
Client xlator re-open bug and locks xlator bug cancel each other.
Fix
Reset O_CREAT|O_TRUNC|O_EXCL flags in re-open.
Locks xlator should not reset O_TRUNC.
Additional changes
Removed wbflags as it is not assigned at all.
Testcases
Automated go program is at:
://bugzilla.redhat.com/show_bug.cgi?id=807976#c2
Change-Id: I0080344fdda2e62e7c976c35a5bf5f1fa8838891
BUG: 807976
Signed-off-by: Pranith Kumar K <pranithk@gluster.com>
Reviewed-on: http://review.gluster.com/3582
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|