summaryrefslogtreecommitdiffstats
path: root/xlators/protocol/client/src/client-handshake.c
Commit message (Collapse)AuthorAgeFilesLines
* protcol/client: Insert dummy clnt-lk-version to avoid upgrade failureAnoop C S2018-02-141-0/+9
| | | | | | | | | | | | | | | | | With https://review.gluster.org/#/c/12363/ being merged, we no longer send client's lk-version to server side and the corresponding check on server is also removed. But when clients are upgraded prior to servers, the check for lk-version at server side fails and is reported back to clients resulting in disconnection. Since we don't have lock-recovery (lk-version and grace-timeout) logic anymore in code base our best bet would be to add client's default lk-version i.e, 1, into the dictionary just to make server side check pass and continue with remaining SETVOLUME operations. Change-Id: I441b67bd271d1e9ba9a7c08703e651c7a6bd945b BUG: 1544699 Signed-off-by: Anoop C S <anoopcs@redhat.com>
* protocol: utilize the version 4 xdrAmar Tumballi2018-02-011-3/+261
| | | | | | | updates #384 Change-Id: Id80bf470988dbecc69779de9eb64088559cb1f6a Signed-off-by: Amar Tumballi <amarts@redhat.com>
* protocol: Remove lock recovery logic from client and serverAnoop C S2018-01-291-354/+11
| | | | | | Change-Id: I27f5e1e34fe3eac96c7dd88e90753fb5d3d14550 BUG: 1272030 Signed-off-by: Anoop C S <anoopcs@redhat.com>
* rpc/*: auth-header changesAmar Tumballi2018-01-171-10/+15
| | | | | | | | | | | | | | | | | Introduce another authentication header which can now send more data. This is useful because this data can be common for all the fops, and we don't need to change all the signatures. As part of this, made rpc-clnt.c little more modular to support multiple authentication structures. stack.h changes are placeholder for the ctime etc, can be moved later based on need. updates #384 Change-Id: I6111c13cfd2ec92e2b4e9295896bf62a8a33b2c7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* protocol/client: reduce lock contentionZhang Huan2017-12-261-15/+16
| | | | | | | | | | | | Current use of a per-client mutex to protect fdctx introduces lock contentions when there are dozens of file operations active. Use finer grain spinlock to reduce contention, and put retrieving fdctx out of lock. Change-Id: Iea3e2eb481e76a5d73a582ba81529180c5b88248 BUG: 1519598 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* glusterfs: Use gcc builtin ATOMIC operator to increase/decreate refcount.Mohit Agrawal2017-12-121-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In glusterfs code base we call mutex_lock/unlock to take reference/dereference for a object.Sometime it could be reason for lock contention also. Solution: There is no need to use mutex to increase/decrease ref counter, instead of using mutex use gcc builtin ATOMIC operation. Test: I have not observed yet how much performance gain after apply this patch specific to glusterfs but i have tested same with below small program(mutex and atomic both) and get good difference. static int numOuterLoops; static void * threadFunc(void *arg) { int j; for (j = 0; j < numOuterLoops; j++) { __atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL); } return NULL; } int main(int argc, char *argv[]) { int opt, s, j; int numThreads; pthread_t *thread; int verbose; int64_t n = 0; if (argc < 2 ) { printf(" Please provide 2 args Num of threads && Outer Loop\n"); exit (-1); } numThreads = atoi(argv[1]); numOuterLoops = atoi (argv[2]); if (1) { printf("\tthreads: %d; outer loops: %d;\n", numThreads, numOuterLoops); } thread = calloc(numThreads, sizeof(pthread_t)); if (thread == NULL) { printf ("calloc error so exit\n"); exit (-1); } __atomic_store (&glob, &n, __ATOMIC_RELEASE); for (j = 0; j < numThreads; j++) { s = pthread_create(&thread[j], NULL, threadFunc, NULL); if (s != 0) { printf ("pthread_create failed so exit\n"); exit (-1); } } for (j = 0; j < numThreads; j++) { s = pthread_join(thread[j], NULL); if (s != 0) { printf ("pthread_join failed so exit\n"); exit (-1); } } printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED)); exit(0); } time ./thr_count 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 1m10.288s user 0m57.269s sys 3m31.565s time ./thr_count_atomic 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 0m20.313s user 1m20.558s sys 0m0.028 Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* rpc : Change the way client uuid is builtPoornima G2017-11-201-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Today the main users of client uuid are protocol layers, locks, leases. Protocol layers requires each client uuid to be unique, even across connects and disconnects. Locks and leases on the server side also use the same client uid which changes across file migrations. Which makes the graph switch and file migration tedious for locks and leases. file migration across bricks becomes difficult as client uuid for the same client, is different on the other brick. The exact set of issues exists for leases as well. Solution would be to introduce a constant in the client-uid string which the locks and leases can use to identify the owner client across bricks. Client uuid currently: %s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume count/reconnect count) Proposed Client uuid: "CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s" - CTX_ID: This is will be constant per client. - GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume count) remains the same. Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c BUG: 1369028 Signed-off-by: Susant Palai <spalai@redhat.com>
* rpc: bring a new protocol versionAmar Tumballi2017-11-071-0/+14
| | | | | | | | | | * xdr: add gfid to on wire format for fsetattr/rchecksum * as it is change in on wire XDR format, needed backward compatible RPC programs. Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 827334 Change-Id: Id0a2da3632516dc1a5560dde2b151b2e5f0be8e5
* core: make gf_boolean_t a C99 bool instead of an enumJeff Darcy2017-11-031-1/+4
| | | | | | | | | | | | This reduces the space used from four bytes to one, and allows new code to use familiar C99 types/values interoperably with our old cruft. It does *not* change current declarations or code; that will be left for a separate - much larger - patch. Updates: #80 Change-Id: I5baedd17d3fb05b38f0d8b8bb9dd62824475842e Signed-off-by: Jeff Darcy <jdarcy@fb.com>
* protocol/client: handle the subdir handshake properly for add-brickAmar Tumballi2017-10-291-1/+9
| | | | | | | | | There should be different way we handle handshake in case of subdir mount for the first time, and in case of subsequent graph changes. Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54 BUG: 1505323 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Infra to indentify processhari gowtham2017-08-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: currently we can't identify which process is running and how many instances of it are available. Fix: name the process when its spawned and send it to the server and save it in the client_t The processes that abide by this change from this patch are: 1) fuse mount, 2) rebalance, 3) selfheal, 4) tier, 5) quota, 6) snapshot, 7) brick. 8) gfapi (by default. gfapi.<processname> if processname is found) Note: fuse gets a process name as native-fuse-client by default. If the user gives a name for the fuse and spawns it, it will be of this type --process-name native-fuse-client.<name_specified>. This can be made use by the process like aux mount done by quota, geo-rep, etc by adding another option in the aux mount " -o process-name=gsync_mount" Updates: #178 Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: Ie4d02257216839338043737691753bab9a974d5e Reviewed-on: https://review.gluster.org/17957 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterfsd: allow subdir mountAmar Tumballi2017-08-041-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes: 1. Take subdir mount option in client (mount.gluster / glusterfsd) 2. Pass the subdir mount to server-handshake (from client-handshake) 3. Handle subdir-mount dir's lookup in server-first-lookup and handle all fops resolution accordingly with proper gfid of subdir 4. Change the auth/addr module to handle the multiple subdir entries in option, and valid parsing. How to use the feature: `# mount -t glusterfs $hostname:/$volname/$subdir /$mount_point` Or `# mount -t glusterfs $hostname:/$volname -osubdir_mount=$subdir /$mount_point` Option can be set like: `# gluster volume set <volname> auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)"` Updates #175 Change-Id: I7ea57f76ddbe6c3862cfe02e13f89e8a39719e11 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: https://review.gluster.org/17141 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Halo Replication feature for AFR translatorKevin Vigor2017-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 BUG: 1428061 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 Reviewed-on: https://review.gluster.org/16177 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* protocol/client: Fix double free of client fdctx destroyRavishankar N2017-02-131-22/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the race between fd re-open code and fd release code, both of which free the fd context due to a race in certain variable checks as explained below: 1. client process (shd in the case of this BZ) sends an opendir to its children (client xlators) which send the fop to the bricks to get a valid fd. 2. Client xlator loses connection to the brick. fdctx->remotefd is -1 3. Client re-establishes connection. After handshake, it reopens the dir and sets fdctx->remotefd to a valid fd in client3_3_reopendir_cbk(). 4. Meanwhile, shd sends a fd unref after it is done with the opendir. This triggers a releasedir (since fd->refcount becomes 0). 5. client3_3_releasedir() sees that fdctx-->remotefd is a valid number (i.e not -1), sets fdctx->released=1 and calls client_fdctx_destroy() 6. As a continuation of step3, client_reopen_done() is called by client3_3_reopendir_cbk(), which sees that fdctx->released==1 and again calls client_fdctx_destroy(). Depending on when step-5 does GF_FREE(fdctx), we may crash at any place in step-6 in client3_3_reopendir_cbk() when it tries to access fdctx->{whatever}. Change-Id: Ia50873d11763e084e41d2a1f4d53715438e5e947 BUG: 1418629 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16521 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Add info on op-version for clients in vol status outputSamikshan Bairagya2017-01-121-0/+6
| | | | | | | | | | | | | | | | | | | | | Currently the `gluster volume status <VOLNAME|all> clients` command gives us the following information on clients: 1. Brick name 2. Client count for each brick 3. hostname:port for each client 4. Bytes read and written for each client There is no information regarding op-version for each client. This patch adds that to the output. Change-Id: Ib2ece93ab00c234162bb92b7c67a7d86f3350a8d BUG: 1409078 Signed-off-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-on: http://review.gluster.org/16303 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* socket: socket disconnect should wait for poller thread exitRajesh Joseph2016-12-211-2/+2
| | | | | | | | | | | | | | | | | | | | | When SSL is enabled or if "transport.socket.own-thread" option is set then socket_poller is run as different thread. Currently during disconnect or PARENT_DOWN scenario we don't wait for this thread to terminate. PARENT_DOWN will disconnect the socket layer and cleanup resources used by socket_poller. Therefore before disconnect we should wait for poller thread to exit. Change-Id: I71f984b47d260ffd979102f180a99a0bed29f0d6 BUG: 1404181 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/16141 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* protocol/client: Fix potential mem-leaksKrutika Dhananjay2016-12-161-0/+1
| | | | | | | | | | | | | | | Commit 93eaeb9c93be3232f24e840044d560f9f0e66f71 introduces leaks in INODELK callback where a dict is unserialized twice, leading to dict leaks. Change-Id: I219ccb2279f237ebc2e4fc366af4775a461929b8 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16156 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* protocol/client (no 2): fix unused variable warnings/errorsKaleb S. KEITHLEY2016-09-051-2/+0
| | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. BUG: 1369124 Change-Id: I54055b3b1038374b4e21432da48fdaeca2938289 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15339 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* protocol/server: Fix client/server compatibilityAvra Sengupta2016-06-281-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | The 3.8 client expects a child_up key from the server indicating the status of the server translators. This key is not being sent by the servers running older versions, thereby breaking compatibility. With this patch we are treating the absence of the said key as an indication that the server trying to connect to this client is running an older version and hence in such a case we are setting conf->child_up as _gf_true explicitly. This should suffice in emulating the older behavior. Due to the nature of this bug, requiring two version to be reproducible, there are no testcases added for the same. Change-Id: I29e0a5c63b55380dc9db8e42852d7e95b64a2b2e BUG: 1350327 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14811 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.org>
* protocol client/server: Fix client-server handshakeAvra Sengupta2016-03-101-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently on a successful connection between protocol server and client, the protocol client initiates a CHILD_UP event in the client stack. At this point in time, only the connection between server and client is established, and there is no guarantee that the server side stack is ready to serve requests. It works fine now, as most server side translators are not dependent on any other factors, before being able to serve requests today and hence they are up by the time the client stack translators receive the CHILD_UP (initiated by client handshake). The gap here is exposed when certain server side translators like NSR-Server for example, have a couple of protocol clients as their child(connecting them to other bricks), and they can't really serve requests till a quorum of their children are up. Hence these translators should defer sending CHILD_UP till they have enough children up, and the same needs to be propagated to the client stack translators. Fix: Maintain a child_up variable in both the protocol client and protocol server translators. The protocol server should update this value based on the CHILD_UP and CHILD_DOWN events it receives from the translators below it. On receiving such an event it should forward that event to the client. The protocol client on receiving such an event should forward it up the client stack, thereby letting the client translators correctly know that the server is up and ready to serve. The clients connecting later(long after a server has initialized and processed it's CHILD_UP events), will receive a child_up status as part of the handshake, and based on the status of the server's child_up, can either propagate a CHILD_UP event or defer it. Change-Id: I0807141e62118d8de9d9cde57a53a607be44a0e0 BUG: 1312845 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/13549 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/client: Remove dead code from client_rpc_notifyAnoop C S2015-09-281-1/+1
| | | | | | | | | | | | | | | | | | | | Normally GF_EVENT_CHILD_UP is dispatched after client handshake. But we have some dead code in client_rpc_notify which is assumed to do the same on receiving RPC_CLNT_CONNECT. This dispatch is based on a condition whether "disable-handshake" is enabled or not. Since we require client-handshake everytime we have a connect this check for "disable-handshake" is invalid and no longer required. Moreover this option is never handled in any of the translators. Change-Id: Ic862d6ac08cd3b18cf231f50140cd00e84e52ca0 BUG: 1227667 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/12170 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* protocol/client: Properly handle return value in clnt_release_reopen_fdAnoop C S2015-06-251-1/+0
| | | | | | | | | | | | | | On account of a lock reacquire failure [in clnt_release_reopen_fd()] the return value, on submitting the client request for release of reopened fd, is not honoured correctly. Change-Id: Iff11523b2cc6f284e806855f32a13d8c4432f1c6 BUG: 1227667 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/11088 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* protocol/client: Remove unused function clnt_mark_fd_bad()Anoop C S2015-06-221-10/+0
| | | | | | | | | | | | | clnt_mark_fd_bad() is no longer used to mark the fd bad. Instead we make use of client_mark_fd_bad() to do the same. Change-Id: I09af892d8c0c5d1cf853ff020e8596c53d9539c0 BUG: 1227667 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/11063 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* protocol/client : removing duplicate printing in gf_msgManikandan Selvaganesh2015-06-201-12/+11
| | | | | | | | | | | | | | Since the 3rd and 5th argument of gf_msg framework prints the error string in case of strerror(), 5th argument is removed. Change-Id: Ib1794ea2d4cb5c46a39311f0afcfd7e494540506 BUG: 1194640 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/11280 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/client : porting log messages to new frameworkManikandan Selvaganesh2015-06-151-141/+189
| | | | | | | | | | Change-Id: I9bf2ca08fef969e566a64475d0f7a16d37e66eeb BUG: 1194640 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/10042 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Xlators : Fixed typosManikandan Selvaganesh2015-04-021-1/+1
| | | | | | | | | | | Change-Id: I948f85cb369206ee8ce8b8cd5e48cae9adb971c9 BUG: 1075417 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9529 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* protocol-client: Removal of Dead Codearao2015-03-301-9/+0
| | | | | | | | | | | | | CID: 1124448 CID: 1124449 Removal of the dead code in the 'out' label. Change-Id: Ibdd05cbb6e2204f6aefdf442698225883c2d7734 BUG: 789278 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9676 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Change the subvolume encoding in d_off to be a "global"Dan Lambright2015-03-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | position in the graph rather than relative (local) to a particular translator. Encoding the volume in this way allows a single translator to manage which brick is currently being scanned for directory entries. Using a single translator minimizes allocated bits in the d_off. It also allows multiple DHT translators in the same graph to have a common frame of reference (the graph position) for which brick is being read. Multiple DHT translators are needed for the Tiering feature. The fix builds off a previous change (9332) which removed subvolume encoding from AFR. The fix makes an equivalent change to the EC translator. More background can be found in fix 9332 and gluster-dev discussions [1]. DHT and AFR/EC are responsibile (as before) for choosing which brick to enumerate directory entries in over the readdir lifecycle. The client translator receiving the readdir fop encodes the dht_t. It is referred to as the "leaf node" in the graph and corresponds to the brick being scanned. When DHT decodes the d_off, it translates the leaf node to a local subvolume, which represents the next node in the graph leading to the brick. Tracking of leaf nodes is done in common utility functions. Leaf nodes counts and positional information are updated on a graph switch. [1] www.gluster.org/pipermail/gluster-devel/2015-January/043592.html Change-Id: Iaf0ea86d7046b1ceadbad69d88707b243077ebc8 BUG: 1190734 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9688 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* protocol/client: sequence CHILD_UP, CHILD_DOWN etc notificationsKrishnan Parthasarathi2015-02-071-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... from all bricks in the volume This patch is important in the context of MT epoll. With MT epoll, notification events from client xlators could reach cluster xlators like afr, dht, ec, stripe etc. in different orders. For e.g, In a distributed replicate volume of 2 bricks, namely Brick1 and Brick2, the following network events are observed by a mount process. - connection to Brick1 is broken. - connection to Brick1 has been restored. - connection to Brick2 is broken. - connection to Brick2 has been restored. Without establishing a total ordering of events, we can't guarantee that cluster xlators like afr, dht perceive them in the same order. While we would expect afr (say) to perceive it as only one of Brick1 and Brick2 going down at any given time, it is possible for the notification of Brick2 going offline to race with the notification of Brick1 coming back online. Change-Id: I78f5a52bfb05593335d0e9ad53ebfff98995593d BUG: 1104462 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9591 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rdma: client connection establishment takes more timeMohammed Rafi KC2014-11-181-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For rdma type only volume client connection establishment with server takes more than three seconds. Because for tcp,rdma type volume, will have 2 ports one for tcp and one for rdma, tcp port is stored with brickname and rdma port is stored as "brickname.rdma" during pamap_sighin. During the handshake when trying to get the brick port for rdma clients, since we are not aware of server transport type, we will append '.rdma' with brick name. So for tcp,rdma volume there will be an entry with '.rdma', but it will fail for rdma type only volume. So we will try again, this time without appending '.rdma' using a flag variable need_different_port, and it will succeed, but the reconnection happens only after 3 seconds. In this patch for rdma only type volume we will append '.rdma' during the pmap_signin. So during the handshake we will get the correct port for first try itself. Since we don't need to retry , we can remove the need_different_port flag variable. Change-Id: Ie8e3a7f532d4104829dbe995e99b35e95571466c BUG: 1153569 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/8934 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* rdma:rdma fuse mount hangs for tcp,rdma volumes if brick is down.Mohammed Rafi KC2014-11-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | When we try to mount a tcp,rdma volume as rdma transport using FUSE protocol, then mount will hang if the brick is down. When we kill a process, signal will be received in glusterfsd process and it will call pmap_signout with port listening on tcp only. In case of the tcp,rdma there will be two ports, and port which is listening for rdma will not called for sign out. So the mount process will try to connect to a port which is not open and it will keep trying to connect. This patch will call pmap_signout for rdma port also, So when mount tries to get the brick port,it will fail. Change-Id: I23676f65f96eb90b69b76478f7a21412a6aba70f BUG: 1143886 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/8762 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: Ping timer implmentationKrishnan Parthasarathi2014-04-291-262/+0
| | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the existing client ping timer implementation, and makes use of the common code for implementing both client ping timer and the glusterd ping timer. A new gluster rpc program for ping is introduced. The ping timer is only started for peers that have this new program. The deafult glusterd ping timeout is 30 seconds. It is configurable by setting the option 'ping-timeout' in glusterd.vol . Also, this patch introduces changes in the glusterd-handshake path. The client programs for a peer are now set in the callback of dump_versions, for both the older handshake and the newer op-version handshake. This is the only place in the handshake process where we know what programs a peer supports. Change-Id: I035815ac13449ca47080ecc3253c0a9afbe9016a BUG: 1038261 Signed-off-by: Vijaikumar M <vmallika@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5202 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc: transport may be destroyed while rpc isn'tKrishnan Parthasarathi2014-03-051-1/+1
| | | | | | | | | | | | | | | | rpc_clnt object is destroyed after the corresponding transport object is destroyed. But rpc_clnt_reconnect, a timer driven function, refers to the transport object beyond its 'life'. Instead, using the embedded connection object prevents use after free problem wrt transport object. Also, access transport object under conn->lock. Change-Id: Iae28e8a657d02689963c510114ad7cb7e6764e62 BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6751 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: conn-id should be unique when lk-heal is offPranith Kumar K2014-02-171-8/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: It was observed that in some cases client disconnects and re-connects before server xlator could detect that a disconnect happened. So it still uses previous fdtable and ltable. But it can so happen that in between disconnect and re-connect an 'unlock' fop may fail because the fds are marked 'bad' in client xlator upon disconnect. Due to this stale locks remain on the brick which lead to hangs/self-heals not happening etc. For the exact bug RCA please look at https://bugzilla.redhat.com/show_bug.cgi?id=1049932#c0 Fix: When lk-heal is not enabled make sure connection-id is different for every setvolume. This will make sure that a previous connection's resources are not re-used in server xlator. Change-Id: Id844aaa76dfcf2740db72533bca53c23b2fe5549 BUG: 1049932 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/client: handle network disconnect/reconnect properlyAnand Avati2013-12-031-0/+1
| | | | | | | | | | | | | if client/server state versions match, we still need to notify parent xlators of reconnection (CHILD_UP) because they were notified of CHILD_DOWN at the time of disconnection. Change-Id: I36c4bde6d8c3db9cb0c48eeb10663b56897c932e BUG: 1037267 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6396 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* libglusterfs: Add monotonic clocking counter for timer threadHarshavardhana2013-10-151-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gettimeofday() returns the current wall clock time and timezone. Using these functions in order to measure the passage of time (how long an operation took) therefore seems like a no-brainer. This time suffer's from some limitations: a. They have a low resolution: “High-performance” timing by definition, requires clock resolutions into the microseconds or better. b. They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive). In such cases timer thread could go into an infinite loop. From 'man gettimeofday': ---------- .. .. The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). .. .. ---------- Rationale: For calculating interval timing for Timer thread, all that’s needed should be clock as a simple counter that increments at a stable rate. This is necessary to avoid the jumps which are caused by using "wall time", this counter must be monotonic that can never “tick” backwards, ever. Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb BUG: 1017993 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: Prevent excessive logging of client's "disconnect" messages.Venkatesh Somyajulu2013-05-281-0/+1
| | | | | | | | | | | | | | | | | | | | | Problem: Currently when gluster volume start force is executed, client process will talk to glusterd to get the port of the brick. But if brick's path is not available it cannot return brick's port. So client process will keep connecting and disconnecting from glusterd for port-query which is ultimately responsible for execssive logging of disconnect messages. Fix: Message will be logged just once at INFO level after the first disconnect from glusterd. Afterwards "disconnect" messages will be logged in DEBUG mode. Change-Id: I2b787f3820b5da45e090c562e5698fcfe24a02cd BUG: 959969 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/4953 Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/client: Avoid double free of framePranith Kumar K2013-02-041-2/+1
| | | | | | | | | | | | | | | When client_submit_request fails it calls cbk. The cleanups should happen only in cbk. The code committed as part of http://review.gluster.org/4357 violates this. Also found that clnt_release_reopen_fd violates this as well. This patch fixes these issue. Change-Id: Ic02ba278724b03c65c00b686c39fd7846122618a BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4464 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: Periodically attempt reopensPranith Kumar K2013-02-031-18/+81
| | | | | | | | | | | | | | | | | | | | | | If the brick is taken down and the hard disk is replaced and the brick is brought back up, the re-opens of the open-fds will fail because the file is not present on the brick. Re-opens are not attempted even if the files are re-created by self-heal until the brick is brought down after the files are re-created and brought back up. This is a problem with a VM-store in a replica-setup. Until the fd is re-opened the writes will never happen on the brick where the hard-disk is replaced. To handle this situation gracefully, client xlator is enhanced to perform finodelk, fxattrop, writev, readv using anonymous fds if the file is yet to be re-opened. If the fop succeeds then client xlator attempts re-open. Change-Id: I1cc6d1bbf8227cd996868ab2ed0a57fb05e00017 BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4358 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/client: Add fdctx back to saved-list after reopenPranith Kumar K2013-02-031-90/+73
| | | | | | | | | Change-Id: I01caa1b51570359e6e3ffe1ffb7279cbdb0b0c64 BUG: 821056 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4357 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* protocol/client: print a message regarding brick status in case of failureKrutika Dhananjay2012-12-201-1/+3
| | | | | | | | | | | | | | that way, it would help admins to look at the corresponding brick directly. All credit goes to Amar. Change-Id: I959df59111864cc0574945d827f8fe5f2d919491 BUG: 839021 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/4341 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* client-handshake: synchronize config.remote_port setting b/wRaghavendra G2012-10-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | rpc_clnt_reconnect and client_query_portmap_cbk problem: ------- Theoretically there is a possibility that we could complete querying the remote brick's port number before rpc_transport_connect can return. If rpc_clnt_reconnect happens to be the caller of rpc_transport_connect and we've already got the remote brick's port number by the time rpc_transport_connect returns, without synchronization, rpc_clnt_connect resets config.remote_port to zero even before we have attempted a connection with remote brick. fix: --- By making only poll thread do setting and resetting of config.remote_port, we avoid the race-condition. Change-Id: I51879ba1cac651a80ff5c9c070ec7fe1ceea9e05 BUG: 765051 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/4044 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: quick-reconnect after portmap queryAnand Avati2012-09-251-1/+1
| | | | | | | | | | | | | | | | | | Currently the disconnect after a portmap query is treated like an ordinary disconnect and the reconnection attempt (in this case, to the brick) is attempted only after 3 secs. This results in a delay which is unnecessary. Mark the disconnection happening because of a successful portmap query as needing a 'quick reconnect' to avoid the delay for this special case. Change-Id: I43c8292ff0c30858d883ff3569a3761acbf2f5eb BUG: 860220 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/3994 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* protocol/client: Fix negative return in client_setvolumeKrutika Dhananjay2012-08-011-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: The function dict_serialized_length could, owing to an error, return a negative integer (-EINVAL) that gets assigned to an unsigned int member 'dict_len' of gf_setvolume_req structure. FIX: Hold the value returned by dict_serialized_length in local variable ret (which is a signed int). Test if ret is negative, in which case the control would anyway branch to the label fail where the function returns. Otherwise dict_len is assigned with ret, in turn giving a more meaningful value to the attribute length. TEST: Attached gdb to glusterfs mount process, set breakpoint at client_setvolume, forced dict_serialized_length to return -EINVAL (indirectly by forcing _dict_serialized_length to return -EINVAL after setting count to -1 within its body) and checked the value of ret (which is now sure to contain a negative value) whose value will be appropriately tested to decide the next course of action within client_setvolume: whether to simply exit due to an error or execute the subsequent statements. Change-Id: Ib22ad8f30d8ae04acaf2ff5bfee9c348a2c47148 BUG: 789278 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.com/3755 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* remove useless if-before-free (and free-like) functionsJim Meyering2012-07-131-6/+3
| | | | | | | | | | | | See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* rpc: variable name changesAmar Tumballi2012-07-121-8/+8
| | | | | | | | | | | | | 's/3_1/3_3/g' in case of glusterfs protocol 's/3_1_/_/g' in case of CLI and mgmt protocol Change-Id: I6e6510d02c05f68f290c52ed284c04576326e12c Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 764890 Reviewed-on: http://review.gluster.com/3632 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/client: Re-open should not have O_CREAT|O_TRUNC|O_EXCLPranith Kumar K2012-06-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCA The bug is observed in 3.2.x because posix xlator changes the uid/gid of file as per frame->root-uid/gid if O_CREAT flag is set in open fop. Posix does not do this in 3.3.x so that bug does not appear anymore but this issue exposed the actual bug in client xlator re-open. Re-open of a file on re-connection should not perform re-open with the same flags at the time of open/create/opendir. Imagine a case where a file is opened with O_TRUNC|O_RDWR and some data is written to it, now if the brick goes down and comes back the file will be truncated. When I tested this case, the file is not truncated because locks xlator resets O_TRUNC unconditionally. Client xlator re-open bug and locks xlator bug cancel each other. Fix Reset O_CREAT|O_TRUNC|O_EXCL flags in re-open. Locks xlator should not reset O_TRUNC. Additional changes Removed wbflags as it is not assigned at all. Testcases Automated go program is at: ://bugzilla.redhat.com/show_bug.cgi?id=807976#c2 Change-Id: I0080344fdda2e62e7c976c35a5bf5f1fa8838891 BUG: 807976 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3582 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>