summaryrefslogtreecommitdiffstats
path: root/rpc/rpc-lib
Commit message (Collapse)AuthorAgeFilesLines
* rpc: use export map to minimize exported symbols in libgf{rpc,xdr}.soKaleb S. KEITHLEY2018-01-122-1/+69
| | | | | | | | | | | | | | | | | | | | | | | | Without an export map (at link time) libgrpc and libgfxdr export over 150 and 450 symbols each, respectively. Many are not used by anything else. (Unclear what the unused symbols are, some may be simple sloppiness, e.g. not declaring functions static that should be. Others may be intra-library calls that can't be static but aren't part of the API, per se.) By linking with an export map the number of exported symbols is reduced to ~60 and ~250 respectively. This parallels the similar change made to libglusterfs recently and the older changes to the xlators to minimize the symbols that are visible (exported) from the .so. And I don't know, do we want to go all the way to symbol versions? For these libs? And for libglusterfs? fixes gluster/glusterfs#392 Change-Id: I9cdc3eee10e5f1408d7e7f2f29fad597c97e4003 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
* rpc: fix use after freed of clnt after rpc transport clenupKinglong Mee2017-12-271-1/+4
| | | | | | | | | | | If the transport object is freed in rpc_transport_unref, a notify of RPC_TRANSPORT_CLEANUP is push to rpc_clnt_notify, where the rpc_clnt(contains conn) is freed. After that, using of conn after rpc_transport_unref is use after freed. Change-Id: I5cac8a8e7ced7c1079930080a12abf02d46667d5 Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
* Use RTLD_LOCAL for symbol resolutionPrashanth Pai2017-12-271-1/+1
| | | | | | | | | | | | | RTLD_LOCAL is the default value for symbol visibility flag of dlopen() in Linux and NetBSD. Using it avoids conflicts during symbol resolution. This also allows us to detect xlators that have not been explicitly linked with libraries that they use. This used to go unnoticed when RTLD_GLOBAL was being used. BUG: 1193929 Change-Id: I50db6ea14ffdee96596060c4d6bf71cd3c432f7b Signed-off-by: Prashanth Pai <ppai@redhat.com>
* all: Simplify component message id's definitionXavier Hernandez2017-12-141-60/+23
| | | | | | | | | This patch creates a new way of defining message id's that is easier and less error prone because it doesn't require so many manual changes each time a new component is defined or a new message created. Change-Id: I71ba8af9ac068f5add7e74f316a2478bc991c67b Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* glusterfs: Use gcc builtin ATOMIC operator to increase/decreate refcount.Mohit Agrawal2017-12-124-22/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In glusterfs code base we call mutex_lock/unlock to take reference/dereference for a object.Sometime it could be reason for lock contention also. Solution: There is no need to use mutex to increase/decrease ref counter, instead of using mutex use gcc builtin ATOMIC operation. Test: I have not observed yet how much performance gain after apply this patch specific to glusterfs but i have tested same with below small program(mutex and atomic both) and get good difference. static int numOuterLoops; static void * threadFunc(void *arg) { int j; for (j = 0; j < numOuterLoops; j++) { __atomic_add_fetch (&glob, 1,__ATOMIC_ACQ_REL); } return NULL; } int main(int argc, char *argv[]) { int opt, s, j; int numThreads; pthread_t *thread; int verbose; int64_t n = 0; if (argc < 2 ) { printf(" Please provide 2 args Num of threads && Outer Loop\n"); exit (-1); } numThreads = atoi(argv[1]); numOuterLoops = atoi (argv[2]); if (1) { printf("\tthreads: %d; outer loops: %d;\n", numThreads, numOuterLoops); } thread = calloc(numThreads, sizeof(pthread_t)); if (thread == NULL) { printf ("calloc error so exit\n"); exit (-1); } __atomic_store (&glob, &n, __ATOMIC_RELEASE); for (j = 0; j < numThreads; j++) { s = pthread_create(&thread[j], NULL, threadFunc, NULL); if (s != 0) { printf ("pthread_create failed so exit\n"); exit (-1); } } for (j = 0; j < numThreads; j++) { s = pthread_join(thread[j], NULL); if (s != 0) { printf ("pthread_join failed so exit\n"); exit (-1); } } printf("glob value is %ld\n",__atomic_load_n (&glob,__ATOMIC_RELAXED)); exit(0); } time ./thr_count 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 1m10.288s user 0m57.269s sys 3m31.565s time ./thr_count_atomic 800 800000 threads: 800; outer loops: 800000; glob value is 640000000 real 0m20.313s user 1m20.558s sys 0m0.028 Change-Id: Ie5030a52ea264875e002e108dd4b207b15ab7cc7 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* rpc: Fix format warnings when using IPV6_DEFAULTShreyas Siravara2017-12-072-2/+2
| | | | | | Change-Id: I22e622212f30defe6f2af1a67d7b48a88d37a097 BUG: 1520974 Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
* metrics: provide options to dump metrics from xlatorsAmar Tumballi2017-12-061-0/+1
| | | | | | | | | | * Introduce xlator methods to allow dumping of metrics * Separate options to get the metrics dumped in a path Updates #168 Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* xdr: Fix build errors due to missing xdr symbol when building against TIRPCShreyas Siravara2017-12-061-0/+3
| | | | | Change-Id: Ic52045f5dd19e551612242450b8982f42ff327e9 Signed-off-by: Shreyas Siravara <sshreyas@fb.com>
* rio/everywhere: add icreate/namelink fopSusant Palai2017-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | icreate creates inode, while namelink links the basename to it's parent gfid. For now mkdir is the primary user of these fops. Better distribution is acheived by creating the inode on ,(say) mds1 and linking the basename to it's parent gfid on mds2. The inode serves readdirp, stat etc. More details about the fops are present at: https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md This backport of three patches from experimental branch. 1- https://review.gluster.org/#/c/18085/ 2- https://review.gluster.org/#/c/18086/ 3- https://review.gluster.org/#/c/18094/ Updates gluster/glusterfs#243 Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592 Signed-off-by: Susant Palai <spalai@redhat.com>
* rpc: Eliminate conn->lock contention by using more granular locksMohit Agrawal2017-11-283-14/+6
| | | | | | | | | | | | | | | | | | | | | | | | | rpc_clnt_submit() acquires conn->lock before call to rpc_transport_submit_request() and subsequent queuing of frame into saved_frames list. However, as part of handling RPC_TRANSPORT_MSG_RECEIVED and RPC_TRANSPORT_MSG_SENT notifications in rpc_clnt_notify(), conn->lock is again used to atomically update conn->last_received and conn->last_sent event timestamps. So when conn->lock is acquired as part of submitting a request, a parallel POLLIN notification gets blocked at rpc layer until the request submission completes and the lock is released. To get around this, this patch call clock_gettime (instead to call gettimeofday) to update event timestamps in conn->last_received and conn->last_sent and to call clock_gettime don't need to call mutex_lock because it (clock_gettime) is thread safe call. Note: Run fio on vm after apply the patch, iops is improved after apply the patch. Change-Id: I347b5031d61c426b276bc5e07136a7172645d763 BUG: 1467614 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* rpc-lib: coverity fixesMilind Changire2017-11-225-38/+70
| | | | | | | | | | | | | | | | | Scan URL: https://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-11-10-0f524f07/html/ ID: 9 (BAD_SHIFT) ID: 58 (CHECKED_RETURN) ID: 98 (DEAD_CODE) ID: 249, 250, 251, 252 (MIXED_ENUMS) ID: 289, 297 (NULL_RETURNS) ID: 609, 613, 622, 644, 653, 655 (UNUSED_VALUE) ID: 432 (RESOURCE_LEAK) Change-Id: I2349877214dd38b789e08b74be05539f09b751b9 BUG: 789278 Signed-off-by: Milind Changire <mchangir@redhat.com>
* rpc : Change the way client uuid is builtPoornima G2017-11-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Today the main users of client uuid are protocol layers, locks, leases. Protocol layers requires each client uuid to be unique, even across connects and disconnects. Locks and leases on the server side also use the same client uid which changes across file migrations. Which makes the graph switch and file migration tedious for locks and leases. file migration across bricks becomes difficult as client uuid for the same client, is different on the other brick. The exact set of issues exists for leases as well. Solution would be to introduce a constant in the client-uid string which the locks and leases can use to identify the owner client across bricks. Client uuid currently: %s(ctx uuid)-%s(protocol client name)-%d(graph id)%s(setvolume count/reconnect count) Proposed Client uuid: "CTX_ID:%s-GRAPH_ID:%d-PID:%d-HOST:%s-PC_NAME:%s-RECON_NO:%s" - CTX_ID: This is will be constant per client. - GRAPH_ID, PID, HOST, PC_NAME(protocol client name), RECON_NO(setvolume count) remains the same. Change-Id: Ia81d57a9693207cd325d7b26aee4593fcbd6482c BUG: 1369028 Signed-off-by: Susant Palai <spalai@redhat.com>
* rpc: bring a new protocol versionAmar Tumballi2017-11-071-0/+2
| | | | | | | | | | * xdr: add gfid to on wire format for fsetattr/rchecksum * as it is change in on wire XDR format, needed backward compatible RPC programs. Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 827334 Change-Id: Id0a2da3632516dc1a5560dde2b151b2e5f0be8e5
* rpc: optimize fop program lookupMilind Changire2017-11-062-4/+9
| | | | | | | | | | Ensure that the fop program is the first in the program list so that there's minimum amount of time spent to search the program for the most frequently needed use case. Change-Id: I45c3dcdbf39ec90ba39d914432d13a2ace00a5ee BUG: 1509647 Signed-off-by: Milind Changire <mchangir@redhat.com>
* stack: change gettimeofday() to clock_gettime()Amar Tumballi2017-11-061-1/+2
| | | | | | | | | | | | For achieving the above, needed below changes too. * more sanity into how 'frame->op' is assigned. * infra to have 'stats' as separate section in 'xlator_t' structure Updates #137 Change-Id: I36679bf9577f3ed00a695b4e7d92870dcb3db8e1 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* rpc: make actor search parallelMilind Changire2017-11-062-28/+28
| | | | | | | | | | | | | | | | Problem: On a service request, the actor is searched using an exclusive mutex lock which is not really necessary since most of the time the actor list is going to be searched and not modified. Solution: Use a read-write lock instead of a mutex lock. Only modify operations on a service need to be done under a write-lock which grants exclusive access to the code. Change-Id: Ia227351b3f794bd8eee70c7a76d833cc716ab113 BUG: 1509644 Signed-off-by: Milind Changire <mchangir@redhat.com>
* gluster: IPv6 single stack supportKevin Vigor2017-10-243-0/+97
| | | | | | | | | | | | | | | | | | | | | | Summary: - This diff changes all locations in the code to prefer inet6 family instead of inet. This will allow change GlusterFS to operate via IPv6 instead of IPv4 for all internal operations while still being able to serve (FUSE or NFS) clients via IPv4. - The changes apply to NFS as well. - This diff ports D1892990, D1897341 & D1896522 to the 3.8 branch. Test Plan: Prove tests! Reviewers: dph, rwareing Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Change-Id: I34fdaaeb33c194782255625e00616faf75d60c33 BUG: 1406898 Reviewed-on-3.8-fb: http://review.gluster.org/16059 Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Tested-by: Shreyas Siravara <sshreyas@fb.com>
* rpc: free conn->name in rpc_clnt_destroy()Niels de Vos2017-10-111-0/+1
| | | | | | Change-Id: Idf99908aa48718a7faf7f0bbc647a679ec548282 BUG: 1443145 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* rpc: free registered callback programsNiels de Vos2017-10-111-0/+7
| | | | | | Change-Id: I8c6f6b642f025d1faf74015b8f7aaecd7ebfd4d5 BUG: 1443145 Signed-off-by: Niels de Vos <ndevos@redhat.com>
* heal: New feature heal info summary to list the status of brick and count of ↵Mohamed Ashiq Liyazudeen2017-09-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entries to be healed Command output: Brick 192.168.2.8:/brick/1 Status: Connected Total Number of entries: 363 Number of entries in heal pending: 362 Number of entries in split-brain: 0 Number of entries possibly healing: 1 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <healInfo> <bricks> <brick hostUuid="9105dd4b-eca8-4fdb-85b2-b81cdf77eda3"> <name>192.168.2.8:/brick/1</name> <status>Connected</status> <totalNumberOfEntries>363</numberOfEntries> <numberOfEntriesInHealPending>362</numberOfEntriesInHealPending> <numberOfEntriesInSplitBrain>0</numberOfEntriesInSplitBrain> <numberOfEntriesPossiblyHealing>1</numberOfEntriesPossiblyHealing> </brick> </bricks> </healInfo> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> </cliOutput> Change-Id: I40cb6f77a14131c9e41b292f4901b41a228863d7 BUG: 1261463 Signed-off-by: Mohamed Ashiq Liyazudeen <mliyazud@redhat.com> Reviewed-on: https://review.gluster.org/12154 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Karthik U S <ksubrahm@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: destroy transport after client_tMilind Changire2017-08-311-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: 1. Ref counting increment on the client_t object is done in rpcsvc_request_init() which is incorrect. 2. Ref not taken when delegating to grace_time_handler() Solution: 1. Only fop requests which require processing down the graph via stack 'frames' now ref count the request in get_frame_from_request() 2. Take ref on client_t object in server_rpc_notify() but avoid dropping in RPCSVC_EVENT_TRANSPORT_DESRTROY. Drop the ref unconditionally when exiting out of grace_time_handler(). Also, avoid dropping ref on client_t in RPCSVC_EVENT_TRANSPORT_DESTROY when ref mangement as been delegated to grace_time_handler() Change-Id: Ic16246bebc7ea4490545b26564658f4b081675e4 BUG: 1481600 Reported-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17982 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* Revert "rpc: disable client on disconnection from rebalance"Milind Changire2017-08-241-4/+0
| | | | | | | | | | | | | | This reverts commit 5b14c11d3cae38bc66006b02217ede485ae30dea. BUG: 1484225 Change-Id: I3269d3fc64de3f3cc6f670ea564a87d7725e10fd Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/18113 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: disable client on disconnection from rebalanceMilind Changire2017-08-231-0/+4
| | | | | | | | | | | | | | | | | | | Problem: glusterd rpc code path attempts to reconnect to rebalance process via the reconnect timer even after the rebalance process disconnection Solution: Set the clnt->disabled flag to 1 to avoid reconnection and cause the clnt object to be unref'd Change-Id: I4e38eaef45d2fdea86d25e9dff9f1af0cd29cf66 BUG: 1484225 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/18093 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* tier: separation of attach-tier from add-brickhari gowtham2017-08-011-0/+1
| | | | | | | | | | | | | | | | | | PROBLEM: Both attach tier and add brick have the same RPC and set of code. This becomes a hurdle while tring to implement add brick on a tiered volume. FIX: This patch separates the add brick and attach tier giving them separate RPCs. Change-Id: Iec57e972be968a9ff00b15b507e56a4f6dc398a2 BUG: 1376326 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/15503 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: prevent logging null client [Invalid argument]Prashanth Pai2017-07-251-1/+3
| | | | | | | | | | | | | | | | | | | | | The following log entry is observed during volume operations such as volume create: [2017-07-20 05:13:43.213797] E [client_t.c:321:gf_client_ref] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_create+0x1a4) [0x7f987f66cd20] -->/usr/local/lib/libgfrpc.so.0(rpcsvc_request_init+0xd0) [0x7f987f66ca23] -->/usr/local/lib/libglusterfs.so.0(gf_client_ref+0x56) [0x7f987f91cbd5] ) 0-client_t: null client [Invalid argument] Change-Id: I49ba753e8d1a828bb275b0ccb1a181706774f387 BUG: 1193929 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17848 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: include current second in timed out frame cleanupMilind Changire2017-07-211-1/+1
| | | | | | | | | | | | | | | | | | | Problem: frames which time out at current second are missed out Solution: change test to include frames timing out at current second i.e. timeout <= current.tv_sec instead of timeout < current.tv_sec Change-Id: I459d47856ade2b657a0289e49f7f63da29186d6e BUG: 1468433 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/17722 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* program/GF-DUMP: Shield ping processing from traffic to GlusterfsRaghavendra G2017-07-182-5/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Program Since poller thread bears the brunt of execution till the request is handed over to io-threads, poller thread experiencies lock contention(s) in the control flow till io-threads, which slows it down. This delay invariably affects reading ping requests from network and responding to them, resulting in increased ping latencies, which sometimes results in a ping-timer-expiry on client leading to disconnect of transport. So, this patch aims to free up poller thread from executing code of Glusterfs Program. We do this by making * Glusterfs Program registering itself asking rpcsvc to execute its actors in its own threads. * GF-DUMP Program registering itself asking rpcsvc to _NOT_ execute its actors in its own threads. Otherwise program's ownthreads become bottleneck in processing ping traffic. This means that poller thread reads a ping packet, invokes its actor and hands the response msg to transport queue. Change-Id: I526268c10bdd5ef93f322a4f95385137550a6a49 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1421938 Reviewed-on: https://review.gluster.org/17105 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* Link against missed libraries to resolve symbolsPrashanth Pai2017-07-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | When external programs perform a dlopen("..so", RTLD_LAZY|RTLD_LOCAL) on some shared objects like xlators, it can fail with dlerror set to error string "undefined symbol <some-type>". This was observed for the following shared objects: fuse.so, quota.so, quotad.so, server.so, libgfrpc.so and socket.so P.S: This was found while running a go program which fetches the list of xlator options (volume_option_t) from xlator's shared object. BUG: 1193929 Change-Id: I7b958409cf11fb67c2be32a3f85a96fb1260236b Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17659 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* multiple: fix struct/typedef inconsistenciesJeff Darcy2017-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The most common pattern, both in our code and elsewhere, is this: struct _xyz { ... }; typedef struct _xyz xyz_t; These exceptions - especially call_frame/call_stack - have been slowing down code navigation for years. By converging on a single pattern, navigating from xyz_t in code to the actual definition of struct _xyz (i.e. without having to visit the typedef first) might even be automatable. Change-Id: I0e5dd1f51f98e000173c62ef4ddc5b21d9ec44ed Signed-off-by: Jeff Darcy <jdarcy@fb.com> Reviewed-on: https://review.gluster.org/17650 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* rpc: use GF_ATOMIC_INC to generate rpc_clnt's callidZhou Zhengping2017-05-082-17/+3
| | | | | | | | | | | | Change-Id: I57ad970411db1ccd3d2c56c504c7da9cc221051f BUG: 1448692 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: https://review.gluster.org/17198 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: Remove accidental IPV6 changesKaushal M2017-05-053-97/+0
| | | | | | | | | | | | | | They snuck in with the HALO patch (07cc8679c) Change-Id: I8ced6cbb0b49554fc9d348c453d4d5da00f981f6 BUG: 1447953 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: https://review.gluster.org/17174 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* Halo Replication feature for AFR translatorKevin Vigor2017-05-025-12/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 BUG: 1428061 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 Reviewed-on: https://review.gluster.org/16177 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: Fix removing pmap entry on rpc disconnectPrashanth Pai2017-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problem: The following line of code intended to remove pmap entry for the connection during disconnects: pmap_registry_remove (this, 0, NULL, GF_PMAP_PORT_NONE, xprt); However, no pmap entry will have it's type set to GF_PMAP_PORT_NONE at any point in time. So a call to pmap_registry_search_by_xprt() in pmap_registry_remove() will always fail to find a match. Fix: Optionally ignore pmap entry's type in pmap_registry_search_by_xprt(). BUG: 1193929 Change-Id: I705f101739ab1647ff52a92820d478354407264a Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/17129 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterd : Disallow peer detach if snapshot bricks exist on itGaurav Yadav2017-03-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem : - Deploy gluster on 2 nodes, one brick each, one volume replicated - Create a snapshot - Lose one server - Add a replacement peer and new brick with a new IP address - replace-brick the missing brick onto the new server (wait for replication to finish) - peer detach the old server - after doing above steps, glusterd fails to restart. Solution: With the fix detach peer will populate an error : "N2 is part of existing snapshots. Remove those snapshots before proceeding". While doing so we force user to stay with that peer or to delete all snapshots. Change-Id: I3699afb9b2a5f915768b77f885e783bd9b51818c BUG: 1322145 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/16907 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* rpc: bump up conn->cleanup_gen in rpc_clnt_reconnect_cleanupAtin Mukherjee2017-03-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Commit 086436a introduced generation number (cleanup_gen) to ensure that rpc layer doesn't end up cleaning up the connection object if application layer has already destroyed it. Bumping up cleanup_gen was done only in rpc_clnt_connection_cleanup (). However the same is needed in rpc_clnt_reconnect_cleanup () too as with out it if the object gets destroyed through the reconnect event in the application layer, rpc layer will still end up in trying to delete the object resulting into double free and crash. Peer probing an invalid host/IP was the basic test to catch this issue. Change-Id: Id5332f3239cb324cead34eb51cf73d426733bd46 BUG: 1433578 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/16914 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* rpc: avoid logging success on failureMilind Changire2017-03-071-0/+5
| | | | | | | | | | | | | | | | Avoid logging Success in the event of failure especially when errno has no meaningful value w.r.t. the failure. In this case the errno is set to zero when there's indeed a failure at the RPC level. Change-Id: If2cc81aa1e590023ed22892dacbef7cac213e591 BUG: 1426032 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: https://review.gluster.org/16730 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc/clnt: remove locks while notifying CONNECT/DISCONNECTRaghavendra G2017-03-012-31/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Locking during notify was introduced as part of commit aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent xlators [2]. However as part of handling DISCONNECT protocol/client does unwind saved frames (with failure) waiting for responses. This saved_frames_unwind can be a costly operation and hence ideally shouldn't be included in the critical section of notifylock, as it unnecessarily delays the reconnection to same brick. Also, its not a good practise to pass control to other xlators holding a lock as it can lead to deadlocks. So, this patch removes locking in rpc-clnt while notifying parent xlators. To fix [2], two changes are present in this patch: * notify DISCONNECT before cleaning up rpc connection (same as commit a6b63e11b7758cf1bfcb6798, patch [3]). * protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc connection and does a start while handling a DISCONNECT event from rpc. Note that patch [3] was reverted as rpc_clnt_start called in quick_reconnect path of protocol/client didn't invoke connect on transport as the connection was not cleaned up _yet_ (as cleanup was moved post notification in rpc-clnt). This resulted in clients never attempting connect to bricks. Note that one of the neater ways to fix [2] (without using locks) is to introduce generation numbers to map CONNECT and DISCONNECTS across epochs and ignore DISCONNECT events if they don't belong to current epoch. However, this approach is a bit complex to implement and requires time. So, current patch is a hacky stop-gap fix till we come up with a more cleaner solution. [1] http://review.gluster.org/15916 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626 [3] http://review.gluster.org/15681 Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1427012 Reviewed-on: https://review.gluster.org/16784 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* Use int instead of int8_t for the 3 variablesMichael Scherer2017-02-281-1/+1
| | | | | | | | | | | | | | | | | | | | | Since strcmp return a int, and since the spec of strcmp do not tell the return value, it could return 256 and this would overflow. Found by Coverity scan. (thanks to Stéphane Marcheusin who explained the details to me) Change-Id: I5195e05b44f8b537226e6cee178d95a1ab904e96 BUG: 789278 Signed-off-by: Michael Scherer <misc@redhat.com> Reviewed-on: https://review.gluster.org/16738 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Michael Scherer <misc@fedoraproject.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: fix obvious typo in cleanup code in rpc_clnt_notifyMateusz Slupny2017-02-191-1/+1
| | | | | | | | | | | | | | | Change-Id: I003e38b238704d3345d46688355bcf3702455ba1 BUG: 1399593 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> [ndevos: rebased after I8ff5d1a32 moved the code around] Reviewed-on: https://review.gluster.org/15969 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpcsvc: Add rpchdr and proghdr to iobref before submitting to transportPoornima G2017-02-152-3/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue: When fio is run on multiple clients (each client writes to its own files), and meanwhile the clients does a readdirp, thus the client which did a readdirp will now recieve the upcalls. In this scenario the client disconnects with rpc decode failed error. RCA: Upcall calls rpcsvc_request_submit to submit the request to socket: rpcsvc_request_submit currently: rpcsvc_request_submit () { iobuf = iobuf_new iov = iobuf->ptr fill iobuf to contain xdrised upcall content - proghdr rpcsvc_callback_submit (..iov..) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit (... iov...) { ... iobuf = iobuf_new iov1 = iobuf->ptr fill iobuf to contain xdrised rpc header - rpchdr msg.rpchdr = iov1 msg.proghdr = iov ... rpc_transport_submit_request (msg) ... if (iobuf) iobuf_unref (iobuf) } rpcsvc_callback_submit assumes that once rpc_transport_submit_request() returns the msg is written on to socket and thus the buffers(rpchdr, proghdr) can be freed, which is not the case. In especially high workload, rpc_transport_submit_request() may not be able to write to socket immediately and hence adds it to its own queue and returns as successful. Thus, we have use after free, for rpchdr and proghdr. Hence the clients gets garbage rpchdr and proghdr and thus fails to decode the rpc, resulting in disconnect. To prevent this, we need to add the rpchdr and proghdr to a iobref and send it in msg: iobref_add (iobref, iobufs) msg.iobref = iobref; The socket layer takes a ref on msg.iobref, if it cannot write to socket and is adding to the queue. Thus we do not have use after free. Thank You for discussing, debugging and fixing along: Prashanth Pai <ppai@redhat.com> Raghavendra G <rgowdapp@redhat.com> Rajesh Joseph <rjoseph@redhat.com> Kotresh HR <khiremat@redhat.com> Mohammed Rafi KC <rkavunga@redhat.com> Soumya Koduri <skoduri@redhat.com> Change-Id: Ifa6bf6f4879141f42b46830a37c1574b21b37275 BUG: 1421937 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16613 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-302-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: add a cli command to trigger a statedump on a clientPoornima G2017-01-232-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we will be able to trigger statedumps on remote Gluster clients, mainly targetted for applications using libgfapi. Design: SIGUSR signal is the most comman way of taking a statedump in Gluster. But it cannot be used for libgfapi based processes, as the process loading the library might have already consumed SIGUSR signal. Hence going by the command way. One has to issue a Gluster command to initiate a statedump on the libgfapi based client. The command takes hostname and PID as an argument. All the glusterds in the cluster, check if they are connected to the specified hostname, and send an RPC request to all the connected clients from that hostname (via the mgmt connection). URL: http://review.gluster.org/16357 Change-Id: Icbe4d2f026b32a2c7d5535e1bfb2cdaaff042e91 BUG: 1169302 Signed-off-by: Poornima G <pgurusid@redhat.com> [ndevos: minor fixes and split patch in smaller pieces] Reviewed-on: https://review.gluster.org/9228 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com>
* tier : Tier as a servicehari gowtham2017-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : *) spawning the daemon, killing it and other such processes. *) volume set options , which are written on the volfile. *) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. *) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* socket: socket disconnect should wait for poller thread exitRajesh Joseph2016-12-215-8/+9
| | | | | | | | | | | | | | | | | | | | | When SSL is enabled or if "transport.socket.own-thread" option is set then socket_poller is run as different thread. Currently during disconnect or PARENT_DOWN scenario we don't wait for this thread to terminate. PARENT_DOWN will disconnect the socket layer and cleanup resources used by socket_poller. Therefore before disconnect we should wait for poller thread to exit. Change-Id: I71f984b47d260ffd979102f180a99a0bed29f0d6 BUG: 1404181 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/16141 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: fix for race between rpc and protocol/clientRajesh Joseph2016-12-052-40/+59
| | | | | | | | | | | | | | | | | | | | | | | | It is possible that the notification thread which notifies protocol/client layer about the disconnection is put to sleep and meanwhile, a fuse thread or a timer thread initiates and completes reconnection to the brick. The notification thread is then woken up and protocol/client layer updates its flags to indicate that network is disconnected. No reconnection is initiated because reconnection is rpc-lib layer's responsibility and its flags indicate that connection is connected. Fix: Serialize connect and disconnect notify Credit: Raghavendra Talur <rtalur@redhat.com> Change-Id: I8ff5d1a3283b47f5c26848a42016a40bc34ffc1d BUG: 1386626 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15916 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: CLI for granular entry heal enablement/disablementKrutika Dhananjay2016-11-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099 BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15747 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Revert "rpc: Fix the race between notification and reconnection"Pranith Kumar Karampuri2016-11-161-4/+3
| | | | | | | | | | | | | | | | This reverts commit a6b63e11b7758cf1bfcb67985e25ec02845f0995. Nithya and Rajesh found that the mount fails sometimes after this patch was merged so reverting it. BUG: 1386626 Change-Id: I959a5b6c7da61368cf4c67c98193c6e8fdd1755d Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15838 Reviewed-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* rpc: Fix the race between notification and reconnectionPranith Kumar K2016-10-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: There was a hang because unlock on an entry failed with ENOTCONN. Client thinks the connection is down where as server thinks the connection is up. This is the race we are seeing: 1) Connection from client to the brick disconnects. 2) Saved frames unwind is called which unwinds all frames that were wound before disconnect. 3) connection from client to the brick happens and setvolume. 4) Disconnect notification for the connection in 1) comes now and calls client_rpc_notify() which marks the connection to be offline even when the connection is up. This is happening because I/O can retrigger connection before disconnect notification is sent to the higher layers in rpc. Fix: Notify the higher layers that a disconnect happened and then go ahead with reconnect logic. For the logs which point to the information above check: https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1 Thanks to Raghavendra G for suggesting the correct fix. BUG: 1386626 Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15681 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* compound fops: Fix file corruption issueKrutika Dhananjay2016-10-244-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Address of a local variable @args is copied into state->req in server3_3_compound (). But even after the function has gone out of scope, in server_compound_resume () this pointer is accessed and dereferenced. This patch fixes that. 2. Compound fops, by virtue of NOT having a vector sizer (like the one writev has), ends up having both the header and the data (in case one of its member fops is WRITEV) in the same hdr_iobuf. This buffer was not being preserved through the lifetime of the compound fop, causing it to be overwritten by a parallel write fop, even when the writev associated with the currently executing compound fop is yet to hit the desk, thereby corrupting the file's data. This is fixed by associating the hdr_iobuf with the iobref so its memory remains valid through the lifetime of the fop. 3. Also fixed a use-after-free bug in protocol/client in compound fops cbk, missed by Linux but caught by NetBSD. Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their help in debugging this file corruption issue. Change-Id: I6d5c04f400ecb687c9403a17a12683a96c2bf122 BUG: 1378778 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15654 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>