summaryrefslogtreecommitdiffstats
path: root/rpc/rpc-lib/src
Commit message (Collapse)AuthorAgeFilesLines
* rpc: increase RPC/XID with each callbackNiels de Vos2016-10-162-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RPC/XID for callbacks has been hardcoded to GF_UNIVERSAL_ANSWER. In Wireshark these RPC-calls are marked as "RPC retransmissions" because of the repeating RPC/XID. This is most confusing when verifying the callbacks that the upcall framework sends. There is no way to see the difference between real retransmissions and new callbacks. This change was verified by create and removal of files through different Gluster clients. The RPC/XID is increased on a per connection (or client) base. The expectations of the RPC protocol are met this way. > Change-Id: I2116bec0e294df4046d168d8bcbba011284cd0b2 > BUG: 1377097 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/15524 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit e9b39527d5dcfba95c4c52a522c8ce1f4512ac21) Change-Id: I2116bec0e294df4046d168d8bcbba011284cd0b2 BUG: 1377291 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15529 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc : build_prog_details should iterate program list inside critical sectionAtin Mukherjee2016-08-151-13/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13428 While I was analyzing a glusterd crash from free_prog_details, a code walkthrough detects that we iterate over the rpc svc program list without been inside the criticial section. This opens up a possibility of a crash when there is a concurrent writer updating the same list. Solution is to read the list inside lock. > Reviewed-on: http://review.gluster.org/13428 > Smoke: Gluster Build System <jenkins@build.gluster.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1310970 Change-Id: Ib4b4b0022a9535e139cd3c00574aab23f07aa9d2 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/13488 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* changelog/rpc: Fix rpc_clnt_t mem leaksKotresh HR2016-07-272-6/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13658 PROBLEM: 1. Freeing up rpc_clnt object might lead to crashes. Well, it was not a necessity to free rpc-clnt object till now because all the existing use cases needs to reconnect back on disconnects. Hence timer code was not taking ref on rpc-clnt object. Glusterd had some use-cases that led to crash due to ping-timer and they fixed only those code paths that involve ping-timer. Now, since changelog has an use-case where rpc-clnt need to be freed up, we need to fix timer code to take refs 2. In changelog, because of issue 1, only mydata was being freed which is incorrect. And there are races where rpc-clnt object would access the freed mydata which would lead to crashes. Since changelog xlator resides on brick side and is long living process, if multiple libgfchangelog consumers register to changelog and disconnect/reconnect mulitple times, it would result in leak of 'rpc-clnt' object for every connect/disconnect. SOLUTION: 1. Handle ref/unref of 'rpc_clnt' structure in timer functions properly. 2. In changelog, unref 'rpc_clnt' in RPC_CLNT_DISCONNECT after disabling timers and free mydata on RPC_CLNT_DESTROY. RPC SETUP IN CHANGELOG: 1. changelog xlator initiates rpc server say 'changelog_rpc_server' 2. libgfchangelog initiates one rpc server say 'libgfchangelog_rpc_server' 3. libgfchangelog initiates rpc client and connects to 'changelog_rpc_server' 4. In return changelog_rpc_server initiates a rpc client and connects back to 'libgfchangelog_rpc_server' REF/UNREF HANDLING IN TIMER FUNCTIONS: Let's say rpc clnt refcount = 1 1. Take the ref before reigstering callback to timer queue >>>> rpc_clnt_ref (say ref count becomes = 2) 2. Register a callback to timer say 'callback1' 3. If register fails: >>>> rpc_clnt_unref (ref count = 1) 4. On timer expiration, 'callback1' gets called. So unref rpc clnt at the end in 'callback1'. This is corresponding to ref taken in step 1 >>>> rpc_clnt_unref (ref count = 1) 5. The cycle from step-1 to step-4 continues....until timer cancel event happens 6. timer cancel of say 'callback1' If timer cancel fails: Do nothing, Step-4 would have unrefd If timer cancel succeeds: >>>> rpc_clnt_unref (ref count = 1) Change-Id: I91389bc511b8b1a17824941970ee8d2c29a74a09 BUG: 1359363 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 637ce9e2e27e9f598a4a6c5a04cd339efaa62076) Reviewed-on: http://review.gluster.org/14993 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: check the right variable after gf_strdupZhou Zhengping2016-05-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/13934 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Jeff Darcy <jdarcy@redhat.com> > Reviewed-by: Anoop C S <anoopcs@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> (cherry picked from commit 1c9c776352c60deeda51be66fda6d44bf06d3796) Change-Id: If4628bd37f2c85a070f6c3b3e0583d939100d7ec BUG: 1332404 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: http://review.gluster.org/14494 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Anoop C S <anoopcs@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: add defence mechanism to avoid brick port clashesPrasanna Kumar Kalever2016-05-041-40/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro: Currently glusterd maintain the portmap registry which contains ports that are free to use between 49152 - 65535, this registry is initialized once, and updated accordingly as an then when glusterd sees they are been used. Glusterd first checks for a port within the portmap registry and gets a FREE port marked in it, then checks if that port is currently free using a connect() function then passes it to brick process which have to bind on it. Problem: We see that there is a time gap between glusterd checking the port with connect() and brick process actually binding on it. In this time gap it could be so possible that any process would have occupied this port because of which brick will fail to bind and exit. Case 1: To avoid the gluster client process occupying the port supplied by glusterd : we have separated the client port map range with brick port map range more @ http://review.gluster.org/#/c/13998/ Case 2: (Handled by this patch) To avoid the other foreign process occupying the port supplied by glusterd : To handle above situation this patch implements a mechanism to return EADDRINUSE error code to glusterd, upon which a new port is allocated and try to restart the brick process with the newly allocated port. Note: Incase of glusterd restarts i.e. runner_run_nowait() there is no way to handle Case 2, becuase runner_run_nowait() will not wait to get the return/exit code of the executed command (brick process). Hence as of now in such case, we cannot know with what error the brick has failed to connect. This patch also fix the runner_end() to perform some cleanup w.r.t return values. Backport of: > Change-Id: Iec52e7f5d87ce938d173f8ef16aa77fd573f2c5e > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14043 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Change-Id: Ief247b4d4538c1ca03e73aa31beb5fa99853afd6 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14208 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: try to connect on GF_PMAP_PORT_FOREIGN aswellPrasanna Kumar Kalever2016-05-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix couple of things mentioned below: 1. previously we use to try to connect on only GF_PMAP_PORT_FREE in the pmap_registry_alloc(), it could happen that some foreign process would have freed the port by this time ?, hence it is worth giving a try on GF_PMAP_PORT_FOREIGN ports as well instead of wasting them all. 2. fix pmap_registry_remove() to mark the port asGF_PMAP_PORT_FREE 3. added useful comments on gf_pmap_port_type enum members Backport of: > Change-Id: Id2aa7ad55e76ae3fdece21bed15792525ae33fe1 > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14080 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: Ib7852aa1c55611e81c78341aace4d374d516f439 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14117 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* Revert "glusterd: Bug fixes for IPv6 support"Kaushal M2016-04-161-0/+6
| | | | | | | | This reverts commit b33f3c95ec9c8112e6677e09cea05c4c462040d0. This commit exposes some issues with management encryption that prevents GlusterFS from operating properly. This will be added again once problems with management encryption are fixed.
* afr: add mtime based split-brain resolution to CLIRavishankar N2016-04-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13828/ Extended the CLI to include support for split-brain resolution based on mtime. The command syntax is: $:gluster volume heal <VOLNAME> split-brain latest-mtime <FILE> where <FILE> can be either the full file name as seen from the root of the volume (or) the gfid-string representation of the file. Change-Id: I7a16f72ff1a4495aa69f43f22758a9404e958b4f BUG: 1321748 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13838 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* rpc: Connect back only if rpc is not disabledKotresh HR2016-03-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of the below fix - http://review.gluster.org/13592 As discussed over http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/14284, patch #13456 caused a regression where in if there are any pending rpc invocations, we end up accessing freed object. This patch fixes it by allowing reconnect during rpc submit only if rpc is not disabled. This is to fix regression caused by below patch - http://review.gluster.org/#/c/13507/ Change-Id: I4ef4dd52bd42368bb89129f98bc973e46c6a39f4 BUG: 1311441 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13592 Reviewed-on: http://review.gluster.org/13824 Tested-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* rpc: Fix for rpc_transport_t leakSoumya Koduri2016-03-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | The transport object needs to get unref'ed when the rpc clnt object is getting destroyed. But currently in rpc_clnt_disable() we set conn->trans to NULL before it gets unref'ed leading to transport object leak. This change is to fix it by setting conn-tran to NULL only when it is being unref'ed. This is backport of the below patch - http://review.gluster.org/13456 Change-Id: I79ba34e28ae19eb616035f36bbed1c2f47875b94 BUG: 1311441 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13456 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13507 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* glusterd/rpc : Discard duplicate Disconnect eventsAtin Mukherjee2016-03-232-0/+24
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13790/ If a peer rpc disconnect event has been already processed, skip the furthers as processing them are overheads and sometimes may lead to a crash like due to a double free Change-Id: Iec589ce85daf28fd5b267cb6fc82a4238e0e8adc BUG: 1320374 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13790 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/13813
* glusterd: Bug fixes for IPv6 supportNithin D2016-03-211-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11988/ Problem: Glusterd not working using ipv6 transport. The idea is with proper glusterd.vol configuration, 1. glusterd needs to listen on default port (240007) as IPv6 TCP listner. 2. Volume creation/deletion/mounting/add-bricks/delete-bricks/peer-probe needs to work using ipv6 addresses. 3. Bricks needs to listen on ipv6 addresses. All the above functionality is needed to say that glusterd supports ipv6 transport and this is broken. Fix: When "option transport.address-family inet6" option is present in glusterd.vol file, it is made sure that glusterd creates listeners using ipv6 sockets only and also the same information is saved inside brick volume files used by glusterfsd brick process when they are starting. Tests Run: Regression tests using ./run-tests.sh IPv4: Regression tests using ./run-tests.sh for release-3.7 branch verified by comparing with clean repo. IPv6: (Need to add the above mentioned config and also add an entry for "hostname ::1" in /etc/hosts) Started failing at ./tests/basic/glusterd/arbiter-volume-probe.t and ran successfully till here Change-Id: Idd7513aa2347ce0de2b1f68daeecce1b7a39a7af BUG: 1310445 Signed-off-by: Nithin D <nithind1988@yahoo.in> Reviewed-on: http://review.gluster.org/13787 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* all: reduce "inline" usageKaleb S KEITHLEY2016-01-183-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. backport of Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155, http://review.gluster.org/11769, BUG: 1245331 Change-Id: Iba1efb0bc578ea4a5e9bf76b7bd93dc1be9eba44 BUG: 1283302 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12646 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd: cli command implementation for bitrot scrub statusGaurav Kumar Garg2015-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is backport of: http://review.gluster.org/10231 CLI command for bitrot scrub status will be : gluster volume bitrot <volname> scrub status Above command will show the statistics of bitrot scrubber. Upon execution of this command it will show some common scrubber tunable value of volume <VOLNAME> followed by statistics of scrubber statistics of individual nodes. sample ouput for single node: Volume name : <VOLNAME> State of scrub: Active Scrub frequency: biweekly Bitrot error log location: /var/log/glusterfs/bitd.log Scrubber error log location: /var/log/glusterfs/scrub.log ========================================================= Node name: Number of Scrubbed files: Number of Unsigned files: Last completed scrub time: Duration of last scrub: Error count: ========================================================= This is just infrastructure. list of bad file, last scrub time, error count value will be taken care by http://review.gluster.org/#/c/12503/ and http://review.gluster.org/#/c/12654/ patches. >> Change-Id: I3ed3c7057c9d0c894233f4079a7f185d90c202d1 >> BUG: 1207627 >> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> >> Reviewed-on: http://review.gluster.org/10231 >> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >> Tested-by: NetBSD Build System <jenkins@build.gluster.org> >> Tested-by: Gluster Build System <jenkins@build.gluster.com> Change-Id: I45ed94e5e0e78a1e007c30eb0b252f74cf3c9187 BUG: 1283881 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12704 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* rpc: fix possible deadlock left behind in d448fd1Krishnan Parthasarathi2015-11-023-23/+21
| | | | | | | | | | | | | | See http://review.gluster.org/9613 for more details. BUG: 1233019 Change-Id: I05ac0267b8c6f4e9b354acbbdf5469835455fb10 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/10821 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit e902df70f8157db4db503b7ec3c2635b08b3dcb2) Reviewed-on: http://review.gluster.org/11303 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* server/protocol: option for dynamic authorization of client permissionsPrasanna Kumar Kalever2015-10-132-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problem: assuming gluster volume is already mounted (for gfapi: say client transport connection has already established), now if somebody change the volume permissions say *.allow | *.reject for a client, gluster should allow/terminate the client connection based on the fresh set of volume options immediately, but in existing scenario neither we have any option to set this behaviour nor we take any action until and unless we remount the volume manually solution: Introduce 'dynamic-auth' option (default: on). If 'dynamic-auth' is 'on' gluster will perform dynamic authentication to allow/terminate client transport connection immediately in response to *.allow | *.reject volume set options, thus if volume permissions have changed for a particular client (say client is added to auth.reject list), his transport connection to gluster volume will be terminated immediately. Backport of: > Change-Id: I6243a6db41bf1e0babbf050a8e4f8620732e00d8 > BUG: 1245380 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/12229 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit 84e90b756566bc211535a8627ed16d4231110ade) Change-Id: If7e5c9be912412ea388391ef406ee2c8bedb26b8 BUG: 1271065 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/12343 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: add gluster v tier <vol>Dan Lambright2015-09-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 11984. > Currently the tier feature piggy backs off the rebalance command > syntax to obtain status and this is clumsy. Introduce a new > tier command that can do tier specific operations, starting > with volume status to display counters. > Old commands: > gluster volume attach-tier <vol> [replica count] {bricklist..} > gluster volume detach-tier <vol> {start|stop|commit} > New commands: > gluster volume tier <vol> attach [replica count] {bricklist} | > detach {start|stop|commit} | > status > Change-Id: Ic07b3c6260588162de7d34380f8cbd3d8a7f35d3 > BUG: 1255693 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/11984 > Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Id45bd0fa6b8606dd47863de83a694908da393229 BUG: 1261664 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12143 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com>
* rpc-clnt: Check for transport object during rpc connection cleanupSoumya Koduri2015-08-191-3/+7
| | | | | | | | | | | | | | | | | | | | | While doing glfs_fini(), all the xlators are first notified of PARENT_DOWN. protocol-client xlator on receving that notification does rpc_clnt_disable which disassociates rpc->conn with its transport object and does socket shutdown. So any further references to conn->trans should not happen during rpc connection cleanup which is done mainly as part of epoll event handling of EPOLLERR/EPOLLHUP. This is a backport of the below fix- http://review.gluster.org/#/c/11845/ BUG: 1254607 Change-Id: I619ec00fd061f77c9b04dfa6fd139620cb44189b Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/11845 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/11953 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* rpc: add owner xlator argument to rpc_clnt_newKrishnan Parthasarathi2015-08-142-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The @owner argument tells RPC layer the xlator that owns the connection and to which xlator THIS needs be set during network notifications like CONNECT and DISCONNECT. Code paths that originate from the head of a (volume) graph and use STACK_WIND ensure that the RPC local endpoint has the right xlator saved in the frame of the call (callback pair). This guarantees that the callback is executed in the right xlator context. The client handshake process which includes fetching of brick ports from glusterd, setting lk-version on the brick for the session, don't have the correct xlator set in their frames. The problem lies with RPC notifications. It doesn't have the provision to set THIS with the xlator that is registered with the corresponding RPC programs. e.g, RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set to its xlator. This implies, call(-callbacks) originating from this thread don't have the right xlator set too. The fix would be to save the xlator registered with the RPC connection during rpc_clnt_new. e.g, protocol/client's xlator would be saved with the RPC connection that it 'owns'. RPC notifications such as CONNECT, DISCONNECT, etc inherit THIS from the RPC connection's xlator. Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409 BUG: 1253212 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/11436 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit f7668938cd7745d024f3d2884e04cd744d0a69ab) Reviewed-on: http://review.gluster.org/11908 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* rpc: check for unprivileged port should start at 1024Milind Changire2015-08-131-1/+1
| | | | | | | | | | | | | | | | The current check for unprivileged port starts beyond 1024 i.e. port > 1024 The actual check should start at 1024 i.e. port >= 1024 Change-Id: I78aff3025891e3e78ca6a9a670c89571752157df BUG: 1248450 Reviewed-on: http://review.gluster.org/#/c/11788/ Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/11804 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* client, rpc: make ping-timeout configurable for glusterfs clientsKrishnan Parthasarathi2015-08-122-0/+16
| | | | | | | | | | | | Change-Id: Idd94adb0457aaffce7330f56f98cebafa2c4dae8 BUG: 1250810 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/11818 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit 3403370ebeaf16567b79022c6ac48b2e0cd50db5) Reviewed-on: http://review.gluster.org/11848
* rdma : porting missing gf_log to gf_msgManikandan Selvaganesh2015-08-121-1/+3
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11607/ Cherry picked from commit 6beb9ca292a0653d3d082af9d30f519a99569a14 > Change-Id: I036b43007fbcd0e528faab8d44e1a7fc820eaf1f > BUG: 1242333 > Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> > Reviewed-on: http://review.gluster.org/11607 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I036b43007fbcd0e528faab8d44e1a7fc820eaf1f BUG: 1252272 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/11878 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: fix binding brick issue while bind-insecure is enabledPrasanna Kumar Kalever2015-07-263-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is backport of http://review.gluster.org/#/c/11512/ > problem: > When bind-insecure is turned on (which is the default now), it may happen > that brick is not able to bind to port assigned by Glusterd for example > 49192-49195... > > It seems to occur because the rpc_clnt connections are binding to ports in > the same range. so brick fails to bind to a port which is already used by > someone else > > solution: > > fix for now is to make rpc_clnt to get port numbers from 65535 in a > descending > order, as a result port clash is minimized > > other fixes: > > previously rdma binds to port >= 1024 if it cannot find a free port < 1024, > even when bind insecure was turned off(ref to commit '0e3fd04e'), this patch > add's a check for bind-insecure in gf_rdma_client_bind function > > This patch also re-enable bind-insecure and allow insecure by default > which was reverted (ref: commit cef1720) previously > Change-Id: Ia1cfa93c5454e2ae0ff57813689b75de282ebd07 > BUG: 1238661 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Change-Id: Iea55f9b2a57b5e24d3df2c5fafae12fe99e9dee0 BUG: 1246481 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/11758 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* Revert "rpc: By default set allow-insecure, bind-insecure to on"Raghavendra G2015-07-023-18/+4
| | | | | | | | | | | | | This reverts commit 243a5b429f225acb8e7132264fe0a0835ff013d5. This patch introduced a regression where client no longer binds to privileged port. This is causing lots of regressions. Hence reverting this patch for now and will be resent after suitable modifications. Change-Id: I302252fd3832b0a5a03b04e30cfa0def37597404 Reviewed-on: http://review.gluster.org/11508 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: By default set allow-insecure, bind-insecure to onPrasanna Kumar Kalever2015-06-303-4/+18
| | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11039 since we now use SSL (Secure Sockets Layer) for the security issues, the patch changes the default setting to allow connections/requests from non-privilaged ports by setting allow-insecure and bind-insecure to 1 Also added bind functionality for insecure binding which can select from available local ports dynamically BUG: 1232660 Change-Id: I927e112223f33611452093e38cd846a0b9347e57 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/11274 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rdma: porting rdma to new message id logging formatHumble Devassy Chirammal2015-06-252-1/+85
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/9868/ Cherry picked from 4306245aef7cdcbfa6d7a59dccd031d4ada54105 > Change-Id: I71e940817ae0a9378e82332d5a8569114fc13482 > BUG: 1194640 > Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> > Reviewed-on: http://review.gluster.org/9868 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I71e940817ae0a9378e82332d5a8569114fc13482 BUG: 1217722 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/10673 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: call transport_unref only on non-NULL transportKrishnan Parthasarathi2015-06-091-2/+8
| | | | | | | | | | | BUG: 1228601 Change-Id: Ifac4dd8c633081483e4eba9d7e5a89837b2a453a Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/11041 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/11102 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd: fix repeated connection to nfssvc failed msgsKrishnan Parthasarathi2015-06-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and disable reconnect timer on rpc_clnt_disconnect. Root Cause ---------- gluster-NFS service wouldn't be started if there are no started volumes that have nfs service enabled for them. Before this fix we would initiate a connect even when the gluster-NFS service wasn't (re)started. Compounding that glusterd_conn_disconnect doesn't disable reconnect timer. So, it is possible that the reconnect timer was in execution when the timer event was attempted to be removed. Change-Id: Iadcb5cff9eafefa95eaf3a1a9413eeb682d3aaac BUG: 1222065 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/10830 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/10963 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org>
* Upcall: Process each of the upcall events separatelySoumya Koduri2015-05-071-3/+1
| | | | | | | | | | | | | | | | | | As suggested during the code-review of Bug1200262, have modified GF_CBK_UPCALL to be exlusively GF_CBK_CACHE_INVALIDATION. Thus, for any new upcall event, a new CBK procedure will be added. Also made changes to store upcall data separately based on the upcall event type received. BUG: 1217711 Change-Id: I0f5e53d6f5ece16aecb514a0a426dca40fa1c755 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10049 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/10562 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* geo-rep: Status EnhancementsAravinda VK2015-05-061-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Discussion in gluster-devel http://www.gluster.org/pipermail/gluster-devel/2015-April/044301.html MASTER NODE - Master Volume Node MASTER VOL - Master Volume name MASTER BRICK - Master Volume Brick SLAVE USER - Slave User to which Geo-rep session is established SLAVE - <SLAVE_NODE>::<SLAVE_VOL> used in Geo-rep Create command SLAVE NODE - Slave Node to which Master worker is connected STATUS - Worker Status(Created, Initializing, Active, Passive, Faulty, Paused, Stopped) CRAWL STATUS - Crawl type(Hybrid Crawl, History Crawl, Changelog Crawl) LAST_SYNCED - Last Synced Time(Local Time in CLI output and UTC in XML output) ENTRY - Number of entry Operations pending.(Resets on worker restart) DATA - Number of Data operations pending(Resets on worker restart) META - Number of Meta operations pending(Resets on worker restart) FAILURES - Number of Failures CHECKPOINT TIME - Checkpoint set Time(Local Time in CLI output and UTC in XML output) CHECKPOINT COMPLETED - Yes/No or N/A CHECKPOINT COMPLETION TIME - Checkpoint Completed Time(Local Time in CLI output and UTC in XML output) XML output: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> cliOutput> geoRep> volume> name> sessions> session> session_slave> pair> master_node> master_brick> slave_user> slave/> slave_node> status> crawl_status> entry> data> meta> failures> checkpoint_completed> master_node_uuid> last_synced> checkpoint_time> checkpoint_completion_time> BUG: 1218586 Change-Id: I944a6c3c67f1e6d6baf9670b474233bec8f61ea3 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10121 Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10574 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: Maintain separate xlator pointer in 'rpcsvc_state'Kotresh HR2015-05-052-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The structure 'rpcsvc_state', which maintains rpc server state had no separate pointer to track the translator. It was using the mydata pointer itself. So callers were forced to send xlator pointer as mydata which is opaque (void pointer) by function prototype. 'rpcsvc_register_init' is setting svc->mydata with xlator pointer. 'rpcsvc_register_notify' is overwriting svc->mydata with mydata pointer. And rpc interprets svc->mydata as xlator pointer internally. If someone passes non xlator structure pointer to rpcsvc_register_notify as libgfchangelog currently does, it might corrupt mydata. So interpreting opaque mydata as xlator pointer is incorrect as it is caller's choice to send mydata as any type of data to 'rpcsvc_register_notify'. Maintaining two different pointers in 'rpcsvc_state' for xlator and mydata solves the issue. BUG: 1218381 Change-Id: I4c28937a30845e3f41b6fc7a09036149c816659b Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10366 Reviewed-on: http://review.gluster.org/10534 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: Perform throttling conditionallyVijay Bellur2015-05-031-6/+19
| | | | | | | | | | | | | | | | This change makes rpc's throttling to be performed only if attribute throttle is set in rpcsvc_t. Change-Id: I24620095570e206f5dc8fc6208fcf55cb22a1658 BUG: 1216310 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10268 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/10443 Tested-by: NetBSD Build System
* rpc: Introduce attribute throttle for rpcsvc_tVijay Bellur2015-05-033-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | This attribute will be used to set/unset throttling for a rpcsvc_t program subsequently. Following APIs have been added to get/set throttle. int rpcsvc_set_throttle (rpcsvc_t svc, gf_boolean_t value); gf_boolean_t rpcsvc_get_throttle (rpcsvc_t svc); Change-Id: Ica8a9166cef22eb92d81fe68e48d0a5e24a1ef95 BUG: 1216310 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10267 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/10442 Tested-by: NetBSD Build System
* rpc-lib: Fixing the coverity issuesNandaja Varma2015-04-104-20/+14
| | | | | | | | | | | | | | | | | Coverity CIDs: 1210973 1124887 1124888 1124682 1124849 1124503 Change-Id: I012f6cf9d14753f572ab94aae6d442d1ef8df79a BUG: 789278 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/9600 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: fix deadlock when unref is inside conn->lockKrishnan Parthasarathi2015-04-101-70/+91
| | | | | | | | | | | | | | | | | In ping-timer implementation, the timer event takes a ref on the rpc object. This ref needs to be removed after every timeout event. ping-timer mechanism could be holding the last ref. For e.g, when a peer is detached and its rpc object was unref'd. In this case, ping-timer mechanism would try to acquire conn->mutex to perform the 'last' unref while being inside the critical section already. This will result in a deadlock. Change-Id: I74f80dd08c9348bd320a1c6d12fc8cd544fa4aea BUG: 1206134 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9613 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: Fixing dereferencing after null checkarao2015-04-021-3/+9
| | | | | | | | | | | | | | CID: 1124607 The pointer variable is checked for NULL and logged accordingly. Change-Id: Ied0d7f7ff33da22198eca65f14816b943cae5541 BUG: 789278 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9674 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: CLI commands to create and manage tiered volumes.Dan Lambright2015-03-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A tiered volume is a normal volume with some number of new bricks representing "hot" storage. The "hot" bricks can be attached or detached dynamically to a normal volume. When this happens, a new graph is constructed. The root of the new graph is an instance of the tier translator. One subvolume of the tier translator leads to the old volume, and another leads to the new hot bricks. attach-tier <VOLNAME> [<replica> <COUNT>] <NEW-BRICK> ... [force] volume detach-tier <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force> gluster volume rebalance <volume> tier start gluster volume rebalance <volume> tier stop gluster volume rebalance <volume> tier status The "tier start" CLI command starts a server side daemon. The daemon initiates file level migration based on caching policies. The daemon's status can be monitored and stopped. Note development on the "tier status" command is incomplete. It will be added in a subsequent patch. When the "hot" storage is detached, the tier translator is removed from the graph and the tiered volume reverts to its original state as described in the volume's info file. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Ic8042ce37327b850b9e199236e5be3dae95d2472 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9753 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* cli/glusterd: cli command implementation for bitrot featuresGaurav Kumar Garg2015-03-181-0/+1
| | | | | | | | | | | | | | | | | CLI command for bitrot features. volume bitrot <volname> enable|disable Above command will enable/disable bitrot feature for particular volume. BUG: 1170075 Change-Id: Ie84002ef7f479a285688fdae99c7afa3e91b8b99 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Anand nekkunti <anekkunt@redhat.com> Signed-off-by: Dominic P Geevarghese <dgeevarg@redhat.com> Reviewed-on: http://review.gluster.org/9866 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* CLI : GLobal option for NFS-GaneshaMeghana Madhusudhan2015-03-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new global CLI option has been introduced for NFS-Ganesha. gluster features.ganesha enable/disable. This option is persistent and shall be inherited by new volumes created after this option is set. gluster features.ganesha enable It carries out the following functions: 1. Disables gluster-nfs across the cluster 2. Starts NFS-Ganesha server on a subset of nodes and exports '/'. 3. Creates the HA cluster for NFS-Ganesha. 4. Writes the option into the global config file. gluster features.ganesha disable 1. Stops NFS-Ganesha server. 2. Tears down the HA cluster for NFS-Ganesha With this change the older volume set options with keys "nfs-ganesha.host" and "nfs-ganesha.enable" will no longer be supported. This commit has only has the CLI related changes. Another patch will be submitted to support this feature entirely. Change-Id: Ie4b66a16c23b33b795738654b9a68f8e2c34efe3 BUG: 1188184 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/9538 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* socket: use TCP_USER_TIMEOUT to detect client failures quickerNiels de Vos2015-03-172-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the network.ping-timeout to set the TCP_USER_TIMEOUT socket option (see 'man 7 tcp'). The option sets the transport.tcp-user-timeout option that is handled in the rpc/socket layer on the protocol/server side. This socket option makes detecting unclean disconnected clients more reliable. When the socket gets closed, any locks that the client held are been released. This makes it possible to reduce the fail-over time for applications that run on systems that became unreachable due to a network partition or general system error client-side (kernel panic, hang, ...). It is not trivial to create a test-case for this at the moment. We need a client that unclean disconnects and an other client that tries to take over the lock from the disconnected client. URL: http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040755.html Change-Id: I5e5f540a49abfb5f398291f1818583a63a5f4bb4 BUG: 1129787 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/8065 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Santosh Pradhan <santosh.pradhan@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Upcall: New xlator to store various states and send cbk eventsSoumya Koduri2015-03-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Framework on the server-side, to handle certain state of the files accessed and send notifications to the clients connected. A generic and extensible framework, used to maintain states in the glusterfsd process for each of the files accessed (including the clients info doing the fops) and send notifications to the respective glusterfs clients incase of any change in that state. This patch handles "Inode Update/Invalidation" upcall event. Feature page: URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure Below link has a writeup which explains the code changes done - URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/ Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/9535 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Add self-heal-daemon command handlersPranith Kumar K2015-03-091-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the changes required in ec xlator to handle index/full heal. Index healer threads: Ec xlator start an index healer thread per local brick. This thread keeps waking up every minute to check if there are any files to be healed based on the indices kept in index directory. Whenever child_up event comes, then also this index healer thread wakes up and crawls the indices and triggers heal. When self-heal-daemon is disabled on this particular volume then the healer thread keeps waiting until it is enabled again to perform heals. Full healer threads: Ec xlator starts a full healer thread for the local subvolume provided by glusterd to perform full crawl on the directory hierarchy to perform heals. Once the crawl completes the thread exits if no more full heals are issued. Changed xl-op prefix GF_AFR_OP to GF_SHD_OP to make it more generic. Change-Id: Idf9b2735d779a6253717be064173dfde6f8f824b BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9787 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpcsvc: New rpc routines defined to send callback requestsSoumya Koduri2015-03-012-0/+47
| | | | | | | | | | | | | Change-Id: I7f95682faada16308314bfbf84298b02d1198efa BUG: 1188184 Signed-off-by: Poornima G <pgurusid@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/9534 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: nfs,shd,quotad,snapd daemons refactoringAtin Mukherjee2015-02-202-0/+44
| | | | | | | | | | | | | This patch ports nfs, shd, quotad & snapd with the approach suggested in http://www.gluster.org/pipermail/gluster-devel/2014-December/043180.html Change-Id: I4ea5b38793f87fc85cc9d2cf873727351dedffd2 BUG: 1191486 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9428 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
* rdma: reduce log level from E to WMohammed Rafi KC2015-02-172-1/+28
| | | | | | | | | | | | | | glusterd process, when try to initialize default vol file, will always through an error if there is no rdma device. Changing the log levels and log messages to more appropriately. Change-Id: I75b919581c6738446dd2d5bddb7b7658a91efcf4 BUG: 1188232 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9559 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: fix ref leak in ping timerKrishnan Parthasarathi2015-02-041-1/+16
| | | | | | | | Change-Id: I4ddc371d01ec763706a168a215410015ee2a3787 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/9578 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* geo-rep: Adding Slave user field to georep statusAravinda VK2015-02-021-0/+1
| | | | | | | | | | | | | | | | | | | New column introduced in Status output, "SLAVE USER", Slave user is not "root" in non root Geo-replication setup. Added additional tag in XML output <slave_user> BUG: 1180459 Change-Id: Ia48a5a8eb892ce883b9ec114be7bb2d46eff8535 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9409 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Implement Volume heal enable/disablePranith Kumar K2015-01-201-0/+2
| | | | | | | | | | | | | | | | | | For volumes with replicate, disperse xlators, self-heal daemon should do healing. This patch provides enable/disable functionality for the xlators to be part of self-heal-daemon. Replicate already had this functionality with 'gluster volume set cluster.self-heal-daemon on/off'. But this patch makes it uniform for both types of volumes. Internally it still does 'volume set' based on the volume type. Change-Id: Ie0f3799b74c2afef9ac658ef3d50dce3e8072b29 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9358 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cluster/afr: split-brain resolution CLIRavishankar N2015-01-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the AFR heal command to include automated split-brain resolution. This patch [3/3] is the final patch for afr automated split-brain resolution implementation. "gluster volume heal <VOLNAME> [full | statistics [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain]| split-brain {bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}]" The new additions being: 1.gluster volume heal <VOLNAME> split-brain bigger-file <FILE> Locates the replica containing the FILE, selects bigger-file as source and completes heal. 2.gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME> <FILE> Selects <FILE> present in <HOSTNAME:BRICKNAME> as source and completes heal. 3.gluster volume heal <VOLNAME> split-brain <HOSTNAME:BRICKNAME> Selects all split-brained files in <HOSTNAME:BRICKNAME> as source and completes heal. Note: <FILE> can be either the full file name as seen from the root of the volume (or) the gfid-string representation of the file, which sometimes gets displayed in the heal info command's output. Entry/gfid split-brain resolution is not supported. Example can be found in the test case. Change-Id: I4649733922d406f14f28ee9033a5cb627b9538b3 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9377 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>