glusterfs.git/rpc/rpc-lib/src, branch v3.11dev

tier : Tier as a service

2017-01-17T04:49:47+00:00

tierd is implemented by separating from rebalance process.

The commands affected:

1) Attach tier will trigger this process instead of old one
2) tier start and tier start force will also trigger this process.
3) volume status [tier] will show tier daemon as a process instead
of task and normal tier status and tier detach status works.
4) tier stop implemented.
5) detach tier implemented separately along with new detach tier
status
6) volume tier volname status will work using the changes.
7) volume set works

This patch has separated the tier translator from the legacy
DHT rebalance code. It now sends the RPCs from the CLI
to glusterd separate to the DHT rebalance code.
The daemon is now a service, similar to the snapshot daemon,
and can be viewed using the volume status command.

The code for the validation and commit phase are the same
as the earlier tier validation code in DHT rebalance.

The “brickop” phase has been changed so that the status
command can use this framework.

The service management framework is now used.
DHT rebalance does not use this framework.

This service framework takes care of :

*) spawning the daemon, killing it and other such processes.
*) volume set options , which are written on the volfile.
*) restart and reconfigure functions. Restart is to restart
the daemon at two points
        1)after gluster goes down and comes up.
        2) to stop detach tier.
*) reconfigure is used to make immediate volfile changes.
By doing this, we don’t restart the daemon.
it has the code to rewrite the volfile for topological
changes too (which comes into place during add and remove brick).

With this patch the log, pid, and volfile are separated
and put into respective directories.

Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399
BUG: 1313838
Signed-off-by: hari gowtham 
Reviewed-on: http://review.gluster.org/13365
Smoke: Gluster Build System 
Tested-by: hari gowtham 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Dan Lambright 
Reviewed-by: Atin Mukherjee

socket: socket disconnect should wait for poller thread exit

2016-12-22T04:49:19+00:00

When SSL is enabled or if "transport.socket.own-thread" option is set
then socket_poller is run as different thread. Currently during
disconnect or PARENT_DOWN scenario we don't wait for this thread
to terminate. PARENT_DOWN will disconnect the socket layer and
cleanup resources used by socket_poller.

Therefore before disconnect we should wait for poller thread to exit.

Change-Id: I71f984b47d260ffd979102f180a99a0bed29f0d6
BUG: 1404181
Signed-off-by: Rajesh Joseph 
Reviewed-on: http://review.gluster.org/16141
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Kaushal M 
Reviewed-by: Raghavendra Talur 
Reviewed-by: Raghavendra G

rpc: fix for race between rpc and protocol/client

2016-12-05T14:09:48+00:00

It is possible that the notification thread which notifies
protocol/client layer about the disconnection is put to sleep
and meanwhile, a fuse thread or a timer thread initiates and
completes reconnection to the brick. The notification thread
is then woken up and protocol/client layer updates its flags
to indicate that network is disconnected. No reconnection is
initiated because reconnection is rpc-lib layer's responsibility
and its flags indicate that connection is connected.

Fix: Serialize connect and disconnect notify

Credit: Raghavendra Talur 
Change-Id: I8ff5d1a3283b47f5c26848a42016a40bc34ffc1d
BUG: 1386626
Signed-off-by: Rajesh Joseph 
Reviewed-on: http://review.gluster.org/15916
Reviewed-by: Raghavendra G 
Smoke: Gluster Build System 
Reviewed-by: Raghavendra Talur 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

cluster/afr: CLI for granular entry heal enablement/disablement

2016-11-28T11:56:33+00:00

When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.

To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.

New CLI: gluster volume heal  granular-entry-heal {enable,disable}

Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099
BUG: 1370410
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/15747
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee

Revert "rpc: Fix the race between notification and reconnection"

2016-11-16T09:25:31+00:00

This reverts commit a6b63e11b7758cf1bfcb67985e25ec02845f0995.

Nithya and Rajesh found that the mount fails sometimes after this patch
was merged so reverting it.

BUG: 1386626
Change-Id: I959a5b6c7da61368cf4c67c98193c6e8fdd1755d
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/15838
Reviewed-by: N Balachandran 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System

rpc: Fix the race between notification and reconnection

2016-10-25T06:42:20+00:00

Problem:
There was a hang because unlock on an entry failed with
ENOTCONN.
Client thinks the connection is down where as server thinks
the connection is up.

This is the race we are seeing:
1) Connection from client to the brick disconnects.
2) Saved frames unwind is called which unwinds all
   frames that were wound before disconnect.
3) connection from client to the brick happens and
   setvolume.
4) Disconnect notification for the connection in 1)
   comes now and calls client_rpc_notify() which
   marks the connection to be offline even when the
   connection is up.

This is happening because I/O can retrigger connection
before disconnect notification is sent to the higher
layers in rpc.

Fix:
Notify the higher layers that a disconnect happened and then
go ahead with reconnect logic.

For the logs which point to the information above check:
https://bugzilla.redhat.com/show_bug.cgi?id=1386626#c1

Thanks to Raghavendra G for suggesting the correct fix.

BUG: 1386626
Change-Id: I3c84ba1f17010bd69049fa88ec5f0ae431f8cda9
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/15681
NetBSD-regression: NetBSD Build System 
Reviewed-by: Niels de Vos 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Raghavendra G

compound fops: Fix file corruption issue

2016-10-24T14:11:08+00:00

1. Address of a local variable @args is copied into state->req
in server3_3_compound (). But even after the function has gone out of
scope, in server_compound_resume () this pointer is accessed and
dereferenced. This patch fixes that.

2. Compound fops, by virtue of NOT having a vector sizer (like the one
writev has), ends up having both the header and the data (in case one of
its member fops is WRITEV) in the same hdr_iobuf. This buffer was not
being preserved through the lifetime of the compound fop, causing it to
be overwritten by a parallel write fop, even when the writev associated
with the currently executing compound fop is yet to hit the desk, thereby
corrupting the file's data. This is fixed by associating the hdr_iobuf with
the iobref so its memory remains valid through the lifetime of the fop.

3. Also fixed a use-after-free bug in protocol/client in compound fops cbk,
missed by Linux but caught by NetBSD.

Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their
help in debugging this file corruption issue.

Change-Id: I6d5c04f400ecb687c9403a17a12683a96c2bf122
BUG: 1378778
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/15654
NetBSD-regression: NetBSD Build System 
Reviewed-by: Raghavendra G 
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System

rpc: increase RPC/XID with each callback

2016-09-19T09:32:03+00:00

The RPC/XID for callbacks has been hardcoded to GF_UNIVERSAL_ANSWER. In
Wireshark these RPC-calls are marked as "RPC retransmissions" because of
the repeating RPC/XID. This is most confusing when verifying the
callbacks that the upcall framework sends. There is no way to see the
difference between real retransmissions and new callbacks.

This change was verified by create and removal of files through
different Gluster clients. The RPC/XID is increased on a per connection
(or client) base. The expectations of the RPC protocol are met this way.

Change-Id: I2116bec0e294df4046d168d8bcbba011284cd0b2
BUG: 1377097
Signed-off-by: Niels de Vos 
Reviewed-on: http://review.gluster.org/15524
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Raghavendra G

build: out-of-tree builds generates files in the wrong directory

2016-09-18T16:34:37+00:00

And minor cleanup of a few of the Makefile.am files while we're
at it.

Rewrite the make rules to do what xdrgen does. Now we can get rid
of xdrgen.

Note 1. netbsd6's sed doesn't do -i. Why are we still running
smoke tests on netbsd6 and not netbsd7? We barely support netbsd7
as it is.

Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with
libglusterfs? A cut-and-paste mistake? It has no references to
symbols in libglusterfs.

Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_
regex that matches the same lines as the _extended_ regex
"/#(ifndef|define|endif)/". To match the extended regex sed needs to
be run with -r on Linux; with -E on *BSD. However NetBSD's and
FreeBSD's sed helpfully also provide -r for compatibility. Using a
basic regex avoids having to use a kludge in order to run sed with
the correct option on OS X.

Note 4. Not copying the bit of xdrgen that inserts copyright/license
boilerplate. AFAIK it's silly to pretend that machine generated
files like these can be copyrighted or need license boilerplate.
The XDR source files have their own copyright and license; and
their copyrights are bound to be more up to date than old
boilerplate inserted by a script. From what I've seen of other
Open Source projects -- e.g. gcc and its C parser files generated
by yacc and lex -- IIRC they don't bother to add copyright/license
boilerplate to their generated files.

It appears that it's a long-standing feature of make (SysV, BSD,
gnu) for out-of-tree builds to helpfully pretend that the source
files it can find in the VPATH "exist" as if they are in the $cwd.
rpcgen doesn't work well in this situation and generates files
with "bad" #include directives.

E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`,
you get an #include directive in the generated .c file like this:

  ...
  #include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h"
  ...

which (obviously) results in compile errors on out-of-tree build
because the (generated) header file doesn't exist at that location.
Compared to `rpcgen ./glusterfs3-xdr.x` where you get:

  ...
  #include "glusterfs3-xdr.h"
  ...

Which is what we need. We have to resort to some Stupid Make Tricks
like the addition of various .PHONY targets to work around the VPATH
"help".

Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/...
looks exactly like -I$(top_srcdir)/rpc/xdr/...  Don't be fooled though.
And don't delete the -I$(top_builddir)/rpc/xdr/... bits

Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e
BUG: 1330604
Signed-off-by: Kaleb S KEITHLEY 
Reviewed-on: http://review.gluster.org/14085
Tested-by: Niels de Vos 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Niels de Vos

glusterd : Introduce reset brick

2016-08-30T02:55:53+00:00

The command basically allows replace brick with src and
dst bricks as same.

Usage:
gluster v reset-brick   start
This command kills the brick to be reset. Once this command is run,
admin can do other manual operations that they need to do,
like configuring some options for the brick. Once this is done,
resetting the brick can be continued with the following options.

gluster v reset-brick    commit {force}

Does the job of resetting the brick. 'force' option should be used
when the brick already contains volinfo id.

Problem: On doing a disk-replacement of a brick in a replicate volume
the following 2 scenarios may occur :

a) there is a chance that reads are served from this replaced-disk brick,
which leads to empty reads. b) potential data loss if next writes succeed
only on replaced brick, and heal is done to other bricks from this one.

Solution: After disk-replacement, make sure that reset-brick command is
run for that brick so that pending markers are set for the brick and it
is not chosen as source for reads and heal. But, as of now replace-brick
for the same brick-path is not allowed. In order to fix the above
mentioned problem, same brick-path replace-brick is needed.
With this patch reset-brick commit {force} will be allowed even when
source and destination  are identical as long as
1) destination brick is not alive
2) source and destination brick have the same brick uuid and path.
Also, the destination brick after replace-brick will use the same port
as the source brick.

Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de
BUG: 1266876
Signed-off-by: Anuradha Talur 
Reviewed-on: http://review.gluster.org/12250
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Ashish Pandey 
Reviewed-by: Pranith Kumar Karampuri