glusterfs.git/xlators/protocol/server, branch v7.0rc2

event: rename event_XXX with gf_ prefixed

2019-08-21T06:13:38+00:00

I hit one crash issue when using the libgfapi.

In the libgfapi it will call glfs_poller() --> event_dispatch()
in file api/src/glfs.c:721, and the event_dispatch() is defined
by libgluster locally, the problem is the name of event_dispatch()
is the extremly the same with the one from libevent package form
the OS.

For example, if a executable program Foo, which will also use and
link the libevent and the libgfapi at the same time, I can hit the
crash, like:

kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp
00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000]

The link for Foo is:
lib_foo_LADD = -levent $(GFAPI_LIBS)
It will crash.

This is because the glfs_poller() is calling the event_dispatch() from
the libevent, not the libglsuter.

The gfapi link info :
GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid

If I link Foo like:
lib_foo_LADD = $(GFAPI_LIBS) -levent
It will works well without any problem.

And if Foo call one private lib, such as handler_glfs.so, and the
handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't
and it will dlopen(handler_glfs.so), then the crash will be hit everytime.

The link info will be:
foo_LADD = -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)

I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like:
foo_LADD = $(GFAPI_LIBS) -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)

But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS.

And in some cases when the --as-needed link option is added(on many dists
it is added as default), then the crash is back again, the above workaround
won't work.

Backport of:
> https://review.gluster.org/#/c/glusterfs/+/23110/
> Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa
> Fixes: #699
> Signed-off-by: Xiubo Li 

Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa
updates: bz#1740519
Signed-off-by: Xiubo Li 
(cherry picked from commit 799edc73c3d4f694c365c6a7c27c9ab8eed5f260)

stack: Make sure to have unique call-stacks in all cases

2019-05-30T15:54:33+00:00

At the moment new stack doesn't populate frame->root->unique in all cases. This
makes it difficult to debug hung frames by examining successive state dumps.
Fuse and server xlators populate it whenever they can, but other xlators won't
be able to assign 'unique' when they need to create a new frame/stack because
they don't know what 'unique' fuse/server xlators already used. What we need is
for unique to be correct. If a stack with same unique is present in successive
statedumps, that means the same operation is still in progress. This makes
'finding hung frames' part of debugging hung frames easier.

fixes bz#1714098
Change-Id: I3e9a8f6b4111e260106c48a2ac3a41ef29361b9e
Signed-off-by: Pranith Kumar K

protocol: remove compound fop

2019-04-29T05:29:44+00:00

Compound fops are kept on wire as a backward compatibility with
older AFR modules. The AFR module used beyond 4.x releases are
not using compound fops. Hence removing the compound fop in the
protocol code.

Note that, compound-fops was already an 'option' in AFR, and
completely removed since 4.1.x releases.

So, point to note is, with this change, we have 2 ways to upgrade
when clients of 3.x series are present.

i) set 'use-compound-fops' option to 'false' on any volume which
   is of replica type. And then upgrade the servers.

ii) Do a two step upgrade. First from current version (which will
    already be EOL if it's using compound) to a 4.1..6.x version,
    and then an upgrade to 7.x.

Consider the overall code which we are removing for the option
seems quite high, I believe it is worth it.

updates: bz#1693692
Signed-off-by: Amar Tumballi 
Change-Id: I0a8876d0367a15e1410ec845f251d5d3097ee593

Replace memdup() with gf_memdup()

2019-04-12T13:08:28+00:00

memdup() and gf_memdup() have the same implementation. Removed one API
as the presence of both can be confusing.

Change-Id: I562130c668457e13e4288e592792872d2e49887e
updates: bz#1193929
Signed-off-by: Vijay Bellur

server.c: fix Coverity CID 1399758

2019-03-21T04:38:35+00:00

1399758 Dereference before null check

It was introduced @ commit 67f48bfcc16a38052e6c9ae7c25e69b03b8ae008


updates: bz#789278
Signed-off-by: Yaniv Kaul 

Change-Id: I1424b008b240691fe2a8924e31c708d0fb4f362d

glusterfsd: Brick is getting crash at the time of startup

2019-03-13T10:50:10+00:00

Problem: Brick is getting crash because graph was not activated
         at the time of accessing server_conf

Solution: To avoid the crash check ctx->active before processing
          a request

Change-Id: Ib112e0eace19189e45f430abdac5511c026bed47
fixes: bz#1687705
Signed-off-by: Mohit Agrawal

server.c: use dict_() funcs with key length.

2019-02-18T02:46:35+00:00

Changed to use the dict_() funcs which take the key length.
This happens to also reduce work under the lock in one case as well.

Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul 

Change-Id: I958fcc29e95286fe3c74178cae3f01a8b2db26f2

protocol/server: Use SERVER_REQ_SET_ERROR correctly for dicts

2019-02-15T07:15:55+00:00

Removed op_errno based SERVER_REQ_SET_ERROR() calls which was
dead-code. xdr_to_dict() calls have this check which is used
in 4.0 version of xdr-to-dict.

fixes bz#1676797
Change-Id: I6f56907c85576f1263a6ec04ed7e37f723b01ac3
Signed-off-by: Pranith Kumar K

core: heketi-cli is throwing error "target is busy"

2019-01-31T06:23:39+00:00

Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk() 
         is supposed to call rpc_transport_unref() after open-files on 
         that transport are flushed per transport.But open-fd-count is 
         maintained in bound_xl->fd_count, which can be incremented/decremented 
         cumulatively in server_connection_cleanup() by all transport 
         disconnect paths. So instead of rpc_transport_unref() happening 
         per transport, it ends up doing it only once after all the files 
         on all the transports for the brick are flushed leading to 
         rpc-leaks.

Solution: To avoid races maintain fd_cnt at client instead of maintaining
          on brick

Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal

core: heketi-cli is throwing error "target is busy"

2019-01-24T06:54:41+00:00

Problem: At the time of deleting block hosting volume
         through heketi-cli , it is throwing an error "target is busy".
         cli is throwing an error because brick is not detached successfully
         and brick is not detached due to race condition to cleanp xprt
         associated with detached brick

Solution: To avoid xprt specifc race condition introduce an atomic flag
          on rpc_transport

Change-Id: Id4ff1fe8375a63be71fb3343f455190a1b8bb6d4
fixes: bz#1668190
Signed-off-by: Mohit Agrawal