| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[WIP patch as of now, just needs a little tweak]
A pending TODO in the code caused regressions to fail as
bitrot daemons are spawned during volume start (equivalent
to enabling bitrot by default). The problematic part that
casued such failures is during brick disconnections with
unsafe handling of event data structured in the code.
With this patch, data structures are properly cleaned up
with care taken to cleanup all accessors first. This also
fixes potential memory leaks which was bluntly ignored
before.
Change-Id: I70ed82cb1a0fb56c85ef390007e321a97a35c5ce
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
original-author: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9959
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
libgfchangelog initializes global xlator on library load (via
constructor: _ctor) and mangles it's xlator context thereby
messing with certain important members of the command structure.
On receiving an RPC disconnection event, if the point-of-execution
was in libgfchangelogs context, accessing ->cmd_args during RPC
notify resulted in a segfault.
Fix:
Since the libarary needs to be able to work with processes that
have a notion of an xlator (THIS in particular) and without it,
care needs to be taken to allocate the global xlator when needed.
Moreover, the actual fix is to use the correct xlator context
in both cases. A new API is introduces when needs to be invoked
by the conusmer (although this could have been done during
register() call, keeping it a separate API makes thing flexible
and easy).
Test:
The issue is observed when a brick process goes offline. This is
triggered when test cases (.t's) are run in bulk, since each
test essestially spawns bricks processes (on volume start) and
terminates them (volume stop). Since bitrot daemon, as of now,
spawns upon volume start, the issue is much observed when the
volume is taken offline at the end of each test case. With this
fix, running the basic and core test cases along with building
the linux kernel has passed without daemon segfaults.
Thanks to Johnny (rabhat@) for helping in debugging the issue
(and with the fix :)).
Change-Id: I8d3022bf749590b2ee816504ed9b1dfccc65559a
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9953
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
This patch introduces RPC based communication between the changelog
translator and libgfchangelog. It replaces the old pathetic stream
based interaction that existed earlier (due to time constraints :-/).
Changelog, upon initialization starts a RPC server (rpcsvc) allowing
clients to invoke a probe API as a bootup mechanism to request for
event notifications. During probe, clients can choose an event
filter specifying the type(s) of events they are interested in. As
of now there is no way to change the event notification set once
the probe RPC call is made, but that is easier to implement.
The actual event notifications is done on a separate RPC session.
The client (libgfchangelog) itself starts and RPC server which the
changelog translator "connects back" during probe. Notifications
are dispatched by a bunch of threads from the server (translator)
and the client optionally orders them if ordered notifications
are requried. FOPs fill in their respective event details in a
buffer (rot-buffs to be particular) and a bunch of threads
(consumers) swap the buffers out of roatation and dispatch them
via RPC. To avoid writer starvation, then number of dispatcher
threads is one less than the number of buffer list in rot-buffs.x
libgfchangelog becomes purely callback based -- upon event
notification from the server (and re-ordering them if required)
invoke a callback routine specified by consumer(s).
A major part of the patch is also aimed at providing backward
compatibility for geo-replication, which was one of the main
consumer of the stream based API. Also, this patch does not\
"turn on" event notifications for all fops, just a bunch which
is currently in requirement. Another pain point is that the
server does not filter events before dispatching it to the
clients. That load is taken up by the client itself (although
it's done at the library layer rather than making it hard on
the callback implementor). This needs improvement and care
needs to be taken to not load the server up with expensive
filtering mechanisms.
Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395
BUG: 1170075
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Reviewed-on: http://review.gluster.org/9708
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|