glusterfs.git/xlators/cluster/ec, branch v6.4

cluster/ec: honor contention notifications for partially acquired locks

2019-06-03T04:08:06+00:00

EC was ignoring lock contention notifications received while a lock was
being acquired. When a lock is partially acquired (some bricks have
granted the lock but some others not yet) we can receive notifications
from acquired bricks, which should be honored, since we may not receive
more notifications after that.

Since EC was ignoring them, once the lock was acquired, it was not
released until the eager-lock timeout, causing unnecessary delays on
other clients.

This fix takes into consideration the notifications received before
having completed the full lock acquisition. After that, the lock will
be releaed as soon as possible.

Backport of:
> BUG: bz#1708156
> Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
> Signed-off-by: Xavi Hernandez 

Fixes: bz#1714172
Change-Id: I2a306dbdb29fb557dcab7788a258bd75d826cc12
Signed-off-by: Xavi Hernandez

cluster/ec: Reopen shouldn't happen with O_TRUNC

2019-05-15T10:36:50+00:00

Problem:
Doing re-open with O_TRUNC will truncate the fragment even when it is not
needed needing extra heals

Fix:
At the time of re-open don't use O_TRUNC.

fixes bz#1709660
Change-Id: Idc6408968efaad897b95a5a52481c66e843d3fb8
Signed-off-by: Pranith Kumar K

cluster/ec: fix fd reopen

2019-05-08T13:54:59+00:00

Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
> Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
> BUG: bz#1699866
> Signed-off-by: Xavi Hernandez 

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez

ec: fix truncate lock to cover the write in tuncate clean

2019-04-16T10:57:05+00:00

ec_truncate_clean does writing under the lock granted for truncate,
but the lock is calculated by ec_adjust_offset_up, so that,
the write in ec_truncate_clean is out of lock.

Updates: bz#1699499
Change-Id: Idbe1fd48d26afe49c36b77db9f12e0907f5a4134
Signed-off-by: Kinglong Mee 
(cherry picked from commit 0e1223491e964096384edfae5032ed0d50d028ad)

cluster/ec: Don't enqueue an entry if it is already healing

2019-04-16T10:50:49+00:00

Problem:
1 - heal-wait-qlength is by default 128. If shd is disabled
and we need to heal files, client side heal is needed.
If we access these files that will trigger the heal.
However, it has been observed that a file will be enqueued
multiple times in the heal wait queue, which in turn causes
queue to be filled and prevent other files to be enqueued.

2 - While a file is going through healing and a write fop from
mount comes on that file, it sends write on all the bricks including
healing one. At the end it updates version and size on all the
bricks. However, it does not unset dirty flag on all the bricks,
even if this write fop was successful on all the bricks.
After healing completion this dirty flag remain set and never
gets cleaned up if SHD is disabled.

Solution:
1 - If an entry is already in queue or going through heal process,
don't enqueue next client side request to heal the same file.

2 - Unset dirty on all the bricks at the end if fop has succeeded on
all the bricks even if some of the bricks are going through heal.

Change-Id: Ia61ffe230c6502ce6cb934425d55e2f40dd1a727
updates: bz#1693223
Signed-off-by: Ashish Pandey 
(cherry picked from commit 313dcefe7a62bd16cd794040df068f9bec9c6927)

cluster/ec: Fix handling of heal info cases without locks

2019-04-09T05:27:52+00:00

When we use heal info command, it takes lot of time as in
some cases it takes lock on entries to find out if the
entry actually needs heal or not.

There are some cases where we can avoid these locks and
can conclude if the entry needs heal or not.

1 - We do a lookup (without lock) on an entry, which we found in
.glusterfs/indices/xattrop, and find that lock count is
zero. Now if the file contains dirty bit set on all or any
brick, we can say that this entry needs heal.

2 - If the lock count is one and dirty is greater than 1,
then it also means that some fop had left the dirty bit set
which made the dirty count of current fop (which has taken lock)
more than one. At this point also we can definitely say that
this entry needs heal.

This patch is modifying code to take into consideration above two
points.
It is also changing code to not to call ec_heal_inspect if ec_heal_do
was called from client side heal. Client side heal triggeres heal
only when it is sure that it requires heal.

[We have changed the code to not to call heal for lookup]

updates bz#1697764
Change-Id: I7f09f0ecd12f65a353297aefd57026fd2bebdf9c
Signed-off-by: Ashish Pandey 
(cherry picked from commit da47caf2405c08c9abafc4a55525a8b2c2dd5bb8)

cluster/ec: NULL pointer deferencing clang fix

2018-12-14T04:33:45+00:00

Removing VALIDATE_OR_GOTO check on "this"

Change-Id: I154deaca5302b41c1cafd87077de880dd03ec613
Updates: bz#1622665
Signed-off-by: Sheetal Pamecha

xlator: make 'xlator_api' mandatory

2018-12-13T09:11:50+00:00

* Remove the options to load old symbol.
* keep only 'xlator_api' symbol from being exported using xlator.sym
* add xlator_api to all the xlators where its missing

NOTE: This covers all the xlators which has at least a test case
to validate its loading. If there is a translator, which doesn't
have any test, then we should probably remove that from codebase.

fixes: #164
Change-Id: Ibcdc8c9844cda6b4463d907a15813745d14c1ebb
Signed-off-by: Amar Tumballi

libglusterfs: Move devel headers under glusterfs directory

2018-12-05T21:47:04+00:00

libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.

Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs

This change although big, is just moving around the headers and
making it correct when including these headers from other sources.

This helps us correctly include libglusterfs includes without
namespace conflicts.

Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR

Multiple xlator .h files: remove unused private gf_* memory types.

2018-11-30T11:51:18+00:00

It seems there were quite a few unused enums (that in turn
cause unndeeded memory allocation) in some xlators.
I've removed them, hopefully not causing any damage.

Compile-tested only!

updates: bz#1193929
Signed-off-by: Yaniv Kaul 

Change-Id: I8252bd763dc1506e2d922496d896cd2fc0886ea7