glusterfs.git/tests/volume.rc, branch v6.8

ctime/rebalance: Heal ctime xattr on directory during rebalance

2019-09-27T11:34:25+00:00

After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Backport of:
 > Patch: https://review.gluster.org/23127
 > Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
 > BUG: 1734026
 > Signed-off-by: Kotresh HR 

Patch: https://review.gluster.org/23127
Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1752413
Signed-off-by: Kotresh HR

posix/ctime: Fix ctime upgrade issue

2019-07-02T07:40:33+00:00

Problem:
On a EC volume, during upgrade from the older version where
ctime feature is not enabled(or not present) to the newer
version where the ctime feature is available (enabled default),
the self heal hangs and doesn't complete.

Cause:
The ctime feature has both client side code (utime) and
server side code (posix). The feature is driven from client.
Only if the client side sets the time in the frame, should
the server side sets the time attributes in xattr. But posix
setattr/fseattr was not doing that. When one of the server
nodes is updated, since ctime is enabled by default, it
starts setting xattr on setattr/fseattr on the updated node/brick.

On a EC volume the first two updated nodes(bricks) are not a
problem because there are 4 other bricks with consistent data.
However once the third brick is updated, the new attribute(mdata xattr)
will cause an inconsistency on metadata on 3 bricks, which
prevents the file to be repaired.

Fix:
Don't create mdata xattr with utimes/utimensat system call.
Only update if already present.

Backport of:
 > Patch: https://review.gluster.org/22858
 > Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
 > BUG: 1720201
 > Signed-off-by: Kotresh HR 

Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
fixes: bz#1722805
Signed-off-by: Kotresh HR

cluster/afr: Send truncate on arbiter brick from SHD

2019-03-12T20:51:47+00:00

Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.

Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.

Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1687672
Signed-off-by: karthik-us

features/shard: Hold a ref on base inode when adding a shard to lru list

2018-10-16T03:37:44+00:00

In __shard_update_shards_inode_list(), previously shard translator
was not holding a ref on the base inode whenever a shard was added to
the lru list. But if the base shard is forgotten and destroyed either
by fuse due to memory pressure or due to the file being deleted at some
point by a different client with this client still containing stale
shards in its lru list, the client would crash at the time of locking
lru_base_inode->lock owing to illegal memory access.

So now the base shard is ref'd into the inode ctx of every shard that
is added to lru list until it gets lru'd out.

The patch also handles the case where none of the shards associated
with a file that is about to be deleted are part of the LRU list and
where an unlink at the beginning of the operation destroys the base
inode (because there are no refkeepers) and hence all of the shards
that are about to be deleted will be resolved without the existence
of a base shard in-memory. This, if not handled properly, could lead
to a crash.

Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499
updates: bz#1605056
Signed-off-by: Krutika Dhananjay

tests: kill_brick should wait for brick status to become offline

2018-08-10T02:03:14+00:00

Change-Id: I52e8eec7f334af37de433c444f4ddfc876fa56cc
Fixes: bz#1614088
Signed-off-by: Atin Mukherjee

tests: fix online_brick_count function

2018-08-03T22:45:33+00:00

online_brick_count should discard Bitrot and Scrubber daemon.

Change-Id: I301373ccdbeec1d1a5e6c6b137f48ed997f22556
Fixes: bz#1611103
Signed-off-by: Atin Mukherjee

cluster/dht: Fix rename journal in changelog

2018-06-24T15:47:59+00:00

With patch [1], renames are journalled only
on cached subvolume. The dht sends the special
key on the cached subvolume so that the changelog
journals the rename. With single distribute
sub-volume, the key is not being set. This patch
fixes the same.

[1] https://review.gluster.org/10410

fixes: bz#1583018
Change-Id: Ic2e35b40535916fa506a714f257ba325e22d0961
Signed-off-by: Kotresh HR

client: remove the "connecting" state - it's not used

2018-06-21T05:37:09+00:00

The "connecting" state is not used anywhere really.
It's only being set and printed. So remove it.

Change-Id: I11fc8b0bdcda5a812d065543aa447d39957d3b38
fixes: bz#1583583
Signed-off-by: Michael Adam

gluster: Sometimes Brick process is crashed at the time of stopping brick

2018-04-19T04:31:51+00:00

Problem: Sometimes brick process is getting crashed at the time
         of stop brick while brick mux is enabled.

Solution: Brick process was getting crashed because of rpc connection
          was not cleaning properly while brick mux is enabled.In this patch
          after sending GF_EVENT_CLEANUP notification to xlator(server)
          waits for all rpc client connection destroy for specific xlator.Once rpc
          connections are destroyed in server_rpc_notify for all associated client
          for that brick then call xlator_mem_cleanup for for brick xlator as well as
          all child xlators.To avoid races at the time of cleanup introduce
          two new flags at each xlator cleanup_starting, call_cleanup.

BUG: 1544090
Signed-off-by: Mohit Agrawal 

Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
      with same patch after enable brick mux forcefully, all test cases are
      passed.

Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090

storage/posix: Add active-fd-count option in gluster

2018-03-21T05:06:31+00:00

Problem:
when dd happens on sharded replicate volume all the writes on shards happen
through anon-fd. When the writes don't come quick enough, old anon-fd closes
and new fd gets created to serve the new writes. open-fd-count is decremented
only after the fd is closed as part of fd_destroy(). So even when one fd is on
the way to be closed a new fd will be created and during this short period it
appears as though there are multiple fds opened on the file. AFR thinks another
application opened the same file and switches off eager-lock leading to
extra latency.

Fix:
Have a different option called active-fd whose life cycle starts at
fd_bind() and ends just before fd_destroy()

BUG: 1557932
Change-Id: I2e221f6030feeedf29fbb3bd6554673b8a5b9c94
Signed-off-by: Pranith Kumar K