| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
__nlc_inode_ctx_get assigns a value to nlc_pe_p which is never used by
its parent function or any of the predecessor hence remove the
assignment and also that function argument as it is not being used
anywhere.
fixes: bz#1732496
Change-Id: I5b950e1e251bd50a646616da872a4efe9d2ff8c9
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.
The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).
So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).
When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.
The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.
This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.
Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.
Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1730715
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
| |
fixes: bz#1724024
Change-Id: I539fb7248b2cfc037ec29f1413ea648f9ec21ef2
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When frame->local is not null FRAME_DESTROY calls mem_put on it.
Since the stub is already destroyed in call_resume(), it leads
to crash
Fix:
Set frame->local to NULL before calling call_resume()
fixes: bz#1593542
Change-Id: I0f8adf406f4cefdb89d7624ba7a9d9c2eedfb1de
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
| |
This function does length, allocation and serialization for you.
Change-Id: I142a259952a2fe83dd719442afaefe4a43a8e55e
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The files which were created before ctime enabled would not
have "trusted.glusterfs.mdata"(stores time attributes) xattr.
Upon fops which modifies either ctime or mtime, the xattr
gets created with latest ctime, mtime and atime, which is
incorrect. It should update only the corresponding time
attribute and rest from backend
Solution:
Creating xattr with values from brick is not possible as
each brick of replica set would have different times.
So create the xattr upon successful lookup if the xattr
is not created
Note To Reviewers:
The time attributes used to set xattr is got from successful
lookup. Instead of sending the whole iatt over the wire via
setxattr, a structure called mdata_iatt is sent. The mdata_iatt
contains only time attributes.
Change-Id: I5e535631ddef04195361ae0364336410a2895dd4
fixes: bz#1593542
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two ways to fetch node-uuid information from dht.
1 - #define GF_XATTR_LIST_NODE_UUIDS_KEY "trusted.glusterfs.list-node-uuids"
This key is used by AFR.
2 - #define GF_REBAL_FIND_LOCAL_SUBVOL "glusterfs.find-local-subvol"
This key is used for non-afr volume type.
We do two getxattr operations. First on the #1 key followed by on #2 if
getxattr on #1 key fails.
Since the parent function "dht_init_local_subvols_and_nodeuuids" logs failure,
moving the log-level to DEBUG in dht_find_local_subvol_cbk.
fixes: bz#1730175
Change-Id: I4d88244dc26587b111ca5b00d4c00118efdaac14
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ec_manager_open/opendir memsets ctx->loc which causes
memory/inode leak, and ec_fheal uses ctx->loc out of fd->lock
that loc_copy may copy bad data when memset it.
This patch skips updating ctx->loc when it is initilizaed.
With it, ctx->loc is filled once, and never updated.
Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a
fixes: bz#1729772
Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
|
|
|
|
|
|
|
|
|
| |
If lock has info, fop should inherit healing mask from it.
Otherwise, fop cannot inherit right healing when changed_flags is zero.
Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f
fixes: bz#1727081
Signed-off-by: Kinglong Mee <mijinlong@horiscale.com>
|
|
|
|
|
|
|
|
|
| |
We need to safe-guard against possible zero'ing out of iatt
structure in acl ctx, which can cause many issues.
fixes: bz#1668286
Change-Id: Ie81a57d7453a6624078de3be8c0845bf4d432773
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Manually added '-Wpadded' to get warnings on padding, and reordered
structs to reduce most of them.
Change-Id: I0c505fcb3dfef76399ac9d5d33bfb235354532de
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When iot_worker terminates, its resources have not been reaped, which
will consumes lots of memory.
Detach iot_worker to automically release its resources back to the
system.
fixes: bz#1729107
Change-Id: I71fabb2940e76ad54dc56b4c41aeeead2644b8bb
Signed-off-by: Liguang Li <liguang.lee6@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to send the commit req to peers in case of geo-rep
operations even though it is a no volname operation. In commit
phase peers try to set the txn_opinfo which will fail because
it is a no volname operation where we don't require a commit
phase. We mark skip_locking as true for no volname operations,
but we have to give an exception to geo-rep operations, so that
they can set txn_opinfo in commit phase.
Please refer to detailed RCA at the bug: 1729463
fixes: bz#1729463
Change-Id: I9f2478b12a281f6e052035c0563c40543493a3fc
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
| |
Fixes: bz#1728554
Change-Id: I88357aed7c14988a12616035c3738c32c09a8f9a
Signed-off-by: Patrick Matthäi <pmatthaei@debian.org>
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have proposed about this an year ago, and with recent smoke
failures, it looks like the right time to take such call.
ref: https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html
With this, glusterfs-8.0 wouldn't have rdma feature, and would
allow some modularity changes possible with rpc layer (as we would
have just 1 transport)
Updates: bz#1635688
Change-Id: Ia277dca4d4b1f0cffae20819024a52b075b775e5
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
| |
The current list of snapshots from priv->dirents is obtained outside
the lock.
Change-Id: I8876ec0a38308da5db058397382fbc82cc7ac177
Fixes: bz#1726783
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In case of thin arbiter, before index healer starts crawling the
indices at every heal-timeout interval, even if there is nothing to
be healed it will send an upcall notification to all the clients to
release any AFR_TA_DOM_NOTIFY locks that they hold. SHD will wait
for the upcall to return before proceeding with the heal even though
there is nothing to be healed. This will also invalidates the cached
information about the bricks states on the clients which leads to
extra calls on TA from clients for the next reads & writes if needed.
This will impact the IO performance.
Fix:
- Before sending the upcall to the clients, check for any pending heals
on TA without taking any locks.
- If there is nothing marked bad on TA, then continue with the index
crawl to heal any dirty markings present on the files due to any post-op
failure.
- If there is a brick marked as bad on TA, then take the
AFR_TA_DOM_NOTIFY lock on TA from SHD, get the state on TA and
continue with the current healing process.
Change-Id: Ieb477bc6cb18bbdfd4e7a0453c5ed79b574ec9d6
fixes: bz#1724184
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems:
1. When checking for type and gfid mismatch, if the type or gfid
is unknown because of missing gfid handle and the gfid xattr
it will be reported as type or gfid mismatch and the heal will
not complete.
2. If the source selected during entry heal has null gfid the same
will be sent to afr_lookup_and_heal_gfid(). In this function when
we try to assign the gfid on the bricks where it does not exist,
we are considering the same gfid and try to assign that on those
bricks. This will fail in posix_gfid_set() since the gfid sent
is null.
Fix:
If the gfid sent to afr_lookup_and_heal_gfid() is null choose a
valid gfid before proceeding to assign the gfid on the bricks
where it is missing.
In afr_selfheal_detect_gfid_and_type_mismatch(), do not report
type/gfid mismatch if the type/gfid is unknown or not set.
Change-Id: Ia06552e4dc4a9f89cb7f5302833604bd21bbf7da
fixes: bz#1722507
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As usleep has been obsoleted, changed all invocations of usleep
to nanosleep. From man 3 usleep:
"4.3BSD, POSIX.1-2001. POSIX.1-2001 declares this function
obsolete; use nanosleep(2) instead. POSIX.1-2008 removes the
specification of usleep()."
Added a helper function gf_nanosleep() to have a single place
for handling edge cases that might arise from the conversion of
usleep to nanosleep and allow the sleep to resume with right
remaining value upon being interrupted.
Fixes: bz#1721686
Change-Id: Ia39ab82c9e0f4669d2c00d4cdf25e38d94ef9f62
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
As Hadoop is no longer supported, dropping code for
handling Hadoop access.
Fixes: bz#1728417
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I8fcf4faacb364f1c9a8abb0c48faec337087f845
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a normal volume, we are updating the pid from a the
process while we do a daemonization or at the end of the
init if it is no-daemon mode. Along with updating the pid
we also lock the file, to make sure that the process is
running fine.
With brick mux, we were updating the pidfile from gluterd
after an attach/detach request.
There are two problems with this approach.
1) We are not holding a pidlock for any file other than parent
process.
2) There is a chance for possible race conditions with attach/detach.
For example, shd start and a volume stop could race. Let's say
we are starting an shd and it is attached to a volume.
While we trying to link the pid file to the running process,
this would have deleted by the thread that doing a volume stop.
Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a
Updates: bz#1722541
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
| |
warning: ‘%s’ directive argument is null
Change-Id: I2ce9560f98a8310886c31384e40c2e101ad2c719
updates: bz#1193929
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
warning: taking address of packed member of
‘struct _quota_limits’ may result in an
unaligned pointer value.
Change-Id: Ib889c99184560f0d24dbcc4fae10b563a2b9b7fd
updates: bz#1193929
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With group-metadata-cache group profile settings
performance.cache-invalidation option when turned on enables both md-cache
and quick-read xlator's cache-invalidation feature. While the intent of
the group-metadata-cache is to set md-cache xlator's cache-invalidation
feature, quick-read xlator also gets affected due to the same. While
md-cache feature and it's profile existed since release-3.9,
quick-read cache-invalidation was introduced in release-4 and due to
this op-version mismatch on any cluster which is >= glusterfs-4 when
this group profile is applied it breaks backward compatibility with the
old clients.
The proposed fix here is to rename the key in quick-read to
'quick-read-cache-invalidation' so that both these features have
distinct identification. While this brings in by itself a backward
compatibility challenge where this feature is enabled in an existing
cluster and when the same is upgraded to a version where this change
exists, it will lead to an unidentified old key. But as a workaround we
can always ask users upgrading to release-7 version to turn off this
option, upgrade the cluster and turn it back on with the new key. This
needs to be documented once the patch is accepted.
Fixes: bz#1698042
Change-Id: I30422ba6496208e21191a8d78ad29b2e21078664
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
| |
fixes: bz#1727248
Change-Id: Iea289032a8feecf2945668d3fb44a6a53089fdea
Signed-off-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
|
|
|
|
|
|
|
|
| |
Move some variables around to reduce compiler padding.
Change-Id: I9cd53ce257fb169270c295ac60d58d4a6a950d4f
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: get-state does not show correct brick status if brick
status is not Started, it always shows started if any value
is set brickinfo->status
Solution: Check the value of brickinfo->status to show correct status
in get-state
Change-Id: I12a79619024c2cf59f338220d144f2f034059b3b
fixes: bz#1726906
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are logging a warning message saying unknown-key for
tier-enabled kay. although the tier xlator is deprecated,
this key is left behind for handling the peer rejection
issues in a heterogeneous cluster. We need not to log if
this key is not found/recognised.
updates: bz#1193929
Change-Id: Ia68661898a618f99a240ca8d8a124ff6a65ebe9d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
snapview server xlator makes use of "localhost" as the volfile server while
initing the new glfs instance to talk to a snapshot. While localhost is fine,
better use the same volfile server that was used to start the snapshot
daemon containing the snapview-server xlator.
Change-Id: I4485d39b0e3d066f481adc6958ace53ea33237f7
fixes: bz#1725211
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Fixed a memleak in dht_rename_cbk when creating
a linkto file.
Change-Id: I705adef3cb79e33806520fc2b15558e90e2c211c
fixes: bz#1722698
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterfs-fops.h was moved to rpc/xdr to support compound fops.
(ref: https://review.gluster.org/14032, 2f945b86d3)
This was fine as long as all these header files were in single
include directory after 'install'. With the move to separate out
glusterfs specific header files into another directory inside
/usr/include (ref: https://review.gluster.org/21746, 20ef211cfa),
glusterfs-fops.h file was not in the proper path when an external
.c file tried to include any of glusterfs specific .h file (like
xlator.h).
Now, we have removed compound-fops, with that, none of the enums
declared in glusterfs-fops.h are actually getting used on wire
anymore. Hence, it makes sense to get this to libglusterfs/src
as a single point of definition. With this change, the external
programs can use glusterfs header files.
also remove some enum definitions which are not used in code
anymore.
Updates: bz#1636297
Change-Id: I423c44d3dbe2efc777299c544ece3cb172fc7e44
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to man page:
mount -t glusterfs [-o <options>] <server1>,<server2>,
<server3>,..<serverN>:/<volname>[/<subdir>] <mount_point>
When we try
mount -t glusterfs 52:54:00:23:5a:b6,52:54:00:a0:be:ed:/ta-vol1 /mnt/ta
we see the below msg in log:
[2019-06-17 11:41:36.085809] I [MSGID: 100030] [glusterfsd.c:2867:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 7dev (args: /usr/local/sbin/glusterfs --process-name fuse --volfile-server=52:54:00:23:5a --volfile-id=/ta-vol1 /mnt/ta)
With this change, we'll able to give multiple volfile servers
while mounting the volume using ipv6.
After the change, I see the following in log:
[2019-06-17 12:00:21.183658] I [MSGID: 100030] [glusterfsd.c:2867:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 7dev (args: /usr/local/sbin/glusterfs --process-name fuse --volfile-server=52:54:00:23:5a:b6 --volfile-server=52:54:00:a0:be:ed --volfile-id=/ta-vol1 /mnt/ta)
fixes: bz#1719290
credits: Aga <aga_1990@hotmail.com>
Change-Id: Icf89bea3ba15d8374ef428aeb59f2ef55ad544ec
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes following:
https://build.gluster.org/job/clang-scan/744/clangScanBuildBugs/browse/report-dd8d31.html#EndPath
https://build.gluster.org/job/clang-scan/744/clangScanBuildBugs/browse/report-89a1cd.html#EndPath
Updates: bz#1622665
Change-Id: Ibd201fb2ca54ae7ae3fed8a8d87815358b614349
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gluster volume create <VOLNAME> replica 2 thin-arbiter 1 <host1>:<brick1> <host2>:<brick2>
<thin-arbiter-host>:<path-to-store-replica-id-file> [force]
The changes have been made in a way that the last brick in the bricks list
will be treated as the thin-arbiter.
GD1 will be manipulated to consider replica count to be as 2 and continue creating the
volume like any other replica 2 volume but since thin-arbiter volumes need ta-brick
client xlator entries for each subvolume in fuse volfile, volfile generation is
modified in a way to inject these entries seperately in the volfile for every subvolume.
Few more additions -
1- Save the volinfo with new fields ta_bricks list and thin_arbiter_count.
2- Introduce a new option client.ta-brick-port to add remote-port to ta-brick xlator entry
in fuse volfiles. The option can be set using the following CLI syntax -
gluster volume set <VOLNAME> client.ta-brick-port <PORTNO.>
3- Volume Info will contain a Thin-Arbiter-path entry to distinguish
from other replicate volumes.
Change-Id: Ib434e2313b29716f32476c6c211d282c4ef39406
Updates #687
Signed-off-by: Vishal Pandey <vpandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were cleaning xlator from botton to top, which might
lead to problems when upper xlators trying to access
the xlator object loaded below.
One such scenario is when fd_unref happens as part of the
fini call which might lead to calling the releasedir to
lower xlator. This will lead to invalid mem access
Change-Id: I8a6cb619256fab0b0c01a2d564fc88287c4415a0
Updates: bz#1716695
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two reasons:
* ping responses from glusterd may not be relevant for Halo
replication. Instead, it might be interested in only knowing whether
the brick itself is responsive.
* When a brick is killed, propagating GF_EVENT_CHILD_PING of ping
response from glusterd results in GF_EVENT_DISCONNECT spuriously
propagated to parent xlators. These DISCONNECT events are from the
connections client establishes with glusterd as part of its
reconnect logic. Without GF_EVENT_CHILD_PING, the last event
propagated to parent xlators would be the first DISCONNECT event
from brick and hence subsequent DISCONNECTS to glusterd are not
propagated as protocol/client prevents same event being propagated
to parent xlators consecutively. propagating GF_EVENT_CHILD_PING for
ping responses from glusterd would change the last_sent_event to
GF_EVENT_CHILD_PING and hence protocol/client cannot prevent
subsequent DISCONNECT events
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Fixes: bz#1716979
Change-Id: I50276680c52f05ca9e12149a3094923622d6eaef
|
|
|
|
|
|
| |
Change-Id: I0cb5320fea71306e0283509ae47024f23874b53b
fixes: bz#1723761
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were using glusterfs_graph_fini to free the xl rec from
glusterfs_process_volfp as well as glusterfs_graph_cleanup.
Instead we can use glusterfs_graph_deactivate, which is does
fini as well as other common rec free.
Change-Id: Ie4a5f2771e5254aa5ed9f00c3672a6d2cc8e4bc1
Updates: bz#1716695
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
* reverting changes made in
https://review.gluster.org/#/c/glusterfs/+/21686/
* Now storage.reserve can take value in percent or bytes
fixes: bz#1651445
Change-Id: Id4826210ec27991c55b17d1fecd90356bff3e036
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_real_filename is implemented as a virtual extended attribute to help
Samba implement the case-insensitive but case preserving SMB protocol
more efficiently. It is implemented as a getxattr call on the parent directory
with the virtual key of "get_real_filename:<entryname>" by looking for a
spelling with different case for the provided file/dir name (<entryname>)
and returning this correct spelling as a result if the entry is found.
Originally (05aaec645a6262d431486eb5ac7cd702646cfcfb), the
implementation used the ENOENT errno to return the authoritative answer
that <entryname> does not exist in any case folding.
Now this implementation is actually a violation or misuse of the defined
API for the getxattr call which returns ENOENT for the case that the dir
that the call is made against does not exist and ENOATTR (or the synonym
ENODATA) for the case that the xattr does not exist.
This was not a problem until the gluster fuse-bridge was changed
to do map ENOENT to ESTALE in 59629f1da9dca670d5dcc6425f7f89b3e96b46bf,
after which we the getxattr call for get_real_filename returned an
ESTALE instead of ENOENT breaking the expectation in Samba.
It is an independent problem that ESTALE should not leak out to user
space but is intended to trigger retries between fuse and gluster.
But nevertheless, the semantics seem to be incorrect here and should
be changed.
This patch changes the implementation of the get_real_filename virtual
xattr to correctly return ENOATTR instead of ENOENT if the file/directory
being looked up is not found.
The Samba glusterfs_fuse vfs module which takes advantage of the
get_real_filename over a fuse mount will receive a corresponding change
to map ENOATTR to ENOENT. Without this change, it will still work
correctly, but the performance optimization for nonexisting files is
lost. On the other hand side, this change removes the distinction
between the old not-implemented case and the implemented case.
So Samba changed to treat ENOATTR like ENOENT will not work correctly
any more against old servers that don't implement get_real_filename.
I.e. existing files will be reported as non-existing
Change-Id: I971b427ab8410636d5d201157d9af70e0d075b67
fixes: bz#1722977
Signed-off-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
This patch enables the lock contention notification by default.
Change-Id: I10131b026a7cb09fc7c93e1e6c8549988c1d7751
Fixes: bz#1717754
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A dict was passed to a function that calls dict_unref() without taking
any additional reference. Given that the same dict is also used after
the function returns, this was causing a use-after-free situation.
To fix the issue, we simply take an additional reference before calling
the function.
Fixes: bz#1723890
Change-Id: I98c6b76b08fe3fa6224edf281a26e9ba1ffe3017
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
| |
...otherwise this leads to a crash when volume status is run on a
heterogeneous mode.
Fixes: bz#1723658
Change-Id: I0d39f412b2e5e9d3ef0a3462b90b38bb5364b09d
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd populate unix socket file name based
on volname and if volname is lengthy socket
system call's are failed due to breach maximum length
is defined in the kernel.
Solution:Convert unix socket name to hash to resolve the issue
Change-Id: I5072e8184013095587537dbfa4767286307fff65
fixes: bz#1720566
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
| |
user.* options are just custom and they don't contribute anything in
terms of determining the volume compatibility in brick multiplexing
Fixes: bz#1723402
Change-Id: Ic7e0181ab72993d29cab345cde64ae1340bf4faf
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
| |
* found a bug with quiesce fallocate() - fixed.
* found a bug with cloudsync part of code in posix - fixed
updates: bz#1693692
Change-Id: I4f315ffebb612de072ae08761b8cd0f47714080a
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Some macros were not used, so removed.
Some macros were quite local, so moved to the respective users.
Some macros simplified (removed an allocation here and there)
Change-Id: Ifaf1aff15a78f105b1549ab8053378933b35df43
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the shd mux changes, shd was havinga a logfile
with volname of the first started volume.
This was creating a lot confusion, as other volumes data
is also logging to a logfile which has a different vol name.
With this changes the logfile will be changed to a unique name
ie "/var/log/glusterfs/glustershd.log". This was the same
logfile name before the shd mux
Change-Id: I2b94c1f0b2cf3c9493505dddf873687755a46dda
fixes: bz#1721601
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a small race window, where we have a shd proc
without having a connection. That is when we stopped the
last shd running on a process. The list was removed
outside of a lock just after stopping the process.
So there is a window where we stopped the process, but
the shd proc list contains the entry.
Change-Id: Id82a82509e5cd72acac24e8b7b87197626525441
fixes: bz#1722541
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Race:
Thread-1 Thread-2
1) Does ec_get_size_version() to perform
pre-op fxattrop as part of write-1
2) Calls ec_set_dirty_flag() in
ec_get_size_version() for write-2.
This sets dirty[] to 1
3) Completes executing
ec_prepare_update_cbk leading to
ctx->dirty[] = '1'
4) Takes LOCK(inode->lock) to check if there are
any flags and sets dirty-flag because
lock->waiting_flag is 0 now. This leads to
fxattrop to increment on-disk dirty[] to '2'
At the end of the writes the file will be marked for heal even when it doesn't need heal.
Fix:
Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked
as '1' in step 2) above
Updates bz#1593224
Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|