| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: File ownership is not being preserved for root in geo-rep
mountbroker setup.
Analysis and Cause:
Entry creations for geo-rep is overloaded in ga_setxattr.
It happens in two phase, entry creation followed by setattr
to preserve ownership as in master.
If uid and gid of file being synced is root, setattr was
not being sent down. Since, the file creation happens with
non-root user in mountborker geo-rep setup, if setattr is
not done explicitly, file ownership is not preserved for root.
Solution:
Always pass setattr down in overloaded ga_setxattr.
Change-Id: I062215c1b2379d515f28ec7f271077ad37182c7e
BUG: 1104954
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9051
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use versioned symbols to keep libgfapi at libgfapi.so.0.0.0
Some nits uncovered:
+ there are a couple functions declared that do not have an
associated definition, e.g. glfs_truncate(), glfs_caller_specific_init()
+ there are seven private/internal functions used by heal/src/glfsheal
and the gfapi master xlator (glfs-master.c): glfs_loc_touchup(),
glfs_active_subvol(), and glfs_subvol_done(), glfs_init_done(),
glfs_resolve_at(), glfs_free_from_ctx(), and glfs_new_from_ctx();
which are not declared in glfs.h;
+ for this initial pass at versioned symbols, we use the earliest version
of all public symbols, i.e. those for which there are declarations in
glfs.h or glfs-handles.h.
Further investigation as we do backports to 3.6, 3.4, and 3.4
will be required to determine if older implementations need to
be preserved (forward ported) and their associated alias(es) and
symbol version(s) defined.
FWIW, we should consider linking all of our libraries with a map, it'll
result in a cleaner ABI. Perhaps something for an intern to do or a
Google Summer of Code project.
Change-Id: I499456807a5cd26acb39843216ece4276f8e9b84
BUG: 1160709
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/9036
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In brick statedump file the barriered fop's gfid was showing 0 when
statedump was taken. This is because of statedump code was not
referring to correct gfid.
With this change statedump code will use correct gfid and gfid will
not be 0 in statedump file when barrier is enable and user takes
statedump of volume.
Change-Id: Ia296cba7e132402df53c602daa160c1c2cd21245
BUG: 1099369
Reviewed-on: http://review.gluster.org/7893
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd op-sm infrastructure has some loophole in handing error cases in
locking/unlocking phases which ends up having stale locks restricting
further transactions to go through.
This patch still doesn't handle all possible unlocking error cases as the
framework neither has retry mechanism nor the lock timeout. For eg - if
unlocking fails in one of the peer, cluster wide lock is not released and
further transaction can not be made until and unless originator node/the node
where unlocking failed is restarted.
Following test cases were executed (with the help of gdb) after applying this
patch:
* RPC timesout in lock cbk
* Decoding of RPC response in lock cbk fails
* RPC response is received from unknown peer in lock cbk
* Setting peerinfo in dictionary fails while sending lock request for first peer
in the list
* Setting peerinfo in dictionary fails while sending lock request for other
peers
* Lock RPC could not be sent for peers
For all above test cases the success criteria is not to have any stale locks
Change-Id: Ia1550341c31005c7850ee1b2697161c9ca04b01a
BUG: 1154635
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/9012
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes portability problems so that NetBSD passes tests/features/glupy.t
- Use python-config to detect python build environment on all systems,
not just Linux and Darwin.
- Get the site-package directory from python and make sure we install
glupy.py there, Previously we installed within glusterfs prefix,
which caused a problem if it was different that python's prefix.
- Set PYTHONPATH for tests so that the detected site-packages is used
in python's search path. This should be useless, but let us have it
just in case.
- Pass glupy.so path from glusterfsd to glupy.py through an
environment variable and use it in CDLL instead of "", as the
later seems not portable (at least it fails on NetBSD).
- Use gil_init_key pthread_getspecific to avoid deadlocks (that
code was #ifdef out, perhaps because it was not needed on Linux,
but it seems to be required for NetBSD.
- Recover the error message from Python and send it to the logs
to help debugging problems.
BUG: 1129939
Change-Id: Icc71e77d6940f0759cc14c5c5cf7ca6fa431e0d2
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8978
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
set help"
gluster volume set help for uss shows "User Servicable Snapshots"
whereas it should be "User Serviceable Snapshots"
Change-Id: I3cc8b3ea2cb6d209e1a12678eb7d0e68f4160d99
BUG: 1160236
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/9041
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bulk remove xattr is internal fop in gluster. Some of the xattrs may have
special behavior. Ex: removexattr("posix.system_acl_access"), removes more than
one xattr on the file that could be present in the bulk-removal request.
Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR.
Since all this fop cares is removal of the xattrs in bulk-remove request and
if they are already deleted, it can be treated as success.
Change-Id: Id8f2a39b68ab763ec8b04cb71b47977647f22da4
BUG: 1160509
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9049
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When quorum is enabled and the fop fails on all the subvolumes,
op_errno is set to EROFS which overrides the actual errno returned
from bricks.
Fix:
Don't override the errno when fop fails on all subvols.
Change-Id: I61e57bbf1a69407230ec172a983de18d1c624fd2
BUG: 1157976
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8984
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
Tested-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cases where loc->path is NULL, the current code in create/open/mkdir
would copy the same blindly and as a result coredump.
This is a preventive fix for the coredump. The reason for loc->path
to be NULL in certain cases is yet to be determined.
One such case is when resolve_loc_touchup fails to get inode_path due
to loops in the inode table.
Change-Id: Ic2ddf2cc9f2acaf9b939afc11afd193b4402ee7c
BUG: 1159221
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/9029
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I195b3309bae7e684b7dbf771e4f3b4778d0dac4c
BUG: 1146377
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8843
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A lookup on a linkto file whose trusted.glusterfs.dht.linkto
xattr points to a subvol that is not part of the volume
can cause the brick process to segfault due to a null dereference.
Modified to check for a non-null value before attempting to access
the variable.
Change-Id: Ie8f9df058f842cfc0c2b52a8f147e557677386fa
BUG: 1159571
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/9034
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd crashes in non-originator slave node during geo-rep
create push-pem.
Cause: In glusterd_op_copy_file, the value of the key "common_pem_contents"
is freed explicitly even after dict_set is successful when it is
taken cared by dict_free.
Solution: Free only in failure cases before dict_set.
Change-Id: I65b5f32ee2b946107ad279b1fe3d728ec699bc7e
BUG: 1159119
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/9018
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Poornima G <pgurusid@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
quota_statfs() returns aggregated details of space usage
of bricks this causes distribute to be confused during
``rebalance``, where ``statfs()`` values are used to
schedule file migration.
We can make sure the values of ``statfs`` are from
individual bricks by selectively instructing
``quota_statfs()`` to return non aggregated values.
Change-Id: I1397faeee66a1b9c26709cfda693286d227a4170
BUG: 1158262
Signed-off-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-on: http://review.gluster.org/8996
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take entrylks in xlator domain before doing post-op (undo-pending) in
entry self-heal. This is to prevent a parallel name self-heal on
an entry under @fd->inode from reading pending xattrs while it is
being modified by SHD after entry sh below, given that
name self-heal takes locks ONLY in xlator domain and is free to read
pending changelog in the absence of the following locking.
Change-Id: Ie083ceab10155c460447f04bdce7688480f1ac4f
BUG: 1128721
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/9020
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like enabling SSL, enabling own-thread has to be done separately for
clients and servers.
* client.own-thread for clients (including internal like self-heal)
* server.own-thread for servers (including e.g. glusterd)
It's very unlikely that you would ever want to set one without the
other, but they're separate anyway just in case. Check for "private
polling thread" in the relevant logs to make sure the option took
effect, because otherwise you might not notice any difference besides
inreased performance. ;)
Change-Id: Ifaee8de52f0b959bcdf7f6b56faeee549ee56604
BUG: 1158648
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/8931
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id5f9d5a23eb5932a0a53520b08ffba258952e000
BUG: 1151004
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/8999
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rebalance state was being saved only on the peers participating in
the rebalance on a rebalance start. This change makes sure all nodes
save the rebalance state.
Change-Id: I436e5c34bcfb88f7da7378cec807328ce32397bc
BUG: 1157979
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/8998
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On non Linux systems, we check that seekdir() succeeds and we return
EINVAL if it does not. We need this to avoid infinite loops if some
other component in GlusterFS makes an invalid seekdir() usage. This
was introduced in this change: http://review.gluster.org/#/c/8760/
But seekdir() also fails when using the offset returned for the
last entry, and this is expected behavior. As a result, the seekdir()
test produces a spurious EINVAL when reaching end of directory. That
error is not propagated to calling process, but it may harm internal
GlusterFS processing. At least it produce a spurious error message
in brick's log.
We fix the problem by remembering the last entry offset in fd private
data. When a new posix_readdir() invocation requests that offset,
we avoid returning EINVAL.
BUG: 1129939
Change-Id: I4e67a2ea46538aae63eea663dd4aa33b16ad24c7
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8926
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Valid SETATTR entries are missing in changelog when more
than one metadata operation happen on same inode within
changelog roll-over time.
Cause: Metadata entries with fop num being GF_FOP_NULL are logged
in changelog which is of no use. Since slice version
checking is done for metadata entries to avoid logging of
subsequent entries of same inode falling into same
changelog, if the entry with GF_FOP_NULL is logged first,
subsequent valid ones will be missed.
Solution: Have a boundary condition to log only those fops whose fop
number falls between GF_FOP_NULL and GF_FOP_MAXVALUE.
Change-Id: Iff585ea573ac5e521a361541c6646225943f0b2d
BUG: 1104954
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/8964
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : op state machine was relying on the global peer list while sending
lock/stage/unlock commit rpc requests to the peers in the cluster. Trusting on
global peer list structure is dangerous as this structure gets modified if any
peer modification command is attempted in the cluster when there is a ongoing
transaction going through the state machine. An ideal usecase of this problem
when rebalance is in progress and peer probe is executed rebalance op-sm and
peer probe may run into race making peerinfo structure go for toss.
Solution: Use local copy of peer list (xaction_peers) in glusterd op-sm.
Change-Id: I1ff7118dc6a9a72633e2e87b7ab7bae1796595e0
BUG: 1152890
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/8932
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue: stat() on XFS has a check for the filesystem status but
ext4 does not.
Fix: Replacing stat() call with open, write and read to a new file under the
"brick/.glusterfs" directory. This change will work for xfs, ext4 and other
fileystems.
Change-Id: Id03c4bc07df4ee22916a293442bd74819b051839
BUG: 1130242
Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com>
Reviewed-on: http://review.gluster.org/8213
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The used once MAKE_HTIME_FILE_PATH macro uses strcpy and strcat into a
fixed buffer without checking the input lengths.
Recommend replacing with a snprintf.
Change-Id: Ia0245096774dc84be1b937e1d5750f3634fff034
BUG: 1099645
Reported-by: Keith Schincke <kschinck@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8977
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I2bd34f063d6bf1835d5ae57a8e9aa03f3ec3deb3
BUG: 1156404
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8972
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct before doing any fop
The following operations might lead to problems:
* Create a file on the glusterfs mount point
* Create a snapshot (say "snap1")
* Access the contents of the snapshot
* Delete the file from the mount point
* Delete the snapshot "snap1"
* Create a new snapshot "snap1"
Now accessing the new snapshot "snap1" gives problems. Because the inode and
dentry created for snap1 would not be deleted upon the deletion of the snapshot
(as deletion of snapshot is a gluster cli operation, not a fop). So next time
upon creation of a new snap with same name, the previous inode and dentry itself
will be used. But the inode context contains old information about the glfs_t
instance and the handle in the gfapi world. Directly accessing them without
proper check leads to ENOTCONN errors. Thus the glfs_t instance should be
checked before accessing. If its wrong, then right instance should be obtained
by doing the lookup.
Change-Id: Idca0c8015ff632447cea206a4807d8ef968424fa
BUG: 1151004
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
Reviewed-on: http://review.gluster.org/8917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The device to get the inode size from does not get passed to the tool
(tune2fs, xfs_info or the like) that is called. This is probably just an
oversight. While correcting this, cleanup some bits of the function too.
Change-Id: Ida45852cba061631fb304bc7dd5286df1a808010
BUG: 1130462
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8492
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some issues in ec xlator made that rebalance didn't complete
successfully and generated some warnings and errors in the
log. The most critical error was a race condition that caused
false corruption detection when two specific operations were
executed sequentially and they shared the same lock.
This explains the problem:
1. A setxattr is issued.
2. setxattr: ec locks the inode before updating the xattr.
3. setxattr: The xattr is updated.
4. setxattr: Upper xlator is notified that the operation completed.
5. setxattr: A background task is initiated to update the version
of the file.
6. A stat is issued on the same file.
7. stat: Since the lock is already acquired, it's reused.
8. stat: A lookup is issued to determine version and size
information of the file.
At this point, operations 5 and 8 can interfere. This can make that
lookup sees different information on each brick, determining that
some bricks are corrupted and incorrectly excluding them from the
operation and initiating a self-heal. In some cases this false
detection combined with self-heal could lead to invalid updates of
the trusted.ec.size xattr, leaving the file smaller than it should
be.
This only happens if the first operation does not perform a lookup,
because chained operations reuse the information returned by the
previous one, avoiding this kind of problems.
To solve this, now the background update is executed atomically with
the posterior unlock. This avoids some reuses of the lock while
updating. However this reduces performance because the window in
which new requests can reuse the lock is much smaller now. This has
been alleviated by using the same technique implemented in AFR (i.e.
waiting some time before releasing the lock).
Some minor changes also introduced in this patch:
* Bug in management of 'trusted.glusterfs.pathinfo' that was writing
beyond the allocated space.
* Uninitialized variable.
* trusted.ec.config was not created for regular files created with
mknod.
* An invalid state was used in access fop.
Change-Id: Idfaf69578ed04dbac97a62710326729715b9b395
BUG: 1152902
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8947
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PROBLEM:
Geo-rep misses few a files to sync when I/O happenned during
geo-rep start.
ANALYSES:
To use the available changelogs to handle deletes/renames,
'xsync upper limit' is introduced which limits the xsync
crawl till the changelog register time. But there is a
small time interval between the changelog register time and
the time changelog actually enabled. If there is I/O between
this interval, it will not be synced through xsync as it is
beyond changelog register time and not through changelog also
as changelog is not actually enabled.
SOLUTION:
Enable changelog and marker during geo-rep create instead
of geo-rep start so that entries are captured in changelog
and above said interval is nullified.
Change-Id: Ic5f0457a4b67a335cbbb37d34db5f8cb8bc901c4
BUG: 1139196
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/8650
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Doing an 'ls' of a directory that has been modified while one
of the bricks was down, sometimes returns the old directory
contents.
Cause: Directories are not marked when they are modified as files are.
The ec xlator balances requests amongst available and healthy
bricks. Since there is no way to detect that a directory is
out of date in one of the bricks, it is used from time to time
to return the directory contents.
Solution: Basically the solution consists in use versioning information
also for directories, however some additional changes have
been necessary.
Changes:
* Use directory versioning:
This required to lock full directory instead of a single entry for
all requests that add or remove entries from it. This is needed to
allow atomic version update. This affects the following fops:
create, mkdir, mknod, link, symlink, rename, unlink, rmdir
Another side effect is that opendir requires to do a previous
lookup to get versioning information and discard out of date
bricks for subsequent readdir(p) calls.
* Restrict directory self-heal:
Till now, when one discrepancy was found in lookup, a self-heal
was automatically started. This caused the versioning information
of a bad directory to be healed instantly, making the original
problem to reapear again.
To solve this, when a missing directory is detected in one or more
bricks on lookup or opendir fops, only a partial self-heal is
performed on it. A partial self-heal basically creates the
directory but does not restore any additional information.
This avoids that an 'ls' could repair the directory and cause the
problem to happen again. With this change, output of 'ls' is
always consistent. However, since the directory has been created
in the brick, this allows any other operation on it (create new
files, for example) to succeed on all bricks and not add additional
work to the self-heal process.
To force a self-heal of a directory, any other operation must be
done on it. For example a getxattr.
With these changes, the correct healing procedure that would avoid
inconsistent directory browsing consists on a post-order traversal
of directoriesi being healed. This way, the directory contents will
be healed before healing the directory itslef.
* Additional changes to fix self-heal errors
- Don't use fop->fd to decide between fd/loc.
open, opendir and create have an fd, but the correct data is in
loc.
- Fix incorrect management of bad bricks per inode/fd.
- Fix incorrect selection of fop's target bricks when there are bad
bricks involved.
- Improved ec_loc_parent() to always return a parent loc as
complete as possible.
Change-Id: Iaf3df174d7857da57d4a87b4a8740a7048b366ad
BUG: 1149726
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8916
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
stub->fop can be more than FOP_MAX is what static analysis is complaining. This
patch doesn't allow any 'log' to be printed in the case fop value is not in the
definied range. It gives EINVAL instead.
Change-Id: I293381e2c1ad0ab45154b0192a637612becaf744
BUG: 1153935
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8939
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Just after replace-brick the mount logs are filled with ENOENT/ESTALE
warning logs because the file is yet to be self-healed now that the
brick is new.
Fix:
Do conditional logging for the logs. ENOENT/ESTALE will be logged at
lower log level. Only when debug logs are enabled, these logs will
be written to the logfile.
Change-Id: If203d09e2479e8c2415ebc14fb79d4fbb81dfc95
BUG: 1151303
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8918
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some additional 32 bits issues have been added by a recent patch.
This patch solves it.
Change-Id: Ice81032fbe8e36e5ccad19a781b7876891993906
BUG: 1146903
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8882
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org>
Tested-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The final lookup made to restore final file attributes after a self-heal
did clear the mask of bad bricks, causing that the final setattr won't
modify any brick at all. This caused that some attriutes, specially the
modification time of the file didn't get updated properly.
Now the mask of healed bricks is saved before doing the last lookup.
It's also used to correctly report the repaired bricks.
Change-Id: Ib94083c9e1b562515dfb54f9574120f1f031dccc
BUG: 1149723
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8905
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although glusterd currently has statedump support but it doesn't dump its
context information. Implementing glusterd_dump_priv function to export per-node
glusterd information would be useful for debugging bugs. Once implemented, we
could enhance sos-report to fetch this information. This would potentially
reduce our time to root cause and data needed for debugability can be dumped
gradually.
Following is the main items of the dump list targeted in this patch :
* Supported max/min op-version and current op-version
* Information about peer list
* Information about peer list involved while a transaction is going on
(xaction_peers)
* option dictionary in glusterd_conf_t
* mgmt_v3_lock in glusterd_conf_t
* List of connected clients
* uuid of glusterd
* A section of rpc related information like live connections and their
statistics
There are couple of issues which were found during implementation and testing
phase:
- xaction_peers of glusterd_conf_t was not initialized in init because of which
traversing through this list head was crashing when there was no active
transaction
- gf_free was not setting the typestr to NULL if the the alloc count becomes 0
for a mem-type earlier allocated.
Change-Id: Ic9bce2d57682fc1771cd2bc6af0b7316ecbc761f
BUG: 1139682
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/8665
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to Niels for helping me to get build stuff right.
Change-Id: I634f24d90cd856ceab3cc0c6e9a91003f443403e
BUG: 1147462
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6529
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When geo-rep is in paused state and a node in a cluster
is rebooted, the geo-rep status goes to "faulty (Paused)"
and no worker processes are started on that node yet. In
this state, when geo-rep is resumed, there is a race in
updating status file between glusterd and gsyncd itself
as geo-rep is resumed first and then status is updated.
glusterd tries to update to previous state and gsyncd
tries to update it to "Initializing...(Paused)" on
restart as it was paused previously. If gsyncd on restart
wins, the state is always paused but the process is not
acutally paused. So the solution is glusterd to update
the status file and then resume.
Change-Id: I348761a6e8c3ad2630c79833bc86587d062a8f92
BUG: 1149982
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/8911
Reviewed-by: Aravinda VK <avishwan@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Tested-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch improves the failure message by printing the correct peer name
instead of a blank uuid in case of rpc connection is lost/broken.
Change-Id: Ia232792051f23896883b239982cb48130e3ce60e
BUG: 1146902
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/8597
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When GlusterD starts the brick processes, these will listen on all
interfaces. When the 'transport.socket.bind-address' option is set in
glusterd.vol, the brick processes should only listen on the specified
hostname or IP-address.
Change-Id: I8e7d1f294904081137c23f3446261329d0d13bba
BUG: 1149863
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8910
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the transport.socket.bind-address option is set to a hostname or
ip-address, the services started by GlusterD fail to connect to the
management daemon. GlusterD always forces the services to connect to the
"localhost" hostname, even if it is not listening on that address.
GlusterD should take the transport.socket.bind-address option into
consideration, and pass that to the glusterfs-clients with the -s or
--volfile commandline parameter.
Note that this is not a change that removes all hard-coded dependencies
on "localhost". This change merely makes it possible to start required
services when the transport.socket.bind-address option is set.
Change-Id: I36a0ed6c69342e6327adc258fea023929055d7f2
BUG: 1149863
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8908
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After enabling nfs.mount-udp, mounting a subdir on a volume over
NFS fails. Because mountudpproc3_mnt_3_svc() invokes nfs3_rootfh()
which internally calls mnt3_mntpath_to_export() to resolve the
mount path. mnt3_mntpath_to_export() just works if the mount path
requested is volume itself. It is not able to resolve, if the path
is a subdir inside the volume.
MOUNT over TCP uses mnt3_find_export() to resolve subdir path but
UDP can't use this routine because mnt3_find_export() needs the
req data (of type rpcsvc_request_t) and it's available only for
TCP version of RPC.
FIX:
(1) Use syncop_lookup() framework to resolve the MOUNT PATH by
breaking it into components and resolve component-by-component.
i.e. glfs_resolve_at () API from libgfapi shared object.
(2) If MOUNT PATH is subdir, then make sure subdir export is not
disabled.
(3) Add auth mechanism to respect nfs.rpc-auth-allow/reject and
subdir auth i.e. nfs.export-dir
(4) Enhanced error handling for MOUNT over UDP
Change-Id: I42ee69415d064b98af4f49773026562824f684d1
BUG: 1118311
Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com>
Reviewed-on: http://review.gluster.org/8346
Reviewed-by: soumya k <skoduri@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Operations processed by ec_dispatch_one() were not correctly
completed by ec_complete(), leaving some structures in memory.
Now ec_complete() also calls ec_resume() for this type of fops.
Change-Id: Iaf0f2e8227399ebb735db9f1bd007593e0ece041
BUG: 1148520
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8896
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Use a system-dependent macro for umount(8) location instead of
relying on $PATH to find it, for security and portability sake.
2) Introduce gf_umount_lazy() to replace umount -l (-l for lazy) invocations,
which is only supported on Linux; On Linux behavior in unchanged. On other
systems, we fork an external process (umountd) that will take care of
periodically attempt to unmount, and optionally rmdir.
BUG: 1129939
Change-Id: Ia91167c0652f8ddab85136324b08f87c5ac1e51d
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8649
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Csaba Henk <csaba@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
POSIX mandates the filesystem to support paths of lengths up to
_XOPEN_PATH_MAX (1024). This is the PATH_MAX limit here:
http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html
When using a path of 1023 bytes, the posix xlator attempts to create
an absolute path by prefixing the 1023 bytes path by the brick
base path. The result is an absolute path of more than _XOPEN_PATH_MAX
bytes which may be rejected by the backend filesystem.
Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it
will work (except if brick base path is longer than 2072 bytes but
it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024,
which means the bug can happen regardless of brick base path length.
If this condition is detected for a brick, the proposed fix is to
chdir() the brick glusterfsd daemon to its brick base directory.
Then when encountering a path that will exceed _XOPEN_PATH_MAX once
prefixed by the brick base path, a relative path is used instead
of an absolute one. We do not always use relative path because some
operations require an absolute path on the brick base path itself
(e.g.: statvfs).
At least on NetBSD, this chdir() uncovers a race condition which
causes file lookup to fail with ENODATA for a few seconds. The
volume quickly reaches a sane state, but regression tests are fast
enough to choke on it. The reason is obscure (as often with race
conditions), but sleeping one second after the chdir() seems to
change scheduling enough that the problem disapear.
Note that since the chdir() is done if brick backend filesystem
does not support path long enough, it will not occur with Linux
ext3fs (except if brick base path is over 2072 bytes long).
BUG: 1129939
Change-Id: I7db3567948bc8fa8d99ca5f5ba6647fe425186a9
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8596
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
Tested-by: Harshavardhana <harsha@harshavardhana.net>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4504f3050674dde217e79af28cb4d2b5370fe2d5
BUG: 1148010
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/8891
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Emmanuel Dreyfus <manu@netbsd.org>
Tested-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFSv3 does not have a TRUNCATE procedure, instead it is part of the
SETATTR (change the 'size' attribute). SETATTR with a new 'size'
succeeds on other NFS-servers, even when the owner of the file does not
have write permissions. Make Gluster/NFS behave the same way, by
checking if the RPC/pid comes from the NFS-server, and allow truncate()
when the file is owned by the user calling SETATTR.
BUG: 955753
Change-Id: I4b7cb8efe5a2032c6cd2eef6af610032f76d8b39
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/8889
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: soumya k <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the special cases v1 handles and also
self-accusing pending changelog from v1 pre-op also is handled
in this patch.
Change-Id: Ie10f71633fb20276f01ecafbd728f20483e7029c
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8536
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enable
The pgfid extended attributes are used to construct the ancestry path
(from the file to the volume root) for nameless lookups on files.
As NFS relies on nameless lookups heavily, quota enforcement through NFS
would be inconsistent if quota were to be enabled on a volume with
existing data.
Solution is to heal the pgfid extended attributes as a part of lookup
perfomed by quota-crawl process. In a posix lookup check for pgfid xattr
and if it is missing set the xattr.
Change-Id: I5912ea96787625c496bde56d43ac9162596032e9
BUG: 1147378
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/8878
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to POSIX, seekdir() should only be given offset obtained from
telldir() on the same DIR *
http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html
Code from afr-self-heald.c and index.c is operating outside of the
specification, by doing using seekdir() with offset from a previously
open/close/re-open directory. This seems to work on Linux (although with
no guarantee it will always in the future). On NetBSD the seekdir()
with a in invalid offset is a nilpotent operation, and causes an infinite
loop, since index_fill_readdir() always restart from the beginning of the
directory.
The situation is fixed by using a non anonymous fd in afr-self-heald.c:
we explicitely open the directory so that it remains open on the brick
side during the timeframe where we want to reuse offsets in seekdir().
This requires adding an opendir fop in index xlator.
If the brick was not updated, the opendir will fail and we fallback
to the standard violating approach for backward compatibility on Linux.
On other systems we fail since it never worked.
While there, add tests to check seekdir() success in index and posix
xlators, so that incorrect usage from calling code produce an explicit
error instead of an infinite loop. We can only do it on non Linux systems,
for the sake of backward compatibility when the brick was updated but
not the client.
BUG: 1129939
Change-Id: I88ca90acfcfee280988124bd6addc1a1893ca7ab
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/8760
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently in case of multi node cluster brick-order check for
replicate volume done on every node. Its waste of time to perform
brick order check on every node.
This change will perform brick order check only at originator node.
Change-Id: I8687fd28e587de8a280a9003b015ccd5729c9740
BUG: 1091935
Signed-off-by: ggarg <ggarg@redhat.com>
Reviewed-on: http://review.gluster.org/8881
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem : When a lookup is issued, and if the entry is not found
then snapview-client will log failure stating that
"Lookup on normal graph failed with error Stale file handle"
irrespective of type of graph it received call back from.
Solution : Introduced a check to find out the graph from which
the snapview-client received call-back.
Change-Id: Iadd5b525c394be3675d40231711058e1cf1396cd
BUG: 1146479
Signed-off-by: Sachin Pandit <spandit@redhat.com>
Reviewed-on: http://review.gluster.org/8851
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
File goes into split-brain because of wrong erasing of xattrs.
RCA:
The issue happens because index self-heal is triggered even before all the
bricks are up. So what ends up happening while erasing the xattrs is, xattrs
are erased only on the sink brick for the brick that it thinks is up leading to
split-brain
Example:
lets say the xattrs before heal started are:
brick 2:
trusted.afr.vol1-client-2=0x000000020000000000000000
trusted.afr.vol1-client-3=0x000000020000000000000000
brick 3:
trusted.afr.vol1-client-2=0x000010040000000000000000
trusted.afr.vol1-client-3=0x000000000000000000000000
if only brick-2 came up at the time of triggering the self-heal only
'trusted.afr.vol1-client-2' is erased leading to the following xattrs:
brick 2:
trusted.afr.vol1-client-2=0x000000000000000000000000
trusted.afr.vol1-client-3=0x000000020000000000000000
brick 3:
trusted.afr.vol1-client-2=0x000010040000000000000000
trusted.afr.vol1-client-3=0x000000000000000000000000
So the file goes into split-brain.
Change-Id: I1185713c688e0f41fd32bf2a5953c505d17a3173
BUG: 1142601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8755
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|