| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running "rm -rf" on a tiered volume sometimes caused
the client to crash because dht_readdirp_cbk referenced
a NULL layout for the hot tier subvol.
Now, entries are skipped if the layout is NULL. This
can cause "rm -rf" to fail with ENOTEMPTY rmdir failures.
Change-Id: Idd71a9d0f7ee712899cc7113bbf2cd3dcb25808b
BUG: 1307208
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/13440
Smoke: Gluster Build System <jenkins@build.gluster.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The start command doesnt restart the tier deamon if the deamon
is running at one node. hence to bring up the tierd on the nodes
where the deamon is down, the force command is implemented.
It skips the check for tierd running.
Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436
BUG: 1292112
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: http://review.gluster.org/12983
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
What:
If dht_open is called on a migrating file after the inode_ctx is set,
subsequent FOPs on that fd do not open the fd on the dst subvol.
This is seen when the open-ftruncate-close sequence is repeatedly
called on a migrating file.
A second call to the sequence described above causes dht_truncate_cbk
to call dht_truncate2 as the dht_inode_ctx was already set by the first
call. As dht_rebalance_in_progress_check is not called, the fd is not
opened on the dst subvol.
On a distributed-replicate volume, this causes AFR to
open the fd using afr_fix_open, but with the wrong flags, causing
posix_ftruncate to fail with EINVAL.
The fix: We require fd specific information to make a decision while
handling migrating files.
Set the fd_ctx to indicate the fd has been opened on the dst subvol
and check if it has been set while processing Phase1/Phase2 checks
in the FOP callback functions.
Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14
BUG: 1284823
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/12985
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are creating data file in a hot subvolume
then we will create a linkfile in cold subvolume.
Linkfile creation happens first. If linkfile creation
was successful and data file creation failed, then
linkfile in cold subvolume will become stale.
This patch will delete the linkfile as well, if data
file creation fails.
Also this code duplicates dht_create to make tier_create
Change-Id: I377a90dad47f288e9576c7323b23cf694a91a7a3
BUG: 1290677
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12948
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently even if we have received xattrs from any one of
the subvolume, we unwind with error in case the last subvol (which
unwinds) received a negative response.
To handle the case check if any of the subvolume has received
a response and pass it down.
Change-Id: Ia12a1f9671a6764f7550e6dc223324b1039fcc51
BUG: 1287539
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/12845
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
files deleted during promotion were not deleting as the
files are moving from hashed to non-hashed.
On deleting a file that is undergoing promotion,
the unlink call is not sent to the dst file as the
hashed subvol == cached subvol. This causes
the file to reappear once the migration is complete.
This patch also fixes a problem with stale linkfile
deleting.
Change-Id: I4b02a498218c9d8eeaa4556fa4219e91e7fa71e5
BUG: 1282390
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12829
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After detach tier start, creates are still going to hot
tier. Because when creating data files we are not checking for
decommissioned bricks.
Change-Id: I8e28258d9b2367dcc8ad6e5e91d0e54d92fdf771
BUG: 1289602
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12914
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterd occasionally loads shared libraries of translators. This
failed for tiering due to a reference to dht_methods which is defined
as a global variable which is not necessary.
The global variable has been removed and this is now a member of
dht_conf and is now initialised in the *_init calls.
Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953
BUG: 1287842
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/12863
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible a file would get migrated in the middle
of a readdir operation. If there are four subvolumes A,B,C,D,
and if readdir reads them in order and reaches subvol B,
then, if a file is moved from D to A, it will not be included
in the readdir output.
This phenonema has pre-existed in DHT migration but is more
apparent in tiering.
When a file is moved off the hashed subvolume a T file is created.
For tiering, we will make the cold subvolume the hashed subvolume.
This will ensure the creation of a T file. Readdir will not skip T
files in the tier translator.
Making the cold subvolume the hashed subvolume ensures the T
files created on promotions or creates will be less likely to
fill the volume.
Creates still put the data on the hot subvolume.
Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407
BUG: 1281598
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/12530
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic1393d44a9ed4aaba23d7c9ddea45977b9dae5e4
BUG: 1281265
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12574
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I0c4c72e2f5a9f8a7c60ef65251c596b54de89479
BUG: 1279705
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12559
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a part of CHILD_MODIFIED event DHT forgets the current layout and
performs fresh lookup. However this is not required when a replica pair
goes offline as the xattrs can be read from other replica pairs. Hence
setting different event to handle replica pair going down.
Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3
BUG: 1281230
Signed-off-by: Sakshi Bansal <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12573
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After a successful nameless lookup if the directory is not
present on any of the subvol, then we will get the path of
the directory and will recursively send a named lookp on
each parent directory.
This will help particularly for the scenarios like add brick
and attach-tier.
Change-Id: I64c2118a5ab03bbaa59b0dfc62babdf4472a92a3
BUG: 1272949
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12376
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bb2370514598a99e6ab268af81df57dc16caa2c5.
issue and impact: readdirp_cbk was not resetting the layout for files,
this causes problem if the files is moved from one cached subvolume
and if the layout was not proper, then there is chance to fail
entry fops if the fops executed with out a lookup. Because the
cached subvolume will not change and the application assumes the
presence of file in cached subvol. so it fails with ENOENT.
The patch preset the layout information in readdirp cbk
for each files in the entry. That leaves the problem the commit
bb2370514598a99e6ab268af81df57dc16caa2c5 try to fix. We will fix the
problem in a separate patch.
Change-Id: I878ec32f44edde2fb9d4f132d9b1b547cde993d9
BUG: 1272949
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12449
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a quota is disable and the clean-up process terminated
without completely cleaning-up the quota xattrs.
Now when quota is enabled again, this can mess-up the accounting
A version number is suffixed for all quota xattrs and this version
number is specific to marker xaltor, i.e when quota xattrs are
requested by quotad/client marker will remove the version suffix in the
key before sending the response
Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb
BUG: 1272411
Signed-off-by: vmallika <vmallika@redhat.com>
Reviewed-on: http://review.gluster.org/12386
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Snaps of tiered volumes cannot handle files undergoing migration.
We implement a helper mechanism to "pause" migration. Any files
undergoing migration are aborted. Clean up is done to remove
sticky bits and data at the destination. Migration is restarted
after snap completes.
For testing an internal switch is added. It is not exposed externally.
gluster volume set vol1 tier-pause [true|false]
Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799
BUG: 1267950
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/12304
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Determine which DHT level is responsible for
handling fops on a file undergoing migration based
on the name of the the linkto xattr set on the file
being migrated and process accordingly.
Change-Id: I82772e39314d4fe7f2ba0dcf22de0c6a374ee139
BUG: 1254428
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/12090
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unlink fails with invalid argument for files that
are being present on cold tier, before attaching.
All of the fops will be hashed to hot_tier after
attach-tier (unless explicitly set the "rule"
option). Lookups sent to directory, will eventually
search the directory using readdirp, and will
populate inode_ctx for the inodes based on the output,
in respective dht_xlators. So the readdirp will
populate inodes_ctx for the files (that is already
present in volume before attaching) in cold-dht
only because it got the entries from the cold-tier.
So when an unlink comes on such an inode, the lookup
associated with the unlink will be send as a
re validate request to cold-tier only, since
already a lookup was performed on the inode,
and the new lookup will succeed. So from the
unlink of dht, it will hash to cold-tier but the
cached_subvol will be cold, since there is a
mismatch in hash and cach , it chose hashed
subvolume and will sent the fop to hot dht,
and the fops fail with EINVAL from the hot-dht
since it does not have inode_ctx stored for
that inode (because, no lookup was performed
from hot-dht).
Change-Id: Ib7c14a9297a22d615f7a890a060be4809b5a745a
BUG: 1236032
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11675
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lookup selfheal race
Locking on all subvols before an rmdir is unable to remove all
directory entries. Hence reverting the patch for now.
Change-Id: I31baf2b2fa2f62c57429cd44f3f229c35eff1939
BUG: 1245065
Signed-off-by: Sakshi <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/12125
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses crash handling if local is NULL. In addition to that,
we were not unwinding if no lock is taken in dht_linkfile_create_cbk(create/mknod).
This patch handles that also.
Change-Id: Ibcff317f10d60e7865fd7ffb9479b3af53c9ef17
BUG: 1260051
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/12160
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: If the hashed subvol of a file has reached cluster.min-free-disk,
for a create opertaion a linkto file will be created on the hashed and
the data file will be created on some other brick.
For creation of the linkfile we populate the dictionary with linkto key
and value as the cached subvol. After successful linkto file creation,
the linkto-key-value pair is not deleted form the dictionary and hence,
the data file will also have linkto xattr which points to itself.This looks
something like this.
client-0 client-1
-------T file rwx------file
linkto.xattr=client-1 linkto.xattr=client-1
Now coming to the data loss part. Hardlink migration highly depend on this
linkto xattr on the data file. This value should be the new hashed subvol
of the first hardlink encountered post fix-layout. But when it tries to
read the linkto xattr it gets the same target as where it is sitting.
Now the source and destination are same for migration. At the end of
migration the source file is truncated and deleted, which in this case
is the destination and also the only data file it self resulting in
data loss.
Change-Id: I36b1d105752bd9467757ecf3f103b45c666783d6
BUG: 1260051
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/12105
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If linkfile_create is failed for some reason, then
we are trying to dereference a null variable
Change-Id: I3c6ff3715821b9b993d1bab7b90167de2861e190
BUG: 1260147
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/12106
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- fd_unref should decrement fd->inode->fd_count only if it is present in the
inode's fd list.
- successful open/opendir should perform fd_bind.
Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169
BUG: 1207735
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/11044
Reviewed-by: Anoop C S <anoopcs@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a possibility that while an rmdir is completed on
some non-hashed subvol and proceeding to others. A lookup
selfheal can recreate the same directory on those subvols
for which the rmdir had succeeded. The fix is to take a
blocking inodelk on the subvols before starting rmdir.
Since selfheal requires lock on all subvols, if an rmdir
is in progess acquiring locks will fail and vice versa.
Change-Id: I841a44758c3b88f5e04d1cb73ad36e0cac9fdabb
BUG: 1245065
Signed-off-by: Sakshi <sabansal@redhat.com>
Reviewed-on: http://review.gluster.org/11725
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I8c39ce38e257758e27e11ccaaff4798138203e0c
BUG: 1256243
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/11998
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Post remove-brick start till commit phase, the client layout
may not be in sync with disk layout because of lack of lookup.
Hence,a create call may fall on the decommissioned brick.
Solution:
Will acquire a lock on hashed subvol. So that a fix-layout or
selfheal can not step on layout while reading the layout.
Even if we read a layout before remove-brick fix-layout and the
file falls on the decommissioned brick, the file should be
migrated to a new brick as per the fix-layout.
Change-Id: If84a12ec34f981adb2b9b224e80f535cfe5bf9f2
BUG: 1232378
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/11260
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I4a3dd8c00894ceeed4af77df2d960f372281a03b
BUG: 1235989
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/11409
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the files was in hot tier and the look up was done already, then
hashed and cached subvolume will be hot-tier. Once the file is moved
from hot-tier to cold-tier, then subsequent lookup will send a
revalidate lookup to hot-tier and it will find out that the file was
actually moved and there is only link in the cached subvolume. So dht
will return an ESTALE to fuse. Upon receiving ESTALE for a lookup, fuse
will create a new inode and sent a fresh lookup. This lookup will be
successful, and it will locate the file properly. Then fuse try to link
the inode, but the older inode was already there in inmemory inode cache
with same gfid and that is also shared with fuse kernal. So inode_link
will return the older ionode itself. So the subsequent rename fop will
come to gluster with the older inode. From dht_rename, we will take a
lock on the inode and after successful inodelk on inode dht will send
lookup before creating a link. this lookup will again find out that the
file is a link file, and then dht will think that file is
migrating/migrated in the mean time, and will send EBUSY.
Change-Id: Ib3a01e5b1d7f64514b04bb6234026d049f082679
BUG: 1248306
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/11768
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An opendir is done in rebalance. The graph constructed when
EC is used in tiering may have no local volumes (if
all the hot volumes are on one node and all the others on
another node). Previously the opendir only sent fops down
the local subvolumes for migration. They must be sent down
both the hot and cold subvolumes for tiering.
When setxattr2() received a NULL subvolume; this dereferenced
an uninitialized variable.
When a lookup is done during creation of the destination
file, the xattr dict is "polluted" with virtual xattrs.
These cause subsequent xattrs in the new file to not be
written by posix. They are required by EC.
The inode gfid for "entry_loc" in gf_defrag_migrate_single_file()
was not initialized. This made underlying translators
think the gfid was 0, and failed migration.
Change-Id: I6ccda8ca8e43485b9b354341bbfcb302496f632c
BUG: 1236212
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11433
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a subvol goes down, tier daemon stopped immediately, and
the status shows as "Progressing".
With this change, with respect to tier xlator, when a subvol
goes offline it will update the status as failed.
Change-Id: I9f722ed0d35cda8c7fc1a7e75af52222e2d0fdb7
BUG: 1227803
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/11068
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ib3bb61c5223f409c23c68100f3fe884918d2dc3f
BUG: 1194640
Signed-off-by: arao <arao@redhat.com>
Reviewed-on: http://review.gluster.org/10021
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Joseph Fernandes
Tested-by: Joseph Fernandes
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a tiered volume the cold subvolume is always at a fixed
position in the graph. DHT's layout array, on the other hand,
may have the cold subvolume in either the first or second
index, therefore code cannot make any assumptions. The fix
searches the layout for the correct position dynamically
rather than statically.
The bug manifested itself in NFS, in which a newly attached
subvolume had not received an existing directory. This case
is a "stale entry" and marked as such in the layout for
that directory. The code did not see this, because it
looked at the wrong index in the layout array.
The fix also adds the check for decomissioned bricks, and
fixes a problem in detach tier related to starting the
rebalance process: we never received the right defrag
command and it did not get directed to the tier translator.
Change-Id: I77cdf9fbb0a777640c98003188565a79be9d0b56
BUG: 1214289
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/11092
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stashing additional information in the inode_ctx to help
decide whether the migration information is stale, which could
happen if a file was migrated several times but FOPs only detected
the P1 migration phase. If no FOP detects the P2 phase, the inode
ctx1 is never reset.
We now save the src subvol as well as the dst subvol in the
inode ctx. The src subvol is the subvol on which the FOP was sent
when the mig info was set in the inode ctx. This information is
considered stale if:
1. The subvol on which the current FOP is sent is the same as
the dst subvol in the ctx
2. The subvol on which the current FOP is sent is not the same
as the src subvol in the ctx
This does not handle the case where the same file might have been
renamed such that the src subvol is the same but the dst subvol
is different. However, that is unlikely to happen very often.
Change-Id: I05a2e9b107ee64750c7ca629aee03b03a02ef75f
BUG: 1142423
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10834
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The destination subvol used in the fop2 variants is either stored in
inode-ctx1 or local->cached_subvol. However, it is not guaranteed that
a value stored in these locations before invocation of fop2 is still
present after the invocation as these locations are shared among
different concurrent operations. So, to preserve the atomicity of
"check dst-subvol and invoke fop2 variant if dst-subvol found", we
pass down the dst-subvol to fop2 variant.
This patch also fixes error handling in some fop2 variants.
Change-Id: Icc226228a246d3f223e3463519736c4495b364d2
BUG: 1142423
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/10943
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new rebalance performance improvements added new
datastructures which were not initialized in the
tier case. Function dht_find_local_subvol_cbk() needs
to accept a list built by lower level DHT translators
in order to build the local subvolumes list.
Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497
BUG: 1222088
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10795
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently with commit 4eaaf5 a mixed version cluster would
have issues if lookup-uhashed is set to auto, as older clients
would fail to validate the layouts if newer clients (i.e 3.7 or
upwards) create directories. Also, in a mixed version cluster
rebalance daemon would set commit hash for some subvolumes and
not for the others.
This commit fixes this problem by moving the enabling of the
functionality introduced in the above mentioned commit to a
new dht option. This option also has a op_version of 3_7_1
thereby preventing it from being set in a mixed version
cluster. It brings in the following changes,
- Option can be set only if min version of the cluster is
3.7.1 or more
- Rebalance and mkdir update the layout with the commit hashes
only if this option is set, hence ensuring rebalance works in a
mixed version cluster, and also directories created by newer
clients do not cause layout errors when read by older clients
- This option also supersedes lookup-unhased, to enable the
optimization for lookups more deterministic and not conflict
with lookup-unhashed settings.
Option added is cluster.lookup-optimize, which is a boolean.
Usage: # gluster volume set VOLNAME cluster.lookup-optimize on
Change-Id: Ifd1d4ce3f6438fcbcd60ffbfdbfb647355ea1ae0
BUG: 1222126
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/10797
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we did a graph switch on a rebalance daemon, a second call
to gf_degrag_start() was done. This lead to multiple threads
doing migration. When multiple threads try to move the same
file there can be deadlocks.
Change-Id: I931ca7fe600022f245e3dccaabb1ad004f732c56
BUG: 1226005
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10977
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of including config.h in each file, and have the additional
config.h included from the compiler commandline (-include option).
When a .c file tests for a certain #define, and config.h was not
included, incorrect assumtions were made. With this change, it can not
happen again.
BUG: 1222319
Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/10808
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ic92d25db68e40ef4a4388ef42affd1b3ee5a7ec6
BUG: 1221270
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/10773
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a file is under migration, any xattrs created on it
are lost post migration of the file. This is because
the xattrs are set only on the cached subvol of the source
and as the source is under migration, it becomes a linkto file
post migration.
Change-Id: Ib8e233b519cf954e7723c6e26b38fa8f9b8c85c0
BUG: 1193636
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10212
Tested-by: NetBSD Build System
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rebalance process determines the local subvols for the
node it is running on and only acts on files in those subvols.
If a dist-rep or dist-disperse volume is created on 2 nodes by
dividing the bricks equally across the nodes, one process might
determine it has no local_subvols.
When trying to update the commit hash, the function attempts to
lock all local subvols. On the node with no local_subvols the dht
inode lock operation fails, in turn causing the rebalance to fail.
In a dist-rep volume with 2 nodes, if brick 0 of each replica
set is on node1 and brick 1 is on node2, node2 will find that it has
no local subvols.
Change-Id: I7d73b5b4bf1c822eae6df2e6f79bd6a1606f4d1c
BUG: 1221696
Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/10786
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID : 1124521
Change-Id: Ie524935d636195cb6894074095b9b98fe28dbc2c
BUG: 789278
Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Reviewed-on: http://review.gluster.org/10348
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sakshi Bansal
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The key concept here is to determine whether a directory is "clean" by
comparing its last-known-good topology to the current one for the
volume. These are stored as "commit hashes" on the directory and the
volume root respectively. The volume's commit hash changes whenever a
brick is added or removed, and a fix-layout is done. A directory's
commit hash changes only when a full rebalance (not just fix-layout)
is done on it. If all bricks are present and have a directory
commit hash that matches the volume commit hash, then we can assume
that every file is in its "proper" place. Therefore, if we look for
a file in that proper place and don't find it, we can assume it's not
on any other subvolume and *safely* skip the global (broadcast to all)
lookup.
Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3
BUG: 1219637
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Signed-off-by: Shyam <srangana@redhat.com>
Reviewed-on: http://review.gluster.org/7702
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fix adds support to view the number of promoted or demoted
files from the cli. The mechanism is isolmorphic to checking
the status of volumes being rebalanced.
gluster volume rebalance <vol> tier status
Change-Id: I1b11ca27355ceec36c488967c23531202030e205
BUG: 1213063
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10292
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the patch http://review.gluster.org/#/c/9657
the client pid set by tiering migration was getting over-
written in dht_start_rebalance_task(). Just corrected it
in dht_setxattr() before calling dht_start_rebalance_task()
and removed it from dht_start_rebalance_task().
Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870
BUG: 1217937
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10502
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we attach a tier, the hot tier becomes the hashed
subvolume. But directories may not yet have been replicated by
the fix layout process. Hence lookups to those directories
will fail on the hot subvolume. We should only go to the hashed
subvolume once the layout has been fixed. This is known if the
layout for the parent directory does not have an error. If
there is an error, the cold tier is considered the hashed
subvolume. The exception to this rules is ENOCON, in which
case we do not know where the file is and must abort.
Note we may revalidate a lookup for a directory even if the
inode has not yet been populated by FUSE. This case can
happen in tiering (where one tier has completed a lookup
but the other has not, in which case we revalidate one tier
when we call lookup the second time). Such inodes are
still invalid and should not be consulted for validation.
Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523
BUG: 1214289
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10435
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up patch for http://review.gluster.org/#/c/10080
In the above, the suggested change in
http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c
doesnot work. The reason it doesnt work is promotion and demotion are done in
a multithread way. Whenever a promotion or demotion thread is called, the frame
of the old sync_op thread is not carried with it. As a result the frame->root->pid
is not set.
Solution:
When the file is getting migrated, we get a tiering.migration key_value in the
xattr dict, so that we pass this dic key-value when we do syncop_setxattr()
to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG
in dht_setxattr() just before calling dht_start_rebalance_task().
Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10266
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem:
1. When two threads execute in parallel in dht_getxattr_cbk
it may so happen that, both may find local->xattr to be NULL. As
a result dht_aggregate_xattr may not get executed.
2. In dht_getxattr_cbk,
thread1 thread2
T1 this_call_cnt = 2 -1
T2 this_call_cnt = 1 - 1
T3 fills local_xattr
T4 DHT_STACK_UNWIND -> local_wipe
T5 tries to dereference local
which is already freed,
leading to crash.
Solution:
for problem1: Execute critical section inside frame lock
to resolve race.
for problem2: Calculate this_call_count just before out section.
Change-Id: I9827ac8fafebb0c733a4e4f3c710b752f1cd45fa
BUG: 1215592
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/10389
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Tested-by: NetBSD Build System
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node
Brief design explanation for the above two points.
1. Rebalance multiple files in parallel:
-------------------------------------
The existing rebalance engine is single threaded. Hence, introduced
multiple threads which will be running parallel to the crawler. The
current rebalance migration is converted to a "Producer-Consumer"
frame work.
Where Producer is : Crawler
Consumer is : Migrating Threads
Crawler: Crawler is the main thread. The job of the crawler is now
limited to fix-layout of each directory and add the files which are
eligible for the migration to a global queue in a round robin manner
so that we will use all the disk resources efficiently. Hence, the
crawler will not be "blocked" by migration process.
Producer: Producer will monitor the global queue. If any file is
added to this queue, it will dqueue that entry and migrate the file.
Currently 20 migration threads are spawned at the beginning of the
rebalance process. Hence, multiple file migration happens in parallel.
2. Crawl only bricks that belong to the current node:
--------------------------------------------------
As rebalance process is spawned per node, it migrates only the files
that belongs to it's own node for the sake of load balancing. But it
also reads entries from the whole cluster, which is not necessary as
readdir hits other nodes.
New Design:
As part of the new design the rebalancer decides the subvols
that are local to the rebalancer node by checking the node-uuid of
root directory prior to the crawler starts. Hence, readdir won't hit
the whole cluster as it has already the context of local subvols and
also node-uuid request for each file can be avoided. This makes the
rebalance process "more scalable".
Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1
BUG: 1171954
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/9657
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These commands work in a manner analagous to rebalancing when removing a
brick. The existing migration daemon detects "detach start" and switches
to moving data off the hot tier. While in this state all lookups are
directed to the cold tier.
gluster v detach-tier <vol> start
gluster v detach-tier <vol> commit
The status and stop cli commands shall be submitted separately.
Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0
BUG: 1205540
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: root <root@localhost.localdomain>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10108
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Tested-by: NetBSD Build System
|