| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Each brick on a host will get a separate query file.
2. While reading query record from these query files we
read them in a Round-Robin manner.
3. When an error occurs during migration we rename it to
query file with an time stamp and .err extension for
better debugging.
Backport of http://review.gluster.org/13293
> Change-Id: I27c4285d24fd695d2d5cbd9fd7db3879d277ecc8
> BUG: 1302772
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/13293
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> Tested-by: N Balachandran <nbalacha@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Change-Id: I9448f19c10a82fff9b1b3c9387c9026adc0742f1
BUG: 1306514
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/13427
Smoke: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
Tested-by: Joseph Fernandes
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible a file would get migrated in the middle
of a readdir operation. If there are four subvolumes A,B,C,D,
and if readdir reads them in order and reaches subvol B,
then, if a file is moved from D to A, it will not be included
in the readdir output.
This phenonema has pre-existed in DHT migration but is more
apparent in tiering.
When a file is moved off the hashed subvolume a T file is created.
For tiering, we will make the cold subvolume the hashed subvolume.
This will ensure the creation of a T file. Readdir will not skip T
files in the tier translator.
Making the cold subvolume the hashed subvolume ensures the T
files created on promotions or creates will be less likely to
fill the volume.
Creates still put the data on the hot subvolume.
This is a backport of 12530
> Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407
> BUG: 1281598
> Signed-off-by: Dan Lambright <dlambrig@redhat.com>
> Reviewed-on: http://review.gluster.org/12530
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Conflicts:
xlators/cluster/dht/src/tier.c
Change-Id: I5720a4cd04ae5088e5d7d23439b0f90d6bbc6265
BUG: 1283923
Reviewed-on: http://review.gluster.org/12722
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current DHT migration code will always delete the
src linkto file after migration as dht always moves
files to the hashed subvol. This is not the case in tiering.
The lack of linkto files causes rename to fail leaving 2 files
with the same name but different gfids on the volume.
Modified to leave the linkto file behind if the source
volume is the hashed subvolume.
> Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: http://review.gluster.org/12551
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: I210b94cdae0409c87af8ba198e3cd263a6c85190
BUG: 1283480
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/12655
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier, when the database was queried we used to save
all the queried records in an ASCII format in the query file.
This caused issues like filename having ASCII delimiter and used
to take a lot of space. The tier.c file also had a lot of parsing code.
Here we changed the format of the query file to binary.
All the logic of serialization and formating of query record is done
by libgfdb. Libgfdb provides API,
gfdb_write_query_record() and gfdb_read_query_record(),
which the user i.e tier migrator and CTR xlator can use to
write to and read from query file.
With this binary format we save on disk space i.e reduce to 50% atleast
as we are saving GFID's in binary format 16 bytes and not the string format
which takes 36 bytes + We are not saving path of the file + we are also saving on
ASCII delimiters.
The on disk format of query record is as follows,
+---------------------------------------------------------------------------+
| Length of serialized query record | Serialized Query Record |
+---------------------------------------------------------------------------+
4 bytes Length of serialized query record
|
|
-------------------------------------------------|
|
|
V
Serialized Query Record Format:
+---------------------------------------------------------------------------+
| GFID | Link count | <LINK INFO> |..... | FOOTER |
+---------------------------------------------------------------------------+
16 B 4 B Link Length 4 B
| |
| |
-----------------------------| |
| |
| |
V |
Each <Link Info> will be serialized as |
+-----------------------------------------------+ |
| PGID | BASE_NAME_LENGTH | BASE_NAME | |
+-----------------------------------------------+ |
16 B 4 B BASE_NAME_LENGTH |
|
|
------------------------------------------------------------------------|
|
|
V
FOOTER is a magic number 0xBAADF00D indicating the end of the record.
This also serves as a serialized schema validator.
Backport of http://review.gluster.org/#/c/12354/
> Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a
> BUG: 1272207
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/12354
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: I170c579027f2594a58706f826e3ddf89e34022f4
BUG: 1263619
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12535
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport fix 12039
This fix introduces infrastructure to support different
policies for promotion and demotion.
Currently the tier feature automatically promotes and demotes
files periodically based on access. This is good for testing
but too stringent for most real workloads. It makes it
difficult to fully utilize a hot tier- data will be demoted
before it is touched- its unlikely a 100GB hot SSD will have
all its data touched in a window of time.
A new parameter "mode" allows the user to pick promotion/demotion
polcies.
The "test mode" will be used for *.t and other general testing.
This is the current mechanism.
The "cache mode" introduces watermarks. The watermarks
represent levels of data residing on the hot tier.
"cache mode" policy:
The % the hot tier is full is called P.
Do not promote or demote more than D MB or F files.
A random number [0-100] is called R.
Rules for migration:
if (P < watermark_low) don't demote, always promote.
if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P.
if (P > watermark_hi) always demote, don't promote.
gluster volume set {vol} cluster.watermark-hi %
gluster volume set {vol} cluster.watermark-low %
gluster volume set {vol} cluster.tier-max-mb {D}
gluster volume set {vol} cluster.tier-max-files {F}
gluster volume set {vol} cluster.tier-mode {test|cache}
> Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2
> BUG: 1257911
> Signed-off-by: Dan Lambright <dlambrig@redhat.com>
> Reviewed-on: http://review.gluster.org/12039
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Conflicts:
xlators/cluster/dht/src/dht-rebalance.c
xlators/cluster/dht/src/tier.c
Change-Id: Ibfe6b89563ceab98708325cf5d5ab0997c64816c
BUG: 1270527
Reviewed-on: http://review.gluster.org/12330
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a backport of 12031.
> Problem: The DB on the brick is been accessed by CTR, for write and
> tier migrator, for read and write. The write from tier migrator is reseting
> the heat counters after a cycle. Since we are using sqlite, two connections
> trying to write would cause a db lock contention. As a result CTR used to fail
> to update the db.
> Solution: Using the same db connection of CTR for reseting the heat counters.
> 1) Introducted a new IPC FOP for CTR
> 2) After the query do a ipc syncop to the underlying client xlator associated
> to the brick.
> 3) CTR in brick will catch the IPC FOP and cleat the heat counters.
> Change-Id: I53306bfc08dcdba479deb4ccc154896521336150
> BUG: 1260730
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/12031
> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Conflicts:
xlators/cluster/dht/src/tier.c
Change-Id: I88aa289cdf21e216b42c3d8ccfb4e7e828b43772
BUG: 1262341
Reviewed-on: http://review.gluster.org/12161
Reviewed-by: Joseph Fernandes
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a back port of 11334
1) Ignore creation of T file in ctr_mknod
2) Ignore lookup for T file in ctr_lookup
3) Ctr_lookup:
a. If the gfid and pgfid in empty dont record
b. Decreased log level for multiple heal attempts
c. Inode/File heal happens after an expiry period, which is configurable.
d. Hardlink heal happens after an expiry period, which is configurable.
> Change-Id: Id8eb5092e78beaec22d05f5283645081619e2452
> BUG: 1235269
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/11334
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: Ia28a5cf975e41d318906f707deca447aaa35630f
BUG: 1236288
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11446
Reviewed-by: Joseph Fernandes
Tested-by: Joseph Fernandes
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We did not set up the graph properly for hot tiers with replicated
subvolumes. Also add check that the file has not already been moved
by another replicated brick on the same node.
Change-Id: I9adef565ab60f6774810962d912168b77a6032fa
BUG: 1206517
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10054
Reviewed-by: Joseph Fernandes <josferna@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch resolves tiering translator issues taken from the list
in bug 1203776. These issues have been selected to be fixed
first. The rest will be fixed in a subsequent patch (or are not a
problem).
3. Replace hardcoded #defines of promote/demote file names
6. Use loc_wipe() in migrate_using_query_file()
9. Only promote/demote files on the same node on which they reside.
14. Replace calloc with GF_CALLOC in tier.c and ensure freeing done
properly.
15. Handle if parse_query_str fails
22. Only load gfdb library on server side, remove SQL references
from client.
Change-Id: I6563b11e58ab2e4c6b1ce44db755781ad6d930fb
BUG: 1203776
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9987
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD does not have sys/xattr.h, including it in tier.h breaks
building on FreeBSD. There is nothing in tier.h that seems to require
definitions from the sys/xattr.h header, just remove it.
BUG: 1194753
Change-Id: If970272a0ce7728e0f18e5ae026880688ac31408
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/9965
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Sachin Pandit <spandit@redhat.com>
|
|
The tier translator shares most of DHT's code. It differs in how
subvolumes are chosen for I/Os, and how file migration (cache promotion
and demotion) is managed. That different functionality is split to either
DHT or tier logic according to the "tier_methods" structure.
A cache promotion and demotion thread is created in a manner
similar to the rebalance daemon. The thread operates a timing
wheel which periodically checks for promotion and demotion candidates
(files). Candidates are queued and then migrated. Candidates must exist on
the same node as the daemon and meet other critera per caching policies.
This patch has two authors (Dan Lambright and Joseph Fernandes). Dan
did the DHT changes and Joe wrote the cache policies. The fix depends on
DHT readidr changes and the database library which have been submitted
separately. Header files in libglusterfs/src/gfdb should be reviewed in
patch 9683.
For more background and design see the feature page [1].
[1]
http://www.gluster.org/community/documentation/index.php/Features/data-classification
Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c
BUG: 1194753
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/9724
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Tested-by: Vijay Bellur <vbellur@redhat.com>
|