| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier, when the database was queried we used to save
all the queried records in an ASCII format in the query file.
This caused issues like filename having ASCII delimiter and used
to take a lot of space. The tier.c file also had a lot of parsing code.
Here we changed the format of the query file to binary.
All the logic of serialization and formating of query record is done
by libgfdb. Libgfdb provides API,
gfdb_write_query_record() and gfdb_read_query_record(),
which the user i.e tier migrator and CTR xlator can use to
write to and read from query file.
With this binary format we save on disk space i.e reduce to 50% atleast
as we are saving GFID's in binary format 16 bytes and not the string format
which takes 36 bytes + We are not saving path of the file + we are also saving on
ASCII delimiters.
The on disk format of query record is as follows,
+---------------------------------------------------------------------------+
| Length of serialized query record | Serialized Query Record |
+---------------------------------------------------------------------------+
4 bytes Length of serialized query record
|
|
-------------------------------------------------|
|
|
V
Serialized Query Record Format:
+---------------------------------------------------------------------------+
| GFID | Link count | <LINK INFO> |..... | FOOTER |
+---------------------------------------------------------------------------+
16 B 4 B Link Length 4 B
| |
| |
-----------------------------| |
| |
| |
V |
Each <Link Info> will be serialized as |
+-----------------------------------------------+ |
| PGID | BASE_NAME_LENGTH | BASE_NAME | |
+-----------------------------------------------+ |
16 B 4 B BASE_NAME_LENGTH |
|
|
------------------------------------------------------------------------|
|
|
V
FOOTER is a magic number 0xBAADF00D indicating the end of the record.
This also serves as a serialized schema validator.
Backport of http://review.gluster.org/#/c/12354/
> Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a
> BUG: 1272207
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/12354
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: I170c579027f2594a58706f826e3ddf89e34022f4
BUG: 1263619
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12535
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ignore bitrot related fops since they are internal fops.
Backport of http://review.gluster.org/#/c/12512/
Change-Id: I5b676fe450d266b95bbd25aeca0d54436e098917
BUG: 1278640
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12533
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correcting the internal fop calculation method, as it had wrong logic.
backport of http://review.gluster.org/#/c/12418/
Change-Id: I1deeae8b67dd967159853b494e89a3f46572c962
BUG: 1275483
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/12423
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Heal hardlink in the db for already existing data in the cold
tier during attach tier. i.e during fix layout do lookup to files
in the cold tier.
CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup.
Currently the namedlookup is done synchronous to the fixlayout that is
triggered by attach tier. This is not performant, adding more time to
fixlayout. The performant approach is record the hardlinks on a compressed
datastore and then do the namelookup asynchronously later, giving the ctr db
eventual consistency
master patch : http://review.gluster.org/#/c/11828/
>>Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9
>>BUG: 1252586
>>Signed-off-by: Joseph Fernandes <josferna@redhat.com>
>>Reviewed-on: http://review.gluster.org/11828
>>Tested-by: Gluster Build System <jenkins@build.gluster.com>
>>Reviewed-by: Dan Lambright <dlambrig@redhat.com>
>>Tested-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Change-Id: I61b185a54ae4e8c1d82804b95a278bfbea870987
BUG: 1261146
Reviewed-on: http://review.gluster.org/12331
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
version less than 3.7 i.e rhel 6.7
Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support
WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above.
As a result we cannot have to progreses concurrently accessing sqlite, without
running into db locks! Well WAL is also need for performace on CTR side.
Solution: This solution is to use CTR db connection for doing queries when WAL mode is
absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will
do the query and create/update the query file suggested by tier migrator.
Pending: Well this solution will stop the db locks but the performance is still an issue for CTR.
We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR
performance by doing in memory udpates on the IO path and later flush the updates to
the db in a batch/segment flush.
Master patch: http://review.gluster.org/#/c/12191
>> Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124
>> BUG: 1240577
>> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
>> Signed-off-by: Dan Lambright <dlambrig@redhat.com>
>> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
>> Reviewed-on: http://review.gluster.org/12191
>> Tested-by: Gluster Build System <jenkins@build.gluster.com>
>> Reviewed-by: Luis Pabon <lpabon@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Change-Id: Ie8c7a7e9566244c104531b579126bb57fbc6e32b
BUG: 1270123
Reviewed-on: http://review.gluster.org/12325
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a backport of 12031.
> Problem: The DB on the brick is been accessed by CTR, for write and
> tier migrator, for read and write. The write from tier migrator is reseting
> the heat counters after a cycle. Since we are using sqlite, two connections
> trying to write would cause a db lock contention. As a result CTR used to fail
> to update the db.
> Solution: Using the same db connection of CTR for reseting the heat counters.
> 1) Introducted a new IPC FOP for CTR
> 2) After the query do a ipc syncop to the underlying client xlator associated
> to the brick.
> 3) CTR in brick will catch the IPC FOP and cleat the heat counters.
> Change-Id: I53306bfc08dcdba479deb4ccc154896521336150
> BUG: 1260730
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/12031
> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Conflicts:
xlators/cluster/dht/src/tier.c
Change-Id: I88aa289cdf21e216b42c3d8ccfb4e7e828b43772
BUG: 1262341
Reviewed-on: http://review.gluster.org/12161
Reviewed-by: Joseph Fernandes
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in ctr_lookup, the loc variable need not be comes with pargfid,
though there is a parent for the inode. The same for loc->name
also. From this patch, we will generate loc->name from loc->path
Back port of :
>Change-Id: I24a79554748139504ec09f77930f8208d3805977
>BUG: 1236128
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
>Reviewed-on: http://review.gluster.org/11459
>Tested-by: NetBSD Build System <jenkins@build.gluster.org>
>Tested-by: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Joseph Fernandes
(cherry picked from commit 2e44d1580497eb75f325ad3104249a425ddf592a)
Change-Id: Ia926bee45059ec97bbe5120f19999b9d5a71aa88
BUG: 1254439
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/11947
Tested-by: NetBSD Build System <jenkins@build.gluster.org>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Dan Lambright <dlambrig@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a back port of 11334
1) Ignore creation of T file in ctr_mknod
2) Ignore lookup for T file in ctr_lookup
3) Ctr_lookup:
a. If the gfid and pgfid in empty dont record
b. Decreased log level for multiple heal attempts
c. Inode/File heal happens after an expiry period, which is configurable.
d. Hardlink heal happens after an expiry period, which is configurable.
> Change-Id: Id8eb5092e78beaec22d05f5283645081619e2452
> BUG: 1235269
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/11334
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Dan Lambright <dlambrig@redhat.com>
Change-Id: Ia28a5cf975e41d318906f707deca447aaa35630f
BUG: 1236288
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/11446
Reviewed-by: Joseph Fernandes
Tested-by: Joseph Fernandes
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/10938/
Cherry picked from 00f9a61fe8884062c141edd662424625d349a377
>Change-Id: I66e7ccc5e62482c3ecf0aab302568e6c9ecdc05d
>BUG: 1194640
>Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
>Reviewed-on: http://review.gluster.org/10938
>Tested-by: Gluster Build System <jenkins@build.gluster.com>
>Reviewed-by: Joseph Fernandes
>Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Change-Id: I66e7ccc5e62482c3ecf0aab302568e6c9ecdc05d
BUG: 1217722
Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com>
Reviewed-on: http://review.gluster.org/11195
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Joseph Fernandes
Tested-by: Joseph Fernandes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The CTR xlator records file meta (heat/hardlinks)
into the data. This works fine for files which are created
after ctr xlator is switched ON. But for files which were
created before CTR xlator is ON, CTR xlator is not able to
record either of the meta i.e heat or hardlinks. Thus making
those files immune to promotions/demotions.
Solution: The solution that is implemented in this patch is
do ctr-db heal of all those pre-existent files, using named lookup.
For this purpose we use the inode-xlator context variable option
in gluster.
The inode-xlator context variable for ctr xlator will have the
following,
a. A Lock for the context variable
b. A hardlink list: This list represents the successful looked
up hardlinks.
These are the scenarios when the hardlink list is updated:
1) Named-Lookup: Whenever a named lookup happens on a file, in the
wind path we copy all required hardlink and inode information to
ctr_db_record structure, which resides in the frame->local variable.
We dont update the database in wind. During the unwind, we read the
information from the ctr_db_record and ,
Check if the inode context variable is created, if not we create it.
Check if the hard link is there in the hardlink list.
If its not there we add it to the list and send a update to the
database using libgfdb.
Please note: The database transaction can fail(and we ignore) as there
already might be a record in the db. This update to the db is to heal
if its not there.
If its there in the list we ignore it.
2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in
the inode context variable and delete the inode context variable.
Please note: An inode forget may happen for two reason,
a. when the inode is delete.
b. the in-memory inode is evicted from the inode table due to cache limits.
3) create: whenever a create happens we create the inode context variable and
add the hardlink. The database updation is done as usual by ctr.
4) link: whenever a hardlink is created for the inode, we create the inode context
variable, if not present, and add the hardlink to the list.
5) unlink: whenever a unlink happens we delete the hardlink from the list.
6) mknod: same as create.
7) rename: whenever a rename happens we update the hardlink in list. if the hardlink
was not present for updation, we add the hardlink to the list.
What is pending:
1) This solution will only work for named lookups.
2) We dont track afr-self-heal/dht-rebalancer traffic for healing.
> http://review.gluster.org/#/c/10370/
> Cherry picked from commit cb11dd91a6cc296e4a3808364077f4eacb810e48
> Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f
> BUG: 1212037
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Signed-off-by: Dan Lambright <dlambrig@redhat.com>
> Reviewed-on: http://review.gluster.org/10370
> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Tested-by: NetBSD Build System
> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I367aa46c3f4b8f912248fb8be75866507f2538df
BUG: 1219075
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-on: http://review.gluster.org/10370
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10615
Tested-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) ctr_link_consistency option for ctr xaltor is provided so that
the user can choose to switch it on or off.
/* For link consistency we do a double update i.e mark the link
* during the wind and during the unwind we update/delete the link.
* This has a performance hit. We give a choice here whether we need
* link consistency to be spoton or not using link_consistency flag.
* This will have only one link update */
2) In delete the wind time recording is moved to unwind path.
/* Special performance case:
* Updating wind time in unwind for delete. This is done here
* as in the wind path we will not know whether its the last
* link or not. For a last link there is not use to update any
* wind or unwind time!*/
> http://review.gluster.org/#/c/10170/
> Cherry picked from commit 606d9734543208542afcf9df982bf2d560235ef6
> Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6
> BUG: 1217786
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
> Reviewed-on: http://review.gluster.org/10170
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
> Tested-by: Gluster Build System <jenkins@build.gluster.com>
> Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Change-Id: I4a89ef80875f36cff91520f712e1f47fde258a63
BUG: 1219066
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10170
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10614
Tested-by: NetBSD Build System
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fix will solve the heating of the files during the promotion
or demotion.
Promotion:
~~~~~~~~~
When a file gets promoted it get the current time stamp
during creation only, but following writes or reads during the
migration wont heat the file.
Demotion:
~~~~~~~~
When a file gets demoted it get the wind/unwind time stamp is set to
zero. The following writes or reads during the migration wont heat
the file.
What is remaining ?
~~~~~~~~~~~~~~~~~
Bug 1209129 ( https://bugzilla.redhat.com/show_bug.cgi?id=1209129 )
Inspite of this fix there is still a issue remaining, i.e the heat of
the file is not keep intact during a internal rebalance activity i.e
a rebalance within a tier.
Change-Id: I01e82dc226355599732d40e699062cee7960b0a5
BUG: 1207867
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Signed-off-by: Dan Lambright <dlambrig@redhat.com>
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/10080
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterfs relies on Linux uuid implementation, which
API is incompatible with most other systems's uuid. As
a result, libglusterfs has to embed contrib/uuid,
which is the Linux implementation, on non Linux systems.
This implementation is incompatible with systtem's
built in, but the symbols have the same names.
Usually this is not a problem because when we link
with -lglusterfs, libc's symbols are trumped. However
there is a problem when a program not linked with
-lglusterfs will dlopen() glusterfs component. In
such a case, libc's uuid implementation is already
loaded in the calling program, and it will be used
instead of libglusterfs's implementation, causing
crashes.
A possible workaround is to use pre-load libglusterfs
in the calling program (using LD_PRELOAD on NetBSD for
instance), but such a mechanism is not portable, nor
is it flexible. A much better approach is to rename
libglusterfs's uuid_* functions to gf_uuid_* to avoid
any possible conflict. This is what this change attempts.
BUG: 1206587
Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a
Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org>
Reviewed-on: http://review.gluster.org/10017
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
**********************************************************************
ChangeTimeRecorder(CTR) Xlator |
**********************************************************************
ChangeTimeRecorder(CTR) is server side xlator(translator) which sits
just above posix xlator. The main role of this xlator is to record the
access/write patterns on a file residing the brick. It records the
read(only data) and write(data and metadata) times and also count on
how many times a file is read or written. This xlator also captures
the hard links to a file(as its required by data tiering to move
files).
CTR Xlator is the consumer of libgfdb.
To Enable/Disable CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.ctr-enabled {on/off}
To Enable/Disable Frequency Counter Recording in CTR Xlator:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gluster volume set <volume-name> features.record-counters {on/off}
Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2
BUG: 1194753
Signed-off-by: Joseph Fernandes <josferna@redhat.com>
Reviewed-on: http://review.gluster.org/9935
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|