summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/gfdb/gfdb_data_store_types.h
Commit message (Collapse)AuthorAgeFilesLines
* libglusterfs: remove unused gfdb specific files from repoAmar Tumballi2019-11-181-532/+0
| | | | | | Updates: bz#1193929 Change-Id: Idb98394c51917e9b132aeb32facccd112effe672 Signed-off-by: Amar Tumballi <amarts@gmail.com>
* Land clang-format changesGluster Ant2018-09-121-307/+247
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-1/+1
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* libglusterfs+changetimerecorder: reduce log noiseJeff Darcy2017-02-101-1/+1
| | | | | | | | | | | | | | | The logging about translator options is so verbose that it significantly slows down scalability tests - sometimes even to the point where it induces timing-related failures. Quiet, please. Change-Id: If0766e2a80746bba586e67e6019ff7084d68b425 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16569 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com>
* cluster/tier: handle fast demotionsMilind Changire2016-10-191-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Demote files on priority if hi-watermark has been breached and continue to demote until the watermark drops below hi-watermark. Monitor watermark more frequently. Trigger demotion as soon as hi-watermark is breached. Add cluster.tier-emergency-demote-query-limit option to limit number of files returned from the database query for every iteration of tier_migrate_using_query_file(). If watermark hasn't dropped below hi-watermark during the first iteration, the next iteration will be triggered approximately 1 second after tier_demote() returns to the main tiering loop. Update changetimerecorder xlator to handle query for emergency demote mode. Add tier-ctr-interface.h: Move tier and ctr interface specific macros and struct definition from libglusterfs/src/gfdb/gfdb_data_store.h to new header libglusterfs/src/tier-ctr-interface.h Change-Id: If56af78c6c81d37529b9b6e65ae606ba5c99a811 BUG: 1366648 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/15158 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Adding compaction option for metadata databasesDiogenes Nunez2016-09-041-33/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: As metadata in the database fills up, querying the database take a long time. As a result, tier migration slows down. To counteract this, we added a way to enable the compaction methods of the underlying database. The goal is to reduce the size of the underlying file by eliminating database fragmentation. NOTE: There is currently a bug where sometimes a brick will attempt to activate compaction. This happens even compaction is already turned on. The cause is narrowed down to the compact_mode_switch flipping its value. Changes: libglusterfs/src/gfdb - Added a gfdb function to compact the underlying database, compact_db() This is a no-op if the database has no such option. - Added a compaction function for SQLite3 that does the following 1) Changes the auto_vacuum pragma of the database 2) Compacts the database according to the type of compaction requested - Compaction type can be changed by changing the macro GF_SQL_COMPACT_DEF to one of the 4 compaction types in gfdb_sqlite3.h It is currently set to GF_SQL_COMPACT_INCR, or incremental vacuuming. xlators/cluster/dht/src - Added the following command-line option to enable SQLite3 compaction. gluster volume set <vol-name> tier-compact <off|on> - Added the following command-line option to change the frequency the hot and cold tier are ordered to compact. gluster volume set <vol-name> tier-hot-compact-frequency <int> gluster volume set <vol-name> tier-cold-compact-frequency <int> - tier daemon periodically sends the (new) GFDB_IPC_CTR_SET_COMPACT_PRAGMA IPC to the CTR xlator. The IPC triggers compaction of the database. The inputs are both gf_boolean_t. IPC Input: compact_active: Is compaction currently on for the db. compact_mode_switched: Did we flip the compaction switch recently? IPC Output: 0 if the compaction succeeds. Non-zero otherwise. xlators/features/changetimerecorder/src/ - When the CTR gets the compaction IPC, it launches a thread that will perform the compaction. The IPC ends after the thread is launched. To avoid extra allocations, the parameters are passed using static variables. Change-Id: I5e1433becb9eeff2afe8dcb4a5798977bf5ba0dd Signed-off-by: Diogenes Nunez <dnunez@redhat.com> Reviewed-on: http://review.gluster.org/15031 Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org>
* ctr/sql: Providing for vol set for sqlcachesize and sqlWALsize and skip ↵Joseph Fernandes2015-12-221-8/+11
| | | | | | | | | | | | | | | | | | | | recording path 1. Providing vol set option for cache size and wal autocheck point so that performance can be tuned. 2. Removed recording of file path in the db. Trimming database columns. Path need not be stored in the db, as PARGFID, GFID, Basename is suffice to derive the path during migration. Change-Id: I2cb590451a6d244bc91fe66c6dbffe2c2059dfb8 BUG: 1293034 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12972 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Correcting rename logicJoseph Fernandes2015-11-251-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When a file with old_file_name and GFID_1 is renamed with a new_file_name which already exists and with GFID_2, this is what happens in linux internaly. a. "new_file_name" is unlinked for GFID_2 b. a hardlink "new_file_name" is created to GFID_1 c. "old_file_name" hardlink is unlinked for GFID_2. Well this is all internal to linux, and gluster just issues a rename system call at POSIX layer. But CTR Xlator doesn't delete the entries corresponding to the "new_file_name" and GFID_2. Thus leaving the stale entry in the DB. The following are the implications. a. Promotion are tried on these stale entries which will fail and show false results in the status of migration, b. GFID_2 Files with 2 hardlinks, which will have only one hardlink after the rename will not be promoted or demoted as the DB shows 2 entries. Solution: Delete the older database entry for the replaced hardlink Change-Id: I4eafa0872253e29ff1f0bec4283bcfc579ecf0e2 BUG: 1284090 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12711 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/libgfdb: Replacing ASCII query file with binaryJoseph Fernandes2015-11-061-173/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier, when the database was queried we used to save all the queried records in an ASCII format in the query file. This caused issues like filename having ASCII delimiter and used to take a lot of space. The tier.c file also had a lot of parsing code. Here we changed the format of the query file to binary. All the logic of serialization and formating of query record is done by libgfdb. Libgfdb provides API, gfdb_write_query_record() and gfdb_read_query_record(), which the user i.e tier migrator and CTR xlator can use to write to and read from query file. With this binary format we save on disk space i.e reduce to 50% atleast as we are saving GFID's in binary format 16 bytes and not the string format which takes 36 bytes + We are not saving path of the file + we are also saving on ASCII delimiters. The on disk format of query record is as follows, +---------------------------------------------------------------------------+ | Length of serialized query record | Serialized Query Record | +---------------------------------------------------------------------------+ 4 bytes Length of serialized query record | | -------------------------------------------------| | | V Serialized Query Record Format: +---------------------------------------------------------------------------+ | GFID | Link count | <LINK INFO> |..... | FOOTER | +---------------------------------------------------------------------------+ 16 B 4 B Link Length 4 B | | | | -----------------------------| | | | | | V | Each <Link Info> will be serialized as | +-----------------------------------------------+ | | PGID | BASE_NAME_LENGTH | BASE_NAME | | +-----------------------------------------------+ | 16 B 4 B BASE_NAME_LENGTH | | | ------------------------------------------------------------------------| | | V FOOTER is a magic number 0xBAADF00D indicating the end of the record. This also serves as a serialized schema validator. Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a BUG: 1272207 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12354 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: Solution for db locks for tier migrator and ctr using sqlite ↵Joseph Fernandes2015-10-081-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | version less than 3.7 i.e rhel 6.7 Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above. As a result we cannot have to progreses concurrently accessing sqlite, without running into db locks! Well WAL is also need for performace on CTR side. Solution: This solution is to use CTR db connection for doing queries when WAL mode is absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will do the query and create/update the query file suggested by tier migrator. Pending: Well this solution will stop the db locks but the performance is still an issue for CTR. We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR performance by doing in memory udpates on the IO path and later flush the updates to the db in a batch/segment flush. Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124 BUG: 1240577 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12191 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Luis Pabon <lpabon@redhat.com>
* tier/ctr: Solving DB Lock issue due to write contention from db connectionsJoseph Fernandes2015-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Problem: The DB on the brick is been accessed by CTR, for write and tier migrator, for read and write. The write from tier migrator is reseting the heat counters after a cycle. Since we are using sqlite, two connections trying to write would cause a db lock contention. As a result CTR used to fail to update the db. Solution: Using the same db connection of CTR for reseting the heat counters. 1) Introducted a new IPC FOP for CTR 2) After the query do a ipc syncop to the underlying client xlator associated to the brick. 3) CTR in brick will catch the IPC FOP and cleat the heat counters. Change-Id: I53306bfc08dcdba479deb4ccc154896521336150 BUG: 1260730 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12031 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix 64 bit issue with sql query using timesDan Lambright2015-08-131-2/+2
| | | | | | | | | | | | | We overflowed when converting seconds to usecs in preperation for sql queries. The fix uses uint64_t throughout including subexpressions. Change-Id: I59bdb742197400dede97f54735b52030920b0d19 BUG: 1231268 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11885 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* tier/ctr: Ignore creation of T file and Ctr Lookup heal improvememntsJoseph Fernandes2015-06-261-2/+13
| | | | | | | | | | | | | | | | | | 1) Ignore creation of T file in ctr_mknod 2) Ignore lookup for T file in ctr_lookup 3) Ctr_lookup: a. If the gfid and pgfid in empty dont record b. Decreased log level for multiple heal attempts c. Inode/File heal happens after an expiry period, which is configurable. d. Hardlink heal happens after an expiry period, which is configurable. Change-Id: Id8eb5092e78beaec22d05f5283645081619e2452 BUG: 1235269 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11334 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* gfdb/libglusterfs : Porting to a new logging frameworkMohamed Ashiq2015-06-101-12/+16
| | | | | | | | | | Change-Id: I61874561fdf2c175d2b3eec0c85c25f12dc60552 BUG: 1194640 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/10819 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* CTR/Libgfdb: Log typo fixJoseph Fernandes2015-05-101-2/+2
| | | | | | | | | | | | | | Log typo fix for CTR Xlator and Libgfdb Change-Id: I8d7208b9756199f8dc0a72771a3c953b4fa00f3f BUG: 1215896 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10668 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* ctr/libgfdb: Performance enhancer for unlink/create/rename/link fopsJoseph Fernandes2015-05-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | 1) ctr_link_consistency option for ctr xaltor is provided so that the user can choose to switch it on or off. /* For link consistency we do a double update i.e mark the link * during the wind and during the unwind we update/delete the link. * This has a performance hit. We give a choice here whether we need * link consistency to be spoton or not using link_consistency flag. * This will have only one link update */ 2) In delete the wind time recording is moved to unwind path. /* Special performance case: * Updating wind time in unwind for delete. This is done here * as in the wind path we will not know whether its the last * link or not. For a last link there is not use to update any * wind or unwind time!*/ Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6 BUG: 1217786 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10170 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: make contrib/uuid dependency optionalNiels de Vos2015-04-101-0/+1
| | | | | | | | | | | | | | | | | | | On Linux systems we should use the libuuid from the distribution and not bundle and statically link the contrib/uuid/ bits. libglusterfs/src/compat-uuid.h has been introduced and should become an abstraction layer for different UUID APIs. Non-Linux operating systems should implement their compatibility layer there. Once all operating systems have an implementation in compat-uuid.h, we can remove contrib/uuid/ from the repository completely. Change-Id: I345e5357644be2521685e00358bb8c83c4ea0577 BUG: 1206587 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10129 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr : Fix for heating of files during promotion/demotionJoseph Fernandes2015-04-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix will solve the heating of the files during the promotion or demotion. Promotion: ~~~~~~~~~ When a file gets promoted it get the current time stamp during creation only, but following writes or reads during the migration wont heat the file. Demotion: ~~~~~~~~ When a file gets demoted it get the wind/unwind time stamp is set to zero. The following writes or reads during the migration wont heat the file. What is remaining ? ~~~~~~~~~~~~~~~~~ Bug 1209129 ( https://bugzilla.redhat.com/show_bug.cgi?id=1209129 ) Inspite of this fix there is still a issue remaining, i.e the heat of the file is not keep intact during a internal rebalance activity i.e a rebalance within a tier. Change-Id: I01e82dc226355599732d40e699062cee7960b0a5 BUG: 1207867 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10080 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Adding Libgfdb to GlusterFSJoseph Fernandes2015-03-181-0/+733
************************************************************************* Libgfdb | ************************************************************************* Libgfdb provides abstract mechanism to record extra/rich metadata required for data maintenance, such as data tiering/classification. It provides consumer with API for recording and querying, keeping the consumer abstracted from the data store used beneath for storing data. It works in a plug-and-play model, where data stores can be plugged-in. Presently we have plugin for Sqlite3. In the future will provide recording and querying performance optimizer. In the current implementation the schema of metadata is fixed. Schema: ~~~~~~ GF_FILE_TB Table: ~~~~~~~~~~~~~~~~~ This table has one entry per file inode. It holds the metadata required to make decisions in data maintenance. GF_ID (Primary key) : File GFID (Universal Unique IDentifier in the namespace) W_SEC, W_MSEC : Write wind time in sec & micro-sec UW_SEC, UW_MSEC : Write un-wind time in sec & micro-sec W_READ_SEC, W_READ_MSEC : Read wind time in sec & micro-sec UW_READ_SEC, UW_READ_MSEC : Read un-wind time in sec & micro-sec WRITE_FREQ_CNTR INTEGER : Write Frequency Counter READ_FREQ_CNTR INTEGER : Read Frequency Counter GF_FLINK_TABLE: ~~~~~~~~~~~~~~ This table has all the hardlinks to a file inode. GF_ID : File GFID (Composite Primary Key)``| GF_PID : Parent Directory GFID (Composite Primary Key) |-> Primary Key FNAME : File Base Name (Composite Primary Key)__| FPATH : File Full Path (Its redundant for now, this will go) W_DEL_FLAG : This Flag is used for crash consistancy, when a link is unlinked. i.e Set to 1 during unlink wind and during unwind this record is deleted LINK_UPDATE : This Flag is used when a link is changed i.e rename. Set to 1 when rename wind and set to 0 in rename unwind Libgfdb API: ~~~~~~~~~~~ Refer libglusterfs/src/gfdb/gfdb_data_store.h Change-Id: I2e9fbab3878ce630a7f41221ef61017dc43db11f BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>