summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht/src/tier.c
Commit message (Collapse)AuthorAgeFilesLines
...
* cluster/tier : Files skipped during tier query parsingN Balachandran2015-11-031-2/+2
| | | | | | | | | | | | | | | | | The tier query parsing code was using fscanf to read each record. As space is a delimiter for fscanf, filenames containing spaces caused the parsing to return unexpected values causing various issues in the tier process, including crashes due to buffer overflows. Change-Id: Ife602cb7ecb158fccbc2c89e4d2959bd97098a87 BUG: 1276562 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12469 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* core: use syscall wrappers instead of direct syscalls - miscellaneousKaleb S. KEITHLEY2015-10-281-5/+6
| | | | | | | | | | | | | | | various xlators and other components are invoking system calls directly instead of using the libglusterfs/syscall.[ch] wrappers. If not using the system call wrappers there should be a comment in the source explaining why the wrapper isn't used. Change-Id: I1f47820534c890a00b452fa61f7438eb2b3f667c BUG: 1267967 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12276 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/tier do not log error message on lookup heal for files on hot tierDan Lambright2015-10-281-13/+14
| | | | | | | | | | | | | | | | | | On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 BUG: 1275383 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* cluster/tier: add pause tier for snapshotsDan Lambright2015-10-211-1/+67
| | | | | | | | | | | | | | | | | | | | Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true|false] Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 BUG: 1267950 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12304 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier do not abort migration if a single brick is downDan Lambright2015-10-201-12/+12
| | | | | | | | | | | | | | When a bricks are down, promotion/demotion should still be possible. For example, if an EC brick is down, the other bricks are able to recover the data and migrate it. Change-Id: I8e650c640bce22a3ad23d75c363fbb9fd027d705 BUG: 1273215 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12397 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* cluster/tier remove suprious log messages on valid failed migrationDan Lambright2015-10-191-1/+8
| | | | | | | | | | | | | | On a write to a replica volume, we record in all brick's databases an entry. When the tier daemon runs, it will only move the file if it is the true owner of the file as defined by the XATTR_NODE_UUID_KEY. Change-Id: Ib82717f87a3f94f3d0d9f969773de9e88d6aaf22 BUG: 1273043 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12391 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: add watermarks and policy driverDan Lambright2015-10-101-88/+383
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test|cache} Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 BUG: 1257911 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12039 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* tier/ctr: Solution for db locks for tier migrator and ctr using sqlite ↵Joseph Fernandes2015-10-081-15/+286
| | | | | | | | | | | | | | | | | | | | | | | | | | | version less than 3.7 i.e rhel 6.7 Problem: On RHEL 6.7, we have sqlite version 3.6.2 which doesnt support WAL journaling mode, as this journaling mode is only available in sqlite 3.7 and above. As a result we cannot have to progreses concurrently accessing sqlite, without running into db locks! Well WAL is also need for performace on CTR side. Solution: This solution is to use CTR db connection for doing queries when WAL mode is absent. i,e tier migrator will send sync_op ipc calls to CTR, which in turn will do the query and create/update the query file suggested by tier migrator. Pending: Well this solution will stop the db locks but the performance is still an issue for CTR. We are developing an in-Memory Transaction Log (iMeTaL) which will help boost the CTR performance by doing in memory udpates on the IO path and later flush the updates to the db in a batch/segment flush. Change-Id: Ie3149643ded159234b5cc6aa6cf93b9022c2f124 BUG: 1240577 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12191 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Luis Pabon <lpabon@redhat.com>
* cluster/tier do not flag migration error on already migrated fileDan Lambright2015-09-161-15/+13
| | | | | | | | | | | | | | In some cases a brick will try to migrate a file that has already been migrated. This is a legal case, e.g. when both bricks are replica pairs. Change-Id: If2578b947014cbbdfb3c6591db9044d6b1d92774 BUG: 1263726 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12185 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: Fixed a crash in tieringNithya Balachandran2015-09-161-2/+2
| | | | | | | | | | | | | | | An incorrect check was causing the arguments to the promote thread to be cleared before the thread was done with them. This caused the process to crash when it tried to dereference a NULL pointer. Change-Id: I8348309ef4dad33b7f648c7a2c2703487e401269 BUG: 1263204 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12179 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Joseph Fernandes
* cluster/tier fix bug with sql includes introduced by 12031Dan Lambright2015-09-111-2/+3
| | | | | | | | | | | | We accidentally introduced a bug where client translators have a dependency on sql. This broke freebsd smoke tests. Fix is to abstract from the client those dependencies. Change-Id: I7152573a489bacc8f32e6eb139f9ff4408288f5b BUG: 1260730 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12155 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* tier/ctr: Solving DB Lock issue due to write contention from db connectionsJoseph Fernandes2015-09-081-43/+96
| | | | | | | | | | | | | | | | | | | | | | | Problem: The DB on the brick is been accessed by CTR, for write and tier migrator, for read and write. The write from tier migrator is reseting the heat counters after a cycle. Since we are using sqlite, two connections trying to write would cause a db lock contention. As a result CTR used to fail to update the db. Solution: Using the same db connection of CTR for reseting the heat counters. 1) Introducted a new IPC FOP for CTR 2) After the query do a ipc syncop to the underlying client xlator associated to the brick. 3) CTR in brick will catch the IPC FOP and cleat the heat counters. Change-Id: I53306bfc08dcdba479deb4ccc154896521336150 BUG: 1260730 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12031 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: avoid filling /var/run with tiering filesDan Lambright2015-09-021-4/+28
| | | | | | | | | | | | | | We failed to delete old promote/demote workfiles in /var/run. This fix removes the <pid> postfix so there will be only a single pair of files. Change-Id: Ib9aafe7b4a9d4b0c05cf03a94cc1057a423a27d2 BUG: 1253970 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11931 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* all: reduce "inline" usageJeff Darcy2015-09-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/tier : Use dht_* versions for xlator_fopsN Balachandran2015-08-181-16/+28
| | | | | | | | | | | | | The tier xlator was using the default_* versions for some xlator_fops. Changed to use the dht_* versions for all xlator_fops Change-Id: I8252fb3911b8a48a55e9eee42b89bd66bbacf799 BUG: 1254451 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/11948 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/libgfdb : Setting Freq counters of un-selected files to zeroJoseph Fernandes2015-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Change Time Recorder increments the write/read frequency counters on a read or write of a file, if the "features.record-counters" is "on". It is the responsibility of the tiering migrator to reset these counters to zero for un-selected files to reset them to zero as frequency counters are function of promotion/Demotion cycles. If the counters are not set to zero then, 1) the counters may overflow in the DB 2) The file may be wrongly promoted or demoted. This fix will reset the freq counters of un-selected files to zero after promotion/demotion frequency. Change-Id: Ideea2c76a52d421a7e67c37fb0c823f552b3da7a BUG: 1242504 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11648 Tested-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix demotion when cold tier is ECDan Lambright2015-08-121-0/+2
| | | | | | | | | | | | | | | We did not set the gfid in the loc structure in tier demotion. EC has a sanity check which fails FOPs when the loc gfid mismatches with the file attribute. When the FOP failed demotion was aborted. Change-Id: I69022c9ccb135b86e1feea93b01801b6a4100509 BUG: 1251121 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11855 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/tier: stop tier migration after graph switchDan Lambright2015-06-261-0/+16
| | | | | | | | | | | | | | | | | On a graph switch, a new xlator and private structures are created. The tier migration daemon must stop using the old xlator and private structures and begin using the new ones. Otherwise, when RPCs arrive (such as counter queries from glusterd), the new xlator will be consulted but it will not have up to date information. The fix detects a graph switch and exits the daemon in this case. Typical graph switches for the tier case would be turning off performance translators. Change-Id: Ibfbd4720dc82ea179b77c81b8f534abced21e3c8 BUG: 1226005 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11372
* dht: Adding log messages to the new logging frameworkarao2015-06-231-15/+18
| | | | | | | | | | | | | Change-Id: Ib3bb61c5223f409c23c68100f3fe884918d2dc3f BUG: 1194640 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/10021 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tier/dht: Fixing non atomic promotion/demotion w.r.t to frequency periodJoseph Fernandes2015-06-111-40/+59
| | | | | | | | | | | | | | | | | | | | | This fixes the ping-pong issue i.e files getting demoted immediately after promition, caused by off-sync promotion/demotion processes. The solution is do promotion/demotion refering to the system time. To have the fix working all the file serving nodes should have thier system time synchronized with each other either manually or using a NTP Server. NOTE: The ping-pong issue can re-appear even with this fix, if the admin have different promotion freq period and demotion freq period, but this would be under the control of the admin. Change-Id: I1b33a5881d0cac143662ddb48e5b7b653aeb1271 BUG: 1218717 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11110 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: account for reordered layoutsDan Lambright2015-06-111-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a tiered volume the cold subvolume is always at a fixed position in the graph. DHT's layout array, on the other hand, may have the cold subvolume in either the first or second index, therefore code cannot make any assumptions. The fix searches the layout for the correct position dynamically rather than statically. The bug manifested itself in NFS, in which a newly attached subvolume had not received an existing directory. This case is a "stale entry" and marked as such in the layout for that directory. The code did not see this, because it looked at the wrong index in the layout array. The fix also adds the check for decomissioned bricks, and fixes a problem in detach tier related to starting the rebalance process: we never received the right defrag command and it did not get directed to the tier translator. Change-Id: I77cdf9fbb0a777640c98003188565a79be9d0b56 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Joseph Fernandes <josferna@redhat.com> Reviewed-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11092
* tier/volume set: Validate volume set option for tierMohammed Rafi KC2015-06-101-0/+6
| | | | | | | | | | | | | | | | Volume set option related to tier volume can only be set for tier volume, also currently all volume set i for tier option accepts a non-negative integer. This patch validate both condition. Change-Id: I3611af048ff4ab193544058cace8db205ea92336 BUG: 1216960 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10751 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes
* tiering:static function called from a non static inline functionMohammed Rafi KC2015-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc v5.1.1 throws warning for calling a static function from a non-static inline function. <snippet from compiler warning> CC tier.lo tier.c:610:15: warning: 'tier_migrate_using_query_file' is static but used in inline function 'tier_migrate_files_using_qfile' which is not static ret = tier_migrate_using_query_file ((void *)query_cbk_args); ^ tier.c:585:47: warning: 'tier_process_brick_cbk' is static but used in inline function 'tier_build_migration_qfile' which is not static ret = dict_foreach (args->brick_list, tier_process_brick_cbk, ^ tier.c:565:176: warning: 'demotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:565:158: warning: 'promotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:563:58: warning: 'demotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static tier.c:563:40: warning: 'promotion_qfile' is static but used in inline function 'tier_build_migration_qfile' which is not static ret = remove (GET_QFILE_PATH (is_promotion)); ^ CCLD tier.la </snip> Change-Id: I46046feeb79ab4e2724b0ba6b02c9ec8b121ff4e BUG: 1226881 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11032 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anoop C S <achiraya@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-4/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tiering/rebalance: Use separate pid/socket file for tieringMohammed Rafi KC2015-05-281-2/+3
| | | | | | | | | | | | | | When promotion/demotion daemon starts, it uses the same pidfile as rebalance. This patch will introduce a different pid file for the same. Change-Id: Ic484c53f51e00ae6b2d697748a9600b14829e23b BUG: 1221970 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10792 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System
* xlators/cluster/dht: Fix Explicit null dereferenced (CID 1291727).Günther Deschner2015-05-281-1/+1
| | | | | | | | | | | | | Coverity CID 1291727. Guenther Change-Id: I95f01b638f74370f0ef04383f0f9d5799abe31f5 BUG: 789278 Signed-off-by: Guenther Deschner <gd@samba.org> Reviewed-on: http://review.gluster.org/10300 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: load libgfdb.so properly in all casesDan Lambright2015-05-161-1/+1
| | | | | | | | | | | | | We should load libgfdb.so.0, not libgfdb.so Change-Id: I7a0d64018ccd9893b1685de391e99b5392bd1879 BUG: 1222092 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10796 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Joseph Fernandes Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: add counter support for tiered volumesDan Lambright2015-05-081-0/+6
| | | | | | | | | | | | | | | | This fix adds support to view the number of promoted or demoted files from the cli. The mechanism is isolmorphic to checking the status of volumes being rebalanced. gluster volume rebalance <vol> tier status Change-Id: I1b11ca27355ceec36c488967c23531202030e205 BUG: 1213063 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10292 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht/tier/rebalancer: Fix reset of tiering client pidJoseph Fernandes2015-05-051-1/+1
| | | | | | | | | | | | | | | | In the patch http://review.gluster.org/#/c/9657 the client pid set by tiering migration was getting over- written in dht_start_rebalance_task(). Just corrected it in dht_setxattr() before calling dht_start_rebalance_task() and removed it from dht_start_rebalance_task(). Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870 BUG: 1217937 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: don't use hot tier until subvolumes readyDan Lambright2015-05-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10435 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* tier: relax libgfdb required version numberEmmanuel Dreyfus2015-05-031-1/+1
| | | | | | | | | | | | | | | | | | | When calling dlopen() for libgfdb, do not specify the library version number "libgfdb.so.0.0.1", since libtool will not always create libraries or link with that name with the full 3-digit version. For instance on NetBSD only up to the 2-digit version is available and "libgfdb.so.0.0.1" does not exist. Instead, just specify "libgfdb.so" and rely on smymlinks installed by libtool to find the relevant library. BUG: 1129939 Change-Id: I074b1009d3622a122fdaeb4b99658bca3277e211 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10407 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalancer: Marking tiering migration fopsJoseph Fernandes2015-05-011-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up patch for http://review.gluster.org/#/c/10080 In the above, the suggested change in http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c doesnot work. The reason it doesnt work is promotion and demotion are done in a multithread way. Whenever a promotion or demotion thread is called, the frame of the old sync_op thread is not carried with it. As a result the frame->root->pid is not set. Solution: When the file is getting migrated, we get a tiering.migration key_value in the xattr dict, so that we pass this dic key-value when we do syncop_setxattr() to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG in dht_setxattr() just before calling dht_start_rebalance_task(). Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10266 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier: fix off-by-one overrun in UUID stringEmmanuel Dreyfus2015-04-271-1/+1
| | | | | | | | | | | | | | | | | UUID strings are UUID_CANONICAL_FORM_LEN (36) bytes long plus the trailing nul character that various function (e.g.: uuid_unparse) will add. As a consequence, UUID strings must be declared as UUID_CANONICAL_FORM_LEN+1 long, otherwise we get a off-by-one overrun that corrupts the next variable on stack. BUG: 1129939 Change-Id: I5837ad6ca06fa17cc7ab143eedd02d8099ecca2a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10394 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators/cluster/dht: Fix Resource leak (CID 1291751)Günther Deschner2015-04-241-1/+3
| | | | | | | | | | | | | | | Coverity CID 1291751. Guenther Change-Id: Ibe9dc3662811dc5889f85fa063ab9211fcaf7f12 BUG: 789278 Signed-off-by: Guenther Deschner <gd@samba.org> Reviewed-on: http://review.gluster.org/10301 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: support for tier volumes 'detach start' and 'detach commit'Dan Lambright2015-04-221-8/+30
| | | | | | | | | | | | | | | | | | | | | These commands work in a manner analagous to rebalancing when removing a brick. The existing migration daemon detects "detach start" and switches to moving data off the hot tier. While in this state all lookups are directed to the cold tier. gluster v detach-tier <vol> start gluster v detach-tier <vol> commit The status and stop cli commands shall be submitted separately. Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0 BUG: 1205540 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: root <root@localhost.localdomain> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10108 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: NetBSD Build System
* libglusterfs/syncop: Add xdata to all syncop callsRaghavendra Talur2015-04-081-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for xdata in both the request and response path of syncops. Few calls like lookup already had the support; have renamed variables in few places to maintain uniformity. xdata passed downwards is known as xdata_in and xdata passed upwards is known as xdata_out. There is an old patch by Jeff Darcy at http://review.gluster.org/#/c/8769/3 which does the same for some selected calls. It also brings in xdata support at gfapi level. xdata support at gfapi level would be introduced in subsequent patches. Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc BUG: 1158621 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9859 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Support distributed replicated volumes on hot tierDan Lambright2015-04-081-4/+35
| | | | | | | | | | | | | | We did not set up the graph properly for hot tiers with replicated subvolumes. Also add check that the file has not already been moved by another replicated brick on the same node. Change-Id: I9adef565ab60f6774810962d912168b77a6032fa BUG: 1206517 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10054 Reviewed-by: Joseph Fernandes <josferna@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* cluster/dht: fix spurious smoke test failureGaurav Kumar Garg2015-04-061-2/+2
| | | | | | | | | | | | | | | There is smoke test failure due to implici declaration of function "uuid_parse" and "uuid_compare". Fix is to change these function caller name to "gf_uuid_parse" and "gf_uuid_compare." Change-Id: I79efa00c44d112c2ca732a9d9711c07bd5f1a069 BUG: 1207532 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/10139 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: fix tier.c problems found prior to feature freezeDan Lambright2015-04-061-50/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves tiering translator issues taken from the list in bug 1203776. These issues have been selected to be fixed first. The rest will be fixed in a subsequent patch (or are not a problem). 3. Replace hardcoded #defines of promote/demote file names 6. Use loc_wipe() in migrate_using_query_file() 9. Only promote/demote files on the same node on which they reside. 14. Replace calloc with GF_CALLOC in tier.c and ensure freeing done properly. 15. Handle if parse_query_str fails 22. Only load gfdb library on server side, remove SQL references from client. Change-Id: I6563b11e58ab2e4c6b1ce44db755781ad6d930fb BUG: 1203776 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Fix coverity bug in tiering codeDan Lambright2015-04-041-2/+17
| | | | | | | | | | | | | | | | | The bug was: *** CID 1291734: Error handling issues (CHECKED_RETURN) /xlators/cluster/dht/src/tier.c: 451 in tier_build_migration_qfile() The fix is to check the return code to the remove library call. It is legal to fail, we just log an INFO level message. Change-Id: I026eb49276b394efa3b8092ee2cc209c470aacb2 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10000 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Add tier translator.Dan Lambright2015-03-211-0/+1007
The tier translator shares most of DHT's code. It differs in how subvolumes are chosen for I/Os, and how file migration (cache promotion and demotion) is managed. That different functionality is split to either DHT or tier logic according to the "tier_methods" structure. A cache promotion and demotion thread is created in a manner similar to the rebalance daemon. The thread operates a timing wheel which periodically checks for promotion and demotion candidates (files). Candidates are queued and then migrated. Candidates must exist on the same node as the daemon and meet other critera per caching policies. This patch has two authors (Dan Lambright and Joseph Fernandes). Dan did the DHT changes and Joe wrote the cache policies. The fix depends on DHT readidr changes and the database library which have been submitted separately. Header files in libglusterfs/src/gfdb should be reviewed in patch 9683. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9724 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>