summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht/src/dht-rebalance.c
Commit message (Collapse)AuthorAgeFilesLines
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-08-051-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. > Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 > BUG: 1362070 > Reviewed-on: http://review.gluster.org/15000 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b) Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1362070 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15062 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier/detach: Clear tier-fix-layout-complete xattr after migration threads joinDan Lambright2016-05-161-33/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we had wrongly placed the clearing tier-fix-layout-complete xattr before the joining of migration threads. This would lead to situations where failure of clearing the xattr would cause the premature death of migration threads. Now we clear the xattr only after the data movement threads join, ensuring that all migration is done. This is a backport of 14285 > Change-Id: I829b671efa165ae13dbff7b00707434970b37a09 > BUG: 1334839 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I475242e6a05cacd2252dc5c29b160e7abc5d1791 BUG: 1336148 Reviewed-on: http://review.gluster.org/14341 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier/detach : During detach check if background fixlayout is doneJoseph Fernandes2016-05-141-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | During detach check if background fixlayout is done, if not done ignore the case and continue detach. Backport of http://review.gluster.org/14147 > Change-Id: I5d5cfc0e73d0eb217fdeab54c432dc4af8bc598d > BUG: 1332136 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/14147 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I2161673cf6861b02a8e323366a13a13587258bef BUG: 1333934 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/14246 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Joseph Fernandes CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: Handle GF_DEFRAG_STOPSusant Palai2016-04-261-0/+17
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14004 Problem: On a rebal stop, the migrator threads don't intimate the crawler thread to wake up in case it is waiting on signal from migrator thread. BUG: 1330529 Change-Id: I9019a715c7b4673b8bb5a75d7d33a18add85ce33 Reviewed-by: N Balachandran <nbalacha@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14076 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht/rebalance: rebalance failure handlingSusant Palai2016-04-051-21/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At current state rebalance aborts basically on any failure like fix-layout of a directory, readdirp, opendir etc. Unless it is not a remove-brick process we can ignore these failures. Major impact: Any failure in the gf_defrag_process_dir means there are files left unmigrated in the directory. Fix-layout(setxattr) failure will impact it's child subtree i.e. the child subtree will not be rebalanced. Settle-hash (commit-hash)failure will trigger lookup_everywhere for immediate children until the next commit-hash. Note: Remove-brick opertaion is still sensitive to any kind of failure. Change-Id: I2f67a490e4e7d06423bb5bc010a1373a74a6af1d BUG: 1318196 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12013 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13749 Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* tier/dht : Attach tier fix layout to run in backgroundJoseph Fernandes2016-04-011-15/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Spawn a thread for background fix-layout for tier process. 2. Once the fix-layout is completed a marker xttr is set on the root of volume to mark the completion of the background fixlayout, so that even if the tier process is spawned again, fixlayout will not be issued, if it was completed last time. 3. Please note that promotion of legacy files will happen eventually as the ctr lookup heal in the fixlayout slowly heals the ctr db for legacy files OR the ctr lookup heal happend due to a name lookup. 4. When a detach tier is successful in evacuation data from hot tier, we remove the marker xattr is removed. So that next attach tier runs the background tier fixlayout. what is remaining ? 1. Instead of clearing the marker xattr of tiering fix layout at the end of detach start clear it during detach commit. But the issue is detach commit is a glusterd operation and the volume is not mounted in glusterd. The reason we want to do it in detach commit is that if the admin wants to attach the same tier again, then a background fixlayout will be triggered, which would not be needed. 2. Clearing the CTR DB of the cold bricks when there is a detach commit, as it will be having entries which will be stale when the volume is used, with ctr off (ctr is switched off only when we have detach commit.) Backport of http://review.gluster.org/13491 > Change-Id: Ibe343572e95865325cd0eef4d0b976b626a3c0c5 > BUG: 1313228 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/13491 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Joseph Fernandes > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: Ic28affdf78d2ac0f394f3dd59f0126df7915d609 BUG: 1323016 Reviewed-on: http://review.gluster.org/13879 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: allow db queries to be interruptableDan Lambright2016-02-181-14/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A query to the database may take a long time if the database has many entries. The tier daemon also sends IPC calls to the bricks which can run slowly, espcially in RHEL6. While it is possible to track down each such instance, the snapshot feature should not be affected by database operations. It requires no migration be underway. Therefore it is okay to pause tiering at any time except when DHT is moving a file. This fix implements this strategy by monitoring when control passes to DHT to migrate a file using the GF_XATTR_FILE_MIGRATE_KEY trigger. If it is not, the pause operation is successful. > Change-Id: I21f168b1bd424077ad5f38cf82f794060a1fabf6 > BUG: 1287842 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/13104 > Reviewed-by: Joseph Fernandes > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I667e0af24eaa66afefa860c4d73b324e4f39b997 BUG: 1288352 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13199 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht : Rebalance process crashes due to double fd_unrefN Balachandran2016-01-241-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | The dst_fd was being unrefed twice in case the call to __dht_rebalance_create_dst_file failed. > BUG: 1296611 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/13193 > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 36fcaf275952202ce3f1e621d3b3a34d58054c99) Change-Id: I56c5aff7fa3827887e67936b8aa1ecbd1a08a9e9 BUG: 1297309 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13212 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* all: reduce "inline" usageKaleb S KEITHLEY2016-01-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. backport of Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155, http://review.gluster.org/11769, BUG: 1245331 Change-Id: Iba1efb0bc578ea4a5e9bf76b7bd93dc1be9eba44 BUG: 1283302 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12646 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* tier: Demotion failed if the file was renamed when it was in coldMohammed Rafi KC2015-12-301-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During migration if the file is present we just open the file in hashed subvol. Now if the linkfile present on hashed is just linkfile to another subvol, we actually open in hashed subvol. But subsequent operation will go to linkto subvol ie, to non-hashed subvol. This operation will get failed since we haven't opened d on non-hashed. Backport of> >Change-Id: I9753ad3a48f0384c25509612ba76e7e10645add3 >BUG: 1292067 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/12980 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Susant Palai <spalai@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit d2f48214d436be633efb1136ee951b0736935143) Change-Id: I562ac4ba73e6b572bc1be91e55a029fa75262b33 BUG: 1293342 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13045 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: do not block in synctask created from pause tierDan Lambright2015-12-251-13/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had run sleep() in the pause tier callback. Blocking within a synctask is dangerous. The sleep() call does not inform the synctask scheduler that a thread is no longer running. It therefore believes it is running. If a second synctask already exists, it may not be able to run. This occurs if the thread limit in the pool has been reached. Note the pool size only grows when a synctask is created, not when it is moved from wait state to run state, as is the case when an FOP completes. When the tier is paused during migration, synctasks already exist waiting for responses to FOPs to the server with high probability. The fix is to yield() in the RPC callback, which will place the synctask into the wait queue and free up a thread for the FOP callback. A timer wakes the callback after sufficient time has elapsed for the pause to occur. This is a backport of 12987. > Change-Id: I6a947ee04c6e5649946cb6d8207ba17263a67fc6 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12987 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I9bb7564d57d59abb98e89c8d54c8b1ff4a5fc3f3 BUG: 1274100 Reviewed-on: http://review.gluster.org/13078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht : Properly free file descriptors during data migrationJoseph Fernandes2015-12-221-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | While tier migration, free src and dst fd's when create of destination or open of source fails. Backport of http://review.gluster.org/12969 > Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 > BUG: 1291566 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12969 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I4d745076b3258a64584778ba88dd665ddd907d27 BUG: 1293348 Reviewed-on: http://review.gluster.org/13046 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* Tier: "tier start force" command implementationhari gowtham2015-12-221-2/+1
| | | | | | | | | | | | | | | | | | | | | back port of : http://review.gluster.org/#/c/12983/ The start command doesnt restart the tier deamon if the deamon is running at one node. hence to bring up the tierd on the nodes where the deamon is down, the force command is implemented. It skips the check for tierd running. >Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436 >BUG: 1292112 >Signed-off-by: hari gowtham <hgowtham@redhat.com> Change-Id: Idaca442c1a41ded8bf555a6e34eed0ebb9ea4034 BUG: 1293698 Signed-off-by: hari <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13069 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix tier-max-files bookeeping and helpDan Lambright2015-12-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The option tier-max-files represents the maximum number of files trasferred by a node in a gives cycle. Fix help message to reflect the "per node" aspect. The code transferred one more file per cycle than the given value. Also change the default values of max file and max bytes to very large values, effectively we will not throttle migration unless the administrator requests it via CLI. This is a backport of 12984. > Change-Id: Ic2949ed3d8c35afe7c9ae4db72195603cfb2e28f > BUG: 1292671 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12984 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ibb234976f912ff4b20e894f2984c63792acfe814 BUG: 1292945 Reviewed-on: http://review.gluster.org/13004 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: Use seperate return variable for destination cleanupJoseph Fernandes2015-12-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Use seperate local return variable for destination cleanup, without messing with the function return variable. Backport http://review.gluster.org/12956 > Change-Id: Iaea9ed2927234fdb888aef7a31ec362090e98196 > BUG: 1290975 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/12956 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I2df0e29321647fd6eb1719eb3cb7d657fbed3793 BUG: 1291002 Reviewed-on: http://review.gluster.org/12959 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix loading tier.so into glusterdN Balachandran2015-12-041-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The glusterd process loads the shared libraries of client translators. This failed for tiering due to a reference to dht_methods which is defined as a global variable which is not necessary. The global variable has been removed and this is now a member of dht_conf and is now initialised in the *_init calls. > Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12863 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Dan Lambright <dlambrig@redhat.com> (cherry picked from commit 96fc7f64da2ef09e82845a7ab97574f511a9aae5) Change-Id: If3cc908ebfcd1f165504f15db2e3079d97f3132e BUG: 1288352 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12877 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Do not delete linkto file on demotionN Balachandran2015-11-191-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. > Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12551 > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I210b94cdae0409c87af8ba198e3cd263a6c85190 BUG: 1283480 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12655 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Disallow detach commit when detach in progressDan Lambright2015-11-161-3/+0
| | | | | | | | | | | | | 1. Check if detach is running, disallow detach commit if so. 2. Cleanup shutdown of tier daemon on detach: do not rerun fix-layout, do not send incorrect status back to glusterd. Change-Id: I97202f748773c1176396a4ffd32a4c7fa9b9c1bc BUG: 1264441 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12272 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* cluster/tier: fix lookup-unhashed on tiered volumesDan Lambright2015-11-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | During attach tier the commit hash must be copied to the hot tier. This is a backport of 12498. > Change-Id: I91b92fd8e98696993433856e1436409b657c439d > BUG: 1277716 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12498 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I012c8ce9ef1dbb9550f109a1141afbecaa4fa54d BUG: 1278603 Reviewed-on: http://review.gluster.org/12525 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier dont log error on lookup heal for files on hot tierDan Lambright2015-10-301-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12430 On fix-layout heal files are scanned. Files found are exist on the hot or cold subvolume. Those not found in the cold tier would exist on the hot. They should not be flagged as an error. Replace INFO with TRACE for common tier migration logs. Frequent migration was growing the log files too quickly. On migratation failures, do not acrue files towards cycle limit's budget. > Change-Id: Ie832ee07c43bce5477ae81c939d1fe8416a11615 > BUG: 1275383 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12430 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Joseph Fernandes Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia1ce5c3ac9c8c43cf3f3f7e0bd6161aa13affe5f BUG: 1272398 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12465 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: add pause tier for snapshotsDan Lambright2015-10-211-3/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of 12304 Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true|false] > Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 > BUG: 1267950 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12304 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/mgmt/glusterd/src/glusterd-messages.h Change-Id: I5f039d8d38a4c915bd873969f336b96755a0b8f1 BUG: 1274101 Reviewed-on: http://review.gluster.org/12411 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht : Do not migrate files with POSIX locks heldN Balachandran2015-10-181-11/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_migrate_file does not migrate file locks to the dst file. Any locks held on the source file are lost once the migration is complete. This issue is magnified in the case of a tier volume as file migrations occur more frequently and repeatedly as compared to a DHT rebalance. The fix makes 2 changes: 1. Before starting the actual migration process, check if there are any locks held on the file. If yes, do not migrate the file. 2. The rebalance process tries to lock on the entire file just before moving into the Phase 2 of the file migration. If the lock acquisition fails, the file migration does not proceed. If the lock is granted, the file migration proceeds. This still leaves a small window where conflicting locks can be granted to different clients. If client1 requests a lock on the src file just after it is converted to a linkto file and client2 requests a lock on the dst data file, they will both be granted, but all FOPs will be redirected to the dst data file. This issue will be taken up in a subsequent patch. Change-Id: I8c895fc3cced50dd2894259d40a827c7b43d58ac BUG: 1272331 Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/12347 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/ctr: CTR DB named lookup heal of cold tier during attach tierJoseph Fernandes2015-10-101-2/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heal hardlink in the db for already existing data in the cold tier during attach tier. i.e during fix layout do lookup to files in the cold tier. CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup. Currently the namedlookup is done synchronous to the fixlayout that is triggered by attach tier. This is not performant, adding more time to fixlayout. The performant approach is record the hardlinks on a compressed datastore and then do the namelookup asynchronously later, giving the ctr db eventual consistency master patch : http://review.gluster.org/#/c/11828/ >>Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9 >>BUG: 1252586 >>Signed-off-by: Joseph Fernandes <josferna@redhat.com> >>Reviewed-on: http://review.gluster.org/11828 >>Tested-by: Gluster Build System <jenkins@build.gluster.com> >>Reviewed-by: Dan Lambright <dlambrig@redhat.com> >>Tested-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I61b185a54ae4e8c1d82804b95a278bfbea870987 BUG: 1261146 Reviewed-on: http://review.gluster.org/12331 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: add watermarks and policy driverDan Lambright2015-10-101-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport fix 12039 This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test|cache} > Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 > BUG: 1257911 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/12039 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Conflicts: xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/tier.c Change-Id: Ibfe6b89563ceab98708325cf5d5ab0997c64816c BUG: 1270527 Reviewed-on: http://review.gluster.org/12330 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: fix mem-leak in rebalanceSusant Palai2015-10-061-5/+27
| | | | | | | | | | Change-Id: I37faf983fc02996541f3d96a17cb2a2c2cdb6781 BUG: 1261234 Reviewed-on: http://review.gluster.org/12235 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12296
* fd: Do fd_bind on successful openPranith Kumar K2015-10-051-0/+4
| | | | | | | | | | | | | | | | | | | | | | | - fd_unref should decrement fd->inode->fd_count only if it is present in the inode's fd list. - successful open/opendir should perform fd_bind. >Change-Id: I81dd04f330e2fee86369a6dc7147af44f3d49169 >BUG: 1207735 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/11044 >Reviewed-by: Anoop C S <anoopcs@redhat.com> >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1259697 Change-Id: I73b79dd3519aa085fb84dde74b321511cbccce1a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12100 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/dht: Reset source file mode bits on migration failureNithya Balachandran2015-09-211-3/+94
| | | | | | | | | | | | | | | | | | | | | DHT rebalance uses the sgid and sticky bits to indicate that a file is being migrated. These were not removed if the file migration failed. The fix resets these bits to the original values. >Change-Id: I9801bfc0bd80c0800251ccd66c1c91a51cffd909 >Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> >Reviewed-on: http://review.gluster.org/11454 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: Ia701687819ee7130d6abebad84feb2ee879b7ab2 BUG: 1262700 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12167 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: make attach/detach work with new rebalance logicDan Lambright2015-09-021-7/+2
| | | | | | | | | | | | | | | | | | | | | | | This is a backport of 10795. > The new rebalance performance improvements added new > datastructures which were not initialized in the > tier case. Function dht_find_local_subvol_cbk() needs > to accept a list built by lower level DHT translators > in order to build the local subvolumes list. > Change-Id: Iab03fc8e7fadc22debc08cd5bc781b9e3e270497 > BUG: 1222088 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10795 > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: Icbd51c96ae4d367d1edf41cdd0edb35095195699 BUG: 1259079 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12085 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: Don't set posix acls on linkto filesNithya Balachandran2015-08-311-0/+34
| | | | | | | | | | | | | | | | | | | | | | Posix acls on a linkto file change the file's permission bits and cause DHT to treat it as a non-linkto file.This happens on the migration failure of a file on which posix acls were set. The fix prevents posix acls from being set on a linkto file and copies them across only after a file has been successfully migrated. Change-Id: Iccf7ff6fba49fe05d691d9b83bf76a240848b212 BUG: 1258377 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12025 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12062 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht/tiering : create new dictionary during migrationMohammed Rafi KC2015-08-191-2/+10
| | | | | | | | | | | | | | | | | | | | | | | To avoid setting wrong xattr during creating link file Back port of: >Change-Id: Iad8de3521eae17e510035ed42e3e01933d647096 >BUG: 1250828 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/11838 >Reviewed-by: N Balachandran <nbalacha@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit a3faffb259d5288907fac33a2822a8f61c3e86fe) Change-Id: I76ef168cd881c8fd828283a1ae70ed251fc44aaa BUG: 1254438 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11945 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht: Adding log messages to the new logging frameworkarao2015-07-271-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | Backported from: http://review.gluster.org/10021 > Change-Id: Ib3bb61c5223f409c23c68100f3fe884918d2dc3f > BUG: 1194640 > Reviewed-on: http://review.gluster.org/10021 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Joseph Fernandes <josferna@redhat.com> > Tested-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: arao <arao@redhat.com> BUG: 1217722 Change-Id: Ide79c6c1e6a466fb52f955c90a2b22711bec794a Signed-off-by: arao <arao@redhat.com> Signed-off-by: Anusha Rao <arao@redhat.com> Reviewed-on: http://review.gluster.org/11350 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* syncop: Include iatt to 'syncop_link' argsSoumya Koduri2015-07-151-1/+1
| | | | | | | | | | | | | | | | | Include iatt to 'syncop_link' args to fetch proper attributes of the newly linked inode. This is backport of the below fix - http://review.gluster.org/11611 Change-Id: If6b92961bd7a89add3791ed3a9b494087348b492 BUG: 1243408 Reviewed-on: http://review.gluster.org/11611 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/11677 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: fixes for migration over ec as cold tierDan Lambright2015-07-141-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of fix 11433. > An opendir is done in rebalance. The graph constructed when > EC is used in tiering may have no local volumes (if > all the hot volumes are on one node and all the others on > another node). Previously the opendir only sent fops down > the local subvolumes for migration. They must be sent down > both the hot and cold subvolumes for tiering. > When setxattr2() received a NULL subvolume; this dereferenced > an uninitialized variable. > When a lookup is done during creation of the destination > file, the xattr dict is "polluted" with virtual xattrs. > These cause subsequent xattrs in the new file to not be > written by posix. They are required by EC. > The inode gfid for "entry_loc" in gf_defrag_migrate_single_file() > was not initialized. This made underlying translators > think the gfid was 0, and failed migration. > Change-Id: I6ccda8ca8e43485b9b354341bbfcb302496f632c > BUG: 1236212 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I9b26725e055eecfec235c4291ee90b0e53d0ea62 BUG: 1242274 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11652 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* dht: Add lookup-optimize configuration option for DHTShyam2015-06-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently with commit 4eaaf5 a mixed version cluster would have issues if lookup-uhashed is set to auto, as older clients would fail to validate the layouts if newer clients (i.e 3.7 or upwards) create directories. Also, in a mixed version cluster rebalance daemon would set commit hash for some subvolumes and not for the others. This commit fixes this problem by moving the enabling of the functionality introduced in the above mentioned commit to a new dht option. This option also has a op_version of 3_7_1 thereby preventing it from being set in a mixed version cluster. It brings in the following changes, - Option can be set only if min version of the cluster is 3.7.1 or more - Rebalance and mkdir update the layout with the commit hashes only if this option is set, hence ensuring rebalance works in a mixed version cluster, and also directories created by newer clients do not cause layout errors when read by older clients - This option also supersedes lookup-unhased, to enable the optimization for lookups more deterministic and not conflict with lookup-unhashed settings. Option added is cluster.lookup-optimize, which is a boolean. Usage: # gluster volume set VOLNAME cluster.lookup-optimize on Change-Id: Ifd1d4ce3f6438fcbcd60ffbfdbfb647355ea1ae0 BUG: 1225940 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/10976 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* dht/rebalance : Fixed rebalance failureNithya Balachandran2015-06-041-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The rebalance process determines the local subvols for the node it is running on and only acts on files in those subvols. If a dist-rep or dist-disperse volume is created on 2 nodes by dividing the bricks equally across the nodes, one process might determine it has no local_subvols. When trying to update the commit hash, the function attempts to lock all local subvols. On the node with no local_subvols the dht inode lock operation fails, in turn causing the rebalance to fail. In a dist-rep volume with 2 nodes, if brick 0 of each replica set is on node1 and brick 1 is on node2, node2 will find that it has no local subvols. Change-Id: I7d73b5b4bf1c822eae6df2e6f79bd6a1606f4d1c BUG: 1221656 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on-master: http://review.gluster.org/10786 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10788 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* dht/rebalance: Change log_level to DEBUGSusant Palai2015-06-011-2/+3
| | | | | | | | | | | BUG: 1221503 Change-Id: I4ac87bb69e05e2dd445fc0a5bcf48d5bd3d0020b Reviewed-on: http://review.gluster.org/10756 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10778
* glusterd: add counter support for tiered volumesDan Lambright2015-05-281-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | back port of http://review.gluster.org/10292 This fix adds support to view the number of promoted or demoted files from the cli. The mechanism is isolmorphic to checking the status of volumes being rebalanced. gluster volume rebalance <vol> tier status >Change-Id: I1b11ca27355ceec36c488967c23531202030e205 >BUG: 1213063 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-on: http://review.gluster.org/10292 >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> Change-Id: I543e886f17132b544274c83fdecca5a8da9d092a BUG: 1221477 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10775 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* dht/tier/rebalancer: Fix reset of tiering client pidJoseph Fernandes2015-05-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the patch http://review.gluster.org/#/c/9657 the client pid set by tiering migration was getting over- written in dht_start_rebalance_task(). Just corrected it in dht_setxattr() before calling dht_start_rebalance_task() and removed it from dht_start_rebalance_task(). > http://review.gluster.org/#/c/10502/ > Cherry picked from commit a5fe0f594d41e1a11661d9074bb19e9c2e2c4776 > Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870 > BUG: 1217937 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10502 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Conflicts: xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/tier.c Change-Id: Id513114c9a880c6196162dd4b35bbf1155a8cd09 BUG: 1219027 Reviewed-on: http://review.gluster.org/10609 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht: make lookup-unhashed=auto do something actually usefulJeff Darcy2015-05-091-9/+100
| | | | | | | | | | | | | | | | | | | | | | | | | The key concept here is to determine whether a directory is "clean" by comparing its last-known-good topology to the current one for the volume. These are stored as "commit hashes" on the directory and the volume root respectively. The volume's commit hash changes whenever a brick is added or removed, and a fix-layout is done. A directory's commit hash changes only when a full rebalance (not just fix-layout) is done on it. If all bricks are present and have a directory commit hash that matches the volume commit hash, then we can assume that every file is in its "proper" place. Therefore, if we look for a file in that proper place and don't find it, we can assume it's not on any other subvolume and *safely* skip the global (broadcast to all) lookup. Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3 BUG: 1220064 Reviewed-on-master: http://review.gluster.org/#/c/7702/ Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/10729 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: support for tier volumes 'detach start' and 'detach commit'Dan Lambright2015-05-091-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back port of http://review.gluster.org/10108 These commands work in a manner analagous to rebalancing when removing a brick. The existing migration daemon detects "detach start" and switches to moving data off the hot tier. While in this state all lookups are directed to the cold tier. gluster v detach-tier <vol> start gluster v detach-tier <vol> commit The status and stop cli commands shall be submitted separately. >Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0 >BUG: 1205540 >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Signed-off-by: root <root@localhost.localdomain> >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-on: http://review.gluster.org/10108 >Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: I212d748d077fb5870ee84b316c653acbafbea3f7 BUG: 1220047 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10708 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Restore build on non Linux systemsEmmanuel Dreyfus2015-05-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | This change broke the build on NetBSD, FreeBSD, and MacOS X: http://review.gluster.org/10526/ We restore the build with two fixes: - Use POSIX-compliant sysconf(_SC_NPROCESSORS_ONLN) to get the number of processors, instead of Linux specific get_nprocs(). That let us remove Linux-specific #include <sys/sysinfo.h> - Only define MAX() if it is not already defined. NetBSD defines it in <sys/param.h> which is already included Backport of: I62341c670598670e47ea2f69ab94864f96588b18 BUG: 1212676 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Change-Id: I0f098153e76954bb85b5dca3f054a069e31dd94c Reviewed-on: http://review.gluster.org/10653 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalance: Throttle rebalanceSusant Palai2015-05-071-2/+66
| | | | | | | | | | | | | | | | Throttle value will be "normal" by default. For throttling down, a thread will be put in to sleep. And for throttling up, gf_defrag_process_dir will wake up the sleeping threads. Change-Id: I4892ab14982a1ff305aeb2d8bbd33c79d6877b69 BUG: 1219579 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10526 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/10629 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalancer: Marking tiering migration fopsJoseph Fernandes2015-05-071-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up patch for http://review.gluster.org/#/c/10080 In the above, the suggested change in http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c doesnot work. The reason it doesnt work is promotion and demotion are done in a multithread way. Whenever a promotion or demotion thread is called, the frame of the old sync_op thread is not carried with it. As a result the frame->root->pid is not set. Solution: When the file is getting migrated, we get a tiering.migration key_value in the xattr dict, so that we pass this dic key-value when we do syncop_setxattr() to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG in dht_setxattr() just before calling dht_start_rebalance_task(). > http://review.gluster.org/#/c/10266/ > Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404 > BUG: 1194753 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10266 > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Change-Id: I6ab42b2c7a3c3e21c461d097b7558ee967b62c62 BUG: 1218959 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10266 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10601 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rebalance: Introducing local crawl and parallel migrationSusant Palai2015-05-071-245/+912
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current patch address two part of the design proposed. 1. Rebalance multiple files in parallel 2. Crawl only bricks that belong to the current node Brief design explanation for the above two points. 1. Rebalance multiple files in parallel: ------------------------------------- The existing rebalance engine is single threaded. Hence, introduced multiple threads which will be running parallel to the crawler. The current rebalance migration is converted to a "Producer-Consumer" frame work. Where Producer is : Crawler Consumer is : Migrating Threads Crawler: Crawler is the main thread. The job of the crawler is now limited to fix-layout of each directory and add the files which are eligible for the migration to a global queue in a round robin manner so that we will use all the disk resources efficiently. Hence, the crawler will not be "blocked" by migration process. Producer: Producer will monitor the global queue. If any file is added to this queue, it will dqueue that entry and migrate the file. Currently 20 migration threads are spawned at the beginning of the rebalance process. Hence, multiple file migration happens in parallel. 2. Crawl only bricks that belong to the current node: -------------------------------------------------- As rebalance process is spawned per node, it migrates only the files that belongs to it's own node for the sake of load balancing. But it also reads entries from the whole cluster, which is not necessary as readdir hits other nodes. New Design: As part of the new design the rebalancer decides the subvols that are local to the rebalancer node by checking the node-uuid of root directory prior to the crawler starts. Hence, readdir won't hit the whole cluster as it has already the context of local subvols and also node-uuid request for each file can be avoided. This makes the rebalance process "more scalable". Change-Id: I6f1b44086a09df8ca23935fd213509c70cc0c050 BUG: 1217381 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10466 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/afr,dht: Fix memleak after syncop_readlinkPranith Kumar K2015-05-011-0/+1
| | | | | | | | | | | | Backport of http://review.gluster.org/10305 BUG: 1216302 Change-Id: Icb0f2d6bbff806e1c5827fabcbf46b9b7983491f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10441 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs/syncop: Add xdata to all syncop callsRaghavendra Talur2015-04-081-53/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for xdata in both the request and response path of syncops. Few calls like lookup already had the support; have renamed variables in few places to maintain uniformity. xdata passed downwards is known as xdata_in and xdata passed upwards is known as xdata_out. There is an old patch by Jeff Darcy at http://review.gluster.org/#/c/8769/3 which does the same for some selected calls. It also brings in xdata support at gfapi level. xdata support at gfapi level would be introduced in subsequent patches. Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc BUG: 1158621 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9859 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr : Fix for heating of files during promotion/demotionJoseph Fernandes2015-04-081-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix will solve the heating of the files during the promotion or demotion. Promotion: ~~~~~~~~~ When a file gets promoted it get the current time stamp during creation only, but following writes or reads during the migration wont heat the file. Demotion: ~~~~~~~~ When a file gets demoted it get the wind/unwind time stamp is set to zero. The following writes or reads during the migration wont heat the file. What is remaining ? ~~~~~~~~~~~~~~~~~ Bug 1209129 ( https://bugzilla.redhat.com/show_bug.cgi?id=1209129 ) Inspite of this fix there is still a issue remaining, i.e the heat of the file is not keep intact during a internal rebalance activity i.e a rebalance within a tier. Change-Id: I01e82dc226355599732d40e699062cee7960b0a5 BUG: 1207867 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10080 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Fixing build of xlator/cluster/dht/src/dht-rebalance.c when tiering is disabledJoseph Fernandes2015-03-251-1/+0
| | | | | | | | | | | | | 1) Removed unnecessary include tier.h in dht-rebalance.c 2) tier xlator will only compile when tiering is enabled in configure.ac Change-Id: Ia21aa9ff403506dc898a83236e9e2d382a0594da BUG: 1204604 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9973 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Add tier translator.Dan Lambright2015-03-211-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tier translator shares most of DHT's code. It differs in how subvolumes are chosen for I/Os, and how file migration (cache promotion and demotion) is managed. That different functionality is split to either DHT or tier logic according to the "tier_methods" structure. A cache promotion and demotion thread is created in a manner similar to the rebalance daemon. The thread operates a timing wheel which periodically checks for promotion and demotion candidates (files). Candidates are queued and then migrated. Candidates must exist on the same node as the daemon and meet other critera per caching policies. This patch has two authors (Dan Lambright and Joseph Fernandes). Dan did the DHT changes and Joe wrote the cache policies. The fix depends on DHT readidr changes and the database library which have been submitted separately. Header files in libglusterfs/src/gfdb should be reviewed in patch 9683. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9724 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>