summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht
Commit message (Collapse)AuthorAgeFilesLines
* dht: report constant directory sizeJeff Darcy2016-03-204-1/+81
| | | | | | | | | | | | | | | | | | | | | | | | Directory size is meaningless. Every filesystem has its own unpredictable way of increasing or decreasing it, based on internal data structures and even transient conditions. Some filesystems (e.g. ext4) never decrease it at all. Others (e.g. btrfs) don't even report it. Very few programs look at it, and those that do are broken. Unfortunately, one such program is GNU tar, which will complain when it sees different values because at different times we got the value from different DHT subvolumes. To avoid such problems, just report a constant value. Change-Id: Id64ce917c75b5f7ff50cb55b6e997f3b3556e7e3 BUG: 1302948 Original-author: Shyam <srangana@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13770 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/tier: add tunable to migrate files based on sizeDan Lambright2016-03-163-0/+29
| | | | | | | | | | | | | | | | | This fix adds a paramater "tier-max_promote_size" to control wether a file is migrated or not based on its size. By default the value is 0, meaning all files are migrated. If set to a non-zero value, files larger than the parameter won't be moved in tiered volumes. Change-Id: Ia6b88e9b2508935bef500d956f9192e59670fe00 BUG: 1313495 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13570 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
* TIER: stopping the tierd when the volume goes downhari gowtham2016-03-141-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | If there are large number of files to be migrated and by this time if the volume goes down, then the tierd has to be stopped. But on a huge query file list it keeps checking for each file before stopping. If the volume comes up before the old tierd dies then due to the presence of old tierd new one won't be created. After the old one completes the task, it dies and the status ends up as failed. This patch will check if the status is still running and then let it continue its work. Else it will stop running the tierd. Change-Id: I6522a4e2919e84bf502b99b13873795b9274f3cd BUG: 1315659 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13646 Tested-by: Dan Lambright <dlambrig@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* Tier: Avoiding stale entries from causing demotion to stophari gowtham2016-03-132-3/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the parent GFID is a stale entry, the lookup on this parent fails and this in turn fails the demotion process. This patch will make the stale entry error to be skipped. Situation for pargfid to be stale: Consider a folder from a tar file. Once the tar file is untared the files in the tar-file will start to demote. when the demotion is under progress, if we tend to delete the actual folder, then the files under it which are undergoing demotion will do a lookup on the parent which was deleted and become stale entry. This stale entry fails the Lookup and this will fail the demotion of the other files(not from tar) that are supposed to be demoted. Change-Id: I3d47c32c4077526d477a25912b0135bab98b23fc BUG: 1311178 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13501 Tested-by: hari gowtham <hari.gowtham005@gmail.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* tier: Fix unused-but-set-variable warningRaghavendra Talur2016-03-031-2/+0
| | | | | | | | | | | | Change-Id: Ie5eaf1075b1c9c29dd7d85bf7b61b22e1fbce422 BUG: 1314291 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/13591 Reviewed-by: hari gowtham <hari.gowtham005@gmail.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/tier: make promotion and demotion independantDan Lambright2016-02-282-126/+127
| | | | | | | | | | | | | | | | | | | | Currently a main loop in tiering spawns promotion and demotion threads, and does a join to wait for them to complete. When one of the two threads takes a long time, the main thread waits for it before exiting the join. It may wait so long the scheduled time for the other thread is skipped. In the case of demotion, it may be a long time before another attempt. This patch fixes that by making the promotion and demotion activities independant. A side effect of this change is the logic is significantly simplified. Change-Id: I1196bd4bbfc95e8aa326a9bd4ebf395032369d1c BUG: 1306852 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13433 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht: mkdir must unwind with latest ctimeSakshi Bansal2016-02-262-0/+24
| | | | | | | | | | | | | | | | | | Currently fops like mkdir used the the ctime it gets after creating the directory entry. But setting layout also updates the ctime of a directory. Hence DHT must get the ctime after the setxattr call and unwind with the latest ctime to avoid mismatch in time seen by applications like tar. Change-Id: Iecbbe3aac5244af5da9788b48ccf299ca56b4bae BUG: 1302948 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13352 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: cleanup dict and free memory in rename code pathSakshi Bansal2016-02-241-0/+4
| | | | | | | | | | | | Change-Id: I2458e18197bdf7565563a85e9021b5b2850c1825 BUG: 1303945 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13392 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* dht: file rename must take blocking inode locksSakshi Bansal2016-02-211-5/+5
| | | | | | | | | | | | | | | | Currently DHT takes non-blocking locks for file rename. Due to this during parallel renames some clients fail with EBUSY or ESTALE errors. Hence to avoid application discontinuity file rename must take blocking inode locks. Change-Id: I986e9d08b3be359f20b1a3e1564e049b0f3dffd3 BUG: 1304966 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13366 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* all: fixes for clang compile warningsKaleb S KEITHLEY2016-02-152-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cli/src/cli-cmd-parser.c (chenk) cli/src/cli-xml-output.c (spandit) cli/src/cli.c (chenk) libglusterfs/src/common-utils.c (vmallika) libglusterfs/src/gfdb/gfdb_sqlite3.c (jfernand +1) rpc/rpc-transport/socket/src/socket.c (?) xlators/cluster/afr/src/afr-transaction.c (?) xlators/cluster/dht/src/dht-common.h (srangana +2) xlators/cluster/dht/src/dht-selfheal.c (srangana +2) xlators/debug/io-stats/src/io-stats.c (R. Wareing) xlators/features/barrier/src/barrier.c (vshastry) xlators/features/bit-rot/src/bitd/bit-rot-scrub.h (vshankar +1) xlators/features/shard/src/shard.c (kdhananj +1) xlators/mgmt/glusterd/src/glusterd-ganesha.c (skoduri) xlators/mgmt/glusterd/src/glusterd-handler.c (atinmu) xlators/mgmt/glusterd/src/glusterd-op-sm.h (atinmu) xlators/mgmt/glusterd/src/glusterd-snapshot.c (spandit) xlators/mgmt/glusterd/src/glusterd-syncop.c (atinmu) xlators/mgmt/glusterd/src/glusterd-volgen.c (atinmu) xlators/protocol/client/src/client-messages.h (mselvaga +1) xlators/storage/bd/src/bd-helper.c (M. Mohan Kumar) xlators/storage/bd/src/bd.c (M. Mohan Kumar) xlators/storage/posix/src/posix.c (nbalacha +1) Change-Id: I85934fbcaf485932136ef3acd206f6ebecde61dd BUG: 1293133 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13031 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* all: fix various cppcheck warningsKaleb S KEITHLEY2016-02-151-3/+1
| | | | | | | | | | | | | | | fixes for various warnings reported by cppcheck N.B. cppcheck output is in the bugzilla Change-Id: I33acec127bc4536935fdd8d52a0c490ec54d50b2 BUG: 1292954 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13006 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Skip subvols if no layout presentN Balachandran2016-02-141-0/+8
| | | | | | | | | | | | | | | | | | Running "rm -rf" on a tiered volume sometimes caused the client to crash because dht_readdirp_cbk referenced a NULL layout for the hot tier subvol. Now, entries are skipped if the layout is NULL. This can cause "rm -rf" to fail with ENOTEMPTY rmdir failures. Change-Id: Idd71a9d0f7ee712899cc7113bbf2cd3dcb25808b BUG: 1307208 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13440 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier : Fixed wrong variable comparisonN Balachandran2016-02-101-3/+15
| | | | | | | | | | | | | | The wrong variable was being checked to determine the watermark value. Change-Id: If4c97fa70b772187f1fcbdf5193e077cb356a8b1 BUG: 1303895 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13357 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* dht/quota : heal the limit_objects_key xattr needed for inode-quotaManikandan Selvaganesh2016-02-101-0/+11
| | | | | | | | | | | | | | | | | | Whenever a new brick is added, quota related xattr's should be healed but currently, the xattr "quota.limit-objects.<suffix>" needed for inode-quota is not being healed. The patch fixes this issue. Change-Id: I1e7b229126f7b058642bbc3fb5c109bfd8925325 BUG: 1302257 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/13299 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: Create linkfiles to hardlinks correctlyN Balachandran2016-02-073-1/+132
| | | | | | | | | | | | | | There is a bug in the way hardlinks are handled in tiered volumes. Ideally, the tier linkto files on the cold tier to files that are hardlinks to each other on the hot tier, should themselves be hardlinks of each other. As they are not, they end up being files with the same gfid but different names for the cold tier dht, and end up overwriting the cached-subvol information stored in the dht inode-ctx. Change-Id: Ic658a316836e6a1729cfea848b7d212674b0edd2 BUG: 1305277 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13391 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht: Cleanup dict in dht_do_rename()Vijay Bellur2016-02-041-0/+2
| | | | | | | | | | | | Change-Id: Ib4b3a843e78eccf5b8e0e7776cd0128013a59a3e BUG: 1303945 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/13322 CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier/gfdb : Round-Robin read of query filesJoseph Fernandes2016-02-033-81/+326
| | | | | | | | | | | | | | | | | | | 1. Each brick on a host will get a separate query file. 2. While reading query record from these query files we read them in a Round-Robin manner. 3. When an error occurs during migration we rename it to query file with an time stamp and .err extension for better debugging. Change-Id: I27c4285d24fd695d2d5cbd9fd7db3879d277ecc8 BUG: 1302772 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/13293 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: N Balachandran <nbalacha@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier : Reset watermarks in tierN Balachandran2016-02-031-9/+36
| | | | | | | | | | | | | | | | A node which contains only cold bricks and has detected that the high watermark value has been breached on the hot tier will never reset the watermark to the correct value. The promotion check will thus always fail and no promotions will occur from that node. Change-Id: I0f0804744cd184c263acbea1ee50cd6010a49ec5 BUG: 1303895 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13341 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: break out of iterating query file once cycle time endsDan Lambright2016-02-031-0/+27
| | | | | | | | | | | | | | | | When iterating the query file during migration, tiering should break out of the loop once cycle time completes. Otherwise it may be possible to stay in the loop for a long time. If that happens updates to files will become stale and have not impact migration. Change-Id: Ib60cf74bc84e8646e6a0da21ff04954b1b83c414 BUG: 1301227 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13284 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* tier/dht : Default value for demote-freq, max files and mbJoseph Fernandes2016-01-262-5/+6
| | | | | | | | | | | | | | | | | | | Default value for tier-demote-frequency is 3600 sec to avoid frequent demotions. Default value for tier-max-mb is 4000 mb Default value for tier-max-files is 10000 files Change-Id: Ie60951c478a7462c425059699ab82511aa13fa0a BUG: 1300412 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/13270 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Ignore quota-deem-statfs for watermark calculationN Balachandran2016-01-251-0/+21
| | | | | | | | | | | | | | | | | | The tier process watermark calculations were incorrect when the quota-deem-statfs option was enabled. We now ignore this while calculating hot tier usage to determine watermark levels. Change-Id: I308bc24432e2fa5ad1d5703e80fc391433538bbb BUG: 1301473 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13288 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht : Rebalance process crashes due to double fd_unrefN Balachandran2016-01-101-6/+11
| | | | | | | | | | | | | The dst_fd was being unrefed twice in case the call to __dht_rebalance_create_dst_file failed. Change-Id: I56c5aff7fa3827887e67936b8aa1ecbd1a08a9e9 BUG: 1296611 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13193 Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: allow db queries to be interruptableDan Lambright2016-01-073-30/+110
| | | | | | | | | | | | | | | | | | | | A query to the database may take a long time if the database has many entries. The tier daemon also sends IPC calls to the bricks which can run slowly, espcially in RHEL6. While it is possible to track down each such instance, the snapshot feature should not be affected by database operations. It requires no migration be underway. Therefore it is okay to pause tiering at any time except when DHT is moving a file. This fix implements this strategy by monitoring when control passes to DHT to migrate a file using the GF_XATTR_FILE_MIGRATE_KEY trigger. If it is not, the pause operation is successful. Change-Id: I21f168b1bd424077ad5f38cf82f794060a1fabf6 BUG: 1287842 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13104 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: missleading indendentation, gcc-6Kaleb S KEITHLEY2016-01-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-6 now has -Wmisleading-indentation as part of -Wall. compiling with gcc-6 gives this warning. ... dht-diskusage.c: In function ‘dht_subvol_has_err’: dht-diskusage.c:361:33: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] goto out; ^~~~ dht-diskusage.c:358:25: note: ...this ‘if’ clause, but it is not if (conf->decommissioned_bricks[i] && ^~ ... Inspection of the source shows that without braces the loop is terminated prematurely. Change-Id: Ica48a8c59ee5d0a206797827d7920259d33b47ec BUG: 1295784 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13176 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/create: store TIER_LINKFILE_GFID in xattr dictionaryMohammed Rafi KC2016-01-051-17/+68
| | | | | | | | | | | | | | | | | | | In tier_create, a new key TIER_LINKFILE_GFID was introduced to avoid a race in stale linkfile deletion. Storing this key in xattr dictionary instead of using local->params dictionary. Because local->params dictionary was also used to create the file before stale linkfile deletion, that leads posix_create to fail, trying to set the added key as extended attributes Change-Id: I24fecb62b47bee65a1e86103925a67d13304c5df BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13130 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: check watermark during migrationDan Lambright2016-01-051-46/+57
| | | | | | | | | | | | | | | Currently we check the watermarks only before a cycle begins. We should also check the hot tier's fullness against the watermarks during the migration so the watermark is not exceeded as files are promoted. Change-Id: I2ff87a1c308d64fbdca14bbdf55f3ec3007290ae BUG: 1293932 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13103 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/tier: Additional details in error messagesN Balachandran2016-01-041-41/+45
| | | | | | | | | | | | | | Added file path/gfid when available to the tier log messages to make debugging easier. Change-Id: I22dda329367df2b846dcf254594312c997b66083 BUG: 1273043 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13114 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/create: Dynamically allocate gfid memoryMohammed Rafi KC2015-12-291-2/+13
| | | | | | | | | | | | | | | | Currently we are storing the memory as a static pointer. There is a chance to go that variable in out of scope. So we should allocate in Dynamic way. Change-Id: I096876deb8055ac3a44681599591a0a032bc0c24 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13102 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/unlink: symlink failed to unlinkMohammed Rafi KC2015-12-291-1/+5
| | | | | | | | | | | | | | | | | | | | | during unlink of a file, we will get stat just after deleting the file, to see if the file is under migration or not. but this stat call will fail for symlink if the actual file was deleted. So it is better not to send stat request from client if it is a symlink as we are not migrating symlink. Change-Id: Idc033b24fa3522b5261e579889d2195b43419682 BUG: 1293963 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13074 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: limit xattr for subdir not healed on newly added bricksvmallika2015-12-291-1/+19
| | | | | | | | | | | | | | DHT after creating missing directory, tries to heal the xattrs. This xattrs operation fails as INTERNAL FOP key was not set Change-Id: I819d373cf7073da014143d9ada908228ddcd140c BUG: 1294479 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13100 Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* Tier: "tier start force" command implementationhari gowtham2015-12-222-3/+3
| | | | | | | | | | | | | | | | The start command doesnt restart the tier deamon if the deamon is running at one node. hence to bring up the tierd on the nodes where the deamon is down, the force command is implemented. It skips the check for tierd running. Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436 BUG: 1292112 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/12983 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht : Ftruncate on migrating file fails with EINVALN Balachandran2015-12-229-80/+317
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | What: If dht_open is called on a migrating file after the inode_ctx is set, subsequent FOPs on that fd do not open the fd on the dst subvol. This is seen when the open-ftruncate-close sequence is repeatedly called on a migrating file. A second call to the sequence described above causes dht_truncate_cbk to call dht_truncate2 as the dht_inode_ctx was already set by the first call. As dht_rebalance_in_progress_check is not called, the fd is not opened on the dst subvol. On a distributed-replicate volume, this causes AFR to open the fd using afr_fix_open, but with the wrong flags, causing posix_ftruncate to fail with EINVAL. The fix: We require fd specific information to make a decision while handling migrating files. Set the fd_ctx to indicate the fd has been opened on the dst subvol and check if it has been set while processing Phase1/Phase2 checks in the FOP callback functions. Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14 BUG: 1284823 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12985 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier:delete the linkfile if data file creation failsMohammed Rafi KC2015-12-224-82/+288
| | | | | | | | | | | | | | | | | | | | | | | If we are creating data file in a hot subvolume then we will create a linkfile in cold subvolume. Linkfile creation happens first. If linkfile creation was successful and data file creation failed, then linkfile in cold subvolume will become stale. This patch will delete the linkfile as well, if data file creation fails. Also this code duplicates dht_create to make tier_create Change-Id: I377a90dad47f288e9576c7323b23cf694a91a7a3 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12948 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: do not block in synctask created from pause tierDan Lambright2015-12-213-18/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | We had run sleep() in the pause tier callback. Blocking within a synctask is dangerous. The sleep() call does not inform the synctask scheduler that a thread is no longer running. It therefore believes it is running. If a second synctask already exists, it may not be able to run. This occurs if the thread limit in the pool has been reached. Note the pool size only grows when a synctask is created, not when it is moved from wait state to run state, as is the case when an FOP completes. When the tier is paused during migration, synctasks already exist waiting for responses to FOPs to the server with high probability. The fix is to yield() in the RPC callback, which will place the synctask into the wait queue and free up a thread for the FOP callback. A timer wakes the callback after sufficient time has elapsed for the pause to occur. Change-Id: I6a947ee04c6e5649946cb6d8207ba17263a67fc6 BUG: 1267950 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* tier: Demotion failed if the file was renamed when it was in coldMohammed Rafi KC2015-12-211-11/+26
| | | | | | | | | | | | | | | | | | | | During migration if the file is present we just open the file in hashed subvol. Now if the linkfile present on hashed is just linkfile to another subvol, we actually open in hashed subvol. But subsequent operation will go to linkto subvol ie, to non-hashed subvol. This operation will get failed since we haven't opened d on non-hashed. Change-Id: I9753ad3a48f0384c25509612ba76e7e10645add3 BUG: 1292067 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12980 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht : Properly free file descriptors during data migrationJoseph Fernandes2015-12-211-6/+5
| | | | | | | | | | | | | | | While tier migration, free src and dst fd's when create of destination or open of source fails. Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 BUG: 1291566 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix tier-max-files bookeeping and helpDan Lambright2015-12-182-2/+1
| | | | | | | | | | | | | | | | | | The option tier-max-files represents the maximum number of files trasferred by a node in a gives cycle. Fix help message to reflect the "per node" aspect. The code transferred one more file per cycle than the given value. Also change the default values of max file and max bytes to very large values, effectively we will not throttle migration unless the administrator requests it via CLI. Change-Id: Ic2949ed3d8c35afe7c9ae4db72195603cfb2e28f BUG: 1292671 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12984 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Handle failure in getxattrSusant Palai2015-12-161-0/+7
| | | | | | | | | | | | | | | | | Problem: Currently even if we have received xattrs from any one of the subvolume, we unwind with error in case the last subvol (which unwinds) received a negative response. To handle the case check if any of the subvolume has received a response and pass it down. Change-Id: Ia12a1f9671a6764f7550e6dc223324b1039fcc51 BUG: 1287539 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12845 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier:unlink during migrationMohammed Rafi KC2015-12-165-75/+382
| | | | | | | | | | | | | | | | | | | | | | files deleted during promotion were not deleting as the files are moving from hashed to non-hashed. On deleting a file that is undergoing promotion, the unlink call is not sent to the dst file as the hashed subvol == cached subvol. This causes the file to reappear once the migration is complete. This patch also fixes a problem with stale linkfile deleting. Change-Id: I4b02a498218c9d8eeaa4556fa4219e91e7fa71e5 BUG: 1282390 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12829 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: Use seperate return variable for destination cleanupJoseph Fernandes2015-12-121-3/+3
| | | | | | | | | | | | | | | Use seperate local return variable for destination cleanup, without messing with the function return variable. Change-Id: Iaea9ed2927234fdb888aef7a31ec362090e98196 BUG: 1290975 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12956 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: files are still going to decommissioned subvolMohammed Rafi KC2015-12-091-3/+27
| | | | | | | | | | | | | | After detach tier start, creates are still going to hot tier. Because when creating data files we are not checking for decommissioned bricks. Change-Id: I8e28258d9b2367dcc8ad6e5e91d0e54d92fdf771 BUG: 1289602 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12914 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier : Spawn promotion or demotion thread depending on local bricksJoseph Fernandes2015-12-091-22/+29
| | | | | | | | | | | | | | | | | | | Spawn Promotion or Demotion depending if there are any Cold or Hot bricks present localy. IF the local HOT brick list is empty dont spawn demote thread. IF the local COLD brick list is empty dont spawn promote thread. Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I524730e59414dd156c78ec0bd7a3629212697e6e BUG: 1289578 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12912 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Fix double free in tier processN Balachandran2015-12-071-3/+1
| | | | | | | | | | | | | | | | | The tier process tries to free ipc_ctr_params twice if the syncop_ipc call in tier_process_ctr_query fails. ipc_ctr_params is freed when ctr_ipc_in_dict is freed. But ctr_ipc_out_dict is NULL when syncop_ipc fails, causing GF_FREE to be called on a non-NULL ipc_ctr_params ptr again. Change-Id: Ia15f36dfbcd97be5524588beb7caad5cb79efdb4 BUG: 1288995 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12890 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: Fix mem leak from lookup response dictJoseph Fernandes2015-12-061-0/+6
| | | | | | | | | | | | | Fixing memory leak from response dict during a parent lookup to get the path. Change-Id: I60c23d0b25e7f763f0e53c40e71ee053aba6d555 BUG: 1288019 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12867 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes
* tier/tier: Ignoring status of already migrated filesJoseph Fernandes2015-12-031-5/+16
| | | | | | | | | | | | | Ignore the status of already migrated files and in the process don't count. Change-Id: Idba6402508d51a4285ac96742c6edf797ee51b6a BUG: 1276141 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12758 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix loading tier.so into glusterdN Balachandran2015-12-037-41/+72
| | | | | | | | | | | | | | | | | glusterd occasionally loads shared libraries of translators. This failed for tiering due to a reference to dht_methods which is defined as a global variable which is not necessary. The global variable has been removed and this is now a member of dht_conf and is now initialised in the *_init calls. Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953 BUG: 1287842 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12863 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht : changing variable type to avoid overflowSakshi Bansal2015-11-261-7/+9
| | | | | | | | | | | | | | | | | For layout computation we find total size of the cluster and store it in an unsigned 32 bit variable. For large clusters this value may overflow which leads to wrong computations and for some bricks the layout may overflow. Hence using unsigned 64 bit to handle large values. Change-Id: I7c3ba26ea2c4158065ea9e74705a7ede1b6759c7 BUG: 1282751 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12597 Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: readdirp to cold tier onlyDan Lambright2015-11-236-109/+496
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible a file would get migrated in the middle of a readdir operation. If there are four subvolumes A,B,C,D, and if readdir reads them in order and reaches subvol B, then, if a file is moved from D to A, it will not be included in the readdir output. This phenonema has pre-existed in DHT migration but is more apparent in tiering. When a file is moved off the hashed subvolume a T file is created. For tiering, we will make the cold subvolume the hashed subvolume. This will ensure the creation of a T file. Readdir will not skip T files in the tier translator. Making the cold subvolume the hashed subvolume ensures the T files created on promotions or creates will be less likely to fill the volume. Creates still put the data on the hot subvolume. Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407 BUG: 1281598 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: Do not delete linkto file on demotionN Balachandran2015-11-182-2/+9
| | | | | | | | | | | | | | | | | The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d BUG: 1279376 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12551 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* dht: loc should store proper gfidSakshi Bansal2015-11-171-0/+1
| | | | | | | | | | Change-Id: Ic1393d44a9ed4aaba23d7c9ddea45977b9dae5e4 BUG: 1281265 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12574 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>