summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/dht/src/dht-helper.c
Commit message (Collapse)AuthorAgeFilesLines
* support for de-commissioning a node using 'remove-brick'Amar Tumballi2011-09-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to achieve this, we now create volume-file with 'decommissioned-nodes' option in distribute volume, then just perform the rebalance set of operations (with 'force' flag set). now onwards, the 'remove-brick' (with 'start' option) operation tries to migrate data from removed bricks to existing bricks. 'remove-brick' also supports similar options as of replace-brick. * (no options) -> works as 'force', will have the current behavior of remove-brick, ie., no data-migration, volume changes. * start (starts remove-brick with data-migration/draining process, which takes care of migrating data and once complete, will commit the changes to volume file) * pause (stop data migration, but keep the volume file intact with extra options whatever is set) * abort (stop data-migration, and fall back to old configuration) * commit (if volume is stopped, commits the changes to volumefile) * force (stops the data-migration and commits the changes to volume file) Change-Id: I3952bcfbe604a0952e68b6accace7014d5e401d3 BUG: 1952 Reviewed-on: http://review.gluster.com/118 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* distribute rebalance: handle the open file migrationAmar Tumballi2011-09-121-6/+410
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: fill 'ia_ino' from 'ia_gfid' in 'storage/posix' to preserve same ino ↵Amar Tumballi2011-06-161-2/+1
| | | | | | | | | | | | | | | number take the least significant 64bit from gfid and assign it to 'ia_ino', hence for a given file (or directory), the 'ia_ino' number is always same, and we need not worry about the 'itransform' in 'cluster/*' translators. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3042 (inode number should be constant on storage) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3042
* DHT:first_up_subvol should be the one up the longestshishir gowda2011-05-311-3/+9
| | | | | | | | Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2684 (Dir missing from mount point) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2684
* cluster/dht: log enhancementsAmar Tumballi2011-03-171-5/+0
| | | | | | | | | Signed-off-by: Shishir Gowda <shishirng@gluster.com> Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/dht: whitespace cleanupAmar Tumballi2011-03-171-150/+150
| | | | | | | | | | also fill tabs by spaces (untabify), and indent the code Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/dht: Perform self-heal as rootPranith K2011-02-081-31/+0
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2370 (cluster/afr: Perform self-heal as root) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2370
* Fix DHT getxattr for directoriesshishir gowda2010-11-031-0/+4
| | | | | | | | | | | | When a heal on the directory or layout changes, the user xattrs do not get healed in dht. The current fix sends the getxattr call n all the subvolumes, aggregates it and sends the response Signed-off-by: shishir gowda <shishirng@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1991 (distribute directory self-heal does not copy user extended attributes) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1991
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* distribute: check for 'conf' before dereferencing itAmar Tumballi2010-10-051-1/+18
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1806 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1806
* Change GNU GPL to GNU AGPLPranith K2010-10-041-3/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1388 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1388
* remove 'gen' from iatt/protocol structuresAmar Tumballi2010-09-141-1/+0
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: changes in distribute to handle uuids in iatt structureAnand Avati2010-09-041-0/+2
| | | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* gfid: change in create() prototype to have params dictionary with uuid in itAnand Avati2010-09-041-0/+5
| | | | | | | | | Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* dht enhancementsAmar Tumballi2010-07-271-0/+63
| | | | | | | | | | | | | | * create(filename@<distribute-volume-name>-<subvol-name>), will create the file in corresponding subvol of dht. * same for unlink() same logic can be extended to other fops if there is a need Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 1200 (enhance distribute with hooks so lot of debugging can be done online) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1200
* Memory accounting changesVijay Bellur2010-04-231-4/+5
| | | | | | | | | | | Memory accounting Changes. Thanks to Vinayak Hegde and Csaba Henk for their contributions. Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 329 (Replacing memory allocation functions with mem-type functions) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=329
* iatt: changes across the codebaseAnand V. Avati2010-03-161-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - libglusterfs -- call-stub -- inode -- protocol - libglusterfsclient - cluster/replicate - cluster/{dht,nufa,switch} - cluster/unify - cluster/HA - cluster/map - cluster/stripe - debug/error-gen - debug/trace - debug/io-stats - encryption/rot-13 - features/filter - features/locks - features/path-converter - features/quota - features/trash - mount/fuse - performance/io-threads - performance/io-cache - performance/quick-read - performance/read-ahead - performance/stat-prefetch - performance/symlink-cache - performance/write-behind - protocol/client - protocol/server - storage-posix Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 361 (GlusterFS 3.0 should work on Mac OS/X) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=361
* distribute: perform self-heal as rootShehjar Tikoo2010-03-041-0/+30
| | | | | | | | | | | prerserve original frame uid and gid and perform self-heal by changing uid/gid to root (0) temporarily. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 597 (miscellaneous fixes for xlators to work well with NFS xlator) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=597
* distribute: Respect end-of-dir on readdir only for last subvolShehjar Tikoo2010-03-041-0/+21
| | | | | | | | | | | | | | | | We need to ensure that only the last subvolume's end-of-directory notification is respected so that directory reading does not stop before all subvolumes have been read. That could happen because the posix for each subvolume sends a ENOENT on end-of-directory but in distribute we're not concerned only with a posix's view of the directory but the aggregated namespace' view of the directory, which is maintained by distribute. Signed-off-by: Shehjar Tikoo <shehjart@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 597 (miscellaneous fixes for xlators to work well with NFS xlator) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=597
* dht: unlink stale linkfiles in rmdir to prevent ENOTEMPTYAnand Avati2010-02-221-3/+43
| | | | | | | | | | | | Thanks to He Xaobing <allreol@gmail.com>, this patch is inspired by http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=188#c2 Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 188 ([ glusterfs 2.0.6rc2 ] - "Directory not empty" on rm -rf) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=188
* cluster/dht: Check for pointers before de-referencing in dht_stat_merge()Vijay Bellur2009-12-161-0/+3
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 463 (Crash in dht_stat_merge ()) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=463
* distribute,nufa: layout handling changesAnand V. Avati2009-10-161-3/+18
| | | | | | | | | | changes to make revalidate not fail but instead perform fresh lookup and swap inode context (layout) safely Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 315 (generation number support) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=315
* Changed occurrences of Z Research to Gluster.Vijay Bellur2009-10-071-1/+1
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* dht_stat_merge - use the highest uid when ambiguousAnand Avati2009-08-041-2/+3
| | | | | | | | | When directories on different subvolumes have different ownerships, use the highest uid/gid till self-heal resolves the inconsistency Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 191 (random Permission denied errors) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=191
* rename dht_first_up_child to dht_first_up_subvolAnand V. Avati2009-06-261-2/+2
| | | | Signed-off-by: Anand V. Avati <avati@dev.gluster.com>
* log message cleanup in distributeAnand V. Avati2009-04-241-2/+2
| | | | Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* updated copyright header to extend copyright upto 2009Basavanagowda Kanur2009-02-261-1/+1
| | | | | | updated copyright header to include 2009. Signed-off-by: Anand V. Avati <avati@amp.gluster.com>
* Added all filesVikas Gorur2009-02-181-0/+326