diff options
| author | Susant Palai <spalai@redhat.com> | 2017-04-25 18:32:45 +0530 | 
|---|---|---|
| committer | Shyamsundar Ranganathan <srangana@redhat.com> | 2017-05-30 00:42:58 +0000 | 
| commit | 1b1f871ca41b08671ebb327dba464aeb6c82e776 (patch) | |
| tree | c03df93f141525db93461c08d48f0ba7511fbaa8 /xlators/cluster/dht/src/dht-common.h | |
| parent | 0a8b2db2d2c3993b25272a65fecad0d3d2866607 (diff) | |
cluster/dht: fix on demand migration files from client
    On demand migration of files i.e. migration done by clients
    triggered by a setfattr was broken.
    
    Dependency on defrag led to crash when migration was triggered from
    client.
    
    Note: This functionality is not available for tiered volumes. Migration
    from tier served client will fail with ENOTSUP.
    
    usage (But refer to the steps mentioned below to avoid any issues) :
    setfattr -n "trusted.distribute.migrate-data" -v "1" <filename>
    
    The purpose of fixing the on-demand client migration was to give a
    workaround where the user has lots of empty directories compared to
    files and want to do a remove-brick process.
    
    Here are the steps to trigger file migration for remove-brick process from
    client. (This is highly recommended to follow below steps as is)
    
    Let's say it is a replica volume and user want to remove a replica pair
    named brick1 and brick2. (Make sure healing is completed before you run
    these steps)
    
    Step-1: Start remove-brick process
     - gluster v remove-brick <volname> brick1 brick2 start
    Step-2: Kill the rebalance daemon
     - ps aux | grep glusterfs | grep rebalance\/ | awk '{print $2}' | xargs kill
    Step-3: Do a fresh mount as mentioned here
     -  glusterfs -s ${localhostname} --volfile-id rebalance/$volume-name /tmp/mount/point
    Step-4: Go to one of the bricks (among brick1 and brick2)
     - cd <brick1 path>
    Step-5: Run the following command.
     - find . -not \( -path ./.glusterfs -prune \) -type f -not -perm 01000 -exec bash -c 'setfattr -n "distribute.fix.layout" -v "1" ${mountpoint}/$(dirname '{}')' \; -exec  setfattr -n "trusted.distribute.migrate-data" -v "1" ${mountpoint}/'{}' \;
    
    This command will ignore the linkto files and empty directories. Do a fix-layout of
    the parent directory. And trigger a migration operation on the files.
    
    Step-6: Once this process is completed do "remove-brick force"
     - gluster v remove-brick <volname> brick1 brick2 force
    
    Note: Use the above script only when there are large number of empty directories.
    Since the script does a crawl on the brick side directly and avoids directories those
    are empty, the time spent on fixing layout on those directories are eliminated(even if the script
    does not do fix-layout on empty directories, post remove-brick a fresh layout will be built
    for the directory, hence not affecting application continuity).
    
    Detailing the expectation for hardlink migartion with this patch:
        Hardlink is migrated only for remove-brick process. It is highly essential
    to have a new mount(step-3) for the hardlink migration to happen. Why?:
    setfattr operation is an inode based operation. Since, we are doing setfattr from
    fuse mount here, inode_path will try to build path from the linked dentries to the inode.
    For a file without hardlinks the path construction will be correct. But for hardlinks,
    the inode will have multiple dentries linked.
    
            Without fresh mount, inode_path will always get the most recently linked dentry.
    e.g. if there are three hardlinks named dir1/link1, dir2/link2, dir3/link3, on a client
    where these hardlinks are looked up, inode_path will always return the path dir3/link3
    if dir3/link3 was looked up most recently. Hence, we won't be able to create linkto
    files for all other hardlinks on destination (read gf_defrag_handle_hardlink for more details
    on hardlink migration).
    
            With a fresh mount, the lookup and setfattr become serialized. e.g. link2 won't be
    looked up until link1 is looked up and migrated. Hence, inode_path will always have the correct
    path, in this case link1 dentry is picked up(as this is the most recently looked up inode) and
    the path is built right.
    
    Note: If you run the above script on an existing mount(all entries looked up), hard links may
    not be migrated, but there should not be any other issue. Please raise a bug, if you find any
    issue.
    
    Tests: Manual
Change-Id: I9854cdd4955d9e24494f348fb29ba856ea7ac50a
BUG: 1450975
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: https://review.gluster.org/17115
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Diffstat (limited to 'xlators/cluster/dht/src/dht-common.h')
| -rw-r--r-- | xlators/cluster/dht/src/dht-common.h | 7 | 
1 files changed, 4 insertions, 3 deletions
diff --git a/xlators/cluster/dht/src/dht-common.h b/xlators/cluster/dht/src/dht-common.h index f982bf6ac1a..786db020427 100644 --- a/xlators/cluster/dht/src/dht-common.h +++ b/xlators/cluster/dht/src/dht-common.h @@ -542,9 +542,6 @@ struct gf_defrag_info_ {          int32_t                      current_thread_count;          pthread_cond_t               df_wakeup_thread; -        /* Hard link handle requirement */ -        synclock_t                   link_lock; -          /* lock migration flag */          gf_boolean_t                 lock_migration_enabled; @@ -645,6 +642,10 @@ struct dht_conf {          gf_boolean_t    lock_migration_enabled;          gf_lock_t       lock; + +        /* Hard link handle requirement for migration triggered from client*/ +        synclock_t      link_lock; +  };  typedef struct dht_conf dht_conf_t;  | 
