<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/storage/bd/src/bd.h, branch v3.5.1</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>bd: Add Zerofill FOP support</title>
<updated>2013-11-20T22:46:16+00:00</updated>
<author>
<name>M. Mohan Kumar</name>
<email>mohan@in.ibm.com</email>
</author>
<published>2013-11-15T08:49:11+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=2bb025699a8b9b34491c8b13a2bbb6da302a5d77'/>
<id>2bb025699a8b9b34491c8b13a2bbb6da302a5d77</id>
<content type='text'>
BUG: 1028673
Change-Id: I9ba8e3e6cf2f888640b4d2a2eb934a27ff903c42
Signed-off-by: Bharata B Rao &lt;bharata@linux.vnet.ibm.com&gt;
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/6290
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
BUG: 1028673
Change-Id: I9ba8e3e6cf2f888640b4d2a2eb934a27ff903c42
Signed-off-by: Bharata B Rao &lt;bharata@linux.vnet.ibm.com&gt;
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/6290
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bd: Add support to create clone, snapshot and merge of LV images.</title>
<updated>2013-11-13T19:39:22+00:00</updated>
<author>
<name>M. Mohan Kumar</name>
<email>mohan@in.ibm.com</email>
</author>
<published>2013-11-13T17:14:43+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=81a57679c20ac0ac9b48e313af75036132e3a5ad'/>
<id>81a57679c20ac0ac9b48e313af75036132e3a5ad</id>
<content type='text'>
Special xattr names "clone" &amp; "snapshot" can be used to create full and
linked clone of the LV images. GFID of destination posix file (to be
mapped) is passed as a value to the xattr. Destination posix file must
exist before running this operation.

These operations form a basis for offloading storage related operations
from QEMU to GlusterFS.

Syntax for full clone: xattr name: "clone" value: "gfid-of-dest-file"
Syntax for linked clone: xattr name: "snapshot" value: "gfid-of-dest-file"
Syntax for merging: xattr name: "merge" value: "path-to-snapshot-file"

Example:
	setfattr -n clone -v &lt;gfid-of-dest-file&gt; /media/source
	setfattr -n snapshot -v &lt;gfid-of-dest-file&gt; /media/source
	setfattr -n merge -v "/media/sn" /media/sn

Change-Id: Id9f984a709d4c2e52a64ae75bb12a8ecb01f8776
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5626
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Special xattr names "clone" &amp; "snapshot" can be used to create full and
linked clone of the LV images. GFID of destination posix file (to be
mapped) is passed as a value to the xattr. Destination posix file must
exist before running this operation.

These operations form a basis for offloading storage related operations
from QEMU to GlusterFS.

Syntax for full clone: xattr name: "clone" value: "gfid-of-dest-file"
Syntax for linked clone: xattr name: "snapshot" value: "gfid-of-dest-file"
Syntax for merging: xattr name: "merge" value: "path-to-snapshot-file"

Example:
	setfattr -n clone -v &lt;gfid-of-dest-file&gt; /media/source
	setfattr -n snapshot -v &lt;gfid-of-dest-file&gt; /media/source
	setfattr -n merge -v "/media/sn" /media/sn

Change-Id: Id9f984a709d4c2e52a64ae75bb12a8ecb01f8776
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5626
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bd: Add aio support to BD xlator</title>
<updated>2013-11-13T19:39:11+00:00</updated>
<author>
<name>M. Mohan Kumar</name>
<email>mohan@in.ibm.com</email>
</author>
<published>2013-11-13T17:14:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=b222ce817f5f324fe20d4d3614001ed2f177afb8'/>
<id>b222ce817f5f324fe20d4d3614001ed2f177afb8</id>
<content type='text'>
Volume option bd-aio controls AIO feature for BD xlator. Code taken from
posix-aio.c

Change-Id: Ib049bd59c9d3f9101d33939838322cfa808de053
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5748
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Volume option bd-aio controls AIO feature for BD xlator. Code taken from
posix-aio.c

Change-Id: Ib049bd59c9d3f9101d33939838322cfa808de053
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5748
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bd: posix/multi-brick support to BD xlator</title>
<updated>2013-11-13T19:38:42+00:00</updated>
<author>
<name>M. Mohan Kumar</name>
<email>mohan@in.ibm.com</email>
</author>
<published>2013-11-13T17:14:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=48c40e1a42efe1b59126406084821947d139dd0e'/>
<id>48c40e1a42efe1b59126406084821947d139dd0e</id>
<content type='text'>
Current BD xlator (block backend) has a few limitations such as
* Creation of directories not supported
* Supports only single brick
* Does not use extended attributes (and client gfid) like posix xlator
* Creation of special files (symbolic links, device nodes etc) not
  supported

Basic limitation of not allowing directory creation is blocking
oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
creates multi-level directories when GlusterFS is used as storage
backend for storing VM images.

To overcome these limitations a new BD xlator with following
improvements is suggested.

* New hybrid BD xlator that handles both regular files and block device
  files
* The volume will have both POSIX and BD bricks. Regular files are
  created on POSIX bricks, block devices are created on the BD brick (VG)
* BD xlator leverages exiting POSIX xlator for most POSIX calls and
  hence sits above the POSIX xlator
* Block device file is differentiated from regular file by an extended
  attribute
* The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
  posix file to Logical Volume (LV).
* When a client sends a request to set BD_XATTR on a posix file, a new
  LV is created and mapped to posix file. So every block device will
  have a representative file in POSIX brick with 'user.glusterfs.bd'
  (BD_XATTR) set.
* Here after all operations on this file results in LV related
  operations.

For example opening a file that has BD_XATTR set results in opening
the LV block device, reading results in reading the corresponding LV
block device.

When BD xlator gets request to set BD_XATTR via setxattr call, it
creates a LV and information about this LV is placed in the xattr of the
posix file. xattr "user.glusterfs.bd" used to identify that posix file
is mapped to BD.

Usage:
Server side:
[root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
It creates a distributed gluster volume 'bdvol' with Volume Group vg1
using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
/storage/vg2_info in host2.

[root@host1 ~]# gluster volume start bdvol

Client side:
[root@node ~]# mount -t glusterfs host1:/bdvol /media
[root@node ~]# touch /media/posix
It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
[root@node ~]# mkdir /media/image
[root@node ~]# touch /media/image/lv1
It also creates regular posix file 'lv1' in either host1:/vg1 or
host2:/vg2 brick
[root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
[root@node ~]#
Above setxattr results in creating a new LV in corresponding brick's VG
and it sets 'user.glusterfs.bd' with value 'lv:&lt;default-extent-size'
[root@node ~]# truncate -s5G /media/image/lv1
It results in resizig LV 'lv1'to 5G

New BD xlator code is placed in xlators/storage/bd directory.

Also add volume-uuid to the VG so that same VG can't be used for other
bricks/volumes. After deleting a gluster volume, one has to manually
remove the associated tag using vgchange &lt;vg-name&gt; --deltag
&lt;trusted.glusterfs.volume-id:&lt;volume-id&gt;&gt;

Changes from previous version V5:
* Removed support for delayed deleting of LVs

Changes from previous version V4:
* Consolidated the patches
* Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR.

Changes from previous version V3:
* Added support in FUSE to support full/linked clone
* Added support to merge snapshots and provide information about origin
* bd_map xlator removed
* iatt structure used in inode_ctx. iatt is cached and updated during
fsync/flush
* aio support
* Type and capabilities of volume are exported through getxattr

Changes from version 2:
* Used inode_context for caching BD size and to check if loc/fd is BD or
  not.
* Added GlusterFS server offloaded copy and snapshot through setfattr
  FOP. As part of this libgfapi is modified.
* BD xlator supports stripe
* During unlinking if a LV file is already opened, its added to delete
  list and bd_del_thread tries to delete from this list when a last
  reference to that file is closed.

Changes from previous version:
* gfid is used as name of LV
* ? is used to specify VG name for creating BD volume in volume
  create, add-brick. gluster volume create volname host:/path?vg
* open-behind issue is fixed
* A replicate brick can be added dynamically and LVs from source brick
  are replicated to destination brick
* A distribute brick can be added dynamically and rebalance operation
  distributes existing LVs/files to the new brick
* Thin provisioning support added.
* bd_map xlator support retained
* setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and
  setfattr -n user.glusterfs.bd -v "thin" creates thin LV
* Capability and backend information added to gluster volume info (and
--xml) so
  that management tools can exploit BD xlator.
* tracing support for bd xlator added

TODO:
* Add support to display snapshots for a given LV
* Display posix filename for list-origin instead of gfid

Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/4809
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current BD xlator (block backend) has a few limitations such as
* Creation of directories not supported
* Supports only single brick
* Does not use extended attributes (and client gfid) like posix xlator
* Creation of special files (symbolic links, device nodes etc) not
  supported

Basic limitation of not allowing directory creation is blocking
oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
creates multi-level directories when GlusterFS is used as storage
backend for storing VM images.

To overcome these limitations a new BD xlator with following
improvements is suggested.

* New hybrid BD xlator that handles both regular files and block device
  files
* The volume will have both POSIX and BD bricks. Regular files are
  created on POSIX bricks, block devices are created on the BD brick (VG)
* BD xlator leverages exiting POSIX xlator for most POSIX calls and
  hence sits above the POSIX xlator
* Block device file is differentiated from regular file by an extended
  attribute
* The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
  posix file to Logical Volume (LV).
* When a client sends a request to set BD_XATTR on a posix file, a new
  LV is created and mapped to posix file. So every block device will
  have a representative file in POSIX brick with 'user.glusterfs.bd'
  (BD_XATTR) set.
* Here after all operations on this file results in LV related
  operations.

For example opening a file that has BD_XATTR set results in opening
the LV block device, reading results in reading the corresponding LV
block device.

When BD xlator gets request to set BD_XATTR via setxattr call, it
creates a LV and information about this LV is placed in the xattr of the
posix file. xattr "user.glusterfs.bd" used to identify that posix file
is mapped to BD.

Usage:
Server side:
[root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
It creates a distributed gluster volume 'bdvol' with Volume Group vg1
using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
/storage/vg2_info in host2.

[root@host1 ~]# gluster volume start bdvol

Client side:
[root@node ~]# mount -t glusterfs host1:/bdvol /media
[root@node ~]# touch /media/posix
It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
[root@node ~]# mkdir /media/image
[root@node ~]# touch /media/image/lv1
It also creates regular posix file 'lv1' in either host1:/vg1 or
host2:/vg2 brick
[root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
[root@node ~]#
Above setxattr results in creating a new LV in corresponding brick's VG
and it sets 'user.glusterfs.bd' with value 'lv:&lt;default-extent-size'
[root@node ~]# truncate -s5G /media/image/lv1
It results in resizig LV 'lv1'to 5G

New BD xlator code is placed in xlators/storage/bd directory.

Also add volume-uuid to the VG so that same VG can't be used for other
bricks/volumes. After deleting a gluster volume, one has to manually
remove the associated tag using vgchange &lt;vg-name&gt; --deltag
&lt;trusted.glusterfs.volume-id:&lt;volume-id&gt;&gt;

Changes from previous version V5:
* Removed support for delayed deleting of LVs

Changes from previous version V4:
* Consolidated the patches
* Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR.

Changes from previous version V3:
* Added support in FUSE to support full/linked clone
* Added support to merge snapshots and provide information about origin
* bd_map xlator removed
* iatt structure used in inode_ctx. iatt is cached and updated during
fsync/flush
* aio support
* Type and capabilities of volume are exported through getxattr

Changes from version 2:
* Used inode_context for caching BD size and to check if loc/fd is BD or
  not.
* Added GlusterFS server offloaded copy and snapshot through setfattr
  FOP. As part of this libgfapi is modified.
* BD xlator supports stripe
* During unlinking if a LV file is already opened, its added to delete
  list and bd_del_thread tries to delete from this list when a last
  reference to that file is closed.

Changes from previous version:
* gfid is used as name of LV
* ? is used to specify VG name for creating BD volume in volume
  create, add-brick. gluster volume create volname host:/path?vg
* open-behind issue is fixed
* A replicate brick can be added dynamically and LVs from source brick
  are replicated to destination brick
* A distribute brick can be added dynamically and rebalance operation
  distributes existing LVs/files to the new brick
* Thin provisioning support added.
* bd_map xlator support retained
* setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and
  setfattr -n user.glusterfs.bd -v "thin" creates thin LV
* Capability and backend information added to gluster volume info (and
--xml) so
  that management tools can exploit BD xlator.
* tracing support for bd xlator added

TODO:
* Add support to display snapshots for a given LV
* Display posix filename for list-origin instead of gfid

Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2
BUG: 1028672
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/4809
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
