cloudArchival: Added feature page and design document

Change-Id: Iff9025dc28ae1b12213b564903b03001251e8aff Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/18854 Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
author: Susant Palai <spalai@redhat.com> 2017-11-26 11:49:48 +0530
committer: Niels de Vos <ndevos@redhat.com> 2018-01-18 08:50:30 +0000
commit: ee1c8b52721ce815bc98fd60a6b0e867848c8d79 (patch)
tree: d6c4bb262293292e399f5107d9dfa68099584f55
parent: ee5fab234b0a9d317f7ec7cb7eb05c2d1172e94e (diff)
2 files changed, 144 insertions, 0 deletions
diff --git a/accepted/CloudArchival.md b/accepted/CloudArchival.md
new file mode 100644
index 0000000..ed25fea
--- /dev/null
+++ b/accepted/CloudArchival.md
@@ -0,0 +1,84 @@
+# CloudArchival
+
+### Goal
+
+A new Cloud archival story for Glusterfs.
+
+### Summary
+The feature will archive cold data to cloud storage. Applications where majority
+of the data are not accessed/modified frequently can be archived to low-cost
+cloud storage. And the local storage system(Glusterfs) space can be used for
+files that needs high performance operations
+
+### Owners
+
+Aravinda Krishna Murthy <avishwan@redhat.com>
+
+Susant Kumar Palai <spalai@redhat.com>
+
+### Current Status Feature under development
+
+### Detailed Description
+
+A scanner/uploader tool will run a policy (tunable) based scan and will upload
+files to the cloud storage. Post migration of data to cloud, downloader xlator
+will truncate the file and store the size information as xattr. Any meta-data
+operation will be served locally from glusterfs till the next data modification
+request. On a data modification, the request will be stubbed and downloader
+will download the file from cloud. Upon success, the stubbed request will be
+resumed.
+
+
+### Benefits to GlusterFS
+This archival feature will be of immense benifit to users where majority of
+their data in the storage system are cold. With this, users can leverage the
+in house Glusterfs space for high performance jobs.
+
+### Scope
+
+### Nature of proposed change
+
+- An uploader tool - Role is to scan the file system and upload file to cloud
+  based on a user-defined policy.
+
+- Downloader xlator - This xlator will intercept data modification request on a
+  file which resides in cloud. A download operation will be initiated, post
+  which the data modification request will be resumed.
+
+### Implications on manageability
+At a high level, command to enable, configure downloader xlator.
+
+### Implications on presentation layer
+N/A
+
+### Implications on persistence layer
+N/A
+
+### Implications on 'GlusterFS' backend
+None
+
+### Modification to GlusterFS metadata
+Post archival, a size xattr will be set on the file to serve meta-data requests
+as the file would have been truncated
+
+### Implications on 'glusterd'
+Volgen must be able to configure the downloader xlator and store information
+related cloud provider and access.
+
+### How to Test
+
+N/A
+
+### User Experience
+Minimal change, mostly related to new options. Some latency will be experienced
+while the flie  is getting downloaded from cloud during data modification.
+
+### Dependencies N/A
+
+### Documentation TBD.
+
+### Status
+
+Patches being worked on  :
+
+- https://review.gluster.org/#/c/18532/ (Downloader Xlator)
diff --git a/design/Cloud-Archival/CloudArchive.md b/design/Cloud-Archival/CloudArchive.md
new file mode 100644
index 0000000..9506429
--- /dev/null
+++ b/design/Cloud-Archival/CloudArchive.md
@@ -0,0 +1,60 @@
+# CloudArchival-Design.md
+
+This document gives a high level overview of CloudArchival. The design is being
+refined as we go along, and this document will be updated along the way.
+
+## Introduction
+
+This design solves the usecase where data that requires high-speed access is
+retained internally i.e. Glusterfs and lower-priority data is moved to a
+low-cost cloud-based archive storage. This will allow reduction in storage cost
+for usecases where a majority of data is cold and can be archived.
+
+## Architectural Overview
+
+CloudeArchival has two components. A scanner/uploader tool and a downloader
+xlator in Glusterfs stack.
+
+### 1. Scanner/uploader
+
+This tool will scan the file system and based on a policy, will upload the data
+to a predecided Cloud Storage. The policy can be user defined. A simple example
+would be, upload any file that has not been accessed for one month.
+
+### 2. Downloader
+
+This xlator will download the file from Cloud-Storage when an access for
+read/write (basically any data modification) request is made. This xlator will
+be placed on the client side as AFR and EC xlators are client xlators.
+
+## Work Flow
+
+ - Phase I - Post scanning, the uploader will filter out files to be archived
+   to Cloud. Once the data migration is complete to Cloud, the uploader will do
+a setxattr operation on the file to inform the downloader xlator to truncate
+the data. As part of this maintenance, downloader will store the size
+information as an xattr on the file to serve lookup/stat etc and then will
+truncate the data.
+
+
+- Phase II - While the data resides on Cloud, all meta-data operation can be
+  performed locally on Glusterfs. The data will be downloaded only when a data
+modification is requested. For read/write request, the downloader will stub the
+request and start downloading the file from Cloud. Upon successful download,
+the stubbed request will be resumed.
+
+## Cloud Information and Security
+
+Cloud information like which Cloud provider and it's access information can be
+stored per volume basis through Glusterd. There can only one cloud storage be
+attached to a volume.
+
+Since the communication channel to Cloud needs to be secured, the access
+information for Cloud should and must reside on the trusted storage pool.
+GF-proxy fits this requirement nicely as it runs on the trusted storage pool
+(as for now). Hence, the downloader will be part of GF-proxy daemon on the
+trusted storage pool.
+
+#### Note: Initial implementation will integrate with Amazon Web Service (AWS).
+Integration with other Cloud Storage will be left open for development to the
+community.
author	Susant Palai <spalai@redhat.com>	2017-11-26 11:49:48 +0530
committer	Niels de Vos <ndevos@redhat.com>	2018-01-18 08:50:30 +0000
commit	ee1c8b52721ce815bc98fd60a6b0e867848c8d79 (patch)
tree	d6c4bb262293292e399f5107d9dfa68099584f55
parent	ee5fab234b0a9d317f7ec7cb7eb05c2d1172e94e (diff)