From 60bdca792b7e572b4d79382dada1c6b93bebdd0e Mon Sep 17 00:00:00 2001 From: Vijay Bellur Date: Wed, 3 Jul 2013 13:02:29 +0530 Subject: doc: Moved non-relevant documentation files to legacy Change-Id: I2d34e5a4e47cd03d301d9fd2525fb61ae997fcb8 BUG: 811311 Signed-off-by: Vijay Bellur Reviewed-on: http://review.gluster.org/5277 Tested-by: Gluster Build System --- doc/legacy/docbook/admin_Hadoop.xml | 244 ++++++++++++++++++++++++++++++++++++ 1 file changed, 244 insertions(+) create mode 100644 doc/legacy/docbook/admin_Hadoop.xml (limited to 'doc/legacy/docbook/admin_Hadoop.xml') diff --git a/doc/legacy/docbook/admin_Hadoop.xml b/doc/legacy/docbook/admin_Hadoop.xml new file mode 100644 index 00000000000..08bac89615b --- /dev/null +++ b/doc/legacy/docbook/admin_Hadoop.xml @@ -0,0 +1,244 @@ + + +%BOOK_ENTITIES; +]> + + Managing Hadoop Compatible Storage + GlusterFS provides compatibility for Apache Hadoop and it uses the standard file system +APIs available in Hadoop to provide a new storage option for Hadoop deployments. Existing +MapReduce based applications can use GlusterFS seamlessly. This new functionality opens up data +within Hadoop deployments to any file-based or object-based application. + + +
+ Architecture Overview + The following diagram illustrates Hadoop integration with GlusterFS: + + + + + + +
+
+ Advantages + +The following are the advantages of Hadoop Compatible Storage with GlusterFS: + + + + + + Provides simultaneous file-based and object-based access within Hadoop. + + + + Eliminates the centralized metadata server. + + + + Provides compatibility with MapReduce applications and rewrite is not required. + + + + Provides a fault tolerant file system. + + + +
+
+ Preparing to Install Hadoop Compatible Storage + This section provides information on pre-requisites and list of dependencies that will be installed +during installation of Hadoop compatible storage. + + +
+ Pre-requisites + The following are the pre-requisites to install Hadoop Compatible +Storage : + + + + + Hadoop 0.20.2 is installed, configured, and is running on all the machines in the cluster. + + + + Java Runtime Environment + + + + Maven (mandatory only if you are building the plugin from the source) + + + + JDK (mandatory only if you are building the plugin from the source) + + + + getfattr +- command line utility + + +
+
+
+ Installing, and Configuring Hadoop Compatible Storage + This section describes how to install and configure Hadoop Compatible Storage in your storage +environment and verify that it is functioning correctly. + + + + To install and configure Hadoop compatible storage: + + Download glusterfs-hadoop-0.20.2-0.1.x86_64.rpm file to each server on your cluster. You can download the file from . + + + + + To install Hadoop Compatible Storage on all servers in your cluster, run the following command: + + # rpm –ivh --nodeps glusterfs-hadoop-0.20.2-0.1.x86_64.rpm + + The following files will be extracted: + + + + /usr/local/lib/glusterfs-Hadoop-version-gluster_plugin_version.jar + + + /usr/local/lib/conf/core-site.xml + + + + + (Optional) To install Hadoop Compatible Storage in a different location, run the following +command: + + # rpm –ivh --nodeps –prefix /usr/local/glusterfs/hadoop glusterfs-hadoop- 0.20.2-0.1.x86_64.rpm + + + + Edit the conf/core-site.xml file. The following is the sample conf/core-site.xml file: + + <configuration> + <property> + <name>fs.glusterfs.impl</name> + <value>org.apache.hadoop.fs.glusterfs.Gluster FileSystem</value> +</property> + +<property> + <name>fs.default.name</name> + <value>glusterfs://fedora1:9000</value> +</property> + +<property> + <name>fs.glusterfs.volname</name> + <value>hadoopvol</value> +</property> + +<property> + <name>fs.glusterfs.mount</name> + <value>/mnt/glusterfs</value> +</property> + +<property> + <name>fs.glusterfs.server</name> + <value>fedora2</value> +</property> + +<property> + <name>quick.slave.io</name> + <value>Off</value> +</property> +</configuration> + + The following are the configurable fields: + + + + + + + + + Property Name + Default Value + Description + + + + + fs.default.name + glusterfs://fedora1:9000 + Any hostname in the cluster as the server and any port number. + + + fs.glusterfs.volname + hadoopvol + GlusterFS volume to mount. + + + fs.glusterfs.mount + /mnt/glusterfs + The directory used to fuse mount the volume. + + + fs.glusterfs.server + fedora2 + Any hostname or IP address on the cluster except the client/master. + + + quick.slave.io + Off + Performance tunable option. If this option is set to On, the plugin will try to perform I/O directly from the disk file system (like ext3 or ext4) the file resides on. Hence read performance will improve and job would run faster. + This option is not tested widely + + + + + + + + Create a soft link in Hadoop’s library and configuration directory for the downloaded files (in +Step 3) using the following commands: + + # ln -s <target location> <source location> + + For example, + + # ln –s /usr/local/lib/glusterfs-0.20.2-0.1.jar $HADOOP_HOME/lib/glusterfs-0.20.2-0.1.jar + + # ln –s /usr/local/lib/conf/core-site.xml $HADOOP_HOME/conf/core-site.xml + + + (Optional) You can run the following command on Hadoop master to build the plugin and deploy +it along with core-site.xml file, instead of repeating the above steps: + + # build-deploy-jar.py -d $HADOOP_HOME -c + + +
+
+ Starting and Stopping the Hadoop MapReduce Daemon + To start and stop MapReduce daemon + + + To start MapReduce daemon manually, enter the following command: + + # $HADOOP_HOME/bin/start-mapred.sh + + + + To stop MapReduce daemon manually, enter the following command: + + # $HADOOP_HOME/bin/stop-mapred.sh + + + + You must start Hadoop MapReduce daemon on all servers. + + +
+
-- cgit