From abc043312db6307aabf95f593bfa405ac28a2c8a Mon Sep 17 00:00:00 2001 From: hchiramm Date: Thu, 25 Sep 2014 13:23:46 +0530 Subject: doc: update glossary documentation Change-Id: I0da7c3aaaeb43f633b4d6355ef83ac2d21e24d68 Signed-off-by: Divya Signed-off-by: Pavithra Signed-off-by: Humble Devassy Chirammal Reviewed-on: http://review.gluster.org/8847 Tested-by: Gluster Build System Reviewed-by: Krishnan Parthasarathi Reviewed-by: Humble Devassy Chirammal --- doc/admin-guide/en-US/markdown/glossary.md | 182 ++++++++++++++++++++++------- 1 file changed, 142 insertions(+), 40 deletions(-) (limited to 'doc/admin-guide/en-US') diff --git a/doc/admin-guide/en-US/markdown/glossary.md b/doc/admin-guide/en-US/markdown/glossary.md index d047622a93b..496d0a428d4 100644 --- a/doc/admin-guide/en-US/markdown/glossary.md +++ b/doc/admin-guide/en-US/markdown/glossary.md @@ -2,31 +2,69 @@ Glossary ======== **Brick** -: A Brick is the basic unit of storage in GlusterFS, represented by an - export directory on a server in the trusted storage pool. A Brick is - represented by combining a server name with an export directory in the - following format: - - `SERVER:EXPORT` +: A Brick is the basic unit of storage in GlusterFS, represented by an export + directory on a server in the trusted storage pool. + A brick is expressed by combining a server with an export directory in the following format: + `SERVER:EXPORT` For example: + `myhostname:/exports/myexportdir/` + +**Volume** +: A volume is a logical collection of bricks. Most of the gluster + management operations happen on the volume. - `myhostname:/exports/myexportdir/` -**Client** -: Any machine that mounts a GlusterFS volume. +**Subvolume** +: A brick after being processed by at least one translator or in other words + set of one or more xlator stacked together is called a sub-volume. + + +**Volfile** +: Volume (vol) files are configuration files that determine the behavior of the + GlusterFs trusted storage pool. Volume file is a textual representation of a + collection of modules (also known as translators) that together implement the + various functions required. The collection of modules are arranged in a graph-like + fashion. E.g, A replicated volume's volfile, among other things, would have a + section describing the replication translator and its tunables. + This section describes how the volume would replicate data written to it. + Further, a client process that serves a mount point, would interpret its volfile + and load the translators described in it. While serving I/O, it would pass the + request to the collection of modules in the order specified in the volfile. + + At a high level, GlusterFs has three entities,that is, Server, Client and Management daemon. + Each of these entities have their own volume files. + Volume files for servers and clients are generated by the management daemon + after the volume is created. + + Server and Client Vol files are located in /var/lib/glusterd/vols/VOLNAME directory. + The management daemon vol file is named as glusterd.vol and is located in /etc/glusterfs/ + directory. + +**glusterd** +: The daemon/service that manages volumes and cluster membership. It is required to + run on all the servers in the trusted storage pool. **Cluster** -: A cluster is a group of linked computers, working together closely - thus in many respects forming a single computer. +: A trusted pool of linked computers working together, resembling a single computing resource. + In GlusterFs, a cluster is also referred to as a trusted storage pool. + +**Client** +: Any machine that mounts a GlusterFS volume. Any applications that use libgfapi access + mechanism can also be treated as clients in GlusterFS context. + + +**Server** +: The machine (virtual or bare metal) that hosts the bricks in which data is stored. + + +**Block Storage** +: Block special files, or block devices, correspond to devices through which the system moves + data in the form of blocks. These device nodes often represent addressable devices such as + hard disks, CD-ROM drives, or memory regions. GlusterFS requires a filesystem (like XFS) that + supports extended attributes. -**Distributed File System** -: A file system that allows multiple clients to concurrently access - data over a computer network. -**Extended Attributes** -: Extended file attributes (abbreviated xattr) is a file system feature - that enables users/programs to associate files/dirs with metadata. **Filesystem** : A method of storing and organizing computer files and their data. @@ -36,28 +74,43 @@ Glossary Source: [Wikipedia][] +**Distributed File System** +: A file system that allows multiple clients to concurrently access data which is spread across + servers/bricks in a trusted storage pool. Data sharing among multiple locations is fundamental + to all distributed file systems. + +**Virtual File System (VFS) + VFS is a kernel software layer which handles all system calls related to the standard Linux file system. + It provides a common interface to several kinds of file systems. + +**POSIX** +: Portable Operating System Interface (for Unix) is the name of a + family of related standards specified by the IEEE to define the + application programming interface (API), along with shell and + utilities interfaces for software compatible with variants of the + Unix operating system. Gluster exports a fully POSIX compliant file + system. + +**Extended Attributes** +: Extended file attributes (abbreviated xattr) is a filesystem feature + that enables users/programs to associate files/dirs with metadata. + + **FUSE** : Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users - create their own file systems without editing kernel code. This is - achieved by running file system code in user space while the FUSE + create their own filesystems without editing kernel code. This is + achieved by running filesystem code in user space while the FUSE module provides only a "bridge" to the actual kernel interfaces. Source: [Wikipedia][1] -**Geo-Replication** -: Geo-replication provides a continuous, asynchronous, and incremental - replication service from site to another over Local Area Networks - (LAN), Wide Area Network (WAN), and across the Internet. **GFID** : Each file/directory on a GlusterFS volume has a unique 128-bit number associated with it called the GFID. This is analogous to inode in a regular filesystem. -**glusterd** -: The Gluster management daemon that needs to run on all servers in - the trusted storage pool. **Infiniband** InfiniBand is a switched fabric computer network communications link @@ -108,13 +161,7 @@ Glossary Source: [Wikipedia][3] -**POSIX** -: Portable Operating System Interface (for Unix) is the name of a - family of related standards specified by the IEEE to define the - application programming interface (API), along with shell and - utilities interfaces for software compatible with variants of the - Unix operating system. Gluster exports a fully POSIX compliant file - system. + **Quorum** : The configuration of quorum in a trusted storage pool determines the @@ -123,7 +170,7 @@ Glossary unavailable. **Quota** -: Quotas allow you to set limits on usage of disk space by directories or +: Quota allows you to set limits on usage of disk space by directories or by volumes. **RAID** @@ -178,19 +225,74 @@ Glossary start the first server, the storage pool consists of that server alone. +**Scale-Up Storage** +: Increases the capacity of the storage device in a single dimension. + For example, adding additional disk capacity to an existing trusted storage pool. + +**Scale-Out Storage** + Scale out systems are designed to scale on both capacity and performance. + It increases the capability of a storage device in single dimension. + For example, adding more systems of the same size, or adding servers to a trusted storage pool + that increases CPU, disk capacity, and throughput for the trusted storage pool. + **Userspace** : Applications running in user space don’t directly interact with hardware, instead using the kernel to moderate access. Userspace applications are generally more portable than applications in kernel space. Gluster is a user space application. -**Volfile** -: Volfile is a configuration file used by glusterfs process. Volfile - will be usually located at `/var/lib/glusterd/vols/VOLNAME`. -**Volume** -: A volume is a logical collection of bricks. Most of the gluster - management operations happen on the volume. +**Geo-Replication** +: Geo-replication provides a continuous, asynchronous, and incremental + replication service from site to another over Local Area Networks + (LAN), Wide Area Network (WAN), and across the Internet. + +**N-way Replication** +: Local synchronous data replication which is typically deployed across campus + or Amazon Web Services Availability Zones. + +**Distributed Hash Table Terminology** +**Hashed subvolume** +: A Distributed Hash Table Translator subvolume to which the file or directory name is hashed to. + +**Cached subvolume** +: A Distributed Hash Table Translator subvolume where the file content is actually present. + For directories, the concept of cached-subvolume is not relevant. It is loosely used to mean + subvolumes which are not hashed-subvolume. + +**Linkto-file** + +: For a newly created file, the hashed and cached subvolumes are the same. + When directory entry operations like rename (which can change the name and hence hashed + subvolume of the file) are performed on the file, instead of moving the entire data in the file + to a new hashed subvolume, a file is created with the same name on the newly hashed subvolume. + The purpose of this file is only to act as a pointer to the node where the data is present. + In the extended attributes of this file, the name of the cached subvolume is stored. + This file on the newly hashed-subvolume is called a linkto-file. + The linkto file is relevant only for non-directory entities. + +**Directory Layout** +: The directory layout specifies the hash-ranges of the subdirectories of a directory to which + subvolumes they correspond to. + +**Properties of directory layouts:** +: The layouts are created at the time of directory creation and are persisted as extended attributes + of the directory. + A subvolume is not included in the layout if it remained offline at the time of directory creation + and no directory entries ( such as files and directories) of that directory are created on + that subvolume. The subvolume is not part of the layout until the fix-layout is complete + as part of running the rebalance command. If a subvolume is down during access (after directory creation), + access to any files that hash to that subvolume fails. + +**Fix Layout** +: A command that is executed during the rebalance process. + The rebalance process itself comprises of two stages: + Fixes the layouts of directories to accommodate any subvolumes that are added or removed. + It also heals the directories, checks whether the layout is non-contiguous, and persists the + layout in extended attributes, if needed. It also ensures that the directories have the same + attributes across all the subvolumes. + + Migrates the data from the cached-subvolume to the hashed-subvolume. [Wikipedia]: http://en.wikipedia.org/wiki/Filesystem [1]: http://en.wikipedia.org/wiki/Filesystem_in_Userspace -- cgit