summaryrefslogtreecommitdiffstats
path: root/doc/debugging
diff options
context:
space:
mode:
authorHumble Devassy Chirammal <hchiramm@redhat.com>2015-06-08 20:35:38 +0530
committerHumble Devassy Chirammal <hchiramm@redhat.com>2015-06-09 14:52:10 +0530
commit5be489089407fc410c7157e39c73c6eb910696b8 (patch)
tree467354ebc4f56e930a0b7d639bd49264b004ee9a /doc/debugging
parenta2a370db6db80e9365d0777701786ce706957f42 (diff)
doc: Remove doc directories
At present gluster documentation is available at http://gluster.readthedocs.org/en/latest/ and the source project is https://github.com/gluster/glusterdocs Here onwards the patches has to be send against glusterdocs project in git hub repo. For more details refer# http://www.gluster.org/pipermail/gluster-users/2015-May/022065.html Change-Id: I6d7d20d34ca4ee36356f0dc67204f28350dbf94c BUG: 1206539 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>
Diffstat (limited to 'doc/debugging')
-rw-r--r--doc/debugging/gfid-to-path.md73
-rw-r--r--doc/debugging/split-brain.md251
-rw-r--r--doc/debugging/statedump.md389
3 files changed, 0 insertions, 713 deletions
diff --git a/doc/debugging/gfid-to-path.md b/doc/debugging/gfid-to-path.md
deleted file mode 100644
index 09c459e52c8..00000000000
--- a/doc/debugging/gfid-to-path.md
+++ /dev/null
@@ -1,73 +0,0 @@
-#Convert GFID to Path
-
-GlusterFS internal file identifier (GFID) is a uuid that is unique to each
-file across the entire cluster. This is analogous to inode number in a
-normal filesystem. The GFID of a file is stored in its xattr named
-`trusted.gfid`.
-
-####Special mount using [gfid-access translator][1]:
-~~~
-mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol
-~~~
-
-Assuming, you have `GFID` of a file from changelog (or somewhere else).
-For trying this out, you can get `GFID` of a file from mountpoint:
-~~~
-getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file
-~~~
-
-
----
-###Get file path from GFID (Method 1):
-**(Lists hardlinks delimited by `:`, returns path as seen from mountpoint)**
-
-####Turn on build-pgfid option
-~~~
-gluster volume set test build-pgfid on
-~~~
-Read virtual xattr `glusterfs.ancestry.path` which contains the file path
-~~~
-getfattr -n glusterfs.ancestry.path -e text /mnt/testvol/.gfid/<GFID>
-~~~
-
-**Example:**
-~~~
-[root@vm1 glusterfs]# ls -il /mnt/testvol/dir/
-total 1
-10610563327990022372 -rw-r--r--. 2 root root 3 Jul 17 18:05 file
-10610563327990022372 -rw-r--r--. 2 root root 3 Jul 17 18:05 file3
-
-[root@vm1 glusterfs]# getfattr -n glusterfs.gfid.string /mnt/testvol/dir/file
-getfattr: Removing leading '/' from absolute path names
-# file: mnt/testvol/dir/file
-glusterfs.gfid.string="11118443-1894-4273-9340-4b212fa1c0e4"
-
-[root@vm1 glusterfs]# getfattr -n glusterfs.ancestry.path -e text /mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
-getfattr: Removing leading '/' from absolute path names
-# file: mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
-glusterfs.ancestry.path="/dir/file:/dir/file3"
-~~~
-
----
-###Get file path from GFID (Method 2):
-**(Does not list all hardlinks, returns backend brick path)**
-~~~
-getfattr -n trusted.glusterfs.pathinfo -e text /mnt/testvol/.gfid/<GFID>
-~~~
-
-**Example:**
-~~~
-[root@vm1 glusterfs]# getfattr -n trusted.glusterfs.pathinfo -e text /mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
-getfattr: Removing leading '/' from absolute path names
-# file: mnt/testvol/.gfid/11118443-1894-4273-9340-4b212fa1c0e4
-trusted.glusterfs.pathinfo="(<DISTRIBUTE:test-dht> <POSIX(/mnt/brick-test/b):vm1:/mnt/brick-test/b/dir//file3>)"
-~~~
-
----
-###Get file path from GFID (Method 3):
-https://gist.github.com/semiosis/4392640
-
----
-####References and links:
-[posix: placeholders for GFID to path conversion](http://review.gluster.org/5951)
-[1]: https://github.com/gluster/glusterfs/blob/master/doc/features/gfid-access.md
diff --git a/doc/debugging/split-brain.md b/doc/debugging/split-brain.md
deleted file mode 100644
index b0d938e26bc..00000000000
--- a/doc/debugging/split-brain.md
+++ /dev/null
@@ -1,251 +0,0 @@
-Steps to recover from File split-brain.
-======================================
-
-Quick Start:
-============
-1. Get the path of the file that is in split-brain:
-> It can be obtained either by
-> a) The command `gluster volume heal info split-brain`.
-> b) Identify the files for which file operations performed
- from the client keep failing with Input/Output error.
-
-2. Close the applications that opened this file from the mount point.
-In case of VMs, they need to be powered-off.
-
-3. Decide on the correct copy:
-> This is done by observing the afr changelog extended attributes of the file on
-the bricks using the getfattr command; then identifying the type of split-brain
-(data split-brain, metadata split-brain, entry split-brain or split-brain due to
-gfid-mismatch); and finally determining which of the bricks contains the 'good copy'
-of the file.
-> `getfattr -d -m . -e hex <file-path-on-brick>`.
-It is also possible that one brick might contain the correct data while the
-other might contain the correct metadata.
-
-4. Reset the relevant extended attribute on the brick(s) that contains the
-'bad copy' of the file data/metadata using the setfattr command.
-> `setfattr -n <attribute-name> -v <attribute-value> <file-path-on-brick>`
-
-5. Trigger self-heal on the file by performing lookup from the client:
-> `ls -l <file-path-on-gluster-mount>`
-
-Detailed Instructions for steps 3 through 5:
-===========================================
-To understand how to resolve split-brain we need to know how to interpret the
-afr changelog extended attributes.
-
-Execute `getfattr -d -m . -e hex <file-path-on-brick>`
-
-* Example:
-[root@store3 ~]# getfattr -d -e hex -m. brick-a/file.txt
-\#file: brick-a/file.txt
-security.selinux=0x726f6f743a6f626a6563745f723a66696c655f743a733000
-trusted.afr.vol-client-2=0x000000000000000000000000
-trusted.afr.vol-client-3=0x000000000200000000000000
-trusted.gfid=0x307a5c9efddd4e7c96e94fd4bcdcbd1b
-
-The extended attributes with `trusted.afr.<volname>-client-<subvolume-index>`
-are used by afr to maintain changelog of the file.The values of the
-`trusted.afr.<volname>-client-<subvolume-index>` are calculated by the glusterfs
-client (fuse or nfs-server) processes. When the glusterfs client modifies a file
-or directory, the client contacts each brick and updates the changelog extended
-attribute according to the response of the brick.
-
-'subvolume-index' is nothing but (brick number - 1) in
-`gluster volume info <volname>` output.
-
-* Example:
-[root@pranithk-laptop ~]# gluster volume info vol
- Volume Name: vol
- Type: Distributed-Replicate
- Volume ID: 4f2d7849-fbd6-40a2-b346-d13420978a01
- Status: Created
- Number of Bricks: 4 x 2 = 8
- Transport-type: tcp
- Bricks:
- brick-a: pranithk-laptop:/gfs/brick-a
- brick-b: pranithk-laptop:/gfs/brick-b
- brick-c: pranithk-laptop:/gfs/brick-c
- brick-d: pranithk-laptop:/gfs/brick-d
- brick-e: pranithk-laptop:/gfs/brick-e
- brick-f: pranithk-laptop:/gfs/brick-f
- brick-g: pranithk-laptop:/gfs/brick-g
- brick-h: pranithk-laptop:/gfs/brick-h
-
-In the example above:
-```
-Brick | Replica set | Brick subvolume index
-----------------------------------------------------------------------------
--/gfs/brick-a | 0 | 0
--/gfs/brick-b | 0 | 1
--/gfs/brick-c | 1 | 2
--/gfs/brick-d | 1 | 3
--/gfs/brick-e | 2 | 4
--/gfs/brick-f | 2 | 5
--/gfs/brick-g | 3 | 6
--/gfs/brick-h | 3 | 7
-```
-
-Each file in a brick maintains the changelog of itself and that of the files
-present in all the other bricks in it's replica set as seen by that brick.
-
-In the example volume given above, all files in brick-a will have 2 entries,
-one for itself and the other for the file present in it's replica pair, i.e.brick-b:
-trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for itself (brick-a)
-trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for brick-b as seen by brick-a
-
-Likewise, all files in brick-b will have:
-trusted.afr.vol-client-0=0x000000000000000000000000 -->changelog for brick-a as seen by brick-b
-trusted.afr.vol-client-1=0x000000000000000000000000 -->changelog for itself (brick-b)
-
-The same can be extended for other replica pairs.
-
-Interpreting Changelog (roughly pending operation count) Value:
-Each extended attribute has a value which is 24 hexa decimal digits.
-First 8 digits represent changelog of data. Second 8 digits represent changelog
-of metadata. Last 8 digits represent Changelog of directory entries.
-
-Pictorially representing the same, we have:
-```
-0x 000003d7 00000001 00000000
- | | |
- | | \_ changelog of directory entries
- | \_ changelog of metadata
- \ _ changelog of data
-```
-
-
-For Directories metadata and entry changelogs are valid.
-For regular files data and metadata changelogs are valid.
-For special files like device files etc metadata changelog is valid.
-When a file split-brain happens it could be either data split-brain or
-meta-data split-brain or both. When a split-brain happens the changelog of the
-file would be something like this:
-
-* Example:(Lets consider both data, metadata split-brain on same file).
-[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
-getfattr: Removing leading '/' from absolute path names
-\#file: gfs/brick-a/a
-trusted.afr.vol-client-0=0x000000000000000000000000
-trusted.afr.vol-client-1=0x000003d70000000100000000
-trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
-\#file: gfs/brick-b/a
-trusted.afr.vol-client-0=0x000003b00000000100000000
-trusted.afr.vol-client-1=0x000000000000000000000000
-trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
-
-###Observations:
-
-####According to changelog extended attributes on file /gfs/brick-a/a:
-The first 8 digits of trusted.afr.vol-client-0 are all
-zeros (0x00000000................), and the first 8 digits of
-trusted.afr.vol-client-1 are not all zeros (0x000003d7................).
-So the changelog on /gfs/brick-a/a implies that some data operations succeeded
-on itself but failed on /gfs/brick-b/a.
-
-The second 8 digits of trusted.afr.vol-client-0 are
-all zeros (0x........00000000........), and the second 8 digits of
-trusted.afr.vol-client-1 are not all zeros (0x........00000001........).
-So the changelog on /gfs/brick-a/a implies that some metadata operations succeeded
-on itself but failed on /gfs/brick-b/a.
-
-####According to Changelog extended attributes on file /gfs/brick-b/a:
-The first 8 digits of trusted.afr.vol-client-0 are not all
-zeros (0x000003b0................), and the first 8 digits of
-trusted.afr.vol-client-1 are all zeros (0x00000000................).
-So the changelog on /gfs/brick-b/a implies that some data operations succeeded
-on itself but failed on /gfs/brick-a/a.
-
-The second 8 digits of trusted.afr.vol-client-0 are not
-all zeros (0x........00000001........), and the second 8 digits of
-trusted.afr.vol-client-1 are all zeros (0x........00000000........).
-So the changelog on /gfs/brick-b/a implies that some metadata operations succeeded
-on itself but failed on /gfs/brick-a/a.
-
-Since both the copies have data, metadata changes that are not on the other
-file, it is in both data and metadata split-brain.
-
-Deciding on the correct copy:
------------------------------
-The user may have to inspect stat,getfattr output of the files to decide which
-metadata to retain and contents of the file to decide which data to retain.
-Continuing with the example above, lets say we want to retain the data
-of /gfs/brick-a/a and metadata of /gfs/brick-b/a.
-
-Resetting the relevant changelogs to resolve the split-brain:
--------------------------------------------------------------
-For resolving data-split-brain:
-We need to change the changelog extended attributes on the files as if some data
-operations succeeded on /gfs/brick-a/a but failed on /gfs/brick-b/a. But
-/gfs/brick-b/a should NOT have any changelog which says some data operations
-succeeded on /gfs/brick-b/a but failed on /gfs/brick-a/a. We need to reset the
-data part of the changelog on trusted.afr.vol-client-0 of /gfs/brick-b/a.
-
-For resolving metadata-split-brain:
-We need to change the changelog extended attributes on the files as if some
-metadata operations succeeded on /gfs/brick-b/a but failed on /gfs/brick-a/a.
-But /gfs/brick-a/a should NOT have any changelog which says some metadata
-operations succeeded on /gfs/brick-a/a but failed on /gfs/brick-b/a.
-We need to reset metadata part of the changelog on
-trusted.afr.vol-client-1 of /gfs/brick-a/a
-
-So, the intended changes are:
-On /gfs/brick-b/a:
-For trusted.afr.vol-client-0
-0x000003b00000000100000000 to 0x000000000000000100000000
-(Note that the metadata part is still not all zeros)
-Hence execute
-`setfattr -n trusted.afr.vol-client-0 -v 0x000000000000000100000000 /gfs/brick-b/a`
-
-On /gfs/brick-a/a:
-For trusted.afr.vol-client-1
-0x0000000000000000ffffffff to 0x000003d70000000000000000
-(Note that the data part is still not all zeros)
-Hence execute
-`setfattr -n trusted.afr.vol-client-1 -v 0x000003d70000000000000000 /gfs/brick-a/a`
-
-Thus after the above operations are done, the changelogs look like this:
-[root@pranithk-laptop vol]# getfattr -d -m . -e hex /gfs/brick-?/a
-getfattr: Removing leading '/' from absolute path names
-\#file: gfs/brick-a/a
-trusted.afr.vol-client-0=0x000000000000000000000000
-trusted.afr.vol-client-1=0x000003d70000000000000000
-trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
-
-\#file: gfs/brick-b/a
-trusted.afr.vol-client-0=0x000000000000000100000000
-trusted.afr.vol-client-1=0x000000000000000000000000
-trusted.gfid=0x80acdbd886524f6fbefa21fc356fed57
-
-
-Triggering Self-heal:
----------------------
-Perform `ls -l <file-path-on-gluster-mount>` to trigger healing.
-
-Fixing Directory entry split-brain:
-----------------------------------
-Afr has the ability to conservatively merge different entries in the directories
-when there is a split-brain on directory.
-If on one brick directory 'd' has entries '1', '2' and has entries '3', '4' on
-the other brick then afr will merge all of the entries in the directory to have
-'1', '2', '3', '4' entries in the same directory.
-(Note: this may result in deleted files to re-appear in case the split-brain
-happens because of deletion of files on the directory)
-Split-brain resolution needs human intervention when there is at least one entry
-which has same file name but different gfid in that directory.
-Example:
-On brick-a the directory has entries '1' (with gfid g1), '2' and on brick-b
-directory has entries '1' (with gfid g2) and '3'.
-These kinds of directory split-brains need human intervention to resolve.
-The user needs to remove either file '1' on brick-a or the file '1' on brick-b
-to resolve the split-brain. In addition, the corresponding gfid-link file also
-needs to be removed.The gfid-link files are present in the .glusterfs folder
-in the top-level directory of the brick. If the gfid of the file is
-0x307a5c9efddd4e7c96e94fd4bcdcbd1b (the trusted.gfid extended attribute got
-from the getfattr command earlier),the gfid-link file can be found at
-> /gfs/brick-a/.glusterfs/30/7a/307a5c9efddd4e7c96e94fd4bcdcbd1b
-
-####Word of caution:
-Before deleting the gfid-link, we have to ensure that there are no hard links
-to the file present on that brick. If hard-links exist,they must be deleted as
-well.
diff --git a/doc/debugging/statedump.md b/doc/debugging/statedump.md
deleted file mode 100644
index f34a5c3436a..00000000000
--- a/doc/debugging/statedump.md
+++ /dev/null
@@ -1,389 +0,0 @@
-#Statedump
-Statedump is a file generated by glusterfs process with different data structure state which may contain the active inodes, fds, mempools, iobufs, memory allocation stats of different types of datastructures per xlator etc.
-
-##How to generate statedump
-We can find the directory where statedump files are created using 'gluster --print-statedumpdir' command.
-Create that directory if not already present based on the type of installation.
-Lets call this directory `statedump-directory`.
-
-We can generate statedump using 'kill -USR1 <pid-of-gluster-process>'.
-gluster-process is nothing but glusterd/glusterfs/glusterfsd process.
-
-There are also commands to generate statedumps for brick processes/nfs server/quotad
-
-For bricks: `gluster volume statedump <volname>`
-
-For nfs server: `gluster volume statedump <volname> nfs`
-
-For quotad: `gluster volume statedump <volname> quotad`
-
-For brick-processes files will be created in `statedump-directory` with name of the file as `hyphenated-brick-path.<pid>.dump.timestamp`. For all other processes it will be `glusterdump.<pid>.dump.timestamp`.
-
-##How to read statedump
-We shall see snippets of each type of statedump.
-
-First and last lines of the file have starting and ending time of writing the statedump file. Times will be in UTC timezone.
-
-mallinfo return status is printed in the following format. Please read man mallinfo for more information about what each field means.
-###Mallinfo
-```
-[mallinfo]
-mallinfo_arena=100020224 /* Non-mmapped space allocated (bytes) */
-mallinfo_ordblks=69467 /* Number of free chunks */
-mallinfo_smblks=449 /* Number of free fastbin blocks */
-mallinfo_hblks=13 /* Number of mmapped regions */
-mallinfo_hblkhd=20144128 /* Space allocated in mmapped regions (bytes) */
-mallinfo_usmblks=0 /* Maximum total allocated space (bytes) */
-mallinfo_fsmblks=39264 /* Space in freed fastbin blocks (bytes) */
-mallinfo_uordblks=96710112 /* Total allocated space (bytes) */
-mallinfo_fordblks=3310112 /* Total free space (bytes) */
-mallinfo_keepcost=133712 /* Top-most, releasable space (bytes) */
-```
-
-###Data structure allocation stats
-For every xlator data structure memory per translator loaded in the call-graph is displayed in the following format:
-
-For xlator with name: glusterfs
-```
-[global.glusterfs - Memory usage] #[global.xlator-name - Memory usage]
-num_types=119 #It shows the number of data types it is using
-```
-
-Now for each data-type it prints the memory usage.
-
-```
-[global.glusterfs - usage-type gf_common_mt_gf_timer_t memusage]
-#[global.xlator-name - usage-type <tag associated with the data-type> memusage]
-size=112 #num_allocs times the sizeof(data-type) i.e. num_allocs * sizeof (data-type)
-num_allocs=2 #Number of allocations of the data-type which are active at the time of taking statedump.
-max_size=168 #max_num_allocs times the sizeof(data-type) i.e. max_num_allocs * sizeof (data-type)
-max_num_allocs=3 #Maximum number of active allocations at any point in the life of the process.
-total_allocs=7 #Number of times this data is allocated in the life of the process.
-```
-
-###Mempools
-
-Mempools are optimization to reduce the number of allocations of a data type. If we create a mem-pool of lets say 1024 elements for a data-type, new elements will be allocated from heap using syscalls like calloc, only if all the 1024 elements in the pool are in active use.
-
-Memory pool allocated by each xlator is displayed in the following format:
-
-```
-[mempool] #Section name
------=-----
-pool-name=fuse:fd_t #pool-name=<xlator-name>:<data-type>
-hot-count=1 #number of mempool elements that are in active use. i.e. for this pool it is the number of 'fd_t' s in active use.
-cold-count=1023 #number of mempool elements that are not in use. If a new allocation is required it will be served from here until all the elements in the pool are in use i.e. cold-count becomes 0.
-padded_sizeof=108 #Each mempool element is padded with a doubly-linked-list + ptr of mempool + is-in-use info to operate the pool of elements, this size is the element-size after padding
-pool-misses=0 #Number of times the element had to be allocated from heap because all elements from the pool are in active use.
-alloc-count=314 #Number of times this type of data is allocated through out the life of this process. This may include pool-misses as well.
-max-alloc=3 #Maximum number of elements from the pool in active use at any point in the life of the process. This does *not* include pool-misses.
-cur-stdalloc=0 #Denotes the number of allocations made from heap once cold-count reaches 0, that are yet to be released via mem_put().
-max-stdalloc=0 #Maximum number of allocations from heap that are in active use at any point in the life of the process.
-```
-
-###Iobufs
-```
-[iobuf.global]
-iobuf_pool=0x1f0d970 #The memory pool for iobufs
-iobuf_pool.default_page_size=131072 #The default size of iobuf (if no iobuf size is specified the default size is allocated)
-#iobuf_arena: One arena represents a group of iobufs of a particular size
-iobuf_pool.arena_size=12976128 # The initial size of the iobuf pool (doesn't include the stdalloc'd memory or the newly added arenas)
-iobuf_pool.arena_cnt=8 #Total number of arenas in the pool
-iobuf_pool.request_misses=0 #The number of iobufs that were stdalloc'd (as they exceeded the default max page size provided by iobuf_pool).
-```
-
-There are 3 lists of arenas
-
-1. Arena list: arenas allocated during iobuf pool creation and the arenas that are in use(active_cnt != 0) will be part of this list.
-2. Purge list: arenas that can be purged(no active iobufs, active_cnt == 0).
-3. Filled list: arenas without free iobufs.
-
-```
-[purge.1] #purge.<S.No.>
-purge.1.mem_base=0x7fc47b35f000 #The address of the arena structure
-purge.1.active_cnt=0 #The number of iobufs active in that arena
-purge.1.passive_cnt=1024 #The number of unused iobufs in the arena
-purge.1.alloc_cnt=22853 #Total allocs in this pool(number of times the iobuf was allocated from this arena)
-purge.1.max_active=7 #Max active iobufs from this arena, at any point in the life of this process.
-purge.1.page_size=128 #Size of all the iobufs in this arena.
-
-[arena.5] #arena.<S.No.>
-arena.5.mem_base=0x7fc47af1f000
-arena.5.active_cnt=0
-arena.5.passive_cnt=64
-arena.5.alloc_cnt=0
-arena.5.max_active=0
-arena.5.page_size=32768
-```
-
-If the active_cnt of any arena is non zero, then the statedump will also have the iobuf list.
-```
-[arena.6.active_iobuf.1] #arena.<S.No>.active_iobuf.<iobuf.S.No.>
-arena.6.active_iobuf.1.ref=1 #refcount of the iobuf
-arena.6.active_iobuf.1.ptr=0x7fdb921a9000 #address of the iobuf
-
-[arena.6.active_iobuf.2]
-arena.6.active_iobuf.2.ref=1
-arena.6.active_iobuf.2.ptr=0x7fdb92189000
-```
-
-At any given point in time if there are lots of filled arenas then that could be a sign of iobuf leaks.
-
-###Call stack
-All the fops received by gluster are handled using call-stacks. Call stack contains the information about uid/gid/pid etc of the process that is executing the fop. Each call-stack contains different call-frames per xlator which handles that fop.
-
-```
-[global.callpool.stack.3] #global.callpool.stack.<Serial-Number>
-stack=0x7fc47a44bbe0 #Stack address
-uid=0 #Uid of the process which is executing the fop
-gid=0 #Gid of the process which is executing the fop
-pid=6223 #Pid of the process which is executing the fop
-unique=2778 #Xlators like afr do copy_frame and perform the operation in a different stack, this id is useful to find out the stacks that are inter-related because of copy-frame
-lk-owner=0000000000000000 #Some of the fuse fops have lk-owner.
-op=LOOKUP #Fop
-type=1 #Type of the op i.e. FOP/MGMT-OP
-cnt=9 #Number of frames in this stack.
-```
-###Call-frame
-Each frame will have information about which xlator the frame belongs to, what is the function it wound to/from and will be unwind to. It also mentions if the unwind happened or not. If we observe hangs in the system and want to find out which xlator is causing it. Take a statedump and see what is the final xlator which is yet to be unwound.
-
-```
-[global.callpool.stack.3.frame.2]#global.callpool.stack.<stack-serial-number>.frame.<frame-serial-number>
-frame=0x7fc47a611dbc #Frame address
-ref_count=0 #Incremented at the time of wind and decremented at the time of unwind.
-translator=r2-client-1 #Xlator this frame belongs to
-complete=0 #if this value is 1 that means this frame is already unwound. 0 if it is yet to unwind.
-parent=r2-replicate-0 #Parent xlator of this frame
-wind_from=afr_lookup #Parent xlator function from which the wind happened
-wind_to=priv->children[i]->fops->lookup
-unwind_to=afr_lookup_cbk #Parent xlator function to which unwind happened
-```
-
-###History of operations in Fuse
-
-Fuse maintains history of operations that happened in fuse.
-
-```
-[xlator.mount.fuse.history]
-TIME=2014-07-09 16:44:57.523364
-message=[0] fuse_release: RELEASE(): 4590:, fd: 0x1fef0d8, gfid: 3afb4968-5100-478d-91e9-76264e634c9f
-
-TIME=2014-07-09 16:44:57.523373
-message=[0] send_fuse_err: Sending Success for operation 18 on inode 3afb4968-5100-478d-91e9-76264e634c9f
-
-TIME=2014-07-09 16:44:57.523394
-message=[0] fuse_getattr_resume: 4591, STAT, path: (/iozone.tmp), gfid: (3afb4968-5100-478d-91e9-76264e634c9f)
-```
-
-###Xlator configuration
-```
-[cluster/replicate.r2-replicate-0] #Xlator type, name information
-child_count=2 #Number of children to the xlator
-#Xlator specific configuration below
-child_up[0]=1
-pending_key[0]=trusted.afr.r2-client-0
-child_up[1]=1
-pending_key[1]=trusted.afr.r2-client-1
-data_self_heal=on
-metadata_self_heal=1
-entry_self_heal=1
-data_change_log=1
-metadata_change_log=1
-entry-change_log=1
-read_child=1
-favorite_child=-1
-wait_count=1
-```
-
-###Graph/inode table
-```
-[active graph - 1]
-
-conn.1.bound_xl./data/brick01a/homegfs.hashsize=14057
-conn.1.bound_xl./data/brick01a/homegfs.name=/data/brick01a/homegfs/inode
-conn.1.bound_xl./data/brick01a/homegfs.lru_limit=16384 #Least recently used size limit
-conn.1.bound_xl./data/brick01a/homegfs.active_size=690 #Number of inodes undergoing some kind of fop to be precise on which there is at least one ref.
-conn.1.bound_xl./data/brick01a/homegfs.lru_size=183 #Number of inodes present in lru list
-conn.1.bound_xl./data/brick01a/homegfs.purge_size=0 #Number of inodes present in purge list
-```
-
-###Inode
-```
-[conn.1.bound_xl./data/brick01a/homegfs.active.324] #324th inode in active inode list
-gfid=e6d337cf-97eb-44b3-9492-379ba3f6ad42 #Gfid of the inode
-nlookup=13 #Number of times lookups happened from the client or from fuse kernel
-fd-count=4 #Number of fds opened on the inode
-ref=11 #Number of refs taken on the inode
-ia_type=1 #Type of the inode. This should be changed to some string :-(
-
-[conn.1.bound_xl./data/brick01a/homegfs.lru.1] #1st inode in lru list. Note that ref count is zero for these inodes.
-gfid=5114574e-69bc-412b-9e52-f13ff087c6fc
-nlookup=5
-fd-count=0
-ref=0
-ia_type=2
-```
-###Inode context
-For each inode per xlator some context could be stored. This context can also be printed in the statedump. Here is the inode ctx of locks xlator
-```
-[xlator.features.locks.homegfs-locks.inode]
-path=/homegfs/users/dfrobins/gfstest/r4/SCRATCH/fort.5102 - path of the file
-mandatory=0
-inodelk-count=5 #Number of inode locks
-lock-dump.domain.domain=homegfs-replicate-0:self-heal #Domain name where self-heals take locks to prevent more than one heal on the same file
-inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 18446744073709551615, owner=080b1ada117f0000, client=0xb7fc30, connection-id=compute-30-029.com-3505-2014/06/29-14:46:12:477358-homegfs-client-0-0-1, granted at Sun Jun 29 11:01:00 2014 #Active lock information
-
-inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 18446744073709551615, owner=c0cb091a277f0000, client=0xad4f10, connection-id=gfs01a.com-4080-2014/06/29-14:41:36:917768-homegfs-client-0-0-0, blocked at Sun Jun 29 11:04:44 2014 #Blocked lock information
-
-lock-dump.domain.domain=homegfs-replicate-0:metadata #Domain name where metadata operations take locks to maintain replication consistency
-lock-dump.domain.domain=homegfs-replicate-0 #Domain name where entry/data operations take locks to maintain replication consistency
-inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=11141120, len=131072, pid = 18446744073709551615, owner=080b1ada117f0000, client=0xb7fc30, connection-id=compute-30-029.com-3505-2014/06/29-14:46:12:477358-homegfs-client-0-0-1, granted at Sun Jun 29 11:10:36 2014 #Active lock information
-```
-
-##FAQ
-###How to debug Memory leaks using statedump?
-
-####Using memory accounting feature:
-
-`https://bugzilla.redhat.com/show_bug.cgi?id=1120151` is one of the bugs which was debugged using statedump to see which data-structure is leaking. Here is the process used to find what the leak is using statedump. According to the bug the observation is that the process memory usage is increasing whenever one of the bricks is wiped in a replicate volume and a `full` self-heal is invoked to heal the contents. Statedump of the process is taken using kill -USR1 `<pid-of-gluster-self-heal-daemon>`.
-```
-grep -w num_allocs glusterdump.5225.dump.1405493251
-num_allocs=77078
-num_allocs=87070
-num_allocs=117376
-....
-
-grep hot-count glusterdump.5225.dump.1405493251
-hot-count=16384
-hot-count=16384
-hot-count=4095
-....
-
-Find the occurrences in the statedump file to figure out the tags.
-```
-grep of the statedump revealed too many allocations for the following data-types under replicate,
-
-1. gf_common_mt_asprintf
-2. gf_common_mt_char
-3. gf_common_mt_mem_pool.
-
-After checking afr-code for allocations with tag `gf_common_mt_char` found `data-self-heal` code path does not free one such allocated memory. `gf_common_mt_mem_pool` suggests that there is a leak in pool memory. `replicate-0:dict_t`, `glusterfs:data_t` and `glusterfs:data_pair_t` pools are using lot of memory, i.e. cold_count is `0` and too many allocations. Checking source code of dict.c revealed that `key` in `dict` is allocated with `gf_common_mt_char` i.e. `2.` tag and value is created using gf_asprintf which in-turn uses `gf_common_mt_asprintf` i.e. `1.`. Browsing the code for leak in self-heal code paths lead to a line which over-writes a variable with new dictionary even when it was already holding a reference to another dictionary. After fixing these leaks, ran the same test to verify that none of the `num_allocs` are increasing even after healing 10,000 files directory hierarchy in statedump of self-heal daemon.
-Please check http://review.gluster.org/8316 for more info about patch/code.
-
-####Debugging leaks in memory pools:
-Statedump output of memory pools was used to test and verify the fixes to https://bugzilla.redhat.com/show_bug.cgi?id=1134221. On code analysis, dict_t objects were found to be leaking (in terms of not being unref'd enough number of times, during name self-heal. The test involved creating 100 files on plain replicate volume, removing them from one of the bricks's backend, and then triggering lookup on them from the mount point. Statedump of the mount process was taken before executing the test case and after it, after compiling glusterfs with -DDEBUG flags (to have cold count set to 0 by default).
-
-Statedump output of the fuse mount process before the test case was executed:
-
-```
-
-pool-name=glusterfs:dict_t
-hot-count=0
-cold-count=0
-padded_sizeof=140
-alloc-count=33
-max-alloc=0
-pool-misses=33
-cur-stdalloc=14
-max-stdalloc=18
-
-```
-Statedump output of the fuse mount process after the test case was executed:
-
-```
-
-pool-name=glusterfs:dict_t
-hot-count=0
-cold-count=0
-padded_sizeof=140
-alloc-count=2841
-max-alloc=0
-pool-misses=2841
-cur-stdalloc=214
-max-stdalloc=220
-
-```
-Here, with cold count being 0 by default, cur-stdalloc indicated the number of dict_t objects that were allocated in heap using mem_get(), and yet to be freed using mem_put() (refer to https://github.com/gluster/glusterfs/blob/master/doc/data-structures/mem-pool.md for more details on how mempool works). After the test case (name selfheal of 100 files), there was a rise in the cur-stdalloc value (from 14 to 214) for dict_t.
-
-After these leaks were fixed, glusterfs was again compiled with -DDEBUG flags, and the same steps were performed again and statedump was taken before and after executing the test case, of the mount. This was done to ascertain the validity of the fix. And the following are the results:
-
-Statedump output of the fuse mount process before executing the test case:
-
-```
-pool-name=glusterfs:dict_t
-hot-count=0
-cold-count=0
-padded_sizeof=140
-alloc-count=33
-max-alloc=0
-pool-misses=33
-cur-stdalloc=14
-max-stdalloc=18
-
-```
-Statedump output of the fuse mount process after executing the test case:
-
-```
-pool-name=glusterfs:dict_t
-hot-count=0
-cold-count=0
-padded_sizeof=140
-alloc-count=2837
-max-alloc=0
-pool-misses=2837
-cur-stdalloc=14
-max-stdalloc=119
-
-```
-The value of cur-stdalloc remained 14 before and after the test, indicating that the fix indeed does what it's supposed to do.
-
-###How to debug hangs because of frame-loss?
-`https://bugzilla.redhat.com/show_bug.cgi?id=994959` is one of the bugs where statedump was helpful in finding where the frame was lost. Here is the process used to find where the hang is using statedump.
-When the hang was observed, statedumps are taken for all the processes. On mount's statedump the following stack is shown:
-```
-[global.callpool.stack.1.frame.1]
-ref_count=1
-translator=fuse
-complete=0
-
-[global.callpool.stack.1.frame.2]
-ref_count=0
-translator=r2-client-1
-complete=1 <<----- Client xlator completed the readdirp call and unwound to afr
-parent=r2-replicate-0
-wind_from=afr_do_readdir
-wind_to=children[call_child]->fops->readdirp
-unwind_from=client3_3_readdirp_cbk
-unwind_to=afr_readdirp_cbk
-
-[global.callpool.stack.1.frame.3]
-ref_count=0
-translator=r2-replicate-0
-complete=0 <<---- Afr xlator is not unwinding for some reason.
-parent=r2-dht
-wind_from=dht_do_readdir
-wind_to=xvol->fops->readdirp
-unwind_to=dht_readdirp_cbk
-
-[global.callpool.stack.1.frame.4]
-ref_count=1
-translator=r2-dht
-complete=0
-parent=r2-io-cache
-wind_from=ioc_readdirp
-wind_to=FIRST_CHILD(this)->fops->readdirp
-unwind_to=ioc_readdirp_cbk
-
-[global.callpool.stack.1.frame.5]
-ref_count=1
-translator=r2-io-cache
-complete=0
-parent=r2-quick-read
-wind_from=qr_readdirp
-wind_to=FIRST_CHILD (this)->fops->readdirp
-unwind_to=qr_readdirp_cbk
-
-```
-`unwind_to` shows that call was unwound to `afr_readdirp_cbk` from client xlator.
-Inspecting that function revealed that afr is not unwinding the stack when fop failed.
-Check http://review.gluster.org/5531 for more info about patch/code changes.