1 files changed, 0 insertions, 543 deletions
diff --git a/doc/admin-guide/en-US/markdown/admin_troubleshooting.md b/doc/admin-guide/en-US/markdown/admin_troubleshooting.md
deleted file mode 100644
index 88fb85c240c..00000000000
--- a/doc/admin-guide/en-US/markdown/admin_troubleshooting.md
+++ /dev/null
@@ -1,543 +0,0 @@
-Troubleshooting GlusterFS
-=========================
-
-This section describes how to manage GlusterFS logs and most common
-troubleshooting scenarios related to GlusterFS.
-
-Managing GlusterFS Logs
-=======================
-
-This section describes how to manage GlusterFS logs by performing the
-following operation:
-
--   Rotating Logs
-
-Rotating Logs
--------------
-
-Administrators can rotate the log file in a volume, as needed.
-
-**To rotate a log file**
-
--   Rotate the log file using the following command:
-
-    `# gluster volume log rotate `
-
-    For example, to rotate the log file on test-volume:
-
-        # gluster volume log rotate test-volume
-        log rotate successful
-
-    > **Note**
-    >
-    > When a log file is rotated, the contents of the current log file
-    > are moved to log-file- name.epoch-time-stamp.
-
-Troubleshooting Geo-replication
-===============================
-
-This section describes the most common troubleshooting scenarios related
-to GlusterFS Geo-replication.
-
-Locating Log Files
-------------------
-
-For every Geo-replication session, the following three log files are
-associated to it (four, if the slave is a gluster volume):
-
--   Master-log-file - log file for the process which monitors the Master
-    volume
-
--   Slave-log-file - log file for process which initiates the changes in
-    slave
-
--   Master-gluster-log-file - log file for the maintenance mount point
-    that Geo-replication module uses to monitor the master volume
-
--   Slave-gluster-log-file - is the slave's counterpart of it
-
-**Master Log File**
-
-To get the Master-log-file for geo-replication, use the following
-command:
-
-`gluster volume geo-replication  config log-file`
-
-For example:
-
-`# gluster volume geo-replication Volume1 example.com:/data/remote_dir config log-file `
-
-**Slave Log File**
-
-To get the log file for Geo-replication on slave (glusterd must be
-running on slave machine), use the following commands:
-
-1.  On master, run the following command:
-
-    `# gluster volume geo-replication Volume1 example.com:/data/remote_dir config session-owner 5f6e5200-756f-11e0-a1f0-0800200c9a66 `
-
-    Displays the session owner details.
-
-2.  On slave, run the following command:
-
-    `# gluster volume geo-replication /data/remote_dir config log-file /var/log/gluster/${session-owner}:remote-mirror.log `
-
-3.  Replace the session owner details (output of Step 1) to the output
-    of the Step 2 to get the location of the log file.
-
-    `/var/log/gluster/5f6e5200-756f-11e0-a1f0-0800200c9a66:remote-mirror.log`
-
-Rotating Geo-replication Logs
------------------------------
-
-Administrators can rotate the log file of a particular master-slave
-session, as needed. When you run geo-replication's ` log-rotate`
-command, the log file is backed up with the current timestamp suffixed
-to the file name and signal is sent to gsyncd to start logging to a new
-log file.
-
-**To rotate a geo-replication log file**
-
--   Rotate log file for a particular master-slave session using the
-    following command:
-
-    `# gluster volume geo-replication  log-rotate`
-
-    For example, to rotate the log file of master `Volume1` and slave
-    `example.com:/data/remote_dir` :
-
-        # gluster volume geo-replication Volume1 example.com:/data/remote_dir log rotate
-        log rotate successful
-
--   Rotate log file for all sessions for a master volume using the
-    following command:
-
-    `# gluster volume geo-replication  log-rotate`
-
-    For example, to rotate the log file of master `Volume1`:
-
-        # gluster volume geo-replication Volume1 log rotate
-        log rotate successful
-
--   Rotate log file for all sessions using the following command:
-
-    `# gluster volume geo-replication log-rotate`
-
-    For example, to rotate the log file for all sessions:
-
-        # gluster volume geo-replication log rotate
-        log rotate successful
-
-Synchronization is not complete
--------------------------------
-
-**Description**: GlusterFS Geo-replication did not synchronize the data
-completely but still the geo- replication status displayed is OK.
-
-**Solution**: You can enforce a full sync of the data by erasing the
-index and restarting GlusterFS Geo- replication. After restarting,
-GlusterFS Geo-replication begins synchronizing all the data. All files
-are compared using checksum, which can be a lengthy and high resource
-utilization operation on large data sets. If the error situation
-persists, contact Red Hat Support.
-
-For more information about erasing index, see ?.
-
-Issues in Data Synchronization
-------------------------------
-
-**Description**: Geo-replication display status as OK, but the files do
-not get synced, only directories and symlink gets synced with the
-following error message in the log:
-
-[2011-05-02 13:42:13.467644] E [master:288:regjob] GMaster: failed to
-sync ./some\_file\`
-
-**Solution**: Geo-replication invokes rsync v3.0.0 or higher on the host
-and the remote machine. You must verify if you have installed the
-required version.
-
-Geo-replication status displays Faulty very often
--------------------------------------------------
-
-**Description**: Geo-replication displays status as faulty very often
-with a backtrace similar to the following:
-
-2011-04-28 14:06:18.378859] E [syncdutils:131:log\_raise\_exception]
-\<top\>: FAIL: Traceback (most recent call last): File
-"/usr/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
-152, in twraptf(\*aa) File
-"/usr/local/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in
-listen rid, exc, res = recv(self.inf) File
-"/usr/local/libexec/glusterfs/python/syncdaemon/repce.py", line 42, in
-recv return pickle.load(inf) EOFError
-
-**Solution**: This error indicates that the RPC communication between
-the master gsyncd module and slave gsyncd module is broken and this can
-happen for various reasons. Check if it satisfies all the following
-pre-requisites:
-
--   Password-less SSH is set up properly between the host and the remote
-    machine.
-
--   If FUSE is installed in the machine, because geo-replication module
-    mounts the GlusterFS volume using FUSE to sync data.
-
--   If the **Slave** is a volume, check if that volume is started.
-
--   If the Slave is a plain directory, verify if the directory has been
-    created already with the required permissions.
-
--   If GlusterFS 3.2 or higher is not installed in the default location
-    (in Master) and has been prefixed to be installed in a custom
-    location, configure the `gluster-command` for it to point to the
-    exact location.
-
--   If GlusterFS 3.2 or higher is not installed in the default location
-    (in slave) and has been prefixed to be installed in a custom
-    location, configure the `remote-gsyncd-command` for it to point to
-    the exact place where gsyncd is located.
-
-Intermediate Master goes to Faulty State
-----------------------------------------
-
-**Description**: In a cascading set-up, the intermediate master goes to
-faulty state with the following log:
-
-raise RuntimeError ("aborting on uuid change from %s to %s" % \\
-RuntimeError: aborting on uuid change from af07e07c-427f-4586-ab9f-
-4bf7d299be81 to de6b5040-8f4e-4575-8831-c4f55bd41154
-
-**Solution**: In a cascading set-up the Intermediate master is loyal to
-the original primary master. The above log means that the
-geo-replication module has detected change in primary master. If this is
-the desired behavior, delete the config option volume-id in the session
-initiated from the intermediate master.
-
-Troubleshooting POSIX ACLs
-==========================
-
-This section describes the most common troubleshooting issues related to
-POSIX ACLs.
-
-setfacl command fails with “setfacl: \<file or directory name\>: Operation not supported” error
------------------------------------------------------------------------------------------------
-
-You may face this error when the backend file systems in one of the
-servers is not mounted with the "-o acl" option. The same can be
-confirmed by viewing the following error message in the log file of the
-server "Posix access control list is not supported".
-
-**Solution**: Remount the backend file system with "-o acl" option. For
-more information, see ?.
-
-Troubleshooting Hadoop Compatible Storage
-=========================================
-
-This section describes the most common troubleshooting issues related to
-Hadoop Compatible Storage.
-
-Time Sync
----------
-
-Running MapReduce job may throw exceptions if the time is out-of-sync on
-the hosts in the cluster.
-
-**Solution**: Sync the time on all hosts using ntpd program.
-
-Troubleshooting NFS
-===================
-
-This section describes the most common troubleshooting issues related to
-NFS .
-
-mount command on NFS client fails with “RPC Error: Program not registered”
---------------------------------------------------------------------------
-
-Start portmap or rpcbind service on the NFS server.
-
-This error is encountered when the server has not started correctly.
-
-On most Linux distributions this is fixed by starting portmap:
-
-`$ /etc/init.d/portmap start`
-
-On some distributions where portmap has been replaced by rpcbind, the
-following command is required:
-
-`$ /etc/init.d/rpcbind start `
-
-After starting portmap or rpcbind, gluster NFS server needs to be
-restarted.
-
-NFS server start-up fails with “Port is already in use” error in the log file."
--------------------------------------------------------------------------------
-
-Another Gluster NFS server is running on the same machine.
-
-This error can arise in case there is already a Gluster NFS server
-running on the same machine. This situation can be confirmed from the
-log file, if the following error lines exist:
-
-    [2010-05-26 23:40:49] E [rpc-socket.c:126:rpcsvc_socket_listen] rpc-socket: binding socket failed:Address already in use
-    [2010-05-26 23:40:49] E [rpc-socket.c:129:rpcsvc_socket_listen] rpc-socket: Port is already in use 
-    [2010-05-26 23:40:49] E [rpcsvc.c:2636:rpcsvc_stage_program_register] rpc-service: could not create listening connection 
-    [2010-05-26 23:40:49] E [rpcsvc.c:2675:rpcsvc_program_register] rpc-service: stage registration of program failed 
-    [2010-05-26 23:40:49] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465 
-    [2010-05-26 23:40:49] E [nfs.c:125:nfs_init_versions] nfs: Program init failed 
-    [2010-05-26 23:40:49] C [nfs.c:531:notify] nfs: Failed to initialize protocols
-
-To resolve this error one of the Gluster NFS servers will have to be
-shutdown. At this time, Gluster NFS server does not support running
-multiple NFS servers on the same machine.
-
-mount command fails with “rpc.statd” related error message
-----------------------------------------------------------
-
-If the mount command fails with the following error message:
-
-mount.nfs: rpc.statd is not running but is required for remote locking.
-mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
-
-Start rpc.statd
-
-For NFS clients to mount the NFS server, rpc.statd service must be
-running on the clients.
-
-Start rpc.statd service by running the following command:
-
-`$ rpc.statd `
-
-mount command takes too long to finish.
----------------------------------------
-
-Start rpcbind service on the NFS client.
-
-The problem is that the rpcbind or portmap service is not running on the
-NFS client. The resolution for this is to start either of these services
-by running the following command:
-
-`$ /etc/init.d/portmap start`
-
-On some distributions where portmap has been replaced by rpcbind, the
-following command is required:
-
-`$ /etc/init.d/rpcbind start`
-
-NFS server glusterfsd starts but initialization fails with “nfsrpc- service: portmap registration of program failed” error message in the log.
-----------------------------------------------------------------------------------------------------------------------------------------------
-
-NFS start-up can succeed but the initialization of the NFS service can
-still fail preventing clients from accessing the mount points. Such a
-situation can be confirmed from the following error messages in the log
-file:
-
-    [2010-05-26 23:33:47] E [rpcsvc.c:2598:rpcsvc_program_register_portmap] rpc-service: Could notregister with portmap 
-    [2010-05-26 23:33:47] E [rpcsvc.c:2682:rpcsvc_program_register] rpc-service: portmap registration of program failed
-    [2010-05-26 23:33:47] E [rpcsvc.c:2695:rpcsvc_program_register] rpc-service: Program registration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
-    [2010-05-26 23:33:47] E [nfs.c:125:nfs_init_versions] nfs: Program init failed
-    [2010-05-26 23:33:47] C [nfs.c:531:notify] nfs: Failed to initialize protocols
-    [2010-05-26 23:33:49] E [rpcsvc.c:2614:rpcsvc_program_unregister_portmap] rpc-service: Could not unregister with portmap
-    [2010-05-26 23:33:49] E [rpcsvc.c:2731:rpcsvc_program_unregister] rpc-service: portmap unregistration of program failed
-    [2010-05-26 23:33:49] E [rpcsvc.c:2744:rpcsvc_program_unregister] rpc-service: Program unregistration failed: MOUNT3, Num: 100005, Ver: 3, Port: 38465
-
-1.  Start portmap or rpcbind service on the NFS server.
-
-    On most Linux distributions, portmap can be started using the
-    following command:
-
-    `$ /etc/init.d/portmap start `
-
-    On some distributions where portmap has been replaced by rpcbind,
-    run the following command:
-
-    `$ /etc/init.d/rpcbind start `
-
-    After starting portmap or rpcbind, gluster NFS server needs to be
-    restarted.
-
-2.  Stop another NFS server running on the same machine.
-
-    Such an error is also seen when there is another NFS server running
-    on the same machine but it is not the Gluster NFS server. On Linux
-    systems, this could be the kernel NFS server. Resolution involves
-    stopping the other NFS server or not running the Gluster NFS server
-    on the machine. Before stopping the kernel NFS server, ensure that
-    no critical service depends on access to that NFS server's exports.
-
-    On Linux, kernel NFS servers can be stopped by using either of the
-    following commands depending on the distribution in use:
-
-    `$ /etc/init.d/nfs-kernel-server stop`
-
-    `$ /etc/init.d/nfs stop`
-
-3.  Restart Gluster NFS server.
-
-mount command fails with NFS server failed error.
--------------------------------------------------
-
-mount command fails with following error
-
-*mount: mount to NFS server '10.1.10.11' failed: timed out (retrying).*
-
-Perform one of the following to resolve this issue:
-
-1.  Disable name lookup requests from NFS server to a DNS server.
-
-    The NFS server attempts to authenticate NFS clients by performing a
-    reverse DNS lookup to match hostnames in the volume file with the
-    client IP addresses. There can be a situation where the NFS server
-    either is not able to connect to the DNS server or the DNS server is
-    taking too long to responsd to DNS request. These delays can result
-    in delayed replies from the NFS server to the NFS client resulting
-    in the timeout error seen above.
-
-    NFS server provides a work-around that disables DNS requests,
-    instead relying only on the client IP addresses for authentication.
-    The following option can be added for successful mounting in such
-    situations:
-
-    `option rpc-auth.addr.namelookup off `
-
-    > **Note**
-    >
-    > Note: Remember that disabling the NFS server forces authentication
-    > of clients to use only IP addresses and if the authentication
-    > rules in the volume file use hostnames, those authentication rules
-    > will fail and disallow mounting for those clients.
-
-    or
-
-2.  NFS version used by the NFS client is other than version 3.
-
-    Gluster NFS server supports version 3 of NFS protocol. In recent
-    Linux kernels, the default NFS version has been changed from 3 to 4.
-    It is possible that the client machine is unable to connect to the
-    Gluster NFS server because it is using version 4 messages which are
-    not understood by Gluster NFS server. The timeout can be resolved by
-    forcing the NFS client to use version 3. The **vers** option to
-    mount command is used for this purpose:
-
-    `$ mount  -o vers=3 `
-
-showmount fails with clnt\_create: RPC: Unable to receive
----------------------------------------------------------
-
-Check your firewall setting to open ports 111 for portmap
-requests/replies and Gluster NFS server requests/replies. Gluster NFS
-server operates over the following port numbers: 38465, 38466, and
-38467.
-
-For more information, see ?.
-
-Application fails with "Invalid argument" or "Value too large for defined data type" error.
--------------------------------------------------------------------------------------------
-
-These two errors generally happen for 32-bit nfs clients or applications
-that do not support 64-bit inode numbers or large files. Use the
-following option from the CLI to make Gluster NFS return 32-bit inode
-numbers instead: nfs.enable-ino32 \<on|off\>
-
-Applications that will benefit are those that were either:
-
--   built 32-bit and run on 32-bit machines such that they do not
-    support large files by default
-
--   built 32-bit on 64-bit systems
-
-This option is disabled by default so NFS returns 64-bit inode numbers
-by default.
-
-Applications which can be rebuilt from source are recommended to rebuild
-using the following flag with gcc:
-
-` -D_FILE_OFFSET_BITS=64`
-
-Troubleshooting File Locks
-==========================
-
-In GlusterFS 3.3 you can use `statedump` command to list the locks held
-on files. The statedump output also provides information on each lock
-with its range, basename, PID of the application holding the lock, and
-so on. You can analyze the output to know about the locks whose
-owner/application is no longer running or interested in that lock. After
-ensuring that the no application is using the file, you can clear the
-lock using the following `clear lock` command:
-
-`# `
-
-For more information on performing `statedump`, see ?
-
-**To identify locked file and clear locks**
-
-1.  Perform statedump on the volume to view the files that are locked
-    using the following command:
-
-    `# gluster volume statedump  inode`
-
-    For example, to display statedump of test-volume:
-
-        # gluster volume statedump test-volume
-        Volume statedump successful
-
-    The statedump files are created on the brick servers in the` /tmp`
-    directory or in the directory set using `server.statedump-path`
-    volume option. The naming convention of the dump file is
-    `<brick-path>.<brick-pid>.dump`.
-
-    The following are the sample contents of the statedump file. It
-    indicates that GlusterFS has entered into a state where there is an
-    entry lock (entrylk) and an inode lock (inodelk). Ensure that those
-    are stale locks and no resources own them.
-
-        [xlator.features.locks.vol-locks.inode]
-        path=/
-        mandatory=0
-        entrylk-count=1
-        lock-dump.domain.domain=vol-replicate-0
-        xlator.feature.locks.lock-dump.domain.entrylk.entrylk[0](ACTIVE)=type=ENTRYLK_WRLCK on basename=file1, pid = 714782904, owner=ffffff2a3c7f0000, transport=0x20e0670, , granted at Mon Feb 27 16:01:01 2012
-
-        conn.2.bound_xl./gfs/brick1.hashsize=14057
-        conn.2.bound_xl./gfs/brick1.name=/gfs/brick1/inode
-        conn.2.bound_xl./gfs/brick1.lru_limit=16384
-        conn.2.bound_xl./gfs/brick1.active_size=2
-        conn.2.bound_xl./gfs/brick1.lru_size=0
-        conn.2.bound_xl./gfs/brick1.purge_size=0
-
-        [conn.2.bound_xl./gfs/brick1.active.1]
-        gfid=538a3d4a-01b0-4d03-9dc9-843cd8704d07
-        nlookup=1
-        ref=2
-        ia_type=1
-        [xlator.features.locks.vol-locks.inode]
-        path=/file1
-        mandatory=0
-        inodelk-count=1
-        lock-dump.domain.domain=vol-replicate-0
-        inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 714787072, owner=00ffff2a3c7f0000, transport=0x20e0670, , granted at Mon Feb 27 16:01:01 2012
-
-2.  Clear the lock using the following command:
-
-    `# `
-
-    For example, to clear the entry lock on `file1` of test-volume:
-
-        # gluster volume clear-locks test-volume / kind granted entry file1
-        Volume clear-locks successful
-        vol-locks: entry blocked locks=0 granted locks=1
-
-3.  Clear the inode lock using the following command:
-
-    `# `
-
-    For example, to clear the inode lock on `file1` of test-volume:
-
-        # gluster  volume clear-locks test-volume /file1 kind granted inode 0,0-0
-        Volume clear-locks successful
-        vol-locks: inode blocked locks=0 granted locks=1
-
-    You can perform statedump on test-volume again to verify that the
-    above inode and entry locks are cleared.
-
-