summaryrefslogtreecommitdiffstats
path: root/xlators/storage/posix/src/posix-handle.h
Commit message (Collapse)AuthorAgeFilesLines
* Posix: Optimize posix code to improve file creationMohit Agrawal2020-04-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Before executing a fop in POSIX xlator it builds an internal path based on GFID.To validate the path it call's (l)stat system call and while .glusterfs is heavily loaded kernel takes time to lookup inode and due to that performance drops Solution: In this patch we followed two ways to improve the performance. 1) Keep open fd specific to first level directory(gfid[0]) in .glusterfs, it would force to kernel keep the inodes from all those files in cache. In case of memory pressure kernel won't uncache first level inodes. We need to open 256 fd's per brick to access the entry faster. 2) Use at based call's to access relative path to reduce path based lookup time. Note: To verify the patch we have executed kernel untar 100 times on 6 different clients after enabling metadata group-cache and some other option.We were getting more than 20 percent improvement in kenel untar after applying the patch. Credits: Xavi Hernandez <xhernandez@redhat.com> Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343 updates: #891 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
* posix: Improve MAKE_HANDLE_GFID_PATH and MAKE_HANDLE_RELPATHMohit Agrawal2019-11-111-10/+10
| | | | | | | | Avoid one function call to set the gfid_path in buffer Change-Id: If9b95801b05c34d262fac9a275492c794d12bf58 Updates #748 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* posix*.c: remove unneeded strlen() callsYaniv Kaul2019-09-051-1/+1
| | | | | | | | | In various places, we can re-use knowledge of string length or result of snprintf() and such instead of strlen(). Change-Id: I4c9b1decf1169b3f8ac83699a0afbd7c38fad746 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* server: don't allow '/' in basenameAmar Tumballi2018-11-051-2/+2
| | | | | | | | | | | | | | | | Server stack needs to have all the sort of validation, assuming clients can be compromized. It is possible for a compromized client to send basenames with paths with '/', and with that create files without permission on server. By sanitizing the basename, and not allowing anything other than actual directory as the parent for any entry creation, we can mitigate the effects of clients not able to exploit the server. Fixes: CVE-2018-14651 Fixes: bz#1644755 Change-Id: I5dc0da0da2713452ff2b65ac2ddbccf1a267dc20 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Land clang-format changesGluster Ant2018-09-121-168/+175
| | | | Change-Id: I6f5d8140a06f3c1b2d196849299f8d483028d33b
* server-protocol: don't allow '../' path in 'name'Amar Tumballi2018-09-051-0/+6
| | | | | | | | | | | | | This will prevent any arbitrary file creation through glusterfs by modifying the client bits. Also check for the similar flaw inside posix too, so we prevent any changes in layers in-between. Fixes: bz#1625095 Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: Id9fe0ef6e86459e8ed85ab947d977f058c5ae06e
* posix/ctime: posix hooks to get consistent time xattrKotresh HR2018-05-061-3/+5
| | | | | | | | | | | | This patch uses the ctime posix APIs to get consistent time across replica. The time attributes are got from from inode context or from on disk if not found and merged with iatt to be returned. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: Id737038ce52468f1f5ebc8a42cbf9c6ffbd63850 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix: reserve option behavior is not correct while using fallocateMohit Agrawal2018-04-111-0/+2
| | | | | | | | | | | | | | | | | Problem: storage.reserve option is not working correctly while disk space is allocate throguh fallocate Solution: In posix_disk_space_check_thread_proc after every 5 sec interval it calls posix_disk_space_check to monitor disk space and set the flag in posix priv.In 5 sec timestamp user can create big file with fallocate that can reach posix reserve limit and no error is shown on terminal even limit has reached. To resolve the same call posix_disk_space for every fallocate fop instead to call by a thread after 5 second BUG: 1560411 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I39ba9390e2e6d084eedbf3bcf45cd6d708591577
* cluster/dht: avoid overwriting client writes during migrationSusant Palai2018-02-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | For more details on this issue see https://github.com/gluster/glusterfs/issues/308 Solution: This is a restrictive solution where a file will not be migrated if a client writes to it during the migration. This does not check if the writes from the rebalance and the client actually do overlap. If dht_writev_cbk finds that the file is being migrated (PHASE1) it will set an xattr on the destination file indicating the file was updated by a non-rebalance client. Rebalance checks if any other client has written to the dst file and aborts the file migration if it finds the xattr. updates gluster/glusterfs#308 Change-Id: I73aec28bc9dbb8da57c7425ec88c6b6af0fbc9dd Signed-off-by: Susant Palai <spalai@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
* posix: Reorganize posix xlator to prepare for reuse with rioShyamsundarR2017-12-021-103/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Split out entry and inode/fd based FOPs into separate files from posix.c 2. Split out common routines (init, fini, reconf, and such) into its own file, from posix.c 3. Retain just the method assignments in posix.c (such that posix2 for RIO can assign its own methods in the future for entry operations and such) 4. Based on the split in (1) and (2) split out posix-handle.h into 2 files, such that macros that are needed for inode ops are in one and rest are in the other If the split is done as above, posix2 can compile with its own entry ops, and hence not compile, the entry ops as split in (1) above. The split described in (4) can again help posix2 to define its own macros to make entry and inode handles, thus not impact existing POSIX xlator code. Noted problems - There are path references in certain cases where quota is used (in the xattr FOPs), and thus will fail on reuse in posix2, this needs to be handled when we get there. - posix_init does set root GFID on the brick root, and this is incorrect for posix2, again will need handling later when posix2 evolves based on this code (other init checks seem fine on current inspection) Merge of experimental branch patches with the following gerrit change-IDs > Change-Id: I965ce6dffe70a62c697f790f3438559520e0af20 > Change-Id: I089a4d9cf470c2f9c121611e8ef18dea92b2be70 > Change-Id: I2cec103f6ba8f3084443f3066bcc70b2f5ecb49a Fixes gluster/glusterfs#327 Change-Id: I0ccfa78559a7c5a68f5e861e144cf856f5c9e19c Signed-off-by: ShyamsundarR <srangana@redhat.com>
* Coverity Issue: PW.INCLUDE_RECURSION in several filesGirjesh Rajoria2017-11-091-1/+0
| | | | | | | | | | | | | | Coverity ID: 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 423, 424, 425, 426, 427, 428, 429, 436, 437, 438, 439, 440, 441, 442, 443 Issue: Event include_recursion Removed redundant, recursive includes from the files. Change-Id: I920776b1fa089a2d4917ca722d0075a9239911a7 BUG: 789278 Signed-off-by: Girjesh Rajoria <grajoria@redhat.com>
* cluster/afr: Fix data loss due to race between sh and ongoing writeKrutika Dhananjay2015-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When IO is happening on a file and a brick goes down comes back up during this time, protocol/client translator attempts reopening of the fd on the gfid handle of the file. But if another client renames this file while a brick was down && writes were in progress on it, once this brick is back up, there can be a race between reopening of the fd and entry self-heal replaying the effect of the rename() on the sink brick. If the reopening of the fd happens first, the application's writes continue to go into the data blocks associated with the gfid. Now entry-self-heal deletes 'src' and creates 'dst' file on the sink, marking dst as a 'newentry'. Data self-heal is also completed on 'dst' as a result and self-heal terminates. If at this point the application is still writing to this fd, all writes on the file after self-heal would go into the data blocks associated with this fd, which would be lost once the fd is closed. The result - the 'dst' file on the source and sink are not the same and there is no pending heal on the file, leading to silent corruption on the sink. Fix: Leverage http://review.gluster.org/#/c/12816/ to ensure the gfid handle path gets saved in .glusterfs/unlink until the fd is closed on the file. During this time, when self-heal sends mknod() with gfid of the file, do the following: link() the gfid handle under .glusterfs/unlink to the new path to be created in mknod() and rename() the gfid handle to go back under .glusterfs/ab/cd/. Change-Id: I86ef1f97a76ffe11f32653bb995f575f7648f798 BUG: 1292379 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13001 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* posix: posix_make_ancestryfromgfid shouldn't log ENOENTvmallika2015-08-131-1/+1
| | | | | | | | | | | | | | posix_make_ancestryfromgfid shouldn't log ENOENT and it should set proper op_errno Change-Id: I8a87f30bc04d33cab06c91c74baa9563a1c7b45d BUG: 1251449 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/11861 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* Porting new log messages for posixHari Gowtham2015-06-171-23/+30
| | | | | | | | | | Change-Id: I29bdeefb755805858e3cb1817b679cb6f9a476a9 BUG: 1194640 Signed-off-by: Hari Gowtham <hgowtham@dhcp35-85.lab.eng.blr.redhat.com> Reviewed-on: http://review.gluster.org/9893 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-181-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Renaming directories can cause the size of the buffer required for posix_handle_path to increase between the first call, which calculates the size, and the second call which forms the path in the buffer allocated based on the size calculated in the first call. The path created in the second call overflows the allocated buffer and overwrites the stack causing the brick process to crash. The fix adds a buffer size check to prevent the buffer overflow. It also checks and returns an error if the posix_handle_path call is unable to form the path instead of working on the incomplete path, which is likely to cause subsequent calls using the path to fail with ELOOP. Preventing buffer overflow and handling errors BUG: 1113960 Change-Id: If3d3c1952e297ad14f121f05f90a35baf42923aa Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9289 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* POSIX filesystem compliance: PATH_MAXEmmanuel Dreyfus2014-10-031-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX mandates the filesystem to support paths of lengths up to _XOPEN_PATH_MAX (1024). This is the PATH_MAX limit here: http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html When using a path of 1023 bytes, the posix xlator attempts to create an absolute path by prefixing the 1023 bytes path by the brick base path. The result is an absolute path of more than _XOPEN_PATH_MAX bytes which may be rejected by the backend filesystem. Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it will work (except if brick base path is longer than 2072 bytes but it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024, which means the bug can happen regardless of brick base path length. If this condition is detected for a brick, the proposed fix is to chdir() the brick glusterfsd daemon to its brick base directory. Then when encountering a path that will exceed _XOPEN_PATH_MAX once prefixed by the brick base path, a relative path is used instead of an absolute one. We do not always use relative path because some operations require an absolute path on the brick base path itself (e.g.: statvfs). At least on NetBSD, this chdir() uncovers a race condition which causes file lookup to fail with ENODATA for a few seconds. The volume quickly reaches a sane state, but regression tests are fast enough to choke on it. The reason is obscure (as often with race conditions), but sleeping one second after the chdir() seems to change scheduling enough that the problem disapear. Note that since the chdir() is done if brick backend filesystem does not support path long enough, it will not occur with Linux ext3fs (except if brick base path is over 2072 bytes long). BUG: 1129939 Change-Id: I7db3567948bc8fa8d99ca5f5ba6647fe425186a9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8596 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/quota: Heal pgfid xattr on existing data when the quota isvmallika2014-09-301-0/+16
| | | | | | | | | | | | | | | | | | | | | enable The pgfid extended attributes are used to construct the ancestry path (from the file to the volume root) for nameless lookups on files. As NFS relies on nameless lookups heavily, quota enforcement through NFS would be inconsistent if quota were to be enabled on a volume with existing data. Solution is to heal the pgfid extended attributes as a part of lookup perfomed by quota-crawl process. In a posix lookup check for pgfid xattr and if it is missing set the xattr. Change-Id: I5912ea96787625c496bde56d43ac9162596032e9 BUG: 1147378 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8878 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Always check for ENODATA with ENOATTREmmanuel Dreyfus2014-09-081-1/+1
| | | | | | | | | | | | | | | | | | Linux defines ENODATA and ENOATTR with the same value, which means that code can miss on on the two without breaking. FreeBSD does not have ENODATA and GlusterFS defines it as ENOATTR just like Linux does. On NetBSD, ENODATA != ENOATTR, hence we need to check for both values to get portable behavior. BUG: 764655 Change-Id: I003a3af055fdad285d235f2a0c192c9cce56fab8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* storage/posix: Prefer gfid links for inode-handlePranith Kumar K2014-09-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: File path could change by other entry operations in-flight so if renames are in progress at the time of other operations like open, it may lead to failures. We observed that this issue can also happen while renames and readdirps/lookups are in progress because dentry-table is going stale sometimes. Fix: Prefer gfid-handles over paths for files. For directory handles prefering gfid-handles hits performance issues because it needs to resolve paths traversing up the symlinks. Tests which test if files are opened should check on gfid path after this change. So changed couple of tests to reflect the same. Note: This patch doesn't fix the issue for directories. I think a complete fix is to come up with an entry operation serialization xlator. Until then lets live with this. Change-Id: I10bda1083036d013f3a12588db7a71039d9da6c3 BUG: 1136159 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8575 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: do not dereference gfid symlinks before ↵Xavier Hernandez2014-05-021-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | posix_handle_mkdir_hashes() Whenever a new directory is created, its corresponding gfid file must also be created. This was done first calling MAKE_HANDLE_PATH() to get the path of the gfid file, then calling posix_handle_mkdir_hashes() to create the parent directories of the gfid, and finally creating the soft-link. In normal circumstances, the gfid we want to create won't exist and MAKE_HANDLE_PATH() will return a simple path to the new gfid. However if the volume is damaged and a self-heal is running, it is possible that we try to create an already existing gfid. In this case, MAKE_HANDLE_PATH() will return a path to the directory instead of the path to the gfid. To solve this problem, every time a path to a gfid is needed, a call to MAKE_HANDLE_ABSPATH() is made instead of the call to MAKE_HANDLE_PATH(). Change-Id: Ic319cc38c170434db8e86e2f89f0b8c28c0d611a BUG: 859581 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/5075 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: HANDLE_PFX is redundant use GF_HIDDEN_PATH insteadHarshavardhana2014-01-221-1/+0
| | | | | | | | | | | | | GF_HIDDEN_PATH usage would help in better readability of the code and avoids bugs produced from redundant macro constants. Change-Id: I2fd7e92e87783ba462ae438ced2cf4f720a25f5c BUG: 990028 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6756 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: placeholders for GFID to path conversionRaghavendra G2013-11-261-5/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | what? ===== The following is an attempt to generate the paths of a file when only its gfid is known. To find the path of a directory, the symlink handle to the directory maintained in the ".glusterfs" backend directory is read. The symlink handle is generated using the gfid of the directory. It (handle) contains the directory's name and parent gfid, which are used to recursively construct the absolute path as seen by the user from the mount point. A similar approach cannot be used for a regular file or a symbolic link since its hardlink handle, generated using its gfid, doesn't contain its parent gfid and basename. So xattrs are set to store the parent gfids and the number of hardlinks to a file or a symlink having the same parent gfid. When an user/application requests for the paths of a regular file or a symlink with multiple hardlinks, using the parent gfids stored in the xattrs, the paths of the parent directories are generated as mentioned earlier. The base names of the hardlinks (with the same parent gfid) are determined by matching the actual backend inode numbers of each entry in the parent directory with that of the hardlink handle. Xattr is set on a regular file, link, and symbolic link as follows, Xattr name : trusted.pgfid.<pargfidstr> Xattr value : <number of hardlinks to a regular file/symlink with the same parentgfid> If a regular file, hard link, symbolic link is created then an xattr in the above format is set in the backend. how to use? =========== This functionality can be used through getxattr interface. Two keys - glusterfs.ancestry.dentry and glusterfs.ancestry.path - enable usage of this functionality. A successful getxattr will have the result stored under same keys. Values will be, glusterfs.ancestry.dentry: -------------------------- A linked list of gf-dirent structures for all possible paths from root to this gfid. If there are multiple paths, the linked-list will be a series of paths one after another. Each path will be a series of dentries representing all components of the path. This key is primarily for internal usage within glusterfs. glusterfs.ancestry.path: ------------------------ A string containing all possible paths from root to this gfid. Multiple hardlinks of a file or a symlink are displayed as a colon seperated list (this could interfere with path components containing ':'). e.g. If there is a file "file1" in root directory with two hardlinks, "/dir2/link2tofile1" and "/dir1/link1tofile1", then [root@alpha gfsmntpt]# getfattr -n glusterfs.ancestry.path -e text file1 glusterfs.ancestry.path="/file1:/dir2/link2tofile1:/dir1/link1tofile1" Thanks Amar, Avati and Venky for the inputs. Original Author: Ramana Raja <rraja@redhat.com> BUG: 990028 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I0eaa9101e333e0c1f66ccefd9e95944dd4a27497 Reviewed-on: http://review.gluster.org/5951 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* All: License message changeVarun Shastry2012-09-131-7/+6
| | | | | | | | | | | | License message changed for server-side, dual license GPLV2 and LGPLv3+. Change-Id: Ia9e53061b9d2df3b3ef3bc9778dceff77db46a09 BUG: 852318 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/3940 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* All: License message changeVarun Shastry2012-08-281-16/+7
| | | | | | | | | | | | | | | | | | The license message is changed to Copyright (c) 2008-2012 Red Hat, Inc. <http://www.redhat.com> This file is part of GlusterFS. This file is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation. Change-Id: I07d2b63ed5fbbbd1884f1e74f2dd56013d15b0f4 BUG: 852318 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/3858 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Move landfill inside .glusterfsPranith Kumar K2012-05-311-0/+2
| | | | | | | | | Change-Id: Ia2944f891dd62e72f3c79678c3a1fed389854a90 BUG: 811970 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3158 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: prefer absolute path handles over GFID handlesv3.3.0qa37Anand Avati2012-04-211-12/+12
| | | | | | | | | | Change-Id: I9afefa2f8a39c5f2c77271ea64aff95249c88821 BUG: 791187 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3204 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* posix: handle some internal behavior in posix_mknod()Amar Tumballi2012-02-161-0/+3
| | | | | | | | | | | | | | | | | | | | assume a case of link() systemcall, which is handled in distribute by creating a 'linkfile' in hashed subvolume, if the 'oldloc' is present in different subvolume. we have same 'gfid' for the linkfile as that of file for consistency. Now, a file with multiple hardlinks, we may end up with 'hardlinked' linkfiles. dht create linkfile using 'mknod()' fop, and as now posix_mknod() is not equipped to handle this situation. this patch fixes the situation by looking at the 'internal' key set in the dictionary to differentiate the call which originates from inside with regular system calls. Change-Id: Ibff7c31f8e0c8bdae035c705c93a295f080ff985 BUG: 763844 Signed-off-by: Amar Tumballi <amar@gluster.com> Reviewed-on: http://review.gluster.com/2755 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-201-0/+148
1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>