summaryrefslogtreecommitdiffstats
path: root/xlators/storage/posix/src/posix.c
Commit message (Collapse)AuthorAgeFilesLines
* Adding ChangeTimeRecorder(CTR) Xlator to GlusterFSJoseph Fernandes2015-03-191-1/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ********************************************************************** ChangeTimeRecorder(CTR) Xlator | ********************************************************************** ChangeTimeRecorder(CTR) is server side xlator(translator) which sits just above posix xlator. The main role of this xlator is to record the access/write patterns on a file residing the brick. It records the read(only data) and write(data and metadata) times and also count on how many times a file is read or written. This xlator also captures the hard links to a file(as its required by data tiering to move files). CTR Xlator is the consumer of libgfdb. To Enable/Disable CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.ctr-enabled {on/off} To Enable/Disable Frequency Counter Recording in CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.record-counters {on/off} Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9935 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add ACL translation for the GF_POSIX_ACL_*_KEY xattrNiels de Vos2015-03-091-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding support for two virtual extended attributes that are used for converting a binary POSIX ACL to a POSIX.1e long ACL text format. This makes it possible to transfer the ACL over the network to a different OS which can convert the POSIX.1e text format to its native structures. The following xattrs are sent over RPC in SETXATTR/GETXATTR procedures, and contain the POSIX.1e long ACL text format: - glusterfs.posix.acl: maps to ACL_TYPE_ACCESS - glusterfs.posix.default_acl: maps to ACL_TYPE_DEFAULT acl_from_text() (from libacl) converts the text format into an acl_t structure. This structure is then used by acl_set_file() to set the ACL in the filesystem. libacl-devel is needed for linking against libacl, so it has been added to the BuildRequires in the .spec. NetBSD does not support POSIX ACLs. Trying to get/set POSIX ACLs on a storage server running NetBSD, an error will be returned with errno set to ENOTSUP. Faking support, but not enforcing ACLs seems wrong to me. URL: http://www.gluster.org/community/documentation/index.php/Features/Improved_POSIX_ACLs BUG: 1185654 Change-Id: Ic5eb73d69190d3492df2f711d0436775eeea7de3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9627 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* Fix dictionary leaks in ancestry-building code.Pranith Kumar K2015-03-091-5/+0
| | | | | | | | | | Change-Id: I7a4a24ed95f897d1c14d89f3869c20ba40f85b7f BUG: 1188636 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9839 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Moved common functions as utils in syncop/common-utilsPranith Kumar K2015-02-271-3/+9
| | | | | | | | | | | | | | | These will be used by both afr and ec. Moved syncop_dirfd, syncop_ftw, syncop_dir_scan functions also into syncop-utils.c Change-Id: I467253c74a346e1e292d36a8c1a035775c3aa670 BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9740 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-191-2/+3
| | | | | | | | | | | | | Modified a few log messages added for this fix. Also set the op_errno in an error check. Change-Id: I87caf2f89031aedad1aaee001aef54896dbecd3b BUG: 1113960 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9702 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* posix: Fix unlink failing under specific conditionPrashanth Pai2015-02-191-1/+3
| | | | | | | | | | | | | | | | | | | | PROBLEM: Files are undeletable when these three conditions are met: 1. File does not have trusted.pgfid.<gfid> xattr set. This won't be set when build-pgfid is off (default). 2. File has hardlink count > 1. 3. build-pgfid option is turned on. FIX: Allow unlink on files not having trusted.pgfid.<gfid> xattr. Change-Id: I58a9d9a1b29a0cb07f4959daabbd6dd04fab2b34 BUG: 1122028 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/8352 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Storage/posix : Adding error checks in path formationNithya Balachandran2015-02-181-10/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Renaming directories can cause the size of the buffer required for posix_handle_path to increase between the first call, which calculates the size, and the second call which forms the path in the buffer allocated based on the size calculated in the first call. The path created in the second call overflows the allocated buffer and overwrites the stack causing the brick process to crash. The fix adds a buffer size check to prevent the buffer overflow. It also checks and returns an error if the posix_handle_path call is unable to form the path instead of working on the incomplete path, which is likely to cause subsequent calls using the path to fail with ELOOP. Preventing buffer overflow and handling errors BUG: 1113960 Change-Id: If3d3c1952e297ad14f121f05f90a35baf42923aa Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9289 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Don't try to set gfid in case of INTERNAL-mknodPranith Kumar K2015-01-161-7/+12
| | | | | | | | | | Change-Id: I96540ed07f08e54d2a24a3b22c2437bddd558c85 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9446 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set gfid after all xattrs, uid/gid are setPranith Kumar K2015-01-121-32/+32
| | | | | | | | | | | | | | | | | | | | | | | | Problem: When a new entry is created gfid is set even before uid/gid, xattrs are set on the entry. This can lead to dht/afr healing that file/dir with the uid/gid it sees just after the gfid is set, i.e. root/root. Sometimes setattr/setxattr are failing on that file/dir. Fix: Set gfid of the file/directory only after uid/gid, xattrs are setup properly. Readdirp, lookup either wait for the gfid to be assigned to the entry or not update the in-memory inode ctx in posix-acl xlator which was producing lot EACCESS/EPERM to the application or dht/afr self-heals. Change-Id: I0a6ced579daabe3452f5a85010a00ca6e8579451 BUG: 1088649 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9434 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set correct fsgid before doing symlinkPranith Kumar K2015-01-121-2/+2
| | | | | | | | | | Change-Id: Ic50dfa5e5084c7b148e42a5014cca2b47c8ab5ed BUG: 1180986 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9431 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* telldir()/seekdir() portability fixesEmmanuel Dreyfus2014-12-171-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX says that an offset obtained from telldir() can only be used on the same DIR *. Linux is abls to reuse the offset accross closedir()/opendir() for a given directory, but this is not portable and such a behavior should be fixed. An incomplete fix for the posix xlator was merged in http://review.gluster.com/8926 This change set completes it. - Perform the same fix index xlator. - Use appropriate casts and variable types so that 32 bit signed offsets obtained by telldir() do not get clobbered when copied into 64 bit signed types. - modify glfs-heal.c and afr-self-heald.c so that they do not use anonymous fd, since this will cause closedir()/opendir() between each syncop_readdir(). On failure we fallback to anonymous fs only for Linux so that we can cope with updated client vs not updated brick. - Avoid sending an EINVAL when the client request for the EOF offset. Here we fix an error in previous fix for posix xlator: since we fill each directory entry with the offset of the next entry, we must consider as EOF the offset of the last entry, and not the value of telldir() after we read it. - Add checks in regression tests that we do not hit cases where offsets fed to seekdir() are wrong. Introduce log_newer() shell function to check for messages produced by the current script. This fix gather changes from http://review.gluster.org/9047 and http://review.gluster.org/8936 making them obsolete. BUG: 1129939 Change-Id: I59fb7f06a872c4f98987105792d648141c258c6a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9071 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Set errno for xattrop failuresPranith Kumar K2014-12-101-0/+3
| | | | | | | | | | Change-Id: I4d44068c8da5257227d62906ec18ae16f6ed6c02 BUG: 1172477 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* posix: remove duplicate dirfd calls in posix_opendirAtin Mukherjee2014-11-301-1/+1
| | | | | | | | | BUG: 1168910 Change-Id: I285d352d20374bb3edee2db42d062d4724198425 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9186 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: Changed order of chown and chmodVenkatesh Somyajulu2014-11-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Rebalance process runs in the root mode. If a normal user create a file and if it requires migration then because the migrated file is created by root, its owner and mode should be changed to the source normal user and permission should be changed the previous mode. If the suid bit is also set, then at the destination suid bit should also be set. Two operations are performed in the given order: 1. chmod 2. chown But chown resets the suid bit. So changed the order of these two operations so that first chown will be performed and then chmod will be performd so that suid bit will be preserved. Change-Id: Ib63b5cf528f8336b69bf090ad43bb02eec1d1602 BUG: 1086228 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/7435 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Treat ENODATA/ENOATTR as success in bulk removexattrPranith Kumar K2014-11-051-0/+14
| | | | | | | | | | | | | | | | | Bulk remove xattr is internal fop in gluster. Some of the xattrs may have special behavior. Ex: removexattr("posix.system_acl_access"), removes more than one xattr on the file that could be present in the bulk-removal request. Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR. Since all this fop cares is removal of the xattrs in bulk-remove request and if they are already deleted, it can be treated as success. Change-Id: Id8f2a39b68ab763ec8b04cb71b47977647f22da4 BUG: 1160509 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9049 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Avoid spurious EINVAL in posix_readdir()Emmanuel Dreyfus2014-10-291-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | On non Linux systems, we check that seekdir() succeeds and we return EINVAL if it does not. We need this to avoid infinite loops if some other component in GlusterFS makes an invalid seekdir() usage. This was introduced in this change: http://review.gluster.org/#/c/8760/ But seekdir() also fails when using the offset returned for the last entry, and this is expected behavior. As a result, the seekdir() test produces a spurious EINVAL when reaching end of directory. That error is not propagated to calling process, but it may harm internal GlusterFS processing. At least it produce a spurious error message in brick's log. We fix the problem by remembering the last entry offset in fd private data. When a new posix_readdir() invocation requests that offset, we avoid returning EINVAL. BUG: 1129939 Change-Id: I4e67a2ea46538aae63eea663dd4aa33b16ad24c7 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* POSIX filesystem compliance: PATH_MAXEmmanuel Dreyfus2014-10-031-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX mandates the filesystem to support paths of lengths up to _XOPEN_PATH_MAX (1024). This is the PATH_MAX limit here: http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html When using a path of 1023 bytes, the posix xlator attempts to create an absolute path by prefixing the 1023 bytes path by the brick base path. The result is an absolute path of more than _XOPEN_PATH_MAX bytes which may be rejected by the backend filesystem. Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it will work (except if brick base path is longer than 2072 bytes but it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024, which means the bug can happen regardless of brick base path length. If this condition is detected for a brick, the proposed fix is to chdir() the brick glusterfsd daemon to its brick base directory. Then when encountering a path that will exceed _XOPEN_PATH_MAX once prefixed by the brick base path, a relative path is used instead of an absolute one. We do not always use relative path because some operations require an absolute path on the brick base path itself (e.g.: statvfs). At least on NetBSD, this chdir() uncovers a race condition which causes file lookup to fail with ENODATA for a few seconds. The volume quickly reaches a sane state, but regression tests are fast enough to choke on it. The reason is obscure (as often with race conditions), but sleeping one second after the chdir() seems to change scheduling enough that the problem disapear. Note that since the chdir() is done if brick backend filesystem does not support path long enough, it will not occur with Linux ext3fs (except if brick base path is over 2072 bytes long). BUG: 1129939 Change-Id: I7db3567948bc8fa8d99ca5f5ba6647fe425186a9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8596 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/quota: Heal pgfid xattr on existing data when the quota isvmallika2014-09-301-0/+21
| | | | | | | | | | | | | | | | | | | | | enable The pgfid extended attributes are used to construct the ancestry path (from the file to the volume root) for nameless lookups on files. As NFS relies on nameless lookups heavily, quota enforcement through NFS would be inconsistent if quota were to be enabled on a volume with existing data. Solution is to heal the pgfid extended attributes as a part of lookup perfomed by quota-crawl process. In a posix lookup check for pgfid xattr and if it is missing set the xattr. Change-Id: I5912ea96787625c496bde56d43ac9162596032e9 BUG: 1147378 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8878 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fix invalid seekdir() usageEmmanuel Dreyfus2014-09-301-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to POSIX, seekdir() should only be given offset obtained from telldir() on the same DIR * http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html Code from afr-self-heald.c and index.c is operating outside of the specification, by doing using seekdir() with offset from a previously open/close/re-open directory. This seems to work on Linux (although with no guarantee it will always in the future). On NetBSD the seekdir() with a in invalid offset is a nilpotent operation, and causes an infinite loop, since index_fill_readdir() always restart from the beginning of the directory. The situation is fixed by using a non anonymous fd in afr-self-heald.c: we explicitely open the directory so that it remains open on the brick side during the timeframe where we want to reuse offsets in seekdir(). This requires adding an opendir fop in index xlator. If the brick was not updated, the opendir will fail and we fallback to the standard violating approach for backward compatibility on Linux. On other systems we fail since it never worked. While there, add tests to check seekdir() success in index and posix xlators, so that incorrect usage from calling code produce an explicit error instead of an infinite loop. We can only do it on non Linux systems, for the sake of backward compatibility when the brick was updated but not the client. BUG: 1129939 Change-Id: I88ca90acfcfee280988124bd6addc1a1893ca7ab Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8760 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: allow setxattr of keys with null values.Ravishankar N2014-09-291-2/+2
| | | | | | | | | | | | | | | Disk based file systems allow to get/set extended attribute key-value pairs where value can be null. Fuse/libgfapi clients must be able to do the same on a gluster volume. Change-Id: Ifc11134cc07f1a3ede43f9d027554dcd10b5c930 BUG: 1135514 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8567 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Do not forbid fallocate on non Linux systemsEmmanuel Dreyfus2014-09-261-7/+2
| | | | | | | | | | | | | | | | Linux fallocate() differs from posix_fallocate() by an extra flag that can have the FALLOC_FL_KEEP_SIZE value; Do not test FALLOC_FL_KEEP_SIZE existence to enable fallocate() in posix xlator, as sys_fallocate() in libglusterfs provides support for both implementations. BUG: 1129939 Change-Id: Idf41a0396028a15e81281791bf6912d7fd674e3f Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8856 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Log when mkdir is on an existing gfid but non-existentRaghavendra G2014-09-181-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path. consider following steps on a distribute volume 1. rename (src, dst) on hashed subvolume 2. snapshot taken 3. restore snapshots and do stat on src and dst Now, we end up with two directories src and dst having same gfid, because of distribute creating directories on non-existent subvolumes as part of directory healing. This can happen even with race between rename and directory healing in dht-lookup. This can lead to undefined behaviour while accessing any of both directories. Hence, we are logging paths of both directories, so that a sysadmin can take some corrective action when (s)he sees this log. One of the corrective action can be to copy contents of both directories from backend into a new directory and delete both directories. Since effort involved to fix this issue is non-trivial, giving this workaround till we come up with a fix. Change-Id: I38f4520e6787ee33180a9cd1bf2f36f46daea1ea BUG: 1105082 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8008 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Always check for ENODATA with ENOATTREmmanuel Dreyfus2014-09-081-8/+12
| | | | | | | | | | | | | | | | | | Linux defines ENODATA and ENOATTR with the same value, which means that code can miss on on the two without breaking. FreeBSD does not have ENODATA and GlusterFS defines it as ENOATTR just like Linux does. On NetBSD, ENODATA != ENOATTR, hence we need to check for both values to get portable behavior. BUG: 764655 Change-Id: I003a3af055fdad285d235f2a0c192c9cce56fab8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Added code to capture races in dht-lookup pathVenkatesh Somyajulu2014-09-031-0/+6
| | | | | | | | | Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix : Missing space in log messageNithya Balachandran2014-09-031-1/+1
| | | | | | | | | | | Added a space in a log message Change-Id: Iabd50e6b5c9ff4673f59d6b52b785894b3dcdaf9 BUG: 1116150 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8585 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* storage/posix: Don't unlink .glusterfs-hardlink before linkto checkVenkatesh Somyajulu2014-08-281-8/+37
| | | | | | | | | | BUG: 1116150 Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8559 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: fix issue in posix_fsyncZhang Huan2014-08-011-1/+2
| | | | | | | | | | | | | | Fix the issue that posix_fsync does not correctly return and save error code in op_errno when call to sys_fdatasync fails. Change-Id: Id0b62cfa009dbb52c8a0992abd5c46330fa0a8c0 BUG: 1125814 Signed-off-by: Zhang Huan <zhhuan@gmail.com> Reviewed-on: http://review.gluster.org/8398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-07-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: removing deleting entries in case of creation failuresRaghavendra G2014-07-301-40/+68
| | | | | | | | | | | | | The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa3d BUG: 1117851 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fix races to avoid deletion of linkto fileVenkatesh Somyajulu2014-07-181-11/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8231 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* porting: Provide fallocate and fremovexattr for OSXHarshavardhana2014-07-021-1/+1
| | | | | | | | | Change-Id: I563216f83edaff6d01a251ef0c1746a14aec700c BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8217 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* porting: Port for FreeBSD rebased from Mike Ma's effortsHarshavardhana2014-07-021-2/+2
| | | | | | | | | | | | | | | | | | | - Provides a working Gluster Management Daemon, CLI - Provides a working GlusterFS server, GlusterNFS server - Provides a working GlusterFS client - execinfo port from FreeBSD is moved into ./contrib/libexecinfo for ease of portability on NetBSD. (FreeBSD 10 and OSX provide execinfo natively) - More portability cleanups for Darwin, FreeBSD and NetBSD - Provides a new rc script for FreeBSD Change-Id: I8dff336f97479ca5a7f9b8c6b730051c0f8ac46f BUG: 1111774 Original-Author: Mike Ma <mikemandarine@gmail.com> Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8141 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Posix/readdirp : Avoid filling wrong stat info for dentry in readdirpSusant Palai2014-05-281-2/+5
| | | | | | | | | | | | | | | | | Problem: Upon no entry found for a dentry, posix_readdirp_fill used to fill the stat for the current entry with the previous one. Solution: Continue with other entries if lstat failed for current one Change-Id: Ic96b5900451ed6c8de59acf9fee2e116649d3cdb BUG: 1096578 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/7733 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Don't allow open on symlinkPranith Kumar K2014-05-221-2/+8
| | | | | | | | | | | | Change-Id: Ie38ecad621d5cb351c607c9676814c573834cb0b BUG: 1099858 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7823 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Remove dead code in xattropPranith Kumar K2014-05-011-32/+0
| | | | | | | | | Change-Id: I48ba81ad342e3c8fe1d81f8e57e61676dd356fb0 BUG: 1092749 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7318 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: add list xattr capability to lookupPranith Kumar K2014-04-291-6/+12
| | | | | | | | | BUG: 1078061 Change-Id: Ie26d28b8a74aa0d1eceff14a84c3cd3e302dcdb5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7293 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Update references to the maillinglist to gluster-devel@gluster.orgNiels de Vos2014-04-271-1/+1
| | | | | | | | | | | | | | gluster-devel@nongnu.org has moved to gluster-devel@gluster.org. All occurrences in the current (non legacy) documentation and code have been adjusted. Change-Id: I053162e633f7ea14fd3eed239ded017df165147c BUG: 1091705 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7573 Reviewed-by: Justin Clift <justin@gluster.org> Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* build: MacOSX Porting fixesHarshavardhana2014-04-241-25/+157
| | | | | | | | | | | | | | | | | | | | | git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs Working functionality on MacOSX - GlusterD (management daemon) - GlusterCLI (management cli) - GlusterFS FUSE (using OSXFUSE) - GlusterNFS (without NLM - issues with rpc.statd) Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Dennis Schafroth <dennis@schafroth.com> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Dennis Schafroth <dennis@schafroth.com> Reviewed-on: http://review.gluster.org/7503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Reduce logging caused by non-existing extended attributesNiels de Vos2014-03-011-2/+4
| | | | | | | | | | | | | | | | | This changes the following log messages from INFO (default value) to DEBUG. We do not really care if someone tries to read extended attributes that do not exist. [2013-12-09 12:19:05.924497] E [posix.c:3539:posix_fgetxattr] 0-dis-rep-posix: fgetxattr failed on key system.posix_acl_access (No data available) [2013-12-09 12:19:05.924545] I [server-rpc-fops.c:863:server_fgetxattr_cbk] 0-dis-rep-server: 13074: FGETXATTR 1 (b8381953-ffa5-40fa-90dd-ae122335cc4b) (system.posix_acl_access) ==> (No data available) Change-Id: Idbbeb026f81e67025a2b36d7bfeb125ad2a1f61b BUG: 1027174 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7171 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Anand Avati <avati@redhat.com>
* add build-gfid option to enable pgfid tracking ...Krishnan Parthasarathi2014-02-141-1/+1
| | | | | | | | | | | .. for inode to pathname mapping Change-Id: I0486d85b02e86d739fc1d8ea16d118fb666abf60 BUG: 1064863 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6989 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: calculate checksum only over read dataAnand Avati2014-02-131-2/+2
| | | | | | | | | | | | | | If the last block of a file is not aligned to the requested size, checksum is calculated over junk data in the iobuf. Or it could be zeroes, resulting in a spurious checksum match in self-heal. Change-Id: I41422e08de90013dabfc348ec6fbb8ecdd4f8fb8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6932 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: perform chmod after chown.Ravishankar N2014-02-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | Problem: When a replica brick is added to a volume, set-user-ID and set-group-ID permission bits of files are not set correctly in the new brick. The issue is in the posix_setattr() call where we do a chmod followed by a chown. But according to the man pages for chown: When the owner or group of an executable file are changed by an unprivileged user the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify whether this also should happen when root does the chown(). Fix: Swap the chmod and chown calls in posix_setattr() Change-Id: I094e47a995c210d2fdbc23ae7a5718286e7a9cf8 BUG: 1058797 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/6862 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/marker-quota: more stringent error handling in rename.Raghavendra G2014-02-081-1/+4
| | | | | | | | | | | | | | | If an error occurs and op_errno is not set to non-zero value, we can end up in loosing a frame resulting in a hung syscall. This patch adds code setting op_errno appropriately in storage/posix and makes marker to set err to a default non-zero value in case of op_errno being zero. Change-Id: Idc2c3e843b932709a69b32ba67deb284547168f2 BUG: 833586 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/5032 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Problem : getfattr fails and mount point becomes inaccessible while Atin Mukherjee2014-02-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | fetching glusterfs.ancestry.path key value for a gfid from aux-gfid-mount based mount point. Analysis : The caller of posix_make_ancestryfromgfid() function i.e. posix_get_ancestry_non_directory() is sending an incorrect type value as an argument leading to a crash in posix_make_ancestral_node() (head dirent pointer is NULL and an attempt to dereference causes seg fault). For a non directory operation this piece of code should not have been executed but due to incorrect type value this happened. Solution : In posix_make_ancestryfromgfid() call type is passed as logical or of type and POSIX_ANCESTRY_PATH instead of logical or of POSIX_ANCESTRY_PATH and POSX_ANCESTRY_DENTRY. Change-Id: Iaf844bea91396c5e2ee295d5a082998fda66f0a9 BUG: 1060104 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/6892 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: remove in-memory accounting of files in enforcerRaghavendra G2014-01-251-1/+0
| | | | | | | | | | | | | | | Accounting was done in enforcer (though marker is the ultimate source of truth) to offset cached directory size becoming stale. However, with enforcer being moved to brick we can no longer maintain correct cluster wide size for a directory. Hence removing accounting code from enforcer. Change-Id: I5ea94234da4da85ed5f5ced1354d8de3454b3fcb BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6434 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* consolidate code for #ifdef HAVE_LINKAT usageVijay Bellur2014-01-141-12/+3
| | | | | | | | | | | | | | | | | | | | | sys_link() now does ifdef HAVE_LINKAT linkat (...) else link (...) endif Use sys_link() in all places where we previously had the conditional behavior. Change-Id: I8bce5ac1175efd2ba7ab4bb5b372f6d1e0365d28 BUG: 764655 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6633 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: UNWIND right op_error and op_errno in *setxattr()Vijay Bellur2014-01-141-2/+6
| | | | | | | | | | | | | | | | | | | | | | 1. errno was being set after gf_log() in posix_{f}handle_pair, this would cause errno to be overwritten. 2. dht would expect -1 for indication of failure in setxattr callback (dht_err_cbk()). posix_{f}setxattr has been changed to set op_ret as -1 instead of -op_errno. 3. dict_foreach() has been changed to return an error if the invoked fn() returns < 0. Bug report and test case credits to Zorro Lang <zlang@redhat.com> Change-Id: I96c15f12a5d7717b7584ba392f390a0b4f704a98 BUG: 1051896 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6684 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* pathinfo: Provide user namespace access.Vijaykumar M2013-12-161-4/+2
| | | | | | | | | | | | | | | | | Locality can be now queried by unprivileged users with key "glusterfs.pathinfo". Setting both "glusterfs.pathinfo" and "trusted.glusterfs.pathinfo" on disk is prevented with this patch. Original Author: Vijay Bellur <vbellur@redhat.com> Change-Id: I4f7a0db8ad59165c4aeda04b23173255157a8b79 Signed-off-by: Vijaykumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/5101 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: if brick-uid or brick-gid is not specified, do not setAnand Avati2013-12-121-12/+29
| | | | | | | | | | | | | | Current code would set owner uid/gid explicitly to 0/0 on start even if none was specified. Fix it. Change-Id: I72dec9e79c51bd1eb3af5334c42b7c23b01d0258 BUG: 1040275 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6476 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Lukáš Bezdička <lukas.bezdicka@gooddata.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>