summaryrefslogtreecommitdiffstats
path: root/xlators/storage/posix/src/posix.c
Commit message (Collapse)AuthorAgeFilesLines
* storage/posix: Treat ENODATA/ENOATTR as success in bulk removexattrPranith Kumar K2014-11-051-0/+14
| | | | | | | | | | | | | | | | | Bulk remove xattr is internal fop in gluster. Some of the xattrs may have special behavior. Ex: removexattr("posix.system_acl_access"), removes more than one xattr on the file that could be present in the bulk-removal request. Removexattr of these deleted xattrs will fail with either ENODATA/ENOATTR. Since all this fop cares is removal of the xattrs in bulk-remove request and if they are already deleted, it can be treated as success. Change-Id: Id8f2a39b68ab763ec8b04cb71b47977647f22da4 BUG: 1160509 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9049 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Avoid spurious EINVAL in posix_readdir()Emmanuel Dreyfus2014-10-291-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | On non Linux systems, we check that seekdir() succeeds and we return EINVAL if it does not. We need this to avoid infinite loops if some other component in GlusterFS makes an invalid seekdir() usage. This was introduced in this change: http://review.gluster.org/#/c/8760/ But seekdir() also fails when using the offset returned for the last entry, and this is expected behavior. As a result, the seekdir() test produces a spurious EINVAL when reaching end of directory. That error is not propagated to calling process, but it may harm internal GlusterFS processing. At least it produce a spurious error message in brick's log. We fix the problem by remembering the last entry offset in fd private data. When a new posix_readdir() invocation requests that offset, we avoid returning EINVAL. BUG: 1129939 Change-Id: I4e67a2ea46538aae63eea663dd4aa33b16ad24c7 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8926 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* POSIX filesystem compliance: PATH_MAXEmmanuel Dreyfus2014-10-031-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSIX mandates the filesystem to support paths of lengths up to _XOPEN_PATH_MAX (1024). This is the PATH_MAX limit here: http://pubs.opengroup.org/onlinepubs/009604499/basedefs/limits.h.html When using a path of 1023 bytes, the posix xlator attempts to create an absolute path by prefixing the 1023 bytes path by the brick base path. The result is an absolute path of more than _XOPEN_PATH_MAX bytes which may be rejected by the backend filesystem. Linux's ext3fs PATH_MAX seems to defaut to 4096, which means it will work (except if brick base path is longer than 2072 bytes but it is unlikely to happen. NetBSD's FFS PATH_MAX defaults to 1024, which means the bug can happen regardless of brick base path length. If this condition is detected for a brick, the proposed fix is to chdir() the brick glusterfsd daemon to its brick base directory. Then when encountering a path that will exceed _XOPEN_PATH_MAX once prefixed by the brick base path, a relative path is used instead of an absolute one. We do not always use relative path because some operations require an absolute path on the brick base path itself (e.g.: statvfs). At least on NetBSD, this chdir() uncovers a race condition which causes file lookup to fail with ENODATA for a few seconds. The volume quickly reaches a sane state, but regression tests are fast enough to choke on it. The reason is obscure (as often with race conditions), but sleeping one second after the chdir() seems to change scheduling enough that the problem disapear. Note that since the chdir() is done if brick backend filesystem does not support path long enough, it will not occur with Linux ext3fs (except if brick base path is over 2072 bytes long). BUG: 1129939 Change-Id: I7db3567948bc8fa8d99ca5f5ba6647fe425186a9 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8596 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/quota: Heal pgfid xattr on existing data when the quota isvmallika2014-09-301-0/+21
| | | | | | | | | | | | | | | | | | | | | enable The pgfid extended attributes are used to construct the ancestry path (from the file to the volume root) for nameless lookups on files. As NFS relies on nameless lookups heavily, quota enforcement through NFS would be inconsistent if quota were to be enabled on a volume with existing data. Solution is to heal the pgfid extended attributes as a part of lookup perfomed by quota-crawl process. In a posix lookup check for pgfid xattr and if it is missing set the xattr. Change-Id: I5912ea96787625c496bde56d43ac9162596032e9 BUG: 1147378 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8878 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fix invalid seekdir() usageEmmanuel Dreyfus2014-09-301-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to POSIX, seekdir() should only be given offset obtained from telldir() on the same DIR * http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html Code from afr-self-heald.c and index.c is operating outside of the specification, by doing using seekdir() with offset from a previously open/close/re-open directory. This seems to work on Linux (although with no guarantee it will always in the future). On NetBSD the seekdir() with a in invalid offset is a nilpotent operation, and causes an infinite loop, since index_fill_readdir() always restart from the beginning of the directory. The situation is fixed by using a non anonymous fd in afr-self-heald.c: we explicitely open the directory so that it remains open on the brick side during the timeframe where we want to reuse offsets in seekdir(). This requires adding an opendir fop in index xlator. If the brick was not updated, the opendir will fail and we fallback to the standard violating approach for backward compatibility on Linux. On other systems we fail since it never worked. While there, add tests to check seekdir() success in index and posix xlators, so that incorrect usage from calling code produce an explicit error instead of an infinite loop. We can only do it on non Linux systems, for the sake of backward compatibility when the brick was updated but not the client. BUG: 1129939 Change-Id: I88ca90acfcfee280988124bd6addc1a1893ca7ab Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8760 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: allow setxattr of keys with null values.Ravishankar N2014-09-291-2/+2
| | | | | | | | | | | | | | | Disk based file systems allow to get/set extended attribute key-value pairs where value can be null. Fuse/libgfapi clients must be able to do the same on a gluster volume. Change-Id: Ifc11134cc07f1a3ede43f9d027554dcd10b5c930 BUG: 1135514 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/8567 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Do not forbid fallocate on non Linux systemsEmmanuel Dreyfus2014-09-261-7/+2
| | | | | | | | | | | | | | | | Linux fallocate() differs from posix_fallocate() by an extra flag that can have the FALLOC_FL_KEEP_SIZE value; Do not test FALLOC_FL_KEEP_SIZE existence to enable fallocate() in posix xlator, as sys_fallocate() in libglusterfs provides support for both implementations. BUG: 1129939 Change-Id: Idf41a0396028a15e81281791bf6912d7fd674e3f Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8856 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Log when mkdir is on an existing gfid but non-existentRaghavendra G2014-09-181-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path. consider following steps on a distribute volume 1. rename (src, dst) on hashed subvolume 2. snapshot taken 3. restore snapshots and do stat on src and dst Now, we end up with two directories src and dst having same gfid, because of distribute creating directories on non-existent subvolumes as part of directory healing. This can happen even with race between rename and directory healing in dht-lookup. This can lead to undefined behaviour while accessing any of both directories. Hence, we are logging paths of both directories, so that a sysadmin can take some corrective action when (s)he sees this log. One of the corrective action can be to copy contents of both directories from backend into a new directory and delete both directories. Since effort involved to fix this issue is non-trivial, giving this workaround till we come up with a fix. Change-Id: I38f4520e6787ee33180a9cd1bf2f36f46daea1ea BUG: 1105082 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8008 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Always check for ENODATA with ENOATTREmmanuel Dreyfus2014-09-081-8/+12
| | | | | | | | | | | | | | | | | | Linux defines ENODATA and ENOATTR with the same value, which means that code can miss on on the two without breaking. FreeBSD does not have ENODATA and GlusterFS defines it as ENOATTR just like Linux does. On NetBSD, ENODATA != ENOATTR, hence we need to check for both values to get portable behavior. BUG: 764655 Change-Id: I003a3af055fdad285d235f2a0c192c9cce56fab8 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/8447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Added code to capture races in dht-lookup pathVenkatesh Somyajulu2014-09-031-0/+6
| | | | | | | | | Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix : Missing space in log messageNithya Balachandran2014-09-031-1/+1
| | | | | | | | | | | Added a space in a log message Change-Id: Iabd50e6b5c9ff4673f59d6b52b785894b3dcdaf9 BUG: 1116150 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/8585 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* storage/posix: Don't unlink .glusterfs-hardlink before linkto checkVenkatesh Somyajulu2014-08-281-8/+37
| | | | | | | | | | BUG: 1116150 Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8559 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: fix issue in posix_fsyncZhang Huan2014-08-011-1/+2
| | | | | | | | | | | | | | Fix the issue that posix_fsync does not correctly return and save error code in op_errno when call to sys_fdatasync fails. Change-Id: Id0b62cfa009dbb52c8a0992abd5c46330fa0a8c0 BUG: 1125814 Signed-off-by: Zhang Huan <zhhuan@gmail.com> Reviewed-on: http://review.gluster.org/8398 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Modified logic of linkto file deletion on non-hashedVenkatesh Somyajulu2014-07-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: removing deleting entries in case of creation failuresRaghavendra G2014-07-301-40/+68
| | | | | | | | | | | | | The code is not atomic enough to not to delete a dentry created by a prallel dentry creation operation. Change-Id: I9bd6d2aa9e7a1c0688c0a937b02a4b4f56d7aa3d BUG: 1117851 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/8327 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Fix races to avoid deletion of linkto fileVenkatesh Somyajulu2014-07-181-11/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com> Reviewed-on: http://review.gluster.org/8231 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* porting: Provide fallocate and fremovexattr for OSXHarshavardhana2014-07-021-1/+1
| | | | | | | | | Change-Id: I563216f83edaff6d01a251ef0c1746a14aec700c BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8217 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* porting: Port for FreeBSD rebased from Mike Ma's effortsHarshavardhana2014-07-021-2/+2
| | | | | | | | | | | | | | | | | | | - Provides a working Gluster Management Daemon, CLI - Provides a working GlusterFS server, GlusterNFS server - Provides a working GlusterFS client - execinfo port from FreeBSD is moved into ./contrib/libexecinfo for ease of portability on NetBSD. (FreeBSD 10 and OSX provide execinfo natively) - More portability cleanups for Darwin, FreeBSD and NetBSD - Provides a new rc script for FreeBSD Change-Id: I8dff336f97479ca5a7f9b8c6b730051c0f8ac46f BUG: 1111774 Original-Author: Mike Ma <mikemandarine@gmail.com> Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/8141 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Posix/readdirp : Avoid filling wrong stat info for dentry in readdirpSusant Palai2014-05-281-2/+5
| | | | | | | | | | | | | | | | | Problem: Upon no entry found for a dentry, posix_readdirp_fill used to fill the stat for the current entry with the previous one. Solution: Continue with other entries if lstat failed for current one Change-Id: Ic96b5900451ed6c8de59acf9fee2e116649d3cdb BUG: 1096578 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/7733 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Don't allow open on symlinkPranith Kumar K2014-05-221-2/+8
| | | | | | | | | | | | Change-Id: Ie38ecad621d5cb351c607c9676814c573834cb0b BUG: 1099858 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7823 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Remove dead code in xattropPranith Kumar K2014-05-011-32/+0
| | | | | | | | | Change-Id: I48ba81ad342e3c8fe1d81f8e57e61676dd356fb0 BUG: 1092749 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7318 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: add list xattr capability to lookupPranith Kumar K2014-04-291-6/+12
| | | | | | | | | BUG: 1078061 Change-Id: Ie26d28b8a74aa0d1eceff14a84c3cd3e302dcdb5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7293 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Update references to the maillinglist to gluster-devel@gluster.orgNiels de Vos2014-04-271-1/+1
| | | | | | | | | | | | | | gluster-devel@nongnu.org has moved to gluster-devel@gluster.org. All occurrences in the current (non legacy) documentation and code have been adjusted. Change-Id: I053162e633f7ea14fd3eed239ded017df165147c BUG: 1091705 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7573 Reviewed-by: Justin Clift <justin@gluster.org> Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* build: MacOSX Porting fixesHarshavardhana2014-04-241-25/+157
| | | | | | | | | | | | | | | | | | | | | git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs Working functionality on MacOSX - GlusterD (management daemon) - GlusterCLI (management cli) - GlusterFS FUSE (using OSXFUSE) - GlusterNFS (without NLM - issues with rpc.statd) Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Dennis Schafroth <dennis@schafroth.com> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Dennis Schafroth <dennis@schafroth.com> Reviewed-on: http://review.gluster.org/7503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Reduce logging caused by non-existing extended attributesNiels de Vos2014-03-011-2/+4
| | | | | | | | | | | | | | | | | This changes the following log messages from INFO (default value) to DEBUG. We do not really care if someone tries to read extended attributes that do not exist. [2013-12-09 12:19:05.924497] E [posix.c:3539:posix_fgetxattr] 0-dis-rep-posix: fgetxattr failed on key system.posix_acl_access (No data available) [2013-12-09 12:19:05.924545] I [server-rpc-fops.c:863:server_fgetxattr_cbk] 0-dis-rep-server: 13074: FGETXATTR 1 (b8381953-ffa5-40fa-90dd-ae122335cc4b) (system.posix_acl_access) ==> (No data available) Change-Id: Idbbeb026f81e67025a2b36d7bfeb125ad2a1f61b BUG: 1027174 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7171 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Anand Avati <avati@redhat.com>
* add build-gfid option to enable pgfid tracking ...Krishnan Parthasarathi2014-02-141-1/+1
| | | | | | | | | | | .. for inode to pathname mapping Change-Id: I0486d85b02e86d739fc1d8ea16d118fb666abf60 BUG: 1064863 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6989 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: calculate checksum only over read dataAnand Avati2014-02-131-2/+2
| | | | | | | | | | | | | | If the last block of a file is not aligned to the requested size, checksum is calculated over junk data in the iobuf. Or it could be zeroes, resulting in a spurious checksum match in self-heal. Change-Id: I41422e08de90013dabfc348ec6fbb8ecdd4f8fb8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6932 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: perform chmod after chown.Ravishankar N2014-02-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | Problem: When a replica brick is added to a volume, set-user-ID and set-group-ID permission bits of files are not set correctly in the new brick. The issue is in the posix_setattr() call where we do a chmod followed by a chown. But according to the man pages for chown: When the owner or group of an executable file are changed by an unprivileged user the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify whether this also should happen when root does the chown(). Fix: Swap the chmod and chown calls in posix_setattr() Change-Id: I094e47a995c210d2fdbc23ae7a5718286e7a9cf8 BUG: 1058797 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/6862 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/marker-quota: more stringent error handling in rename.Raghavendra G2014-02-081-1/+4
| | | | | | | | | | | | | | | If an error occurs and op_errno is not set to non-zero value, we can end up in loosing a frame resulting in a hung syscall. This patch adds code setting op_errno appropriately in storage/posix and makes marker to set err to a default non-zero value in case of op_errno being zero. Change-Id: Idc2c3e843b932709a69b32ba67deb284547168f2 BUG: 833586 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/5032 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Problem : getfattr fails and mount point becomes inaccessible while Atin Mukherjee2014-02-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | fetching glusterfs.ancestry.path key value for a gfid from aux-gfid-mount based mount point. Analysis : The caller of posix_make_ancestryfromgfid() function i.e. posix_get_ancestry_non_directory() is sending an incorrect type value as an argument leading to a crash in posix_make_ancestral_node() (head dirent pointer is NULL and an attempt to dereference causes seg fault). For a non directory operation this piece of code should not have been executed but due to incorrect type value this happened. Solution : In posix_make_ancestryfromgfid() call type is passed as logical or of type and POSIX_ANCESTRY_PATH instead of logical or of POSIX_ANCESTRY_PATH and POSX_ANCESTRY_DENTRY. Change-Id: Iaf844bea91396c5e2ee295d5a082998fda66f0a9 BUG: 1060104 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/6892 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: remove in-memory accounting of files in enforcerRaghavendra G2014-01-251-1/+0
| | | | | | | | | | | | | | | Accounting was done in enforcer (though marker is the ultimate source of truth) to offset cached directory size becoming stale. However, with enforcer being moved to brick we can no longer maintain correct cluster wide size for a directory. Hence removing accounting code from enforcer. Change-Id: I5ea94234da4da85ed5f5ced1354d8de3454b3fcb BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6434 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* consolidate code for #ifdef HAVE_LINKAT usageVijay Bellur2014-01-141-12/+3
| | | | | | | | | | | | | | | | | | | | | sys_link() now does ifdef HAVE_LINKAT linkat (...) else link (...) endif Use sys_link() in all places where we previously had the conditional behavior. Change-Id: I8bce5ac1175efd2ba7ab4bb5b372f6d1e0365d28 BUG: 764655 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6633 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Anand Avati <avati@redhat.com>
* storage/posix: UNWIND right op_error and op_errno in *setxattr()Vijay Bellur2014-01-141-2/+6
| | | | | | | | | | | | | | | | | | | | | | 1. errno was being set after gf_log() in posix_{f}handle_pair, this would cause errno to be overwritten. 2. dht would expect -1 for indication of failure in setxattr callback (dht_err_cbk()). posix_{f}setxattr has been changed to set op_ret as -1 instead of -op_errno. 3. dict_foreach() has been changed to return an error if the invoked fn() returns < 0. Bug report and test case credits to Zorro Lang <zlang@redhat.com> Change-Id: I96c15f12a5d7717b7584ba392f390a0b4f704a98 BUG: 1051896 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6684 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* pathinfo: Provide user namespace access.Vijaykumar M2013-12-161-4/+2
| | | | | | | | | | | | | | | | | Locality can be now queried by unprivileged users with key "glusterfs.pathinfo". Setting both "glusterfs.pathinfo" and "trusted.glusterfs.pathinfo" on disk is prevented with this patch. Original Author: Vijay Bellur <vbellur@redhat.com> Change-Id: I4f7a0db8ad59165c4aeda04b23173255157a8b79 Signed-off-by: Vijaykumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/5101 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: if brick-uid or brick-gid is not specified, do not setAnand Avati2013-12-121-12/+29
| | | | | | | | | | | | | | Current code would set owner uid/gid explicitly to 0/0 on start even if none was specified. Fix it. Change-Id: I72dec9e79c51bd1eb3af5334c42b7c23b01d0258 BUG: 1040275 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6476 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: Lukáš Bezdička <lukas.bezdicka@gooddata.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: do not allow to set/get "trusted.glusterfs.volume-id" xattrVijaykumar M2013-11-261-0/+15
| | | | | | | | | Change-Id: I2e9a2264b1fd5ebc1ed0aff30225e89acbd0bcb4 BUG: 1034716 Signed-off-by: Vijaykumar M <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/6361 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: fix errno for non-existent GFIDAnand Avati2013-11-261-0/+6
| | | | | | | | | | | | | | | | | | | When clients refer to a GFID which does not exist, the errno to be returned in ESTALE (and not ENOENT). Even though ENOENT might look "proper" most of the time, as the application eventually expects ENOENT even if a parent directory does not exist, not returning ESTALE results in resolvers (FUSE and GFAPI) to not retry resolution in uncached mode. This can result in spurious ENOENTs during concurrent path modification operations. Change-Id: I7a06ea6d6a191739f2e9c6e333a1969615e05936 BUG: 1032894 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6318 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: placeholders for GFID to path conversionRaghavendra G2013-11-261-114/+645
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | what? ===== The following is an attempt to generate the paths of a file when only its gfid is known. To find the path of a directory, the symlink handle to the directory maintained in the ".glusterfs" backend directory is read. The symlink handle is generated using the gfid of the directory. It (handle) contains the directory's name and parent gfid, which are used to recursively construct the absolute path as seen by the user from the mount point. A similar approach cannot be used for a regular file or a symbolic link since its hardlink handle, generated using its gfid, doesn't contain its parent gfid and basename. So xattrs are set to store the parent gfids and the number of hardlinks to a file or a symlink having the same parent gfid. When an user/application requests for the paths of a regular file or a symlink with multiple hardlinks, using the parent gfids stored in the xattrs, the paths of the parent directories are generated as mentioned earlier. The base names of the hardlinks (with the same parent gfid) are determined by matching the actual backend inode numbers of each entry in the parent directory with that of the hardlink handle. Xattr is set on a regular file, link, and symbolic link as follows, Xattr name : trusted.pgfid.<pargfidstr> Xattr value : <number of hardlinks to a regular file/symlink with the same parentgfid> If a regular file, hard link, symbolic link is created then an xattr in the above format is set in the backend. how to use? =========== This functionality can be used through getxattr interface. Two keys - glusterfs.ancestry.dentry and glusterfs.ancestry.path - enable usage of this functionality. A successful getxattr will have the result stored under same keys. Values will be, glusterfs.ancestry.dentry: -------------------------- A linked list of gf-dirent structures for all possible paths from root to this gfid. If there are multiple paths, the linked-list will be a series of paths one after another. Each path will be a series of dentries representing all components of the path. This key is primarily for internal usage within glusterfs. glusterfs.ancestry.path: ------------------------ A string containing all possible paths from root to this gfid. Multiple hardlinks of a file or a symlink are displayed as a colon seperated list (this could interfere with path components containing ':'). e.g. If there is a file "file1" in root directory with two hardlinks, "/dir2/link2tofile1" and "/dir1/link1tofile1", then [root@alpha gfsmntpt]# getfattr -n glusterfs.ancestry.path -e text file1 glusterfs.ancestry.path="/file1:/dir2/link2tofile1:/dir1/link1tofile1" Thanks Amar, Avati and Venky for the inputs. Original Author: Ramana Raja <rraja@redhat.com> BUG: 990028 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I0eaa9101e333e0c1f66ccefd9e95944dd4a27497 Reviewed-on: http://review.gluster.org/5951 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Build storage/posix xlator if fallocate() does not existsEmmanuel Dreyfus2013-11-201-1/+12
| | | | | | | | | | | If fallocate() does not exists, just return EOPNOTSUPP BUG: 764655 Change-Id: I808114f733c88985519dc47fb7537e1ced1db077 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/6289 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Fixes for ZF reported by coverityM. Mohan Kumar2013-11-191-1/+5
| | | | | | | | | BUG: 1028673 Change-Id: I7c75738cca22c81c5629d579ef5bea24000e622e Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/6291 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* zerofill: Change the type of len argument of glfs_zerofill() to off_tBharata B Rao2013-11-141-8/+8
| | | | | | | | | | | | | | glfs_zerofill() can be potentially called to zero-out entire file and hence allow for bigger value of length parameter. Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63 BUG: 1028673 Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-on: http://review.gluster.org/6266 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com> Tested-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: zerofill supportM. Mohan Kumar2013-11-101-16/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: Fix excessive logging resulting from get_real_filename failure from sambaKrutika Dhananjay2013-11-071-2/+3
| | | | | | | | | | Change-Id: I641d028165da7b8501bd372c62d2df89a9d4db1f BUG: 1027174 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/6237 Reviewed-by: poornima g <pgurusid@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: Fix readv FOPM. Mohan Kumar2013-10-171-5/+1
| | | | | | | | | | | Suggested by Anand Avati in BD xlator code review. Change-Id: I31c353a26dfdeb3d0023c3f7e03ed25461d13c16 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> BUG: 837495 Reviewed-on: http://review.gluster.org/6077 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: Lower log severity for ENODATA errors in getxattr()Vijay Bellur2013-10-151-1/+2
| | | | | | | | | Change-Id: I101e329cce4c1305615c63ebcb42355f6c3e85e0 BUG: 918052 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6084 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gNFS: Incorrect NFS ACL encoding for XFSSantosh Kumar Pradhan2013-09-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Incorrect NFS ACL encoding causes "system.posix_acl_default" setxattr failure on bricks on XFS file system. XFS (potentially others?) doesn't understand when the 0x10 prefix is added to the ACL type field for default ACLs (which the Linux NFS client adds) which causes setfacl()->setxattr() to fail silently. NFS client adds NFS_ACL_DEFAULT(0x1000) for default ACL. FIX: Mask the prefix (added by NFS client) OFF, so the setfacl is not rejected when it hits the FS. Original patch by: "Richard Wareing" Change-Id: I17ad27d84f030cdea8396eb667ee031f0d41b396 BUG: 1009210 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/5980 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: block unused signals in created threadsAnand Avati2013-09-251-1/+1
| | | | | | | | | | | | | | | Block all signal except those which are set for explicit handling in glusterfs_signals_setup(). Since thread spawning code in libglusterfs and xlators can get called from application threads when used through libgfapi, it is necessary to do this blocking. Change-Id: Ia320f80521a83d2edcda50b9ad414583a0175281 BUG: 1011662 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/5995 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: remove GLUSTERFS_CREATE_MODE_KEY usageAmar Tumballi2013-08-231-14/+0
| | | | | | | | | Change-Id: I23b8cb7223b91a55af1cd4214f61bbe0e87351f6 BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5683 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: changes to support gfid-accessAmar Tumballi2013-08-211-0/+14
| | | | | | | | | Change-Id: I38d2fdc47e4b805deafca6805e54807976ffdb7e Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 952029 Reviewed-on: http://review.gluster.org/5496 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "fuse: auxiliary gfid mount support"Amar Tumballi2013-08-211-14/+0
| | | | | | | | | | | | | | | | | This reverts commit 4c0f4c8a89039b1fa1c9c015fb6f273268164c20. Conflicts: xlators/mount/fuse/src/fuse-bridge.c For build issues added CREATE_MODE_KEY definition in: libglusterfs/src/glusterfs.h Change-Id: I8093c2a0b5349b01e1ee6206025edbdbee43055e BUG: 952029 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/5495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>