summaryrefslogtreecommitdiffstats
path: root/xlators/storage/posix/src/posix-helpers.c
Commit message (Collapse)AuthorAgeFilesLines
...
* storage/posix: Do not fail entry creation fops if gfid handle already existsKrutika Dhananjay2018-10-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: tests/bugs/shard/bug-1251824.t fails occasionally with EIO due to gfid mismatch across replicas on the same shard when dd is executed. CAUSE: Turns out this is due to a race between posix_mknod() and posix_lookup(). posix mknod does 3 operations, among other things: 1. creation of the entry itself under its parent directory 2. setting the gfid xattr on the file, and 3. creating the gfid link under .glusterfs. Consider a case where the thread doing posix_mknod() (initiated by shard) has executed steps 1 and 2 and is on its way to executing 3. And a parallel LOOKUP from another thread on noting that loc->inode->gfid is NULL, tries to perform gfid_heal where it attempts to create the gfid link under .glusterfs and succeeds. As a result, posix_gfid_set() through MKNOD (step 3) fails with EEXIST. In the older code, MKNOD under such conditions was NOT being treated as a failure. But commit e37ee6d changes this behavior by failing MKNOD, causing the entry creation to be undone in posix_mknod() (it's another matter that the stale gfid handle gets left behind if lookup has gone ahead and gfid-healed it). All of this happens on only one replica while on the other MKNOD succeeds. Now if a parallel write causes shard translator to send another MKNOD of the same shard (shortly after AFR releases entrylk from the first MKNOD), the file is created on the other replica too, although with a new gfid (since "gfid-req" that is passed now is a new UUID. This leads to a gfid-mismatch across the replicas. FIX: The solution is to not fail MKNOD (or any other entry fop for that matter that does posix_gfid_set()) if the .glusterfs link creation fails with EEXIST. Change-Id: I84a5e54d214b6c47ed85671a880bb1c767a29f4d fixes: bz#1638453 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* all: fix warnings on non 64-bits architecturesXavi Hernandez2018-10-101-1/+1
| | | | | | | | | | When compiling in other architectures there appear many warnings. Some of them are actual problems that prevent gluster to work correctly on those architectures. Change-Id: Icdc7107a2bc2da662903c51910beddb84bdf03c0 fixes: bz#1632717 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-2618/+2562
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* posix: remove not supported get/set contentAmar Tumballi2018-09-041-168/+1
| | | | | | | | | | | | | | getting and setting a file's content using extended attribute worked great as a GET/PUT alternative when an object storage is supported on top of Gluster. But it needs application changes, and also, it skips some caching layers. It is not used over years, and not supported any more. Remove the dead code. Fixes: bz#1625102 Change-Id: Ide3b3f1f644f6ca58558bbe45561f346f96b95b7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Multiple files: calloc -> mallocYaniv Kaul2018-09-041-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xlators/storage/posix/src/posix-inode-fd-ops.c: xlators/storage/posix/src/posix-helpers.c: xlators/storage/bd/src/bd.c: xlators/protocol/client/src/client-lk.c: xlators/performance/quick-read/src/quick-read.c: xlators/performance/io-cache/src/page.c xlators/nfs/server/src/nfs3-helpers.c xlators/nfs/server/src/nfs-fops.c xlators/nfs/server/src/mount3udp_svc.c xlators/nfs/server/src/mount3.c xlators/mount/fuse/src/fuse-helpers.c xlators/mount/fuse/src/fuse-bridge.c xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd-syncop.h xlators/mgmt/glusterd/src/glusterd-snapshot.c xlators/mgmt/glusterd/src/glusterd-rpc-ops.c xlators/mgmt/glusterd/src/glusterd-replace-brick.c xlators/mgmt/glusterd/src/glusterd-op-sm.c xlators/mgmt/glusterd/src/glusterd-mgmt.c xlators/meta/src/subvolumes-dir.c xlators/meta/src/graph-dir.c xlators/features/trash/src/trash.c xlators/features/shard/src/shard.h xlators/features/shard/src/shard.c xlators/features/marker/src/marker-quota.c xlators/features/locks/src/common.c xlators/features/leases/src/leases-internal.c xlators/features/gfid-access/src/gfid-access.c xlators/features/cloudsync/src/cloudsync-plugins/src/cloudsyncs3/src/libcloudsyncs3.c xlators/features/bit-rot/src/bitd/bit-rot.c xlators/features/bit-rot/src/bitd/bit-rot-scrub.c bxlators/encryption/crypt/src/metadata.c xlators/encryption/crypt/src/crypt.c xlators/performance/md-cache/src/md-cache.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! .. and allocate memory as much as needed. xlators/nfs/server/src/mount3.c : Don't blindly allocate PATH_MAX, but strlen() the string and allocate appropriately. Also, align error messges. updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Change-Id: Ibda6f33dd180b7f7694f20a12af1e9576fe197f5
* multiple xlators (storage/posix): strncpy()->sprintf(), reduce strlen()'sYaniv Kaul2018-08-311-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | xlators/storage/posix/src/posix-gfid-path.c xlators/storage/posix/src/posix-handle.c xlators/storage/posix/src/posix-helpers.c xlators/storage/posix/src/posix-inode-fd-ops.c xlators/storage/posix/src/posix.h strncpy may not be very efficient for short strings copied into a large buffer: If the length of src is less than n, strncpy() writes additional null bytes to dest to ensure that a total of n bytes are written. Instead, use snprintf(). Also: - save the result of strlen() and re-use it when possible. - move from strlen to SLEN (sizeof() ) for const strings. Compile-tested only! Change-Id: I056939f111a4ec6bc8ebd539ebcaf9eb67da6c95 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* clang-scan: fix multiple issuesAmar Tumballi2018-08-311-2/+2
| | | | | | | | | | | * Buffer overflow issue in glusterfsd * Null argument passed to function expecting non-null (event-epoll) * Make sure the op_ret value is set in macro (posix) Updates: bz#1622665 Change-Id: I32b378fc40a5e3ee800c0dfbc13335d44c9db9ac Signed-off-by: Amar Tumballi <amarts@redhat.com>
* posix: Delete the entry if gfid link creation failskarthik-us2018-08-201-4/+22
| | | | | | | | | | | | | | | Problem: If the gfid link file inside .glusterfs is not present for a file, the operations which are dependent on the gfid will fail, complaining the link file does not exists inside .glusterfs. Fix: If the link file creation fails, fail the entry creation operation and delete the original file. Change-Id: Id767511de2da46b1f45aea45cb68b98d965ac96d fixes: bz#1612037 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* posix: FORWARD_NULL coverity fixShwetha Acharya2018-08-171-1/+1
| | | | | | | | | | | | Problem: filler could be null. Solution: Modified the condition check to avoid NULL pointer dereferencing. BUG: 789278 Change-Id: I0c3e29ede3c226295a9860ddcb3b432832c381dd Signed-off-by: Shwetha Acharya <shwetha174@gmail.com>
* md-cache: Do not invalidate cache post set/remove xattrPoornima G2018-07-111-9/+40
| | | | | | | | | | | | | | | Since setxattr and removexattr fops cbk do not carry poststat, the stat cache was being invalidated in setxatr/remoxattr cbk. Hence the further lookup wouldn't be served from cache. To prevent this invalidation, md-cache is modified to get the poststat in set/removexattr_cbk in dict. Co-authored with Xavi Hernandez. Change-Id: I6b946be2d20b807e2578825743c25ba5927a60b4 fixes: bz#1586018 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Signed-off-by: Poornima G <pgurusid@redhat.com>
* Fix compile warningsXavi Hernandez2018-07-101-7/+23
| | | | | | | | | | | This patch fixes compile warnings that appear with newer compilers. The solution applied is only to remove the warnings, but it doesn't always solve the problem in the best way. It assumes that the problem will never happen, as the previous code assumed. Change-Id: I6e8470d6c2e2dbd3bd7d324b5fd2f92ffdc3d6ec updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* storage/posix: Fix excessive logging in WRITE fop pathKrutika Dhananjay2018-06-131-0/+3
| | | | | | | | | | | | I was running some write-intensive tests on my volume, and in a matter of 2 hrs, the 50GB space in my root partition was exhausted. On inspecting further, figured that excessive logging in bricks was the cause - specifically in posix write when posix_check_internal_writes() does dict_get() without a NULL-check on xdata. Change-Id: I89de57a3a90ca5c375e5b9477801a9e5ff018bbf fixes: bz#1590655 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* posix/ctime: Fix gfid heal on first lookupKotresh HR2018-05-241-10/+42
| | | | | | | | | | | | | | | With ctime feature enabled, the gfid is not healing on first lookup. The fresh file logic depends on ctime and it was fetching from backend instead of xattr with ctime feature enabled. Fixed the same. Also fixed a possible hang with inode lock Change-Id: I020875c0462b284d6fa0e68304a422fa3d6a3e73 fixes: bz#1580532 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* glusterd/ctime: Provide option to enable/disable ctime featureKotresh HR2018-05-061-3/+6
| | | | | | Updates: #208 Change-Id: If6f52b9b1b5b823ad64faeed662e96ceb848c54c Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: posix hook to set ctime xattr in relevant fopsKotresh HR2018-05-061-2/+42
| | | | | | | | | | | This patch uses the ctime posix APIs to set consistent time across replica on disk. It also stores the time attributes in the inode context. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: I1a8d74d1e251f1d6d142f066fc99258025c0bcdd Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: posix hooks to get consistent time xattrKotresh HR2018-05-061-6/+54
| | | | | | | | | | | | This patch uses the ctime posix APIs to get consistent time across replica. The time attributes are got from from inode context or from on disk if not found and merged with iatt to be returned. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: Id737038ce52468f1f5ebc8a42cbf9c6ffbd63850 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-191-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* posix: check file state before continuing with fopsSusant Palai2018-04-101-0/+482
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In context of Cloudsync: In scenarios where a data modification fop e.g. a write landed in POSIX thinking that the file is local, while the file was actually remote, can be dangerous. Ofcourse we don’t want to take inodelk for every read/write operation to check the archival status or coordinate with an upload or a download of a file. To avoid inodelk, we will check the status of the file in POSIX it self, before we resume the fop. This helps us avoiding any races mentioned above. Now e.g. if a write reached POSIX for a file which was actually remote, it can check the status of the file and will get to know that the file is remote. It can error out with this status “remote” and cloudsync xlator will retry the same operation, once it finished downloading the file. This patch includes the setxattr changes to do the post processing of upload i.e. truncate and setting the remote xattr "trusted.glusterfs.cs.remote" to indicate the file is REMOTE Each file will have no xattr if the file is LOCAL, one remote xattr if the file is REMOTE and a combination of REMOTE and DOWNLOADING xattr if the file is getting downloaded. There is healing logic of these xattrs to recover from crash inconsitencies. Fixes: #387 Change-Id: Ie93c2d41aa8d6a798a39bdbef9d1669f057e5fdb Signed-off-by: Susant Palai <spalai@redhat.com>
* Quota: heal directory on newly added bricks when quota limit is reachedSanoj Unnikrishnan2018-03-281-0/+4
| | | | | | | | | | | | | | | | | Problem: if a lookup is done on a newly added brick for a path on which limit has been reached, the lookup fails to heal the directory tree due to quota. Solution: Tag the lookup as an internal fop and ignore it in quota. Since marking internal fop does not usually give enough contextual information. Introducing new flags to pass the contextual info. Adding dict_check_flag and dict_set_flag to aid flag operations. A flag is a single bit in a bit array (currently limited to 256 bits). Change-Id: Ifb6a68bcaffedd425dd0f01f7db24edd5394c095 fixes: bz#1505355 BUG: 1505355 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com>
* storage/posix: Add active-fd-count option in glusterPranith Kumar K2018-03-211-32/+20
| | | | | | | | | | | | | | | | | | | | Problem: when dd happens on sharded replicate volume all the writes on shards happen through anon-fd. When the writes don't come quick enough, old anon-fd closes and new fd gets created to serve the new writes. open-fd-count is decremented only after the fd is closed as part of fd_destroy(). So even when one fd is on the way to be closed a new fd will be created and during this short period it appears as though there are multiple fds opened on the file. AFR thinks another application opened the same file and switches off eager-lock leading to extra latency. Fix: Have a different option called active-fd whose life cycle starts at fd_bind() and ends just before fd_destroy() BUG: 1557932 Change-Id: I2e221f6030feeedf29fbb3bd6554673b8a5b9c94 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* glusterfsd: Memleak in glusterfsd process while brick mux is onMohit Agrawal2018-02-271-0/+1
| | | | | | | | | | | | | | | | | | Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/19574) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
* Revert "glusterfsd: Memleak in glusterfsd process while brick mux is on"Mohit Agrawal2018-02-191-1/+0
| | | | | | | | | | | There are still remain some code paths where cleanup is required while brick mux is on.I will upload a new patch after resolve all code paths. This reverts commit b313d97faa766443a7f8128b6e19f3d2f1b267dd. BUG: 1544090 Change-Id: I26ef1d29061092bd9a409c8933d5488e968ed90e Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterfsd: Memleak in glusterfsd process while brick mux is onMohit Agrawal2018-02-151-0/+1
| | | | | | | | | | | | | Problem: At the time of stopping the volume while brick multiplex is enabled memory is not cleanup from all server side xlators. Solution: To cleanup memory for all server side xlators call fini in glusterfs_handle_terminate after send GF_EVENT_CLEANUP notification to top xlator. BUG: 1544090 Change-Id: Ifa1525e25b697371276158705026b421b4f81140 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* cluster/dht: avoid overwriting client writes during migrationSusant Palai2018-02-021-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | For more details on this issue see https://github.com/gluster/glusterfs/issues/308 Solution: This is a restrictive solution where a file will not be migrated if a client writes to it during the migration. This does not check if the writes from the rebalance and the client actually do overlap. If dht_writev_cbk finds that the file is being migrated (PHASE1) it will set an xattr on the destination file indicating the file was updated by a non-rebalance client. Rebalance checks if any other client has written to the dst file and aborts the file migration if it finds the xattr. updates gluster/glusterfs#308 Change-Id: I73aec28bc9dbb8da57c7425ec88c6b6af0fbc9dd Signed-off-by: Susant Palai <spalai@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com>
* posix: In getxattr, honor the wildcard '*'Poornima G2018-01-191-23/+41
| | | | | | | | | | | | | | | | | Currently, the posix_xattr_fill performas a sys_getxattr on all the keys requested, there are requirements where the keys could contain a wildcard, in which case sys_getxattr would return ENODATA, eg: if the xattr requested is user.* all the xattrs with prefix user. should be returned, with their values. This patch, changes posix_xattr_fill, to honor wildcard in the keys requested. Updates #297 Change-Id: I3d52da2957ac386fca3c156e26ff4cdf0b2c79a9 Signed-off-by: Poornima G <pgurusid@redhat.com>
* dict: add more types for valuesAmar Tumballi2018-01-051-4/+4
| | | | | | | | | | Added 2 more types which are present in gluster codebase, mainly IATT and UUID. Updates #203 Change-Id: Ib6d6d6aefb88c3494fbf93dcbe08d9979484968f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* posix: Introduce flags for validity of iatt membersRavishankar N2017-12-291-0/+4
| | | | | | | | | | | | | | | | | | v1 of the patch started off as adding new fields to iatt that can be filled up using statx but the discussions were more around introducing masks to check the validity of different fields from a RIO perspective. To that extent, I have dropped the statx call in this version and introduced a 64 bit mask for existing fields. The masks I have defined are similar with the statx() flags' masks. I have *not* changed iatt_to_stat() to use the macros IATT_TYPE_VALID, IATT_GFID_VALID etc before blindly copying from struct iatt to struct. Also fixed warnings in xlators because of atime/mtime/ctime seconds field change from uint32_t to int64_t. Change-Id: I4ac614f1e8d5c8246fc99d5bc2d2a23e7941512b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/dht: Add migration checks to dht_(f)xattropN Balachandran2017-12-261-0/+32
| | | | | | | | | | | | The dht_(f)xattrop implementation did not implement migration phase1/phase2 checks which could cause issues with rebalance on sharded volumes. This does not solve the issue where fops may reach the target out of order. Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c BUG: 1471031 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* posix: Add option to disable nftw() based deletes when purging the landfill ↵Shreyas Siravara2017-12-111-8/+15
| | | | | | | | | | | | | | | | directory Summary: - We may have found an issue where certain directories were being moved into .landfill and then being quickly purged via nftw(). - We would like to have an emergency option to disable these purges. > Reviewed-on: https://review.gluster.org/18253 > Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Fixes #371 Change-Id: I90b54c535930c1ca2925a928728199b6b80eadd9 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* posix: Reorganize posix xlator to prepare for reuse with rioShyamsundarR2017-12-021-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Split out entry and inode/fd based FOPs into separate files from posix.c 2. Split out common routines (init, fini, reconf, and such) into its own file, from posix.c 3. Retain just the method assignments in posix.c (such that posix2 for RIO can assign its own methods in the future for entry operations and such) 4. Based on the split in (1) and (2) split out posix-handle.h into 2 files, such that macros that are needed for inode ops are in one and rest are in the other If the split is done as above, posix2 can compile with its own entry ops, and hence not compile, the entry ops as split in (1) above. The split described in (4) can again help posix2 to define its own macros to make entry and inode handles, thus not impact existing POSIX xlator code. Noted problems - There are path references in certain cases where quota is used (in the xattr FOPs), and thus will fail on reuse in posix2, this needs to be handled when we get there. - posix_init does set root GFID on the brick root, and this is incorrect for posix2, again will need handling later when posix2 evolves based on this code (other init checks seem fine on current inspection) Merge of experimental branch patches with the following gerrit change-IDs > Change-Id: I965ce6dffe70a62c697f790f3438559520e0af20 > Change-Id: I089a4d9cf470c2f9c121611e8ef18dea92b2be70 > Change-Id: I2cec103f6ba8f3084443f3066bcc70b2f5ecb49a Fixes gluster/glusterfs#327 Change-Id: I0ccfa78559a7c5a68f5e861e144cf856f5c9e19c Signed-off-by: ShyamsundarR <srangana@redhat.com>
* posix: Convert posix_fs_health_check asynchrnously to save timestampMohit Agrawal2017-12-011-16/+82
| | | | | | | | | | | Problem: Sometime posix_fs_health_check thread is blocked on write/read call while backend device deleted abruptly. Solution: To resolve it convert code to update timestamp asynchrnously. BUG: 1501132 Change-Id: Id68ea6a572bf68fbf437e1d9be5221b63d47ff9c Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* posix: Ignore disk space reserve check for internal FOPSMohit Agrawal2017-10-301-7/+5
| | | | | | | | | | | | | Problem: Currently disk space reserve check is applicable for internal FOP also it needs to be ignore for internal FOP. Solution: Update the DISK_SPACE_CHECK_AND_GOTO macro at posix component. Macro will call only while key "GLUSTERFS_INTERNAL_FOP_KEY" exists in xdata. BUG: 1506083 Change-Id: I2b0840bbf4fa14bc247855b024ca136773d68d16 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* geo-rep: Filter out volume-mark xattrKotresh HR2017-10-131-0/+1
| | | | | | | | | | | | | The volume-mark xattr, maintained at brick root of slave volume is specific to geo-replication and should be filtered out for all other clients. It should also be filtered out from list getxattr from all mounts including geo-rep mount as it might cause rsync to read and set. Change-Id: If9eb5a3af18051083c853e70d93b2819e8eea222 BUG: 1500433 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix: add null gfid checksRavishankar N2017-08-081-0/+6
| | | | | | | | | | | | | | | | | | ...in file/dir creation and lookup codepaths. The check is relaxed for fops coming from trash xlator at the moment until trash has client side logic to send the create fops with gfid-req. Also fixed the missing trash pid assignment in creates sent by trash xlator. Without this, truncated files won't be moved to .trashcan. Change-Id: Ieddd7f0634850e7c7010e4fbb4ad1eead35888c8 BUG: 1478297 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17975 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
* storage/posix: Add virtual xattr to fetch path from gfidKotresh HR2017-07-281-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfid2path infra stores the "pargfid/bname" as on xattr value for each non directory entry. Hardlinks would have a separate xattr. This xattr key is internal and is not exposed to applications. A virtual xattr is exposed for the applications to fetch the path from gfid. Internal xattr: trusted.gfid2path.<xxhash> Virtual xattr: glusterfs.gfidtopath getfattr -h -n glusterfs.gfidtopath /<aux-mnt>/.gfid/<gfid> If there are hardlinks, it returns all the paths separated by ':'. A volume set option is introduced to change the delimiter to required string of max length 7. gluster vol set gfid2path-separator ":::" Updates: #139 Change-Id: Ie3b0c3fd8bd5333c4a27410011e608333918c02a Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17785 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* posix: Needs to reserve disk space to prevent the brick from getting fullMohit Agrawal2017-07-251-0/+115
| | | | | | | | | | | | | | | | | | | | Problem: Currently there is no option available at posix xlator to save the disk from getting full Solution: Introduce a new option storage.reserve at posix xlator to configure disk threshold.posix xlator spawn a thread to update the disk space status in posix private structure and same flag is checked by every posix fop before start operation.If flag value is 1 then it sets op_errno to ENOSPC and goto out from the fop. BUG: 1471366 Change-Id: I98287cd409860f4c754fc69a332e0521bfb1b67e Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17780 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* storage/posix: Use the ret value of posix_gfid_heal()Krutika Dhananjay2017-07-251-9/+7
| | | | | | | | | | | | | | | | ... to make the change in commit acf8cfdf truly useful. Without this, a race between entry creation fops and lookup at posix layer can cause lookups to fail with ENODATA, as opposed to ENOENT. Change-Id: I44a226872283a25f1f4812f03f68921c5eb335bb BUG: 1472758 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/17821 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* posix/gfid2path: Block access to gfid2path xattr via mountKotresh HR2017-07-241-0/+10
| | | | | | | | | | | | | | | | gfid2path xattr is an internal xattr and should not be allowed to modify by other applications via gluster mount. This patch blocks the same. Updates: #139 Change-Id: Id2cb29797ee1bd77e0e0d2203a47469fd7203355 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17744 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* libglusterfs: Name threads on creationRaghavendra Talur2017-07-191-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set names to threads on creation for easier debugging. Output of top -H -p <PID-OF-GLUSTERFSD> Before: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd After: 19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer 19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd 19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep 19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0 19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1 19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0 19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker 19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0 19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign 19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker 19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon 19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0 19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1 19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2 19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan 19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy 25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1 5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2 7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703 BUG: 1254002 Updates: #271 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: https://review.gluster.org/11926 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* storage/posix: Don't allow gfid/volume-id xattr to be removedPranith Kumar K2017-07-181-18/+10
| | | | | | | | | | | | | | | | | | | | | | | Problem: Bulk xattr removal doesn't check if the xattrs that are coming in xdata have gfid/volume-id xattrs, so there is potential for bulkremovexattr removing gfid/volume-id. I also observed that bulkremovexattr is not available for fremovexattr. Fix: Do proper checks in bulk removexattr to remove gfid/volume-id. Refactor [f]removexattr to reduce the differences. BUG: 1470489 Change-Id: Ia845b31846a149500111c0996646e648f72cdce6 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17765 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/ec: Get size of file in EC [f]xattropPranith Kumar K2017-07-131-0/+5
| | | | | | | | | | | | | | | | | | | | Problem: For allowing parallel writes we shouldn't depend on ia_size to be same for all the bricks in each write_cbk(). But we need to make sure backend size is correct on all the bricks and no crashes/manual modifications happened. Fix: At the time of get_size_version() we do 1 check to make sure size of the file is same across the bricks. From then on the FOPs will give the status of the fop, so we rely on this information to keep which bricks are good/bad. Updates #251 Change-Id: I1df645347e2e9f2e09cfa4411b6cc305d7f4e4e5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17741 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* storage/posix: Handle [f]xattrop xdataPranith Kumar K2017-07-131-16/+7
| | | | | | | | | | | | Updates #251 Change-Id: I13d89c3b5dc39aa0a232a70be8ec6b64394cfa6e Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/17740 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* posix: Avoid one extra call of l(get|list)xattr call after use buffer in ↵Mohit Agrawal2017-07-031-10/+21
| | | | | | | | | | | | | | | | | | | | | posix_getxattr Problem: In posix xlator posix_(f)getxattr is calling system call(sys_lgetxattr) two times to fetch the xattr value. Solution: After use the extra buffer for first time calling we can avoid second attempt of system call(sys_lgetxattr) calling in posix_getxattr for most of the xattrs. BUG: 1460659 Change-Id: I0d8da776c5bc86653d874a4629a73bbf65c621b8 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17520 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kinglong Mee
* posix: Revert modifying op_errno in __posix_fd_ctx_getRavishankar N2017-06-191-10/+6
| | | | | | | | | | | | | | | | | https://review.gluster.org/#/c/17414/ converted ENOENT to EBADFD because ENOENT is not a valid error for fd based operations, but this apparently breaks dht rebalance behaviour (see comments in the backport 17517. So reverting that part of the change. Change-Id: Idcf5c65a47b096a3766cf7f20ca938d988572052 BUG: 1456582 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17565 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterfsd: Deletion of brick dir throw emerg msgs after stop volumeMohit Agrawal2017-06-091-3/+3
| | | | | | | | | | | | | | | | | | | | Problem: Deletion of brick directories throw emerg messages after stop volume while brick mux is enabled. Solution: Modify the posix health check monitor thread code to handled correctly. BUG: 1459781 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I2d22a84f9a98b0da261e5fb7850ba1368f3601d7 Reviewed-on: https://review.gluster.org/17492 Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* glusterfs: Not able to mount running volume after enable brick mux and ↵Mohit Agrawal2017-05-311-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | stopped any volume Problem: After enabled brick mux if any volume has down and then try ot run mount with running volume , mount command is hung. Solution: After enable brick mux server has shared one data structure server_conf for all associated subvolumes.After down any subvolume in some ungraceful manner (remove brick directory) posix xlator sends GF_EVENT_CHILD_DOWN event to parent xlatros and server notify updates the child_up to false in server_conf.When client is trying to communicate with server through mount it checks conf->child_up and it is FALSE so it throws message "translator are not yet ready". From this patch updated structure server_conf to save child_up status for xlator wise. Another improtant correction from this patch is cleanup threads from server side xlators after stop the volume. BUG: 1453977 Change-Id: Ic54da3f01881b7c9429ce92cc569236eb1d43e0d Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17356 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* posix: use the correct op_errnoRavishankar N2017-05-311-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If readdir/fstat was performed on a directory that was removed, posix_fd_ctx_get() fails with ENOENT but we incorrectly use the ret value (-1 in this case) as op_errno, logging "Operation not permitted" messages in the brick logs. Also in case of fstat, the -1 op_errno was also propagated to the client via stack unwind, causing the message to appear in protocol/client logs as well. Fix: Use the right op_errno in readdir, fstat and writev. Also, if posix_fd_ctx_get() failed with ENOENT, convert it into EBADF because ENOENT is not a valid error for an fd operation. Change-Id: Ie43c0789d5040ec73b7cf885d015a183b8c64d70 BUG: 1456582 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17414 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* brick mux: Detach brick on posix health check failureAtin Mukherjee2017-05-151-3/+22
| | | | | | | | | | | | | | With brick mux enabled, we'd need to detach a particular brick if the underlying backend has gone bad. This patch addresses the same. Change-Id: Icfd469c7407cd2d21d02e4906375ec770afeacc3 BUG: 1450630 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17287 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* posix: Send SIGKILL in 2nd attemptAtin Mukherjee2017-05-091-2/+2
| | | | | | | | | | | | | | | Commit 21c7f7ba changed the signal from SIGKILL to SIGTERM for the 2nd attempt to terminate the brick process if SIGTERM fails. This patch fixes this problem. Change-Id: I856df607b7109a215f2a2a4827ba3ea42d8a9729 BUG: 1444596 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/17208 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* glusterd: socketfile & pidfile related fixes for brick multiplexing featureMohit Agrawal2017-05-091-9/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: While brick-muliplexing is on after restarting glusterd, CLI is not showing pid of all brick processes in all volumes. Solution: While brick-mux is on all local brick process communicated through one UNIX socket but as per current code (glusterd_brick_start) it is trying to communicate with separate UNIX socket for each volume which is populated based on brick-name and vol-name.Because of multiplexing design only one UNIX socket is opened so it is throwing poller error and not able to fetch correct status of brick process through cli process. To resolve the problem write a new function glusterd_set_socket_filepath_for_mux that will call by glusterd_brick_start to validate about the existence of socketpath. To avoid the continuous EPOLLERR erros in logs update socket_connect code. Test: To reproduce the issue followed below steps 1) Create two distributed volumes(dist1 and dist2) 2) Set cluster.brick-multiplex is on 3) kill glusterd 4) run command gluster v status After apply the patch it shows correct pid for all volumes BUG: 1444596 Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: https://review.gluster.org/17101 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>