summaryrefslogtreecommitdiffstats
path: root/xlators/storage/posix/src/posix-metadata.c
Commit message (Collapse)AuthorAgeFilesLines
* multiple: fix bad type castXavi Hernandez2020-01-101-10/+25
| | | | | | | | | | | | When using inode_ctx_get() or inode_ctx_set(), a 'uint64_t *' is expected. In many cases, the value to retrieve or store is a pointer, which will be of smaller size in some architectures (for example 32-bits). In this case, directly passing the address of the pointer casted to an 'uint64_t *' is wrong and can cause memory corruption. Change-Id: Iae616da9dda528df6743fa2f65ae5cff5ad23258 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Fixes: bz#1785611
* posix-metadata.c: try to getxattr in one call.Yaniv Kaul2020-01-061-48/+61
| | | | | | | | | | | | | | | | | Another location where instead of 2 sys calls we strive to get the xattr in a single call, by guesstimating the required size And avoid (or try to) not to first read the xattr len, then another call to actually fetch. Instead, use a sane size (256 bytes - worth checking if it makes sense or by default use a larger size), and see if we can fetch it. If we fail, we'll read the size and re-fetch. Such changes are needed elsewhere too (see https://github.com/gluster/glusterfs/issues/720 ) Change-Id: I466cea9d8b12fc45f6b37d202b1294ca28cd1fdd updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* Avoid buffer overwrite due to uuid_utoa() misuseDmitry Antipov2019-12-271-2/+5
| | | | | | | | | | | | | | | Code like: f(..., uuid_utoa(x), uuid_utoa(y)); is not valid (causes undefined behaviour) because uuid_utoa() uses the only static thread-local buffer which will be overwritten by the subsequent call. All such cases should be converted to use uuid_utoa_r() with explicitly specified buffer. Change-Id: I5e72bab806d96a9dd1707c28ed69ca033b9c8d6c Updates: bz#1193929 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
* posix/ctime: Fix coverity issueKotresh HR2019-09-181-1/+1
| | | | | | | | | | | | | | | posix-metadata.c: 462 in posix_set_mdata_xattr() ... 460 GF_VALIDATE_OR_GOTO(this->name, time, out); 461 >>> CID 1405665: Control flow issues (DEADCODE) >>> Execution cannot reach the expression "flag->atime" inside this >>> statement: "if (update_utime && (flag->...". 462 if (update_utime && (flag->ctime && !time) && (flag->atime && !u_atime) && Change-Id: Id31d81d04ea2785a669eafe0dc1307303cb2271b updates: bz#789278 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime/rebalance: Heal ctime xattr on directory during rebalanceKotresh HR2019-09-161-4/+61
| | | | | | | | | | | | | | | | After add-brick and rebalance, the ctime xattr is not present on rebalanced directories on new brick. This patch fixes the same. Note that ctime still doesn't support consistent time across distribute sub-volume. This patch also fixes the in-memory inconsistency of time attributes when metadata is self healed. Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df fixes: bz#1734026 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Fix ctime issue with utime family of syscallsKotresh HR2019-08-201-46/+50
| | | | | | | | | When atime|mtime is updated via utime family of syscalls, ctime is not updated. This patch fixes the same. Change-Id: I7f86d8f8a1e06a332c3449b5bbdbf128c9690f25 fixes: bz#1738786 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* features/utime: always update ctime at setattrKinglong Mee2019-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | For the nfs EXCLUSIVE mode create may sets a later time to mtime (at verifier), it should not set to ctime for storage.ctime does not allowed set ctime to a earlier time. /* Earlier, mdata was updated only if the existing time is less * than the time to be updated. This would fail the scenarios * where mtime can be set to any time using the syscall. Hence * just updating without comparison. But the ctime is not * allowed to changed to older date. */ According to kernel's setattr, always set ctime at setattr, and doesnot set ctime from mtime at storage.ctime. Change-Id: I5cfde6cb7f8939da9617506e3dc80bd840e0d749 fixes: bz#1737288 Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
* posix/ctime: Fix race during lookup ctime xattr healKotresh HR2019-08-011-18/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Ctime heals the ctime xattr ("trusted.glusterfs.mdata") in lookup if it's not present. In a multi client scenario, there is a race which results in updating the ctime xattr to older value. e.g. Let c1 and c2 be two clients and file1 be the file which doesn't have the ctime xattr. Let the ctime of file1 be t1. (from backend, ctime heals time attributes from backend when not present). Now following operations are done on mount c1 -> ls -l /mnt/file1 | c2 -> ls -l /mnt/file1;echo "append" >> /mnt/file1; The race is that the both c1 and c2 didn't fetch the ctime xattr in lookup, so both of them tries to heal ctime to time 't1'. If c2 wins the race and appends the file before c1 heals it, it sets the time to 't1' and updates it to 't2' (because of append). Now c1 proceeds to heal and sets it to 't1' which is incorrect. Solution: Compare the times during heal and only update the larger time. This is the general approach used in ctime feature but got missed with healing legacy files. fixes: bz#1734299 Change-Id: I930bda192c64c3d49d0aed431ce23d3bc57e51b7 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Set mdata xattr on legacy filesKotresh HR2019-07-221-41/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The files which were created before ctime enabled would not have "trusted.glusterfs.mdata"(stores time attributes) xattr. Upon fops which modifies either ctime or mtime, the xattr gets created with latest ctime, mtime and atime, which is incorrect. It should update only the corresponding time attribute and rest from backend Solution: Creating xattr with values from brick is not possible as each brick of replica set would have different times. So create the xattr upon successful lookup if the xattr is not created Note To Reviewers: The time attributes used to set xattr is got from successful lookup. Instead of sending the whole iatt over the wire via setxattr, a structure called mdata_iatt is sent. The mdata_iatt contains only time attributes. Change-Id: I5e535631ddef04195361ae0364336410a2895dd4 fixes: bz#1593542 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix ctime upgrade issueKotresh HR2019-06-211-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: On a EC volume, during upgrade from the older version where ctime feature is not enabled(or not present) to the newer version where the ctime feature is available (enabled default), the self heal hangs and doesn't complete. Cause: The ctime feature has both client side code (utime) and server side code (posix). The feature is driven from client. Only if the client side sets the time in the frame, should the server side sets the time attributes in xattr. But posix setattr/fseattr was not doing that. When one of the server nodes is updated, since ctime is enabled by default, it starts setting xattr on setattr/fseattr on the updated node/brick. On a EC volume the first two updated nodes(bricks) are not a problem because there are 4 other bricks with consistent data. However once the third brick is updated, the new attribute(mdata xattr) will cause an inconsistency on metadata on 3 bricks, which prevents the file to be repaired. Fix: Don't create mdata xattr with utimes/utimensat system call. Only update if already present. Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c fixes: bz#1720201 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Fix log repeated logging during openKotresh HR2019-04-241-10/+5
| | | | | | | | | | | | The log "posix set mdata failed, No ctime" logged repeatedly after the fix [1]. Those could be internal fops. This patch fixes the same. [1] https://review.gluster.org/22540 fixes: bz#1701457 Change-Id: I42799a90b976982cedb0ca11fa224d555eb05650 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix stat(time attributes) inconsistency during readdirpKotresh HR2019-04-151-16/+25
| | | | | | | | | | | | | | | | | | | | Problem: Creation of tar file on gluster volume throws warning 'file changed as we read it' Cause: During readdirp, for few of the files whose inode is not present, time attributes were served from backend. This caused the ctime of few files to be different between before readdir and after readdir by tar. Solution: If ctime feature is enabled and inode is not present, don't serve the time attributes from backend file, serve it from xattr. fixes: bz#1698078 Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix: fix coverity issueIraj Jamali2019-01-211-6/+1
| | | | | | | | | | Logically dead code CID: 1398468 Updates: bz#789278 Change-Id: I8713a0c51777eb64e617d00ab72fd1db4994b6ab Signed-off-by: Iraj Jamali <ijamali@redhat.com>
* copy_file_range support in GlusterFSRaghavendra Bhat2018-12-121-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * libglusterfs changes to add new fop * Fuse changes: - Changes in fuse bridge xlator to receive and send responses * posix changes to perform the op on the backend filesystem * protocol and rpc changes for sending and receiving the fop * gfapi changes for performing the fop * tools: glfs-copy-file-range tool for testing copy_file_range fop - Although, copy_file_range support has been added to the upstream fuse kernel module, no release has been made yet of a kernel which contains the support. It is expected to come in the upcoming release of linux-4.20 So, as of now, executing copy_file_range fop on a fused based filesystem results in fuse kernel module sending read on the source fd and write on the destination fd. Therefore a small gfapi based tool has been written to be able test the copy_file_range fop. This tool is similar (in functionality) to the example program given in copy_file_range man page. So, running regular copy_file_range on a fuse mount point and running gfapi based glfs-copy-file-range tool gives some idea about how fast, the copy_file_range (or reflink) can be. On the local machine this was the result obtained. mount -t glusterfs workstation:new /mnt/glusterfs [root@workstation ~]# cd /mnt/glusterfs/ [root@workstation glusterfs]# ls file [root@workstation glusterfs]# cd [root@workstation ~]# time /tmp/a.out /mnt/glusterfs/file /mnt/glusterfs/new real 0m6.495s user 0m0.000s sys 0m1.439s [root@workstation ~]# time glfs-copy-file-range $(hostname) new /tmp/glfs.log /file /rrr OPEN_SRC: opening /file is success OPEN_DST: opening /rrr is success FSTAT_SRC: fstat on /rrr is success copy_file_range successful real 0m0.309s user 0m0.039s sys 0m0.017s This tool needs following arguments 1) hostname 2) volume name 3) log file path 4) source file path (relative to the gluster volume root) 5) destination file path (relative to the gluster volume root) "glfs-copy-file-range <hostname> <volume> <log file path> <source> <destination>" - Added a testcase as well to run glfs-copy-file-range tool * io-stats changes to capture the fop for profiling * NOTE: - Added conditional check to see whether the copy_file_range syscall is available or not. If not, then return ENOSYS. - Added conditional check for kernel minor version in fuse_kernel.h and fuse-bridge while referring to copy_file_range. And the kernel minor version is kept as it is. i.e. 24. Increment it in future when there is a kernel release which contains the support for copy_file_range fop in fuse kernel module. * The document which contains a writeup on this enhancement can be found at https://docs.google.com/document/d/1BSILbXr_knynNwxSyyu503JoTz5QFM_4suNIh2WwrSc/edit Change-Id: I280069c814dd21ce6ec3be00a884fc24ab692367 updates: #536 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* libglusterfs: Move devel headers under glusterfs directoryShyamsundarR2018-12-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | libglusterfs devel package headers are referenced in code using include semantics for a program, this while it works can be better especially when dealing with out of tree xlator builds or in general out of tree devel package usage. Towards this, the following changes are done, - moved all devel headers under a glusterfs directory - Included these headers using system header notation <> in all code outside of libglusterfs - Included these headers using own program notation "" within libglusterfs This change although big, is just moving around the headers and making it correct when including these headers from other sources. This helps us correctly include libglusterfs includes without namespace conflicts. Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b Updates: bz#1193929 Signed-off-by: ShyamsundarR <srangana@redhat.com>
* posix: fix memory leakAmar Tumballi2018-11-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | Direct leak of 609960 byte(s) in 4485 object(s) allocated from: #0 0x7f0d719bea50 in __interceptor_calloc (/lib64/libasan.so.5+0xefa50) #1 0x7f0d716dc08f in __gf_calloc ../../../libglusterfs/src/mem-pool.c:111 #2 0x7f0d5d41d9b2 in __posix_get_mdata_xattr ../../../../../xlators/storage/posix/src/posix-metadata.c:240 #3 0x7f0d5d41dd6b in posix_get_mdata_xattr ../../../../../xlators/storage/posix/src/posix-metadata.c:317 #4 0x7f0d5d39e855 in posix_fdstat ../../../../../xlators/storage/posix/src/posix-helpers.c:685 #5 0x7f0d5d3d65ec in posix_create ../../../../../xlators/storage/posix/src/posix-entry-ops.c:2173 Direct leak of 609960 byte(s) in 4485 object(s) allocated from: #0 0x7f0d719bea50 in __interceptor_calloc (/lib64/libasan.so.5+0xefa50) #1 0x7f0d716dc08f in __gf_calloc ../../../libglusterfs/src/mem-pool.c:111 #2 0x7f0d5d41ced2 in posix_set_mdata_xattr ../../../../../xlators/storage/posix/src/posix-metadata.c:359 #3 0x7f0d5d41e70f in posix_set_ctime ../../../../../xlators/storage/posix/src/posix-metadata.c:616 #4 0x7f0d5d3d662c in posix_create ../../../../../xlators/storage/posix/src/posix-entry-ops.c:2181 We were freeing only the first context in inode during forget, and not the second. updates: bz#1633930 Change-Id: Ib61b4453aa3d2039d6ce660f52ef45390539b9db Signed-off-by: Amar Tumballi <amarts@redhat.com>
* features/ctime: Fix Coverity issueSusant Palai2018-11-121-0/+1
| | | | | | | | CID : 1394632 Dereference after null check Change-Id: If0bef48b070935854e9d2988393dba07c9001cd2 updates: bz#789278 Signed-off-by: Susant Palai <spalai@redhat.com>
* posix/ctime: Avoid log flood in posix_update_utime_in_mdataKotresh HR2018-11-021-4/+0
| | | | | | | | | | | posix_update_utime_in_mdata() unconditionally logs an error if consistent time attributes features is not enabled. This log does not add any value, prints an incorrect errno & floods the log file. Hence nuking this log message in this patch. fixes: bz#1644129 Change-Id: I9a1f9e7ada3366d2830f18d81f16a1461040092e Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Provide noatime optionKotresh HR2018-09-251-0/+8
| | | | | | | | | | | | | | | | | Most of the applications are {c|m}time dependant and very few are atime dependant. So provide noatime option to not update atime when ctime feature is enabled. Also this option has to be enabled with ctime feature to avoid unnecessary self heal. Since AFR/EC reads data from single subvolume, atime is only updated in one subvolume triggering self heal. updates: bz#1593538 Change-Id: I085fb33c882296545345f5df194cde7b6cbc337e Signed-off-by: Kotresh HR <khiremat@redhat.com>
* Land part 2 of clang-format changesGluster Ant2018-09-121-540/+523
| | | | | Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
* storage/posix: Avoid log flood in posix_set_parent_ctime()Vijay Bellur2018-07-231-4/+0
| | | | | | | | | | | | posix_set_parent_ctime() unconditionally logs an error if consistent time attributes is not enabled. This log does not add any value, prints an incorrect errno & floods the log file. Hence nuking this log message in this patch. Change-Id: I82a78f2f8ce5ab518f8cdf6d9086a97049712f75 fixes: bz#1607049 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* All: run codespell on the code and fix issues.Yaniv Kaul2018-07-221-1/+1
| | | | | | | | | | | | Please review, it's not always just the comments that were fixed. I've had to revert of course all calls to creat() that were changed to create() ... Only compile-tested! Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5 updates: bz#1193929 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* posix/ctime: Fix differential ctime duing entry operationsKotresh HR2018-06-201-51/+60
| | | | | | | | | | | | | | | | | We should not be relying on backend file's time attributes to load the initial ctime time attribute structure. This is incorrect as each replica set would have witnessed the file creation at different times. For new file creation, ctime, atime and mtime should be same, hence initiate the ctime structure with the time from the frame. But for the files which were created before ctime feature is enabled, this is not accurate but still fine as the times would get eventually accurate. fixes: bz#1592275 Change-Id: I206a469c83ee7b26da2fe096ae7bf8ff5986ad67 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix fops racing in updating mtime/atimeKotresh HR2018-06-031-11/+31
| | | | | | | | | | | | | | | | In distributed systems, there could be races with fops updating mtime/atime which could result in different mtime/atime for same file. So updating them only if time is greater than the existing makes sure, only the highest time is retained. If the mtime/atime update comes from the explicit utime syscall, it is allowed to set to previous time. Thanks Xavi for helping in rooting the issue. fixes: bz#1584981 Change-Id: If1230a75b96d7f9a828795189fcc699049e7826e Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix updating mtime to older timeKotresh HR2018-05-251-5/+11
| | | | | | | | | | | With ctime feature enabled, the mtime is not updated when it's set to time older than the existing one. Fixed the same. But the ctime is not allowed to change to older dates. fixes: bz#1581035 Change-Id: If520922df42d6ce084c8df3046c138f8367164e5 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix gfid heal on first lookupKotresh HR2018-05-241-16/+18
| | | | | | | | | | | | | | | With ctime feature enabled, the gfid is not healing on first lookup. The fresh file logic depends on ctime and it was fetching from backend instead of xattr with ctime feature enabled. Fixed the same. Also fixed a possible hang with inode lock Change-Id: I020875c0462b284d6fa0e68304a422fa3d6a3e73 fixes: bz#1580532 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: Fix atime update for hardlinkKotresh HR2018-05-241-8/+19
| | | | | | | | | | | | | With ctime feature enabled, atime is not being updated for a hardlink when the file is accessed. e.g., touch -a <hardlink_file> fails to update atime. This patch fixes the same. fixes: bz#1580529 Change-Id: I2201c88d502d0070300a1f5023af1b36951284ec Signed-off-by: Kotresh HR <khiremat@redhat.com>
* ctime: Fix updating ctime in rename and unlinkKotresh HR2018-05-241-7/+13
| | | | | | | | | | | | 1. Successful rename was not updating ctime. Fixed the same. 2. Successful unlink when link count is more than 1 was not updating ctime. Fixed the same. 3. Copy ctime and flags during frame copy. fixes: bz#1580020 Change-Id: Ied47275a36aea60254b2add7a59128a9c83b3645 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* glusterd/ctime: Provide option to enable/disable ctime featureKotresh HR2018-05-061-11/+13
| | | | | | Updates: #208 Change-Id: If6f52b9b1b5b823ad64faeed662e96ceb848c54c Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: posix hook to set ctime xattr in relevant fopsKotresh HR2018-05-061-21/+121
| | | | | | | | | | | This patch uses the ctime posix APIs to set consistent time across replica on disk. It also stores the time attributes in the inode context. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: I1a8d74d1e251f1d6d142f066fc99258025c0bcdd Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix/ctime: posix hooks to get consistent time xattrKotresh HR2018-05-061-0/+8
| | | | | | | | | | | | This patch uses the ctime posix APIs to get consistent time across replica. The time attributes are got from from inode context or from on disk if not found and merged with iatt to be returned. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: Id737038ce52468f1f5ebc8a42cbf9c6ffbd63850 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* posix: APIs in posix to get and set time attributesKotresh HR2018-05-061-0/+510
This is part of the effort to provide consistent time across distribute and replica set for time attributes (ctime, atime, mtime) of the object. This patch contains the APIs to set and get the attributes from on disk and in inode context. Credits: Rafi KC <rkavunga@redhat.com> Updates: #208 Change-Id: I5d3cba53eef90ac252cb8299c0da42ebab3bde9f Signed-off-by: Kotresh HR <khiremat@redhat.com>