| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Prefer timespec_now_realtime() and gf_time() over clock_gettime()
and time(), use gf_tvdiff() and gf_tsdiff() where appropriate,
drop unused time_elapsed() and leftovers in 'struct posix_private'.
Change-Id: Ie1f0229df5b03d0862193ce2b7fb91d27b0981b6
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Updates: #1002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In the commit fb20713b380e1df8d7f9e9df96563be2f9144fd6 we use
syntask to close fd but we have found the patch is reducing the
performance
Solution: Use janitor thread to close fd's and save the pfd ctx into
ctx janitor list and also save the posix_xlator into pfd object to
avoid the race condition during cleanup in brick_mux environment
Change-Id: Ifb3d18a854b267333a3a9e39845bfefb83fbc092
Fixes: #1396
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xlators/storage/posix/src/posix-common.c:1440:18: warning: initializer overrides prior
initialization of this subobject [-Winitializer-overrides]
.validate = GF_OPT_VALIDATE_MAX,
^~~~~~~~~~~~~~~~~~~
xlators/storage/posix/src/posix-common.c:1439:18: note: previous initialization is here
.validate = GF_OPT_VALIDATE_MIN,
^~~~~~~~~~~~~~~~~~~
[4 times]
Use GF_OPT_VALIDATE_BOTH for min/max-bounded values.
Fixes: #1208
Change-Id: I073a27d23176f3b4a126f2eb50c079374a11418d
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When bringing back a downed brick and performing lookup from the client
side, the permission on said brick aren't updated on the first lookup,
but only on the second.
This patch modifies permission update logic so the first lookup will
trigger a permission update on the downed brick.
LIMITATIONS OF THE PATCH:
As the choice of source depends on whether the directory has layout or not.
Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory.
Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted.
fixes: #999
Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before executing a fop in POSIX xlator it builds an internal
path based on GFID.To validate the path it call's (l)stat
system call and while .glusterfs is heavily loaded kernel takes
time to lookup inode and due to that performance drops
Solution: In this patch we followed two ways to improve the performance.
1) Keep open fd specific to first level directory(gfid[0])
in .glusterfs, it would force to kernel keep the inodes
from all those files in cache. In case of memory pressure
kernel won't uncache first level inodes. We need to open
256 fd's per brick to access the entry faster.
2) Use at based call's to access relative path to reduce
path based lookup time.
Note: To verify the patch we have executed kernel untar 100 times on 6
different clients after enabling metadata group-cache and some
other option.We were getting more than 20 percent improvement in
kenel untar after applying the patch.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343
updates: #891
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: posix_release(dir) functions add the fd's into a ctx->janitor_fds
and janitor thread closes the fd's.In brick_mux environment it is
difficult to handle race condition in janitor threads because brick
spawns a single janitor thread for all bricks.
Solution: Use synctask to execute posix_release(dir) functions instead of
using background a thread to close fds.
Credits: Pranith Karampuri <pkarampu@redhat.com>
Change-Id: Iffb031f0695a7da83d5a2f6bac8863dad225317e
Fixes: bz#1811631
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
| |
Do not include ftw.h twice.
Change-Id: Id9e8d1813aafd890940adcd6883d90fa1b4beaf9
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Updates: bz#1193929
|
|
|
|
|
|
|
|
|
|
|
| |
'volume-id' is good to have for a graph for uniquely identifying it.
Add it to graph->volume_id while generating volfile itself.
This can be further used in many other places.
Updates: #763
Change-Id: I80516d62d28a284e8ff4707841570ced97a37e73
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In many places we use it, compare to it, etc. It could be a static variable,
as it really doesn't change. I think it's better than initializing to 0
and then doing gfid[15] = 1 or other tricks.
I think there are additional oppportunuties to make more variables static.
This is an attempt at an easy one.
Change-Id: I7f23a30a94056d8f043645371ab841cbd0f90d19
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With added check of volume-id during handshake, we can be sure to not
connect with a brick if this gets re-used in another volume. This
prevents any accidental issues which can happen with a stale client
process lurking along.
Also added test case for testing same volume name which would fetch a
different volfile (ie, different bricks, different type), and a
different volume name, but same brick.
For reference:
Currently a client<->server handshake happens in glusterfs through
protocol/client translator (setvolume) to protocol/server using a
dictionary which containes many keys. Rejection happens in server
side if some of the required keys are missing in handshake
dictionary.
Till now, there was no single unique identifier to validate for a
client to tell server if it is actually talking to a corresponding
server. All we look in protocol/client is a key called
'remote-subvolume', which should match with a subvolume name in server
volume file, and for any volume with same brick name (can be present
in same cluster due to recreate), it would be same. This could cause
major issue, when a client was connected to a given brick, in one
volume would be connected to another volume's brick if its
re-created/re-used.
To prevent this behavior, we are now passing along 'volume-id' in
handshake, which would be preserved for the life of client process,
which can prevent this accidental connections.
NOTE: This behavior wouldn't be applicable for user-snapshot enabled
volumes, as snapshotted volume's would have different volume-id.
Fixes: bz#1620580
Change-Id: Ie98286e94ce95ae09c2135fd6ec7d7c2ca1e8095
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In OCS environment heketidbstorage is detached due
to health_check thread is failed.Sometime aio_write
is not successfully finished within default health-check-timeout
limit and the brick is detached.
Solution: To avoid the issue increase default timeout to 20s
Change-Id: Idff283d5713da571f9d20a6b296274f69c3e5b7b
Fixes: bz#1755900
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
| |
In various places, we can re-use knowledge of string length
or result of snprintf() and such instead of strlen().
Change-Id: I4c9b1decf1169b3f8ac83699a0afbd7c38fad746
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment sometime brick is crashed while
volume stop/start in a loop.Brick is crashed in janitor task
at the time of accessing priv.If posix priv is cleaned up before
call janitor task then janitor task is crashed.
Solution: To avoid the crash in brick_mux environment introduce a new
flag janitor_task_stop in posix_private and before send CHILD_DOWN event
wait for update the flag by janitor_task_done
Change-Id: Id9fa5d183a463b2b682774ab5cb9868357d139a4
fixes: bz#1730409
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
seg-fault
CID: 1124831
Updates: bz#789278
Change-Id: Ia6550be3742849809cf3e0a4a39d9d6e77003b35
Signed-off-by: Barak Sason <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
* reverting changes made in
https://review.gluster.org/#/c/glusterfs/+/21686/
* Now storage.reserve can take value in percent or bytes
fixes: bz#1651445
Change-Id: Id4826210ec27991c55b17d1fecd90356bff3e036
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are many include statements that are not needed.
A previous more ambitious attempt failed because of *BSD plafrom
(see https://review.gluster.org/#/c/glusterfs/+/21929/ )
Now trying a more conservative reduction.
It does not solve all circular deps that we have, but it
does reduce some of them. There is just too much to handle
reasonably (dht-common.h includes dht-lock.h which includes
dht-common.h ...), but it does reduce the overall number of lines
of include we need to look at in the future to understand and fix
the mess later one.
Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
storage.reserve-size option will take size as input
instead of percentage. If set, priority will be given to
storage.reserve-size over storage.reserve. Default value
of this option is 0.
fixes: bz#1651445
Change-Id: I7a7342c68e436e8bf65bd39c567512ee04abbcea
Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nr_files is supposed to represent the number of files opened in posix.
Present logic doesn't seem to handle anon-fds because of which the
counts would always be wrong.
I don't remember anyone using this value in debugging any problem probably
because we always have 'ls -l /proc/<pid>/fd' which not only prints the
fds that are active but also prints their paths. It also handles directories
and anon-fds which actually opened the file. So removing this code
instead of fixing the buggy logic to have the nr_files.
fixes bz#1688106
Change-Id: Ibf8713fdfdc1ef094e08e6818152637206a54040
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In commit 2261e444a47ffffb5d64305efceee1d5a734cd75
wrong data type of nr_files was changed to dump
nr_files to statedump so build is failing on 32bit
environment
Solution: Change data type to avoid errors
Change-Id: Ifb4b19feda6e0e56d110b23351e7a0efd5bfa29b
fixes: bz#1657607
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I629698d8ddf6f15428880bdc1501d36bc37b8ebb
fixes: bz#1657607
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
| |
fixes: bz#1622665
Change-Id: I777d67b1b62c284c62a02277238ad7538eef001e
Signed-off-by: Iraj Jamali <ijamali@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: brick is getting crashed at the time of calling
pthread_detach after just call gf_thread_create.If
sufficient resources are not available on the system
pthread_create returns EAGAIN (non-negative) but the
caller function expects negative error code in case of failure
Solution: Change the condition in caller function to avoid the crash
Change-Id: Ifeaa49f809957eb6c33aa9792f5af1b55566756d
fixes: bz#1662906
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With brick mux, the number of threads increases as the number of
bricks increases. As an initiative to reduce the number of
threads in brick mux scenario, replacing janitor thread to use
synctask infra.
Now close() and closedir() handle by separate janitor
thread which is linked with glusterfs_ctx.
Updates #475
Change-Id: I0c4aaf728125ab7264442fde59f3d08542785f73
Signed-off-by: Poornima G <pgurusid@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: posix_fini sends a cancellation request to health_check
thread and cleanup priv without ensuring health_check thread
is running
Solution: Make health_check && disk_space thread joinable and call
gf_thread_cleanup_xint to wait unless thread is not finished
Change-Id: I4d37b08138766881dab0922a47ed68a2c3411f13
fixes: bz#1636570
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Fixes: #164
Change-Id: I93ad6f0232a1dc534df099059f69951e1339086f
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libglusterfs devel package headers are referenced in code using
include semantics for a program, this while it works can be better
especially when dealing with out of tree xlator builds or in
general out of tree devel package usage.
Towards this, the following changes are done,
- moved all devel headers under a glusterfs directory
- Included these headers using system header notation <> in all
code outside of libglusterfs
- Included these headers using own program notation "" within
libglusterfs
This change although big, is just moving around the headers and
making it correct when including these headers from other sources.
This helps us correctly include libglusterfs includes without
namespace conflicts.
Change-Id: Id2a98854e671a7ee5d73be44da5ba1a74252423b
Updates: bz#1193929
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does following.
1. Enable ctime feature by default.
2. Earlier, to enable the ctime feature, two options
needed to be enabled
a. gluster vol set <volname> utime on
b. gluster vol set <volname> ctime on
This is inconvenient from the usability point of
view. Hence changed it to following single option
a. gluster vol set <volname> ctime on
fixes: bz#1624724
Change-Id: I04af0e5de1ea6126c58a06ba8a26e22f9f06344e
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there are possibilities in few places, where a user-controlled
(like filename, program parameter etc) string can be passed as 'fmt' for
printf(), which can lead to segfault, if the user's string contains '%s',
'%d' in it.
While fixing it, makes sense to make the explicit check for such issues
across the codebase, by making the format call properly.
Fixes: CVE-2018-14661
Fixes: bz#1644763
Change-Id: Ib547293f2d9eb618594cbff0df3b9c800e88bde4
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Current resource cleanup sequence is not
perfect while brick mux is enabled
Solution: 1) Destroying xprt after cleanup all fd associated
with a client
2) Before call fini for brick xlators ensure no stub
should be running on a brick
Change-Id: I86195785e428f57d3ef0da3e4061021fafacd435
fixes: bz#1631357
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No default value was specified for `export-statfs-size` in posix
option table. Glusterd2 sets default value as `off` since the
option type is `bool`. Posix treats `export-statfs-size=on` if
not specified in volfile(That means default value is `on`)
This patch sets default value as `on`
Change-Id: I5c6341183be9b62a78fdbc94621220f9284e1382
updates: #302
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
| |
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu <nigelb@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes CID: 1388886
https://scan6.coverity.com/reports.htm#v42607/p10714/fileInstanceId=85287446&defectInstanceId=25997291&mergedDefectId=1388886
Change-Id: Ic4e558bba7e15d213c07bc31affb2e175ace5502
updates: bz#789278
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
the macro 'PATH_SET_TIMESPEC_OR_TIMEVAL' is defined in posix.h,
which seems to be good place for it
updates: bz#1193929
Change-Id: I521a2a6efeb26891c24637a5f06b1d90c00b1135
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
| |
updates: #302
Change-Id: I9c1b9c9751c21866b074ac5d3ef15a58ae7aa707
Signed-off-by: Prashanth Pai <ppai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...
Only compile-tested!
Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
| |
Updates: #208
Change-Id: If6f52b9b1b5b823ad64faeed662e96ceb848c54c
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Sometimes brick process is getting crashed at the time
of stop brick while brick mux is enabled.
Solution: Brick process was getting crashed because of rpc connection
was not cleaning properly while brick mux is enabled.In this patch
after sending GF_EVENT_CLEANUP notification to xlator(server)
waits for all rpc client connection destroy for specific xlator.Once rpc
connections are destroyed in server_rpc_notify for all associated client
for that brick then call xlator_mem_cleanup for for brick xlator as well as
all child xlators.To avoid races at the time of cleanup introduce
two new flags at each xlator cleanup_starting, call_cleanup.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf
updates: bz#1544090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of stopping the volume while brick multiplex is
enabled memory is not cleanup from all server side xlators.
Solution: To cleanup memory for all server side xlators call fini
in glusterfs_handle_terminate after send GF_EVENT_CLEANUP
notification to top xlator.
BUG: 1544090
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Note: Run all test-cases in separate build (https://review.gluster.org/19574)
with same patch after enable brick mux forcefully, all test cases are
passed.
Change-Id: Ia10dc7f2605aa50f2b90b3fe4eb380ba9299e2fc
|
|
|
|
|
|
|
|
|
|
|
| |
There are still remain some code paths where cleanup is required while
brick mux is on.I will upload a new patch after resolve all code paths.
This reverts commit b313d97faa766443a7f8128b6e19f3d2f1b267dd.
BUG: 1544090
Change-Id: I26ef1d29061092bd9a409c8933d5488e968ed90e
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Added a volume option 'fips-mode-rchecksum' tied to op version 4.
If not set, rchecksum fop will use MD5 instead of SHA256.
updates: #230
Change-Id: Id8ea1303777e6450852c0bc25503cda341a6aec2
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: At the time of stopping the volume while brick multiplex is
enabled memory is not cleanup from all server side xlators.
Solution: To cleanup memory for all server side xlators call fini
in glusterfs_handle_terminate after send GF_EVENT_CLEANUP
notification to top xlator.
BUG: 1544090
Change-Id: Ifa1525e25b697371276158705026b421b4f81140
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
directory
Summary:
- We may have found an issue where certain directories were being moved into .landfill and then being quickly purged via nftw().
- We would like to have an emergency option to disable these purges.
> Reviewed-on: https://review.gluster.org/18253
> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Fixes #371
Change-Id: I90b54c535930c1ca2925a928728199b6b80eadd9
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Too may hard links blow up btrfs by exceeding max xattr size (recordign
pgfid for each hardlink). Add a limit to prevent this explosion.
> Reviewed-on: https://review.gluster.org/18232
> Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Fixes gluster/glusterfs#370
Signed-off-by: ShyamsundarR <srangana@redhat.com>
Change-Id: I614a247834fb8f2b2743c0c67d11cefafff0dbaa
|
|
1. Split out entry and inode/fd based FOPs into
separate files from posix.c
2. Split out common routines (init, fini, reconf,
and such) into its own file, from posix.c
3. Retain just the method assignments in posix.c
(such that posix2 for RIO can assign its own methods in
the future for entry operations and such)
4. Based on the split in (1) and (2) split out
posix-handle.h into 2 files, such that macros that are
needed for inode ops are in one and rest are in the other
If the split is done as above, posix2 can compile with
its own entry ops, and hence not compile, the entry ops
as split in (1) above.
The split described in (4) can again help posix2 to
define its own macros to make entry and inode handles,
thus not impact existing POSIX xlator code.
Noted problems
- There are path references in certain cases where
quota is used (in the xattr FOPs), and thus will fail
on reuse in posix2, this needs to be handled when we
get there.
- posix_init does set root GFID on the brick root,
and this is incorrect for posix2, again will need
handling later when posix2 evolves based on this
code (other init checks seem fine on current inspection)
Merge of experimental branch patches with the following
gerrit change-IDs
> Change-Id: I965ce6dffe70a62c697f790f3438559520e0af20
> Change-Id: I089a4d9cf470c2f9c121611e8ef18dea92b2be70
> Change-Id: I2cec103f6ba8f3084443f3066bcc70b2f5ecb49a
Fixes gluster/glusterfs#327
Change-Id: I0ccfa78559a7c5a68f5e861e144cf856f5c9e19c
Signed-off-by: ShyamsundarR <srangana@redhat.com>
|