| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
gf_rev_dns_lookup_cached() allocated struct dnscache->dict if it was null
but the freeing was left to the caller.
Fix:
Moved dict allocation and freeing into corresponding init and fini
routines so that its easier for the caller to avoid such leaks.
Updates: #1000
Change-Id: I90d6a6f85ca2dd4fe0ab461177aaa9ac9c1fbcf9
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 079f7a7d8a2bd85070c1da4dde2452ca82a1cdbb)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In the commit fb20713b380e1df8d7f9e9df96563be2f9144fd6 we use
syntask to close fd but we have found the patch is reducing the
performance
Solution: Use janitor thread to close fd's and save the pfd ctx into
ctx janitor list and also save the posix_xlator into pfd object to
avoid the race condition during cleanup in brick_mux environment
Change-Id: Ifb3d18a854b267333a3a9e39845bfefb83fbc092
Fixes: #1396
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
(cherry picked from commit 41b9616435cbdf671805856e487e373060c9455b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a critical flaw in the previous implementation of open-behind.
When an open is done in the background, it's necessary to take a
reference on the fd_t object because once we "fake" the open answer,
the fd could be destroyed. However as long as there's a reference,
the release function won't be called. So, if the application closes
the file descriptor without having actually opened it, there will
always remain at least 1 reference, causing a leak.
To avoid this problem, the previous implementation didn't take a
reference on the fd_t, so there were races where the fd could be
destroyed while it was still in use.
To fix this, I've implemented a new xlator cbk that gets called from
fuse when the application closes a file descriptor.
The whole logic of handling background opens have been simplified and
it's more efficient now. Only if the fop needs to be delayed until an
open completes, a stub is created. Otherwise no memory allocations are
needed.
Correctly handling the close request while the open is still pending
has added a bit of complexity, but overall normal operation is simpler.
Change-Id: I6376a5491368e0e1c283cc452849032636261592
Fixes: #1225
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current scaling of the syncop thread pool is not working properly
and can leave some tasks in the run queue more time than necessary
when the maximum number of threads is not reached.
This patch provides a better scaling condition to react faster to
pending work.
Condition variables and sleep in the context of a synctask have also
been implemented. Their purpose is to replace regular condition
variables and sleeps that block synctask threads and prevent other
tasks to be executed.
The new features have been applied to several places in glusterd.
Change-Id: Ic50b7c73c104f9e41f08101a357d30b95efccfbf
Fixes: #1116
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before executing a fop in POSIX xlator it builds an internal
path based on GFID.To validate the path it call's (l)stat
system call and while .glusterfs is heavily loaded kernel takes
time to lookup inode and due to that performance drops
Solution: In this patch we followed two ways to improve the performance.
1) Keep open fd specific to first level directory(gfid[0])
in .glusterfs, it would force to kernel keep the inodes
from all those files in cache. In case of memory pressure
kernel won't uncache first level inodes. We need to open
256 fd's per brick to access the entry faster.
2) Use at based call's to access relative path to reduce
path based lookup time.
Note: To verify the patch we have executed kernel untar 100 times on 6
different clients after enabling metadata group-cache and some
other option.We were getting more than 20 percent improvement in
kenel untar after applying the patch.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343
updates: #891
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: posix_release(dir) functions add the fd's into a ctx->janitor_fds
and janitor thread closes the fd's.In brick_mux environment it is
difficult to handle race condition in janitor threads because brick
spawns a single janitor thread for all bricks.
Solution: Use synctask to execute posix_release(dir) functions instead of
using background a thread to close fds.
Credits: Pranith Karampuri <pkarampu@redhat.com>
Change-Id: Iffb031f0695a7da83d5a2f6bac8863dad225317e
Fixes: bz#1811631
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case where uid is not set to be 0, there are possible errors
from acl xlator. So, set `uid = 0;` with pid indicating this is
set from UTIME activity.
The message "E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-dev_SNIP_data-utime: dict set of key for set-ctime-mdata failed [Permission denied]" repeated 2 times between [2019-12-19 21:27:55.042634] and [2019-12-19 21:27:55.047887]
Change-Id: Ieadf329835a40a13ac0bf908dac776e66954466c
Fixes: #832
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Objects allocated from a per-thread memory pool keep a reference to it
to be able to return the object to the pool when not used anymore. The
object holding this reference can have a long life cycle that could
survive a glfs_fini() call.
This means that it's unsafe to destroy memory pools from glfs_fini().
Another side effect of destroying memory pools from glfs_fini() is that
the TLS variable that points to one of those pools cannot be reset for
all alive threads. This means that any attempt to allocate memory from
those threads will access already free'd memory, which is very
dangerous.
To fix these issues, mem_pools_fini() doesn't destroy pool lists
anymore. Only at process termination the pools are destroyed.
Change-Id: Ib189a5510ab6bdac78983c6c65a022e9634b0965
Fixes: bz#1801684
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
| |
convert all gf_msg() to gf_smsg()
Change-Id: Id542e05faadb8041b472a2298c71fe62730e65fc
Updates: #657
Signed-off-by: yatipadia <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Thin-arbiter module makes use of 'pending-xattr' name for the translator
as the filename which gets created in thin-arbiter node. By making this
unique, we can host single thin-arbiter node for multiple clusters.
Updates: #763
Change-Id: Ib3c732e7e04e6dba229e71ae3e64f1f3cb6d794d
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
| |
convert all gf_msg() to gf_smsg()
Change-Id: I8f1ff462b9c8012ed676c51450930a65ac403bf3
Updates: #657
Signed-off-by: yatipadia <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of signing process threads (glfs_brpobj)
is set to 4 by default. The recommendation is to set
it to number of cores available. This patch makes it
configurable as follows
gluster vol bitrot <volname> signer-threads <count>
fixes: bz#1797869
Change-Id: Ia883b3e5e34e0bc8d095243508d320c9c9c58adc
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Change-Id: Idf5bfc826b0c9f1a2674eea2a2e6164f30806b00
Updates: #657
Signed-off-by: yatipadia <ypadia@redhat.com>
|
|
|
|
|
|
|
|
| |
convert gf_msg() to gf_smsg()
Change-Id: I1cd6a5ac6f4361195d5d925efb2cc194045d0bba
Updates: #657
Signed-off-by: yatip <ypadia@redhat.com>
|
|
|
|
|
|
|
|
| |
changes all the gf_msg() to gf_smgs()
Change-Id: I3524bbd8f8070df2f2c888ea9b0fb7db1ee44453
Updates: #657
Signed-off-by: yatipadia <ypadia@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This parameter may have been used in the past, but is no longer
needed. Removing it and the few locations it was actually referenced.
This allows to remove an extra memdup as well, that was not needed
in the 1st place in server_setvolume() and unserialize_rsp_direntp()
functions.
A followup separate patch will remove extra_stdfree parmeter
from the dictionary structure.
Change-Id: Ica0ff0a330672373aaa60e808b7e76ec489a0fe3
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using a boolean parameter, we can use the key variable.
If it's NULL, the pair is not used and can be used.
Otherwise, it's in use - don't use.
It saves this annoying boolean, which causes the compiler
(or us explicitly) to pad with additional bytes the dict struct.
Change-Id: I89f52db57f35b3ef8acf57b7de2cee37f5d18e06
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This volume option was not made avaialble to `gluster volume set` CLI.
Reported-by: epolakis(https://github.com/kinsu) in
https://github.com/gluster/glusterfs/issues/781
fixes: bz#1787554
Change-Id: I7141bdd4e53ee99e22b354edde8d023bfc0b2cd7
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Remove TIER_LINKFILE_GFID related code from posix
Tier xlator was removed, but there are some code related to it scattered
around in DHT and Posix xlators. Remove some of it.
Change-Id: I3a878b31ed4a045ed419f936aa1d567ded1a273f
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.
Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.
fixes: bz#1774011
Change-Id: Ie9e83c162aa77f44a39c2ba7115de558120ada4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When touch is used to create a file, the ctime is not matching
atime and mtime which ideally should match. There is a difference
in nano seconds.
Cause:
When touch is used modify atime or mtime to current time (UTIME_NOW),
the current time is taken from kernel. The ctime gets updated to current
time when atime or mtime is updated. But the current time to update
ctime is taken from utime xlator. Hence the difference in nano seconds.
Fix:
When utimesat uses UTIME_NOW, use the current time from kernel.
fixes: bz#1773530
Change-Id: I9ccfa47dcd39df23396852b4216f1773c49250ce
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
'volume-id' is good to have for a graph for uniquely identifying it.
Add it to graph->volume_id while generating volfile itself.
This can be further used in many other places.
Updates: #763
Change-Id: I80516d62d28a284e8ff4707841570ced97a37e73
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
| |
This reverts commit fce5f68bc72d448490a0d41be494ac54a9181b3c.
I merged the wrong patch by mistake! Hence reverting it.
updates: bz#1774011
Change-Id: Id7d6ed1d727efc02467c8a9aea3374331261ebd5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in locks xlator:
Added support for per-domain inodelk count requests.
Caller needs to set GLUSTERFS_MULTIPLE_DOM_LK_CNT_REQUESTS key in the
dict and then set each key with name
'GLUSTERFS_INODELK_DOM_PREFIX:<domain name>'.
In the response dict, the xlator will send the per domain count as
values for each of these keys.
Changes in AFR:
Replaced afr_selfheal_locked_inspect() with afr_lockless_inspect(). Logic has
been added to make the latter behave same as the former, thus not
breaking the current heal info output behaviour.
fixes: bz#1774011
Change-Id: I9ae08ce768b39aeb6ee230207b5b7fa744176952
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function takes a string and its length and based on it
returns if it's a boolean. It's identical in functionality
to gf_string2boolean only with far less string comparisons since
it takes into account the length of the string.
dict_get_str_boolean() has been converted to use it.
Other cases of gf_string2boolean() across the code base can be
converted as well, but more importantly, they should be converted
from dict_get_str() and then calling to gf_string2boolean to
simply call dict_get_str_boolean(), which would take care of this
for them.
This is therefore a first step in the conversion.
Change-Id: I9ee93abfc676f6e123a3919d8df8c25e8848b799
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of querying for the file size and allocating a char array
according to its size, let's just use a fixed size.
Those calls are not really needed, and are either expensive or
cached anyway. Since we do dynamic allocation/free, let's just use
a fixed array instead.
I'll see if there are other sys_stat() calls that are not really
useful and try to eliminate them in separate patches.
Change-Id: I76b40e78a52ab38f613fc0cdef4be60e6253bf20
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
| |
Updates: bz#1193929
Change-Id: Idb98394c51917e9b132aeb32facccd112effe672
Signed-off-by: Amar Tumballi <amarts@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Configure the list of gluster servers in the key
GLUSTERD_BRICK_SERVERS at the time of GETSPEC RPC CALL
and access the value in client side to update volfile
serve list so that client would be able to connect
next volfile server in case of current volfile server
is down
Updates #741
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Change-Id: I23f36ddb92982bb02ffd83937a8bd8a2c97e8104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Align structures
- gf_iobuf_get_pagesize() will now also return the index, reducing the need
for an additional very similar call.
- Removal of an inefficient loop I've inadvertently added previously.
It was harmless, but just inefficient.
- New pool initialization does not need to be done under lock - no
one can touch that pool yet, so no need to protect it.
Change-Id: I61c50f2f14fa79edc131e515e9615a9928ee2dca
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
posix_xattr_fill() is called from several POSIX functions.
Made minor changes to it and the functions called from it:
1. Dict functions to use known lengths (dict_getn() instead of dict_get(), etc.)
2. Re-ordered some static char[] arrays, to account (hopefully)
to the frequency of the xattrs usage (based on grep in the code...)
3. Before strcmp(), check if the strings lengths match.
4. Removed some dead code.
Hopefully, no functional changes.
Change-Id: I510c0d2785e54ffe0f82c4c449782f2302d63a32
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
| |
I'm not sure why it was there and I did not see any use for it.
In the hope I did not miss anything, I removed it.
Change-Id: I02fa2e8e2a598b488fddbff4c7168dc4a41929b2
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With added check of volume-id during handshake, we can be sure to not
connect with a brick if this gets re-used in another volume. This
prevents any accidental issues which can happen with a stale client
process lurking along.
Also added test case for testing same volume name which would fetch a
different volfile (ie, different bricks, different type), and a
different volume name, but same brick.
For reference:
Currently a client<->server handshake happens in glusterfs through
protocol/client translator (setvolume) to protocol/server using a
dictionary which containes many keys. Rejection happens in server
side if some of the required keys are missing in handshake
dictionary.
Till now, there was no single unique identifier to validate for a
client to tell server if it is actually talking to a corresponding
server. All we look in protocol/client is a key called
'remote-subvolume', which should match with a subvolume name in server
volume file, and for any volume with same brick name (can be present
in same cluster due to recreate), it would be same. This could cause
major issue, when a client was connected to a given brick, in one
volume would be connected to another volume's brick if its
re-created/re-used.
To prevent this behavior, we are now passing along 'volume-id' in
handshake, which would be preserved for the life of client process,
which can prevent this accidental connections.
NOTE: This behavior wouldn't be applicable for user-snapshot enabled
volumes, as snapshotted volume's would have different volume-id.
Fixes: bz#1620580
Change-Id: Ie98286e94ce95ae09c2135fd6ec7d7c2ca1e8095
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
There could be cases (either due to insufficient memory or
corrupted mem-pool) due to which frame creation fails. Bail out
with error in such cases.
Change-Id: I8cc0a5852f6f04d2bac991e4eb79ecb42577da11
Fixes: bz#1748448
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
| |
fixes: #721
Change-Id: I5333540e3c635ccf441cf1f4696e4c8986e38ea8
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were unconditionally cleaning up the grap when we get
child_down followed by parent_down. But this is prone to
race condition when some of the bricks are already disconnected.
In this case, even before the last child down is executed in the
client xlator code,we might have freed the graph. Because the
child_down event is alreadt recevied.
To fix this race, we have introduced a check to see if all client
xlator have cleared thier reconnect chain, and called the child_down
for last time.
Change-Id: I7d02813bc366dac733a836e0cd7b14a6fac52042
fixes: bz#1727329
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On systems that don't support "timespec_get"(e.g., centos6), it
was using "clock_gettime" with "CLOCK_MONOTONIC" to get unix epoch
time which is incorrect. This patch introduces "timespec_now_realtime"
which uses "clock_gettime" with "CLOCK_REALTIME" which fixes
the issue.
Change-Id: I57be35ce442d7e05319e82112b687eb4f28d7612
Signed-off-by: Kotresh HR <khiremat@redhat.com>
fixes: bz#1743652
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To convert the existing `gf_msg` to `gf_smsg`:
- Define `_STR` of respective Message ID as below(In `*-messages.h`)
#define PC_MSG_REMOTE_OP_FAILED_STR "remote operation failed."
- Change `gf_msg` to use `gf_smsg`. Convert values into fields and
add any missing fields. Note: `errno` and `error` fields will be
added automatically to log message in case errnum is specified.
Example:
gf_smsg(
this->name, // Name or log domain
GF_LOG_WARNING, // Log Level
rsp.op_errno, // Error number
PC_MSG_REMOTE_OP_FAILED, // Message ID
"path=%s", local->loc.path, // Key Value 1
"gfid=%s", loc_gfid_utoa(&local->loc), // Key Value 2
NULL // Log End
);
Key value pairs formatting Help:
gf_slog(
this->name, // Name or log domain
GF_LOG_WARNING, // Log Level
rsp.op_errno, // Error number
PC_MSG_REMOTE_OP_FAILED, // Message ID
"op=CREATE", // Static Key and Value
"path=%s", local->loc.path, // Format for Value
"brick-%d-status=%s", brkidx, brkstatus, // Use format for key and val
NULL // Log End
);
Before:
[2019-07-03 08:16:18.226819] W [MSGID: 114031] [client-rpc-fops_v2.c \
:2633:client4_0_lookup_cbk] 0-gv3-client-0: remote operation failed. \
Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint \
is not connected]
After:
[2019-07-29 07:50:15.773765] W [MSGID: 114031] \
[client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] 0-gv1-client-0: \
remote operation failed. [{path=/f1}, \
{gfid=00000000-0000-0000-0000-000000000000}, \
{errno=107}, {error=Transport endpoint is not connected}]
To add new `gf_smsg`, Add a Message ID in respective `*-messages.h` file
and the follow the steps mentioned above.
Change-Id: I4e7d37f27f106ab398e991d931ba2ac7841a44b1
Updates: #657
Signed-off-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In brick_mux environment sometime brick is crashed while
volume stop/start in a loop.Brick is crashed in janitor task
at the time of accessing priv.If posix priv is cleaned up before
call janitor task then janitor task is crashed.
Solution: To avoid the crash in brick_mux environment introduce a new
flag janitor_task_stop in posix_private and before send CHILD_DOWN event
wait for update the flag by janitor_task_done
Change-Id: Id9fa5d183a463b2b682774ab5cb9868357d139a4
fixes: bz#1730409
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
| |
Change-Id: Id9f5f448db305f3135a1fdca61b1d7ec898c63a4
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Goal: 'libglusterfs' files shouldn't have any dependency outside of
the tree, specially the header files, shouldn't have '#include'
from outside the tree.
Fixes:
* Had to introduce libglusterd so, methods and structures required
for only mgmt/glusterd, and cli/ are separated from 'libglusterfs/'
* Remove rpc/xdr/gen from build, which was used mainly so
dependency for libglusterfs could be properly satisfied.
* Move rpcsvc_auth_data to client_t.h, so all dependencies could
be handled.
Updates: bz#1636297
Change-Id: I0e80243a5a3f4615e6fac6e1b947ad08a9363fce
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the glusterfs fuse client process is unable to
process the invalidate requests quickly enough, the
number of such requests quickly grows large enough
to use a significant amount of memory.
We are now introducing another option to set an upper
limit on these to prevent runaway memory usage.
Change-Id: Iddfff1ee2de1466223e6717f7abd4b28ed947788
Fixes: bz#1732717
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
| |
Fixes: bz#1644322
Change-Id: I53e8fa362cd8c7d04fb1c4abb606a9abb642c592
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I hit one crash issue when using the libgfapi.
In the libgfapi it will call glfs_poller() --> event_dispatch()
in file api/src/glfs.c:721, and the event_dispatch() is defined
by libgluster locally, the problem is the name of event_dispatch()
is the extremly the same with the one from libevent package form
the OS.
For example, if a executable program Foo, which will also use and
link the libevent and the libgfapi at the same time, I can hit the
crash, like:
kernel: glfs_glfspoll[68486]: segfault at 1c0 ip 00007fef006fd2b8 sp
00007feeeaffce30 error 4 in libevent-2.0.so.5.1.9[7fef006ed000+46000]
The link for Foo is:
lib_foo_LADD = -levent $(GFAPI_LIBS)
It will crash.
This is because the glfs_poller() is calling the event_dispatch() from
the libevent, not the libglsuter.
The gfapi link info :
GFAPI_LIBS = -lacl -lgfapi -lglusterfs -lgfrpc -lgfxdr -luuid
If I link Foo like:
lib_foo_LADD = $(GFAPI_LIBS) -levent
It will works well without any problem.
And if Foo call one private lib, such as handler_glfs.so, and the
handler_glfs.so will link the GFAPI_LIBS directly, while the Foo won't
and it will dlopen(handler_glfs.so), then the crash will be hit everytime.
The link info will be:
foo_LADD = -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)
I can avoid the crash temporarily by linking the GFAPI_LIBS in Foo too like:
foo_LADD = $(GFAPI_LIBS) -levent
libhandler_glfs_LIBADD = $(GFAPI_LIBS)
But this is ugly since the Foo won't use any APIs from the GFAPI_LIBS.
And in some cases when the --as-needed link option is added(on many dists
it is added as default), then the crash is back again, the above workaround
won't work.
Fixes: #699
Change-Id: I38f0200b941bd1cff4bf3066fca2fc1f9a5263aa
Signed-off-by: Xiubo Li <xiubli@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The files which were created before ctime enabled would not
have "trusted.glusterfs.mdata"(stores time attributes) xattr.
Upon fops which modifies either ctime or mtime, the xattr
gets created with latest ctime, mtime and atime, which is
incorrect. It should update only the corresponding time
attribute and rest from backend
Solution:
Creating xattr with values from brick is not possible as
each brick of replica set would have different times.
So create the xattr upon successful lookup if the xattr
is not created
Note To Reviewers:
The time attributes used to set xattr is got from successful
lookup. Instead of sending the whole iatt over the wire via
setxattr, a structure called mdata_iatt is sent. The mdata_iatt
contains only time attributes.
Change-Id: I5e535631ddef04195361ae0364336410a2895dd4
fixes: bz#1593542
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As usleep has been obsoleted, changed all invocations of usleep
to nanosleep. From man 3 usleep:
"4.3BSD, POSIX.1-2001. POSIX.1-2001 declares this function
obsolete; use nanosleep(2) instead. POSIX.1-2008 removes the
specification of usleep()."
Added a helper function gf_nanosleep() to have a single place
for handling edge cases that might arise from the conversion of
usleep to nanosleep and allow the sleep to resume with right
remaining value upon being interrupted.
Fixes: bz#1721686
Change-Id: Ia39ab82c9e0f4669d2c00d4cdf25e38d94ef9f62
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
As Hadoop is no longer supported, dropping code for
handling Hadoop access.
Fixes: bz#1728417
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I8fcf4faacb364f1c9a8abb0c48faec337087f845
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a normal volume, we are updating the pid from a the
process while we do a daemonization or at the end of the
init if it is no-daemon mode. Along with updating the pid
we also lock the file, to make sure that the process is
running fine.
With brick mux, we were updating the pidfile from gluterd
after an attach/detach request.
There are two problems with this approach.
1) We are not holding a pidlock for any file other than parent
process.
2) There is a chance for possible race conditions with attach/detach.
For example, shd start and a volume stop could race. Let's say
we are starting an shd and it is attached to a volume.
While we trying to link the pid file to the running process,
this would have deleted by the thread that doing a volume stop.
Change-Id: I29a00352102877ce09ea3f376ca52affceb5cf1a
Updates: bz#1722541
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glusterfs-fops.h was moved to rpc/xdr to support compound fops.
(ref: https://review.gluster.org/14032, 2f945b86d3)
This was fine as long as all these header files were in single
include directory after 'install'. With the move to separate out
glusterfs specific header files into another directory inside
/usr/include (ref: https://review.gluster.org/21746, 20ef211cfa),
glusterfs-fops.h file was not in the proper path when an external
.c file tried to include any of glusterfs specific .h file (like
xlator.h).
Now, we have removed compound-fops, with that, none of the enums
declared in glusterfs-fops.h are actually getting used on wire
anymore. Hence, it makes sense to get this to libglusterfs/src
as a single point of definition. With this change, the external
programs can use glusterfs header files.
also remove some enum definitions which are not used in code
anymore.
Updates: bz#1636297
Change-Id: I423c44d3dbe2efc777299c544ece3cb172fc7e44
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two problems have been identified that caused that gluster's memory
usage were twice higher than required.
1. An off by 1 error caused that all objects allocated from the memory
pools were taken from a pool bigger than required. Since each pool
corresponds to a size equal to a power of two, this was wasting half
of the available memory.
2. The header information used for accounting on each memory object was
not taken into consideration when searching for a suitable memory
pool. It was added later when each individual block was allocated.
This made this space "invisible" to memory accounting.
Credits: Thanks to Nithya Balachandran for identifying this problem and
testing this patch.
Fixes: bz#1722802
Change-Id: I90e27ad795fe51ca11c13080f62207451f6c138c
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a good chance that, the inode on which unref came has already been
zero refed and added to the purge list. This can happen when inode table is
being destroyed (glfs_fini is something which destroys the inode table).
Consider a directory 'a' which has a file 'b'. Now as part of inode table destruction
zero refing of inodes does not happen from leaf to the root. It happens in the order
inodes are present in the list. So, in this example, the dentry of 'b' would have its
parent set to the inode of 'a'. So if 'a' gets zero refed first (as part of
inode table cleanup) and then 'b' has to zero refed, then dentry_unset is called on
the dentry of 'b' and it further goes on to call inode_unref on b's parent which is
'a'. In this situation, GF_ASSERT would be called as the refcount of 'a' has been
already set to zero.
So, return the inode (in the function inode_unref without doing anything) if the
inode table cleanup has already started and inode's refcount is zero.
Change-Id: I91e0a807d5c9ce0daae5a611c38da379fd11076e
fixes: bz#1722546
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|