| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
dht_opendir should wind the open to all subvols
whether or not local->subvols is set. This is
because dht_readdirp winds the calls to all subvols.
Change-Id: I67a96b06dad14a08967c3721301e88555aa01017
updates: bz#1566820
Signed-off-by: N Balachandran <nbalacha@redhat.com>
(cherry picked from commit c4251edec654b4e0127577e004923d9729bc323d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch limits itself to only handling the case
where no file (data or linkto) exists on the subvol.
Additional cases to be handled:
1. A linkto file was found on the only child subvol. This currently
calls dht_lookup_everywhere which eventually deletes it. It can be
deleted directly as it will not be pointing to a valid subvol.
2. Directory lookups - locking might be unnecessary in some cases.
> Change-Id: I940ba34531f2aaee1d36fd9ca45ecfd46be662a4
> BUG: 1546620
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: I940ba34531f2aaee1d36fd9ca45ecfd46be662a4
BUG: 1548270
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_populate_inode_for_dentry tries to update the layout
for the '..' entry when listing the root of the volume.
This entry does not correspond to an entry in the volume
and therefore does not have a gfid or a layout on disk,
causing layout processing to fail.
> Change-Id: I2b7470e1c5e20d87b5545160697f24d041045140
> BUG: 1537457
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: I2b7470e1c5e20d87b5545160697f24d041045140
BUG: 1539516
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Non-privileged users cannot delete linkto
files. However the failure to unlink a stale linkto
causes DHT to fail the lookup with EIO and hence
prevent access to the file.
> Change-Id: Id295362d41e52263790694602f36f1219f0646a2
> BUG: 1542318
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: Id295362d41e52263790694602f36f1219f0646a2
BUG: 1543016
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed an issue in dht_populate_inode_for_dentry where a layout is
set in the inode without checking if it is already set. This overwrites
the value each time without freeing the already existing layout.
Also includes the changes in https://review.gluster.org/19471
> Change-Id: I651bf539a0b82b4ddc4c355890c16a8e91f5f1fd
> BUG: 1541264
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: I651bf539a0b82b4ddc4c355890c16a8e91f5f1fd
BUG: 1541267
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dht_(f)xattrop implementation did not implement
migration phase1/phase2 checks which could cause issues
with rebalance on sharded volumes.
This does not solve the issue where fops may reach the target
out of order.
> Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c
> BUG: 1471031
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: I2416fc35115e60659e35b4b717fd51f20746586c
BUG: 1540224
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Superflous dentries that cannot be fit in the buffer size provided by
kernel are thrown away by fuse-bridge. This means,
* the next readdir(p) seen by readdir-ahead would have an offset of a
dentry returned in a previous readdir(p) response. When readdir-ahead
detects non-monotonic offset it turns itself off which can result in
poor readdir performance.
* readdirp can be cpu-intensive on brick and there is no point to read
all those dentries just to be thrown away by fuse-bridge.
So, the best strategy would be to fill the buffer optimally - neither
overfill nor underfill.
> Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84
> BUG: 1492625
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit e785faead91f74dce7c832848f2e8f3f43bd0be5)
Change-Id: Idb3d85dd4c08fdc4526b2df801d49e69e439ba84
BUG: 1478411
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... in readdirp response if dentry points to a directory inode. This
is a special case where the entire layout is stored in one single
subvolume and hence no need for lookup to construct the layout
>Change-Id: I44fd951e2393ec9dac2af120469be47081a32185
>BUG: 1492625
>Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 59d1cc720f52357f7a6f20bb630febc6a622c99c)
Change-Id: I44fd951e2393ec9dac2af120469be47081a32185
BUG: 1478411
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In DHT, after locks on all subvolumes are acquired, it would perform the
following steps sequentially,
1. send remove dir on all other subvolumes except the hashed one in a loop;
2. wait for all pending rmdir to be done
3. remove dir on the hashed subvolume
The problem is that in step 1 there is a check to skip hashed subvolume
in the loop. If the last subvolume to check is actually the
hashed one, and step 3 is quickly done before the last and hashed
subvolume is checked, by accessing shared context data be destroyed in
step 3, would cause a crash.
Fix by saving shared data in a local variable to access later in the
loop.
> BUG: 1490642
> Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
(cherry picked from commit 206120126d455417a81a48ae473d49be337e9463)
Change-Id: I8db7cf7cb262d74efcb58eb00f02ea37df4be4e2
BUG: 1505221
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Comparing the uuid string of the local node against that stored in the
local_subvol information is inefficient, especially as it is
done for every file to be migrated. The code has now been changed
to set the value of info to 1 if the nodeuuid is that of the node
making the comparison so this becomes an integer comparison.
> BUG: 1451434
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> https://review.gluster.org/#/c/17851
(cherry picked from commit c4a608799a577a4f38139f6bb8a47da8efb0fec3)
Change-Id: I7491d59caad3b71dbf5facc94dcde0cd53962775
BUG: 1500472
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If dht_discover finds data files on more than one subvol,
racing calls to dht_discover_cbk could end up calling
dht_aggregate_xattr which could delete dictionary data
that is being accessed by higher layer translators.
Fixed to call dht_aggregate_xattr only for directories and
consider only the first file to be found.
> BUG: 1484709
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: https://review.gluster.org/18137
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: I4f3d2a405ec735d4f1bb33a04b7255eb2d179f8a
BUG: 1486538
(cherry picked from commit 9420022df0962b1fa4f3ea3774249be81bc945cc)
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/18146
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add EBADF handling for dht_fremovexattr and dht_fsetxattr.
> BUG: 1476665
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: https://review.gluster.org/17999
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 747a08d34e2a1e94d7fce68a3577370288bb1955)
Change-Id: Ide0d5812dae79655d2565157e5baabcd753b4309
BUG: 1479303
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/18012
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The local->call_cnt was being accessed and updated inside
the loop where the entries were being processed and the calls
were being wound.
This could end up in a scenario where the local->call_cnt became
0 before the processing was complete causing the crash when the
next entry was being processed.
Change-Id: I930f61f1a1d1948f90d4e58e80b7d6680cf27f2f
BUG: 1472949
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17825
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set names to threads on creation for easier
debugging.
Output of top -H -p <PID-OF-GLUSTERFSD>
Before:
19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterfsd
19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
After:
19773 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19774 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustertimer
19775 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterfsd
19776 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustermemsweep
19777 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc0
19778 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glustersproc1
19779 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll0
19780 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteridxwrker
19781 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusteriotwr0
19782 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrssign
19783 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterbrswrker
19784 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterclogecon
19785 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd0
19786 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd1
19787 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.01 glusterclogd2
19789 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixjan
19790 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixfsy
25178 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll1
5398 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterepoll2
7881 root 20 0 1301.3m 12.6m 8.4m S 0.0 0.1 0:00.00 glusterposixhc
Change-Id: Id5f333755c1ba168a2ffaa4fce6e71c375e10703
BUG: 1254002
Updates: #271
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: https://review.gluster.org/11926
Reviewed-by: Prashanth Pai <ppai@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an fd is opened on a file, the file is migrated
and the cached subvol is updated in the inode_ctx
before an fd based fop is sent, the fop is sent to
the dst subvol on which the fd is not opened.
This causes the FOP to fail with EBADF.
Now, every fd based fop will check to see that the fd
has been opened on the dst subvol before winding it down.
Change-Id: Id92ef5eb7a5b5226688e2d2868b15e383f5f240e
BUG: 1465075
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17630
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Afr has introduced a new key GF_XATTR_LIST_NODE_UUIDS_KEY,
through which rebalance will figure out its local subvolumes.(Reference
bugid=1463250)
key: GF_XATTR_NODE_UUID_KEY will continue to serve it's old
purpose of returning the first afr chiild.
test: prove tests/basic/distribute/rebal-all-nodes-migrate.t
Change-Id: I4d602feda2a05b29d2210c712a07a4ac6b8bc112
BUG: 1463648
Signed-off-by: Susant Palai <spalai@redhat.com>
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17595
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a similar issue fixed for rename in https://review.gluster.org/#/c/16016/
For hardlinks, if cached and hashed subvolumes are different, then it will first
create linkto file in hashed using root permission, but actually hardlink creation
fails with EACESS and stale linkto file is never removed.All the followup hardlink
calls with file name will result ESTALE because linktofile creation fails with EEXIST
and follow up lookup on linkto file returns gfid-mismatching(old linkto file) and
finally fails with ESTALE
Steps to produce :
(From link/00.t test from posix-testsuite)
Steps executed in script
* create a file "abc" using root
* change the ownership of file to a non root user
* create hardlink "link" for "abc" using a non root user, it fails with EACESS
* delete "abc"
* create directory "abc" using root
* again try to create hadrlink "link" for "abc" using non root user, fails with ESTALE
Also tried to fix other bugs in dht_linkfile_create_cbk() and posix_lookup.
Thanks Susant for the help in debugging the issue and suggestion for this patch.
Change-Id: I7a5a1899d3fd1fdb13578b37f9d52a084492e35d
BUG: 1452084
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: https://review.gluster.org/17331
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Empty directories were not being considered while
calculating rebalance estimates leading to negative
time-left values being displayed as part of the
rebalance status.
Change-Id: I48d41d702e72db30af10e6b87b628baa605afa98
BUG: 1457985
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17448
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dht_readdirp must unwind with list of entries only after
the entire buffer requested by kernel is filled to avoid
extra syscalls occuring when returning partially filled
buffer. Also wind readdir call to next subvol on reaching
EOD for directory on that subvol to avoid extra network call.
Change-Id: If2e1a2722f813d95457c7542bff25fef56c7a041
BUG: 1356453
Signed-off-by: Sakshi <sabansal@redhat.com>
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: https://review.gluster.org/12271
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On demand migration of files i.e. migration done by clients
triggered by a setfattr was broken.
Dependency on defrag led to crash when migration was triggered from
client.
Note: This functionality is not available for tiered volumes. Migration
from tier served client will fail with ENOTSUP.
usage (But refer to the steps mentioned below to avoid any issues) :
setfattr -n "trusted.distribute.migrate-data" -v "1" <filename>
The purpose of fixing the on-demand client migration was to give a
workaround where the user has lots of empty directories compared to
files and want to do a remove-brick process.
Here are the steps to trigger file migration for remove-brick process from
client. (This is highly recommended to follow below steps as is)
Let's say it is a replica volume and user want to remove a replica pair
named brick1 and brick2. (Make sure healing is completed before you run
these steps)
Step-1: Start remove-brick process
- gluster v remove-brick <volname> brick1 brick2 start
Step-2: Kill the rebalance daemon
- ps aux | grep glusterfs | grep rebalance\/ | awk '{print $2}' | xargs kill
Step-3: Do a fresh mount as mentioned here
- glusterfs -s ${localhostname} --volfile-id rebalance/$volume-name /tmp/mount/point
Step-4: Go to one of the bricks (among brick1 and brick2)
- cd <brick1 path>
Step-5: Run the following command.
- find . -not \( -path ./.glusterfs -prune \) -type f -not -perm 01000 -exec bash -c 'setfattr -n "distribute.fix.layout" -v "1" ${mountpoint}/$(dirname '{}')' \; -exec setfattr -n "trusted.distribute.migrate-data" -v "1" ${mountpoint}/'{}' \;
This command will ignore the linkto files and empty directories. Do a fix-layout of
the parent directory. And trigger a migration operation on the files.
Step-6: Once this process is completed do "remove-brick force"
- gluster v remove-brick <volname> brick1 brick2 force
Note: Use the above script only when there are large number of empty directories.
Since the script does a crawl on the brick side directly and avoids directories those
are empty, the time spent on fixing layout on those directories are eliminated(even if the script
does not do fix-layout on empty directories, post remove-brick a fresh layout will be built
for the directory, hence not affecting application continuity).
Detailing the expectation for hardlink migartion with this patch:
Hardlink is migrated only for remove-brick process. It is highly essential
to have a new mount(step-3) for the hardlink migration to happen. Why?:
setfattr operation is an inode based operation. Since, we are doing setfattr from
fuse mount here, inode_path will try to build path from the linked dentries to the inode.
For a file without hardlinks the path construction will be correct. But for hardlinks,
the inode will have multiple dentries linked.
Without fresh mount, inode_path will always get the most recently linked dentry.
e.g. if there are three hardlinks named dir1/link1, dir2/link2, dir3/link3, on a client
where these hardlinks are looked up, inode_path will always return the path dir3/link3
if dir3/link3 was looked up most recently. Hence, we won't be able to create linkto
files for all other hardlinks on destination (read gf_defrag_handle_hardlink for more details
on hardlink migration).
With a fresh mount, the lookup and setfattr become serialized. e.g. link2 won't be
looked up until link1 is looked up and migrated. Hence, inode_path will always have the correct
path, in this case link1 dentry is picked up(as this is the most recently looked up inode) and
the path is built right.
Note: If you run the above script on an existing mount(all entries looked up), hard links may
not be migrated, but there should not be any other issue. Please raise a bug, if you find any
issue.
Tests: Manual
Change-Id: I9854cdd4955d9e24494f348fb29ba856ea7ac50a
BUG: 1450975
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: https://review.gluster.org/17115
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Rebalance compares the node-uuid of a file against its own
to and migrates a file only if they match. However, the
current behaviour in both AFR and EC is to return
the node-uuid of the first brick in a replica set for all
files. This means a single node ends up migrating all
the files if the first brick of every replica set is on the
same node.
Fix:
AFR and EC will return all node-uuids for the replica set.
The rebalance process will divide the files to be migrated
among all the nodes by hashing the gfid of the file and
using that value to select a node to perform the migration.
This patch makes the required DHT and tiering changes.
Some tests in rebal-all-nodes-migrate.t will need to be
uncommented once the AFR and EC changes are merged.
Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c
BUG: 1366817
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17239
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Amar Tumballi <amarts@redhat.com>
Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using local->call_cnt to check STACK_WINDs can
cause dht_rmdir_do to be called erroneously if
dht_rmdir_readdirp_cbk unwinds before we check if
local->call_cnt is zero in dht_rmdir_opendir_cbk.
This can cause frame corruptions and crashes.
Thanks to Shyam (srangana@redhat.com) for the
analysis.
Change-Id: I5362cf78f97f21b3fade0b9e94d492002a8d4a11
BUG: 1451083
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17305
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d
BUG: 1440051
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/17126
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removing redundant logs were introduced in
https://review.gluster.org/#/c/17065/
Change-Id: I0d6055488b51a13c91d2121e87f653cdb94888b0
BUG: 1445590
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17118
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Susant Palai <spalai@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Design doc: https://review.gluster.org/16876
Directory creation is now synchronized with blocking inodelk of the
parent on the hashed subvolume followed by the entrylk on the hashed
subvolume between dht_mkdir, dht_rmdir, dht_rename_dir and lookup
selfheal mkdir.
To maintain internal consistency of directories across all subvols of
dht, we need locks. Specifically we are interested in:
1. Consistency of layout of a directory. Only one writer should modify
the layout at a time. A writer (layout setting during directory heal
as part of lookup) shouldn't modify the layout while there are
readers (all other fops like create, mkdir etc., which consume
layout) and readers shouldn't read the layout while a writer is in
progress. Readers can read the layout simultaneously. Writer takes
a WRITE inodelk on the directory (whose layout is being modified)
across ALL subvols. Reader takes a READ inodelk on the directory
(whose layout is being read) on ANY subvol.
2. Consistency of directory namespace across subvols. The path and
associated gfid should be same on all subvols. A gfid should not be
associated with more than one path on any subvol. All fops that can
change directory names (mkdir, rmdir, renamedir, directory creation
phase in lookup-heal) takes an entrylk on hashed subvol of the
directory.
NOTE1: In point 2 above, since dht takes entrylk on hashed subvol of a
directory, the transaction itself is a consumer of layout on
parent directory. So, the transaction is a reader of parent
layout and does an inodelk on parent directory just like any
other layout reader. So a mkdir (dir/subdir) would:
> Acquire a READ inodelk on "dir" on any subvol.
> Acquire an entrylk (dir, "subdir") on hashed subvol of "subdir".
> creates directory on hashed subvol and possibly on non-hashed subvols.
> UNLOCK (entrylk)
> UNLOCK (inodelk)
NOTE2: mkdir fop while setting the layout of the directory being created
is considered as a reader, but NOT a writer. The reason is for
a fop which can consume the layout of a directory to come either
of the following conditions has to be true:
> mkdir syscall from application has to complete. In this case no
need of synchronization.
> A lookup issued on the directory racing with mkdir has to complete.
Since layout setting by a lookup is considered as a writer, only
one of either mkdir or lookup will set the layout.
Code re-organization:
All the lock related routines are moved to "dht-lock.c" file.
New wrapper function is introduced to take blocking inodelk
followed by entrylk 'dht_protect_namespace'
Updates #191
Change-Id: I01569094dfbe1852de6f586475be79c1ba965a31
Signed-off-by: Kotresh HR <khiremat@redhat.com>
BUG: 1443373
Reviewed-on: https://review.gluster.org/15472
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This bug was causing VMs to pause during rebalance. When qemu winds
down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute
in the req dict which DHT doesn't wind its STAT fop with upon detecting
the file has undergone migration. As a result shard doesn't find the
value to this key in the unwind path, causing it to fail the STAT
with EINVAL.
Also, the same bug exists in other fops too, which is also fixed in
this patch.
Change-Id: Id7823fd932b4e5a9b8779ebb2b612a399c0ef5f0
BUG: 1440051
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/17085
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rm -rf <dir> fails with ENOENT if dir contains a lot of
stale linkto files. This is because a single
readdirp is sent as part of the rmdir which would return
and delete only as many linkto files on the bricks as would fit
in one readdirp buffer. Running rm -rf <dir> multiple times
will eventually delete all the files. The fix sends readdirp
on each subvol until no more entries are returned.
Change-Id: I447f2d193de4bd8ac16e4541c6b919d22250e39e
BUG: 1442724
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/17065
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I6adce98f52e17953f501bc590ff7189cceac3c31
BUG: 1431908
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/17057
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As readdir-ahead can be loaded as a child of dht, dht has to specify
the xattrs it is intrested in, as part of opendir call itself.
Change-Id: I012ef96cc143b0cef942df78aa7150d85ec38606
BUG: 1431908
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/16902
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup.
dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory()
is still winding lookup calls to subvolumes so there is a chance of partial gfid being
seen by EC.
We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching
with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC
erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO.
snip from gdb:
$37 = (dht_local_t *) 0x7fde5de5b3cc
(gdb) p /x $37->loc.gfid
$39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5,
0x6c, 0x2c, 0xb8, 0x56}
(gdb) fr 7
state=<optimized out>) at ec-generic.c:837
837 ec_lookup_rebuild(fop->xl->private, fop, cbk);
(gdb) p /x fop->loc[0].gfid
$40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c,
0x2c, 0xb8, 0x56}
snip from log:
[2017-01-29 03:22:30.132328] W [MSGID: 122019]
[ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's
in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199]
[nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3:
/linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX:
5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid
00000000-0000-0000-0000-000000000000, mountid
00000000-0000-0000-0000-000000000000 [Invalid argument]
Fix:
update local->loc.gfid in last-call to make sure there are no races.
BUG: 1438411
Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16986
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
My patch at https://review.gluster.org/16419 is resulting
in core dumps everytime I run tests/features/nuke.t.
Turns out dht, upon successfully "nuking" a directory,
which was initiated through a setxattr, unwinds the operation
with rmdir fop signature, resulting in readdir-ahead casting
a struct iatt (preparent) to dict_t, leading to a crash.
Change-Id: If5f50417be9eb93e731b06c79b9bf027e5dd4d55
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/16829
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix up use after free bugs and dead code
Change-Id: I8f79ed6b5108926c1fac31c147b5ecba79d10785
BUG: 1424905
Signed-off-by: Nigel Babu <nigelb@redhat.com>
Reviewed-on: https://review.gluster.org/16666
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coverty warn of the defect.
Change-Id: Ie86684520e1d5b41237ab8d3247c24564a1a8639
BUG: 1424802
Signed-off-by: Michael Scherer <misc@redhat.com>
Reviewed-on: https://review.gluster.org/16673
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Tested-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Corrected the op_errno assignments and NULL checks in
the dht_sexattr2 and dht_removexattr2 functions. Earlier,
they unwound with the default EINVAL op_errno if the
file had been deleted.
Change-Id: Iaf837a473d769cea40132487a966c7f452990071
BUG: 1421653
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/16610
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is sufficient to pass an int value as opposed to a "yes" against the
DHT_IATT_IN_XDATA_KEY key since all posix cares about is whether the
key is present in the dict or not. Also note that this patch does not
violate backward compatibility since the handling of the key in posix
remains untouched.
Change-Id: I2f881494a257488709c8c1d2002f2d124ddcc089
BUG: 1390050
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: https://review.gluster.org/16591
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: I9c5e65b32e316e6a2fc7e1f5c79fce79386b78e2
BUG: 1401812
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: https://review.gluster.org/16071
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
tierd is implemented by separating from rebalance process.
The commands affected:
1) Attach tier will trigger this process instead of old one
2) tier start and tier start force will also trigger this process.
3) volume status [tier] will show tier daemon as a process instead
of task and normal tier status and tier detach status works.
4) tier stop implemented.
5) detach tier implemented separately along with new detach tier
status
6) volume tier volname status will work using the changes.
7) volume set works
This patch has separated the tier translator from the legacy
DHT rebalance code. It now sends the RPCs from the CLI
to glusterd separate to the DHT rebalance code.
The daemon is now a service, similar to the snapshot daemon,
and can be viewed using the volume status command.
The code for the validation and commit phase are the same
as the earlier tier validation code in DHT rebalance.
The “brickop” phase has been changed so that the status
command can use this framework.
The service management framework is now used.
DHT rebalance does not use this framework.
This service framework takes care of :
*) spawning the daemon, killing it and other such processes.
*) volume set options , which are written on the volfile.
*) restart and reconfigure functions. Restart is to restart
the daemon at two points
1)after gluster goes down and comes up.
2) to stop detach tier.
*) reconfigure is used to make immediate volfile changes.
By doing this, we don’t restart the daemon.
it has the code to rewrite the volfile for topological
changes too (which comes into place during add and remove brick).
With this patch the log, pid, and volfile are separated
and put into respective directories.
Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399
BUG: 1313838
Signed-off-by: hari gowtham <hgowtham@redhat.com>
Reviewed-on: http://review.gluster.org/13365
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: hari gowtham <hari.gowtham005@gmail.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
frame has a void * cookie pointer.
In case of STACK_WIND_COOKIE frame->cookie is assigned
to what is sent by the caller.
In case of STACK_WIND frame->cookie is assigned to point
point to the frame itself.
For ease of coding, at many places, the cookie in the cbk
is used to get the pointer to the next xl. This is
inconsistent when STACK_WIND_TAIL comes into picture.
Eg: dht_setxattr () {
for (i = 0 ; i < conf->subvolume_cnt ; i++) {
STACK_WIND (..dht_checking_pathinfo_cbk,
conf->subvolumes[i] ..);
}
dht_checking_pathinfo_cbk (...void *cookie...) {
prev = cookie;
...
for (i = 0; i < conf->subvolume_cnt; i++) {
if (conf->subvolumes[i] == prev->this) {
...
}
}
}
Consider the below graph:
dht (parent)
readdir-ahead => Doesn't define setxattr and uses STACK_WIND_TAIL
protocol-client
With this graph, when dht_checking_pathinfo_cbk is called,
cookie will have frame pointing to protocol-client.
i.e. prev->this will be protocol-client. But dht was expecting
it to be readdir-ahead as it has stored in conf->subvolumes[i]
Solution:
Hence, as a thumb rule, if cbk is using cookie, then we explicitly
call STACK_WIND_COOKIE.
Change-Id: I83aea1e24c809c5a91a0db7283e908e125471bd4
BUG: 1401812
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/16039
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upcall as a part of setattr, sends an invalidation and the
invalidation carries the resulting stat value. When a file
is converted to linkto files, even then an invalidation
is set and as a result the mountpoint shows the sticky
bit in the stat of the file.
eg: ---------T. 945 root root 0 Nov 8 10:14 hardlink.999
Fix:
When dht recieves a notification of sticky bit change, it updates
the flag, to indicate md-cache to send the subsequent lookup.
Change-Id: Ic2fd7a5b196db0754f9b97072e644e6bf69da606
BUG: 1392713
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15789
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Susant Palai <spalai@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally linkto file is created using root user. Consider following
case, a user is trying to rename a file which he is not permitted.
So the rename fails with EACESS and when rename tries to cleanup the
linkto file, it fails.
The above issue happens when rename/00.t test executed on nfs-ganesha
clients :
Steps executed in script
* create a file "abc" using root
* rename the file "abc" to "xyz" using a non root user, it fails with EACESS
* delete "abc"
* create directory "abc" using root
* again try ot rename "abc" to "xyz" using non root user, test hungs here
which slowly leds to OOM kill of ganesha process
RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being
present as:
* a linkto file on one subvol
* a directory on rest of subvols
When a lookup happens on the dentry in such a scenario, the control flow
goes into an infinite loop of:
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).
This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout unconditionally
2) Most of the functions in this loop do a stack_wind of various fops.
This results in growing of call stack (note that call-stack is destroyed only after lookup response is
received by fuse - which never happens in this case)
Thanks Du for root causing the oom kill and Sushant for suggesting the fix
Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d
BUG: 1397052
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/15894
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT does not set the layout for newly created
directories as root. This causes EPERM failures
when a non-root user with insufficient permissions
creates directories.
credit: srangana@redhat.com for RCA
Change-Id: Ia646e41665ce172c43c5f01d2707455e8eb374ed
BUG: 1392772
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15794
Reviewed-by: Susant Palai <spalai@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently these are few events related to child_up/down:
GF_EVENT_CHILD_UP : Issued when any of the protocol client
connects.
GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec
GF_EVENT_CHILD_DOWN : Issued when any of the protocol client
disconnects.
These events get modified at the dht/afr/ec layers. Here is a
brief on the same.
DHT:
- All the subvolumes reported once, and atleast one child came
up, then GF_EVENT_CHILD_UP is issued
- connect GF_EVENT_CHILD_UP is issued
- disconnect GF_EVENT_CHILD_MODIFIED is issued
- All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued
AFR:
- First subvolume came up, then GF_EVENT_CHILD_UP is issued
- Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED
- Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued
- Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued
Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP,
GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes
go up or down.
Now with md-cache changes, there is a necessity to differentiate between
child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and
getting rid of GF_EVENT_CHILD_MODIFIED.
[1] http://review.gluster.org/12573
Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58
BUG: 1396038
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15764
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for NULL inode before attempting to
set dht inode ctx.
Change-Id: I7693c18445f138221d8417df5e95b118cedb818a
BUG: 1395261
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15847
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: 1385593
Change-Id: Icfae9e557a284182c6c22e9606fdd641528906f0
Reported-by: Patrick Matthäi <pmatthaei@debian.org>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/15656
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://review.gluster.org/14085 fixes a "pragma leak" where the
generated rpc/xdr headers have a pair of pragmas that disable these
warnings. With the warnings disabled, many unused variables have
crept into the code base.
And 14085 won't pass its own smoke test until all these warnings are
fixed.
BUG: 1369124
Change-Id: I6feee863927254ae9f59013112bcc297ad09e89b
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Reviewed-on: http://review.gluster.org/15477
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Dan Lambright <dlambrig@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a distributed-replicate volume attribute
"replica.split-brain-status" value does not display split-brain
condition though directory is in split-brain.
If directory is in split brain on mutiple replica-pairs
it does not show full list of replica pairs.
Solution: Update the dht_aggregate code to aggregate the xattr
value in this specific condition.
Fix: 1) function getChoices returns the choices from split-brain
status string.
2) function add_opt adding the choices to local buffer to
store in dictionary
3) For the key "replica.split-brain-status" function dht_aggregate
call dht_aggregate_split_brain_xattr to prepare the list.
Test: To verify the patch followed below steps
1) Create a distributed replica volume and create mount point
2) Stop heal daemon
3) Touch file and directories on mount point
mkdir test{1..5};touch tmp{1..5}
4) Down brick process on one of the replica set
pkill -9 glusterfsd
5) Change permission of dir on mount point
chmod 755 test{1..5}
6) Restart brick process on node with force option
7) kill brick process on other node in same replica set
8) Change permission of dir again on mount point
chmod 766 test{1..5}
9) Reexecute same step from 4-9 on other replica set also
10) After check heal status on server it will show dir's are
in split brain on all replica sets
11) After check the replica.split-brain-status attr on mount
point it will show wrong status of split brain.
12) After apply the patch the attribute shows correct value.
BUG: 1368312
Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15201
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post add-brick event the new brick will have permission of 755
by default. If the root directory permission was other than 755,
that does not get healed to the new brick leading to permission
errors/inconsistencies.
For choosing source of attr heal we can trust the subvols which
have layouts with latest ctime(as part of missing directory heal,
we heal the proper attr). In case none of the subvols have layout,
return ESTALE to retrigger a fresh lookup.
Note: This patch heals the permission of the root directories only.
Since, permission healing of directory is not straight forward and
required intrusive fix, those are not addressed here.
Change-Id: If894e3895d070d46b62d2452e52c1eaafcf56c29
BUG: 1368012
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15195
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: For healing of uid/gid we check if local->stbuf.ia_ctime is
lesser than stbuf->ia_ctime (received from brick). If yes then uid/gid
is updated to local->prebuf(source of healing).
But we merge local->stbuf also form the newly added brick. So if we
receive response from the newly added brick first and update the
local->stbuf, then local->prebuf will remain empty since the newly added
brick will have the latest ctime among all servers. And this can result
in healing wrong uid/gids to the rest of servers.
Hence, we should update local->stbuf from servers with a layout which
will ignore merging stbufs from newly added bricks.
Change-Id: If4b64f75a0ea669abdbe9f5a3d1d18ff19374c2f
BUG: 1365740
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15126
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Iad96256218be643b272762b5638a3f6837aff28d
BUG: 1366495
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15343
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
dht_layout is built as a part of lookup only. The layout can be
modified by rebalance process. Since every IO fop is preceded
by a lookup, there are very less issues of stale layout. But
with enhancements of aggressive caching of stats in md-cache,
the lookup will reduce and expose the stale layout issue often.
Solution:
Since stale layout is already an issue on dht, there is already
a plan to fix this at the dht layer, but this fix is not currently
planned for any release. Until this fix comes out, we can have
a workaround where, the upcall will send a notification to md-cache
when a layout xattr is changed. As a part of layout change notification
the existing cache is invalidated and the next lookup will fetch the
latest layout.
This is not a foolproof solution as the window between the layout change
and the next lookup(after invalidation of stat), where there will be stale
layout. But until the final fix comes in, this reduces the stale layout
window.
Change-Id: Iacf871a38b35880c1fc0bc68fe7ce291265e71d4
BUG: 1369638
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15300
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|