summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* mount/fuse: Fix the place where graph-switch event is loggedKrutika Dhananjay2017-01-111-3/+4
| | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16302 fuse-bridge changes the active subvol in fuse_graph_sync(), and not in fuse_graph_setup(). This patch is written to get more accurate timestamp in the logs as to when a graph switch actually happened and in the hope that this will help better in identifying the issue of VM corruption that users are seeing when they add-bricks. Change-Id: I3c8577b87db02a2a6ce6159e7d04cf58a2bda0c1 BUG: 1411613 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16366 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* posix: make sure atime and mtime are set when calling lutimes()Niels de Vos2017-01-081-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When overwriting an existing file with O_TRUNC, the 'atime' was set to 0, meaning the Epoch (01-Jan-1970 UTC). However, the 'mtime' gets updated correcty. In case 'atime' or 'mtime' is not passed in the 'struct iatt', the time values passed to the systemcall are taken from the current values are returned by lstat(). Cherry picked from commit 9bed81ada6f91f998e9abd915b18e3f06557cdcb: > Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3 > BUG: 1401777 > Reported-by: Eivind Sarto <eivindsarto@gmail.com> > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/16034 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3 BUG: 1411011 Reported-by: Eivind Sarto <eivindsarto@gmail.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/16356 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* afr: Ignore event_generation checks post inode refresh for write txnsRavishankar N2017-01-083-1/+3
| | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/16205/ Before http://review.gluster.org/#/c/16091/, after inode refresh, we failed read txns in case of EIO or event_generation being zero. For write transactions, the check was only for EIO. 16091 re-factored the code to fail both read and write when event_generation=0. This seems to have caused a regression as explained in the BZ. This patch restores that behaviour in afr_txn_refresh_done(). Change-Id: Id763ed2d420b6d045d4505893a18959d998c91a3 BUG: 1378547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16322 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* dht/rebalance: remove errno check for failure detectionSusant Palai2017-01-081-16/+13
| | | | | | | | | | | | | | | | | | | | | > BUG: 1410355 > Change-Id: I867419ca36a81ef7209e6911a46c1c2c898b8eab > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: http://review.gluster.org/16328 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 451ca272d12f2d49522c845d53585520f71525f8) Change-Id: If05094346fc1fc48f4982db6a195740171ae2fad BUG: 1410764 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16348 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd/geo-rep: Fix geo-rep config issueKotresh HR2017-01-081-14/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Geo-replication config commands fail when geo-rep status is in 'Created'. Cause: During staging phase of geo-rep config, it checks for the existence of 'monitor.pid' file. But the pid file gets created only on start of geo-rep. Hence it fails. Fix: Do not check for the existence of pid-file during staging for config commands. It is not required. > Change-Id: I626d368b249cf0423c7f49b4284465508371f566 > BUG: 1404678 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: http://review.gluster.org/16132 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Aravinda VK <avishwan@redhat.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: I626d368b249cf0423c7f49b4284465508371f566 BUG: 1410699 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16342 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd (geo-rep): fix unused variable warnings/errorsKaleb S. KEITHLEY2017-01-081-20/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the fd leak when geo-rep config command is run while geo-rep is running. NOTE: The patch is backport of http://review.gluster.org/15263 which was one of the collection of patches to fix the bug https://bugzilla.redhat.com/show_bug.cgi?id=1369124 > Change-Id: I2edacd3d0f3924b1be0f0398ba9ac076459c5a61 > BUG: 1369124 > Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> > Reviewed-on: http://review.gluster.org/15263 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Change-Id: I2edacd3d0f3924b1be0f0398ba9ac076459c5a61 BUG: 1410708 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16344 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: Fix dict_leak in migration check tasksN Balachandran2017-01-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Fixed a memleak where dict was not being unrefed in the dht_migration_complete_check_task and dht_rebalance_inprogress_task functions. BUG: 1410369 > Change-Id: I3d42e9a2e5c8596c985bf6431a68fd3905227383 > BUG: 1409186 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/16308 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > (cherry picked from commit 11b6a2c9fc5232b58774cab29873406c0fbfef19) Change-Id: I4e92e4c87b52f6f5ffb859005aa3d525225b4bda Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/16331 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* afr: allow I/O when favorite-child-policy is enabledRavishankar N2017-01-088-61/+388
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Currently, I/O on a split-brained file fails even when the favorite-child-policy is set until the self-heal is complete. Fix: If a valid 'source' is found using the set favorite-child-policy, inspect and reset the afr pending xattrs on the 'sinks' (inside appropriate locks), refresh the inode and then proceed with the read or write transaction. The resetting itself happens in the self-heal code and hence can also happen in the client side background-heal or by the shd's index-heal in addition to the txn code path explained above. When it happens in via heal, we also add checks in undo-pending to not reset the sink xattrs again. > Reviewed-on: http://review.gluster.org/15673 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626 BUG: 1378547 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com> Reviewed-on: http://review.gluster.org/16091 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Fix memory corruption while accessing regex stored inRaghavendra G2017-01-033-45/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | private If reconfigure is executed parallely (or concurrently with dht_init), there are races that can corrupt memory. One such race is modification of regexes stored in conf (conf->rsync_regex_valid and conf->extra_regex_valid) through dht_init_regex. With change [1], reconfigure codepath can get executed parallely (with itself or with dht_init) and this fix is needed. Also, a reconfigure can race with any thread doing dht_layout_search, resulting in dht_layout_search accessing regex freed up by reconfigure (like in bz 1399134). [1] http://review.gluster.org/15046 >Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 >BUG: 1399134 >Signed-off-by: Raghavendra G <rgowdapp@redhat.com> >Reviewed-on: http://review.gluster.org/15945 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: N Balachandran <nbalacha@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830 BUG: 1399423 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit 64451d0f25e7cc7aafc1b6589122648281e4310a) Reviewed-on: http://review.gluster.org/15793 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* logging: Avoid re-initing log level in io-statsVijay Bellur2017-01-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If log level is already set via api or command line, initialization of io-stats xlator overwrites the log level to GF_LOG_INFO. This patch prevents re-initialization of log level if already set. Backport of: > Change-Id: I1f74d94ef8068b95ec696638c0a8b17d8d71aabe > BUG: 1368882 > Signed-off-by: Vijay Bellur <vbellur@redhat.com> > Reported-by: Colin Lord <clord@redhat.com> > Reviewed-on: http://review.gluster.org/15112 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: I1f74d94ef8068b95ec696638c0a8b17d8d71aabe BUG: 1378384 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/15544 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht/rename : Incase of failure remove linkto file properlyJiffin Tony Thottan2017-01-022-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generally linkto file is created using root user. Consider following case, a user is trying to rename a file which he is not permitted. So the rename fails with EACESS and when rename tries to cleanup the linkto file, it fails. The above issue happens when rename/00.t test executed on nfs-ganesha clients : Steps executed in script * create a file "abc" using root * rename the file "abc" to "xyz" using a non root user, it fails with EACESS * delete "abc" * create directory "abc" using root * again try ot rename "abc" to "xyz" using non root user, test hungs here which slowly leds to OOM kill of ganesha process RCA put forwarded by Du for OOM kill of ganesha Note that when we hit this bug, we've a scenario of a dentry being present as: * a linkto file on one subvol * a directory on rest of subvols When a lookup happens on the dentry in such a scenario, the control flow goes into an infinite loop of: dht_lookup_everywhere dht_lookup_everywhere_cbk dht_lookup_unlink_cbk dht_lookup_everywhere_done dht_lookup_directory (as local->dir_count > 0) dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol) dht_lookup_everywhere (as need_selfheal = 1). This infinite loop can cause increased consumption of memory due to: 1) dht_lookup_directory assigns a new layout to local->layout unconditionally 2) Most of the functions in this loop do a stack_wind of various fops. This results in growing of call stack (note that call-stack is destroyed only after lookup response is received by fuse - which never happens in this case) Thanks Du for root causing the oom kill and Sushant for suggesting the fix Upstream reference : >Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d >BUG: 1397052 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/15894 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> >(cherry picked from commit 57d59f4be205ae0c7888758366dc0049bdcfe449) Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d BUG: 1401029 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16015 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs: revalidate lookup converted to fresh lookupMohammed Rafi KC2017-01-024-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/15580 when an inode ctx is missing for a linked inode the revalidate lookups are converted to fresh. This could result in sending ESTALE when the gfid are recreated We are not able to reproduce the issue with normal setup, most part of RCA was done with code reading. Possible scenario in which this bug can reproduce, Delete a file and recreate a new file with same name, at the same time from another client process try to list/or access the file. In this case the second client may throw an ESTALE error for such files Thanks to Soumya and Pranith for doing the complete RCA >Change-Id: I73992a65844b09a169cefaaedc0dcfb129d66ea1 >BUG: 1379720 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/15580 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: soumya k <skoduri@redhat.com> >Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: Ifd0d92e29e409e5b23790b677034cfc8f3184d1a BUG: 1394635 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/15840 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht: Check for null inodeN Balachandran2017-01-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Check for NULL inode before attempting to set dht inode ctx. > Change-Id: I7693c18445f138221d8417df5e95b118cedb818a > BUG: 1395261 > Signed-off-by: N Balachandran <nbalacha@redhat.com> > Reviewed-on: http://review.gluster.org/15847 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 8313d53accaa22feb14d284fb91245be0a32e16e) Change-Id: I7607d32d38d707dd5d71b98efffd1a458ffe90d7 BUG: 1395510 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15850 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* geo-rep: Do not restart workers when log-rsync-performance config changeAravinda VK2016-12-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Geo-rep restarts workers when any of the configurations changed. We don't need to restart workers if tunables like log-rsync-performance is modified. With this patch, Geo-rep workers will get new "log-rsync-performance" config automatically without restart. > Reviewed-on: http://review.gluster.org/15816 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kotresh HR <khiremat@redhat.com> BUG: 1402728 Change-Id: I40ec253892ea7e70c727fa5d3c540a11e891897b Signed-off-by: Aravinda VK <avishwan@redhat.com> (cherry picked from commit a268e2865c21ec8d2b4fed26715e986cfcc66fad) Reviewed-on: http://review.gluster.org/16069 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
* cluster/afr: Fix missing name indices due to EEXIST errorKrutika Dhananjay2016-12-291-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16286 PROBLEM: Consider a volume with granular-entry-heal and sharding enabled. When a replica is down and a shard is created as part of a write, the name index is correctly created under indices/entry-changes/<dot-shard-gfid>. Now when a read on the same region triggers another MKNOD, the fop fails on the online bricks with EEXIST. By virtue of this being a symmetric error, the failed_subvols[] array is reset to all zeroes. Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be set, causing the name index, which was created in the previous MKNOD operation, to be wrongly deleted in THIS MKNOD operation. FIX: The ideal fix would have been for a transaction to delete the name index ONLY if it knows it is the one that created the index in the first place. This would involve gathering information as to whether THIS xattrop created the index from individual bricks, aggregating their responses and based on the various posisble combinations of responses, decide whether to delete the index or not. This is rather complex. Simpler fix would be for post-op to examine local->op_ret in the event of no failed_subvols to figure out whether to delete the name index or not. This can occasionally lead to creation of stale name indices but they won't be affecting the IO path or mess with pending changelogs in any way and self-heal in its crawl of "entry-changes" directory would take care to delete such indices. Change-Id: Icc642a987d1b6a5097562315aecf1263ed35ceb6 BUG: 1408786 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16293 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: use accused matrix instead of readable matrix for deciding healsRavishankar N2016-12-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: afr_replies_interpret() used the 'readable' matrix to trigger client side heals after inode refresh. But for arbiter, readable is always zero. So when `dd` is run with a data brick down, spurious data heals are are triggered. These heals open an fd, causing eager lock to be disabled (open fd count >1) in afr transactions, leading to extra FXATTROPS Fix: Use the accused matrix (derived from interpreting the afr pending xattrs) to decide whether we can start heal or not. > Reviewed-on: http://review.gluster.org/16277 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f) Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9 BUG: 1408772 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16291 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/rebalance: reverify lookup failuresSusant Palai2016-12-272-58/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | race: readdirp has read one entry, and doing a lookup on that entry, but user might have renamed/removed that entry just after readdirp but before lookup. Since remove-brick is a costly opertaion,will ingore any ENOENT/ESTALE failures and move on. > Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996 > BUG: 1408115 > Signed-off-by: Susant Palai <spalai@redhat.com> > Reviewed-on: http://review.gluster.org/15846 > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996 BUG: 1408414 Reviewed-on: http://review.gluster.org/15846 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit c20febcb1ffaef3fa29563987e7a3b554aea27b3) Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/16278
* glusterd: Handle volinfo->refcnt properly during volume start commandAvra Sengupta2016-12-251-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | While running the volume start command, the refcnt of the volume is incremented. At the end of the command, the refcnt should also be decremented. This is currently not the case. This patch, makes sure the refcnt is also decremented at the end of the volume start command. > Reviewed-on: http://review.gluster.org/16108 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 9b1c9395a397e337e4a0acac55b935cb1ce094b7) Change-Id: I017b5039be5948df41dde6bc89d2955d5d18971f BUG: 1404105 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/16114 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd/ganesha : handle volume reset properly for ganesha optionsJiffin Tony Thottan2016-12-231-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "gluster volume reset" should first unexport the volume and then delete export configuration file. Also reset option is not applicable for ganesha.enable if volume value is "all". This patch also changes the name of create_export_config into manange_export_config Upstream reference : >Change-Id: Ie81a49e7d3e39a88bca9fbae5002bfda5cab34af >BUG: 1397795 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/15914 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: soumya k <skoduri@redhat.com> >Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Change-Id: Ie81a49e7d3e39a88bca9fbae5002bfda5cab34af BUG: 1405951 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16054 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/16197
* protocol/client: Fix potential mem-leaksKrutika Dhananjay2016-12-202-4/+1
| | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16156 Commit 93eaeb9c93be3232f24e840044d560f9f0e66f71 introduces leaks in INODELK callback where a dict is unserialized twice, leading to dict leaks. Change-Id: I2ad6f4243d78ba30841731d331f5a4a0006827da BUG: 1405886 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16187 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* upcall: Fix 'use after free' in a log messageNiels de Vos2016-12-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | There is chance of accessing freed pointer in a log message at TRACE level while cleaning up expired client entries. Cherry picked from commit 212c7600d2070a4414bc89fd7d2c186b5994cd54: > Change-Id: I06b4dad755df63978ab04ca52442bfd4600d139a > BUG: 1404168 > Reported-by: Ravishankar N <ravishankar@redhat.com> > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/16117 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> Change-Id: I06b4dad755df63978ab04ca52442bfd4600d139a BUG: 1404583 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/16128 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com>
* cluster/dht: A hard link is lost during rebalance + lookupMohit Agrawal2016-12-141-35/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. > BUG: 1396048 > Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/15866 > Reviewed-by: Susant Palai <spalai@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit 71dd2e914d4a537bf74e1ec3a24512fc83bacb1d) BUG: 1399432 Change-Id: I30e21efd5a054d8a3e640ab3ed8aa7955d083926 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15954 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd/geo-rep: Fix glusterd crashKotresh HR2016-12-132-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: glusterd crashes when geo-rep mountbroker setup is created if the slave user length is more than 8 characters. Cause: _POSIX_LOGIN_NAME_MAX is used which is 9 including NULL byte. Analysis: While the man page says it sufficient for portability, but acutally it's not. Linux allows the creation of username upto 32 characters by default where the max length is 256. And NetBSD's max is 17. Linux: #getconf LOGIN_NAME_MAX 256 NetBSD: #getconf LOGIN_NAME_MAX 17 Fix: Use LOGIN_NAME_MAX instead of _POSIX_LOGIN_NAME_MAX >Reviewed-on: http://review.gluster.org/16053 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Aravinda VK <avishwan@redhat.com> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: I26b7230433ecbbed6e6914ed39221a478c0266a8 BUG: 1403109 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/16085 Tested-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/afr: Fix per-txn optimistic changelog initialisationKrutika Dhananjay2016-12-132-29/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/16075 Incorrect initialisation of local->optimistic_change_log was leading to skipped pre-op and post-op even when a brick didn't participate in the txn because it was down. The result - missing granular name index resulting in some entries never getting healed. FIX: Initialise local->optimistic_change_log just before pre-op. Also fixed granular entry heal to create the granular name index in pre-op as opposed to post-op. This is to prevent loss of granular information when during an entry txn, the good (src) brick goes offline before the post-op is done. This would cause self-heal to do conservative merge (since dirty xattr is the only information available), which when granular-entry-heal is enabled, expects granular indices, the lack of which can lead to loss of data in the worst case. Change-Id: Ibc0fbfb3fa21c578e28868d9e30b274e33c12064 BUG: 1403646 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16105 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* uss: snapd should enable SSL if SSL is enabled on volumeRajesh Joseph2016-12-111-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | During snapd graph generation we should check if SSL is enabled on main volume or not. This is because clients will communicate with snapd as if it is communicating to a brick. > Reviewed-on: http://review.gluster.org/15979 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaushal M <kaushal@redhat.com> (cherry picked from commit 182f0d12040dab5081ca645a3f370f65cd68b528) Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> BUG: 1400459 Reviewed-on: http://review.gluster.org/15986 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* selfheal: fix memory leak on client side healing queueMateusz Slupny2016-12-043-3/+8
| | | | | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/15968 > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit fb95eb4da6f4fc0b9c69e3b159a2214fe47e6d1d) Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d BUG: 1400927 Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com> Reviewed-on: http://review.gluster.org/16012 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: CLI for granular entry heal enablement/disablementKrutika Dhananjay2016-11-292-15/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15747 When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I342e0390f847fcb015a50ef58aedfcbcb58f4ed3 BUG: 1398501 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15942 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* protocol/server: capture offset in seekRavishankar N2016-11-293-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: http://review.gluster.org/11482 implemented seek FOP but http://review.gluster.org/#/c/14137/ 'undid' the change where we pack the offset returned by seek in server xlator before sending it to the client. As a result, seek always returns zero to the client for SEEK_HOLE/ SEEK_DATA. Fix: I think 14137 removed it unintentionally, hence adding it back again. > Reviewed-on: http://review.gluster.org/15920 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit cc37e5929d1e3ea4eaf4c4576a82066bf131ad05) Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I67a1f7b53214b043c5291f5704be4a50b698f91c BUG: 1399130 Reviewed-on: http://review.gluster.org/15943 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/index: Delete granular entry indices of already healed directories ↵Krutika Dhananjay2016-11-261-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | during crawl Backport of: http://review.gluster.org/15880 If granular name indices are already in existence for a volume, and before they are healed, granular entry heal be disabled, a crawl on indices/xattrop will clear the changelogs on these directories. When their corresponding entry-changes indices are crawled subsequently, if it is found that the directories don't need heal anymore, the granular indices are not cleaned up. This patch fixes that problem by ensuring that the zero-xattrop also deletes the stale indices at the level of index translator. Change-Id: Iae0a560c1c9d37b083cad89f16d3dcf83c4f7dc7 BUG: 1398501 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15927 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: clean up old port and allocate new one on every restartAtin Mukherjee2016-11-235-30/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/15005/9. GlusterD as of now was blindly assuming that the brick port which was already allocated would be available to be reused and that assumption is absolutely wrong. Solution : On first attempt, we thought GlusterD should check if the already allocated brick ports are free, if not allocate new port and pass it to the daemon. But with that approach there is a possibility that if PMAP_SIGNOUT is missed out, the stale port will be given back to the clients where connection will keep on failing. Now given the port allocation always start from base_port, if everytime a new port has to be allocated for the daemons, the port range will still be under control. So this fix tries to clean up old port using pmap_registry_remove () if any and then goes for pmap_registry_alloc () This patch is being ported to 3.8 branch because, the brick process blindly re-using old port, without registering with the pmap server, causes snapd daemon to not start properly, even though snapd registers with the pmap server. With this patch, all the brick processes and snapd will register with the pmap server to either get the same port, or a new port, and avoid port collision. > Reviewed-on: http://review.gluster.org/15005 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Avra Sengupta <asengupt@redhat.com> (cherry picked from commit c3dee6d35326c6495591eb5bbf7f52f64031e2c4) Change-Id: If54a055d01ab0cbc06589dc1191d8fc52eb2c84f BUG: 1369766 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/15308 Tested-by: Avra Sengupta <asengupt@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* xlators/trash : Remove upper limit for trash max file sizeJiffin Tony Thottan2016-11-232-18/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently file which size exceeds more than 1GB never moved to trash directory. This is due to the hard coded check using GF_ALLOWED_MAX_FILE_SIZE. Upstream reference : >Change-Id: I2ed707bfe1c3114818896bb27a9856b9a164be92 >BUG: 1386766 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/15689 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Anoop C S <anoopcs@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Vijay Bellur <vbellur@redhat.com> >(cherry picked from commit cd9be49f6fe05d424989c0686a7e55a3f3ead27e) Change-Id: I2ed707bfe1c3114818896bb27a9856b9a164be92 BUG: 1392364 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/15785 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* marker: Fix inode value in loc, in setxattr fopPoornima G2016-11-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/15826 On recieving a rename fop, marker_rename() stores the, oldloc and newloc in its 'local' struct, once the rename is done, the xtime marker(last updated time) is set on the file, but sending a setxattr fop. When upcall receives the setxattr fop, the loc->inode is NULL and it crashes. The loc->inode can be NULL only in one valid case, i.e. in rename case where the inode of new loc can be NULL. Hence, marker should have filled the inode of the new_loc before issuing a setxattr. > Reviewed-on: http://review.gluster.org/15826 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kotresh HR <khiremat@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 46e5466850311ee69e6ae9a11c2bba2aabadd5de) Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977 BUG: 1396418 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15878 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* upcall: Fix a log levelPoornima G2016-11-181-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | In upcall_cache_invalidation(), the gfid can be NULL in certain valid test cases(eg: entry for ".." in readdirp), hence change the log level from WARNING to DEBUG. Backport of http://review.gluster.org/15777 > Reviewed-on: http://review.gluster.org/15777 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit 68d1480e6056d1be91cde5129a6809642eeee857) Change-Id: Ic90167a0e2076694e9131913114460df7b939b30 BUG: 1394187 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15828 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: When failing fop due to lack of quorum, also log error stringKrutika Dhananjay2016-11-111-11/+12
| | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15800/ Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081 BUG: 1393630 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15813 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* performance/open-behind: Avoid deadlock in statedumpPranith Kumar K2016-11-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: open-behind is taking fd->lock then inode->lock where as statedump is taking inode->lock then fd->lock, so it is leading to deadlock In open-behind, following code exists: void ob_fd_free (ob_fd_t *ob_fd) { loc_wipe (&ob_fd->loc); <<--- this takes (inode->lock) ....... } int ob_wake_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, fd_t *fd_ret, dict_t *xdata) { ....... LOCK (&fd->lock); <<---- fd->lock { ....... __fd_ctx_del (fd, this, NULL); ob_fd_free (ob_fd); <<<--------------- } UNLOCK (&fd->lock); ....... } ================================================================= In statedump this code exists: inode_dump (inode_t *inode, char *prefix) { ....... ret = TRY_LOCK(&inode->lock); <<---- inode->lock ....... fd_ctx_dump (fd, prefix); <<<----- ....... } fd_ctx_dump (fd_t *fd, char *prefix) { ....... LOCK (&fd->lock); <<<------------------ this takes fd-lock { ....... } Fix: Make sure open-behind doesn't call ob_fd_free() inside fd->lock >BUG: 1393259 >Change-Id: I4abdcfc5216270fa1e2b43f7b73445f49e6d6e6e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15808 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Poornima G <pgurusid@redhat.com> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1393682 Change-Id: I45a0fbed683ef6acb7900df87534927f332fdaaa Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15818 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* posix-acl: check dictionary before using itRajesh Joseph2016-11-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If extended attributes are not present in md-cache it returns NULL as xattr. posix acl xlator should check for NULL before using xattr. If normal and default ACLs are not set on file then md-cache will not contain system.posix_acl_access and system.posix_acl_default extended attributes in its cache. Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise the logs will get filled with dictionary errors. > Reviewed-on: http://review.gluster.org/15769 > Reviewed-by: Raghavendra Talur <rtalur@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit de7fe24663713fff364cfc2b52b675e3e979ee68) Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da BUG: 1392868 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15799 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-082-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb BUG: 1392846 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15796 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd/quota: upgrade quota.conf file during an upgradeManikandan Selvaganesh2016-11-084-16/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem ======= When quota is enabled on 3.6, it will have quota conf version in quota.conf as v1.1. This node gets upgraded to 3.7 but it will still have quota conf version as v1.1 until a quota enable/disable/set limit is initiated. When this is not initiated and when this node tries to peer probe a node which is a fresh install of 3.7 (which will have quota conf version as v1.2), then this will result in "Peer rejected" state. This patch fixes the issue. Solution ======== When an upgrade happens from 3.6 to 3.7, quota.conf file needs to be modified as well. With 3.6, in quota.conf the version will be v1.1 and it needs to be changed to v1.2 from 3.7. This is because in 3.7, inode quota feature is introduced. So when an op-version bumpup happens quota.conf needs to be upgraded with quota conf version v1.2 and all the 16 byte uuid needs to be changed to 17 bytes uuid as well. Previously, when the cluster version is upgraded to 3.7, the quota.conf got upgraded as well. But, the upgradation was done only when quota enable/disable/set limit is done. With this patch, the upgradation is done during a cluster op version bump up as well. > Reviewed-on: http://review.gluster.org/15352 > Tested-by: Atin Mukherjee <amukherj@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 4b2cff614462508eef529c5d128e0974720e3f50) Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0 BUG: 1392716 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/15791 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> Tested-by: Manikandan Selvaganesh <manikandancs333@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd/shared storage: Check for hook-script at stagingAvra Sengupta2016-11-062-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | Check if S32gluster_enable_shared_storage.sh is present at /var/lib/glusterd/hooks/1/set/post/ at staging before proceeding with the command. Fail the command with the appropriate error message in case it is not present. > Reviewed-on: http://review.gluster.org/15718 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 29587a91716e1e55bd172d63340c40249fb343c9) Change-Id: I84e3912f1cdffb927f8a40d74d52be43ee69388b BUG: 1377448 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15741 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* snapshot: Fix for memory leaks in snapshot code pathAvra Sengupta2016-11-033-12/+47
| | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/15668 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> (cherry picked from commit 48a4c4e525665115f7e8c478d3bf51764427378d) Change-Id: Idc2cb16574d166e3c0ee1f7c3a485f1acb19fc8c BUG: 1388354 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15720 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* md-cache: Invalidate cache entry for open() with O_TRUNCSoumya Koduri2016-11-031-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. This is backport of below mainline patch - http://review.gluster.org/#/c/15618/ > Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 > BUG: 1382266 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/15618 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > (cherry picked from commit 6ca5d6382f03685b31b12accb095093cf1486603) Change-Id: I92349f5b48aef07f3790db7aae25bfa2ddb5947e BUG: 1391450 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15771 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* performance/write-behind: fix flush stuck by former failed writesRyan Ding2016-11-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print *((wb_request_t *)0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. > Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab > BUG: 1372211 > Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1390838 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15762 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht: Proper log message if data migration is skippedankitraj2016-11-021-8/+8
| | | | | | | | | | | | | | | | | | | There was a misleading message from logs about available disk space while rebalancing of bricks while calculating free space. Bug: 1390870 Backprt of http://review.gluster.org/#/c/15345/ Change-Id: Ie9df0b2cbf00faaf13a0a3f0dbd657377a082755 Signed-off-by: ankitraj <anraj@redhat.com> Reviewed-on: http://review.gluster.org/15765 Tested-by: ankitraj CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* crypt: changes needed for openssl-1.1 (coming in Fedora 26)Kaleb S. KEITHLEY2016-10-271-4/+17
| | | | | | | | | | | | | | | | Fedora has updated openssl-1.1.0b in/for Fedora 26 HMAC_CTX is now an opaque type and instances of it must be created and released by calling HMAC_CTX_new() and HMAC_CTX_free(). Change-Id: I3a09751d7b0d9fc25fe18aac6527e5431e9ab19a BUG: 1388580 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15727 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr,ec: Heal device files with correct major, minor numbersPranith Kumar K2016-10-262-2/+4
| | | | | | | | | | | | | | | | | | | | | | | Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. >BUG: 1384297 >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15728 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a BUG: 1388948 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15735 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* storage/posix: Fix race in posix_pstatPranith Kumar K2016-10-251-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When one thread is in the process of creating a file/directory and the other thread is doing readdirp, there is a chance that posix_pstat, creation fops race in the following manner which will lead to wrong stat values to be read by parent xlators like posix-acl. Creation fops posix_pstat() as part of readdirp 1) file is created with uid/gid 0/0 1) does stat of the path that is created just now. 2) Does chown to set the correct uid/gid 3) Sets the acl/user/internal xattrs 4) Sets the gfid on the entry and completes the creation of the file/dir 2) fills the gfid in the iatt If unwind of readdirp hits server xlator before creation fop, then posix-acl remembers uid/gid of the file to be root/root and fails fops like open etc on it. Fix: Reverse the order of filling gfid and filling lstat() values in posix_pstat() so that if there is gfid in iatt buffer uid/gid are valid. >Change-Id: I46caa7f6da7abfa40a0b1d70e35b88de9c64959c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15564 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> BUG: 1386071 Change-Id: I81c4c6e6b8d4037cee9b23da262b254c126c0a19 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15665 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* performance/write-behind: remove the request from liability queue inRaghavendra G2016-10-171-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1379655 > Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb > Reviewed-on: http://review.gluster.org/15579 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit a8b2a981881221925bb5edfe7bb65b25ad855c04) Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1385620 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15658 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Prevent dict_set() on NULL dictPranith Kumar K2016-10-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In afr lookup when NULL dict is received in lookup, afr is supposed to set all the xattrs it requires in a new dict it creates, but for 'link-count' it is trying to set to the dict that is passed in lookup which can be NULL sometimes. This is leading to error logs. Fixed the same in this patch. >BUG: 1385104 >Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15646 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >BUG: 1385236 >Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15650 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> BUG: 1385442 Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15651 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* afr: Take full locks in arbiter only for data transactionsRavishankar N2016-10-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. > Reviewed-on: http://review.gluster.org/15641 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60) Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1375125 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15647 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Use locks for opendirPranith Kumar K2016-10-141-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In some cases we see that readdir keeps winding to the brick that doesn't have any blocked locks i.e. first brick. This is leading to the client assuming that there are no blocking locks on the inode so it won't give away the lock. Other clients end up blocked on the lock as if the command hung. Fix: Proper way to fix this issue is to use infra present in http://review.gluster.org/14736 This is a stop gap fix where we start taking inodelks in opendir which goes to all the bricks, this will detect if there is any contention. cherry picked from commit f013335400d033a9677797377b90b968803135f4: >BUG: 1346719 >Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15309 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 BUG: 1371397 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15405 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>