summaryrefslogtreecommitdiffstats
path: root/tests/bugs
Commit message (Collapse)AuthorAgeFilesLines
* features/index: Choose different base file on EMLINK errorPranith Kumar K2018-04-101-0/+52
| | | | | | | Change-Id: I4648816af908539efdc2528608aa2ebf7f0d0e2f fixes: bz#1565654 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> (cherry picked from commit bb12f2109a01856e8184e13cf984210d20155b13)
* cluster/ec: avoid delays in self-healXavi Hernandez2018-03-151-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Self-heal creates a thread per brick to sweep the index looking for files that need to be healed. These threads are started before the volume comes online, so nothing is done but waiting for the next sweep. This happens once per minute. When a replace brick command is executed, the new graph is loaded and all index sweeper threads started. When all bricks have reported, a getxattr request is sent to the root directory of the volume. This causes a heal on it (because the new brick doesn't have good data), and marks its contents as pending to be healed. This is done by the index sweeper thread on the next round, one minute later. This patch solves this problem by waking all index sweeper threads after a successful check on the root directory. Additionally, the index sweep thread scans the index directory sequentially, but it might happen that after healing a directory entry more index entries are created but skipped by the current directory scan. This causes the remaining entries to be processed on the next round, one minute later. The same can happen in the next round, so the heal is running in bursts and taking a lot to finish, specially on volumes with many directory levels. This patch solves this problem by immediately restarting the index sweep if a directory has been healed. Backport of: > BUG: 1547662 Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e BUG: 1555198 Signed-off-by: Xavi Hernandez <jahernan@redhat.com>
* glusterd: import volumes in separate synctaskAtin Mukherjee2018-02-211-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With brick multiplexing, to attach a brick to an existing brick process the prerequisite is to have the compatible brick to finish it's initialization and portmap sign in and hence the thread might have to go to a sleep and context switch the synctask to allow the brick process to communicate with glusterd. In normal code path, this works fine as glusterd_restart_bricks () is launched through a separate synctask. In case there's a mismatch of the volume when glusterd restarts, glusterd_import_friend_volume is invoked and then it tries to call glusterd_start_bricks () from the main thread which eventually may land into the similar situation. Now since this is not done through a separate synctask, the 1st brick will never be able to get its turn to finish all of its handshaking and as a consequence to it, all the bricks will fail to get attached to it. Solution : Execute import volume and glusterd restart bricks in separate synctask. Importing snaps had to be also done through synctask as there's a dependency of the parent volume need to be available for the importing snap functionality to work. >mainline patch : https://review.gluster.org/#/c/19357/ https://review.gluster.org/#/c/19536/ https://review.gluster.org/#/c/19539/ Change-Id: I290b244d456afcc9b913ab30be4af040d340428c BUG: 1543706 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit cb0339f9229fc5c05d7ef4cfcc4ca9c4569f3755)
* afr: don't treat all cases all bricks being blamed as split-brainRavishankar N2018-02-062-0/+101
| | | | | | | | | | | | | | | | | | | | | | | | Problem: We currently don't have a roll-back/undoing of post-ops if quorum is not met. Though the FOP is still unwound with failure, the xattrs remain on the disk. Due to these partial post-ops and partial heals (healing only when 2 bricks are up), we can end up in split-brain purely from the afr xattrs point of view i.e each brick is blamed by atleast one of the others. These scenarios are hit when there is frequent connect/disconnect of the client/shd to the bricks while I/O or heal are in progress. Fix: Instead of undoing the post-op, pick a source based on the xattr values. If 2 bricks blame one, the blamed one must be treated as sink. If there is no majority, all are sources. Once we pick a source, self-heal will then do the heal instead of erroring out due to split-brain. Change-Id: I3d0224b883eb0945785ade0e9697a1c828aec0ae BUG: 1542380 Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 0e6e8216823c2d9dafb81aae0f6ee3497c23d140)
* md-cache: Implement dynamic configuration of xattr list for cachingPoornima G2018-01-221-1/+1
| | | | | | | | | | | | | | | Currently, the list of xattrs that md-cache can cache is hard coded in the md-cache.c file, this necessiates code change and rebuild everytime a new xattr needs to be added to md-cache xattr cache list. With this patch, the user will be able to configure a comma seperated list of xattrs to be cached by md-cache Updates #297 Change-Id: Ie35ed607d17182d53f6bb6e6c6563ac52bc3132e Signed-off-by: Poornima G <pgurusid@redhat.com>
* afr: add quorum checks in post-opRavishankar N2018-01-191-1/+1
| | | | | | | | | | afr relies on pending changelog xattrs to identify source and sinks and the setting of these xattrs happen in post-op. So if post-op fails, we need to unwind the write txn with a failure. Change-Id: I0f019ac03890108324ee7672883d774918b20be1 BUG: 1506140 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* upcall: Allow md-cache to specify invalidations on xattr with wildcardPoornima G2018-01-191-0/+0
| | | | | | | | | | | | | Currently, md-cache sends a list of xattrs, it is inttrested in recieving invalidations for. But, it cannot specify any wildcard in the xattr names Eg: user.* - invalidate on updating any xattr with user. prefix. This patch, enable upcall to honor wildcard in the xattr key names Updates: #297 Change-Id: I98caf0ed72f11ef10770bf2067d4428880e0a03a Signed-off-by: Poornima G <pgurusid@redhat.com>
* posix: delete stale gfid handles in nameless lookupRavishankar N2018-01-161-0/+65
| | | | | | | | | ..in order for self-heal of symlinks to work properly (see BZ for details). Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501 BUG: 1529488 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tests: check volume status for shd being upRavishankar N2018-01-121-0/+1
| | | | | | | | | | | | | | so that glusterd is also aware that shd is up and running. While not reproducible locally, on the jenkins slaves, 'gluster vol heal patchy' fails with "Self-heal daemon is not running. Check self-heal daemon log file.", while infact the afr_child_up_status_in_shd() checks before that passed. In the shd log also, I see the shd being up and connected to at least one brick before the heal is launched. Change-Id: Id3801fa4ab56a70b1f0bd6a7e240f69bea74a5fc BUG: 1515163 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* Revert "rpc: merge ssl infra with epoll infra"Milind Changire2018-01-071-1/+0
| | | | | | | This reverts commit 56e5fdae74845dfec0ff7ad0c8fee77695d36ad5. Change-Id: Ia62cee5440bbe8e23f5da9cff692d792091d544a Signed-off-by: Milind Changire <mchangir@redhat.com>
* cluster/ec: OpenFD heal implementation for ECSunil Kumar Acharya2018-01-051-11/+1
| | | | | | | | | | | | | Existing EC code doesn't try to heal the OpenFD to avoid unnecessary healing of the data later. Fix implements the healing of open FDs before carrying out file operations on them by making an attempt to open the FDs on required up nodes. BUG: 1431955 Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
* cluster/dht: Use percentages for space checkN Balachandran2018-01-021-0/+5
| | | | | | | | | | | | | | | | | With heterogenous bricks now being supported in DHT we could run into issues where files are not migrated even though there is sufficient space in newly added bricks which just happen to be considerably smaller than older bricks. Using percentages instead of absolute available space for space checks can mitigate that to some extent. Marking bug-1247563.t as that used to depend on the easier code to prevent a file from migrating. This will be removed once we find a way to force a file migration failure. Change-Id: I3452520511f304dbf5af86f0632f654a92fcb647 BUG: 1529440 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* dht: Fill first_up_subvol before use in dht_opendirPoornima G2017-12-151-0/+23
| | | | | | | | Reported by: Sam McLeod Change-Id: Ic8f9b46b173796afd70aff1042834b03ac3e80b2 BUG: 1512437 Signed-off-by: Poornima G <pgurusid@redhat.com>
* quick-read: Discard cache for fallocate, zerofill and discard opsSachin Prabhu2017-12-132-0/+228
| | | | | | | | | The fallocate, zerofill and discard modify file data on the server thus rendering stale any cache held by the xlator on the client. BUG: 1524252 Change-Id: I432146c6390a0cd5869420c373f598da43915f3f Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
* rpc: merge ssl infra with epoll infraMilind Changire2017-12-121-0/+1
| | | | | | | | | | | | | | | | | | | Patch attempts to use the epoll infra for handling SSL connections as well instead of the socket_poller() thread func. This essentially makes priv->own_thread flag redundant. SSL_connect()/SSL_accept() is now non-blocking which has done away with the localised poll() in ssl_do(). So, ssl_do() has been updated appropriately. own_thread and coincidently socket_poller() thread for SSL processing is now deprecated. Added a timeout to test whether seal-heal daemon is up and running as per Ravi's suggestion. Change-Id: If2b5d7b4fd19e321cb289e08d49a718d2161aafe Signed-off-by: Milind Changire <mchangir@redhat.com>
* metrics: provide options to dump metrics from xlatorsAmar Tumballi2017-12-061-1/+1
| | | | | | | | | | * Introduce xlator methods to allow dumping of metrics * Separate options to get the metrics dumped in a path Updates #168 Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: Fix prove test bug-1292020.tKevin Vigor2017-12-051-8/+4
| | | | | | | | | | | dd is doing a statfs and failing with ENOSPC instead of writign and getting EDQUOTA. Make either error a success in this test. > Signed-off-by: Kevin Vigor <kvigor@fb.com> > Reviewed-on: http://review.gluster.org/16352 BUG: 1521116 Signed-off-by: ShyamsundarR <srangana@redhat.com> Change-Id: I9f580d9e4a4dd293df55a1d954f86a9862fcae7b
* storage/posix : options to override umaskSubha sree Mohankumar2017-12-021-0/+100
| | | | | | | | | | | | | | | | | | | | | Options "create-mask" and "create-directory-mask" are added to remove the mode bits set on a file or directory when its created. Default value of these options is 0777. Options "force-create-mode" and "force-create-directory" sets the default permission for a file or directory irrespective of the clients umask. Default value of these options is 0000. Command to set option: volume set <volume name> storage.<option-name> <value> The valid value range from 0000 to 0777. Updates #301 Change-Id: Ia33d13f2117202ca55a056c747ccc3674eb8bae1 Signed-off-by: Subha sree Mohankumar <smohanku@redhat.com>
* tests: Fix a broken test casePoornima G2017-12-011-8/+4
| | | | | | | | | | | | | | | | | | | | Issue: When using statedump command to take statedump of the gfapi process, we specify the following things: $gluster volume statedump <volname> client <host>:<pid> pid: Pid of the gfapi application host: This should be the IP/hostname as seen by the glusterd, the gfapi application is connected to. In this test case, if gfapi application is running locally, and is connected to $H1 glusterd, the <host> need not be $H1. <host> could be localhost, 127.0.0.1, 127.1.1.1 etc. based on the configuration of the system. Hence use netstat to find the right <host> value. Change-Id: I6efb9d1ccaf9c6841a9ab7c9ebfecafc03c0bc5e BUG: 1517961 Signed-off-by: Poornima G <pgurusid@redhat.com>
* tests: fix for bug-1260185-donot-allow-detach-commit-unnecessarily.t failurehari gowtham2017-11-301-49/+0
| | | | | | | | | | problem: detach commit was issues before detach start was completed. fix: wait for detach start to finish and then detach commit. Change-Id: I639962be6de6dbd1512f0a5617050d1e6872eac8 BUG: 1517961 Signed-off-by: hari gowtham <hgowtham@redhat.com>
* tests: mark currently failing regression tests as known issuesAmar Tumballi2017-11-283-0/+18
| | | | | | Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926 BUG: 1517961 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: fix bug-1432542-mpx-restart-crash.t spurious failureAtin Mukherjee2017-11-281-1/+7
| | | | | | Change-Id: Ied1215bfec0ccf2ec8ee55a0aaf618517b67bd55 BUG: 1517961 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* snapshot : snapshot creation failed after brick reset/replaceSunny Kumar2017-11-281-0/+39
| | | | | | | | | | | | Problem : snapshot creation was failing after brick reset/replace Fix : changed code to set mount_dir value in rsp_dict during prerequisites phase i.e glusterd_brick_op_prerequisites call and removed form prevalidate phase. Signed-off-by: Sunny Kumar <sunkumar@redhat.com> Change-Id: Ief5d0fafe882a7eb1a7da8535b7c7ce6f011604c BUG: 1512451
* tests: Spurious failure multiplex-limit-issue-151.tN Balachandran2017-11-271-1/+1
| | | | | | | | | A timing issue caused the remove-brick commit to fail. Replaced 'remove-brick commit' with 'remove-brick force'. Change-Id: I69144b2f7be34095dbd3a7d182e0bf01b27fb0a4 BUG: 1517904 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/dht: Serialize mds update code path with lookup unwind in selfhealMohit Agrawal2017-11-221-2/+0
| | | | | | | | | | | | | | | | Problem: Sometime test case ./tests/bugs/bug-1371806_1.t is failing on centos due to race condition between fresh lookup and setxattr fop. Solution: In selfheal code path we do save mds on inode_ctx, it was not serialize with lookup unwind. Due to this behavior after lookup unwind if mds is not saved on inode_ctx and if any subsequent setxattr fop call it has failed with ENOENT because no mds has found on inode ctx.To resolve it save mds on inode ctx has been serialize with lookup unwind. BUG: 1498966 Change-Id: I8d4bb40a6cbf0cec35d181ec0095cc7142b02e29 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* afr: add checks for allowing lookupsRavishankar N2017-11-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | Problem: In an arbiter volume, lookup was being served from one of the sink bricks (source brick was down). shard uses the iatt values from lookup cbk to calculate the size and block count, which in this case were incorrect values. shard_local_t->last_block was thus initialised to -1, resulting in an infinite while loop in shard_common_resolve_shards(). Fix: Use client quorum logic to allow or fail the lookups from afr if there are no readable subvolumes. So in replica-3 or arbiter vols, if there is no good copy or if quorum is not met, fail lookup with ENOTCONN. With this fix, we are also removing support for quorum-reads xlator option. So if quorum is not met, neither read nor write txns are allowed and we fail the fop with ENOTCONN. Change-Id: Ic65c00c24f77ece007328b421494eee62a505fa0 BUG: 1467250 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* tests: fix bug-1483058-replace-brick-quorum-validation.t spurious failureAtin Mukherjee2017-11-121-1/+8
| | | | | | Change-Id: I04c35305bfb663eabbf715eee78695adfd4a2d20 BUG: 1511310 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* mount/fuse: use fstat in getattr implementation if any opened fd is availableRaghavendra G2017-11-091-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The restriction of using fds opened by the same Pid means fds cannot be shared across threads of multithreaded application. Note that fops from kernel have different Pid for different threads. Imagine following sequence of operations: * Turn off performance.open-behind * Thread t1 opens an fd - fd1 - on file "file". Let's assume nodeid of "file" is "nodeid-file". * Thread t2 does RENAME ("newfile", "file"). Let's assume nodeid of "newfile" as "nodeid-newfile". * t2 proceeds to do fstat (fd1) The above set of operations can sometimes result in ESTALE/ENOENT errors. RENAME overwrites "file" with "newfile" changing its nodeid from "nodeid-file" to "nodeid-newfile" and post RENAME, "nodeid-file" is removed from the backend. If fstat carries nodeid-file as argument, which can happen if lookup has not refreshed the nodeid of "file" and since t2 doesn't have an fd opened, fuse_getattr_resume uses STAT which will fail as "nodeid-file" no longer exists. Since the above set of operations and sharing of fds across multiple threads are valid, this is a bug. The fix is to use any fd opened on the inode. In this specific example fuse_getattr_resume will find fd1 and winds down the call as fstat (fd1) which won't fail. Cross-checked with "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for any security issues with this solution and he approves the solution. Thanks to "Miklos Szeredi" <mszeredi.at.redhat.dot.com> for all the pointers and discussions. Change-Id: I88dd29b3607cd2594eee9d72a1637b5346c8d49c BUG: 1510401 Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* tests: fix spurious failure in bug-1345727-bricks-stop-on-no-quorum-validation.tAtin Mukherjee2017-11-071-0/+1
| | | | | | | | Add peer_count check before checking for brick status Change-Id: I0179ec29729ab6bbc3571eb6ffd631b7b0d15f7c BUG: 1510415 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: create eager-lock option for non-regular filesXavier Hernandez2017-11-051-0/+1
| | | | | | | | | A new option is added to allow independent configuration of eager locking for regular files and non-regular files. Change-Id: I8f80e46d36d8551011132b15c0fac549b7fb1c60 BUG: 1502610 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* glusterd: delete source brick only once in reset-brick commit forceAtin Mukherjee2017-10-311-0/+24
| | | | | | | | | | | While stopping the brick which is to be reset and replaced delete_brick flag was passed as true which resulted glusterd to free up to source brick before the actual operation. This results commit force to fail failing to find the source brickinfo. Change-Id: I1aa7508eff7cc9c9b5d6f5163f3bb92736d6df44 BUG: 1507466 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
* tests: Update tier CLI in .t filesN Balachandran2017-10-306-7/+7
| | | | | | | | Update .t tier tests to use the new tier CLI. Change-Id: I0e7f1769071108d8266fc86378c4466bcaf96e7d BUG: 1505253 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* cluster/afr: Fail open on split-brainPranith Kumar K2017-10-261-0/+1
| | | | | | | | | | | | | | | | | Problem: Append on a file with split-brain succeeds. Open is intercepted by open-behind, when write comes on the file, open-behind does open+write. Open succeeds because afr doesn't fail it. Then write succeeds because write-behind intercepts it. Flush is also intercepted by write-behind, so the application never gets to know that the write failed. Fix: Fail open on split-brain, so that when open-behind does open+write open fails which leads to write failure. Application will know about this failure. Change-Id: I4bff1c747c97bb2925d6987f4ced5f1ce75dbc15 BUG: 1294051 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* snapshot: Issue with other processes accessing the mounted brickSunny Kumar2017-10-232-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added code for unmount of activated snapshot brick during snapshot deactivation process which make sense as mount point for deactivated bricks should not exist. Removed code for mounting newly created snapshot, as newly created snapshots should not mount until it is activated. Added code for mount point creation and snapshot mount during snapshot activation. Added validation during glusterd init for mounting only those snapshot whose status is either STARTED or RESTORED. During snapshot restore, mount point for stopped snap should exist as it is required to set extended attribute. During handshake, after getting updates from friend mount point for activated snapshot should exist and should not for deactivated snapshot. While getting snap status we should show relevent information for deactivated snapshots, after this pathch 'gluster snap status' command will show output like- Snap Name : snap1 Snap UUID : snap-uuid Brick Path : server1:/run/gluster/snaps/snap-vol-name/brick Volume Group : N/A (Deactivated Snapshot) Brick Running : No Brick PID : N/A Data Percentage : N/A LV Size : N/A Fixes: #276 Change-Id: I65783488e35fac43632615ce1b8ff7b8e84834dc BUG: 1482023 Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
* glusterd:Marking all the brick status as stopped when a process goes down in ↵Sanju Rakonde2017-10-121-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | brick multiplexing In brick multiplexing environment, if a brick process goes down i.e., if we kill it with SIGKILL, the status of the brick for which the process came up for the first time is only changing to stopped. all other brick statuses are remain started. This is happening because the process was killed abruptly using SIGKILL signal and signal handler wasn't invoked and further cleanup wasn't triggered. When we try to start a volume using force, it shows error saying "Request timed out", since all the brickinfo->status are still in started state, we're waiting for one of the brick process to come up which never going to happen since the brick process was killed. To resolve this, In the disconnect event, We are checking all the processes that whether the brick which got disconnected belongs the process. Once we get the process we are calling a function named glusterd_mark_bricks_stopped_by_proc() and sending brick_proc_t object as an argument. From the glusterd_brick_proc_t we can get all the bricks attached to that process. but these are duplicated ones. To get the original brickinfo we are reading volinfo from brick. In volinfo we will have original brickinfo copies. We are changing brickinfo->status to stopped for all the bricks. Change-Id: Ifb9054b3ee081ef56b39b2903ae686984fe827e7 BUG: 1499509 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* test: Mark test case ./tests/bugs/bug-1371806_1.t as a bad test caseMohit Agrawal2017-10-091-1/+2
| | | | | | | | | | Problem: Test case ./tests/bugs/bug-1371806_1.t is failing. Solution: Mark test case ./tests/bugs/bug-1371806_1.t as a bad test case. BUG: 1499663 Change-Id: Icb3f41d23dcc74cce6fde05ca343c158d5f58cdd Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* glusterd : fix client io-threads option for replicate volumesRavishankar N2017-10-091-0/+48
| | | | | | | | | | | | | | | | | | | Problem: Commit ff075a3d6f9b142911d25c27fd209838782bfff0 disabled loading client-io-threads for replicate volumes (it was set to on by default in commit e068c1997314046658dd502e9118dab32decf879) due to performance issues but in doing so, inadvertently failed to load the xlator even if the user explicitly enabled the option using the volume set command. This was despite returning returning sucess for the volume set. Fix: Modify the check in perfxl_option_handler() and add checks in volume create/add-brick/remove-brick code paths, tying it all to GD_OP_VERSION_3_12_2. Change-Id: Ib612973a999a7da818cc926f5c2601b1f0794fcf BUG: 1498570 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* afr: heal gfid as a part of entry healRavishankar N2017-10-091-0/+68
| | | | | | | | | | | | | | | | | | Problem: If a brick crashes after an entry (file or dir) is created but before gfid is assigned, the good bricks will have pending entry heal xattrs but the heal won't complete because afr_selfheal_recreate_entry() tries to create the entry again and it fails with EEXIST. Fix: We could have fixed posx_mknod/mkdir etc to assign the gfid if the file already exists but the right thing to do seems to be to trigger a lookup on the bad brick and let it heal the gfid instead of winding an mknod/mkdir in the first place. Change-Id: I82f76665a7541f1893ef8d847b78af6466aff1ff BUG: 1493415 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* cluster/dht : User xattrs are not healed after brick stop/startMohit Agrawal2017-10-047-19/+352
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a distributed volume custom extended attribute value for a directory does not display correct value after stop/start or added newly brick. If any extended(acl) attribute value is set for a directory after stop/added the brick the attribute(user|acl|quota) value is not updated on brick after start the brick. Solution: First store hashed subvol or subvol(has internal xattr) on inode ctx and consider it as a MDS subvol.At the time of update custom xattr (user,quota,acl, selinux) on directory first check the mds from inode ctx, if mds is not present on inode ctx then throw EINVAL error to application otherwise set xattr on MDS subvol with internal xattr value of -1 and then try to update the attribute on other non MDS volumes also.If mds subvol is down in that case throw an error "Transport endpoint is not connected". In dht_dir_lookup_cbk| dht_revalidate_cbk|dht_discover_complete call dht_call_dir_xattr_heal to heal custom extended attribute. In case of gnfs server if hashed subvol has not found based on loc then wind a call on all subvol to update xattr. Fix: 1) Save MDS subvol on inode ctx 2) Check if mds subvol is present on inode ctx 3) If mds subvol is down then call unwind with error ENOTCONN and if it is up then set new xattr "GF_DHT_XATTR_MDS" to -1 and wind a call on other subvol. 4) If setxattr fop is successful on non-mds subvol then increment the value of internal xattr to +1 5) At the time of directory_lookup check the value of new xattr GF_DHT_XATTR_MDS 6) If value is not 0 in dht_lookup_dir_cbk(other cbk) functions then call heal function to heal user xattr 7) syncop_setxattr on hashed_subvol to reset the value of xattr to 0 if heal is successful on all subvol. Test : To reproduce the issue followed below steps 1) Create a distributed volume and create mount point 2) Create some directory from mount point mkdir tmp{1..5} 3) Kill any one brick from the volume 4) Set extended attribute from mount point on directory setfattr -n user.foo -v "abc" ./tmp{1..5} It will throw error " Transport End point is not connected " for those hashed subvol is down 5) Start volume with force option to start brick process 6) Execute getfattr command on mount point for directory 7) Check extended attribute on brick getfattr -n user.foo <volume-location>/tmp{1..5} It shows correct value for directories for those xattr fop were executed successfully. Note: The patch will resolve xattr healing problem only for fuse mount not for nfs mount. BUG: 1371806 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5
* cluster/afr: Make choose-local "reconfigurable"Krutika Dhananjay2017-09-301-0/+18
| | | | | | | | | | With this change, enabling choose-local (which means its state makes transition from "off" to "on") will be effective after the first gfid-lookup on "/" since volume-set was executed. Change-Id: Ibab292ba705d993b475cd0303fb3318211fb2500 BUG: 1480525 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* afr: auto-resolve split-brains for zero-byte filesRavishankar N2017-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Problems: As described in BZ 1491670, renaming hardlinks can result in data/mdata split-brain of the DHT link-to files (T files) without any mismatch of data and metadata. As described in BZ 1486063, for a zero-byte file with only dirty bits set, arbiter brick will likely be chosen as the source brick. Fix: For zero byte files in split-brain, pick first brick as a) data source if file size is zero on all bricks. b) metadata source if metadata is the same on all bricks In arbiter case, if file size is zero on all bricks and there are no pending afr xattrs, pick 1st brick as data source. Change-Id: I0270a9a2f97c3b21087e280bb890159b43975e04 BUG: 1491670 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Rahul Hinduja <rhinduja@redhat.com> Reported-by: Mabi <mabi@protonmail.ch>
* mount/fuse: Make event-history feature configurableKrutika Dhananjay2017-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | ... and disable it by default. This is because having it disabled seems to improve performance. This could be due to the lock contention by the different epoll threads on the circular buff lock in the fop cbks just before writing their response to /dev/fuse. Just to provide some data - wrt ovirt-gluster hyperconverged environment, I saw an increase in IOPs by 12K with event-history disabled for randrom read workload. Usage: mount -t glusterfs -o event-history=on $HOSTNAME:$VOLNAME $MOUNTPOINT OR glusterfs --event-history=on --volfile-server=$HOSTNAME --volfile-id=$VOLNAME $MOUNTPOINT Change-Id: Ia533788d309c78688a315dc8cd04d30fad9e9485 BUG: 1467614 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: disallow replace brick for dist only volumesAtin Mukherjee2017-09-194-17/+17
| | | | | | | | | | | | | | | | | | Allowing replace-brick on dist only volumes will lead to data loss. This patch blocks replace brick commit force to fail if a volume is dist only. Also removing tests/basic/pump.t as its of no use as per the discussion in http://lists.gluster.org/pipermail/gluster-devel/2017-September/053652.html Change-Id: Iabb0c16f865f3fc361b64a19bfcf0c0fbb5c2682 BUG: 1489432 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/18226 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* features/shard: Change default shard-block-size to 64MBKrutika Dhananjay2017-09-146-0/+7
| | | | | | | | | | | Change-Id: I55fa87e07136cff10b0d725ee24dd3151016e64e BUG: 1489823 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/18243 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: Sunil Kumar Acharya <sheggodu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* features/shard: Return aggregated size in stbuf of LINK fopKrutika Dhananjay2017-09-061-0/+25
| | | | | | | | | | Change-Id: I42df7679d63fec9b4c03b8dbc66c5625f097fac0 BUG: 1488546 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: https://review.gluster.org/18209 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* feature/posix: Enabled gfid2path by defaultKotresh HR2017-08-291-1/+1
| | | | | | | | | | | | | | | | Enable gfid2path feature by default. The basic performance tests are carried out and it doesn't show significant depreciation. The results are updated in issue. Updates: #139 Change-Id: I5f1949a608d0827018ef9d548d5d69f3bb7744fd Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: https://review.gluster.org/17950 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* glusterd: glusterd fails to start if peers file has blank lineGaurav Yadav2017-08-241-0/+29
| | | | | | | | | | | | | | | | | | | | | | | Problem: On start of glusterd service, glusterd fetch data from store, while parsing data from store if peers file consists of blank line glusterd fails to start. Fix: With this fix while parsing peers file glusterd will skip blank lines if it contains any. Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Change-Id: I53cd65a54de5f57baef292b2118b70ffb7f99388 BUG: 1482906 Reviewed-on: https://review.gluster.org/18066 Tested-by: Gaurav Yadav <gyadav@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* glusterd: do not create .glusterfs/indicesPrashanth Pai2017-08-233-6/+3
| | | | | | | | | | | | | | | | | | | | | glusterd shouldn't concern itself with creating directories specific to certain xlators. The index xlator will now proceed creating './glusterfs/indices' dir only if the parent '.glusterfs' directory exists, which still fixes the original problem reported i.e 'volume start force' command shouldn't create brick path if it doesn't exist (BUG 1457202) This reverts most of the changes done by the commit b58a15948fb3fc37b6c0b70171482f50ed957f42 Change-Id: I7fc52ad64dce220e336c218fb4d85933ca2e61c0 Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: https://review.gluster.org/18003 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd: replace-brick executing successfully when quorum does not metGaurav Yadav2017-08-221-0/+51
| | | | | | | | | | | | | | | | | | Problem: replace-brick command on a setup where quorum does not met executing successfully. Fix: With the fix glusterd is validating whether server is in quorum or not during replace-brick staging Change-Id: I8017154bb62bdcc6c6490e720ecfe9cde090c161 BUG: 1483058 Signed-off-by: Gaurav Yadav <gyadav@redhat.com> Reviewed-on: https://review.gluster.org/18068 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: disallow volume specific options to be set with all as volume nameAtin Mukherjee2017-08-181-0/+25
| | | | | | | | | | | | | | | | All the .validate_fn defined in volume map entry table refers to volinfo object. And if we end up in trying to set a volume level option cluster wide glusterd results into a crash. Change-Id: I7c877aee0ff5c8c1d8c95662fdc8c8923355ae7b BUG: 1482344 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: https://review.gluster.org/18052 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Gaurav Yadav <gyadav@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>