glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	nl-cache: Fix a possible crash and stale cache	Poornima G	2017-06-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue1: Consider the followinf sequence of operations: ... nlc_ctx = nlc_ctx_get (inode i1) ....... -> nlc_clear_cache (i1) gets called as a part of nlc_invalidate or any other callers ... GF_FREE (ii nlc_ctx) LOCK (nlc_ctx->lock); -> This will result in crash as the ctx got freed in nlc_clear_cache. Issue2: lookup on dir1/file1 result in ENOENT add cache to dir1 at time T1 .... CHILD_DOWN at T2 lookup on dir1/file2 result in ENOENT add cache to dir1, but the cache time is still T1 lookup on dir1/file2 - should have been served from cache but the cache time is T1 < T2, hence cache is considered as invalid. So, after CHILD_DOWN the right thing would be to clear the cache and restart caching on that inode. Solution: Do not free nlc_ctx in nlc_clear_cache, but only in inode_forget() The fix for both issue1 and 2 is interleaved hence sending it as single patch. Change-Id: I83d8ed36c049a93567c6d7e63d045dc14ccbb397 BUG: 1458539 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17453 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	ec: fix ec-data-heal.t failure	Atin Mukherjee	2017-06-12	1	-16/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With brick mux enabled, this test was constantly failing. Further it was found that the the test does a series of killing a particular brick and bringing it up in cmdline where as just starting the volume with force would suffice. Change-Id: Iee491d0777eaa28dca5c78f92d4b400fcc897fd2 BUG: 1460638 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/17508 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	nl-cache: add group volume set option for ease of use	Poornima G	2017-06-12	1	-3/+6
\| \| \| \| \| \| \| \| \| \|	Change-Id: Id03643a9598da53051a01ca09e1d2a62bc195ab6 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17495 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	protocol/server: make listen backlog value as configurable	Mohammed Rafi KC	2017-06-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	problem: When we call listen from protocol/server, we are giving a hard coded valie of 10 if it is not manually given. With multiplexing, especially when glusterd restarts all clients may try to connect to the server at a time. Which will result in overflowing the queue, and kernel will complain about the errors. Solution: This patch will introduce a volume set command to make backlog value as a configurable. This patch also changes the default values for backlog from 10 to 128. This changes is only applicable for sockets listening from protocol. Example: gluster volume set <volname> transport.listen-backlog 1024 Note: 1 Brick has to be restarted to get this value in effect 2 This changes won't be reflected in glusterd, or other xlators which calls listen. If you need, you have to add this option to the volfile. Change-Id: I0c5a2bbf28b5db612f9979e7560e05dd82b41477 BUG: 1456405 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17411 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
*	cluster/ec: Update xattr and heal size properly	Ashish Pandey	2017-06-06	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem-1 : Recursive healing of same file is happening when IO is going on even after data heal completes. Solution: RCA: At the end of the write, when ec_update_size_version gets called, we send it only on good bricks and not on healing brick. Due to this, xattr on healing brick will always remain out of sync and when the background heal check source and sink, it finds this brick to be healed and start healing from scratch. That involve ftruncate and writing all of the data again. To solve this, send xattrop on all the good bricks as well as healing bricks. Problem-2: The above fix exposes the data corruption during heal. If the write on a file is going on and heal finishes, we find that the file gets corrupted. RCA: The real problem happens in ec_rebuild_data(). Here we receive the 'size' argument which contains the real file size at the time of starting self-heal and it's assigned to heal->total_size. After that, a sequence of calls to ec_sync_heal_block() are done. Each call ends up calling ec_manager_heal_block(), which does the actual work of healing a block. First a lock on the inode is taken in state EC_STATE_INIT using ec_heal_inodelk(). When the lock is acquired, ec_heal_lock_cbk() is called. This function calls ec_set_inode_size() to store the real size of the inode (it uses heal->total_size). The next step is to read the block to be healed. This is done using a regular ec_readv(). One of the things this call does is to trim the returned size if the file is smaller than the requested size. In our case, when we read the last block of a file whose size was = 512 mod 1024 at the time of starting self-heal, ec_readv() will return only the first 512 bytes, not the whole 1024 bytes. This isn't a problem since the following ec_writev() sent from the heal code only attempts to write the amount of data read, so it shouldn't modify the remaining 512 bytes. However ec_writev() also checks the file size. If we are writing the last block of the file (determined by the size stored on the inode that we have set to heal->total_size), any data beyond the (imposed) end of file will be cleared with 0's. This causes the 512 bytes after the heal->total_size to be cleared. Since the file was written after heal started, the these bytes contained data, so the block written to the damaged brick will be incorrect. Solution: Align heal->total_size to a multiple of the stripe size. Thanks "Xavier Hernandez" <xhernandez@datalab.es> to find out the root cause and to fix the issue. Change-Id: I6c9f37b3ff9dd7f5dc1858ad6f9845c05b4e204e BUG: 1428673 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: https://review.gluster.org/16985 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
*	core: fix spelling errors	Kaleb S. KEITHLEY	2017-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixes for various minor spelling errors and typos Reported-by: Patrick Matthäi <pmatthaei@debian.org> Change-Id: Ic1be36f82e3d822bbdc9559878bd79520fc0fcd5 BUG: 1457808 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: https://review.gluster.org/17442 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	cluster/ec: Implement FALLOCATE FOP for EC	Sunil Kumar Acharya	2017-05-23	2	-0/+132
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	FALLOCATE file operations is not implemented in the existing EC code. This change set implements it for EC. BUG: 1448293 Change-Id: Id9ed914db984c327c16878a5b2304a0ea461b623 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com> Reviewed-on: https://review.gluster.org/15200 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Tier/cli: detach status xml output	hari gowtham	2017-05-17	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: detach status xml output was broken because of the wrong argument. The status_op sent to verify whether it is a tier status command was as false. Fix: the argument being passed was changed from false to true. Change-Id: I8cdd4dd972d6bfbb61c1182cbf4097767f83c7c5 BUG: 1446362 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/17131 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
*	cluster/dht: Rebalance on all nodes should migrate files	N Balachandran	2017-05-16	1	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Rebalance compares the node-uuid of a file against its own to and migrates a file only if they match. However, the current behaviour in both AFR and EC is to return the node-uuid of the first brick in a replica set for all files. This means a single node ends up migrating all the files if the first brick of every replica set is on the same node. Fix: AFR and EC will return all node-uuids for the replica set. The rebalance process will divide the files to be migrated among all the nodes by hashing the gfid of the file and using that value to select a node to perform the migration. This patch makes the required DHT and tiering changes. Some tests in rebal-all-nodes-migrate.t will need to be uncommented once the AFR and EC changes are merged. Change-Id: I5ce41600f5ba0e244ddfd986e2ba8fa23329ff0c BUG: 1366817 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: https://review.gluster.org/17239 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	afr: gfid-mismatch-resolution-with-fav-child-policy.t to bad tests	Ravishankar N	2017-05-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gfid-mismatch-resolution-with-fav-child-policy.t does a `TEST ls $M0/f3` (line #170) to trigger healing of a file in gfid split-brain in a rep-3 volume. But the code to trigger name heal of gfid split-brain file is not yet there. The test is passing due a lookup/ stat on $M0 which triggers a background entry self heal (which has the code to heal gfid split-brain files) which may or may not complete the heal before line 170. If it doesn't, lookup on f3 is failing with EIO. Add the .t to bad tests until Karthik's patch for CLI based gfid split-brain resolution fixes name heal also. Change-Id: Iba6e9d81db386bc406aff1ecb6a18851f09bf7c0 BUG: 1450730 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/17290 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Fixes quota aux mount failure	Sanoj Unnikrishnan	2017-05-08	6	-10/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The aux mount is created on the first limit/remove_limit/list command and it remains until volume is stopped / deleted / (quota is disabled) , where we do a lazy unmount. If the process is uncleanly terminated, then the mount entry remains and we get (Transport disconnected) error on subsequent attempts to run quota list/limit-usage/remove commands. Second issue, There is also a risk of inadvertent rm -rf on the /var/run/gluster causing data loss for the user. Ideally, /var/run is a temp path for application use and should not cause any data loss to persistent storage. Solution: 1) unmount the aux mount after each use. 2) clean stale mount before mounting, if any. One caveat with doing mount/unmount on each command is that we cannot use same mount point for both list and limit commands. The reason for this is that list command needs mount to be accessible in cli after response from glusterd, So it could be unmounted by a limit command if executed in parallel (had we used same mount point) Hence we use separate mount points for list and limit commands. Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0 BUG: 1433906 Signed-off-by: Sanoj Unnikrishnan <sunnikri@redhat.com> Reviewed-on: https://review.gluster.org/16938 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	Tier: Watermark check for hi and low value being equal	hari gowtham	2017-05-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Both low and hi watermark can be set to same value as the check missed the case for being equal. Fix: Add the check to both the hi and low values being equal along with the low value being higher than hi value. Change-Id: Ia235163aeefdcb2a059e2e58a5cfd8fb7f1a4c64 BUG: 1447960 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/17175 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	gfapi/handleops: Introducing glfs_xreaddirplus_r() fop for handleops	Soumya Koduri	2017-05-02	2	-0/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Its known that readdirplus operation fetches stat as well for each of the dirents. But often applications may need extra information, like for eg., NFS-Ganesha which operates on handles needs handles for each of those dirents returned. So this would require extra calls to the backend, in this case LOOKUP (which is very expensive operation) resulting in very low readdir performance. To address that introducing this new API using which applications can make request for any extra information to be returned as part of readdirplus response. Currently this new api returns stat and handles as demanded by application. The synopsis of the API is noted in glfs.h. @todo: * Enhance test script using this new API Below were the perf results on single brick volume with and without these changes - Dataset used - 10*100 directories and each directory containing 100 empty files. I used NFS-Ganesha application to test these changes - >for i in {1..5}; do systemctl restart nfs-ganesha; sleep 10; mount -t nfs -o vers=4 localhost:/brick_vol /mnt; cd /mnt; echo "ITERATION$i"; date; find . > tmp-nfs.log; date; cd /; umount /mnt; sleep 2; done; Without these changes - ITERATION1 Mon Mar 20 17:22:26 IST 2017 Mon Mar 20 17:23:18 IST 2017 ITERATION2 Mon Mar 20 17:23:39 IST 2017 Mon Mar 20 17:24:28 IST 2017 ITERATION3 Mon Mar 20 17:24:49 IST 2017 Mon Mar 20 17:25:36 IST 2017 ITERATION4 Mon Mar 20 17:30:57 IST 2017 Mon Mar 20 17:31:37 IST 2017 ITERATION5 Mon Mar 20 17:31:57 IST 2017 Mon Mar 20 17:32:40 IST 2017 [root@dhcp35-197 /]# On an average ~46.2 sec With these changes applied - ITERATION1 Mon Mar 20 17:35:03 IST 2017 Mon Mar 20 17:35:15 IST 2017 ITERATION2 Mon Mar 20 17:35:36 IST 2017 Mon Mar 20 17:35:46 IST 2017 ITERATION3 Mon Mar 20 17:36:06 IST 2017 Mon Mar 20 17:36:17 IST 2017 ITERATION4 Mon Mar 20 17:41:38 IST 2017 Mon Mar 20 17:41:49 IST 2017 ITERATION5 Mon Mar 20 17:42:10 IST 2017 Mon Mar 20 17:42:20 IST 2017 On an average ~10.8 sec Updates #174 BUG: 1442950 Change-Id: I0f74f74dc62085ca4c4a23c38e3edc84bd850876 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: https://review.gluster.org/15663 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Halo Replication feature for AFR translator	Kevin Vigor	2017-05-02	2	-1/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Halo Geo-replication is a feature which allows Gluster or NFS clients to write locally to their region (as defined by a latency "halo" or threshold if you like), and have their writes asynchronously propagate from their origin to the rest of the cluster. Clients can also write synchronously to the cluster simply by specifying a halo-latency which is very large (e.g. 10seconds) which will include all bricks. In other words, it allows clients to decide at mount time if they desire synchronous or asynchronous IO into a cluster and the cluster can support both of these modes to any number of clients simultaneously. There are a few new volume options due to this feature: halo-shd-latency: The threshold below which self-heal daemons will consider children (bricks) connected. halo-nfsd-latency: The threshold below which NFS daemons will consider children (bricks) connected. halo-latency: The threshold below which all other clients will consider children (bricks) connected. halo-min-replicas: The minimum number of replicas which are to be enforced regardless of latency specified in the above 3 options. If the number of children falls below this threshold the next best (chosen by latency) shall be swapped in. New FUSE mount options: halo-latency & halo-min-replicas: As descripted above. This feature combined with multi-threaded SHD support (D1271745) results in some pretty cool geo-replication possibilities. Operational Notes: - Global consistency is gaurenteed for synchronous clients, this is provided by the existing entry-locking mechanism. - Asynchronous clients on the other hand and merely consistent to their region. Writes & deletes will be protected via entry-locks as usual preventing concurrent writes into files which are undergoing replication. Read operations on the other hand should never block. - Writes are allowed from _any_ region and propagated from the origin to all other regions. The take away from this is care should be taken to ensure multiple writers do not write the same files resulting in a gfid split-brain which will require resolution via split-brain policies (majority, mtime & size). Recommended method for preventing this is using the nfs-auth feature to define which region for each share has RW permissions, tiers not in the origin region should have RO perms. TODO: - Synchronous clients (including the SHD) should choose clients from their own region as preferred sources for reads. Most of the plumbing is in place for this via the child_latency array. - Better GFID split brain handling & better dent type split brain handling (i.e. create a trash can and move the offending files into it). - Tagging in addition to latency as a means of defining which children you wish to synchronously write to Test Plan: - The usual suspects, clang, gcc w/ address sanitizer & valgrind - Prove tests Reviewers: jackl, dph, cjh, meyering Reviewed By: meyering Subscribers: ethanr Differential Revision: https://phabricator.fb.com/D1272053 Tasks: 4117827 Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1 BUG: 1428061 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on: http://review.gluster.org/16099 Reviewed-on: https://review.gluster.org/16177 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/dht: Make rebalance throttle option tuned by number	Susant Palai	2017-04-29	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current rebalance throttle options: lazy/normal/aggressive may not always be sufficient for the purpose of throttling. In our recent test, we observed for certain setups, normal and aggressive modes behaved similarly consuming full disk bandwidth. So in cases like this admin should be able to tune it down(or vice versa) depending on the need. Along with old throttle configurations, thread counts are tuned based on number. e.g. gluster v set vol-name cluster-rebal.throttle 5. Admin can tune up/down between 0 and the number of cores available. Note: For heterogenous servers, validation will fail on the old server if "number" is given for throttle configuration. The message looks something like this: "volume set: failed: Staging failed on vm2. Error: cluster.rebal-throttle should be {lazy\|normal\|aggressive}" Test: Manual test by logging active thread number after reconfiguring throttle option. testcase: tests/basic/distribute/throttle-rebal.t Change-Id: I46e3cde546900307831028b344ecf601fd9b02c3 BUG: 1438370 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/16980 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	libglusterfs: accept random volname in glusterfs_graph_prepare()	Niels de Vos	2017-04-26	4	-1/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the call to glfs_new("volname") passes a name for the volume and it does not match the name of the subvolume in the graph, glfs_init() will fail. This is easily reproducible by a gfapi program that loads the volume from a .vol file, and not from a GlusterD server. Change-Id: I33e77fbee7d12eaefe7c384fad6aecfa3582ea5a BUG: 1425623 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16796 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	debug/sink: add xlator to aid in resource leak debugging	Niels de Vos	2017-04-25	2	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new xlator does not allocate any resources on init(). This makes it a good option to use for debugging xlator releated resources leaks on fini(). By putting the sink xlator as single xlator in a .vol file, and loading it through gfapi, we can investigate the resource leaks that are happening through gfapi (and the Gluster core). By extending the .vol file with additional xlators, it is possible to analyze resource leaks of single xlators. Change-Id: Idb5faa861b623dd5b2a988b181e669b0d52c2a0e BUG: 1425623 Fixes: #176 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16806 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	cluster/afr: GFID split brain resolution with favorite-child-policy	karthik-us	2017-04-20	1	-0/+228
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Currently the automatic split brain resolution with favorite child policy is not resolving the GFID split brains. Fix: When there is a GFID split brain and the favorite child policy is set to size/mtime/ctime/majority, based on the policy decide on the source and sinks. Delete the entry from the sinks and recreate it from the source. Mark the appropriate pending attributes and resolve the GFID split brain. When the heal takes place it will complete the pending heals and reset the attributes. Change-Id: Ie30e5373f94ca6f276745d9c3ad662b8acca6946 BUG: 1430719 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: https://review.gluster.org/16878 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	Implement negative lookup cache	Poornima G	2017-04-20	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before creating any file negative lookups(1 in Fuse, 4 in SMB etc.) are sent to verify if the file already exists. By serving these lookups from the cache when possible, increases the create performance by multiple folds in SMB access and some percentage in Fuse/NFS access. Feature page: https://review.gluster.org/#/c/16436 Updates #82 Change-Id: Ib1c0e7ac7a386f943d84f6398c27f9a03665b2a4 BUG: 1442569 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16952 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	geo-rep: filter out xtime attribute during getxattr	Saravanakumar Arumugam	2017-04-11	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	georep gsyncd's xtime needs to filtered irrespective of any process access. This way, we can avoid (unnecessarily)syncing xtime attribute to slave, which may raise permission denied errors. test case modified to check for xtime xattr only in backend. Change-Id: I2390b703048d5cc747d91fa2ae884dc55de58669 BUG: 1353952 Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/14880 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Kotresh HR <khiremat@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	TIER/TESTS: improving regression test for tier	hari gowtham	2017-04-10	6	-78/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test files that were marked as bad test were checked and updated for centos. The tests that had issue were fixed. Tests that aren't needed anymore are removed. REASON: tests/basic/tier/tier-file-create.t This test checks one line after creating a tiered volume (which is done in every tier test). So this line is moved along with other test in tier and the file is deleted. tests/bugs/tier/bug-1286974.t This bug checks for the tier as a task and tier has been moved from a task to service as a part of the tier as a service patch https://review.gluster.org/#/c/13365/ So it is removed from bad tests. tests/basic/tier/record-metadata-heat.t This test had a bug and has been fixed. tests/basic/tier/bug-1214222-directories_missing_after_attach_tier.t tests/basic/tier/fops-during-migration.t tests/basic/tier/tier-snapshot.t tests/basic/tier/tier_lookup_heal.t These test seem to work fine on centos now. Change-Id: I05537f4bbb91584410177ce43543897eff8761a1 BUG: 1421600 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16605 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
*	TIER: watermark check during low watermark reset	hari gowtham	2017-03-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: during a low watermark reset, checking of whether the low watermark is lower than hi watermark is not done. FIX: This patch checks if the hi watermark value is higher the default low watermark. Else throws an failure of the reset command Change-Id: I8b49090c6bccce6d45c2e8076ab766047a2a6162 BUG: 1328342 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/14028 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd: disallow increasing replica count for arbiter volumes	Ravishankar N	2017-03-06	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: add-brick command to increase replica count in an arbiter volume succeeds, causing undesirable effects like the 4th brick being loaded with the arbiter xlator, the 3rd one losing the arbiter xlator (when the brick process is restarted), arbitration logic in afr going for a toss etc. Fix: Arbiter configuration should always be a replica 3 volume (of which 3rd brick is arbiter). Hence disallow increasing replica count for arbiter volume configurations. Change-Id: I9fe4edac880d0f711e6d44324ad5562974e53e51 BUG: 1429200 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: https://review.gluster.org/16845 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/ec: Introduce optimistic changelog in EC	Pranith Kumar K	2017-03-04	1	-0/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Fix to https://bugzilla.redhat.com/show_bug.cgi?id=1316873 has made changes to set dirty flag before every update fop, data or metadata, and unset it after successful operation. That makes some of the fops very slow such as entry operations or metadata operations. Solution: File data operations are the only operation which take some time and setting dirty flag before a fop and unsetting it after serves the purpose as probability of failure of a fop is high when the time duration is more. For all the other operations, set dirty flag at the end of the fop, if any brick is down and need heal. Providing following option to choose between high performance or better heal marking for metadata and entry fops. Set/Unset dirty flag for every update fop at the start of the fop. If ON, this option impacts performance of entry operations or metadata operations as it will set dirty flag at the start and unset it at the end of ALL update fop. If OFF and all the bricks are good, dirty flag will be set at the start only for file fops For metadata and entry fops dirty flag will not be set at the start, if all the bricks are good. This does not impact performance for metadata operations and entry operation but has a very small window to miss marking entry as dirty in case it is required to be healed. Thanks to Xavi and Ashish for the design Picked the .t file from Ashish' patch https://review.gluster.org/16298 BUG: 1408809 Change-Id: I3ce860063f0e2901e50754dcfc3e4ed22daf819f Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: https://review.gluster.org/16821 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	tests: Fix and enable split-brain-healing mtime check	Niklas Hambüchen	2017-03-01	1	-6/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test was commented out with the belief that it depended on utimensat() support, but in fact it was not necessary because `stat -c %Y` only outputs second resolution. Simply commenting in the test made it fail because it checked the values before the heal, while intended was to check them after the heal. This commit fixes that. Change-Id: I4194ac645b365a1f906a3ac9bcbbdb1f05000e27 BUG: 1422074 Signed-off-by: Niklas Hambüchen <mail@nh2.me> Reviewed-on: https://review.gluster.org/16789 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niklas Hambüchen
*	cluster/ec: Don't trigger data/metadata heal on Lookups	Pranith Kumar K	2017-02-26	3	-10/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem-1 If Lookup which doesn't take any locks observes version mismatch it can't be trusted. If we launch a heal based on this information it will lead to self-heals which will affect I/O performance in the cases where Lookup is wrong. Considering self-heal-daemon and operations on the inode from client which take locks can still trigger heal we can choose to not attempt a heal on Lookup. Problem-2: Fixed spurious failure of tests/bitrot/bug-1373520.t For the issues above, what was happening was that ec_heal_inspect() is preventing 'name' heal to happen Problem-3: tests/basic/ec/ec-background-heals.t To be honest I don't know what the problem was, while fixing the 2 problems above, I made some changes to ec_heal_inspect() and ec_need_heal() after which when I tried to recreate the spurious failure it just didn't happen even after a long time. BUG: 1414287 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834 Reviewed-on: https://review.gluster.org/16468 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com>
*	tests: Added check for NFS export availability to quota-anon-fd-nfs.t	Shyam	2017-02-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I15a9441267c18bb1073d14db325c98fa497f2fb7 BUG: 1425515 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: https://review.gluster.org/16701 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: sanoj-unnikrishnan <sunnikri@redhat.com>
*	trash: fix problem with trash feature under multiplexing	Jeff Darcy	2017-02-09	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With multiplexing, the trash translator gets a reconfigure call before a notify(CHILD_UP). In this case, priv->trash_itable was not yet initialized, so the reconfigure would get a SEGV. Moving the itable allocation to init seems to fix it, so trash can be reenabled. Change-Id: I21ac2d7fc66bac1bc4ec70fbc8bae306d73ac565 BUG: 1420434 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16567 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	tests: fix online_brick_count for multiplexing	Jeff Darcy	2017-02-07	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of brick processes no longer matches the number of bricks, therefore counting processes doesn't work. Counting pidfiles does. Ironically, the fix broke multiplex.t which used this function, so it now uses a different function with the old process-counting behavior. Also had to fix online_brick_count and kill_node in cluster.rc to be consistent with the new reality. Change-Id: I4e81a6633b93227e10604f53e18a0b802c75cbcc BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16527 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	CLI/TIER: removing old tier commands under rebalance	hari gowtham	2017-02-07	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: gluster v rebalance <volname> tier start works even after the switch of tier to service framework. This lets the user have two tierd for the same volume. FIX: checking for each process will make the new code hard to maintain. So we are removing the support for old commands. Change-Id: I5b0974b2dbb74f0bee8344b61c7f924300ad73f2 BUG: 1415590 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: https://review.gluster.org/16463 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	extras: Provide group set for md-cache and invalidation options	Poornima G	2017-02-03	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To enable the integration of md-cache and invalidation features we need to perform 3 volume set options in a specific order. In order to ease this for user provide a group volume set option. Usage: gluster vol set <VOLNAME> group metadata-cache Change-Id: I9bf0fd4217aa2a1c7ffbdc93e879b10f87addeac BUG: 1418249 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/16503 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	libglusterfs: make memory pools more thread-friendly	Jeff Darcy	2017-02-02	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Early multiplexing tests revealed massive contention on certain pools' global locks - especially for dictionaries and secondarily for call stubs. For the thread counts that multiplexing can create, a more lock-free solution is clearly needed. Also, the current mem-pool implementation does a poor job releasing memory back to the system, artificially inflating memory usage to match whatever the worst case was since the process started. This is bad in general, but especially so for multiplexing where there are more pools and a major point of the whole exercise is to reduce memory consumption. The basic ideas for the new design are these There is one pool, globally, for each power-of-two size range. Every attempt to create a new pool within this range will instead add a reference to the existing pool. Instead of adding pools for each translator within each multiplexed brick (potentially infinite and quite possibly thousands), we allocate one set of size-based pools per thread (hundreds at worst). Each per-thread pool is divided into hot and cold lists. Every allocation first attempts to use the hot list, then the cold list. When objects are freed, they always go on the hot list. There is one global "pool sweeper" thread, which periodically reclaims everything in each pool's cold list and then "demotes" the current hot list to be the new cold list. For normal allocation activity, only a per-thread lock need be taken, and even that only to guard against very rare contention from the pool sweeper. When threads start and stop, a global lock must be taken to add them to the pool sweeper's list. Lock contention is therefore extremely low, and the hot/cold lists also provide good locality. A more complete explanation (of a similar earlier design) can be found here: http://www.gluster.org/pipermail/gluster-devel/2016-October/051160.html Change-Id: I5bc8a1ba57cfb553998f979a498886e0d006e665 BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/15645 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	core: run many bricks within one glusterfsd process	Jeff Darcy	2017-01-30	31	-38/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then compatible bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/ec: mark ec-background-heal.t as bad	Xavier Hernandez	2017-01-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: I0c54c62cdeb40b983da2392296762471a5474652 BUG: 1416689 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: https://review.gluster.org/16470 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	md-cache: Cache security.ima xattrs	Poornima G	2017-01-20	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From kernel version 3.X or greater, creating of a file results in removexattr call on security.ima xattr. But this xattr is not set on the file unless IMA feature is active. With this patch, removxattr call returns ENODATA if it is not found in the cache. Change-Id: I8136096598a983aebc09901945eba1db1b2f93c9 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16296 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	gfapi: add API to trigger events for debugging and troubleshooting	Niels de Vos	2017-01-19	2	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce glfs_sysrq() as a generic API for triggering debug and troubleshoot events. This interface will be used by the feature to get statedumps for applications using libgfapi. The current events that can be requested through this API are: - 'h'elp: log a mesage with all supported events - 's'tatedump: trigger a statedump for the passed glfs_t In future, this API can be used by a CLI to trigger statedumps from storage servers. At the moment it is limited to take statedumps, but it is extensible to set the log-level, clear caches, force reconnects and much more. BUG: 1169302 Change-Id: I18858359a3957870cea5139c79efe1365a15a992 Original-author: Poornima G <pgurusid@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/16414 Reviewed-by: Prashanth Pai <ppai@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	tier : Tier as a service	hari gowtham	2017-01-16	4	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tierd is implemented by separating from rebalance process. The commands affected: 1) Attach tier will trigger this process instead of old one 2) tier start and tier start force will also trigger this process. 3) volume status [tier] will show tier daemon as a process instead of task and normal tier status and tier detach status works. 4) tier stop implemented. 5) detach tier implemented separately along with new detach tier status 6) volume tier volname status will work using the changes. 7) volume set works This patch has separated the tier translator from the legacy DHT rebalance code. It now sends the RPCs from the CLI to glusterd separate to the DHT rebalance code. The daemon is now a service, similar to the snapshot daemon, and can be viewed using the volume status command. The code for the validation and commit phase are the same as the earlier tier validation code in DHT rebalance. The “brickop” phase has been changed so that the status command can use this framework. The service management framework is now used. DHT rebalance does not use this framework. This service framework takes care of : ) spawning the daemon, killing it and other such processes. ) volume set options , which are written on the volfile. ) restart and reconfigure functions. Restart is to restart the daemon at two points 1)after gluster goes down and comes up. 2) to stop detach tier. ) reconfigure is used to make immediate volfile changes. By doing this, we don’t restart the daemon. it has the code to rewrite the volfile for topological changes too (which comes into place during add and remove brick). With this patch the log, pid, and volfile are separated and put into respective directories. Change-Id: I3681d0d66894714b55aa02ca2a30ac000362a399 BUG: 1313838 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/13365 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	readdir-ahead : Perform STACK_UNWIND outside of mutex locks	Poornima G	2017-01-09	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently STACK_UNWIND is performnd within ctx->lock. If readdir-ahead is loaded as a child of dht, then there can be scenarios where the function calling STACK_UNWIND becomes re-entrant. Its a good practice to not call STACK_WIND/UNWIND within local mutex's Change-Id: If4e869849d99ce233014a8aad7c4d5eef8dc2e98 BUG: 1401812 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/16068 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	tests: Remove rpm test	Nigel Babu	2017-01-06	1	-145/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is already tested by our smoke test. Testing it again is redundant. Change-Id: Icae4e8650002cd847dcb7ea76fd0447df7e72816 BUG: 1408755 Signed-off-by: Nigel Babu <nigelb@redhat.com> Reviewed-on: http://review.gluster.org/16287 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anoop C S <anoopcs@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	glusterd: Fail add-brick on replica count change, if brick is down	karthik-us	2017-01-06	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1. Have a replica 2 volume with bricks b1 and b2 2. Before setting the layout, b1 goes down 3. Set the layout write some data, which gets populated on b2 4. b2 goes down then b1 comes up 5. Add another brick b3, and heal will take place from b1 to b3, which basically have no data 6. Write some data. Both b1 and b3 will mark b2 for pending writes 7. b1 goes down, and b2 comes up 8. b2 gets heald from b1. During heal it removes the data which is already in b2, considering that as stale data. This leads to data loss. Solution: 1. In glusterd stage-op, while adding bricks, check whether the replica count is being increased 2. If yes, then check whether any of the bricks are down at that time 3. If yes, then fail the add-brick to avoid such data loss 4. Else continue the normal operation. This check will work enen when we convert plain distribute volume to replicate Test: 1. Create a replica 2 volume 2. Kill one brick from the volume 3. Try adding a brick to the volume 4. It should fail with all bricks are not up error 5. Cretae a distribute volume and kill one of the brick 6. Try to convert it to replicate volume, by adding bricks. 7. This should also fail. Change-Id: I9c8d2ab104263e4206814c94c19212ab914ed07c BUG: 1406411 Signed-off-by: karthik-us <ksubrahm@redhat.com> Reviewed-on: http://review.gluster.org/16330 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	tests: Fix split-brain-favorite-child-policy.t failures	Ravishankar N	2017-01-02	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In CentOS-7, the file was receving an extra removexattr(security.ima) FOP which changed its ctime, breaking the assumption that a particular brick had the latest ctime based on the writevs done in the .t Fix: 1. Compare the ctime of both files in the backend and pick the one with the latest ctime for the fav-child policy. Also unmount the volume before comparing, to avoid any further FOPS on the file that can possibly modify the timestamps. 2. Added floating point handling in stat function. Thanks to Pranith for the helping debugging the regex. Change-Id: I06041a0f39a29d2593b867af8685d65c7cd99150 BUG: 1408757 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/16288 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	Repair EC tests for FB environment (/mnt/gvfs is teh sux00r)	Kevin Vigor	2016-12-29	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Don't blindly 'df', target the volume in question. (this is because FB build environment has a /mnt/gvfs which fails df). Test Plan: runtests.sh Reviewers: Subscribers: Tasks: Blame Revision: Change-Id: Ic2c5883dd102835db64be9594657257e20711ba0 BUG: 1406878 Signed-off-by: Kevin Vigor <kvigor@fb.com> Reviewed-on-3.8-fb: http://review.gluster.org/16182 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Tested-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-by: Shreyas Siravara <sshreyas@fb.com> Reviewed-on: http://review.gluster.org/16226 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	tests: Move tests/basic/gfapi/bug1291259.t to bad tests list	Krutika Dhananjay	2016-12-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	See thread http://www.gluster.org/pipermail/gluster-devel/2016-December/051714.html for more information. Change-Id: I9abe4b0e40499e53c1276a10a6bc192fd0f2cef7 BUG: 1405301 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16159 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	glfsheal: Explicitly enable self-heal xlator options	Ravishankar N	2016-12-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c can heal split-brain files even if they are disabled on the volume via volume set commands. Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457 BUG: 1234054 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/11333 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	cluster/afr: CLI for granular entry heal enablement/disablement	Krutika Dhananjay	2016-11-28	6	-7/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there are already existing non-granular indices created that are yet to be healed, if granular-entry-heal option is toggled from 'off' to 'on', AFR self-heal whenever it kicks in, will try to look for granular indices in 'entry-changes'. Because of the absence of name indices, granular entry healing logic will fail to heal these directories, and worse yet unset pending extended attributes with the assumption that are no entries that need heal. To get around this, a new CLI is introduced which will invoke glfsheal program to figure whether at the time an attempt is made to enable granular entry heal, there are pending heals on the volume OR there are one or more bricks that are down. If either of them is true, the command will be failed with the appropriate error. New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable} Change-Id: I1f4fe8162813b9068e198965d94169fee4adc099 BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15747 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/afr: Fix bugs in [f]inodelk/[f]entrylk	Pranith Kumar K	2016-11-26	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problems: 1) Inodelk is not taking quorum into account 2) finodelk, [f]entrylk are not implemented correctly 3) By default afr doesn't go for non-blocking parallel locks. Fix: Implemented a common framework which can be used by [f]inodelk/[f]entrylk. Used quorum for the same. Change-Id: I239f13875a065298630d266941df10cfa3addc85 BUG: 1369077 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15802 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
*	features/index: Delete granular entry indices of already healed directories ↵	Krutika Dhananjay	2016-11-24	1	-0/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	during crawl If granular name indices are already in existence for a volume, and before they are healed, granular entry heal be disabled, a crawl on indices/xattrop will clear the changelogs on these directories. When their corresponding entry-changes indices are crawled subsequently, if it is found that the directories don't need heal anymore, the granular indices are not cleaned up. This patch fixes that problem by ensuring that the zero-xattrop also deletes the stale indices at the level of index translator. Change-Id: Ifbaa6bec2a14e3041addfee4054131babbf4d35e BUG: 1370410 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15880 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
*	jbr: Sending rollback from failed fop to fdl	Avra Sengupta	2016-11-08	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In case of a failed fop, the failure is detected by the leader in the jbr-server in two places. First during a quorum check of +ve responses when it receives responses from all the followers. At this point if the fop hasn't been successfully journaled at a quorum of followers (as in there is no merit in trying the fop in the leader as the quorum will never be met), then we fail the fop. Also if this quorum is met, then the fop is tried on the leader, and after the leader completes the fop a quorum check similar to the previous one is done again, this time including the leaders outcome. If quorum is not met, then we fail the fop. In both these cases, when the fop fails we send a -ve ack to the client. With this patch, now we will also send a rollback through a GF_FOP_IPC to all the followers(and also to the leader in the second case of failure). This rollback will contain the index and term number of the fop which failed. This will be recorded in the respective journals of the bricks and will be used to rollback the fop on that brick later. A subsequent write, and it's respective rollback would look something like the following in the journal. The trusted.jbr.term and trusted.jbr.index present in the dict of both the logs, relate them, and the presence of "rollback-fop" in the dict of IPC indicates that it is a rollback fop, and the value 13(stands for GF_FOP_WRITE) indicates what kind of rollback operation it is. === GF_FOP_WRITE fd = <gfid 77f12ea2-ca56-40e3-a46e-ba2308baa035> vector = <158 bytes> offset = 0 (0x0) flags = 32769 (0x8001) xdata = dict { trusted.jbr.term = 0 <2 bytes> trusted.jbr.index = 4 <2 bytes> } === GF_FOP_IPC xdata = dict { trusted.jbr.term = 0 <2 bytes> trusted.jbr.index = 4 <2 bytes> rollback-fop = 13 <3 bytes> } Change-Id: I70b6a143d20697153d58e2f719e34ecd1ed160a5 BUG: 1349385 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/14783 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
*	gfapi: async fops should unref in callbacks	Raghavendra Talur	2016-11-05	2	-0/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If fd is unref'd at the end of async call then the unref in cbks would lead to double unref and possible crash. Removing duplicate unrefs. Added unref only in failure cases. A simple test case has been added to test async write case. Need to extend the same for other async APIs too. Details: All glfd based calls in libgfapi, except for glfs_open and glfs_close, behave in the same way. At the start of the operation, they take a ref on glfd and fd. At the end of the operation, they unref it. Async calls are a little different as they unref in the cbk function. A successfull open call does not unref either the glfd or fd, thereby functioning as a reference for a OPEN file object. glfs_close makes a syncop_flush call sandwiched between a fd ref and unref(this can be removed, more on this below), followed by a call to glfs_mark_glfd_for_deletion which unrefs glfd and also calls glfs_fd_destroy as a release function thereby doing a unref on fd too. Functionally, there is no problem with how everything works when as described above. However, it is a little non-intuitive that we need to perform a fd_unref as a consequence of a implicit fd_ref that happens within glfs_resolve_fd. As we perform a GF_REF_GET(glfd) at the start of every operation, it would be worthwhile to remove the fd_ref that glfs_resovle_fd takes and do away with explicit fd_unref()s at the end of every operation. This is the same reason why we don't need the fd_ref in glfs_close. This is however not in the scope of this patch. Change-Id: I86b1d3b2ad846b16ea527d541dc82b5e90b0ba85 BUG: 1391086 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/15768 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Prasanna Kumar Kalever <pkalever@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
*	snapshot: Fix the failure to recreate clones with same name	Avra Sengupta	2016-11-01	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The brick path of snapshot clones contained the clonename, thereby failing to create newer clones with the same name after the original clone had been deleted. This fix creates the brick path with the clone's vol id instead of the clones name. Hence future clones with the same name will not have the namespace clash. Change-Id: I262712adc576122f051b5d1ce171d020efaefd1a BUG: 1387160 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15683 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>