summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* afr: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-291-1/+0
| | | | | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: I98e3308a2548ae095048caa99c86edec15b5e782 BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15241 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* dht: Implement ipc fopPoornima G2016-08-273-0/+87
| | | | | | | | | | | | | | | | | | ipc is used by md-cache to communicate the list of xattrs that it is caching, to the upcall xlator. Hence implement this in dht, such that it winds to all the bricks if the ipc op is GF_IPC_MDC_TARGET_UPCALL. The ips should not fail if any of the bricks is down, as md-cache will replay the ipc late when the brick comes back up. Change-Id: Ica551a550c04cbb1240c0d211fe831c2e5eb6017 BUG: 1211863 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15225 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Use locks for opendirPranith Kumar K2016-08-251-1/+21
| | | | | | | | | | | | | | | | | | | | | | | Problem: In some cases we see that readdir keeps winding to the brick that doesn't have any blocked locks i.e. first brick. This is leading to the client assuming that there are no blocking locks on the inode so it won't give away the lock. Other clients end up blocked on the lock as if the command hung. Fix: Proper way to fix this issue is to use infra present in http://review.gluster.org/14736 This is a stop gap fix where we start taking inodelks in opendir which goes to all the bricks, this will detect if there is any contention. BUG: 1346719 Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15309 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com>
* quotad: fix potential buffer overflowsRaghavendra G2016-08-252-4/+23
| | | | | | | | | | | | | | | | | | | This converts sprintf to gf_asprintf in following components: * quotad.c * dht * afr * protocol/client * rpc/rpc-lib * rpc/rpc-transport Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee BUG: 1332073 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14102 Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
* cluster/ec: Do multi-threaded self-healPranith Kumar K2016-08-244-5/+42
| | | | | | | | | | | BUG: 1368451 Change-Id: I5d6b91d714ad6906dc478a401e614115c89a8fbb Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15083 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ashish Pandey <aspandey@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Give option to do consistent-ioPranith Kumar K2016-08-229-41/+229
| | | | | | | | | | | | | | | | | | | | | | | Problem: When tiering/rebalance does migrations and afr with 2-way replica is in picture, migration can read stale data if the source brick goes down and writes to the destination. After this deletion of the file leads to permanent loss of data after migration. Fix: Rebalance/tiering should migrate only when the data is definitely not stale. So introduce an option in afr called consistent-io which will be enabled in migration daemons. BUG: 1306398 Change-Id: I750f65091cc70a3ed4bf3c12f83d0949af43920a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13425 Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Prevent split-brain when bricks are brought off and on in ↵Krutika Dhananjay2016-08-229-51/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cyclic order When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15080 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* stripe: Fix wrong pathinfo in striped-replicated volumeEkasit Kijsipongse2016-08-161-1/+1
| | | | | | | | | | | | | Change-Id: I05b3ba6757d5b786daf7cb3a64e6ac6676e9c997 BUG: 1200914 Signed-off-by: Ekasit Kijsipongse <ekasit.kijsipongse@nectec.or.th> Reviewed-on: http://review.gluster.org/11375 Tested-by: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: Bug fixes in txn codepathKrutika Dhananjay2016-08-151-2/+2
| | | | | | | | | | | | | | | | | | AFR sets transaction.pre_op[] array even before actually doing the pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array but also the failed_subvols[] information before setting the pre_op_done[] flag. This patch fixes that. Change-Id: I78ccd39106bd4959441821355a82572659e3affb BUG: 1363721 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15145 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs: move alloca0 definition to common-utilsRavishankar N2016-08-122-2/+0
| | | | | | | | | | | | | | ...and remove their definitons from EC and AFR. Also `s/alloca+memset0/alloca0` wherever it is used. Change-Id: I3b71e596d12a7d8900f5d761af6b98305c8874d5 BUG: 1366226 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15147 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: copy loc before passing to syncopPranith Kumar K2016-08-041-1/+2
| | | | | | | | | | | | | | | | | | | | | Problem: When io-threads is enabled on the client side, io-threads destroys the call-stub in which the loc is stored as soon as the c-stack unwinds. Because afr is creating a syncop with the address of loc passed in setxattr by the time syncop tries to access it, io-threads would have already freed the call-stub. This will lead to crash. Fix: Copy loc to frame->local and use it's address. BUG: 1361678 Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15070 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/tier: dont promote if estimated block consumption > hi watermarkMilind Changire2016-07-292-50/+153
| | | | | | | | | | | | | | | | | | Add test to fail promotion if estimated block consumption grows beyond hi watermark. Skip file migrations until next cycle if tier_get_fs_stat() fails in tier_migrate_using_query_file() Change-Id: Ice04572fa739c09109c4433e65965197482a7beb BUG: 1349284 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-on: http://review.gluster.org/14780 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-07-281-3/+14
| | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1359711 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15000 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: fix statfs for dht/tiered volumesDan Lambright2016-07-276-22/+265
| | | | | | | | | | | | | | | | | | | | | | | | | Return the correct size of the tiered volume in statfs. It should be the size of the cold tier, not the sum of the hot and cold tier, because the hot tier is a cache and not an extension of the volume's capacity. The number of free blocks, etc is the cold tier's capacity subtracted by the sum of utilization on the hot and cold tiers. Note if both tiers are part of the same file system this must be accounted for as well. The patch also fixes a pre-existing bug in the DHT/tier translators. If statfs was taken on a file, the code only calculated free space on the cached subvolume, not all subvolumes in the replica group. With the fix, this is corrected, except in the case where quota is used with the deem-statfs option set to "on". Change-Id: I2b8bcb4511edf83f12130960aad0a609fcf8f513 BUG: 1339689 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/14536 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: some coverity fixesRavishankar N2016-07-267-103/+155
| | | | | | | | | | | | | | | Thanks to Krutika for a cleaner way to track inode refs in afr_set_split_brain_choice(). Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df BUG: 1355604 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14895 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dict: Don't expose get_new_dict/dict_destroyPranith Kumar K2016-07-254-14/+12
| | | | | | | | | | | | | | | get_new_dict/dict_destroy is causing confusion where, dict_new/dict_destroy or get_new_dict/dict_unref are used instead of dict_new/dict_unref. Change-Id: I4cc69f5b6711d720823395e20fd624a0c6c1168c BUG: 1296043 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13183 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/ec: Handle absence of keys in some callback dictAshish Pandey2016-07-251-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: This issue arises when we do a rolling update from 3.7.5 to 3.7.9. For 4+2 volume running 3.7.5, if we update 2 nodes and after heal completion kill 2 older nodes, this problem can be seen. After update and killing of bricks, 2 nodes will return inodelk count key in dict while other 2 nodes will not have inodelk count in dict. This is also true for get-link-count. During dictionary match , ec_dict_compare, this will lead to mismatch of answers and the file operation on mount point will fail with IO error. Solution: Don't match inode, entry and link count keys while comparing two dictionaries. However, while combining the data in ec_dict_combine, go through all the dictionaries and select the maximum values received in different dicts for these keys. Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f BUG: 1347686 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14761 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* core: add a basis function to reduce verbose codeZhou Zhengping2016-07-181-15/+0
| | | | | | | | | | | Change-Id: Icebe1b865edb317685e93f3ef11d98fd9b2c2e9a BUG: 1357226 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: http://review.gluster.org/14936 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr, index: Clean up stale directory and file indices in granular entry shKrutika Dhananjay2016-07-113-13/+53
| | | | | | | | | | | | | | | | | | | | | | | | Specifically when a directory tree is removed (rm -rf) while a brick is down, both the directory index and the name indices of the files and subdirs under it will remain. Self-heal will need to pick up these and remove them. Towards this, afr sh will now also crawl indices/entry-changes and call an rmdir on the dir if the directory index is stale. On the brick side, rmdir fop has been implemented for index xl, which would delete the directory index and its contents if present in a synctask. Change-Id: I8b527331c2547e6c141db6c57c14055ad1198a7e BUG: 1331323 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14832 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/afr: Fix a minor typo in afr-self-heald.cVijay Bellur2016-07-091-6/+6
| | | | | | | | | | | | s/outout/output/ Change-Id: I2aec770cdae513cd4932e5fd56e0267584e44cae Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/13930 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com>
* afr:Don't wind reads for files in metadata split-brainRavishankar N2016-06-241-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: For a read on a file in metadata split-brain: 1.lookup_done resets event_generation to zero. 2. readv is issued, goes to inode refresh due to mismatching event_gen. 3. After refresh is successful, we update event_generation, data and metdata readable. 3. We then call afr_read_txn_refresh_done() which in turn calls afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind is called with local->readable (which is populated with data_readable), thus winding the read to a brick. 4. Also, further parallel reads that come directly go to the wind path because there is no inode_refresh needed. Fix: 1.For any afr_read_txn(), readable must be an intersection of data and metadata readable. 2.Check for EIO in afr_read_txn_refresh_done(). Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e BUG: 1305031 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13389 Reviewed-by: Ashish Pandey <aspandey@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: Wrong type of function's parameter when calling ↵Zhou Zhengping2016-06-231-7/+5
| | | | | | | | | | | | | | | | dht_selfheal_dir_xattr_cbk The second parameter's type is call_frame_t *, and we change it to be type xlator_t *, it is exactly what we need in this function. Change-Id: I6a154edcaa5a11084d837ca925efbfac853d0786 BUG: 1346551 Signed-off-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-on: http://review.gluster.org/14737 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/dht: initialize cbk before attempting inode-linkRaghavendra G2016-06-171-3/+3
| | | | | | | | | | | | | | | | | Otherwise inode-link failures in selfheal codepath will result in a crash. Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022 BUG: 1345748 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14707 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/dht: Fix unsafe iteration on inode->fd_listXavier Hernandez2016-06-151-16/+70
| | | | | | | | | | | | | | | | | | | | | When DHT traverses the inode->fd_list, it does that in an unsafe way that can generate races with fd_unref() called from other threads. This patch fixes this problem taking the inode->lock and adding a reference to the fd while it's being used outside of the mutex protected region. A minor change in storage/posix has been done to also access the inode->fd_list in a safe way. Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93 BUG: 1344340 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14682 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Unlock stale locks when inodelk/entrylk/lk failsPranith Kumar K2016-06-141-6/+6
| | | | | | | | | | | | | | | | Thanks to Rafi for hinting a while back that this kind of problem he saw once. I didn't think the theory was valid. Could have caught it earlier if I had tested his theory. Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b BUG: 1344836 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14703 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: mohammed rafi kc <rkavunga@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* cluster/ec: Fix race in timer cancellationXavier Hernandez2016-06-131-15/+56
| | | | | | | | | | | | | | | | | | | A race in timer cancellation for delayed unlock could cause a crash if the cancelling thread fails to cancel the timer because it has already been fired but not executed, and the callback is scheduled out of the CPU, delaying it until the thread has released important resources needed by the callback. This patch improves the handling of this case to make it robust. Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4 BUG: 1345855 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14712 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix invalid __fd_unref() callXavier Hernandez2016-06-131-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | __fd_unref() doesn't do any cleanup, so it cannot be called to release fd references, specially if it's the last reference. The code has been changed to avoid a call to this function. In the previous version we always tried to keep the newest fd in the ec_lock_t structure. However this is not necessary. We'll always keep one reference to an open file on the same inode. It's irrelevant if the reference is new or old. The function __fd_unref() has also been removed from fd.h to avoid being used in the future since it's useless as it's defined now. Change-Id: Ia728777fc8e464758d5ea4d3bf020f0603919039 BUG: 1344396 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14683 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ashish Pandey <aspandey@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* afr: Consider ENOSPC and EDQUOT as symmetric errorsRavishankar N2016-06-121-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Since commit 8eaa3506ead4f11b81b146a9e56575c79f3aad7b, in replica 3, if a brick is down and a create fails on the other 2 brick with EDQUOT, we consider it an unsymmetric error and hence do not do post-op. So the dirty xattr remains set on the parent dir, leading to conservative merges during heal when all bricks are up. i.e. a file deleted on the source might re-appear after heal. Fix: Consider ENOSPC and EDQUOT as symmetric errors since there is no possibility of partial inode or entry modification operations possible when quota is enabled. IOW, if quota reports EDQUOT, the no. of bytes written (or not written) will be the same on all bricks of the replica. Likewise, the entry operation (create, mkdir...) will either succeed or not succeed on all bricks. Change-Id: Iacb1108e9ef4a918e36242fb4a957455133744e9 BUG: 1341650 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14604 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: Pass xdata to dht in case of errorAshish Pandey2016-06-101-4/+6
| | | | | | | | | | | | | | | | | | | | | | | Problem: In case of mkdir failure, dht expects error information so that it can act accordingly. Aftre adding bricks and re balance, layout gets changed. Fop "mkdir" with old layout returns EIO. EC gets this error in xdata but does not pass it back to dht. In this case dht will not be able to take corrective action. Solution: Return xdata back to dht Change-Id: I24def8038e6880607689b7b046dc6428f564c6ab BUG: 1344277 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14679 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Atin Mukherjee <amukherj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr: afr-pending-xattr fallback checkRavishankar N2016-06-091-31/+63
| | | | | | | | | | | | | | | | | | | | Commit 6e635284a4411b816d4d860a28262c9e6dc4bd6a introduced a comma separated list of values to be used as AFR's pending changelogs. If this xlator option is missing in the volfile, fall back to using client xlator names for constructing the pending changelog names. Also, since the aforementioned commit was reverted from 3.7 and 3.8 branches, introduce GD_OP_VERSION_3_9_0 and change the op-version for this feature to GD_OP_VERSION_3_9_0. Change-Id: I3639b9ab475bd8d9929cc7527d9f4584dee1ad1b BUG: 1285152 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14642 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* core, shard: Make shards inherit main file's O_DIRECT flag if presentKrutika Dhananjay2016-06-071-1/+2
| | | | | | | | | | | | | | | | | | If the application opens a file with O_DIRECT, the shards' anon fds would also need to inherit the flag. Towards this, shard xl would be passing the odirect flag in the @flags parameter to the WRITEV fop. This will be used in anon fd resolution and subsequent opening by posix xl. Change-Id: Iddb75c9ed14ce5a8c5d2128ad09b749f46e3b0c2 BUG: 1342171 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14191 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Restrict the launch of replace brick healAshish Pandey2016-06-061-1/+2
| | | | | | | | | | | | | | | | | | | | Problem: When features.cache-invalidation is ON, a lot of ec_notify function gets called which leads to launch of too many heals. This leads to no heal completion, which causes accumulation of heals. Solution: ec_launch_replace_heal should not be launch for every event. Replace brick will trigger a child up event and then only this heal function should be called. Change-Id: I57b44c6a279d57230daea1d93229be6069245b7d BUG: 1342796 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14649 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Add/Modify description for eager-lock optionAshish Pandey2016-06-032-6/+17
| | | | | | | | | | | | | | | | | | | This patch provides description for disperse.eager-lock option for disperse volume. It also modifies the description for cluster.eager-lock option to indicate that this option is only for replica volume. Change-Id: Ie73298947fcaaa6aaf825978bc2d27ceaff386d2 BUG: 1327171 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/13999 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht : add metalock/unlockSusant Palai2016-06-031-5/+98
| | | | | | | | | | | | Change-Id: I842a7ea1b286f1b893b200fe647597e7fd0f2105 BUG: 1331720 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14252 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/afr: Unwind with xdata in inode-write fopsPranith Kumar K2016-06-014-10/+12
| | | | | | | | | | | | | | | When there is a failure afr was not unwinding xdata to xlators above. xdata need not be NULL on failures. So it is important to send it to parent xlators. Change-Id: Ic36aac10a79fa91121961932dd1920cb1c2c3a4c BUG: 1340623 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14567 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: Unwind xdata_rsp even in case of failuresPranith Kumar K2016-05-307-21/+120
| | | | | | | | | | | | | | | | | | | DHT expects GF_PREOP_CHECK_FAILED to be present in xdata_rsp in case of mkdir failures because of stale layout. But AFR was unwinding null xdata_rsp in case of failures. This was leading to mkdir failures just after remove-brick. Unwind the xdata_rsp in case of failures to make sure the response from brick reaches dht. BUG: 1340623 Change-Id: Idd3f7b95730e8ea987b608e892011ff190e181d1 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14553 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/afr: Fix warning about unused variablePranith Kumar K2016-05-291-1/+0
| | | | | | | | | | | BUG: 1336612 Change-Id: Ife1ce4b11776a303df04321b4a8fc5de745389d6 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14545 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* core: assorted spelling mistakes reported by DebianKaleb S KEITHLEY2016-05-261-1/+1
| | | | | | | | | | | | | | | | | See also > Change-Id: I567a4be8f0f31f6285550f243fe802895f6bc43b Reported-by: Patrick Matthäi <pmatthaei@debian.org> BUG: 1336793 Change-Id: Icb9a6ff94d86663a5bca4ba931d810439c02556e Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14526 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: Attempt name-index purge even on full-heal of directoryKrutika Dhananjay2016-05-261-58/+72
| | | | | | | | | | | Change-Id: Ief71cc68a4fbf8113e15b4254ebcabf7e30f74e2 BUG: 1339181 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14516 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Automagic unsplit-brain by [ctime|mtime|size|majority]Ravishankar N2016-05-257-22/+356
| | | | | | | | | | | | | | | | | | | | | Introduce cluster.favorite-child-policy which when enabled with [ctime|mtime|size|majority], automatically heals files that are in split-brian. The majority policy will not pick a source if there is no majority. The other three policies pick the first brick with a valid reply and non-zero ctime/mtime/size as source. Change-Id: I3c099a0404082213860f74f2c9b4d207cfaedb76 BUG: 1328224 Original-author: Richard Wareing <rwareing@fb.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14026 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht: selfheal should wind mkdir call to subvols with ESTALE errorSakshi Bansal2016-05-251-1/+2
| | | | | | | | | | | | Change-Id: I7140e50263b5f28b900829592c664fa1d79f3f99 BUG: 1338634 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14496 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* dht/rebalance: mark hardlink failures as skipped in rebalanceSusant Palai2016-05-251-0/+8
| | | | | | | | | | | | | | | Since rebalance(not remove-brick) process does not migrate hardlinks mark them as skipped rather than failed as it creates confusion for the users. Change-Id: I5d469d10146274f00bb91482d0373c5235a9b8b2 BUG: 1339071 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14493 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/ec: Use correct log levelsAshish Pandey2016-05-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | Problem : Misleading messages are getting logged in mount logs and bricks log. "Mismatching xdata" and "Heal failed" are getting logged Solution : Reduce the level of logs from INFO, WARNING and NOTICE to DEBUG level wherever applicable OR use fop_log_level to get proper log level. Change-Id: Ia824c71e75ab683d3cb8949e1966ea09c9ccce72 BUG: 1231224 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/13266 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Check for required number of entrylksRavishankar N2016-05-241-5/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Parallel rmdir operations on the same directory results in ENOTCONN messages eventhough there was no network disconnect. In blocking entry lock during rmdir, AFR takes 2 set of locks on all its children-One (parentdir,name of dir to be deleted), the other (full lock on the dir being deleted). We proceed to pre-op stage even if only a single lock (but not all the needed locks) was obtained, only to fail it with ENOTCONN because afr_locked_nodes_get() returns zero nodes in afr_changelog_pre_op(). Fix: After we get replies for all blocking lock requests, if we don't have the minimum number of locks to carry out the FOP, unlock and fail the FOP. The op_errno will be that of the last failed reply we got, i.e. whatever is set in afr_lock_cbk(). Change-Id: Ibef25e65b468ebb5ea6ae1f5121a5f1201072293 BUG: 1336381 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/14358 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Get hard-link-count in {unlink,rename}_cbk before deleting ↵Krutika Dhananjay2016-05-201-8/+13
| | | | | | | | | | | | | shards Change-Id: I0606b74f11f5412c4d9af44a6505635ed9022c15 BUG: 1335858 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14334 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Do not inode_link in afrPranith Kumar K2016-05-203-5/+46
| | | | | | | | | | | | | | | | | | Race is explained at https://bugzilla.redhat.com/show_bug.cgi?id=1337405#c0 This patch also handles performing of self-heal with shd-pid. Also performs the healing with this->itable's inode rather than main itable. BUG: 1337405 Change-Id: Id657a6623b71998b027b1dff6af5bbdf8cab09c9 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14422 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/afr: Refresh inode for inode-write fops in needPranith Kumar K2016-05-194-37/+98
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: If a named fresh-lookup is done on an loc and the fop fails on one of the bricks or not sent on one of the bricks, but by the time response comes to afr, if the brick is up, 'can_interpret' will be set to false in afr_lookup_done(), this will lead to inode-ctx for that inode to be not set, this can lead to EIO in case of a transaction as it depends on 'readable' array to be available by that point. Fix: Refresh inode for inode-write fops for the ctx to be set if it is not already done at the time of named fresh-lookup or if the file is in split-brain where we need to perform one more refresh before failing the fop to check if the file is still in split-brain or not. BUG: 1336612 Change-Id: I5c50b62c8de06129b8516039f7c252e5008c47a5 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14368 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: If possible give errno received from lower xlatorsPranith Kumar K2016-05-181-1/+3
| | | | | | | | | | | | | | | | | | In case of 3 way replication with quorum enabled with sharding, if one bricks is brought down and brought back up sometimes fops fail with EROFS because the mknod of shard file fails with two good nodes with EEXIST. So even when quorum is not met, it makes sense to unwind with the errno returned by lower xlators as much as possible. Change-Id: Iabd91cd7c270f5dfe6cbd18c50e59c299a331552 BUG: 1336612 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14369 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
* dht: rename takes lock on parent directory if destination existsSakshi Bansal2016-05-181-7/+32
| | | | | | | | | | | | | | | | For directory rename if destination exists the source directory is created as a child of the given destination directory. Since the new child directory does not exist take lock on parent of the child directory. Change-Id: I24a34605a2cd65984910643ff5462f35e8fc7e71 BUG: 1336698 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14371 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* core: assorted typos and spelling mistakes reported by Debian lintianKaleb S KEITHLEY2016-05-181-1/+1
| | | | | | | | | | | | | | | Also missing bang (!) in #!/bin/bash in shell scripts. Change-Id: I567a4be8f0f31f6285550f243fe802895f6bc43b BUG: 1336793 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14398 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>