summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
* cluster/ec: add separate versions for data/entry, metadataAshish Pandey2015-05-068-29/+104
| | | | | | | | | | | | | | | | | | | Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version. Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in 3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with old version files. For upgrades we need to tell users to complete heals and then upgrade BUG: 1215265 Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10312 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht/rebalance: Go to exit if defrag is NULLSusant Palai2015-05-051-5/+5
| | | | | | | | | | Problem: In case a defrag is null, going to out section will crash rebalance. Change-Id: I8b3ee1ad85dc23ef0e2f2dd6f912d07216bd619f Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10582 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* guster/dht: tiered volumes may not allow access to files undergoing migrationDan Lambright2015-05-051-0/+22
| | | | | | | | | | | | | | | | | | | | If a read IO occurs against a file that has reached rebalance phase 2, we redirect the IO to the destination. For tiered volumes, when we try to reopen the file (on the destination), the lower level DHT receives the open call and fails; it does not have a "cached subvol". Fix is to "teach" the lower level DHT of the new location by sending a locate before the open. Change-Id: Ia4acb0035ff1da15f6a8f9ed54f43c76e8b98f5f BUG: 1214048 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: root <root@gprfs018.sbu.lab.eng.bos.redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10324 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* geo-rep: rename handling in dht volumeNithya Balachandran2015-05-051-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: Glusterfs changelogs are stored in each brick, which records the changes happened in that brick. Georep will run in all the nodes of master and processes changelogs "independently". Processing changelogs is in brick level, but all the fops will be replayed on "slave mount" point. Problem: With a DHT volume, in changelog "internal fops" are NOT recorded. For Rename case, Rename is recorded in "hashed" brick changelog. (DHT's internal fops like creating linkto file, unlink is NOT recorded). This lead us to inconsistent rename operations. For example, Distribute volume created with Two bricks B1, B2. //Consider master volume mounted @ /mnt/master and following operations executed: cd /mnt/master touch f1 // f1 falls on B1 Hash mv f1 f2 // f2 falls on B2 Hash // Here, Changelogs are recorded as below: @B1 CREATE f1 @B2 RENAME f1 f2 Here, race exists between Brick B1 and B2, say B2 will get executed first. Source file f1 itself is "NOT PRESENT", so it will go ahead and create f2 (Current implementation). We have this problem When rename falls in another brick and file is unlinked in Master. Similar kind of issue exists in following case too(multiple rename): CREATE f1 RENAME f1 f2 RENAME f2 f1 Solution: Instead of carrying out "changelogging" at "HASHED volume", carry out at the "CACHED volume". This way we have rename operations carried out where actual files are present. So,Changelog recorded as : @B1 CREATE f1 RENAME f1 f2 credit: sarumuga@redhat.com PS: Some of the races as the one below are _NOT_ fixed by this patch * f1 and f2 exist. B1 and B2 are their respective cached subvols. For both files hashed-subvol == cached-subvol * mv f1 f2 on master. * B1 has change-log entry of rename f1 f2 * rebalance migrates f2 from B1 and B2 * mv f2 f1 on master. * B2 has change-log entry of rename f2 f1 Since changelog entries (rename f1 f2) and (rename f2 f1) are processed independently by gsyncds, which of either f1 and f2 survives on slave is subject to race. Note that on master its file f1 with name f1 which survived. On slave it can be either file f1 with name f1 or file f2 with name f2 based on who wins the race of processing changelog. Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda BUG: 1141379 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/10410 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* dht/tier/rebalancer: Fix reset of tiering client pidJoseph Fernandes2015-05-054-3/+5
| | | | | | | | | | | | | | | | In the patch http://review.gluster.org/#/c/9657 the client pid set by tiering migration was getting over- written in dht_start_rebalance_task(). Just corrected it in dht_setxattr() before calling dht_start_rebalance_task() and removed it from dht_start_rebalance_task(). Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870 BUG: 1217937 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: don't use hot tier until subvolumes readyDan Lambright2015-05-052-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10435 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* afr: add arbitration supportRavishankar N2015-05-058-32/+207
| | | | | | | | | | | | | | | | | | | | | | | | | Add logic in afr to work in conjunction with the arbiter xlator when a replica 3 arbiter volume is created. More specifically, this patch: * Enables full locks for afr data transaction for such volumes. * Removes the upfront marking of pending xattrs at the time of pre-op and defer it to post-op. (This is an arbiter independent change and is made for all afr transactions.) * After pre-op stage, check if we can proceed with the fop stage without ending up in split-brain by examining the changelog xattrs. * Unwinds the fop with failure if only one source was available at the time of pre-op and the fop happened to fail on particular source brick. * Skips data self-heal if arbiter brick is the only source available. * Adds the arbiter-count option to the shd graph. This patch is a part of the arbiter logic implementation for 3 way AFR details of which can be found at http://review.gluster.org/#/c/9656/ Change-Id: I9603db9d04de5626eb2f4d8d959ef5b46113561d BUG: 1199985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Fix dictionary compare functionPranith Kumar K2015-05-044-118/+57
| | | | | | | | | | | | | | | | | | | | If both dicts are NULL then equal. If one of the dicts is NULL but the other has only ignorable keys then also they are equal. If both dicts are non-null then check if for each non-ignorable key, values are same or not. value_ignore function is used to skip comparing values for the keys which must be present in both the dictionaries but the value could be different. geo-rep's stime xattr doesn't need to be present in list xattr but when getxattr comes on stime xattr even if there aren't enough responses with the xattr we should still give out an answer which is maximum of the stimes available. Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d BUG: 1207712 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Use fd instead of loc for get_size_versionAshish Pandey2015-05-043-57/+63
| | | | | | | | | Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a BUG: 1188242 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10176 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: Add a null check before freeing dir_dfmeta and tmp_containerSusant Palai2015-05-041-10/+19
| | | | | | | | | | | Change-Id: Ifbba6f340adfe2b4e3ad07260fbf4a25698ad8df BUG: 1217949 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10459 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tier: relax libgfdb required version numberEmmanuel Dreyfus2015-05-031-1/+1
| | | | | | | | | | | | | | | | | | | When calling dlopen() for libgfdb, do not specify the library version number "libgfdb.so.0.0.1", since libtool will not always create libraries or link with that name with the full 3-digit version. For instance on NetBSD only up to the 2-digit version is available and "libgfdb.so.0.0.1" does not exist. Instead, just specify "libgfdb.so" and rely on smymlinks installed by libtool to find the relevant library. BUG: 1129939 Change-Id: I074b1009d3622a122fdaeb4b99658bca3277e211 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10407 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalancer: Marking tiering migration fopsJoseph Fernandes2015-05-013-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up patch for http://review.gluster.org/#/c/10080 In the above, the suggested change in http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c doesnot work. The reason it doesnt work is promotion and demotion are done in a multithread way. Whenever a promotion or demotion thread is called, the frame of the old sync_op thread is not carried with it. As a result the frame->root->pid is not set. Solution: When the file is getting migrated, we get a tiering.migration key_value in the xattr dict, so that we pass this dic key-value when we do syncop_setxattr() to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG in dht_setxattr() just before calling dht_start_rebalance_task(). Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10266 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: tackle thread race in dht_getxattr_cbkSusant Palai2015-04-291-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | problem: 1. When two threads execute in parallel in dht_getxattr_cbk it may so happen that, both may find local->xattr to be NULL. As a result dht_aggregate_xattr may not get executed. 2. In dht_getxattr_cbk, thread1 thread2 T1 this_call_cnt = 2 -1 T2 this_call_cnt = 1 - 1 T3 fills local_xattr T4 DHT_STACK_UNWIND -> local_wipe T5 tries to dereference local which is already freed, leading to crash. Solution: for problem1: Execute critical section inside frame lock to resolve race. for problem2: Calculate this_call_count just before out section. Change-Id: I9827ac8fafebb0c733a4e4f3c710b752f1cd45fa BUG: 1215592 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10389 Reviewed-by: Anuradha Talur <atalur@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* rebalance: Introducing local crawl and parallel migrationSusant Palai2015-04-296-248/+1121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current patch address two part of the design proposed. 1. Rebalance multiple files in parallel 2. Crawl only bricks that belong to the current node Brief design explanation for the above two points. 1. Rebalance multiple files in parallel: ------------------------------------- The existing rebalance engine is single threaded. Hence, introduced multiple threads which will be running parallel to the crawler. The current rebalance migration is converted to a "Producer-Consumer" frame work. Where Producer is : Crawler Consumer is : Migrating Threads Crawler: Crawler is the main thread. The job of the crawler is now limited to fix-layout of each directory and add the files which are eligible for the migration to a global queue in a round robin manner so that we will use all the disk resources efficiently. Hence, the crawler will not be "blocked" by migration process. Producer: Producer will monitor the global queue. If any file is added to this queue, it will dqueue that entry and migrate the file. Currently 20 migration threads are spawned at the beginning of the rebalance process. Hence, multiple file migration happens in parallel. 2. Crawl only bricks that belong to the current node: -------------------------------------------------- As rebalance process is spawned per node, it migrates only the files that belongs to it's own node for the sake of load balancing. But it also reads entries from the whole cluster, which is not necessary as readdir hits other nodes. New Design: As part of the new design the rebalancer decides the subvols that are local to the rebalancer node by checking the node-uuid of root directory prior to the crawler starts. Hence, readdir won't hit the whole cluster as it has already the context of local subvols and also node-uuid request for each file can be avoided. This makes the rebalance process "more scalable". Change-Id: I73ed6ff807adea15086eabbb8d9883e88571ebc1 BUG: 1171954 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/9657 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/ec: Handle unhandled statesPranith Kumar K2015-04-281-0/+3
| | | | | | | | | | | | This was leading to hangs when get_size_and_version fails Change-Id: Iad9408c2dacc9a74594b8d2f94c95f402533b0f1 BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10390 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* arbiter: load arbiter xlator on every 3rd brick of a replica 3 AFR subvolRavishankar N2015-04-272-0/+8
| | | | | | | | | | | | | | | | | | | Logic for adding the 'glusterd_brickinfo->group' member and using it to find the brick positon has been taken from http://review.gluster.org/#/c/9919. Thanks to Jeff Darcy for that. This patch is a part of the arbiter logic implementation for 3 way AFR details of which can be found at http://review.gluster.org/#/c/9656/ Change-Id: Idbfe4f29ee8e098e0102def8f38b32314316b188 BUG: 1199985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10257 Tested-by: NetBSD Build System Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* tier: fix off-by-one overrun in UUID stringEmmanuel Dreyfus2015-04-271-1/+1
| | | | | | | | | | | | | | | | | UUID strings are UUID_CANONICAL_FORM_LEN (36) bytes long plus the trailing nul character that various function (e.g.: uuid_unparse) will add. As a consequence, UUID strings must be declared as UUID_CANONICAL_FORM_LEN+1 long, otherwise we get a off-by-one overrun that corrupts the next variable on stack. BUG: 1129939 Change-Id: I5837ad6ca06fa17cc7ab143eedd02d8099ecca2a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10394 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Perform inode-write on healing subvolsPranith Kumar K2015-04-243-129/+88
| | | | | | | | | | | | | | | | | | If the version numbers do not match, then writes are performed only on at least N-R bricks which have same version. But if we want to do healing of files which are constantly modified we need to allow writes on subvols that are undergoing heal. Data healing will mark 62nd bit while the heal is going on. When the data transaction sees that this bit is set it needs to perform the fop on that subvol irrespective of whether the versions match or do not match. Fop is considered successful only if N-R non-healing bricks succeed. Change-Id: I69a17582df397aaf6e8ca4b5e746c7ca802cbbde BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10372 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlators/cluster/dht: Fix Resource leak (CID 1291751)Günther Deschner2015-04-241-1/+3
| | | | | | | | | | | | | | | Coverity CID 1291751. Guenther Change-Id: Ibe9dc3662811dc5889f85fa063ab9211fcaf7f12 BUG: 789278 Signed-off-by: Guenther Deschner <gd@samba.org> Reviewed-on: http://review.gluster.org/10301 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: Fixing dereference after null checkarao2015-04-231-1/+1
| | | | | | | | | | | | | | | CID: 1175023 The check for null is made before dereferencing the pointer variable alongside. Change-Id: I827927f2fcf6d1f365f4d6b5a678373020b9daf9 BUG: 789278 Signed-off-by: arao <arao@redhat.com> Reviewed-on: http://review.gluster.org/9663 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/afr,dht: Fix memleak after syncop_readlinkPranith Kumar K2015-04-232-0/+2
| | | | | | | | | | | | Change-Id: Ia71c14c2c2709c541075748c9011437e0d8cac4b BUG: 1213542 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10305 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: support for tier volumes 'detach start' and 'detach commit'Dan Lambright2015-04-224-8/+62
| | | | | | | | | | | | | | | | | | | | | These commands work in a manner analagous to rebalancing when removing a brick. The existing migration daemon detects "detach start" and switches to moving data off the hot tier. While in this state all lookups are directed to the cold tier. gluster v detach-tier <vol> start gluster v detach-tier <vol> commit The status and stop cli commands shall be submitted separately. Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0 BUG: 1205540 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: root <root@localhost.localdomain> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10108 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: NetBSD Build System
* quota/disperse: handle inode quotas in xattr aggregatevmallika2015-04-141-13/+22
| | | | | | | | | | | | | | | | | | | | with the inode quota feature, quota size is now increased from 64bit to 192bits which contains values of 'file size', 'file count' and 'dir count' This change in quota size xattr needs to be handled in disperse xattr aggregation Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: I5fd28aa9f5b8b6cba83a98360236417a97ac16ee BUG: 1207967 Reviewed-on: http://review.gluster.org/10112 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Entry self-heal fixesPranith Kumar K2015-04-131-79/+27
| | | | | | | | | | | | | | | | - Directory deletion should always happen with 'rm -rf' flag, otherwise the call may fail with ENOTEMPTY. - Instead of doing an explicit 'link' call, perform mknod call with GLUSTERFS_INTERNAL_FOP_KEY which acts as 'link' if the gfid already exists. Change-Id: I8826f92170421db37efb67dfc00afad4ab695907 BUG: 1207085 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10045 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Fix readdir de-itransformPranith Kumar K2015-04-113-8/+73
| | | | | | | | | | | | | | | | | | | Problem: gf_deitransform returns the glbal client-id in the complete graph. So except for the first disperse subvolume under dht, all the other disperse subvolumes will return a client-id greater than ec->nodes, so readdir will always error out in those subvolumes. Fix: Get the client subvolume whose client-id matches the client-id returned by gf_deitransform of offset. Change-Id: I26aa17504352d48d7ff14b390b62f49d7ab2d699 BUG: 1209113 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10165 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Ignore volume-mark key for comparing dictsPranith Kumar K2015-04-101-0/+1
| | | | | | | | | Change-Id: Id60107e9fb96588d24fa2f3be85c764b7f08e3d1 BUG: 1207712 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10077 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Fix iobuf mem leakRaghavendra Talur2015-04-101-2/+2
| | | | | | | | | | | | | | | | iobuf_get and iobref_add implicitly ref the iobuf. Hence, it is necessary to unref iobuf before setting it to NULL. Change-Id: Icadd8925574cf04fe708d8090868e49356653a8e Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9818 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Cluster/DHT Mismatching gfid values in dht_local_tNithya Balachandran2015-04-081-11/+6
| | | | | | | | | | | | | | | | | | If multiple files with the same name but different gfids exist on different subvolumes, dht_lookup_everywhere_cbk() copies the gfid from the last received response into local->gfid but does not update the local->stbuf structure. dht_linkfile_create() uses the value in local->gfid, but dht_linkfile_attr_heal() uses the one in local->stbuf, causing a mismatch and failure while trying to heal the linkfile attrs. Change-Id: I80d152be95b42d736c5d9182b955f42e374b82a5 BUG: 1205785 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/9998 Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Shyamsundar Ranganathan <srangana@redhat.com>
* dht : null dereference coverity fixManikandan Selvaganesh2015-04-081-1/+1
| | | | | | | | | | | Change-Id: I700e7ebdfe4929a6d74406ea081059bdddcf7a79 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9628 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* libglusterfs/syncop: Add xdata to all syncop callsRaghavendra Talur2015-04-0811-113/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for xdata in both the request and response path of syncops. Few calls like lookup already had the support; have renamed variables in few places to maintain uniformity. xdata passed downwards is known as xdata_in and xdata passed upwards is known as xdata_out. There is an old patch by Jeff Darcy at http://review.gluster.org/#/c/8769/3 which does the same for some selected calls. It also brings in xdata support at gfapi level. xdata support at gfapi level would be introduced in subsequent patches. Change-Id: I340e94ebaf2a38e160e65bc30732e8fe1c532dcc BUG: 1158621 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9859 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr : Fix for heating of files during promotion/demotionJoseph Fernandes2015-04-081-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fix will solve the heating of the files during the promotion or demotion. Promotion: ~~~~~~~~~ When a file gets promoted it get the current time stamp during creation only, but following writes or reads during the migration wont heat the file. Demotion: ~~~~~~~~ When a file gets demoted it get the wind/unwind time stamp is set to zero. The following writes or reads during the migration wont heat the file. What is remaining ? ~~~~~~~~~~~~~~~~~ Bug 1209129 ( https://bugzilla.redhat.com/show_bug.cgi?id=1209129 ) Inspite of this fix there is still a issue remaining, i.e the heat of the file is not keep intact during a internal rebalance activity i.e a rebalance within a tier. Change-Id: I01e82dc226355599732d40e699062cee7960b0a5 BUG: 1207867 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10080 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: Have same ec_manager_* for [f]set/[f]removexattrPranith Kumar K2015-04-081-206/+80
| | | | | | | | | | | | | ec_manager_xxx() function for [f]set/[f]remove xattr is exactly same except the reporting part. So moved that to common function and use same ec_manager_xattr() function for all these fops. Change-Id: Iaa57023b800f8d1f3f6a827f4ceba9b0a0337336 BUG: 1199767 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10036 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* glusterd: Support distributed replicated volumes on hot tierDan Lambright2015-04-082-4/+36
| | | | | | | | | | | | | | We did not set up the graph properly for hot tiers with replicated subvolumes. Also add check that the file has not already been moved by another replicated brick on the same node. Change-Id: I9adef565ab60f6774810962d912168b77a6032fa BUG: 1206517 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10054 Reviewed-by: Joseph Fernandes <josferna@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* afr : null dereference coverity fix.Manikandan Selvaganesh2015-04-081-14/+25
| | | | | | | | | | | | CID : 1194648 Change-Id: Ib26e7cdbf412d563240885fb3113bcc1fe5c9c49 BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9571 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* Cluster/afr : Coverity fix.Manikandan Selvaganesh2015-04-081-4/+0
| | | | | | | | | | | | | | | CID:1194644 Childup[] value will not be equal to -1 when afr_xl_op() function gets called Change-Id: Iaf7a9d41a54f6b2d52d9ba5dadb638f328afe14b BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* dht : coverity fixesManikandan Selvaganesh2015-04-073-10/+12
| | | | | | | | | | | | | | | | CID : 1124352,1124365 (unchecked return value), 1124377 ( logically dead code), 1124511 (null dereference) Change-Id: I61e029a078559cfe15d36bf0aa53418f6214e5cb BUG: 789278 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9622 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Refactor inode-writevPranith Kumar K2015-04-064-478/+145
| | | | | | | | | | | | | | All _cbk() functions in inode-write.c do same things, i.e. store op_ret/op_errno, stat structures if they are available and combine them. Moved this common operation into one function ec_inode_write_cbk() and made all the other _cbk() functions to use this instead. Change-Id: I2387b9f2d9598ced6299a26ea1900e9deb9fadc4 BUG: 1199767 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9981 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht: Unwind with proper op_retRaghavendra Talur2015-04-061-12/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Expected behavior of get_real_filename feature. A getxattr on a existing dir with glusterfs.get_real_filename:<filename> as key should result in one of the following things. a. A value returned for that key having the real filename (a file whose match is a case insensitive match to the filename passed in key). b. op_ret = -1 and errno set to ENOENT meaning that no such file exists under the specified dir in any case. c. op_ret = -1 and errno set to ENODATA. This is a case assuming no xlator interprets the glusterfs.get_real_filename key and it get passed down to the posix xlator. Naturally, posix xlator would not find any xattr with this key and would return ENODATA. This will be interpreted specially by the caller as the feature not being supported by underlying glusterfs. 2. What assumptions are wrong? Initially the key used to be user.glusterfs.get_real_filename. In that case, when posix xlator did a getxattr call it would have received ENODATA as error. However, the key has now changed to glusterfs.get_real_filename. This leads to a EOPNOTSUPP error instead. Considering the above information, this is a rewrite of get_real_filename logic in dht. Change-Id: I012e9150047fc8563be91b0d112a368ac1cbf598 BUG: 1204140 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/9956 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* cluster/dht: fix spurious smoke test failureGaurav Kumar Garg2015-04-061-2/+2
| | | | | | | | | | | | | | | There is smoke test failure due to implici declaration of function "uuid_parse" and "uuid_compare". Fix is to change these function caller name to "gf_uuid_parse" and "gf_uuid_compare." Change-Id: I79efa00c44d112c2ca732a9d9711c07bd5f1a069 BUG: 1207532 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/10139 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: fix tier.c problems found prior to feature freezeDan Lambright2015-04-063-58/+205
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves tiering translator issues taken from the list in bug 1203776. These issues have been selected to be fixed first. The rest will be fixed in a subsequent patch (or are not a problem). 3. Replace hardcoded #defines of promote/demote file names 6. Use loc_wipe() in migrate_using_query_file() 9. Only promote/demote files on the same node on which they reside. 14. Replace calloc with GF_CALLOC in tier.c and ensure freeing done properly. 15. Handle if parse_query_str fails 22. Only load gfdb library on server side, remove SQL references from client. Change-Id: I6563b11e58ab2e4c6b1ce44db755781ad6d930fb BUG: 1203776 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-0428-198/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/dht: Fix coverity bug in tiering codeDan Lambright2015-04-041-2/+17
| | | | | | | | | | | | | | | | | The bug was: *** CID 1291734: Error handling issues (CHECKED_RETURN) /xlators/cluster/dht/src/tier.c: 451 in tier_build_migration_qfile() The fix is to check the return code to the remove library call. It is legal to fail, we just log an INFO level message. Change-Id: I026eb49276b394efa3b8092ee2cc209c470aacb2 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10000 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* Tests: fix spurious failure in read-subvol-entry.tEmmanuel Dreyfus2015-04-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | read-subvol-entry.t tests that if a brick has pending operations, it is not used for readdir operations. On NetBSD this test exhibits spurious failures, with the wrong brick being used to perform readdir. It happens because when afr_replies_interpret() looks at xattr for pending attributes, it uses alternative bahvior whether it is working on a directory or another object. The decision is based on inode->ia_type, which may be IA_INVAL at that time if we come there from: afr_replies_interpret.() afr_xattrs_are_equal() afr_lookup_metadata_heal_chec() afr_lookup_entry_heal() afr_lookup_cbk() Using replies[i].poststat.ia_type, which is correctly set, works around the problem. BUG: 1129939 Change-Id: Id9ccdd8604f79a69db5f1902697f8913acac50ad Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/9831 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* DHT: move regular expression log message to DEBUGvmallika2015-04-021-1/+1
| | | | | | | | | | | Change-Id: I8652208aa2c3a600816c911a9e8af557c67d37c4 BUG: 1197585 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9777 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Xlators : Fixed typosManikandan Selvaganesh2015-04-027-8/+8
| | | | | | | | | | | Change-Id: I948f85cb369206ee8ce8b8cd5e48cae9adb971c9 BUG: 1075417 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/9529 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* cluster/ec: Implement heal info for ecPranith Kumar K2015-03-302-1/+28
| | | | | | | | | | | | This also lists the files that are on-going I/O, which will be fixed later. Change-Id: Ib3f60a8b7e8798d068658cf38eaef2a904f9e327 BUG: 1203581 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10020 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* Gfdb Query Fix and Volume option fixJoseph Fernandes2015-03-301-2/+2
| | | | | | | | | | | | | | 1) Query fix in find_changed_with_freq() 2) Volume option typo fix for write_freq_threshold and read_freq_threshold Change-Id: I38e154818178aab412b2d7b2914cd29acef66ffb BUG: 1207343 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10050 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* cluster/ec: Use fd when appropriate for updating size/versionPranith Kumar K2015-03-271-3/+9
| | | | | | | | | | Change-Id: I5d3aca101c8cdda406d31d06c40404fa6a2b7170 BUG: 1192378 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9995 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* libxlator: Change marker xattr handling interfacePranith Kumar K2015-03-258-188/+88
| | | | | | | | | | | | | | | | | | | | | - Changed the implementation of marker xattr handling to take just a function which populates important data that is different from default 'gauge' values and subvolumes where the call needs to be wound. - Removed duplicate code I found while reading the code and moved it to cluster_marker_unwind. Removed unused structure members. - Changed dht/afr/stripe implementations to follow the new implementation - Implemented marker xattr handling for ec. Change-Id: Ib0c3626fe31eb7c8aae841eabb694945bf23abd4 BUG: 1200372 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9892 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Fixing build of xlator/cluster/dht/src/dht-rebalance.c when tiering is disabledJoseph Fernandes2015-03-252-4/+8
| | | | | | | | | | | | | 1) Removed unnecessary include tier.h in dht-rebalance.c 2) tier xlator will only compile when tiering is enabled in configure.ac Change-Id: Ia21aa9ff403506dc898a83236e9e2d382a0594da BUG: 1204604 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9973 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>