summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* features/changelog: Version support for Changelog ParserAravinda VK2015-05-062-11/+70
| | | | | | | | | | | | | | | | | | As optional feature, during unlink, full path will be recorded. Changelog Version number to be bumped up to 1.2. With this patch, parser checks the version number before parsing and handles accordingly. Change-Id: Ic1ad98259c39e417029a08e26a1d4b467817e65a BUG: 1214561 Signed-off-by: Aravinda VK <avishwan@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10166 Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/ec: add separate versions for data/entry, metadataAshish Pandey2015-05-068-29/+104
| | | | | | | | | | | | | | | | | | | Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version. Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in 3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with old version files. For upgrades we need to tell users to complete heals and then upgrade BUG: 1215265 Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10312 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* quota/marker: turn off inode quotas by defaultvmallika2015-05-0613-42/+289
| | | | | | | | | | | | | | | | | | | | | inode quota is a new feature implemented in glusterfs-3.7 if quota is enabled in the older version and is upgraded to a new version, we can hit setxattr spike during self-heal of inode quotas. So, when a quota is enabled, turn off inode-quotas with a xlator option. With this patch, we still account for inode quotas but only when a write operation is performed for a particular file. User will be able to query inode quotas once the Inode-quota xlator option is enabled. Change-Id: I52fb28bf7024989ce7bb08ac63a303bf3ec1ec9a BUG: 1209430 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/10152 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* NFS-Ganesha : Do not empty global config file when cluster is tornMeghana Madhusudhan2015-05-061-1/+1
| | | | | | | | | | | | | | | The global config file will need new blocks to support client lock recovery. Current cleanup function empties the entire file. Deleting only "include" lines in the config file. Change-Id: I21f09e30a738d2ba01861ce480ecf906667d887b BUG: 1218854 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/10592 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* NFS-Ganesha : Locking global options fileMeghana Madhusudhan2015-05-066-19/+111
| | | | | | | | | | | | | | | | | | | | | Global option gluster features.ganesha enable writes into the global 'option' file. The snapshot feature also writes into the same file. To handle concurrent multiple transactions correctly, a new lock has to be introduced on this file. Every operation using this file needs to contest for the new lock type. Change-Id: Ia8a324d2a466717b39f2700599edd9f345b939a9 BUG: 1200254 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/10130 Reviewed-by: Avra Sengupta <asengupt@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* doc/admin-guide: BitRot command line usageVenky Shankar2015-05-061-0/+58
| | | | | | | | | Change-Id: Ic3c9da566673f4d9b1d0974c5aa15121da087fba BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10607 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* ctr/xlator: Named lookup heal of pre-existing files, before ctr was ON.Joseph Fernandes2015-05-068-6/+1021
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The CTR xlator records file meta (heat/hardlinks) into the data. This works fine for files which are created after ctr xlator is switched ON. But for files which were created before CTR xlator is ON, CTR xlator is not able to record either of the meta i.e heat or hardlinks. Thus making those files immune to promotions/demotions. Solution: The solution that is implemented in this patch is do ctr-db heal of all those pre-existent files, using named lookup. For this purpose we use the inode-xlator context variable option in gluster. The inode-xlator context variable for ctr xlator will have the following, a. A Lock for the context variable b. A hardlink list: This list represents the successful looked up hardlinks. These are the scenarios when the hardlink list is updated: 1) Named-Lookup: Whenever a named lookup happens on a file, in the wind path we copy all required hardlink and inode information to ctr_db_record structure, which resides in the frame->local variable. We dont update the database in wind. During the unwind, we read the information from the ctr_db_record and , Check if the inode context variable is created, if not we create it. Check if the hard link is there in the hardlink list. If its not there we add it to the list and send a update to the database using libgfdb. Please note: The database transaction can fail(and we ignore) as there already might be a record in the db. This update to the db is to heal if its not there. If its there in the list we ignore it. 2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in the inode context variable and delete the inode context variable. Please note: An inode forget may happen for two reason, a. when the inode is delete. b. the in-memory inode is evicted from the inode table due to cache limits. 3) create: whenever a create happens we create the inode context variable and add the hardlink. The database updation is done as usual by ctr. 4) link: whenever a hardlink is created for the inode, we create the inode context variable, if not present, and add the hardlink to the list. 5) unlink: whenever a unlink happens we delete the hardlink from the list. 6) mknod: same as create. 7) rename: whenever a rename happens we update the hardlink in list. if the hardlink was not present for updation, we add the hardlink to the list. What is pending: 1) This solution will only work for named lookups. 2) We dont track afr-self-heal/dht-rebalancer traffic for healing. Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f BUG: 1212037 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10370 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs : fix for coredump caused by export/netgroup feature in the regressionJiffin Tony Thottan2015-05-061-0/+1
| | | | | | | | | Change-Id: Idaae234b9e81c40040393e748db1f61363a48ed0 BUG: 1211913 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/10250 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/bitrot: Follow xattr naming conventionsVenky Shankar2015-05-065-7/+9
| | | | | | | | | | | | | | | | | Instead of "trusted.glusterfs.bit-rot.*" use "trusted.bit-rot.*" NOTE: With this patch, data on existing volumes would be resigned (which should be OK as of now since we do not expect many users as of now :-)) Change-Id: I926c7bca266a9c8f2cb35d57c4d0359aa5cecfa0 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/10181 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalance: Go to exit if defrag is NULLSusant Palai2015-05-051-5/+5
| | | | | | | | | | Problem: In case a defrag is null, going to out section will crash rebalance. Change-Id: I8b3ee1ad85dc23ef0e2f2dd6f912d07216bd619f Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10582 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* features/shard: Implement readv() fopKrutika Dhananjay2015-05-052-269/+635
| | | | | | | | | Change-Id: I4cc060710482de8633141170dd35f669f01f639b BUG: 1207615 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10528 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libgfapi : anonymous fd supportJiffin Tony Thottan2015-05-059-1/+354
| | | | | | | | | | | | | | | | Anonymous fd's are floating fd assigned to a glusterfs client without a explicit file open. Here either it will create a new anonymous fd or existing anonymous fd in the client stack for requested file.The anonymous fd's are mainly used for IO's. This patch introduces two api's glfs_h_anonymous_read and glfs_h_anonymous_write which performs read and write respectively Change-Id: Id646f2220e8387b2f8bb244c848dc1db6761444f BUG: 1204651 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9971 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Cache-invalidation : set to on/off depending on ganesha.enable valueMeghana Madhusudhan2015-05-052-1/+15
| | | | | | | | | | | | | | | | Multi-Head NFS-Ganesha servers need upcall (cache-invalidation) support to notify them in case of any changes to the files in the backend. Hence, upcall xlator option "features.cache-invalidation" needs to be enabled when ganesha.enable is set to 'on'. Similarly, this feature needs to be disabled when ganesha.enable is set to 'off' Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Change-Id: Ifdd1d50e48a2bd2a388f73c0b9e318c6092ac190 BUG: 1213752 Reviewed-on: http://review.gluster.org/10581 Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* api/libgfapi: Add support for statfs handleMohammed Rafi KC2015-05-054-0/+57
| | | | | | | | | | | | | snapd feature require statfs fops and thereby handle of statfs Change-Id: I037ee846aee3971826a73c592e15f1be27796c64 BUG: 1176837 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10356 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* build: gluster-api-devel requires libacl-develKaleb S. KEITHLEY2015-05-051-0/+1
| | | | | | | | | | | building libvirt fails Change-Id: I355879e6c6f56e49c4ee78e1a9763598acacd074 BUG: 1218625 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/10579 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cleanup: Unmount all fuse mountsPranith Kumar K2015-05-051-0/+3
| | | | | | | | | Change-Id: I851b4bd74b595d739f7bf9eea920d07e31fa5fbc BUG: 1202244 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10540 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* guster/dht: tiered volumes may not allow access to files undergoing migrationDan Lambright2015-05-051-0/+22
| | | | | | | | | | | | | | | | | | | | If a read IO occurs against a file that has reached rebalance phase 2, we redirect the IO to the destination. For tiered volumes, when we try to reopen the file (on the destination), the lower level DHT receives the open call and fails; it does not have a "cached subvol". Fix is to "teach" the lower level DHT of the new location by sending a locate before the open. Change-Id: Ia4acb0035ff1da15f6a8f9ed54f43c76e8b98f5f BUG: 1214048 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: root <root@gprfs018.sbu.lab.eng.bos.redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10324 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* geo-rep: rename handling in dht volumeNithya Balachandran2015-05-053-0/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: Glusterfs changelogs are stored in each brick, which records the changes happened in that brick. Georep will run in all the nodes of master and processes changelogs "independently". Processing changelogs is in brick level, but all the fops will be replayed on "slave mount" point. Problem: With a DHT volume, in changelog "internal fops" are NOT recorded. For Rename case, Rename is recorded in "hashed" brick changelog. (DHT's internal fops like creating linkto file, unlink is NOT recorded). This lead us to inconsistent rename operations. For example, Distribute volume created with Two bricks B1, B2. //Consider master volume mounted @ /mnt/master and following operations executed: cd /mnt/master touch f1 // f1 falls on B1 Hash mv f1 f2 // f2 falls on B2 Hash // Here, Changelogs are recorded as below: @B1 CREATE f1 @B2 RENAME f1 f2 Here, race exists between Brick B1 and B2, say B2 will get executed first. Source file f1 itself is "NOT PRESENT", so it will go ahead and create f2 (Current implementation). We have this problem When rename falls in another brick and file is unlinked in Master. Similar kind of issue exists in following case too(multiple rename): CREATE f1 RENAME f1 f2 RENAME f2 f1 Solution: Instead of carrying out "changelogging" at "HASHED volume", carry out at the "CACHED volume". This way we have rename operations carried out where actual files are present. So,Changelog recorded as : @B1 CREATE f1 RENAME f1 f2 credit: sarumuga@redhat.com PS: Some of the races as the one below are _NOT_ fixed by this patch * f1 and f2 exist. B1 and B2 are their respective cached subvols. For both files hashed-subvol == cached-subvol * mv f1 f2 on master. * B1 has change-log entry of rename f1 f2 * rebalance migrates f2 from B1 and B2 * mv f2 f1 on master. * B2 has change-log entry of rename f2 f1 Since changelog entries (rename f1 f2) and (rename f2 f1) are processed independently by gsyncds, which of either f1 and f2 survives on slave is subject to race. Note that on master its file f1 with name f1 which survived. On slave it can be either file f1 with name f1 or file f2 with name f2 based on who wins the race of processing changelog. Change-Id: Iebc222f582613924c3a7cba37fb6d3e2d8332eda BUG: 1141379 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/10410 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* dht/tier/rebalancer: Fix reset of tiering client pidJoseph Fernandes2015-05-054-3/+5
| | | | | | | | | | | | | | | | In the patch http://review.gluster.org/#/c/9657 the client pid set by tiering migration was getting over- written in dht_start_rebalance_task(). Just corrected it in dht_setxattr() before calling dht_start_rebalance_task() and removed it from dht_start_rebalance_task(). Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870 BUG: 1217937 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tiering: Send both attach-tier and tier-start togetherMohammed Rafi KC2015-05-054-12/+122
| | | | | | | | | | | | After attaching tier, we have to start tier rebalance process. This patch is to trigger tier start along with attch-tier. Change-Id: I39380f95123f0087a82213ef263f9f33adcc5adc BUG: 1214222 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10363 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* glusterd/tiering: disable tiering if cluster runs older gluster versionsDan Lambright2015-05-051-1/+43
| | | | | | | | | | | | Do not allow attach or detach tier to work if any of the nodes is running gluster code < 3.7. Change-Id: Id9af8f4057f6fad9cb703ec7645bc01eccb11fc1 BUG: 1218287 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10531 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/tier: don't use hot tier until subvolumes readyDan Lambright2015-05-052-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1214289 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10435 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* Tests: wworkaround NetBSD failures in cdc.tEmmanuel Dreyfus2015-05-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The volume reset network.compression operation cause brick processes to be restarted. If the volume is already started, a brick process is already there and the restart will fail, as the brick TCP port is already in use. Because the new brick process is not started, the volume is left with no brick online, and the volume stop operation will timeout waiting for bricks to stop. Obviosuly we have two bugs here - If volume reset network.compression needs to restart the bricks, it should first make sure the previous brick process is terminated - volume stop should not wait forever for bricks to come back online This change does not fix the bugs but just makes sure the volume is stoped before volume reset network.compression, so that the failure oes not happen. BUG: 1129939 Change-Id: I9cd5cdc767ef6ee9dd31f2121d672dc3bfdce45f Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10553 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Upcall: Send stat as part of cache_invalidation notificationsSoumya Koduri2015-05-0514-198/+317
| | | | | | | | | | | | | | | | Have added support to send attributes of both entries and its parent (include oldparent in case of RENAME fop) in the same notification request to avoid multiple rpc requests. Also, made changes in gfapi to send parent object and its attributes changed in a single upcall event. Change-Id: I92833da3bcec38d65216921c2ce4d10367c32ef1 BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10460 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* geo-rep: Minimize rm -rf race in Geo-repAravinda VK2015-05-053-26/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While doing RMDIR worker gets ENOTEMPTY because same directory will have files from other bricks which are not deleted since that worker is slow processing. So geo-rep does recursive_delete. Recursive delete was done using shutil.rmtree. once started, it will not check disk_gfid in between. So it ends up deleting the new files created by other workers. Also if other worker creates files after one worker gets list of files to be deleted, then first worker will again get ENOTEMPTY again. To fix these races, retry is added when it gets ENOTEMPTY/ESTALE/ENODATA. And disk_gfid check added for original path for which recursive_delete is called. This disk gfid check executed before every Unlink/Rmdir. If disk gfid is not matching with GFID from Changelog, that means other worker deleted the directory. Even if the subdir/file present, it belongs to different parent. Exit without performing further deletes. Retry on ENOENT during create is ignored, since if CREATE/MKNOD/MKDIR failed with ENOENT will not succeed unless parent directory is created again. Rsync errors handling was handling unlinked_gfids_list only for one Changelog, but when processed in batch it fails to detect unlinked_gfids and retries again. Finally skips the entire Changelogs in that batch. Fixed this issue by moving self.unlinked_gfids reset logic before batch start and after batch end. Most of the Geo-rep races with rm -rf is eliminated with this patch, but in some cases stale directories left in some bricks and in mount point we get ENOTEMPTY.(DHT issue, Error will be logged in Slave log) BUG: 1211037 Change-Id: I8716b88e4c741545f526095bf789f7c1e28008cb Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10204 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rdma:properly handle iobuf_pool when rdma transport is unloadedMohammed Rafi KC2015-05-052-20/+62
| | | | | | | | | | | | | | | | | We are registering iobuf_pool with rdma. When rdma transport is unloaded, we need to deregister all the buffers registered with rdma. Otherwise iobuf_arena destroy will fail. Also if rdma.so is loaded again, then register iobuf_pool with rdma Change-Id: Ic197721a44ba11dce41e03058e0a73901248c541 BUG: 1200704 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9854 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* build: Autogenerated files delivered in dist tarballKaleb S. KEITHLEY2015-05-051-0/+2
| | | | | | | | Change-Id: I99028fe6ff4547f837b0997543234a9020b75607 BUG: 1216067 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/10429 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: Coverity fixGaurav Kumar Garg2015-05-052-4/+21
| | | | | | | | | | | | | | | | | | | CID: 1293504 (Calling xlator_set_option without checking return value ) CID: 1293502 (Dereferencing a pointer that might be null xl when calling xlator_set_option) CID: 1293500 (Assigning value from dict_get_int32(dict, "type", &type) to ret here, but that stored value is overwritten before it can be used.) Change-Id: I5314fb399480df70bd77bc374e3b573f2efd5710 BUG: 1093692 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/10201 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaushal M <kaushal@redhat.com>
* snapshot/scheduler: Use os.path.realpath() for path validationAvra Sengupta2015-05-051-1/+1
| | | | | | | | | | | | | | In order to accomodate systems, where /var/run is a symlink to /run, we are using os.path.realpath() for path validations. Change-Id: I4eae536867ec6c88f92c762b92f5c1966b622bde BUG: 1216931 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/10464 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* doc: AFR arbiter volume usageRavishankar N2015-05-052-0/+60
| | | | | | | | | | | Contains information on creation and behaviour of replica 3 arbiter volumes. Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: I6af4aa3488649686fdb9b839c733046160e0785b BUG: 1199985 Reviewed-on: http://review.gluster.org/10541 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* tests: Unmount aux quota mountPranith Kumar K2015-05-051-0/+2
| | | | | | | | | | Change-Id: Ie4ec40230e6b92d2e694b804a991246050b5fa51 BUG: 1202244 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10539 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Sachin Pandit <spandit@redhat.com>
* geo-rep: Status EnhancementsAravinda VK2015-05-0513-717/+917
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Discussion in gluster-devel http://www.gluster.org/pipermail/gluster-devel/2015-April/044301.html MASTER NODE - Master Volume Node MASTER VOL - Master Volume name MASTER BRICK - Master Volume Brick SLAVE USER - Slave User to which Geo-rep session is established SLAVE - <SLAVE_NODE>::<SLAVE_VOL> used in Geo-rep Create command SLAVE NODE - Slave Node to which Master worker is connected STATUS - Worker Status(Created, Initializing, Active, Passive, Faulty, Paused, Stopped) CRAWL STATUS - Crawl type(Hybrid Crawl, History Crawl, Changelog Crawl) LAST_SYNCED - Last Synced Time(Local Time in CLI output and UTC in XML output) ENTRY - Number of entry Operations pending.(Resets on worker restart) DATA - Number of Data operations pending(Resets on worker restart) META - Number of Meta operations pending(Resets on worker restart) FAILURES - Number of Failures CHECKPOINT TIME - Checkpoint set Time(Local Time in CLI output and UTC in XML output) CHECKPOINT COMPLETED - Yes/No or N/A CHECKPOINT COMPLETION TIME - Checkpoint Completed Time(Local Time in CLI output and UTC in XML output) XML output: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> cliOutput> geoRep> volume> name> sessions> session> session_slave> pair> master_node> master_brick> slave_user> slave/> slave_node> status> crawl_status> entry> data> meta> failures> checkpoint_completed> master_node_uuid> last_synced> checkpoint_time> checkpoint_completion_time> BUG: 1212410 Change-Id: I944a6c3c67f1e6d6baf9670b474233bec8f61ea3 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10121 Tested-by: NetBSD Build System Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* doc/snapshot: doc for snapshot cloneMohammed Rafi KC2015-05-051-1/+18
| | | | | | | | | | Change-Id: Ie53e6ab780ab67ffe0c4f6d92fe4c0b779cec2c9 BUG: 1199894 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10187 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* doc/geo-rep: Doc changes w.r.t common shared gluster meta volumeKotresh HR2015-05-051-8/+23
| | | | | | | | | Change-Id: I71c466537a191caa5c5a4a11e67df384c24c73ef BUG: 1210344 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10447 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* doc: Feature doc for shard translatorKrutika Dhananjay2015-05-051-0/+68
| | | | | | | | | | | Source: https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20 Change-Id: I127685bacb6cf924bb41a3b782ade1460afb91d6 BUG: 1200082 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10537 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
* afr: add arbitration supportRavishankar N2015-05-0510-47/+272
| | | | | | | | | | | | | | | | | | | | | | | | | Add logic in afr to work in conjunction with the arbiter xlator when a replica 3 arbiter volume is created. More specifically, this patch: * Enables full locks for afr data transaction for such volumes. * Removes the upfront marking of pending xattrs at the time of pre-op and defer it to post-op. (This is an arbiter independent change and is made for all afr transactions.) * After pre-op stage, check if we can proceed with the fop stage without ending up in split-brain by examining the changelog xattrs. * Unwinds the fop with failure if only one source was available at the time of pre-op and the fop happened to fail on particular source brick. * Skips data self-heal if arbiter brick is the only source available. * Adds the arbiter-count option to the shd graph. This patch is a part of the arbiter logic implementation for 3 way AFR details of which can be found at http://review.gluster.org/#/c/9656/ Change-Id: I9603db9d04de5626eb2f4d8d959ef5b46113561d BUG: 1199985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/10258 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* mgmt/glusterd: Porting messages to new logging frameworkNandaja Varma2015-05-045-150/+258
| | | | | | | | | | Change-Id: I25f3536446798ea1cffd6b5dfbb3d2398766fcf3 BUG: 1194640 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/9808 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* libglusterfs: Fix cluster_entrylk retryPranith Kumar K2015-05-042-5/+7
| | | | | | | | | Change-Id: I92ff46bae36d39a449d4bbaedc88a322992f65eb BUG: 1215265 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10391 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* cluster/ec: Fix dictionary compare functionPranith Kumar K2015-05-047-119/+152
| | | | | | | | | | | | | | | | | | | | If both dicts are NULL then equal. If one of the dicts is NULL but the other has only ignorable keys then also they are equal. If both dicts are non-null then check if for each non-ignorable key, values are same or not. value_ignore function is used to skip comparing values for the keys which must be present in both the dictionaries but the value could be different. geo-rep's stime xattr doesn't need to be present in list xattr but when getxattr comes on stime xattr even if there aren't enough responses with the xattr we should still give out an answer which is maximum of the stimes available. Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d BUG: 1207712 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10078 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/ec: Use fd instead of loc for get_size_versionAshish Pandey2015-05-043-57/+63
| | | | | | | | | Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a BUG: 1188242 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10176 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* snapshot/uss: fix regression failure in bug-1162498.tAvra Sengupta2015-05-043-7/+6
| | | | | | | | | | | | | | | .snaps seems to take some time, before it is available based on the state of the system. Using EXPECT_WITHIN instead of TEST to check the contents of .snaps, hence giving it some time to come up. Change-Id: Iac166500d5a09ba8bab00d994c27a9ad0a01b9c3 BUG: 1218120 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/10518 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* tests: fix failures due to not unmounting $M2 (/mnt/glusterfs/2)Jeff Darcy2015-05-041-0/+7
| | | | | | | | | | | | | | | | | | | | | | | Our failure to unmount meant that both mkdir and rmdir would fail in cleanup(). Because one of those mkdirs was the last thing cleanup() executed, it would fail, so the test would fail, so the entire regression run would fail. The fix has two parts. (1) Unmount the offending directory. (2) Make sure cleanup() returns success even if that last mkdir failed. That might keep us from consistently blowing up regression runs on the very first tests (basic/afr/data-self-heal.t) that we execute. Change-Id: I7a9761bd28761a5ee2face3db8112e9c3f6c5dc8 BUG: 1163543 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/10536 Reviewed-by: Justin Clift <justin@gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: Included necessary stats in 'gfs3_cbk_cache_invalidation_req'Soumya Koduri2015-05-041-0/+6
| | | | | | | | | | | | | | | To avoid extra network fops, it will benifit clients if the server can send the updated stat of the file/dir and of the parent directory (including oldparent in case of RENAME fop) while sending GF_CBK_CACHE_INVALIDATION upcall event. This patch handles rpc protocol changes. Change-Id: Ic096a61c4d24a8d75a8285be78deb237d7ef9484 BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10452 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Upcall: Cleanup expired client entriesSoumya Koduri2015-05-044-18/+195
| | | | | | | | | | | | | | | | | | | | | To cleanup expired client entries (with access_time > 2*CACHE_INVALIDATION_TIMEOUT), have * defined a global list to contain all the upcall_inode_ctx allocated * Every time a upcall_inode_ctx is allocated, it is added to the global list * during inode_forget, that upcall_inode_ctx is marked for destroy * created a reaper thread which scans through that list * cleans up expired client entries * frees the inode_ctx with destroy_mode set. Note: This reaper thread is initialized only when features.cache_invalidation option is enabled. Change-Id: Iea2a63eb31b8e08d5709e7e090cf26fd13d01265 BUG: 1200267 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10342 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* gfapi: Skip and delete the upcall entry if not foundSoumya Koduri2015-05-041-20/+32
| | | | | | | | | | | | | | We could run into situations where in gfapi received a upcall notification for a file and the file may have got deleted by then. Hence, incase of error while trying to create handle from gfid, log it, delete the entry and return with CBK_NULL reason. Change-Id: I191f10f44d6804be07734a6be63a3ca08f455f0e BUG: 1213773 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10341 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* feature/changelog: Capture path for deletesKotresh HR2015-05-047-6/+194
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM: There is no way to get the path of deleted file if we have gfid from changelog since the file is already deleted. SOLUTION: Do a recursive readlink on parent gfid in backend .glusterfs path to get the complete path in I/O callpath in changelog translator and capture it in callback. The path captured is relative from the brick root. The field separator used is '\0'. e.g., ......\0<pgfid>/bname\0<relative-path>\0<next-record> ADDITIONAL REQUIRED CHANGES: 1. The changelog translator option called "changelog.capture-del-path" is introduced to enable or disable the capturing of deleted entry path. e.g., gluster vol set <vol-name> changelog.capture-del-path on/off If capture-del-path is disabled, '\0' is captured instead of relative path. e.g., ......\0<pgfid>/bname\0\0\0<next-record> 2. The minor number in the version of changelog is bumped up from v1.1 to v1.2. 3. If recursive readlink is failed for some reason, it will capture \0 in place of <relative path>. e.g., ......\0<pgfid>/bname\0\0\0<next-record> (same as when caputre-del-path option is disabled) 4. If bname argument passed to "resolve_pargfid_to_path" function is NULL and pargfid is ROOT, "." is returned. This is not the case with changelog, where bname is always passed. This is applicable to other consumers of "resolve_pargfid_to_path" routine. NOTE: Changelog parser should consider the above new changes and should parse accordingly. Change-Id: I040ed429b5aa7d391033fc6a540edbf07fc37827 BUG: 1214561 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10288 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: NetBSD Build System
* glusterd: Enable readdir-ahead by default on new volumesanand2015-05-042-4/+17
| | | | | | | | | | | | | | With gluster-3.7, 'performance.readdir-ahead' will be enabled by default on new volumes when the cluster op-version supports it. Change-Id: I44e76a69e7d1c11e6dfad72c941caf887bb810ee BUG: 1216187 Signed-off-by: anand <anekkunt@redhat.com> Reviewed-on: http://review.gluster.org/10433 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* Upcall: Feature doc for Upcall infrastrucutreSoumya Koduri2015-05-041-0/+33
| | | | | | | | | Change-Id: I802ba2f13fde6c05da1ed355e340f071e9d20d30 BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/10525 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: Maintain separate xlator pointer in 'rpcsvc_state'Kotresh HR2015-05-046-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The structure 'rpcsvc_state', which maintains rpc server state had no separate pointer to track the translator. It was using the mydata pointer itself. So callers were forced to send xlator pointer as mydata which is opaque (void pointer) by function prototype. 'rpcsvc_register_init' is setting svc->mydata with xlator pointer. 'rpcsvc_register_notify' is overwriting svc->mydata with mydata pointer. And rpc interprets svc->mydata as xlator pointer internally. If someone passes non xlator structure pointer to rpcsvc_register_notify as libgfchangelog currently does, it might corrupt mydata. So interpreting opaque mydata as xlator pointer is incorrect as it is caller's choice to send mydata as any type of data to 'rpcsvc_register_notify'. Maintaining two different pointers in 'rpcsvc_state' for xlator and mydata solves the issue. Change-Id: I7874933fefc68f3fe01d44f92016a8e4e9768378 BUG: 1215161 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10366 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tiering/cli: Check replica count and bricks are proper or notMohammed Rafi KC2015-05-042-1/+59
| | | | | | | | | | | | | | Right now, attach-tier calls parsing function for add-brick. Add-brick does not have any check for brick count and replca count compatibility. Change-Id: I44ec13eadffc003a9ebf8c4eb0193df559933a68 BUG: 1215122 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10428 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>