glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	tier/glusterd : Validation for frequency thresholds and record-counters	Joseph Fernandes	2015-12-01	2	-0/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) if record-counters is set to off check if both the frequency thresholds are non-zero, then pop an error message, with volume set failed. 2) if record-counters is set to on check if both the frequency thresholds are zero, then pop an note, but volume set is not failed. 3) If any of the frequency thresholds are set to a non-zero value, switch record-counters on, if not already on 4) If both the frequency thresholds are set to zero, switch record-counters off, if not already off NOTE: In this fix we have 1) removed unnecessary ctr vol set options. 2) changed ctr_hardlink_heal_expire_period to ctr_lookupheal_link_timeout Change-Id: Ie7ccfd3f6e021056905a79de5a3d8f199312f315 BUG: 1286346 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12780 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
*	cluster/ec: Create copy of dict for setting internal xattrs	Pranith Kumar K	2015-12-01	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Ec takes a ref of the request xdata and sets trusted.ec.version/algo etc xattrs as part of it. But this request xdata could be using same dictionary to do the operation on multiple subvolumes, due to which other subvolumes will have internal xattrs of ec in it and will be created on subvols where they are not supposed to appear. Fix: Take a copy of the request xdata/dict to prevent this from happening. Most of the debugging work and test script is contributed by Nitya. BUG: 1286910 Change-Id: If146435dfb89656158dbed3862a3e9a0cda60581 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12831 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
*	Upcall: Read gfid from iatt in case of invalid inode	Soumya Koduri	2015-12-01	2	-0/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When any file/dir is looked upon for the first time, inode created shall be invalid till it gets linked to the inode table. In such cases, read the gfid from the iatt structure returned as part of such fops for UPCALL processing. Change-Id: Ie5eb2f3be18c34cf7ef172e126c9db5ef7a8512b BUG: 1283983 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/12773 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	glusterd: copy snapshot object during duplication of volfile	Mohammed Rafi KC	2015-11-25	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When creating duplicate volfile for hot/cold tier, we need to copy the snapshot object in to volfile as it requires to generate snapshot brick volfile. Change-Id: I39ccfa20cd1c16ef2801901e3cd3a31c76f8995d BUG: 1284789 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12734 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
*	tier/test: Fixing tests/basic/tier/legacy-many.t	Joseph Fernandes	2015-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Change-Id: Icbd83afdeac053aec5b3b8fa19665a2908e87d8e BUG: 1285483 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12751 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/ctr: Correcting rename logic	Joseph Fernandes	2015-11-25	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When a file with old_file_name and GFID_1 is renamed with a new_file_name which already exists and with GFID_2, this is what happens in linux internaly. a. "new_file_name" is unlinked for GFID_2 b. a hardlink "new_file_name" is created to GFID_1 c. "old_file_name" hardlink is unlinked for GFID_2. Well this is all internal to linux, and gluster just issues a rename system call at POSIX layer. But CTR Xlator doesn't delete the entries corresponding to the "new_file_name" and GFID_2. Thus leaving the stale entry in the DB. The following are the implications. a. Promotion are tried on these stale entries which will fail and show false results in the status of migration, b. GFID_2 Files with 2 hardlinks, which will have only one hardlink after the rename will not be promoted or demoted as the DB shows 2 entries. Solution: Delete the older database entry for the replaced hardlink Change-Id: I4eafa0872253e29ff1f0bec4283bcfc579ecf0e2 BUG: 1284090 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12711 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	glusterd/bitrot : Integration of bad files from bitd with scrub status command	Gaurav Kumar Garg	2015-11-23	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently scrub status command is not displaying list of all the bad files. All the bad files are avaliable in the bitd daemon. With this patch it will dispaly list of all the bad file's in the scrub status command. Change-Id: If09babafaf5d7cf158fa79119abbf5b986027748 BUG: 1207627 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12720 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/tier: readdirp to cold tier only	Dan Lambright	2015-11-23	2	-10/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible a file would get migrated in the middle of a readdir operation. If there are four subvolumes A,B,C,D, and if readdir reads them in order and reaches subvol B, then, if a file is moved from D to A, it will not be included in the readdir output. This phenonema has pre-existed in DHT migration but is more apparent in tiering. When a file is moved off the hashed subvolume a T file is created. For tiering, we will make the cold subvolume the hashed subvolume. This will ensure the creation of a T file. Readdir will not skip T files in the tier translator. Making the cold subvolume the hashed subvolume ensures the T files created on promotions or creates will be less likely to fill the volume. Creates still put the data on the hot subvolume. Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407 BUG: 1281598 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/tier: Do not delete linkto file on demotion	N Balachandran	2015-11-18	1	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d BUG: 1279376 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12551 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	core: Add experimental xlator directory	Shyam	2015-11-18	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added an experimental xlator directory under ./xlators/ The intent of this directory is presented in the README.md that accompanies this commit. This directory can be disabled from being compiled using, - configure --disable-experimental Change-Id: I047f380c91a082d111432f8bbdbd4d7bdcbaa809 Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/12321 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	core: use syscall wrappers instead of direct syscalls - regression test	Kaleb S. KEITHLEY	2015-11-17	2	-0/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	various xlators and other components are invoking system calls directly instead of using the libglusterfs/syscall.[ch] wrappers. If not using the system call wrappers there should be a comment in the source explaining why the wrapper isn't used. Change-Id: Id2207deb81a75e1af6f34bf857e74725f8bb532f BUG: 1267967 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12410 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/tier make cache mode default for tiered volumes	Dan Lambright	2015-11-17	6	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	The default mode for tiered volumes must be cache. The current test mode was for engineering and should ordinarily not be used by customers. Change-Id: I20583f54a9269ce75daade645be18ab8575b0b9b BUG: 1282076 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12581 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
*	quota: fix spurious failure	vmallika	2015-11-17	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Change-Id: I5d18533d66df3175752a73430f680dcdfdb3c12a BUG: 1278689 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12546 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	tier/ctr: Providing option to record or ignore metadata heat	Joseph Fernandes	2015-11-15	1	-0/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we heat up a file for both data and metadata write. Here we provide a ctr xlator option called "ctr-record-metadata-heat" were the admin can decide on recording metadata heat i.e heatup a file on metadata writes or not. Metadata data operation are a. setattr: explicit changing of atime/mtime using utimes, changing of posix permissions of the file b. rename: Renaming a file, c. unlink, link: adding or deleting hardlinks d. xattrs: setting or removal of xattrs. NOTE: atime, mtime and ctime change through writev, readv, truncate, mknod and create will not be considered here as these fops are data and primary metadata fops. Defaultly "ctr-record-metadata-heat" is off. Admin can switch it on using gluster volume set command. Change-Id: I91157509255dd5cb429cda2b6d4f64582e155e7b BUG: 1279166 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12540 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tests: fix spurious error in fops-during-migration-pause.t	Dan Lambright	2015-11-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test did not spend long enough time moving the file for the pause to occur simultaneously, leading to failure. Solution is to elongate that time by increasing the file size. Change-Id: I1727fa9e3f7a987dfa07dd5da44c68d3f17218d9 BUG: 1280428 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12570 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes
*	features/gfid-access: Fix entry creation via setxattr for geo-rep	Kotresh HR	2015-11-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GEO-REP INTEROP WITH SHARD FEATURE Problem: Geo-replication uses setxattr interface of gfid-access xlator to create entries and send explicit setattr after entry creation to set uid and gid. But between entry creation and setattr, the inode would not be linked. Hence operation which accesses inode structure during setattr by any the below xlator fails. Solution: Linking inode would seem the obvious solution but, gfid-access xlator cannot link inodes and maintain it as it would result in same inode pointing to two different paths one being virtual .gfid/<gfid> path and other being actual path. The solution is to set uid and gid in frame->root->uid and frame->root->gid respectively from which posix extracts and sets. Change-Id: Ic0749ee471432caeb8ded3152a07de6e64d8538d BUG: 1265148 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12206 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	tests: make mount-nfs-auth.t more stable	Niels de Vos	2015-11-09	1	-15/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mount-nfs-auth.t has a funky way of restarting the Gluster/NFS service. It is a little racy and does not always work. Disabling and enabling the nfs.disable volume option triggers a restart of the Gluster/NFS service too, and is much simpler. Also adding a little more EXPECT_WITHIN statements to prevent the occasional failures. Change-Id: I6765e9f021abbe995dfac00fbfc67298e2ec769c BUG: 1278476 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12542 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	geo-rep: Fix syncing chown in xsync crawl	Kotresh HR	2015-11-08	3	-22/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GEO-REP INTEROP WITH SHARD FEATURE Problem: The sequence of entry creation and chown in master is recorded as creation of entry with resulted user:group in xsync changelog. During sync, entry creation is always split into two ops, MKNOD and SETATTR. Hence the issue is not being hit otherwise it would have failed with EPERM if parent is owned by different user. But with shard translator being enabled on slave, doing entry creation with MKNOD and SETATTR is not allowed, SETATTR fails as it accesses inode structure which is not linked. Solution: The sequence of entry creation and chown in master should be recorded as MKNOD and SETATTR separately always and do entry creation with single op in gfid-access xlator. The gfid-access patch will be sent separately. Change-Id: I93e554bf9342397a7660503f5128e9709f8a0cd8 BUG: 1265148 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/12205 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
*	cluster/tier correct promotion cycle calculation	Dan Lambright	2015-11-07	3	-19/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tier translator should only choose candidate files for promotion from the most recent cycle, not a multiple of the most recent cycles. Otherwise user observed behavior can be inconsistent. Remove related test in tier.t that is subject to race condition. Change-Id: I9ad1523cac00f904097ce468efa6ddd515857024 BUG: 1275524 Signed-off-by: root <root@rhs-cli-15.gdev.lab.eng.bos.redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12480 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	quota: fix for spurious failure	vmallika	2015-11-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Filed a bug# 1278689. For now marking the testcase tests/bugs/quota/bug-1235182.t' bad once the bug# 1278689, remove the testcase from bad list Change-Id: I224f907153d3e5f35834007a40b0050246d8787a BUG: 1278689 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12526 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/libgfdb: Replacing ASCII query file with binary	Joseph Fernandes	2015-11-06	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Earlier, when the database was queried we used to save all the queried records in an ASCII format in the query file. This caused issues like filename having ASCII delimiter and used to take a lot of space. The tier.c file also had a lot of parsing code. Here we changed the format of the query file to binary. All the logic of serialization and formating of query record is done by libgfdb. Libgfdb provides API, gfdb_write_query_record() and gfdb_read_query_record(), which the user i.e tier migrator and CTR xlator can use to write to and read from query file. With this binary format we save on disk space i.e reduce to 50% atleast as we are saving GFID's in binary format 16 bytes and not the string format which takes 36 bytes + We are not saving path of the file + we are also saving on ASCII delimiters. The on disk format of query record is as follows, +---------------------------------------------------------------------------+ \| Length of serialized query record \| Serialized Query Record \| +---------------------------------------------------------------------------+ 4 bytes Length of serialized query record \| \| -------------------------------------------------\| \| \| V Serialized Query Record Format: +---------------------------------------------------------------------------+ \| GFID \| Link count \| <LINK INFO> \|..... \| FOOTER \| +---------------------------------------------------------------------------+ 16 B 4 B Link Length 4 B \| \| \| \| -----------------------------\| \| \| \| \| \| V \| Each <Link Info> will be serialized as \| +-----------------------------------------------+ \| \| PGID \| BASE_NAME_LENGTH \| BASE_NAME \| \| +-----------------------------------------------+ \| 16 B 4 B BASE_NAME_LENGTH \| \| \| ------------------------------------------------------------------------\| \| \| V FOOTER is a magic number 0xBAADF00D indicating the end of the record. This also serves as a serialized schema validator. Change-Id: I9db7416fd421e118dd44eafab8b535caafe50d5a BUG: 1272207 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12354 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	snapshot: Making bug-1275616.t more regression failure tolerant	Avra Sengupta	2015-11-06	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	snapshot clone creation fails 'spuriously' on the regression setup coz the brick rpc connect for snap3 in the testcase, happens way after the snap was created. So adding a EXPECT_WITHIN $PROCESS_UP_TIMEOUT check(read delay) to help the cause. But this isn't a 100% guaranteed fix, as on an even slower machine, even this check will fail followed by the subsequent failures that this patch is trying to fix in the first place Change-Id: I2f31558b717fd610111f14e451fe444c09f3f254 BUG: 1278418 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12516 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	mgmt/glusterd: Store arbiter-count and restore it	Pranith Kumar K	2015-11-04	3	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: 1) Glusterd doesn't remember about arbiter information of replica volume in store. When glusterd goes down and comes backup, arbiter volumes will become replica volumes. 2) Glusterd doesn't import/export arbiter information to/from the other peers. 3) Volume info doesn't show any arbiter count in the output. Fix: 1) Persist arbiter information in glusterd-store 2) Import/Export arbiter information of the volume 3) Change volume info output to show arbiter count. Change-Id: I2db81e73d2694b01f7d07b08a17b41ad5a55c361 BUG: 1276675 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12475 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
*	cluster/tier : Files skipped during tier query parsing	N Balachandran	2015-11-03	2	-0/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tier query parsing code was using fscanf to read each record. As space is a delimiter for fscanf, filenames containing spaces caused the parsing to return unexpected values causing various issues in the tier process, including crashes due to buffer overflows. Change-Id: Ife602cb7ecb158fccbc2c89e4d2959bd97098a87 BUG: 1276562 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12469 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	snapshot: Inherit snap-max-hard-limit from original volume	Avra Sengupta	2015-11-02	2	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A snapshot should inherit snap-max-hard-limit from the original volume while being created and when being restored to, it should restore the same. Similarly a clone taken from a snapshot should inherit snap-max-hard-limit from the snapshot. Change-Id: If8e90e2ffc10e22086b803ac8e2638a16bcec968 BUG: 1275616 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12437 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
*	snapshot: Don't display snapshot's hard-limit and soft-limit in vol info	Avra Sengupta	2015-11-02	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The snap-max-hard-limit being displayed in the volume info currently is propagated from system's snap-max-hard-limit as that is a global option common for all volumes, and hence ends up showing the system's snap-max-hard-limit. We should not be displaying snap-max-hard-limit and snap-max-soft-limit in the volume info at all, as these are snap config options and should be set and displayed via snap config command. Modified bug-1113476.t to test the same behaviour. Change-Id: I90891f0cf7fb39fd686787297c7f7cd8c1e7daa1 BUG: 1276018 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/12443 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
*	quota: add version to quota xattrs	vmallika	2015-11-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a quota is disable and the clean-up process terminated without completely cleaning-up the quota xattrs. Now when quota is enabled again, this can mess-up the accounting A version number is suffixed for all quota xattrs and this version number is specific to marker xaltor, i.e when quota xattrs are requested by quotad/client marker will remove the version suffix in the key before sending the response Change-Id: I1ca2c11460645edba0f6b68db70d476d8d26e1eb BUG: 1272411 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12386 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	debug/io-stats: Add FOP sampling feature	Richard Wareing	2015-11-01	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Using sampling feature you can record details about every Nth FOP. The fields in each sample are: FOP type, hostname, uid, gid, FOP priority, port and time taken (latency) to fufill the request. - Implemented using a ring buffer which is not (m/c) allocated in the IO path, this should make the sampling process pretty cheap. - DNS resolution done @ dump time not @ sample time for performance w/ cache - Metrics can be used for both diagnostics, traffic/IO profiling as well as P95/P99 calculations - To control this feature there are two new volume options: diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means sample every FOP, 100 means sample every 100th FOP diagnostics.fop-sample-buf-size - The size (in bytes) of the ring buffer used to store the samples. In the even more samples are collected in the stats dump interval than can be held in this buffer, the oldest samples shall be discarded. Samples are stored in the log directory under /var/log/glusterfs/samples. - Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option and defaults to 24hrs. Test Plan: - Valgrind'd to ensure it's leak free - Run prove test(s) - Shadow testing on 100+ brick cluster Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8 BUG: 1271310 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12210 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/tier enable CTR on attach tier	Dan Lambright	2015-10-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	CTR is currently disabled by default, and must be manually enabled for tiering to start. This is an overhead on the administrator and easy to overlook. Enable it automatically when a tier is attached. Change-Id: I0c29de8762faec1bfe6d1376a57eeef3357ad15a BUG: 1274847 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12420 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
*	afr: write zeros to sink for non-sparse files	Ravishankar N	2015-10-28	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If a file is created with zeroes ('dd', 'fallocate' etc.) when a brick is down, the self-heal does not write the zeroes to the sink after it comes up. Consequenty, there is a mismatch in disk-usage amongst the bricks of the replica. Fix: If we definitely know that the file is not sparse, then write the zeroes to the sink even if the checksums match. Change-Id: Ic739b3da5dbf47d99801c0e1743bb13aeb3af864 BUG: 1272460 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12371 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/shard: Force cache-refresh when lookup/readdirp/stat detect that ↵	Krutika Dhananjay	2015-10-28	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \|	xattr value has changed Change-Id: Ia3225a523287f6689b966ba4f893fc1b1fa54817 BUG: 1272986 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12400 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	tests: Separate logs for each test	Raghavendra Talur	2015-10-27	1	-0/+14
\| \| \| \| \| \| \| \| \| \|	Change-Id: Ib286e3d4d7c432dab8073fce582ccbf723eb31d2 BUG: 1251592 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/12110 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tests: fix timeout in mount-nfs-auth.t	Jeff Darcy	2015-10-27	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mount timeout was too short. The normal configuration-change path (construct graph, call reconfigure) and the auth-refresh path might in effect run serially. Therefore we have to wait for the sum of those two intervals. As with all too-short-timeout problems, the result was that the test would run fine most of the time. However, it has caused spurious failures on my own patches a half dozen times, and I have a half dozen other emails about it nuking other people's as well (most often but not always on NetBSD). The fix, obviously, is to calculate and use the right timeout value for NFS mount actions. Other actions and timeouts have been left alone. Change-Id: Ic8f013c8c830e33c48bcc6d1b603d6d22a8ba3c5 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12396 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	tests/tier: Move common functions to tier.rc	N Balachandran	2015-10-21	6	-170/+95
\| \| \| \| \| \| \| \| \| \| \| \|	Move common functions in tier .t files to tier.rc Change-Id: Ibc312d987be9d93e7cc7fc47d0bf598bb1c944c2 BUG: 1272319 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12404 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: add pause tier for snapshots	Dan Lambright	2015-10-21	1	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Snaps of tiered volumes cannot handle files undergoing migration. We implement a helper mechanism to "pause" migration. Any files undergoing migration are aborted. Clean up is done to remove sticky bits and data at the destination. Migration is restarted after snap completes. For testing an internal switch is added. It is not exposed externally. gluster volume set vol1 tier-pause [true\|false] Change-Id: Ia85bbf89ac142e9b7e73fcbef98bb9da86097799 BUG: 1267950 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12304 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/snap : cleanup the root loc in statfs	Ashish Pandey	2015-10-20	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem : In svc_statfs function, wipe_loc is getting called on loc passed by nfs. This loc is being used by svc_stat which throws erro if loc->inode is NULL. Solution : wipe_loc should be called on local root_loc. Change-Id: I9cc5ee3b1bd9f352f2362a6d997b7b09051c0f68 BUG: 1260848 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/12123 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	cluster/dht : Do not migrate files with POSIX locks held	N Balachandran	2015-10-15	2	-0/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dht_migrate_file does not migrate file locks to the dst file. Any locks held on the source file are lost once the migration is complete. This issue is magnified in the case of a tier volume as file migrations occur more frequently and repeatedly as compared to a DHT rebalance. The fix makes 2 changes: 1. Before starting the actual migration process, check if there are any locks held on the file. If yes, do not migrate the file. 2. The rebalance process tries to lock on the entire file just before moving into the Phase 2 of the file migration. If the lock acquisition fails, the file migration does not proceed. If the lock is granted, the file migration proceeds. This still leaves a small window where conflicting locks can be granted to different clients. If client1 requests a lock on the src file just after it is converted to a linkto file and client2 requests a lock on the dst data file, they will both be granted, but all FOPs will be redirected to the dst data file. This issue will be taken up in a subsequent patch. Change-Id: I8c895fc3cced50dd2894259d40a827c7b43d58ac BUG: 1271148 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12347 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	glusterd: disabling enable-shared-storage option should not delete volume	Gaurav Kumar Garg	2015-10-13	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously when you create volume with "glusterd_shared_storage" name and if user disable enable-shared-storage option then gluster will delete the "glusterd_shared_storage" volume. With this fix gluster will do appropriate validation of enable-shared-storage option and it will not delete volume with "glusterd_shared_storage" name if it is a user created volume. Change-Id: I2bd92f938fb3de6ef496a934933bdcea9f251491 BUG: 1266818 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/12232 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	tier/shd: make shd commands compatible with tiering	Mohammed Rafi KC	2015-10-12	1	-0/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tiering volfiles may contain afr and disperse together or multiple time based on configuration. And the informations for those configurations are stored in tier_info. So most of the volgen code generation need to be changed to make compatible with it. Change-Id: I563d1ca6f281f59090ebd470b7fda1cc4b1b7e1d BUG: 1261276 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12135 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	quota/marker: dir_count accounting is not atomic	vmallika	2015-10-12	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider below scenario: Quota enabled on pre-existing data Now quota-crawl process will start healing xattrs Now if write is performed where healing is not complete, there is a possibility that 'update txn' is started before 'create xattr txn', in this case dir count can be missed on a dir where quota size xattr is not yet created. Solution is to get size xattr and if xattr is missing, add 1 for dir_count, this requires one additional fop if done in marker during each update iteration Better solution is to us xattrop GF_XATTROP_ADD_ARRAY64_WITH_DEFAULT Change-Id: Idc8978860a3914e70c98f96effeff52e9a24e6ba BUG: 1243798 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/11694 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	tier/ctr: CTR DB named lookup heal of cold tier during attach tier	Joseph Fernandes	2015-10-10	1	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Heal hardlink in the db for already existing data in the cold tier during attach tier. i.e during fix layout do lookup to files in the cold tier. CTR xlator on the brick/server side does db update/insert of the hardlink on a namelookup. Currently the namedlookup is done synchronous to the fixlayout that is triggered by attach tier. This is not performant, adding more time to fixlayout. The performant approach is record the hardlinks on a compressed datastore and then do the namelookup asynchronously later, giving the ctr db eventual consistency Change-Id: I4ffc337fffe7d447804786851a9183a51b5044a9 BUG: 1252586 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/11828 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: add watermarks and policy driver	Dan Lambright	2015-10-10	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test\|cache} Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 BUG: 1257911 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12039 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/tier: fix transpoint endpoint not connected in tier.t (rare)	Dan Lambright	2015-10-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The script did not cleanly unmount/mount gluster and change the current working directory when stopping and starting the volume. Most of the time this problem would self-resolve before subsequent tests, but very occasionally races would lead to the errors/failures. Change-Id: I128b913a71e2745512ee81c3d71852311e3b4a1b BUG: 1270328 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12327 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/ec: Implement gfid-hash read-policy	Pranith Kumar K	2015-10-09	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a policy in ec to performs reads from same bricks as long as they are good. Based on the gfid of the file/directory it determines the bricks to be considered for reading. Change-Id: Ic97b5c54c086a28b5e07a330a4fd448551b49376 BUG: 1261260 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12133 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
*	xlators: add JSON FOP statistics dumps every N seconds	Richard Wareing	2015-10-08	5	-3/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Adds a thread to the io-stats translator which dumps out statistics every N seconds where N is configurable by an option called "diagnostics.stats-dump-interval" - Thread cleanly starts/stops when translator is unloaded - Updates macros to use "Atomic Builtins" (e.g. intel CPU extentions) to use memory barries to update counters vs using locks. This should reduce overhead and prevent any deadlock bugs due to lock contention. Test Plan: - Test on development machine - Run prove -v tests/basic/stats-dump.t Change-Id: If071239d8fdc185e4e8fd527363cc042447a245d BUG: 1266476 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12209 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
*	quota: use copy_frame when creating new frame during quota_check_limit	vmallika	2015-10-06	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DHT re-balance, sets frame root PID < 0 and quota_check_limit skips enforcement if this PID is less than 0. When creating new frame for quota_check_limit we need to use copy_frame instead of create_frame, so that all auth information are copied from original frame. Change-Id: Ib3b4a3744f8b0d72a8bc32826f6edae836d6faed BUG: 1267812 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/12265 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/ec : Mark new entry changelog in entry self-heal	Ashish Pandey	2015-10-06	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem : When a new entry is created dirty mark xattrs are not created this will need full heal to be performed, even when there are partial failures. Solution : Marks new entry changelog in self-heal. PS: Also fixed erasing of dirty markers when no data heal is required. BUG: 1254121 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I156e3d3201afa77efe118e1aaace1d91c90a9613 Reviewed-on: http://review.gluster.org/11938 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
*	glusterd: validate function for replica volume options	Sakshi	2015-10-01	1	-0/+67
\| \| \| \| \| \| \| \| \| \|	Change-Id: I5b4a28db101e9f7e07f4b388c7a2594051c9e8dd BUG: 1265479 Signed-off-by: Sakshi <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12215 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	glusterd, dht: volume set for use-readdirp in dht	Pranith Kumar K	2015-10-01	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: Icab246b1d02808864d878d949fa56f9f889b538a BUG: 1265677 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12221 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cluster/tier re-enable tier.t in automatic tests	Dan Lambright	2015-09-29	1	-15/+29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Re-enable tier.t in automatic tests. Disable check for BSD until recurring problem with SQLlite on it is understood. Change-Id: Ib13b269ab841a59a0a41d8478c8627b180b16c61 BUG: 1231268 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12208 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>