summaryrefslogtreecommitdiffstats
path: root/xlators/cluster
Commit message (Collapse)AuthorAgeFilesLines
...
* cluster/ec: Create this->itable in all casesPranith Kumar K2016-01-283-5/+5
| | | | | | | | | | | | | | | | | | | Problem: glfsheal operates based on mount's volfile which doesn't have iamshd flag due to which this->itable is NULL, this leads to "inode not found" logs in glfsheal logs. Fix: Ec only allocates itable with 10 inodes, so allocating this->itable in all cases in init. Change-Id: I01d3c05e93a17007a4716a2d6f392d2aa306a34b BUG: 1294743 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13112 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Move remaining gf_logs to gf_msgsPranith Kumar K2016-01-273-14/+19
| | | | | | | | | | | | Change-Id: I48d9e5313bd3ccf9fe26c90a7051a8a174d75c49 BUG: 1296818 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13195 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tier/dht : Default value for demote-freq, max files and mbJoseph Fernandes2016-01-262-5/+6
| | | | | | | | | | | | | | | | | | | Default value for tier-demote-frequency is 3600 sec to avoid frequent demotions. Default value for tier-max-mb is 4000 mb Default value for tier-max-files is 10000 files Change-Id: Ie60951c478a7462c425059699ab82511aa13fa0a BUG: 1300412 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/13270 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: Ignore quota-deem-statfs for watermark calculationN Balachandran2016-01-251-0/+21
| | | | | | | | | | | | | | | | | | The tier process watermark calculations were incorrect when the quota-deem-statfs option was enabled. We now ignore this while calculating hot tier usage to determine watermark levels. Change-Id: I308bc24432e2fa5ad1d5703e80fc391433538bbb BUG: 1301473 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13288 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: mohammed rafi kc <rkavunga@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Dan Lambright <dlambrig@redhat.com>
* afr : Check if dict is valid in afr_replies_interpret()Ravishankar N2016-01-211-1/+2
| | | | | | | | | | | | | | | | | posix_mkdir does not send response xdata. So even though replies are valid, the response xdata dict is NULL. Check if dict is non-null in afr_replies_interpret before doing dict_get Signed-off-by: Ravishankar N <ravishankar@redhat.com> Change-Id: If543d68d8bfd2433519105839d5be106076cc276 BUG: 1294053 Reviewed-on: http://review.gluster.org/13185 Tested-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: skip healing data blocks for arbiterRavishankar N2016-01-181-9/+49
| | | | | | | | | | | | | | | | | | | 1 ....but still do other parts of data-self-heal like restoring the time and undo pending xattrs. 2. Perform undo_pending inside inodelks. 3. If arbiter is the only sink, do these other parts of data-self-heal inside a single lock-unlock sequence. Change-Id: I64c9d5b594375f852bfb73dee02c66a9a67a7176 BUG: 1286017 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12777 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Get size and config for invalid inodeAshish Pandey2016-01-131-11/+20
| | | | | | | | | | | | | | | | | | | Problem: After creating an inode and before linking it to inode table, if there is a request to setattr for that file, it fails and leads to crash. Before linking inode to inode table ia_type is IA_INVAL which will casue have_size and have_config as zero. Solution: Check and get size and config if an inode is invalid Change-Id: I0c0e564940b1b9f351369a76ab14f6b4aa81f23b BUG: 1293223 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/13039 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/dht : Rebalance process crashes due to double fd_unrefN Balachandran2016-01-101-6/+11
| | | | | | | | | | | | | The dst_fd was being unrefed twice in case the call to __dht_rebalance_create_dst_file failed. Change-Id: I56c5aff7fa3827887e67936b8aa1ecbd1a08a9e9 BUG: 1296611 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13193 Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/tier: allow db queries to be interruptableDan Lambright2016-01-073-30/+110
| | | | | | | | | | | | | | | | | | | | A query to the database may take a long time if the database has many entries. The tier daemon also sends IPC calls to the bricks which can run slowly, espcially in RHEL6. While it is possible to track down each such instance, the snapshot feature should not be affected by database operations. It requires no migration be underway. Therefore it is okay to pause tiering at any time except when DHT is moving a file. This fix implements this strategy by monitoring when control passes to DHT to migrate a file using the GF_XATTR_FILE_MIGRATE_KEY trigger. If it is not, the pause operation is successful. Change-Id: I21f168b1bd424077ad5f38cf82f794060a1fabf6 BUG: 1287842 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13104 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
* dht: missleading indendentation, gcc-6Kaleb S KEITHLEY2016-01-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-6 now has -Wmisleading-indentation as part of -Wall. compiling with gcc-6 gives this warning. ... dht-diskusage.c: In function ‘dht_subvol_has_err’: dht-diskusage.c:361:33: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] goto out; ^~~~ dht-diskusage.c:358:25: note: ...this ‘if’ clause, but it is not if (conf->decommissioned_bricks[i] && ^~ ... Inspection of the source shows that without braces the loop is terminated prematurely. Change-Id: Ica48a8c59ee5d0a206797827d7920259d33b47ec BUG: 1295784 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13176 Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tier/create: store TIER_LINKFILE_GFID in xattr dictionaryMohammed Rafi KC2016-01-051-17/+68
| | | | | | | | | | | | | | | | | | | In tier_create, a new key TIER_LINKFILE_GFID was introduced to avoid a race in stale linkfile deletion. Storing this key in xattr dictionary instead of using local->params dictionary. Because local->params dictionary was also used to create the file before stale linkfile deletion, that leads posix_create to fail, trying to set the added key as extended attributes Change-Id: I24fecb62b47bee65a1e86103925a67d13304c5df BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13130 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: check watermark during migrationDan Lambright2016-01-051-46/+57
| | | | | | | | | | | | | | | Currently we check the watermarks only before a cycle begins. We should also check the hot tier's fullness against the watermarks during the migration so the watermark is not exceeded as files are promoted. Change-Id: I2ff87a1c308d64fbdca14bbdf55f3ec3007290ae BUG: 1293932 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/13103 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* cluster/tier: Additional details in error messagesN Balachandran2016-01-041-41/+45
| | | | | | | | | | | | | | Added file path/gfid when available to the tier log messages to make debugging easier. Change-Id: I22dda329367df2b846dcf254594312c997b66083 BUG: 1273043 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/13114 Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/create: Dynamically allocate gfid memoryMohammed Rafi KC2015-12-291-2/+13
| | | | | | | | | | | | | | | | Currently we are storing the memory as a static pointer. There is a chance to go that variable in out of scope. So we should allocate in Dynamic way. Change-Id: I096876deb8055ac3a44681599591a0a032bc0c24 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13102 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/unlink: symlink failed to unlinkMohammed Rafi KC2015-12-291-1/+5
| | | | | | | | | | | | | | | | | | | | | during unlink of a file, we will get stat just after deleting the file, to see if the file is under migration or not. but this stat call will fail for symlink if the actual file was deleted. So it is better not to send stat request from client if it is a symlink as we are not migrating symlink. Change-Id: Idc033b24fa3522b5261e579889d2195b43419682 BUG: 1293963 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13074 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* quota: limit xattr for subdir not healed on newly added bricksvmallika2015-12-291-1/+19
| | | | | | | | | | | | | | DHT after creating missing directory, tries to heal the xattrs. This xattrs operation fails as INTERNAL FOP key was not set Change-Id: I819d373cf7073da014143d9ada908228ddcd140c BUG: 1294479 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13100 Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* afr: Fix excessive logging in afr_accuse_smallfiles()Ravishankar N2015-12-281-1/+2
| | | | | | | | | | | | | Commit 2b7226f9d3470d8fe4c98c1fddb06e7f641e364d did not check for the validity of a dict before doing a dict_get. Fix that. Change-Id: Ie21f4da19256b17196f242cd8fd5bb76b0a69c1e BUG: 1294053 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13077 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Tier: "tier start force" command implementationhari gowtham2015-12-222-3/+3
| | | | | | | | | | | | | | | | The start command doesnt restart the tier deamon if the deamon is running at one node. hence to bring up the tierd on the nodes where the deamon is down, the force command is implemented. It skips the check for tierd running. Change-Id: I0037d3e5ecfe56637d0da201a97903c435d26436 BUG: 1292112 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/12983 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/dht : Ftruncate on migrating file fails with EINVALN Balachandran2015-12-229-80/+317
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | What: If dht_open is called on a migrating file after the inode_ctx is set, subsequent FOPs on that fd do not open the fd on the dst subvol. This is seen when the open-ftruncate-close sequence is repeatedly called on a migrating file. A second call to the sequence described above causes dht_truncate_cbk to call dht_truncate2 as the dht_inode_ctx was already set by the first call. As dht_rebalance_in_progress_check is not called, the fd is not opened on the dst subvol. On a distributed-replicate volume, this causes AFR to open the fd using afr_fix_open, but with the wrong flags, causing posix_ftruncate to fail with EINVAL. The fix: We require fd specific information to make a decision while handling migrating files. Set the fd_ctx to indicate the fd has been opened on the dst subvol and check if it has been set while processing Phase1/Phase2 checks in the FOP callback functions. Change-Id: I43cdcd8017b4a11e18afdd210469de7cd9a5ef14 BUG: 1284823 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12985 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* build: export minimum symbols from xlators for correct resolutionKaleb S KEITHLEY2015-12-225-6/+6
| | | | | | | | | | | | | | | | | | | | | | Revisiting http://review.gluster.org/#/c/11814/, which unintentionally introduced warnings from libtool about the xlator .so names. According to [1], the -module option must appear in the Makefile.am file(s); if -module is defined in a macro, e.g. in configure(.ac), then libtool will not recognize that this is a module and will emit a warning. [1] http://www.gnu.org/software/automake/manual/automake.html#Libtool-Modules Change-Id: Ifa5f9327d18d139597791c305aa10cc4410fb078 BUG: 1248669 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/13003 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* tier:delete the linkfile if data file creation failsMohammed Rafi KC2015-12-224-82/+288
| | | | | | | | | | | | | | | | | | | | | | | If we are creating data file in a hot subvolume then we will create a linkfile in cold subvolume. Linkfile creation happens first. If linkfile creation was successful and data file creation failed, then linkfile in cold subvolume will become stale. This patch will delete the linkfile as well, if data file creation fails. Also this code duplicates dht_create to make tier_create Change-Id: I377a90dad47f288e9576c7323b23cf694a91a7a3 BUG: 1290677 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12948 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* afr: warn if pending xattrs missing during init()Ravishankar N2015-12-212-1/+17
| | | | | | | | | | | | | | | | | | Since commit 6e635284a4411b816d4d860a28262c9e6dc4bd6a (glusterfs-3.7.7), the afr pending xattrs are stored in the volfile and used by afr when it initializes. If a cluster is upgraded, prevent afr from loading until the op-version has been bumped up to 3.7.7 and the volfiles have been regenerated using a volume set command. Without this fix, AFR will crash when initialzing. Change-Id: I14249dedb3f2f77cd754d78d8a9a70fdc5fc8c10 BUG: 1293293 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/13038 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/tier: do not block in synctask created from pause tierDan Lambright2015-12-213-18/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | We had run sleep() in the pause tier callback. Blocking within a synctask is dangerous. The sleep() call does not inform the synctask scheduler that a thread is no longer running. It therefore believes it is running. If a second synctask already exists, it may not be able to run. This occurs if the thread limit in the pool has been reached. Note the pool size only grows when a synctask is created, not when it is moved from wait state to run state, as is the case when an FOP completes. When the tier is paused during migration, synctasks already exist waiting for responses to FOPs to the server with high probability. The fix is to yield() in the RPC callback, which will place the synctask into the wait queue and free up a thread for the FOP callback. A timer wakes the callback after sufficient time has elapsed for the pause to occur. Change-Id: I6a947ee04c6e5649946cb6d8207ba17263a67fc6 BUG: 1267950 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* tier: Demotion failed if the file was renamed when it was in coldMohammed Rafi KC2015-12-211-11/+26
| | | | | | | | | | | | | | | | | | | | During migration if the file is present we just open the file in hashed subvol. Now if the linkfile present on hashed is just linkfile to another subvol, we actually open in hashed subvol. But subsequent operation will go to linkto subvol ie, to non-hashed subvol. This operation will get failed since we haven't opened d on non-hashed. Change-Id: I9753ad3a48f0384c25509612ba76e7e10645add3 BUG: 1292067 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12980 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht : Properly free file descriptors during data migrationJoseph Fernandes2015-12-211-6/+5
| | | | | | | | | | | | | | | While tier migration, free src and dst fd's when create of destination or open of source fails. Change-Id: I62978a669c6c9fbab5fed9df2716b9b2ba00ddf1 BUG: 1291566 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* afr: handle bad objects during lookup/inode_refreshRavishankar N2015-12-201-1/+10
| | | | | | | | | | | | | | | | | If an object (file) is marked bad by bitrot, do not consider the brick on which the object is present as a potential read subvolume for AFR irrespective of the pending xattr values. Also do not consider the brick containing the bad object while performing afr_accuse_smallfiles(). Otherwise if the bad object's size is bigger, we may end up considering that as the source. Change-Id: I4abc68e51e5c43c5adfa56e1c00b46db22c88cf7 BUG: 1290965 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12955 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: refresh inode using fstatRavishankar N2015-12-201-26/+91
| | | | | | | | | | | | | | | For fd based operations (fgetxattr, readv etc.) if an inode refresh is required, do so using fstat instead of lookup. This is because the file might have been deleted by another client before refresh but posix mandates that FOPS using already open fds must still succeed. Change-Id: Id5f71c3af4892b648eb747f363dffe6208e7ac09 BUG: 1285230 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12894 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/tier: fix tier-max-files bookeeping and helpDan Lambright2015-12-182-2/+1
| | | | | | | | | | | | | | | | | | The option tier-max-files represents the maximum number of files trasferred by a node in a gives cycle. Fix help message to reflect the "per node" aspect. The code transferred one more file per cycle than the given value. Also change the default values of max file and max bytes to very large values, effectively we will not throttle migration unless the administrator requests it via CLI. Change-Id: Ic2949ed3d8c35afe7c9ae4db72195603cfb2e28f BUG: 1292671 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12984 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/dht: Handle failure in getxattrSusant Palai2015-12-161-0/+7
| | | | | | | | | | | | | | | | | Problem: Currently even if we have received xattrs from any one of the subvolume, we unwind with error in case the last subvol (which unwinds) received a negative response. To handle the case check if any of the subvolume has received a response and pass it down. Change-Id: Ia12a1f9671a6764f7550e6dc223324b1039fcc51 BUG: 1287539 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12845 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* tier:unlink during migrationMohammed Rafi KC2015-12-165-75/+382
| | | | | | | | | | | | | | | | | | | | | | files deleted during promotion were not deleting as the files are moving from hashed to non-hashed. On deleting a file that is undergoing promotion, the unlink call is not sent to the dst file as the hashed subvol == cached subvol. This causes the file to reappear once the migration is complete. This patch also fixes a problem with stale linkfile deleting. Change-Id: I4b02a498218c9d8eeaa4556fa4219e91e7fa71e5 BUG: 1282390 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12829 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/afr: During name heal, propagate EIO only on gfid or type mismatchKrutika Dhananjay2015-12-161-5/+13
| | | | | | | | | | | | | | | | When the disk associated with a brick returns EIO during lookup, chances are that name heal would return an EIO because one of the syncop_XXX() operations as part of it returned an EIO. This is inherently treated by afr_lookup_selfheal_wrap() as a split-brain and the lookup is aborted prematurely with EIO even if it succeeded on the other replica(s). Change-Id: Ib9b7f2974bff8e206897bb4f689f0482264c61e5 BUG: 1291701 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/12973 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* glusterd/afr: store afr pending xattrs as a volume optionRavishankar N2015-12-151-9/+23
| | | | | | | | | | | | | | | | | | | | | | | Problem: When AFR xlator initialises, it uses the name of the client xlators below it for storing the pending changelogs (xattrs). This can be problem when some other xlator is loaded in between AFR and the client. Though that is a trivial 'traverse-graph-till-the-client-and-use-the-name' fix in AFR's init(), there are other issues like when there's no client xlator at all when, say, AFR is moved to the server side. Fix: The client xlator names are currenly unique and stored as brickinfo->brick_ids. So persist these ids as comma separated values in AFR's volume_options and use them as xattr values during init(). Change-Id: Ie761ffeb3373a4c4d85ad05c84a768c4188aa90d BUG: 1285152 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12738 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* dht/rebalance: Use seperate return variable for destination cleanupJoseph Fernandes2015-12-121-3/+3
| | | | | | | | | | | | | | | Use seperate local return variable for destination cleanup, without messing with the function return variable. Change-Id: Iaea9ed2927234fdb888aef7a31ec362090e98196 BUG: 1290975 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12956 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: files are still going to decommissioned subvolMohammed Rafi KC2015-12-091-3/+27
| | | | | | | | | | | | | | After detach tier start, creates are still going to hot tier. Because when creating data files we are not checking for decommissioned bricks. Change-Id: I8e28258d9b2367dcc8ad6e5e91d0e54d92fdf771 BUG: 1289602 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/12914 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* tier : Spawn promotion or demotion thread depending on local bricksJoseph Fernandes2015-12-091-22/+29
| | | | | | | | | | | | | | | | | | | Spawn Promotion or Demotion depending if there are any Cold or Hot bricks present localy. IF the local HOT brick list is empty dont spawn demote thread. IF the local COLD brick list is empty dont spawn promote thread. Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I524730e59414dd156c78ec0bd7a3629212697e6e BUG: 1289578 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12912 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier : Fix double free in tier processN Balachandran2015-12-071-3/+1
| | | | | | | | | | | | | | | | | The tier process tries to free ipc_ctr_params twice if the syncop_ipc call in tier_process_ctr_query fails. ipc_ctr_params is freed when ctr_ipc_in_dict is freed. But ctr_ipc_out_dict is NULL when syncop_ipc fails, causing GF_FREE to be called on a non-NULL ipc_ctr_params ptr again. Change-Id: Ia15f36dfbcd97be5524588beb7caad5cb79efdb4 BUG: 1288995 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12890 Reviewed-by: Joseph Fernandes Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* tier/dht: Fix mem leak from lookup response dictJoseph Fernandes2015-12-061-0/+6
| | | | | | | | | | | | | Fixing memory leak from response dict during a parent lookup to get the path. Change-Id: I60c23d0b25e7f763f0e53c40e71ee053aba6d555 BUG: 1288019 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12867 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes
* tier/tier: Ignoring status of already migrated filesJoseph Fernandes2015-12-031-5/+16
| | | | | | | | | | | | | Ignore the status of already migrated files and in the process don't count. Change-Id: Idba6402508d51a4285ac96742c6edf797ee51b6a BUG: 1276141 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/12758 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/tier: fix loading tier.so into glusterdN Balachandran2015-12-037-41/+72
| | | | | | | | | | | | | | | | | glusterd occasionally loads shared libraries of translators. This failed for tiering due to a reference to dht_methods which is defined as a global variable which is not necessary. The global variable has been removed and this is now a member of dht_conf and is now initialised in the *_init calls. Change-Id: Ifa0a21e3962b5cd8d9b927ef1d087d3b25312953 BUG: 1287842 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12863 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* heal : Changed heal info to process all indices directoriesAnuradha Talur2015-12-027-58/+102
| | | | | | | | | | Change-Id: Ida863844e14309b6526c1b8434273fbf05c410d2 BUG: 1250803 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12658 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/ec: Create copy of dict for setting internal xattrsPranith Kumar K2015-12-012-19/+19
| | | | | | | | | | | | | | | | | | | | | | Problem: Ec takes a ref of the request xdata and sets trusted.ec.version/algo etc xattrs as part of it. But this request xdata could be using same dictionary to do the operation on multiple subvolumes, due to which other subvolumes will have internal xattrs of ec in it and will be created on subvols where they are not supposed to appear. Fix: Take a copy of the request xdata/dict to prevent this from happening. Most of the debugging work and test script is contributed by Nitya. BUG: 1286910 Change-Id: If146435dfb89656158dbed3862a3e9a0cda60581 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12831 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* cluster/afr : Readdirp performance enhancementAnuradha Talur2015-11-305-82/+162
| | | | | | | | | | | | | | | | | | | | Things done : 1) during lookup and inode_refresh as part of read_txn, request is sent to detect if heal is required or not. 2) If heal is required, be conservative in setting the readdirp entry inodes to NULL, otherwise don't be. 3) Self-heal-daemon now crawls both indices/xattrop and indices/dirty directory while healing. Change-Id: Ic4a4da63fb7e0726eab5f341a200859b29cf7eb7 BUG: 1250803 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/12507 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* dht : changing variable type to avoid overflowSakshi Bansal2015-11-261-7/+9
| | | | | | | | | | | | | | | | | For layout computation we find total size of the cluster and store it in an unsigned 32 bit variable. For large clusters this value may overflow which leads to wrong computations and for some bricks the layout may overflow. Hence using unsigned 64 bit to handle large values. Change-Id: I7c3ba26ea2c4158065ea9e74705a7ede1b6759c7 BUG: 1282751 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12597 Reviewed-by: Susant Palai <spalai@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/afr: change data self-heal size check for arbiterPranith Kumar K2015-11-261-0/+4
| | | | | | | | | | | | | Size mismatch should consider that arbiter brick will have zero size file to prevent data self-heal to spuriously trigger/assuming need of self-heals. Change-Id: I179775d604236b9c8abfa360657abbb36abae829 BUG: 1285634 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12755 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Remember flags sent by create fopPranith Kumar K2015-11-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | Problem: In create fop, afr doesn't remember the flags. When afr has to perform fixing of the fd that needs to be opened on other bricks because the brick was down at the time of create, the flags with which it needs to send open are not present. Fix: Remember the flags in the fd_ctx. Thanks to Nitya for showing us the problem in re-open with the flags. Change-Id: I8ce1eb50c35fc0722cfc25cb4b6d234ef56180e5 BUG: 1285173 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12739 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: readdirp to cold tier onlyDan Lambright2015-11-236-109/+496
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible a file would get migrated in the middle of a readdir operation. If there are four subvolumes A,B,C,D, and if readdir reads them in order and reaches subvol B, then, if a file is moved from D to A, it will not be included in the readdir output. This phenonema has pre-existed in DHT migration but is more apparent in tiering. When a file is moved off the hashed subvolume a T file is created. For tiering, we will make the cold subvolume the hashed subvolume. This will ensure the creation of a T file. Readdir will not skip T files in the tier translator. Making the cold subvolume the hashed subvolume ensures the T files created on promotions or creates will be less likely to fill the volume. Creates still put the data on the hot subvolume. Change-Id: Ifde557d3d0e94a4570ca9f115adee3db2ee75407 BUG: 1281598 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12530 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Mark internal fops appropriatelyXavier Hernandez2015-11-194-27/+58
| | | | | | | | | | | | | 1) Mark read fops in read-modify-write by EC as internal. 2) Handle uid/gid set/reset correctly BUG: 1282761 Change-Id: I5c1ce0cd6213367eaead5fed33aa2397c4e46df7 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/12599 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* afr: Drop compatibility lock for data self-healRavishankar N2015-11-181-18/+0
| | | | | | | | | | | | | | | | | | | In glusterfs 3.4 and older, AFR did not take locks in self-heal domain during data self-heal. So this compat lock in data domain was added to prevent older clients from trying to heal a file while an existing self-heal was going on by a newer client. But the side effect was that all appending writes (which take full locks in data domain) from mounts would be stalled until self-heal was complete. Since glusterfs 3.4 is not supported anymore, remove the compat lock. Change-Id: I31c8e4d7f3364f769a14eec295154e3c40d9f78e BUG: 1283032 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/12602 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/tier: Do not delete linkto file on demotionN Balachandran2015-11-182-2/+9
| | | | | | | | | | | | | | | | | The current DHT migration code will always delete the src linkto file after migration as dht always moves files to the hashed subvol. This is not the case in tiering. The lack of linkto files causes rename to fail leaving 2 files with the same name but different gfids on the volume. Modified to leave the linkto file behind if the source volume is the hashed subvolume. Change-Id: I2b99f7d34b4b719aee6232dc40c6a8f8ba88225d BUG: 1279376 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12551 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* cluster/ec: Mark self-heal fops as internalPranith Kumar K2015-11-186-11/+13
| | | | | | | | | | Change-Id: I8ae7af266d3e00460f0cfdc9389a926e5f2fee36 BUG: 1282761 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/12598 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>