summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* dht/tier/rebalancer: Fix reset of tiering client pidJoseph Fernandes2015-05-104-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the patch http://review.gluster.org/#/c/9657 the client pid set by tiering migration was getting over- written in dht_start_rebalance_task(). Just corrected it in dht_setxattr() before calling dht_start_rebalance_task() and removed it from dht_start_rebalance_task(). > http://review.gluster.org/#/c/10502/ > Cherry picked from commit a5fe0f594d41e1a11661d9074bb19e9c2e2c4776 > Change-Id: I37cfa111f83a4e5d498042575c93799f60b49870 > BUG: 1217937 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10502 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10502 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Conflicts: xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/tier.c Change-Id: Id513114c9a880c6196162dd4b35bbf1155a8cd09 BUG: 1219027 Reviewed-on: http://review.gluster.org/10609 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/tiering: Exchange tier info during glusterd handshakeMohammed Rafi KC2015-05-101-0/+154
| | | | | | | | | | | Back port of http://review.gluster.org/#/c/10449 Change-Id: Ibc2f8eeb32d3e5dfd6945ca8a6d5f0f80a78ebac BUG: 1219846 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* dht: make lookup-unhashed=auto do something actually usefulJeff Darcy2015-05-0910-79/+664
| | | | | | | | | | | | | | | | | | | | | | | | | The key concept here is to determine whether a directory is "clean" by comparing its last-known-good topology to the current one for the volume. These are stored as "commit hashes" on the directory and the volume root respectively. The volume's commit hash changes whenever a brick is added or removed, and a fix-layout is done. A directory's commit hash changes only when a full rebalance (not just fix-layout) is done on it. If all bricks are present and have a directory commit hash that matches the volume commit hash, then we can assume that every file is in its "proper" place. Therefore, if we look for a file in that proper place and don't find it, we can assume it's not on any other subvolume and *safely* skip the global (broadcast to all) lookup. Change-Id: Id6ce4593ba1f7daffa74cfab591cb45960629ae3 BUG: 1220064 Reviewed-on-master: http://review.gluster.org/#/c/7702/ Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Shyam <srangana@redhat.com> Reviewed-on: http://review.gluster.org/10729 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli/tiering: volume info should display details about tierMohammed Rafi KC2015-05-096-3/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back port of http://review.gluster.org/#/c/10339/ >> gluster volume info patchy Volume Name: patchy Type: Tier Volume ID: 8bf1a1ca-6417-484f-821f-18973a7502a8 Status: Created Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Replicate Number of Bricks: 1 x 2 = 2 Brick1: hostname:/home/brick30 Brick2: hostname:/home/brick31 Cold Bricks: Cold Tier Type : Disperse Number of Bricks: 1 x (4 + 2) = 6 Brick3: hostname:/home/brick20 Brick4: hostname:/home/brick21 Brick5: hostname:/home/brick23 Brick6: hostname:/home/brick24 Brick7: hostname:/home/brick25 Brick8: hostname:/home/brick26 Change-Id: I7b9025af81263ebecd641b4b6897b20db8b67195 BUG: 1219842 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10676 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cli/tiering: display hot tier, and cold tier separatelyMohammed Rafi KC2015-05-093-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back port of http://review.gluster.org/#/c/10328 cli commands display the brick information without a way to distinguish hot tier, and cold tier. This patch will change all the cli related output, without changing the corresponding xml output. This patch will change following things >> gluster volume info Volume Name: patchy Type: Tier Volume ID: 7745d367-811a-4fe9-a500-d04e7afa94bf Status: Created Number of Bricks: 3 x 2 = 6 Transport-type: tcp Hot Bricks: Brick1: hostname:/home/brick21 Brick2: hostname:/home/brick20 Cold Bricks: Brick3: hostname:/home/brick19 Brick4: hostname:/home/brick16 Brick5: hostname:/home/brick17 Brick6: hostname:/home/brick18 >>gluster volume status Status of volume: patchy Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick hostname:/home/brick21 49152 0 Y 4690 Brick hostname:/home/brick20 49153 0 Y 4707 Cold Bricks: Brick hostname:/home/brick19 49154 0 Y 4724 Brick hostname:/home/brick16 49155 0 Y 4741 Brick hostname:/home/brick17 49156 0 Y 4758 Brick hostname:/home/brick18 49157 0 Y 4775 NFS Server on localhost 2049 0 Y 4793 Task Status of Volume patchy ------------------------------------------------------------------------------ There are no active volume tasks >>gluster volume status pathy detail Status of volume: patchy Hot Bricks: ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick21 TCP Port : 49162 RDMA Port : 0 Online : Y Pid : 22677 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick20 TCP Port : 49161 RDMA Port : 0 Online : Y Pid : 22660 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 Cold Bricks: ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick19 TCP Port : 49157 RDMA Port : 0 Online : Y Pid : 22501 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick16 TCP Port : 49158 RDMA Port : 0 Online : Y Pid : 22518 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick17 TCP Port : 49159 RDMA Port : 0 Online : Y Pid : 22535 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 ------------------------------------------------------------------------------ Brick : Brick hostname:/home/brick18 TCP Port : 49160 RDMA Port : 0 Online : Y Pid : 22552 File System : ext4 Device : /dev/mapper/luks-cd077c56-42ba-44b1-8195-f214b9bc990c Mount Options : rw,seclabel,relatime,data=ordered Inode Size : 256 Disk Space Free : 127.3GB Total Disk Space : 165.4GB Inode Count : 11026432 Free Inodes : 10998043 Change-Id: I7d584eb8782129c12876cce2ba8ffba6c0a620bd BUG: 1219842 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10675 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd/tiering: disable tiering if cluster runs older gluster versionsDan Lambright2015-05-091-1/+43
| | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of fix 10531 to Gluster 3.7. Do not allow attach or detach tier to work if any of the nodes is running gluster code < 3.7. > http://review.gluster.org/#/c/10531/ > Change-Id: Id9af8f4057f6fad9cb703ec7645bc01eccb11fc1 > BUG: 1218287 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10531 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Id9af8f4057f6fad9cb703ec7645bc01eccb11fc1 BUG: 1219600 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10651 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* bitrot: Scrubber log should report 'bad' file detection as ALERT in logGaurav Kumar Garg2015-05-091-2/+2
| | | | | | | | | | | | | | | | | | | If scrubber detect any bad object by mismatching of checksum of scrubber and signer then log messages shold come as a Alert instead of warning. > Change-Id: I075d80700cbe6182e525a04419a80ab18419ff91 > BUG: 1210687 > Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> > Reviewed-on: http://review.gluster.org/10226 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Change-Id: I7c733c82aed5a00c74e60dc7baca0aa9acf26fad BUG: 1220041 Reviewed-on: http://review.gluster.org/10715 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: Update Not Started to Created in code and docAravinda VK2015-05-091-1/+1
| | | | | | | | | | | | | "Not Started" status is now "Created", replaced "Not Started" string in code and doc. Change-Id: If7d606c2cc8156e41291e7eebe9d0da4ad7ac28d Signed-off-by: Aravinda VK <avishwan@redhat.com> BUG: 1219938 Reviewed-on: http://review.gluster.org/10698 Reviewed-on: http://review.gluster.org/10699 Reviewed-by: Kotresh HR <khiremat@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* tiering: Remove unwanted check for tier type volumeMohammed Rafi KC2015-05-092-2/+3
| | | | | | | | | | | | | | | | | | | | Back port of http://review.gluster.org/10494 >Change-Id: I2def61ebf348558e5f6a138265e3329d9a5407a3 >BUG: 1216898 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/10494 >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Tested-by: NetBSD Build System Change-Id: I35f460297dfd6b6883c62a6826c99e4f1f3aece2 BUG: 1220051 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10712 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System
* bitrot/scrub: fix induced throttling in syncop_ftw_throttle()Venky Shankar2015-05-091-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Failing to reset scanning counter causes "incorrect" delay of around 50 seconds per directory entry. This causes scrubber to run extremely slowly. [ NOTE: This is a temporary fix. With the introduction of token bucket based throttling, inducing throttle via sleep() call would be unneeded. ] Also, fix logging messages in scrubber to log brick and full path of the object which is identified/marked as corrupted. > Change-Id: Id501bd15dcdbd8a09613f80f9d84050304740027 > BUG: 1170075 > Signed-off-by: Venky Shankar <vshankar@redhat.com> > Reviewed-on: http://review.gluster.org/10375 > Tested-by: NetBSD Build System > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Change-Id: I78f227f52f12549d62ecb35cbb70121424f7c2a7 BUG: 1220041 Reviewed-on: http://review.gluster.org/10714 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Invoking daemon reconfigure functions correctlyanand2015-05-092-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem : Scrub and bitd reconfigure functions were not invoking if quota is not enabled. Reason : In glusterd_svcs_reconfigure, if quota is not enabled then it is returning in the middle of the function without calling bitd and scrub reconfigure functions. Fix : If quota is not enable on volume, skip quota reconfigure and continue for other daemon reconfigure. This patch also address the state dump issue for bitd and quotad (logs the scrub and bitd info into state dump). Change-Id: I39ea004b70c95543c08496245be595b3ea044a29 BUG: 1219355 Signed-off-by: anand <anekkunt@redhat.com> Reviewed-on: http://review.gluster.org/10622 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> (cherry picked from commit 5faf4801bd6839cec58324a45dec31f977992236) Reviewed-on: http://review.gluster.org/10703 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* ec: Fix failures with missing filesXavier Hernandez2015-05-095-283/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.com/9407 When a file does not exist on a brick but it does on others, there could be problems trying to access it because there was some loc_t structures with null 'pargfid' but 'name' was set. This forced inode resolution based on <pargfid>/name instead of <gfid> which would be the correct one. To solve this problem, 'name' is always set to NULL when 'pargfid' is not present. Another problem was caused by an incorrect management of errors while doing incremental locking. The only allowed error during an incremental locking was ENOTCONN, but missing files on a brick can be returned as ESTALE. This caused an EIO on the operation. This patch doesn't care of errors during an incremental locking. At the end of the operation it will check if there are enough successfully locked bricks to continue or not. BUG: 1220011 Change-Id: I4a1e6235d80e20ef7ef12daba0807b859ee5c435 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/10701 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glupy: fix makefile issuesHumble Devassy Chirammal2015-05-091-5/+0
| | | | | | | | | | Change-Id: I421d5fd57f12941b956e44456974e3a52e34835b BUG: 1220075 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/10734 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* core: use reference counting for mem_acct structuresJeff Darcy2015-05-092-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing memory, our memory-accounting code expects to be able to dereference from the (previously) allocated block to its owning translator. However, as we have already found once in option validation and twice in logging, that translator might itself have been freed and the dereference attempt causes on of our daemons to crash with SIGSEGV. This patch attempts to fix that as follows: * We no longer embed a struct mem_acct directly in a struct xlator, but instead allocate it separately. * Allocated memory blocks now contain a pointer to the mem_acct instead of the xlator. * The mem_acct structure contains a reference count, manipulated in both the normal and translator allocate/free code using atomic increments and decrements. * Because it's now a separate structure, we can defer freeing the mem_acct until its reference count reaches zero (either way). * Some unit tests were disabled, because they embedded their own copies of the implementation for what they were supposedly testing. Life's too short to spend time fixing tests that seem designed to impede progress by requiring a certain implementation as well as behavior. Change-Id: Id929b11387927136f78626901729296b6c0d0fd7 BUG: 1219026 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/10417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10723 Tested-by: NetBSD Build System
* cli/tiering: Enhance cli output for tieringMohammed Rafi KC2015-05-092-1/+10
| | | | | | | | | | | | | | | | | | | | | | | back port of http://review.gluster.org/10284 Fix for handling cli output for attach-tier and detach-tier >Change-Id: I4d17f4b09612754fe1b8cec6c2e14927029b9678 >BUG: 1211562 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/10284 >Reviewed-by: Dan Lambright <dlambrig@redhat.com> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Tested-by: NetBSD Build System >Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: Ic30c8f0a104d89bf98f5d0069937a34674ee3f2d BUG: 1220052 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10713 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: support for tier volumes 'detach start' and 'detach commit'Dan Lambright2015-05-0912-26/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back port of http://review.gluster.org/10108 These commands work in a manner analagous to rebalancing when removing a brick. The existing migration daemon detects "detach start" and switches to moving data off the hot tier. While in this state all lookups are directed to the cold tier. gluster v detach-tier <vol> start gluster v detach-tier <vol> commit The status and stop cli commands shall be submitted separately. >Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0 >BUG: 1205540 >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Signed-off-by: root <root@localhost.localdomain> >Signed-off-by: Dan Lambright <dlambrig@redhat.com> >Reviewed-on: http://review.gluster.org/10108 >Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: I212d748d077fb5870ee84b316c653acbafbea3f7 BUG: 1220047 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10708 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glupy: package glupy as a subpackage under gluster namespaceNiels de Vos2015-05-092-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently glupy files resides in gluster namespace of python site packages. The other projects like libgfapi-python ..etc are evolving and need to share the gluster namespace. The current structure makes things difficult as all subpackages have its own __init__ files and other files. One subpackage can not any more own gluster namespace. The attempt is to make below structure for gluster namespace so that it is more portable and scalable for future use. <sitepackages>/gluster/ | -- __init__.py | | -- glupy | -- __init__.py -- glupy.py -- ........ | | -- gfapi | -- __init__.py -- gfapi.py -- ........ By above structure clients can import: >>> from gluster import glupy >>> from gluster import gfapi libgfapi-python project has been moved to this structure via http://review.gluster.org/#/c/9668/ Cherry picked from commit 40df2ed4d098d4cd2c6abbed23e497ac3e2e5804: > Change-Id: I54886200ddb6a4153a74d9e187aeca7cad79ef9e > BUG: 1211900 > Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> > Reviewed-on: http://review.gluster.org/10248 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> This backport is really minimal, because commit 44036808 removed the need for the changes in the glusterfs.spec.in file. Change-Id: I54886200ddb6a4153a74d9e187aeca7cad79ef9e BUG: 1220022 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10706 Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: change log level of developer logs to DEBUGVijay Bellur2015-05-091-2/+2
| | | | | | | | | | | | | | | | | | Backport of : http://review.gluster.org/10281 A few log messages in dht directory self heal at log level INFO are useful only for developers and these logs tend to casue excessive logs in our log files. Hence moving the log level of such logs to DEBUG. Change-Id: I8a543f4ddeb5c20b2978a0f7b18d8baccc935a54 BUG: 1217949 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10281 Reviewed-by: N Balachandran <nbalacha@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-on: http://review.gluster.org/10704 Tested-by: NetBSD Build System Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* geo-rep: Fix corrupt gsyncd outputAravinda VK2015-05-091-2/+2
| | | | | | | | | | | | | | | When gsyncd fails with Python traceback, glusterd fails parsing gsyncd output and shows error. BUG: 1219938 Change-Id: Ic32fd897c49a5325294a6588351b539c6e124338 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/10694 Reviewed-on: http://review.gluster.org/10695 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: rename handling in dht volume(changelog changes)Saravanakumar Arumugam2015-05-091-38/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: Glusterfs changelogs are stored in each brick, which records the changes happened in that brick. Georep will run in all the nodes of master and processes changelogs "independently". Processing changelogs is in brick level, but all the fops will be replayed on "slave mount" point. Problem: With a DHT volume, in changelog "internal fops" are NOT recorded. For Rename case, Rename is recorded in "hashed" brick changelog. (DHT's internal fops like creating linkto file, unlink is NOT recorded). This lead us to inconsistent rename operations. For example, Distribute volume created with Two bricks B1, B2. //Consider master volume mounted @ /mnt/master and following operations executed: cd /mnt/master touch f1 // f1 falls on B1 Hash mv f1 f2 // f2 falls on B2 Hash // Here, Changelogs are recorded as below: @B1 CREATE f1 @B2 RENAME f1 f2 Here, race exists between Brick B1 and B2, say B2 will get executed first. Source file f1 itself is "NOT PRESENT", so it will go ahead and create f2 (Current implementation). We have this problem When rename falls in another brick and file is unlinked in Master. Similar kind of issue exists in following case too(multiple rename): CREATE f1 RENAME f1 f2 RENAME f2 f1 Solution: Instead of carrying out "changelogging" at "HASHED volume", carry out at the "CACHED volume". This way we have rename operations carried out where actual files are present. So,Changelog recorded as : @B1 CREATE f1 RENAME f1 f2 Note: This patch is dependent on dht changes from this patch. http://review.gluster.org/10410/ changelog related changes are separated out for review. In changelog, xdata passed from DHT is considered as: 1. In case of unlink (internal operation as part of rename), xdata value is set , it is considered as RENAME and recorded accordingly. 2. In case of rename (Hash and Cache different), xdata value is NOT set, recording rename operation is SKIPPED. BUG: 1219412 Change-Id: I7691166c84991482b2cfe073df64e2317c935b13 Reviewed-On: http://review.gluster.org/#/c/10220/ Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-on: http://review.gluster.org/10633 Reviewed-by: Aravinda VK <avishwan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr : Prevent inode-evict during split-brain resolutionAnuradha2015-05-095-58/+303
| | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/10134/ 1) Provided setfattr command to set timeout for split-brain choice. 2) If split-brain inspection/resolution is being done from the mount for a file, ref the inode when split-brain-choice is set. This inode will be unconditionally unref-ed after timeout seconds set by the user/default otherwise. 3) Updated the doc and testcase to reflect the changes. Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1 BUG: 1219388 Reviewed-on: http://review.gluster.org/10134 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/10679
* features/changelog: Fix buffer overflow in snprintfKotresh HR2015-05-081-8/+8
| | | | | | | | | | BUG: 1219823 Change-Id: I3a1c7b7742671847ed3fec13e06d861c3d09f7a9 Reviewed-On: http://review.gluster.org/#/c/10687/ Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10688 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* cluster/ec: Change meaning of trusted.ec.dirtyPranith Kumar K2015-05-087-208/+555
| | | | | | | | | | | | | | | | | - With this change, the xattr will represent if the file needs to be healed or not. It will have different values for data/entry and metadata changes. - inode ref leaks and dict_set_dynstr related leaks fixed - Added support for trylock/lock based on heal-cmd execution or not in data heal. - Made fixes to pass regression runs Change-Id: I9d8def4c2badde18a76b7898816fecfac113737a BUG: 1216303 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10385 Reviewed-on: http://review.gluster.org/10693 Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: data heal implementation for ecPranith Kumar K2015-05-083-8/+788
| | | | | | | | | | | | | | | | | | | | | | | | | Data self-heal: 1) Take inode lock in domain 'this->name:self-heal' on 0-0 range (full file), So that no other processes try to do self-heal at the same time. 2) Take inode lock in domain 'this->name' on 0-0 range (full file), 3) perform fxattrop+fstat and get the xattrs on all the bricks 3) Choose the brick with ec->fragment number of same version as source 4) Truncate sinks 5) Unlock lock taken in 2) 5) For each block take full file lock, Read from sources write to the sinks, Unlock 6) Take full file lock and see if the file is still sane copy i.e. File didn't become unusable while the bricks are offline. Update mtime to before healing 7) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 8) unlock lock acquired in 6) 9) unlock lock acquired in 1) Change-Id: I6f4d42cd5423c767262c9d7bb5ca7767adb3e5fd BUG: 1216303 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10384 Reviewed-on: http://review.gluster.org/10692 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* cluster/ec: metadata/name/entry heal implementation for ecPranith Kumar K2015-05-082-0/+1058
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Metadata self-heal: 1) Take inode lock in domain 'this->name' on 0-0 range (full file) 2) perform lookup and get the xattrs on all the bricks 3) Choose the brick with highest version as source 4) Setattr uid/gid/permissions 5) removexattr stale xattrs 6) Setxattr existing/new xattrs 7) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 8) unlock lock acquired in 1) Entry self-heal: 1) take directory lock in domain 'this->name:self-heal' on 'NULL' to prevent more than one self-heal 2) we take directory lock in domain 'this->name' on 'NULL' 3) Perform lookup on version, dirty and remember the values 4) unlock lock acquired in 2) 5) readdir on all the bricks and trigger name heals 6) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 7) unlock lock acquired in 1) Name heal: 1) Take 'name' lock in 'this->name' on 'NULL' 2) Perform lookup on 'name' and get stat and xattr structures 3) Build gfid_db where for each gfid we know what subvolumes/bricks have a file with 'name' 4) Delete all the stale files i.e. the file does not exist on more than ec->redundancy number of bricks 5) On all the subvolumes/bricks with missing entry create 'name' with same type,gfid,permissions etc. 6) Unlock lock acquired in 1) Known limitation: At the moment with present design, it conservatively preserves the 'name' in case it can not decide whether to delete it. this can happen in the following scenario: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) rename d1/f1 -> d2/f2 is performed but the rename is successful only on one of the bricks (Lets say B) 3) Now name self-heal on d1 and d2 would re-create the file on both d1 and d2 resulting in d1/f1 and d2/f2. Because we wanted to prevent data loss in the case above, the following scenario is not healable, i.e. it needs manual intervention: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) We have two hard links: d1/a, d2/b and another file d3/c even before the brick went down 3) rename d3/c -> d2/b is performed 4) Now name self-heal on d2/b doesn't heal because d2/b with older gfid will not be deleted. One could think why not delete the link if there is more than 1 hardlink, but that leads to similar data loss issue I described earlier: Scenario: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) We have two hard links: d1/a, d2/b 3) rename d1/a -> d3/c, d2/b -> d4/d is performed and both the operations are successful only on one of the bricks (Lets say B) 4) Now name self-heal on the 'names' above which can happen in parallel can decide to delete the file thinking it has 2 links but after all the self-heals do unlinks we are left with data loss. Change-Id: I3a68218a47bb726bd684604efea63cf11cfd11be BUG: 1216303 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10298 Reviewed-on: http://review.gluster.org/10691 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System
* cluster/ec: Fix dictionary compare functionPranith Kumar K2015-05-084-118/+57
| | | | | | | | | | | | | | | | | | | | | If both dicts are NULL then equal. If one of the dicts is NULL but the other has only ignorable keys then also they are equal. If both dicts are non-null then check if for each non-ignorable key, values are same or not. value_ignore function is used to skip comparing values for the keys which must be present in both the dictionaries but the value could be different. geo-rep's stime xattr doesn't need to be present in list xattr but when getxattr comes on stime xattr even if there aren't enough responses with the xattr we should still give out an answer which is maximum of the stimes available. Change-Id: I8de2ceaa2db785b797f302f585d88e73b154167d BUG: 1216303 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10078 Reviewed-on: http://review.gluster.org/10690 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System
* cluster/ec: add separate versions for data/entry, metadataAshish Pandey2015-05-088-29/+104
| | | | | | | | | | | | | | | | | | | Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version. Note: 3.7 and 3.6 version ec can't co-exist with this change because xattrop in 3.6 will fail with ERANGE as the buffer passed to it will be '8' bytes where as the value will be 16 bytes in 3.7. Where as 3.7 version clients can work with old version files. For upgrades we need to tell users to complete heals and then upgrade BUG: 1215265 Change-Id: Ib85114680cb7e75b8371c984d9f7b6401c1ffb93 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10312 Reviewed-on: http://review.gluster.org/10626 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Use fd instead of loc for get_size_versionAshish Pandey2015-05-083-57/+63
| | | | | | | | | | Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a BUG: 1219358 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/10176 Reviewed-on: http://review.gluster.org/10625 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/changelog: Fixing compilation issueAravinda VK2015-05-082-2/+5
| | | | | | | | | | | | | This issue introduced due to manual rebase. Change-Id: I0589f4a0a1270190340f419b8022d6483bcf853d Signed-off-by: Aravinda VK <avishwan@redhat.com> BUG: 1219479 Reviewed-on: http://review.gluster.org/10685 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10686
* features/changelog: Avoid creation of empty changelogsSaravanakumar Arumugam2015-05-083-23/+154
| | | | | | | | | | | | | | | | | An empty changelog when rolled over gets unlinked and indexed with a modified path-name in htime file. The modification is "changelog" not "CHANGELOG" in basename of the empty changelog file. BUG: 1219479 Change-Id: Ib5b825ab563fa34d8dcf4368cf6cbf4b25d78a6d Original-Author: Ajeet Jha <ajha@redhat.com> Original-Author: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-On: http://review.gluster.org/#/c/9572/ Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10642 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Aravinda VK <avishwan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* quota/marker: turn off inode quotas by defaultvmallika2015-05-089-37/+213
| | | | | | | | | | | | | | | | | | | | | | | inode quota is a new feature implemented in glusterfs-3.7 if quota is enabled in the older version and is upgraded to a new version, we can hit setxattr spike during self-heal of inode quotas. So, when a quota is enabled, turn off inode-quotas with a xlator option. With this patch, we still account for inode quotas but only when a write operation is performed for a particular file. User will be able to query inode quotas once the Inode-quota xlator option is enabled. Change-Id: I52fb28bf7024989ce7bb08ac63a303bf3ec1ec9a BUG: 1218243 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/10152 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/10621
* cluster/tier: don't use hot tier until subvolumes readyDan Lambright2015-05-082-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of fix 10435 to Gluster 3.7. When we attach a tier, the hot tier becomes the hashed subvolume. But directories may not yet have been replicated by the fix layout process. Hence lookups to those directories will fail on the hot subvolume. We should only go to the hashed subvolume once the layout has been fixed. This is known if the layout for the parent directory does not have an error. If there is an error, the cold tier is considered the hashed subvolume. The exception to this rules is ENOCON, in which case we do not know where the file is and must abort. Note we may revalidate a lookup for a directory even if the inode has not yet been populated by FUSE. This case can happen in tiering (where one tier has completed a lookup but the other has not, in which case we revalidate one tier when we call lookup the second time). Such inodes are still invalid and should not be consulted for validation. > http://review.gluster.org/#/c/10435/ > Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 > BUG: 1214289 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10435 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia2bc62e1d807bd70590bd2a8300496264d73c523 BUG: 1219547 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10649 Tested-by: NetBSD Build System Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Implement [f]truncate fopsKrutika Dhananjay2015-05-083-309/+804
| | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10631 To-Do: * Make ftruncate work even in the absence of path * Aggregate and update ia_blocks appropriately when a file is truncated to a lower size. Change-Id: Icd424430066233ba61a030e72fdddf692d2b3f22 BUG: 1214247 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* geo-rep: rename handling in dht volumeNithya Balachandran2015-05-081-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: Glusterfs changelogs are stored in each brick, which records the changes happened in that brick. Georep will run in all the nodes of master and processes changelogs "independently". Processing changelogs is in brick level, but all the fops will be replayed on "slave mount" point. Problem: With a DHT volume, in changelog "internal fops" are NOT recorded. For Rename case, Rename is recorded in "hashed" brick changelog. (DHT's internal fops like creating linkto file, unlink is NOT recorded). This lead us to inconsistent rename operations. For example, Distribute volume created with Two bricks B1, B2. //Consider master volume mounted @ /mnt/master and following operations executed: cd /mnt/master touch f1 // f1 falls on B1 Hash mv f1 f2 // f2 falls on B2 Hash // Here, Changelogs are recorded as below: @B1 CREATE f1 @B2 RENAME f1 f2 Here, race exists between Brick B1 and B2, say B2 will get executed first. Source file f1 itself is "NOT PRESENT", so it will go ahead and create f2 (Current implementation). We have this problem When rename falls in another brick and file is unlinked in Master. Similar kind of issue exists in following case too(multiple rename): CREATE f1 RENAME f1 f2 RENAME f2 f1 Solution: Instead of carrying out "changelogging" at "HASHED volume", carry out at the "CACHED volume". This way we have rename operations carried out where actual files are present. So,Changelog recorded as : @B1 CREATE f1 RENAME f1 f2 credit: sarumuga@redhat.com PS: Some of the races as the one below are _NOT_ fixed by this patch * f1 and f2 exist. B1 and B2 are their respective cached subvols. For both files hashed-subvol == cached-subvol * mv f1 f2 on master. * B1 has change-log entry of rename f1 f2 * rebalance migrates f2 from B1 and B2 * mv f2 f1 on master. * B2 has change-log entry of rename f2 f1 Since changelog entries (rename f1 f2) and (rename f2 f1) are processed independently by gsyncds, which of either f1 and f2 survives on slave is subject to race. Note that on master its file f1 with name f1 which survived. On slave it can be either file f1 with name f1 or file f2 with name f2 based on who wins the race of processing changelog. BUG: 1219412 Change-Id: I43725d69635e2ce065135691ef629014e8df7d50 Original-Author: Nithya Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/10410 Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-on: http://review.gluster.org/10628 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kotresh HR <khiremat@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: Version support for Changelog ParserAravinda VK2015-05-082-11/+70
| | | | | | | | | | | | | | | | | | As optional feature, during unlink, full path will be recorded. Changelog Version number to be bumped up to 1.2. With this patch, parser checks the version number before parsing and handles accordingly. Change-Id: Ic1ad98259c39e417029a08e26a1d4b467817e65a BUG: 1218383 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/10166 Reviewed-on: http://review.gluster.org/10620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* snapshot: Handshake with glusterd is not properMohammed Rafi KC2015-05-081-20/+127
| | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/9664 If a snap is activated or deactivated, when a node is down, it is not retrieving the data properly during the handshake of glusterd With this patch, a version check will made when a glusterd is started running. If there is a mismach in version, then peers will exchange the healed data. Change-Id: I8bd2a347723db2194d3fa73295878b4dd2e9be5d BUG: 1219744 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9664 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/10661
* guster/dht: tiered volumes may not allow access to files undergoing migrationDan Lambright2015-05-081-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a backport of fix 10324 to Gluster 3.7. If a read IO occurs against a file that has reached rebalance phase 2, we redirect the IO to the destination. For tiered volumes, when we try to reopen the file (on the destination), the lower level DHT receives the open call and fails; it does not have a "cached subvol". Fix is to "teach" the lower level DHT of the new location by sending a locate before the open. > http://review.gluster.org/#/c/10324/ > Change-Id: Ia4acb0035ff1da15f6a8f9ed54f43c76e8b98f5f > BUG: 1214048 > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Signed-off-by: root <root@gprfs018.sbu.lab.eng.bos.redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10324 > Tested-by: NetBSD Build System > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> Change-Id: Ia4acb0035ff1da15f6a8f9ed54f43c76e8b98f5f BUG: 1219608 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10654 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Joseph Fernandes Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr/xlator: Named lookup heal of pre-existing files, before ctr was ON.Joseph Fernandes2015-05-086-4/+945
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The CTR xlator records file meta (heat/hardlinks) into the data. This works fine for files which are created after ctr xlator is switched ON. But for files which were created before CTR xlator is ON, CTR xlator is not able to record either of the meta i.e heat or hardlinks. Thus making those files immune to promotions/demotions. Solution: The solution that is implemented in this patch is do ctr-db heal of all those pre-existent files, using named lookup. For this purpose we use the inode-xlator context variable option in gluster. The inode-xlator context variable for ctr xlator will have the following, a. A Lock for the context variable b. A hardlink list: This list represents the successful looked up hardlinks. These are the scenarios when the hardlink list is updated: 1) Named-Lookup: Whenever a named lookup happens on a file, in the wind path we copy all required hardlink and inode information to ctr_db_record structure, which resides in the frame->local variable. We dont update the database in wind. During the unwind, we read the information from the ctr_db_record and , Check if the inode context variable is created, if not we create it. Check if the hard link is there in the hardlink list. If its not there we add it to the list and send a update to the database using libgfdb. Please note: The database transaction can fail(and we ignore) as there already might be a record in the db. This update to the db is to heal if its not there. If its there in the list we ignore it. 2) Inode Forget: Whenever an inode forget hits we clear the hardlink list in the inode context variable and delete the inode context variable. Please note: An inode forget may happen for two reason, a. when the inode is delete. b. the in-memory inode is evicted from the inode table due to cache limits. 3) create: whenever a create happens we create the inode context variable and add the hardlink. The database updation is done as usual by ctr. 4) link: whenever a hardlink is created for the inode, we create the inode context variable, if not present, and add the hardlink to the list. 5) unlink: whenever a unlink happens we delete the hardlink from the list. 6) mknod: same as create. 7) rename: whenever a rename happens we update the hardlink in list. if the hardlink was not present for updation, we add the hardlink to the list. What is pending: 1) This solution will only work for named lookups. 2) We dont track afr-self-heal/dht-rebalancer traffic for healing. > http://review.gluster.org/#/c/10370/ > Cherry picked from commit cb11dd91a6cc296e4a3808364077f4eacb810e48 > Change-Id: Ia4bbaf84128ad6ce8c3ddd70bcfa82894c79585f > BUG: 1212037 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Signed-off-by: Dan Lambright <dlambrig@redhat.com> > Reviewed-on: http://review.gluster.org/10370 > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Tested-by: NetBSD Build System > Reviewed-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I367aa46c3f4b8f912248fb8be75866507f2538df BUG: 1219075 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10370 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10615 Tested-by: Vijay Bellur <vbellur@redhat.com>
* quota: support for inode quota in quota.confvmallika2015-05-073-111/+180
| | | | | | | | | | | | | | | | | | Currently when quota limit is set, corresponding gfid is set in quota.conf. This patch supports storing inode-quota limits in quota.conf and also stores additional byte for each gfid to differentiate between usage quota limit and inode quota limit. Change-Id: I444d7399407594edd280e640681679a784d4c46a BUG: 1218170 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/10069 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/10524
* glusterd: Use generation number to find peerinfo in RPC notificationsKaushal M2015-05-075-8/+53
| | | | | | | | | | | | | | | | | | | | | | The generation number for each peerinfo object is unique. It can be used to find the exact peerinfo object, which is required for peer RPC notifications. Using hostname and uuid matching to find peerinfos can return incorrect peerinfos to be returned in certain cases like multi network peer probe. This could cause updates to happen to incorrect peerinfos. Change-Id: Ia0aada8214fd6d43381e5afd282e08d53a277251 BUG: 1215018 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/10495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Atin Mukherjee <amukherj@redhat.com> (cherry picked from commit 02583099a219ce327aac62af22b486c7b9fcb531) Reviewed-on: http://review.gluster.org/10623 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glupy: fix tuntime search path and python module directory layoEmmanuel Dreyfus2015-05-073-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | ut 1) The glupy.so xlator should embed the runtime search path for the python libraries. Unfortunately, python-config does not gives the appprioate flags, therefore we need to also use pkg-config to obtain them 2) Fix the glupy python module directory layout so that python can import the module without problem That two fixes seems to let glupy.t pass on NetBSD again. Backport of: I397aa726ab8bf7d91fa0d6d870a30910a5f4a5d9 BUG: 1212676 Change-Id: Ie48916d71f0f1a357d65c3c22b5e7d7276d720db Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10650 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: allocate the auth_cache->cache_dict on auth_cache_init()Niels de Vos2015-05-071-14/+10
| | | | | | | | | | | | | | | | | | | | | | | | It seems possible that auth_cache->cache_dict is not always allocated before it is accessed. Instead of allocating the dict upon the 1st access, just create it in auth_cache_init(). Cherry picked from commit eb8847703b8560a045e7ed0336f895bcceda98ea: > Change-Id: I00e60522478b433cb0aae0c1f0948eac544dfd2b > URL: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10710 > BUG: 1143880 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/10600 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: NetBSD Build System Change-Id: I00e60522478b433cb0aae0c1f0948eac544dfd2b BUG: 1212182 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10655 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Restore build on non Linux systemsEmmanuel Dreyfus2015-05-072-4/+7
| | | | | | | | | | | | | | | | | | | | | | This change broke the build on NetBSD, FreeBSD, and MacOS X: http://review.gluster.org/10526/ We restore the build with two fixes: - Use POSIX-compliant sysconf(_SC_NPROCESSORS_ONLN) to get the number of processors, instead of Linux specific get_nprocs(). That let us remove Linux-specific #include <sys/sysinfo.h> - Only define MAX() if it is not already defined. NetBSD defines it in <sys/param.h> which is already included Backport of: I62341c670598670e47ea2f69ab94864f96588b18 BUG: 1212676 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Change-Id: I0f098153e76954bb85b5dca3f054a069e31dd94c Reviewed-on: http://review.gluster.org/10653 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalance: Throttle rebalanceSusant Palai2015-05-074-4/+169
| | | | | | | | | | | | | | | | Throttle value will be "normal" by default. For throttling down, a thread will be put in to sleep. And for throttling up, gf_defrag_process_dir will wake up the sleeping threads. Change-Id: I4892ab14982a1ff305aeb2d8bbd33c79d6877b69 BUG: 1219579 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/10526 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/10629 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* dht/rebalancer: Marking tiering migration fopsJoseph Fernandes2015-05-073-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up patch for http://review.gluster.org/#/c/10080 In the above, the suggested change in http://review.gluster.org/#/c/10080/7/xlators/cluster/dht/src/dht-rebalance.c doesnot work. The reason it doesnt work is promotion and demotion are done in a multithread way. Whenever a promotion or demotion thread is called, the frame of the old sync_op thread is not carried with it. As a result the frame->root->pid is not set. Solution: When the file is getting migrated, we get a tiering.migration key_value in the xattr dict, so that we pass this dic key-value when we do syncop_setxattr() to do data migration and set the frame->root->pid GF_CLIENT_PID_TIER_DEFRAG in dht_setxattr() just before calling dht_start_rebalance_task(). > http://review.gluster.org/#/c/10266/ > Change-Id: I86fef2d961b32fdd2c0c69d8512cbe846b393404 > BUG: 1194753 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10266 > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> Change-Id: I6ab42b2c7a3c3e21c461d097b7558ee967b62c62 BUG: 1218959 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10266 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Susant Palai <spalai@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10601 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* ctr/libgfdb: Performance enhancer for unlink/create/rename/link fopsJoseph Fernandes2015-05-075-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) ctr_link_consistency option for ctr xaltor is provided so that the user can choose to switch it on or off. /* For link consistency we do a double update i.e mark the link * during the wind and during the unwind we update/delete the link. * This has a performance hit. We give a choice here whether we need * link consistency to be spoton or not using link_consistency flag. * This will have only one link update */ 2) In delete the wind time recording is moved to unwind path. /* Special performance case: * Updating wind time in unwind for delete. This is done here * as in the wind path we will not know whether its the last * link or not. For a last link there is not use to update any * wind or unwind time!*/ > http://review.gluster.org/#/c/10170/ > Cherry picked from commit 606d9734543208542afcf9df982bf2d560235ef6 > Change-Id: I209472fb816f939db4a868b97ba053b028f17ea6 > BUG: 1217786 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10170 > Reviewed-by: Dan Lambright <dlambrig@redhat.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: I4a89ef80875f36cff91520f712e1f47fde258a63 BUG: 1219066 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10170 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10614 Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Revert "bitrot: Scrubber log should report 'bad' file detection as ALERT in log"Raghavendra Bhat2015-05-071-3/+3
| | | | | | | | | This reverts commit 580759939dbcf835cb5293638060e8dbc41c7bea. Change-Id: Ie5018eaa6350279c488f12a933662e7aaadaa907 Reviewed-on: http://review.gluster.org/10643 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Raghavendra Bhat <raghavendra@redhat.com>
* changelog: Fixing buffer overrun coverity issues.Nandaja Varma2015-05-072-7/+7
| | | | | | | | | | | | | | | | | | | | | | Coverity IDs: 1214630 1214631 1214633 1234643 Backport of: http://review.gluster.org/9557 Change-Id: I172c4f49bf651b2324522f9e661023f73ca05339 BUG: 789278 Signed-off-by: Nandaja Varma <nvarma@redhat.com> Reviewed-on: http://review.gluster.org/9557 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Sakshi Bansal Reviewed-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 7d7b80efe8c745f3ff7de76fc31c4977098cae01) Reviewed-on: http://review.gluster.org/10595 Tested-by: NetBSD Build System Tested-by: Venky Shankar <vshankar@redhat.com>
* glusterd/quota/tiering: Fixing volgen of quotadJoseph Fernandes2015-05-071-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | The quotad's graph generation was happening wrongly for tiered volume. The check is been inserted. > http://review.gluster.org/#/c/10474/ > Cherry picked from commit cfb9ea4dc68440a18b7f07422901a715b00776f0 > Change-Id: I5554bc5280b0fbaec750e9008fdd930ad53a774f > BUG: 1214219 > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/10474 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I0ad0bdea58c50b24e5f48b5af25a97f995889c5c BUG: 1219048 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10474 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/10611 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs : fix for coredump caused by export/netgroup feature in the regressionJiffin Tony Thottan2015-05-071-0/+1
| | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/10250/ cherry-picked from 306585d2e57aadc7d15951ab1114d49fd9dbf5aa >Change-Id: Idaae234b9e81c40040393e748db1f61363a48ed0 >BUG: 1211913 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/10250 >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: Ic6d6b2f8fc0629678f0adb59e967e73c0923977a BUG: 1212182 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/10618 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>