summaryrefslogtreecommitdiffstats
path: root/xlators/features
Commit message (Collapse)AuthorAgeFilesLines
* changelog xlator: avoid unsupported threaded event-poll usageEmmanuel Dreyfus2015-03-301-10/+5
| | | | | | | | | | | | | | | | | | | | | The changelog xlator was modified to use a poller thread, which uses the same event pool as the xlator stack. Unfortunately, such threaded usage of event pool is not supported by event-poll code, only event-epoll supports it. As a result, platforms such as NetBSD that lack epoll support got broken. The fix is to remove the poller thread, which does not cause any functionnality loss because the xlator stack event poll is functionnal. That lets NetBSD pass AFR tests again. BUG: 1129939 Change-Id: I3d73cf58e2ed8d92d9e0191f7abda3c37dea4159 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10030 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* qemu-block: Fixing coverity issue(Unused value)Nandaja Varma2015-03-301-5/+4
| | | | | | | | | | | | | Coverity ID: 1124889 1124889 1124889 1124889 1124889 1124889 1124889 1124889 1124889 1124889 1124889 Change-Id: I82d57022c4a06fb70573b4e9436f934185f8fb84 BUG: 789278 Signed-off-by: Nandaja Varma <nandaja.varma@gmail.com> Reviewed-on: http://review.gluster.org/9642 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Humble Devassy Chirammal <humble.devassy@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* glupy: correct the definition of GlusterFS in setup.pyHumble Devassy Chirammal2015-03-301-1/+1
| | | | | | | | | Change-Id: I31597a623b4ebf3d3129067eb20c661c910b97fe BUG: 1198849 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/9958 Reviewed-by: Lalatendu Mohanty <lmohanty@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* doc: Fix Jeff Darcy's links in docsSaravanakumar Arumugam2015-03-301-1/+1
| | | | | | | | | Change-Id: I4bc859662088a55fe70714feb3c47ed02ad08e94 BUG: 1206539 Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> Reviewed-on: http://review.gluster.org/10041 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* features/trash : Discarding extended truncate for trash-translatorJiffin Tony Thottan2015-03-301-4/+8
| | | | | | | | | | Change-Id: I5c571cbb2d6da1e95831ec206639926722a9d281 BUG: 1132465 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/9984 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anoop C S <achiraya@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Bug fixesKrutika Dhananjay2015-03-272-30/+50
| | | | | | | | | | | | | | | | | * Return number of bytes written in writev cbk on success * Eliminate separate inode table for sharding xlator. * Fix appearance of "shard" as an option within the volfile for subvolume of type features/shard. * Fix values of min and max allowed shard block size * Return @new as opposed to NULL in shard_create_gfid_dict() on success Change-Id: I6319d377a196d1c5ceed1f65d337ff8eabcb21f8 BUG: 1205661 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/10003 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/trash: Removing dead callbacksAnoop C S2015-03-271-405/+20
| | | | | | | | | | | | | | | Since ftruncate create, mkdir, writev, readv and unlink calls are being re-directed to corresponding truncate calls, we no longer need their cbks. So removing those cbks for now. Change-Id: I41ecde7093a555b3bf69b66afaa8eca835b4982a BUG: 1132465 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/10002 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/snapview-client: Don't free un-allocated memoryPranith Kumar K2015-03-271-2/+0
| | | | | | | | | | Change-Id: I8636ced27448dde4f2c11370fe2026067d4a7e74 BUG: 1203637 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10004 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/ctr: Removing rpc-lib, rpc-xdr and api from CPPFLAGSAnoop C S2015-03-251-6/+1
| | | | | | | | | | | | | Changetimerecorder doesn't seem to use rpc-lib, rpc-xdr and gfapi in the source. So removing those from Makefile.am Change-Id: I21c71db6212c10ba3821c6c456958a45c5312d41 BUG: 1198849 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9997 Reviewed-by: Joseph Fernandes <josferna@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/bit-rot: filesystem scrubberVenky Shankar2015-03-246-61/+502
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scrubber performs signature verification for objects that were signed by signer. This is done by recalculating the signature (using the hash algorithm the object was signed with) and verifying it aginst the objects persisted signature. Since the object could be undergoing IO opretaion at the time of hash calculation, the signature may not match objects persisted signature. Bitrot stub provides additional information about the stalesness of an objects signature (determinted by it's versioning mechanism). This additional bit of information is used by scrubber to determine the staleness of the signature, and in such cases the object is skipped verification (although signature staleness is performed twice: once before initiation of hash calculation and another after it (an object could be modified after staleness checks). The implmentation is a part of the bitrot xlator (signer) which acts as a signer or scrubber based on a translator option. As of now the scrub process is ever running (but has some form of weak throttling mechanism during filesystem scan). Going forward, there needs to be some form of scrub scheduling and IO throttling (during hash calculation) tunables (via CLI). Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9 BUG: 1170075 Original-Author: Raghavendra Bhat <rabhat@redhat.com> Original-Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9914 Tested-by: Vijay Bellur <vbellur@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/bit-rot: Implementation of bit-rot xlatorVenky Shankar2015-03-2411-288/+1506
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the "Signer" -- responsible for signing files with their checksums upon last file descriptor close (last release()). The event notification facility provided by the changelog xlator is made use of. Moreover, checksums are as of now SHA256 hash of the object data and is the only available hash at this point of time. Therefore, there is no special "what hash to use" type check, although it's does not take much to add various hashing algorithms to sign objects with. Signatures are stored in extended attributes of the objects along with the the type of hashing used to calculate the signature. This makes thing future proof when other hash types are added. The signature infrastructure is provided by bitrot stub: a little piece of code that sits over the POSIX xlator providing interfaces to "get or set" objects signature and it's staleness. Since objects are signed upon receiving release() notification, pre-existing data which are "never" modified would never be signed. To counter this, an initial crawler thread is spawned The crawler scans the entire brick for objects that are unsigned or "missed" signing due to the server going offline (node reboots, crashes, etc..) and triggers an explicit sign. This would also sign objects when bit-rot is enabled for a volume and/or after upgrade. Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b BUG: 1170075 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9711 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: generate volfile for BitDVenky Shankar2015-03-244-0/+163
| | | | | | | | | | | | | | * Implement the skeleton of bit-rot xlator. Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Signed-off-by: Anand Nekkunti <anekkunt@redhat.com> Change-Id: If33218bdc694f5f09cb7b8097c4fdb74d7a23b2d BUG: 1170075 Reviewed-on: http://review.gluster.org/9710 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* read-only: read-only/worm translator should be in brick graph by defaultAtin Mukherjee2015-03-247-51/+386
| | | | | | | | | | | | | | | | | | | | | Problem: read-only/worm translator is not loaded by default in brick graph because of which when read-only option is set through volume set volume still remains writable untill the bricks are restarted as the translator does not have an inmemory flag to decide whether the read-only/worm option is turned or not. Solution: read-only/worm should be loaded by default in brick graph and the read-only/worm option can be toggled through volume set command. read-only/worm translator now' has an in-memory flag to decide whether the volume is read-only or not and based on that either reject the fop or proceed. Change-Id: Ic79328698f6a72c50433cff15ecadb1a92acc643 BUG: 1134822 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/8571 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Bitrot StubVenky Shankar2015-03-249-7/+1922
| | | | | | | | | | | | | Bitrot stub implements object versioning required for identifying signature freshness. More details about versioning is explained as a part of the "bitrot feature documentation" patch. Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9709 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: Make use of IPC fopVenky Shankar2015-03-241-0/+46
| | | | | | | | | | | | | Translators which wish to send event notifications can send "down" an IPC FOP with op_type as GF_IPC_TARGET_CHANGELOG and xdata carrying event structures (changelog_event_t). Change-Id: I0e5f8c9170161c186f0e58d07105813e34e18786 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9775 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libgfchangelog: cleanups on disconnectionVenky Shankar2015-03-245-140/+280
| | | | | | | | | | | | | | | | | | | | | | | [WIP patch as of now, just needs a little tweak] A pending TODO in the code caused regressions to fail as bitrot daemons are spawned during volume start (equivalent to enabling bitrot by default). The problematic part that casued such failures is during brick disconnections with unsafe handling of event data structured in the code. With this patch, data structures are properly cleaned up with care taken to cleanup all accessors first. This also fixes potential memory leaks which was bluntly ignored before. Change-Id: I70ed82cb1a0fb56c85ef390007e321a97a35c5ce BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> original-author: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9959 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* libgfchangelog: Fix faulty reference to xlator contextVenky Shankar2015-03-244-55/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: libgfchangelog initializes global xlator on library load (via constructor: _ctor) and mangles it's xlator context thereby messing with certain important members of the command structure. On receiving an RPC disconnection event, if the point-of-execution was in libgfchangelogs context, accessing ->cmd_args during RPC notify resulted in a segfault. Fix: Since the libarary needs to be able to work with processes that have a notion of an xlator (THIS in particular) and without it, care needs to be taken to allocate the global xlator when needed. Moreover, the actual fix is to use the correct xlator context in both cases. A new API is introduces when needs to be invoked by the conusmer (although this could have been done during register() call, keeping it a separate API makes thing flexible and easy). Test: The issue is observed when a brick process goes offline. This is triggered when test cases (.t's) are run in bulk, since each test essestially spawns bricks processes (on volume start) and terminates them (volume stop). Since bitrot daemon, as of now, spawns upon volume start, the issue is much observed when the volume is taken offline at the end of each test case. With this fix, running the basic and core test cases along with building the linux kernel has passed without daemon segfaults. Thanks to Johnny (rabhat@) for helping in debugging the issue (and with the fix :)). Change-Id: I8d3022bf749590b2ee816504ed9b1dfccc65559a BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9953 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* features/trash: Handling hardlinks in trash translatorJiffin Tony Thottan2015-03-241-1/+29
| | | | | | | | | | | | | | | | | In the current code of trash translator, file is moved to trash directory without checking whether it is the last hardlink.This may lead to inconsistency for a file in that gluster volume.To avoid those scenarios,so a file is moved to trash directory only if it is the last hardlink. Change-Id: Id098e53a2236c6406ef91e6e2599ea2cff9bace3 BUG: 1132465 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/9926 Reviewed-by: Anoop C S <achiraya@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* tiering/CTR:build CTR only if tiering is enabledMohammed Rafi KC2015-03-201-1/+5
| | | | | | | | | | | | Thanks to Niels for this fix Change-Id: I9a13c3de3ed5d4eb06c6af61a2519bf27f1b6259 BUG: 1194753 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/9957 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* build: remove broken and unused _la_LIBADD variableNiels de Vos2015-03-191-1/+0
| | | | | | | | | | | | | | | | | | | While building, the following warning is displayed: xlators/features/arbiter/src/Makefile.am:7: variable `_la_LIBADD' is defined but no program or xlators/features/arbiter/src/Makefile.am:7: library has `_la' as canonical name (possible typo) The _la_LIBADD really seems like a typo, dropping it should not be harmful to anything, except for the warning that will now be gone. BUG: 1199985 Change-Id: I3f3ba911f59df2e51fdc6387295fff4bbcc5a12d Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9950 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* build: pass the correct CFLAGS and LDFLAGS when linking (to) libgfdbNiels de Vos2015-03-191-3/+2
| | | | | | | | | Change-Id: Id9a7d0f457d9759ab7d0a52a4000b5ae36d211f8 BUG: 1194753 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9946 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* features/shard: Introducing sharding translatorKrutika Dhananjay2015-03-196-1/+1951
| | | | | | | | | | | | | | | | | | | | | | | | | | | Based on the high-level design by Anand V. Avati which can be found @ https://gist.github.com/avati/af04f1030dcf52e16535#sharding-xlator-stripe-20 Still to-do: * complete implementation of inode write fops - [f]truncate, zerofill, fallocate, discard * introduce transaction mechanism in inode write fops * complete readv * Handle open with O_TRUNC * Handle unlinking of all shards during unlink/rename * Compute total ia_size and ia_blocks in lookup, readdirp, etc * wind fsync/flush on all shards Note: Most of the items above are related. Once we come up with a clean way to determine the last shard/shard count for a file/file size and the mgmt of sparse regions of the file, implementing them becomes trivial. Change-Id: Id871379b53a4a916e4baa2e06f197dd8c0043b0f BUG: 1200082 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9841 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* afr: arbiter xlatorRavishankar N2015-03-196-1/+389
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds the arbiter translator into the tree. This is a server side xlator used for replica 3 volumes. It sits above posix and will be loaded on the 3rd (last) brick of every afr subvolume in a replica 3 configuration. It intercepts inode read/write operations: reads are unwound with ENOTCONN, inode writes are unwound with success without actually passing them down to posix. Metadata operations are allowed to pass through. The CLI for creating a 3 way replica with arbiter is also added but kept disabled (A 'normal' 3 way replica is created instead). This patch is a part of the arbiter logic implementation for 3 way AFR, details of which can be found at http://review.gluster.org/#/c/9656/ Change-Id: I395b81f49d5da52c466daf5c8518f1bbad9c16fa BUG: 1199985 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9840 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Adding ChangeTimeRecorder(CTR) Xlator to GlusterFSJoseph Fernandes2015-03-198-2/+1961
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ********************************************************************** ChangeTimeRecorder(CTR) Xlator | ********************************************************************** ChangeTimeRecorder(CTR) is server side xlator(translator) which sits just above posix xlator. The main role of this xlator is to record the access/write patterns on a file residing the brick. It records the read(only data) and write(data and metadata) times and also count on how many times a file is read or written. This xlator also captures the hard links to a file(as its required by data tiering to move files). CTR Xlator is the consumer of libgfdb. To Enable/Disable CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.ctr-enabled {on/off} To Enable/Disable Frequency Counter Recording in CTR Xlator: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gluster volume set <volume-name> features.record-counters {on/off} Change-Id: I5d3cf056af61ac8e3f8250321a27cb240a214ac2 BUG: 1194753 Signed-off-by: Joseph Fernandes <josferna@redhat.com> Reviewed-on: http://review.gluster.org/9935 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota : Introducing inode quotavmallika2015-03-186-219/+376
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ========================================================================== Inode quota ========================================================================== = Currently, the only way to retrieve the number of files/objects in a = = directory or volume is to do a crawl of the entire directory/volume. = = This is expensive and is not scalable. = = = = The proposed mechanism will provide an easier alternative to determine = = the count of files/objects in a directory or volume. = = = = The new mechanism proposes to store count of objects/files as part of = = an extended attribute of a directory. Each directory's extended = = attribute value will indicate the number of files/objects present = = in a tree with the directory being considered as the root of the tree. = = = = The count value can be accessed by performing a getxattr(). = = Cluster translators like afr, dht and stripe will perform aggregation = = of count values from various bricks when getxattr() happens on the key = = associated with file/object count. = A new interface is introduced: ------------------------------ limit-objects : limit the number of inodes at directory level list-objects : list the directories where the limit is set remove-objects : remove the limit from the directory ========================================================================== CLI COMMAND: gluster volume quota <volname> limit-objects <path> <number> [<percent>] * <number> is a hard-limit for number of objects limitation for path "<path>" If hard-limit is exceeded, creation of file/directory is no longer permitted. * <percent> is a soft-limit for number of objects creation for path "<path>" If soft-limit is exceeded, a warning is issued for each creation. CLI COMMAND: gluster volume quota <volname> remove-objects [path] ========================================================================== CLI COMMAND: gluster volume quota <volname> list-objects [path] ... Sample output: ------------------ Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? ------------------------------------------------------------------------ -------------------------------------- /dir 10 80% 10 0 Yes Yes ========================================================================== [root@snapshot-28 dir]# ls a b file11 file12 file13 file14 file15 file16 file17 [root@snapshot-28 dir]# touch a1 touch: cannot touch `a1': Disk quota exceeded * Nine files are created in directory "dir" and directory is included in * the count too. Hence the limit "10" is reached and further file creation fails ========================================================================== Note: We have also done some re-factoring in cli for volume name validation. New function cli_validate_volname is created ========================================================================== Change-Id: I1823497de4f790a2a20ebb1770293472ea33ee2b BUG: 1190108 Signed-off-by: Sachin Pandit <spandit@redhat.com> Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9769 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: RPC'fy {libgf}changelogVenky Shankar2015-03-1829-1301/+3936
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces RPC based communication between the changelog translator and libgfchangelog. It replaces the old pathetic stream based interaction that existed earlier (due to time constraints :-/). Changelog, upon initialization starts a RPC server (rpcsvc) allowing clients to invoke a probe API as a bootup mechanism to request for event notifications. During probe, clients can choose an event filter specifying the type(s) of events they are interested in. As of now there is no way to change the event notification set once the probe RPC call is made, but that is easier to implement. The actual event notifications is done on a separate RPC session. The client (libgfchangelog) itself starts and RPC server which the changelog translator "connects back" during probe. Notifications are dispatched by a bunch of threads from the server (translator) and the client optionally orders them if ordered notifications are requried. FOPs fill in their respective event details in a buffer (rot-buffs to be particular) and a bunch of threads (consumers) swap the buffers out of roatation and dispatch them via RPC. To avoid writer starvation, then number of dispatcher threads is one less than the number of buffer list in rot-buffs.x libgfchangelog becomes purely callback based -- upon event notification from the server (and re-ordering them if required) invoke a callback routine specified by consumer(s). A major part of the patch is also aimed at providing backward compatibility for geo-replication, which was one of the main consumer of the stream based API. Also, this patch does not\ "turn on" event notifications for all fops, just a bunch which is currently in requirement. Another pain point is that the server does not filter events before dispatching it to the clients. That load is taken up by the client itself (although it's done at the library layer rather than making it hard on the callback implementor). This needs improvement and care needs to be taken to not load the server up with expensive filtering mechanisms. Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395 BUG: 1170075 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9708 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* marker: fix compile time warning on buf arg.Humble Devassy Chirammal2015-03-181-2/+0
| | | | | | | | | | | | | | | | | Problem: marker-quota.c: In function 'mq_inspect_directory_xattr_task': marker-quota.c:3451:31: warning: variable 'buf' set but not used [-Wunused-but-set-variable] struct iatt buf = {0,}; Change-Id: I211378328bdb2509a5d2a186d173f7f30a670c8a BUG: 1198849 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com> Reviewed-on: http://review.gluster.org/9928 Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Quota: Build ancestry in the lookupvmallika2015-03-183-10/+113
| | | | | | | | | | | | | | | | | | Marker can fail or can account incorrect numbers when it doesn't find a ancestry for a inode. Solution: Current build_ancestry is done only on demand in the write/create FOPs in quota enforcer. It is good to do this in the quota_lookup as well. Change-Id: I8aaf5b3e05a3ca51e7ab1eaa1b636a90f659a872 BUG: 1184885 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9478 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* NFS-Ganesha: Volume set option for managing NFS-Ganesha exports.Meghana Madhusudhan2015-03-186-1/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A dummy translator has been introduced as a place holder for functions related to managing NFS-Ganesha exports. A volume set option is introduced to manage volume level exports. gluster vol set <volname> ganesha.enable ON/OFF 1. gluster volume set <volname> ganesha.enable ON It creates the export config file with a unique export ID. Sends a DBus signal to export this volume dynamically. 2. gluster vol set <volname> ganesha.enable OFF Unexports the specific volume. Deletes the specfic config file related to the volume. This change also removes the handling of the older keys "nfs-ganesha.enable" and "nfs-ganesha.host" Change-Id: I8d4a0b542326a6a0c8e4711600b106274d666587 BUG: 1188184 Signed-off-by: Meghana Madhusudhan <mmadhusu@redhat.com> Reviewed-on: http://review.gluster.org/9585 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* features/trash: Avoid unnecessary logging from trash_local_wipeAnoop C S2015-03-171-1/+2
| | | | | | | | | | | | | | | | | | | Even when trash translator is disabled, the following error is being logged for each unlink/truncate/ftruncate calls. [...] E [trash.c:221:trash_local_wipe] (--> ... ... ) 0-trash: invalid argument: local This change replaces GF_VALIDATE_OR_GOTO macro with simple if condition. Change-Id: I7e6754cd53ec7c2d84669b6d40d883a2d1eee41e BUG: 1132465 Signed-off-by: Anoop C S <achiraya@redhat.com> Reviewed-on: http://review.gluster.org/9909 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Quota/marker : Support for inode quotavmallika2015-03-176-248/+1658
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to retrieve the number of files/objects in a directory or volume is to do a crawl of the entire directory/volume. This is expensive and is not scalable. The new mechanism proposes to store count of objects/files as part of an extended attribute of a directory. Each directory's extended attribute value will indicate the number of files/objects present in a tree with the directory being considered as the root of the tree. Currently file usage is accounted in marker by doing multiple FOPs like setting and getting xattrs. Doing this with STACK WIND and UNWIND can be harder to debug as involves multiple callbacks. In this code we are replacing current mechanism with syncop approach as syncop code is much simpler to follow and help us implement inode quota in an organized way. Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4 BUG: 1188636 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/9567 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* Upcall: New xlator to store various states and send cbk eventsSoumya Koduri2015-03-179-1/+1705
| | | | | | | | | | | | | | | | | | | | | | | | | | Framework on the server-side, to handle certain state of the files accessed and send notifications to the clients connected. A generic and extensible framework, used to maintain states in the glusterfsd process for each of the files accessed (including the clients info doing the fops) and send notifications to the respective glusterfs clients incase of any change in that state. This patch handles "Inode Update/Invalidation" upcall event. Feature page: URL: http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure Below link has a writeup which explains the code changes done - URL: https://soumyakoduri.wordpress.com/2015/02/25/glusterfs-understanding-upcall-infrastructure-and-cache-invalidation-support/ Change-Id: Ie3d724be9a3419fcf18901a753e8ec2df2ac802f BUG: 1200262 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/9535 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Features/trash : Combined patches for trash translatorAnoop C S2015-03-165-726/+1952
| | | | | | | | | | | | | | | | | | | | | | | | | This is the combined patch set for supporting trash feature. http://www.gluster.org/community/documentation/index.php/Features/Trash Current patch includes the following features: * volume set options for enabling trash globally and exclusively for internal operations like self-heal and re-balance * volume set options for setting the eliminate path, trash directory path and maximum trashable file size. * test script for checking the functionality of the feature * brief documentation on different aspects of trash feature. Change-Id: Ic7486982dcd6e295d1eba0f4d5ee6d33bf1b4cb3 BUG: 1132465 Signed-off-by: Anoop C S <achiraya@redhat.com> Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/8312 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* changelog: Unchecked buffer fill in gf_history_changelog_next_changeNiels de Vos2015-03-041-0/+5
| | | | | | | | | | | | | | | | | | | A gf_history_changelog_next_change() calls gf_readline() to fill a buffer without checking buffer size. The size of maxlen is not verified to be less than the lenght of buffer. This could result in the over filling of buffer of maxlen is greater than PATH_MAX. Check the size of maxlen to be less than PATH_MAX and return a fail code as needed. BUG: 1174017 Change-Id: Ic53b1a6e25af69a339bc15fb2d233dc1e457910f Reported-by: Keith Schincke <kschinck@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/9275 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* quota: quotad.socket files should be created under /var/run/glustervmallika2015-03-032-2/+2
| | | | | | | | | Change-Id: I49502a4f7516c02f7e321c16eebd748545afde07 BUG: 1197587 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/index: Add option to track specific xattrs for xattrop64Pranith Kumar K2015-02-242-42/+147
| | | | | | | | | | | This enables trusted.ec.dirty to be tracked in index Change-Id: Ief1619110859f6f9ccee3da229f0688b73e2124b BUG: 1177601 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9602 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* feature/gfid-access: Send a named lookup before trying to create a file.Kotresh HR2015-02-162-10/+55
| | | | | | | | | | | | | | | | | | | Normally a named-lookup is done by the kernel on an entry before it issues a dentry creation fop like create/mknod etc. This will enable cluster translators like dht to maintain internal consistency like deleting a linkto file if no corresponding datafile is present etc. While handling file creation on auxiliary gfid mounts, we issue dentry creation fop without issuing a lookup. If there are stale-linkto files, creation would fail with EEXIST, however access would fail since there is no datafile. A named lookup would cleanup the linkto file allowing create to succeed. Change-Id: I2932107296adac710dd179df7d0946b8697a497a BUG: 1191413 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9634 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* feature/changelog: Fix changelog and history scan failure.Kotresh HR2015-01-271-2/+4
| | | | | | | | | | | | | Fixes bad file descriptor issue while cleaning up scratch directory during gf_changelog_register. Change-Id: Ia6aa8d55dcc2209144b48b6583681a155d919c42 BUG: 1162057 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9495 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* features/marker: do not call inode_path on the inode not yet linkedRaghavendra Bhat2015-01-231-7/+19
| | | | | | | | | | | | | * in readdirp callbak marker is calling inode_path on the inodes that are not yet linked to the inode table. Change-Id: I7f5db29c6a7e778272044f60f8e73c60574df3a9 BUG: 1176393 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/9320 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* quota: For a link operation, do quota_check_limit only till thevmallika2015-01-192-41/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | common ancestor of src and dst file In a dht_rename, if src_cached and dst_hashed are different, then rename is split into link and unlink. We need to handle quota_link properly. We have fixed quota_rename in patch# 8940, we need to handle quota_link similarly > http://review.gluster.org/#/c/8940/ > quota: For a rename operation, do quota_check_limit only till the > common ancestor of src and dst file > Example: > set quota limit set to 1GB on / > create a file /a1/b1/file1 of 600MB > mv /a1/b1/file1 /a1/b1/file2 > This rename fails as it takes delta into account which sums up to 1.2BG. > Though we are not creating new file, we still get quota exceeded error. > So quota enforce should happen only till b1. > Similarly: > mv /a/b/c/file /a/b/x/y/file > quota enforce should happen only till dir 'b' > Change-Id: Ia1e5363da876c3d71bd424e67a8bb28b7ac1c7c1 > BUG: 1153964 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/8940 > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I2c814018d17f7af1807c1d1d162d8bdcbb31e491 BUG: 1153964 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/9419 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* features/changelog: Cleanup .processing and .current directoryAravinda VK2015-01-181-0/+21
| | | | | | | | | | | | | | | | | On changelog_register cleanup .processing, .history/.processing, .current and .history/.current from the working directory. Moved glusterd_recursive_rmdir and glusterd_for_each_entry to common place(libglusterfs) and renamed as recursive_rmdir and GF_FOR_EACH_ENTRY_IN_DIR respectively BUG: 1162057 Change-Id: I1f98468a344cead039026762a805437b2f9e507b Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/9082 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* feature/changelog: Logging entry as well for explicit sync virtual xattr.Kotresh HR2015-01-093-7/+135
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an improvement over the patch 'http://review.gluster.org/9337' to trigger explicit geo-rep sync on regular files even if entry is not present on the slave. An attempt is made to find the pargfid and if available captures CREATE along with DATA in changelog. CREATE is captured with default file permissions. Setting this virtual setxattr on directories captures MKDIR in changelog. The value of setxattr can be as follows. If value = "1" : Both CREATE and DATA is captured in changelog if pargfid is available, else on DATA is captured. value = "any other: ENOTSUP is returned. Usage: setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <file-path> NOTE: This patch supports explicit record of entries only for directories and regular files. Change-Id: Iedde8b2c8bc3b78db524050d8c866ff664811d01 BUG: 1176934 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9370 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* avoid memory leaks in snapview-serverRaghavendra Bhat2015-01-081-0/+3
| | | | | | | | | | | | | * snapview-server in readdirp, creates the inode for entries with names "." and ".." for each readdirp operation without creating dentries leading to memleak. It should have avoided creation of inodes for those entries Change-Id: I3b2025fd10872fcc3303d0becec764ffd4e37601 BUG: 1179663 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/9404 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: change signature of syncop_(f)getxattrRavishankar N2015-01-051-1/+1
| | | | | | | | | | | | | | | | | Pass xdata dict to syncop_(f)getxattr calls. This patch [1/3] is required as a part of afr automated split-brain resolution implementation. Change-Id: I3970b3dd6daf64681a031e37f8e9afb14fb3d668 BUG: 1136769 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/9375 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/uss: Perform NULL check on @name in svc_getxattrKrutika Dhananjay2015-01-021-1/+4
| | | | | | | | | | | | | | | | | | | LISTXATTR fop is internally converted into a GETXATTR with the "name" parameter set to NULL. In svc_getxattr(), a listxattr was causing a crash because of a NULL pointer dereference on @name. FIX: Add the necessary NULL check. Change-Id: I70024d40dc0695648c6d41b423c2665d030e1232 BUG: 1178079 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/9378 Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Sachin Pandit <spandit@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* feature/changelog: Virtual xattr to trigger explicit sync in geo-rep.Kotresh HR2014-12-291-0/+11
| | | | | | | | | | | | | | | | | | | A virtual xattr "glusterfs.geo-rep.trigger-sync" is provided in glusterfs through changelog translator. Geo-rep triggers a explicit data sync on setting this xattr on a file. Changelog captures a DATA entry on file's gfid on setting this virtual xattr on a file. This is supported only for files. It doesn't support directories. Usage: setfattr -n glusterfs.geo-rep.trigger-sync <file-path> Change-Id: Ia689326ac2dcb31035ffbecad2c548eda4eb9245 BUG: 1176934 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/9337 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Venky Shankar <vshankar@redhat.com>
* mgmt/glusterd: Add option to enable lock tracePranith Kumar K2014-12-281-10/+15
| | | | | | | | | | | Change-Id: I24ed0f866d53e91a8323c043a38f73207cbfd7d2 BUG: 1168189 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9351 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
* quota: For a rename operation, do quota_check_limit only till thevmallika2014-12-272-54/+291
| | | | | | | | | | | | | | | | | | | | | | | | common ancestor of src and dst file Example: set quota limit set to 1GB on / create a file /a1/b1/file1 of 600MB mv /a1/b1/file1 /a1/b1/file2 This rename fails as it takes delta into account which sums up to 1.2BG. Though we are not creating new file, we still get quota exceeded error. So quota enforce should happen only till b1. Similarly: mv /a/b/c/file /a/b/x/y/file quota enforce should happen only till dir 'b' Change-Id: Ia1e5363da876c3d71bd424e67a8bb28b7ac1c7c1 BUG: 1153964 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/8940 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* features/locks: Add lk-owner checks in entrylkPranith Kumar K2014-12-262-21/+54
| | | | | | | | | | | | | For backward compatibility of entry-self-heal we need entrylks to be accepted by same lk-owner and same client. This patch introduces these changes. Change-Id: I67004cc5e657ba5ac09ceefbea823afdf06929e0 BUG: 1168189 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/9125 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/marker: log message clean upVijay Bellur2014-12-252-21/+18
| | | | | | | | | | | | | | | | | 1. Changed log messages to be more appropriate. 2. Changed loglevel of failures in fop_cbks to be recorded as TRACE. Logging of failures at higher loglevels is unessential in non-endpoint translators. 3. Removed a log message related to memory allocation failure. BUG: 1174087 Change-Id: I63c560c3bbd12706357fb3f696378c1a1e1efb44 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/8168 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>