summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/xlator.c
Commit message (Collapse)AuthorAgeFilesLines
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-081-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I34b9f90d0f09766b6d87b3994d5cd7a77b622dcb BUG: 1392853 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15797 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core : Use correct path in dlopen for socket.soAtin Mukherjee2015-12-271-1/+6
| | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12656 This patch fixes the path for socket.so file while loading the so dynamically. Also for config.memory-accounting & config.transport voltype is changed to glusterd to fix the warning message coming from xlator_volopt_dynload Change-Id: I0f7964814586f2018d4922b23c683f4e1eb3098e BUG: 1283833 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/12656 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12670 Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* libglusterfs:Porting log messages to new frameworkMohamed Ashiq2015-09-081-4/+6
| | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/11896 Cherry picked from 038dfe57cf0c5944b0392332dbf5a00bb1208150 >Change-Id: I8625b7dc8941720cc7a864b8fddbcc7b4c485fcd >BUG: 1252836 >Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com> >Reviewed-on: http://review.gluster.org/11896 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Tested-by: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> >Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I8625b7dc8941720cc7a864b8fddbcc7b4c485fcd BUG: 1258769 Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com> Reviewed-on: http://review.gluster.org/12076 Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* defaults,globals,iobuf,latency,logging,options,xlator/libglusterfs : porting ↵Mohamed Ashiq2015-07-131-40/+45
| | | | | | | | | | | | | | | | | | | | | | to a new logging framework Backport of http://review.gluster.org/10826 Cherry picked from ef374774b8727b2f76915a1f88b26c83c77dde43 >Change-Id: If6a55186cddc3d1c4d22e3d56b45358b84feeb49 >BUG: 1194640 >Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> >Reviewed-on: http://review.gluster.org/10826 >Tested-by: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: If6a55186cddc3d1c4d22e3d56b45358b84feeb49 BUG: 1217722 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/11405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* core: use reference counting for mem_acct structuresJeff Darcy2015-05-091-19/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing memory, our memory-accounting code expects to be able to dereference from the (previously) allocated block to its owning translator. However, as we have already found once in option validation and twice in logging, that translator might itself have been freed and the dereference attempt causes on of our daemons to crash with SIGSEGV. This patch attempts to fix that as follows: * We no longer embed a struct mem_acct directly in a struct xlator, but instead allocate it separately. * Allocated memory blocks now contain a pointer to the mem_acct instead of the xlator. * The mem_acct structure contains a reference count, manipulated in both the normal and translator allocate/free code using atomic increments and decrements. * Because it's now a separate structure, we can defer freeing the mem_acct until its reference count reaches zero (either way). * Some unit tests were disabled, because they embedded their own copies of the implementation for what they were supposedly testing. Life's too short to spend time fixing tests that seem designed to impede progress by requiring a certain implementation as well as behavior. Change-Id: Id929b11387927136f78626901729296b6c0d0fd7 BUG: 1219026 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/10417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/10723 Tested-by: NetBSD Build System
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* core : free up mem_acct.rec in xlator_destroyAtin Mukherjee2015-03-301-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: We've observed that glusterd was OOM killed after some minutes when volume set command was run in a loop. Analysis: Initially the suspection was in glusterd code, but a deep dive into the codebase revealed that while validating all the options as part of graph reconfiguration at the time of freeing up the xlator object its one of the member mem_acct is left over which causes memory leak. Solution: Free up xlator's mem_acct.rec in xlator_destroy () Change-Id: Ie9e7267e1ac4ab7b8af6e4d7c6660dfe99b4d641 BUG: 1201203 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9862 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* features/bit-rot: Implementation of bit-rot xlatorVenky Shankar2015-03-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the "Signer" -- responsible for signing files with their checksums upon last file descriptor close (last release()). The event notification facility provided by the changelog xlator is made use of. Moreover, checksums are as of now SHA256 hash of the object data and is the only available hash at this point of time. Therefore, there is no special "what hash to use" type check, although it's does not take much to add various hashing algorithms to sign objects with. Signatures are stored in extended attributes of the objects along with the the type of hashing used to calculate the signature. This makes thing future proof when other hash types are added. The signature infrastructure is provided by bitrot stub: a little piece of code that sits over the POSIX xlator providing interfaces to "get or set" objects signature and it's staleness. Since objects are signed upon receiving release() notification, pre-existing data which are "never" modified would never be signed. To counter this, an initial crawler thread is spawned The crawler scans the entire brick for objects that are unsigned or "missed" signing due to the server going offline (node reboots, crashes, etc..) and triggers an explicit sign. This would also sign objects when bit-rot is enabled for a volume and/or after upgrade. Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b BUG: 1170075 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9711 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Quota/marker : Support for inode quotavmallika2015-03-171-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to retrieve the number of files/objects in a directory or volume is to do a crawl of the entire directory/volume. This is expensive and is not scalable. The new mechanism proposes to store count of objects/files as part of an extended attribute of a directory. Each directory's extended attribute value will indicate the number of files/objects present in a tree with the directory being considered as the root of the tree. Currently file usage is accounted in marker by doing multiple FOPs like setting and getting xattrs. Doing this with STACK WIND and UNWIND can be harder to debug as involves multiple callbacks. In this code we are replacing current mechanism with syncop approach as syncop code is much simpler to follow and help us implement inode quota in an organized way. Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4 BUG: 1188636 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/9567 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlator: avoiding possibility of a crash if (xl->ctx) is NULL.Humble Devassy Chirammal2015-03-151-0/+3
| | | | | | | | | | Change-Id: I41acd9970bef04bb16cd4d8532a84a95d5fb642a BUG: 1199003 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>. Reviewed-on: http://review.gluster.org/9810 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Use common loc-touchup in fuse/server/gfapiPranith Kumar K2015-03-081-0/+33
| | | | | | | | | | | Change-Id: Id41fb29480bb6d22c34469339163da05b98c1a98 BUG: 1115907 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8226 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Add functions for xlator and graph cleanup.Poornima G2015-03-021-35/+80
| | | | | | | | | | Change-Id: If341e3c0a559aa5bbca9c1263a241c6592c59706 BUG: 1093594 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/9696 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/server: reflect lru limit in inode table alsoRaghavendra Bhat2014-06-131-0/+19
| | | | | | | | | | | | | | | | | | | | | Upon reconfigure, when lru limit of the inode table is changed, the new value was just saved in the private structure of the protocol/server xlator and the inode table used to have the older values still. A brick start was required for the changes to get reflected. To handle it, traverse through the xlator tree and check whether a xlator is a bound_xl or not (if it is a bound_xl it would have its itable pointer set). If a xlator is a bound_xl, then get the inode table of that bound_xl and set its lru limit to new value given via cli. Also prune the inode table so that extra inodes are purged from the inode table. Change-Id: I6909be028c116adaa1d1a5108470015b5fc6f09d BUG: 1103756 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/7957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Core: Fix issues reported by CppcheckLalatendu Mohanty2014-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed in this patch: [glusterfs/extras/geo-rep/gsync-sync-gfid.c:105]: (error) Resource leak: fp [glusterfs/libglusterfs/src/xlator.c:651]: (error) Uninitialized variable: gfid [glusterfs/libglusterfs/src/xlator.c:652]: (error) Uninitialized variable: gfid [glusterfs/xlators/cluster/ha/src/ha.c:2699]: (error) Possible null pointer dereference: priv [glusterfs/xlators/features/changelog/src/changelog.c:1464]: (error) Possible null pointer dereference: priv [glusterfs/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:865]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:194]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-syncop.c:1408]: (error) Possible null pointer dereference: this [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:7002]: (error) Possible null pointer dereference: path_tokens Fixed in 3.4 and 3.5 branch (http://review.gluster.org/#/c/7583/ , http://review.gluster.org/#/c/7605/ will be backported in a separate patch) [glusterfs/xlators/mount/fuse/src/fuse-bridge.c:4688]: (error) Uninitialized variable: finh [glusterfs/xlators/mount/fuse/src/fuse-bridge.c:3081]: (error) Possible null pointer dereference: state [glusterfs/xlators/cluster/dht/src/dht-rebalance.c:1719]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/cluster/stripe/src/stripe.c:4940]: (error) Possible null pointer dereference: local [glusterfs/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:915]: (error) Resource leak: file [glusterfs/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:999]: (error) Resource leak: file [glusterfs/xlators/mgmt/glusterd/src/glusterd-sm.c:248]: (error) Possible null pointer dereference: new_ev_ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:5297]: (error) Possible null pointer dereference: this [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:6273]: (error) Possible null pointer dereference: this [glusterfs/xlators/performance/quick-read/src/quick-read.c:586]: (error) Possible null pointer dereference: iobuf [glusterfs/xlators/nfs/server/src/nfs-common.c:89]: (error) Dangerous usage of 'volname' (strncpy doesn't always null-terminate it). False positives [glusterfs/geo-replication/src/gsyncd.c:99]: (error) Memory leak: str [glusterfs/geo-replication/src/gsyncd.c:395]: (error) Memory leak: argv [glusterfs/xlators/nfs/server/src/nlm4.c:1199]: (error) Possible null pointer dereference: fde [glusterfs/xlators/mgmt/glusterd/src/glusterd-geo-rep.c:1659]: (error) Possible null pointer dereference: command [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:7001]: (error) Possible null pointer dereference: path_tokens Insignificant/Don't care [glusterfs/contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 2) requires 'long *' but the argument type is 'unsigned long *'. [glusterfs/contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 3) requires 'long *' but the argument type is 'unsigned long *'. [glusterfs/extras/test/test-ffop.c:27]: (error) Buffer overrun possible for long command line arguments. [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:138]: (error) Possible null pointer dereference: __ptr [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:140]: (error) Possible null pointer dereference: __ptr [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:331]: (error) Possible null pointer dereference: __ptr Change-Id: I7696ed1a2a9553b79f9714e10210a8d563a5abd8 BUG: 1091677 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/7693 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: refactorAnand Avati2014-03-221-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Remove client side self-healing completely (opendir, openfd, lookup) - Re-work readdir-failover to work reliably in case of NFS - Remove unused/dead lock recovery code - Consistently use xdata in both calls and callbacks in all FOPs - Per-inode event generation, used to force inode ctx refresh - Implement dirty flag support (in place of pending counts) - Eliminate inode ctx structure, use read subvol bits + event_generation - Implement inode ctx refreshing based on event generation - Provide backward compatibility in transactions - remove unused variables and functions - make code more consistent in style and pattern - regularize and clean up inode-write transaction code - regularize and clean up dir-write transaction code - regularize and clean up common FOPs - reorganize transaction framework code - skip setting xattrs in pending dict if nothing is pending - re-write self-healing code using syncops - re-write simpler self-heal-daemon Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6010 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs..xlator.c: Close a library handle.Christopher R. Hertel2014-02-081-0/+3
| | | | | | | | | | | | | | Simple fix to ensure that a library handle is closed if it is not actually used. BUG: 789278 CID: 1124783 Change-Id: Ia3e734c46e1ad8c97cb8cc7f1a5616606bfbc550 Signed-off-by: Christopher R. Hertel <crh@redhat.com> Reviewed-on: http://review.gluster.org/6933 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/gfid-access: fix lookup on .gfid/<parent>/bnameVenky Shankar2014-01-271-0/+39
| | | | | | | | | | | | | | In gfid translator, lookup was not handling the case when the lookup is sent on .gfid/<parent>/bname. In this case, we flip with fake inode of the parent with the real inode in loc and send it downwards. Change-Id: I639ff1dce10ffc045da419e333d455e208b6a0f0 BUG: 1057881 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6795 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: Improvements to quotaRaghavendra G2013-11-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Two stages of quota enforcement is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota enforcer is moved to server-side. It takes care of enforcing quota. Since enforcer doesn't have the cluster view, it relies on another service called quota-aggregator. Aggregator, on query can return the size of a directory based on the cluster view. Enforcer is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to enforcer: server-quota - Specifies whether the feature is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command). The algorithm followed is, i. Adjust statvfs based on limit configured on root. ii. If limit is set on the inode passed, use size/limits on that inode to populate statvfs. Otherwise, use size/limits configured on root. iii. Upon statvfs, update the ctx->size on the inode. iv. Don't let DHT aggregate, instead take the maximum of the usages from the subvols of the DHT, since each of it contains the complete information. Enforcer also makes use of gfid-to-path conversion functionality to work correctly when a client like nfs predominently relies on nameless lookups. * Quota Aggregator acts as a thin client to provide cluster view Its a lightweight *gluster client* process with no mount point, started upon enabling quota or restarting the volume. This is a single process run on each brick, which can answer queries on all volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol. Credits: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5952 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* bd_map: Remove bd_map xlatorM. Mohan Kumar2013-11-131-23/+0
| | | | | | | | | | | Remove bd_map xlator and CLI related changes. Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: zerofill supportM. Mohan Kumar2013-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfsd, libgfapi: destroy the temporary graphs constructed for comparisonRaghavendra Bhat2013-10-031-16/+27
| | | | | | | | | | | | | | * The new and the oldgraphs which have been constructed whenever there is a volfile change (either reconfigure of the existing graph or creating a new graph) for comparison should be freed. Otherwise frequent graph changes will lead to huge memory leak Change-Id: I4faddb1aa9393b34cd2de6732e537a60f600026a BUG: 948178 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5388 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Handle REPLICATE_TRASH_DIR from old bricksPranith Kumar K2013-07-261-0/+11
| | | | | | | | | Change-Id: Ib99f79d3fa607c818dbc62006516480f598d8add BUG: 886998 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4640 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterfs: discard (hole punch) supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | Add support for the DISCARD file operation. Discard punches a hole in a file in the provided range. Block de-allocation is implemented via fallocate() (as requested via fuse and passed on to the brick fs) but a separate fop is created within gluster to emphasize the fact that discard changes file data (the discarded region is replaced with zeroes) and must invalidate caches where appropriate. BUG: 963678 Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5090 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gluster: add fallocate fop supportBrian Foster2013-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for the fallocate file operation. fallocate allocates blocks for a particular inode such that future writes to the associated region of the file are guaranteed not to fail with ENOSPC. This patch adds fallocate support to the following areas: - libglusterfs - mount/fuse - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi BUG: 949242 Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/4969 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Avoid order mismatch in blocking entrylksPranith Kumar K2013-06-051-3/+13
| | | | | | | | | | | | | | | | | | | | | | | Problem: When taking blocking entrylks, afr orders the entrylks based on uuid_compare of gfids of parent dirs, if they are equal then it orders them based on the basenames. While this approach works fine, the implementation assumes loc->gfids to be populated at the time of the comparison, but loc may have gfid in loc->inode->gfid instead of loc->gfid which was leading to order mismatches and dead-locks. Fix: Implemented loc_gfid which gives gfid by checking both loc->gfid, loc->inode->gfid. Used this for ordering the blocking entrylks. Change-Id: Ib0db36bbaf0df09fa87c3c3bb6a834db74fc2154 BUG: 965987 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5062 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd-volgen: Improve volume op-versions calculationKaushal M2013-05-311-24/+5
| | | | | | | | | | | | | | | | | | Volume op-versions calculations now take into account if an option, a. enables/disables an xlator, or b. is a boolean option. This prevents op-versions from being updated when a feature is disabled. Also, correctly close the dynamically loaded xlators in xlator_volopt_dynload() and prevent leaks. Change-Id: I895ddeeec6f6a33e509325f0ce6f01b7aad3cf5c BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4952 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/client: Print valid loc identifiersPranith Kumar K2013-04-031-1/+15
| | | | | | | | | | Change-Id: I45f91105862a2484b8906a7a63b98ab4aaf80d05 BUG: 924643 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/4683 Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* dht: make nufa/switch call dht's init/finiJeff Darcy2013-04-031-12/+15
| | | | | | | | | | | | | These functions keep changing as new functionality is added, so copying and pasting the code is not a good solution. This way ensures that all fields get initialized properly no matter how much new stuff we throw in. Change-Id: I9e9b043d2d305d31e80cf5689465555b70312756 BUG: 924488 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4710 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: add dispatch table for init/finiJeff Darcy2013-04-031-10/+22
| | | | | | | | | | | | | This adds a layer of indirection so that derivative translators such as NUFA and switch can refer to the parent's init/fini (in both cases DHT's) without having to create stub functions. Change-Id: I1af1fea70a9ddd2aa20485af7ae65f9660f19dd6 BUG: 924490 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/4709 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfsd: dump the in-memory graph rather than volfileAnand Avati2013-03-241-0/+18
| | | | | | | | | | | | | | | | | | | | | Currently we have been printing in the logfile, the volfile verbatim as received from the server. However we perform pre-processing on the graph we receive from the server, like adding ACL translator, applying --xlator-option cli params, etc. So print the serialized in-memory graph as the "volfile" in the log. This can be very handy to double check if certain --xlator-option param actually got applied or not, and in general is showing a "truer" representation of the real graph actually used. Change-Id: I0221dc56e21111b48a1ee3e5fe17a5ef820dc0c6 BUG: 924504 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4708 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* rpc: before freeing the volume options object, delete it from the listRaghavendra Bhat2013-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | * Suppose there is an xlator option which is considered by the xlator only if the source was built with debug mode enabled (the only example in the current code base is run-with-valgrind option for glusterd), then giving that option would make the process crash if the source was not built with debug mode enabled. Reason: In rpc, after getting the options symbol dynamically, it was stored in the newly allocated volume options structure and the structure's list head was added to the xlator's volume_options list. But while freeing the structure the list was not deleted. Thus when the list was traversed, already freed structure was accessed leading to segfault. Change-Id: I3e9e51dd2099e34b206199eae7ba44d9d88a86ad BUG: 922877 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4687 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* core: fixes for gcc's '-pedantic' flag buildAvra Sengupta2013-01-211-5/+5
| | | | | | | | | | | | | * warnings on 'void *' arguments * warnings on empty initializations * warnings on empty array (array[0]) Change-Id: Iae440f54cbd59580eb69f3ecaed5a9926c0edf95 BUG: 875913 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/4219 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* BD Backend: CLI commands to create/delete imageM. Mohan Kumar2012-11-291-0/+22
| | | | | | | | | | | | | | | | | Cli commands added to create/delete a LV device. The following command creates lv in a given vg. $ gluster bd create <volname>:<vgname>/<lvname> <size> The following command deletes lv in a given vg. $ gluster bd delete <volname>:<vgname>/<lvname> BUG: 805138 Change-Id: Ie4e100eca14e2ee32cf2bb4dd064b17230d673bf Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/3718 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* core: reduce the usage of global variablesAmar Tumballi2012-08-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | * move all the 'logging' related global variables into ctx * make gf_fop_list a 'const' global array, hence no init(), no edits. * make sure ctx is allocated without any dependancy on memory-accounting infrastructure, so it can be the first one to get allocated * globals_init() should happen with ctx as argument not yet fixed below in this patchset: * anything with 'THIS' related globals * anything related to compat_errno related globals as its one time init'd and not changed later on. * statedump related globals Change-Id: Iab8fc30d4bfdbded6741d66ff1ed670fdc7b7ad2 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 764890 Reviewed-on: http://review.gluster.com/3767 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs_ctx_t: un-globalize the filesystem contextAnand Avati2012-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | So far there has been a global glusterfs_ctx_t object which represents the running instance of the filesystem (client or server). It contains the various graphs, connection to the management daemon over which new graphs are obtained, calls stacks issued on this filesystem, and a bunch of such things. With the introduction of libgfapi, it is no more true that there will be only one filesystem context in a process. Applications can be written to use libgfapi and obtain serveral instances of different filesystems/volumes in the same process. This involves messy untangling of assumptions inside libglusterfs that there would only be one global glusterfs_ctx_t and offload that assumption to glusterfsd/ and cli/ (where it is true). Change-Id: Ifd7d1259428c26076140a5764a2dc7361694139c BUG: 839950 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* remove useless if-before-free (and free-like) functionsJim Meyering2012-07-131-8/+4
| | | | | | | | | | | | See comments in http://bugzilla.redhat.com/839925 for the code to perform this change. Signed-off-by: Jim Meyering <meyering@redhat.com> BUG: 839925 Change-Id: I10e4ecff16c3749fe17c2831c516737e08a3205a Reviewed-on: http://review.gluster.com/3661 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: coverity fixes (mostly resource leak fixes)Amar Tumballi2012-06-051-0/+3
| | | | | | | | | | | currently working on obvious resource leak reports in coverity Change-Id: I261f4c578987b16da399ab5a504ad0fda0b176b1 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 789278 Reviewed-on: http://review.gluster.com/3265 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Resolve: Assign correct path while resolvingPranith Kumar K2012-05-081-0/+36
| | | | | | | | | Change-Id: Ia17ff38a60225dd2e9115aaa298bed42f9e43f56 BUG: 812277 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3248 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Make option code quieter to make logs more readable.Jeff Darcy2012-04-231-5/+5
| | | | | | | | | | | | | | A lot of this stuff is extremely uninformative, such as messages about default values that can be read from the code. Having them in the logs makes more important messages harder to find, so they're down to GF_DEBUG_TRACE now. Change-Id: I18921b3deac7a01ad94c50670b90342d1130837b Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.com/3198 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* libglusterfs: loc_copy should set name only if src->name is setPranith Kumar K2012-04-121-1/+2
| | | | | | | | | Change-Id: If06d97dd1ff1293730e0cd4933f6ab3283f0475e BUG: 788051 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3137 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/marker: GFID file handle based backend related fixesv3.3.0qa26Raghavendra G2012-03-071-0/+1
| | | | | | | | | | | | * make sure loc->gfid is filled. Change-Id: I5304e650d9fc181ee1f3b65be7f2ff3847d9722e BUG: 790389 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.com/2888 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* libglusterfs: Handle loc_copy for nameless locPranith Kumar K2012-02-291-13/+9
| | | | | | | | | BUG: 787671 Change-Id: I7601f482ae753ead83be16d1df33c2187e76dc5c Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2825 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-211-0/+3
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/dht: Support for hardlink rebalance when decommissioningshishir gowda2012-02-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The support for hardlink rebalance is only available for decommissioning of a node. this can be triggered in two ways 1. remove-brick start 2. if decommission node value is set in vol file, then a normal rebalance command The way we handle it is- if (nlink > 1) do * if src file doesnt have linkto xattr * mark src's linkto to the dst * else * perform a link on the dst * do a look up * if nlinks = dst.nlinks * migrate data * else * continue crawling done Signed-off-by: shishir gowda <shishirng@gluster.com> Change-Id: If43b5524b872fd1413e9f7aa7f436cb244e30d8d BUG: 763844 Reviewed-on: http://review.gluster.com/2737 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* core: add 'fremovexattr()' fopAmar Tumballi2012-01-251-0/+1
| | | | | | | | | | | so operations can be done on fd for extended attribute removal Change-Id: Ie026f1b53793aeb4ae33e96ea5408c7a97f34bf6 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 766571 Reviewed-on: http://review.gluster.com/778 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* core: GFID filehandle based backend and anonymous FDsAnand Avati2012-01-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. What -------- This change introduces an infrastructure change in the filesystem which lets filesystem operation address objects (inodes) just by its GFID. Thus far GFID has been a unique identifier of a user-visible inode. But in terms of addressability the only mechanism thus far has been the backend filesystem path, which could be derived from the GFID only if it was cached in the inode table along with the entire set of dentry ancestry leading up to the root. This change essentially decouples addressability from the namespace. It is no more necessary to be aware of the parent directory to address a file or directory. 2. Why ------- The biggest use case for such a feature is NFS for generating persistent filehandles. So far the technique for generating filehandles in NFS has been to encode path components so that the appropriate inode_t can be repopulated into the inode table by means of a recursive lookup of each component top-down. Another use case is the ability to perform more intelligent self-healing and rebalancing of inodes with hardlinks and also to detect renames. A derived feature from GFID filehandles is anonymous FDs. An anonymous FD is an internal USABLE "fd_t" which does not map to a user opened file descriptor or to an internal ->open()'d fd. The ability to address a file by the GFID eliminates the need to have a persistent ->open()'d fd for the purpose of avoiding the namespace. This improves NFS read/write performance significantly eliminating open/close calls and also fixes some of today's limitations (like keeping an FD open longer than necessary resulting in disk space leakage) 3. How ------- At each storage/posix translator level, every file is hardlinked inside a hidden .glusterfs directory (under the top level export) with the name as the ascii-encoded standard UUID format string. For reasons of performance and scalability there is a two-tier classification of those hardlinks under directories with the initial parts of the UUID string as the directory names. For directories (which cannot be hardlinked), the approach is to use a symlink which dereferences the parent GFID path along with basename of the directory. The parent GFID dereference will in turn be a dereference of the grandparent with the parent's basename, and so on recursively up to the root export. 4. Development --------------- 4a. To leverage the ability to address an inode by its GFID, the technique is to perform a "nameless lookup". This means, to populate a loc_t structure as: loc_t { pargfid: NULL parent: NULL name: NULL path: NULL gfid: GFID to be looked up [out parameter] inode: inode_new () result [in parameter] } and performing such lookup will return in its callback an inode_t populated with the right contexts and a struct iatt which can be used to perform an inode_link () on the inode (without a parent and basename). The inode will now be hashed and linked in the inode table and findable via inode_find(). A fundamental change moving forward is that the primary fields in a loc_t structure are now going to be (pargfid, name) and (gfid) depending on the kind of FOP. So far path had been the primary field for operations. The remaining fields only serve as hints/helpers. 4b. If read/write is to be performed on an inode_t, the approach so far has been to: fd_create(), STACK_WIND(open, fd), fd_bind (in callback) and then perform STACK_WIND(read, fd) etc. With anonymous fds now you can do fd_anonymous (inode), STACK_WIND (read, fd). This results in great boost in performance in the inbuilt NFS server. 5. Misc ------- The inode_ctx_put[2] has been renamed to inode_ctx_set[2] to be consistent with the rest of the codebase. Change-Id: Ie4629edf6bd32a595f4d7f01e90c0a01f16fb12f BUG: 781318 Reviewed-on: http://review.gluster.com/669 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* xlator: prevent a crash in xlator_destroy due to uninitialized listAmar Tumballi2011-11-161-2/+2
| | | | | | | | | | | | | | | | was happening when any of the dlopen() fails for xlator loading, and xlator_destroy() was called as it was using a uninitialized list for traversal. now, the list gets initialized at the begining of the xl-init(), so xlator_destroy() works smoothly. Change-Id: I320f6fe922e6d351e6d7c0a3e8da1f6b414d3c47 BUG: 3731 Reviewed-on: http://review.gluster.com/606 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: remove 'ino' variable from 'loc_t' structureAmar Tumballi2011-11-161-3/+0
| | | | | | | | Change-Id: I53b007fbdb42313d207d5d63fbfaaa6aaf033f95 BUG: 3518 Reviewed-on: http://review.gluster.com/523 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>