summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src/xlator.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "gluster: Sometimes Brick process is crashed at the time of stopping ↵Mohit Agrawal2018-05-251-5/+3
| | | | | | | | brick" Updates: bz#1582286 This reverts commit 0043c63f70776444f69667a4ef9596217ecb42b7. Change-Id: Iab3b4f4a54e122c589e515add93c6effc966b3e0
* Revert "server: fix unresolved symbols by moving them to libglusterfs"Mohit Agrawal2018-05-251-101/+0
| | | | | | Updates: bz#1582286 This reverts commit 408a6d07ababde234ddeafe16687aacd2b810b42. Change-Id: If8247d7980d698141f47130a3c532b942408ec2b
* core: move logs which are only developer relevant to DEBUG levelAmar Tumballi2018-05-091-1/+1
| | | | | | | Change-Id: I8b38e231b6160db8075f73773d4e7dc115a90d95 updates: bz#1542829 BUG: 1542829 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* server: fix unresolved symbols by moving them to libglusterfsMohit Agrawal2018-04-201-0/+101
| | | | | | | | | | | | | | | | Problem: glusterd2 build is failed due to undefined symbol (xlator_mem_cleanup , glusterfsd_ctx) in server.so Solution: To resolve the same done below two changes 1) Move xlator_mem_cleanup code from glusterfsd-mgmt.c to xlator.c to be part of libglusterfs.so 2) replace glusterfsd_ctx to this->ctx because symbol glusterfsd_ctx is not part of server.so BUG: 1544090 Change-Id: Ie5e6fba9ed458931d08eb0948d450aa962424ae5 fixes: bz#1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* gluster: Sometimes Brick process is crashed at the time of stopping brickMohit Agrawal2018-04-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Problem: Sometimes brick process is getting crashed at the time of stop brick while brick mux is enabled. Solution: Brick process was getting crashed because of rpc connection was not cleaning properly while brick mux is enabled.In this patch after sending GF_EVENT_CLEANUP notification to xlator(server) waits for all rpc client connection destroy for specific xlator.Once rpc connections are destroyed in server_rpc_notify for all associated client for that brick then call xlator_mem_cleanup for for brick xlator as well as all child xlators.To avoid races at the time of cleanup introduce two new flags at each xlator cleanup_starting, call_cleanup. BUG: 1544090 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Note: Run all test-cases in separate build (https://review.gluster.org/#/c/19700/) with same patch after enable brick mux forcefully, all test cases are passed. Change-Id: Ic4ab9c128df282d146cf1135640281fcb31997bf updates: bz#1544090
* core: provide infra to make any xlator pass-throughAmar Tumballi2018-03-091-3/+15
| | | | | | | updates: #304 Change-Id: If6a13d2e56b195390a386d720103a882e077f66c Signed-off-by: Amar Tumballi <amarts@redhat.com>
* metrics: set latency min value during xlator initAmar Tumballi2018-02-161-0/+7
| | | | | | | | | | | otherwise, the very first metrics will have all the min as 0. also no need to print pending-fops if it is 0. Updates #168 Change-Id: I233de6c92b1a73977bb468ba211ac6ec3c05298f Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Use RTLD_LOCAL for symbol resolutionPrashanth Pai2017-12-271-2/+2
| | | | | | | | | | | | | RTLD_LOCAL is the default value for symbol visibility flag of dlopen() in Linux and NetBSD. Using it avoids conflicts during symbol resolution. This also allows us to detect xlators that have not been explicitly linked with libraries that they use. This used to go unnoticed when RTLD_GLOBAL was being used. BUG: 1193929 Change-Id: I50db6ea14ffdee96596060c4d6bf71cd3c432f7b Signed-off-by: Prashanth Pai <ppai@redhat.com>
* core/memacct: save allocs in mem_acct_rec listN Balachandran2017-12-071-0/+3
| | | | | | | | | | | | | | | With configure --enable-debug, add all object allocations to a list in the corresponding mem_acct_rec. This allows us to see all objects of a particular type and allows for additional debugging in case of memory leaks. This is not compiled in by default and must be explicitly enabled. It is intended to be used by developers. Change-Id: I7cf2dbeadecf994423d7e7591e85f18d2575cce8 BUG: 1522662 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* libglusterfs: specify ctx in gf_log_set_loglevelZhang Huan2017-12-061-1/+1
| | | | | | | | | specify ctx in gf_log_set_loglevel, instead of getting it from a thread specific variable. Change-Id: I498f826e8e32231235a6b0005026a27c327727fd BUG: 1521213 Signed-off-by: Zhang Huan <zhanghuan@open-fs.com>
* metrics: provide options to dump metrics from xlatorsAmar Tumballi2017-12-061-0/+6
| | | | | | | | | | * Introduce xlator methods to allow dumping of metrics * Separate options to get the metrics dumped in a path Updates #168 Change-Id: I7df80df33b71d6f449f03c2332665b4a45f6ddf2 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* rio/everywhere: add icreate/namelink fopSusant Palai2017-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | icreate creates inode, while namelink links the basename to it's parent gfid. For now mkdir is the primary user of these fops. Better distribution is acheived by creating the inode on ,(say) mds1 and linking the basename to it's parent gfid on mds2. The inode serves readdirp, stat etc. More details about the fops are present at: https://review.gluster.org/#/c/13395/3/design/DHT2/DHT2_Icreate_Namelink_Notes.md This backport of three patches from experimental branch. 1- https://review.gluster.org/#/c/18085/ 2- https://review.gluster.org/#/c/18086/ 3- https://review.gluster.org/#/c/18094/ Updates gluster/glusterfs#243 Change-Id: I1bd3d5a441a3cfab1acfeb52f15c6c867d362592 Signed-off-by: Susant Palai <spalai@redhat.com>
* libglusterfs: Add put fopPoornima G2017-12-051-0/+1
| | | | | | | | | | | | | | | | | Problem: It had been a longtime request to implement put fop in gluster. put fop in gluster may not have the exact sementics of HTTP PUT, but can be easily extended to do so. The subsequent patches, will contain more semantics on the put fop and its guarentees. Why compound fop framework is not used for put? Compound fop framework currently doesn't allow compounding of entry fop and inode fops, i.e. fops on multiple inodes cannot be combined in compound fop. Updates #353 Change-Id: Idb7891b3e056d46d570bb7e31bad1b6a28656ada Signed-off-by: Poornima G <pgurusid@redhat.com>
* xlator: provide a xlator_api_t structure to include all exported optionsAmar Tumballi2017-11-301-37/+183
| | | | | | | | | | each translator from now on can have just 1 symbol exported called 'xlator_api', which has all the required fields in it. Updates: #164 Change-Id: I48d54f5ec59fee842b1d55877e3ac5e9ec9b6bdd Signed-off-by: Amar Tumballi <amarts@redhat.com>
* xlator: add more metrics per fopsAmar Tumballi2017-11-081-0/+12
| | | | | | | | | | | Make sure to handle these counters in STACK_WIND/UNWIND macro, and keep the counters as part of xlator_t structure itself, to provide infra to monitoring. Updates #137 Change-Id: Ib54d45e2321c2b095dac5810c37e6cdffe1f71b7 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* global options: add a sample option to handleAmar Tumballi2017-11-031-1/+14
| | | | | | | Fixes #303 Change-Id: Icdaa804711c43c65b9684f2649437aae1b5c1ed5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* Add framework for global xlator in graphAmar Tumballi2017-11-031-1/+19
| | | | | | | Updates #303 Change-Id: Id0b9050c93ea87532dc80b4fda650c5663d285bd Signed-off-by: Amar Tumballi <amarts@redhat.com>
* mgtm/core : use sha hash function for volfile checkMohammed Rafi KC2017-07-101-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are storing the entire volfile and using this to check volfile change. With brick multiplexing there will be lot of graphs per process which will increase the memory foot print of the process. So instead of storing the entire graph we could use sha256 and we can compare the hash to see whether volfile change happened or not. Also with Brick multiplexing, the direct comparison of vol file is not correct. There are two problems. Problem 1: We are currently storing one single graph (the last updated volfile) whereas, what we need is the entire graph with all atttached bricks. If we fix this issue, we have second problem Problem 2: With multiplexing we have a graph that contains multiple bricks. But what we are checking as part of the reconfigure is, comparing the entire graph with one single graph, which will always fail. Solution: We create list in glusterfs_ctx_t that stores sha256 hash of individual brick graphs. When a graph changes happens we compare the stored hash and the current hash. If the hash matches, then no need for reconfigure. Otherwise we first do the reconfigure and then update the hash. For now, gfapi has not changed this way. Meaning when gfapi volfile fetch or reconfigure happens, we still store the entire graph and compare, each memory. This is fine, because libgfapi will not load brick graphs. But changing the libgfapi will make the code similar in both glusterfsd-mgmt and api. Also it helps to reduce some memory. Change-Id: I9df917a771a52b95622ab8f63af34ec390163a77 BUG: 1467986 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: https://review.gluster.org/17709 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Amar Tumballi <amarts@redhat.com>
* nl-cache: In case of nameless operations do not cachePoornima G2017-05-221-0/+15
| | | | | | | | | | | | | | | | | | Issue: In nameless lookup/other fops, parent inode will be NULL, when we try to add the cache to the NULL inode, it causes a crash. Hence handle the scenario of nameless fops, and do not cache/serve the nameless fops. Change-Id: I3b90f882ac89e6aaf3419db89e6f890797f37700 BUG: 1451588 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: https://review.gluster.org/17316 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht: send lookup on old name inside rename with bname and pargfidSusant Palai2017-04-291-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Inside rename, a lookup is done on the source name to make sure that the file is there. But we used to do a gfid based lookup and hence, even if the source name was renamed to a new name from some other client, lookup will be successful as server3_3_lookup will fetch the new path based on the gfid. So even if the source file does not exist any more rename will carry on, and as server3_3_link(destination is hashed to a different brick other than source cached scenario) also does gfid based resolve, it wont detect that the source name does not exist and hardlink creation will be successful (since gfid based resolve will get the new dentry). To solve this problem, do a name based lookup inside rename. So that rename will fail right away if the source does not exist. Change-Id: Ieba8bdd6675088dbf18de90ed4622df043d163bd BUG: 1412135 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: https://review.gluster.org/16375 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* xlator: do not call dlclose() when debuggingNiels de Vos2017-04-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind can not show the symbols if a .so after calling dlclose(). The unhelpful ??? in the output gets resolved properly with this change: ==25170== 344 bytes in 1 blocks are definitely lost in loss record 233 of 324 ==25170== at 0x4C29975: calloc (vg_replace_malloc.c:711) ==25170== by 0x52C7C0B: __gf_calloc (mem-pool.c:117) ==25170== by 0x12B0638A: ??? ==25170== by 0x528FCE6: __xlator_init (xlator.c:472) ==25170== by 0x528FE16: xlator_init (xlator.c:498) ==25170== by 0x52DA8D6: glusterfs_graph_init (graph.c:321) ==25170== by 0x52DB587: glusterfs_graph_activate (graph.c:695) ==25170== by 0x5046407: glfs_process_volfp (glfs-mgmt.c:79) ==25170== by 0x5043B9E: glfs_volumes_init (glfs.c:281) ==25170== by 0x5044FEC: glfs_init_common (glfs.c:986) ==25170== by 0x50451A7: glfs_init@@GFAPI_3.4.0 (glfs.c:1031) By not calling dlclose(), the dynamically loaded .so is still available upon program exit, and Valgrind is able to resolve the symbols. This will add an additional leak, so dlclose() is called for normal builds, but skipped when configuring with "./configure --enable-valgrind" or passing the "run-with-valgrind" xlator option. URL: http://valgrind.org/docs/manual/faq.html#faq.unhelpful Change-Id: I2044e21b1b8fcce32ad1a817fdd795218f967731 BUG: 1425623 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16809 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Samikshan Bairagya <samikshan@gmail.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* libglusterfs: provide standardized atomic operationsNiels de Vos2017-04-051-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The current macros ATOMIC_INCREMENT() and ATOMIC_DECREMENT() expect a lock as first argument. There are at least two issues with this approach: 1. this lock is unused on architectures that have atomic operations 2. some structures use a single lock for multiple variables By defining a gf_atomic_t type, the unused lock can be removed, saving a few bytes on modern architectures. Because the gf_atomic_t type locates the lock for the variable (in case of older architectures), each variable is protected the same on all architectures. This makes the behaviour across all architectures more equal (per variable locking, by a gf_lock_t or compiler optimization). BUG: 1437037 Change-Id: Ic164892b06ea676e6a9566f8a98b7faf0efe76d6 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16963 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Jeff Darcy <jeff@pl.atyp.us>
* libglusterfs: fix serious leak of xlator_t structuresJeff Darcy2017-02-091-0/+1
| | | | | | | | | | | | | | | There's a lot of logic (and some long comments) around how to free these structures safely, but then we didn't do it. Now we do. Change-Id: I9731ae75c60e99cc43d33d0813a86912db97fd96 BUG: 1420571 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16570 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* core: run many bricks within one glusterfsd processJeff Darcy2017-01-301-0/+72
| | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb BUG: 1385758 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/14763 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: serialize init/reconfigure callsJeff Darcy2017-01-051-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | These functions do not generally "expect" to be called more than once in parallel, and many are likely to misbehave in that case (one case in DHT already). Such parallel calls have not generally happened because there are only a few places where we call these functions, and those have been implicitly serialized until recently. However, recent changes in the epoll layer change that, as does brick multiplexing. Therefore, the serialization is now explicit at the init/reconfigure level. It would be sufficient to serialize calls to a particular translator's init and reconfigure functions, but that would require per-translator locks and a bit more complexity in maintaining/using them. Since there's no clear reason why we would need or want to support a higher level of parallelism, the simpler approach of a global lock should suffice. Change-Id: I26296c2826e91dc00b7f0c2061bcc2964ef90c4c BUG: 1399134 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/16030 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-081-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I5f85b303ede135baaf92e87ec8e09941f5ded6c1 BUG: 1392445 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15788 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* core: add setactivelk () fopSusant Palai2016-05-011-0/+1
| | | | | | | | | | | Change-Id: Ic2ba77a1fdd27801a6e579e04e6c0dd93cd7127b BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/14011 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* core: add getactivelk () fopSusant Palai2016-05-011-0/+1
| | | | | | | | | | | Change-Id: Ifd0ff278dcf43da064021f5c25e5dcd34347fcde BUG: 1326085 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13970 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* core: add lease fopPoornima G2016-04-211-0/+1
| | | | | | | | | | | | | Change-Id: Ia27d66b1061b0377857827515590eb89b18515c9 BUG: 1319992 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/11596 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* core: add seek() FOPNiels de Vos2016-01-311-0/+1
| | | | | | | | | | | | | | Minimal infrastructure changes for the seek() FOP. This will provide SEEK_HOLE and SEEK_DATA functionalities. BUG: 1220173 Change-Id: I4b74fce8b0bad2f45291fd2c2b9e243c4f4a1aa9 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11480 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* core : Use correct path in dlopen for socket.soAtin Mukherjee2015-11-191-1/+6
| | | | | | | | | | | | | | This patch fixes the path for socket.so file while loading the so dynamically. Also for config.memory-accounting & config.transport voltype is changed to glusterd to fix the warning message coming from xlator_volopt_dynload Change-Id: I0f7964814586f2018d4922b23c683f4e1eb3098e BUG: 1283485 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/12656 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* xlators: add JSON FOP statistics dumps every N secondsRichard Wareing2015-10-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: - Adds a thread to the io-stats translator which dumps out statistics every N seconds where N is configurable by an option called "diagnostics.stats-dump-interval" - Thread cleanly starts/stops when translator is unloaded - Updates macros to use "Atomic Builtins" (e.g. intel CPU extentions) to use memory barries to update counters vs using locks. This should reduce overhead and prevent any deadlock bugs due to lock contention. Test Plan: - Test on development machine - Run prove -v tests/basic/stats-dump.t Change-Id: If071239d8fdc185e4e8fd527363cc042447a245d BUG: 1266476 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12209 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* libglusterfs:Porting log messages to new frameworkMohamed Ashiq2015-09-011-4/+6
| | | | | | | | | | | Change-Id: I8625b7dc8941720cc7a864b8fddbcc7b4c485fcd BUG: 1252836 Signed-off-by: Mohamed Ashiq <mliyazud@redhat.com> Reviewed-on: http://review.gluster.org/11896 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* defaults,globals,iobuf,latency,logging,options,xlator/libglusterfs : porting ↵Mohamed Ashiq2015-06-241-40/+45
| | | | | | | | | | | to a new logging framework Change-Id: If6a55186cddc3d1c4d22e3d56b45358b84feeb49 BUG: 1194640 Signed-off-by: Mohamed Ashiq <ashiq333@gmail.com> Reviewed-on: http://review.gluster.org/10826 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-291-5/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* core: use reference counting for mem_acct structuresJeff Darcy2015-05-091-19/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When freeing memory, our memory-accounting code expects to be able to dereference from the (previously) allocated block to its owning translator. However, as we have already found once in option validation and twice in logging, that translator might itself have been freed and the dereference attempt causes on of our daemons to crash with SIGSEGV. This patch attempts to fix that as follows: * We no longer embed a struct mem_acct directly in a struct xlator, but instead allocate it separately. * Allocated memory blocks now contain a pointer to the mem_acct instead of the xlator. * The mem_acct structure contains a reference count, manipulated in both the normal and translator allocate/free code using atomic increments and decrements. * Because it's now a separate structure, we can defer freeing the mem_acct until its reference count reaches zero (either way). * Some unit tests were disabled, because they embedded their own copies of the implementation for what they were supposedly testing. Life's too short to spend time fixing tests that seem designed to impede progress by requiring a certain implementation as well as behavior. Change-Id: Id929b11387927136f78626901729296b6c0d0fd7 BUG: 1211749 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/10417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10017 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* core : free up mem_acct.rec in xlator_destroyAtin Mukherjee2015-03-301-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: We've observed that glusterd was OOM killed after some minutes when volume set command was run in a loop. Analysis: Initially the suspection was in glusterd code, but a deep dive into the codebase revealed that while validating all the options as part of graph reconfiguration at the time of freeing up the xlator object its one of the member mem_acct is left over which causes memory leak. Solution: Free up xlator's mem_acct.rec in xlator_destroy () Change-Id: Ie9e7267e1ac4ab7b8af6e4d7c6660dfe99b4d641 BUG: 1201203 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/9862 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* features/bit-rot: Implementation of bit-rot xlatorVenky Shankar2015-03-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the "Signer" -- responsible for signing files with their checksums upon last file descriptor close (last release()). The event notification facility provided by the changelog xlator is made use of. Moreover, checksums are as of now SHA256 hash of the object data and is the only available hash at this point of time. Therefore, there is no special "what hash to use" type check, although it's does not take much to add various hashing algorithms to sign objects with. Signatures are stored in extended attributes of the objects along with the the type of hashing used to calculate the signature. This makes thing future proof when other hash types are added. The signature infrastructure is provided by bitrot stub: a little piece of code that sits over the POSIX xlator providing interfaces to "get or set" objects signature and it's staleness. Since objects are signed upon receiving release() notification, pre-existing data which are "never" modified would never be signed. To counter this, an initial crawler thread is spawned The crawler scans the entire brick for objects that are unsigned or "missed" signing due to the server going offline (node reboots, crashes, etc..) and triggers an explicit sign. This would also sign objects when bit-rot is enabled for a volume and/or after upgrade. Change-Id: I1d9a98bee6cad1c39c35c53c8fb0fc4bad2bf67b BUG: 1170075 Original-Author: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/9711 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Quota/marker : Support for inode quotavmallika2015-03-171-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to retrieve the number of files/objects in a directory or volume is to do a crawl of the entire directory/volume. This is expensive and is not scalable. The new mechanism proposes to store count of objects/files as part of an extended attribute of a directory. Each directory's extended attribute value will indicate the number of files/objects present in a tree with the directory being considered as the root of the tree. Currently file usage is accounted in marker by doing multiple FOPs like setting and getting xattrs. Doing this with STACK WIND and UNWIND can be harder to debug as involves multiple callbacks. In this code we are replacing current mechanism with syncop approach as syncop code is much simpler to follow and help us implement inode quota in an organized way. Change-Id: Ibf366fbe07037284e89a241ddaff7750fc8771b4 BUG: 1188636 Signed-off-by: vmallika <vmallika@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Reviewed-on: http://review.gluster.org/9567 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* every/where: add GF_FOP_IPC for inter-translator communicationJeff Darcy2015-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do log it send back an EOPNOTSUPP error. This makes the "unrecognized opcode" condition distinguishable from the "no IPC support" condition (which would yield an RPC error instead) so clients can probe for the presence of a handler for their own favorite opcode and either use that or use old-school xattrs depending on the result. BUG: 1158628 Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I84af1b17babe5b30ec03ecf027ae37d09b873968 Reviewed-on: http://review.gluster.org/8812 Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* xlator: avoiding possibility of a crash if (xl->ctx) is NULL.Humble Devassy Chirammal2015-03-151-0/+3
| | | | | | | | | | Change-Id: I41acd9970bef04bb16cd4d8532a84a95d5fb642a BUG: 1199003 Signed-off-by: Humble Devassy Chirammal <hchiramm@redhat.com>. Reviewed-on: http://review.gluster.org/9810 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Use common loc-touchup in fuse/server/gfapiPranith Kumar K2015-03-081-0/+33
| | | | | | | | | | | Change-Id: Id41fb29480bb6d22c34469339163da05b98c1a98 BUG: 1115907 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/8226 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs: Add functions for xlator and graph cleanup.Poornima G2015-03-021-35/+80
| | | | | | | | | | Change-Id: If341e3c0a559aa5bbca9c1263a241c6592c59706 BUG: 1093594 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/9696 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* protocol/server: reflect lru limit in inode table alsoRaghavendra Bhat2014-06-131-0/+19
| | | | | | | | | | | | | | | | | | | | | Upon reconfigure, when lru limit of the inode table is changed, the new value was just saved in the private structure of the protocol/server xlator and the inode table used to have the older values still. A brick start was required for the changes to get reflected. To handle it, traverse through the xlator tree and check whether a xlator is a bound_xl or not (if it is a bound_xl it would have its itable pointer set). If a xlator is a bound_xl, then get the inode table of that bound_xl and set its lru limit to new value given via cli. Also prune the inode table so that extra inodes are purged from the inode table. Change-Id: I6909be028c116adaa1d1a5108470015b5fc6f09d BUG: 1103756 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/7957 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* Core: Fix issues reported by CppcheckLalatendu Mohanty2014-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixed in this patch: [glusterfs/extras/geo-rep/gsync-sync-gfid.c:105]: (error) Resource leak: fp [glusterfs/libglusterfs/src/xlator.c:651]: (error) Uninitialized variable: gfid [glusterfs/libglusterfs/src/xlator.c:652]: (error) Uninitialized variable: gfid [glusterfs/xlators/cluster/ha/src/ha.c:2699]: (error) Possible null pointer dereference: priv [glusterfs/xlators/features/changelog/src/changelog.c:1464]: (error) Possible null pointer dereference: priv [glusterfs/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:865]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-mgmt-handler.c:194]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-syncop.c:1408]: (error) Possible null pointer dereference: this [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:7002]: (error) Possible null pointer dereference: path_tokens Fixed in 3.4 and 3.5 branch (http://review.gluster.org/#/c/7583/ , http://review.gluster.org/#/c/7605/ will be backported in a separate patch) [glusterfs/xlators/mount/fuse/src/fuse-bridge.c:4688]: (error) Uninitialized variable: finh [glusterfs/xlators/mount/fuse/src/fuse-bridge.c:3081]: (error) Possible null pointer dereference: state [glusterfs/xlators/cluster/dht/src/dht-rebalance.c:1719]: (error) Possible null pointer dereference: ctx [glusterfs/xlators/cluster/stripe/src/stripe.c:4940]: (error) Possible null pointer dereference: local [glusterfs/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:915]: (error) Resource leak: file [glusterfs/xlators/mgmt/glusterd/src/glusterd-replace-brick.c:999]: (error) Resource leak: file [glusterfs/xlators/mgmt/glusterd/src/glusterd-sm.c:248]: (error) Possible null pointer dereference: new_ev_ctx [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:5297]: (error) Possible null pointer dereference: this [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:6273]: (error) Possible null pointer dereference: this [glusterfs/xlators/performance/quick-read/src/quick-read.c:586]: (error) Possible null pointer dereference: iobuf [glusterfs/xlators/nfs/server/src/nfs-common.c:89]: (error) Dangerous usage of 'volname' (strncpy doesn't always null-terminate it). False positives [glusterfs/geo-replication/src/gsyncd.c:99]: (error) Memory leak: str [glusterfs/geo-replication/src/gsyncd.c:395]: (error) Memory leak: argv [glusterfs/xlators/nfs/server/src/nlm4.c:1199]: (error) Possible null pointer dereference: fde [glusterfs/xlators/mgmt/glusterd/src/glusterd-geo-rep.c:1659]: (error) Possible null pointer dereference: command [glusterfs/xlators/mgmt/glusterd/src/glusterd-utils.c:7001]: (error) Possible null pointer dereference: path_tokens Insignificant/Don't care [glusterfs/contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 2) requires 'long *' but the argument type is 'unsigned long *'. [glusterfs/contrib/uuid/gen_uuid.c:369]: (warning) %ld in format string (no. 3) requires 'long *' but the argument type is 'unsigned long *'. [glusterfs/extras/test/test-ffop.c:27]: (error) Buffer overrun possible for long command line arguments. [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:138]: (error) Possible null pointer dereference: __ptr [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:140]: (error) Possible null pointer dereference: __ptr [glusterfs/xlators/cluster/afr/src/afr-self-heal-common.c:331]: (error) Possible null pointer dereference: __ptr Change-Id: I7696ed1a2a9553b79f9714e10210a8d563a5abd8 BUG: 1091677 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/7693 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: refactorAnand Avati2014-03-221-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Remove client side self-healing completely (opendir, openfd, lookup) - Re-work readdir-failover to work reliably in case of NFS - Remove unused/dead lock recovery code - Consistently use xdata in both calls and callbacks in all FOPs - Per-inode event generation, used to force inode ctx refresh - Implement dirty flag support (in place of pending counts) - Eliminate inode ctx structure, use read subvol bits + event_generation - Implement inode ctx refreshing based on event generation - Provide backward compatibility in transactions - remove unused variables and functions - make code more consistent in style and pattern - regularize and clean up inode-write transaction code - regularize and clean up dir-write transaction code - regularize and clean up common FOPs - reorganize transaction framework code - skip setting xattrs in pending dict if nothing is pending - re-write self-healing code using syncops - re-write simpler self-heal-daemon Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8 BUG: 1021686 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6010 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* libglusterfs..xlator.c: Close a library handle.Christopher R. Hertel2014-02-081-0/+3
| | | | | | | | | | | | | | Simple fix to ensure that a library handle is closed if it is not actually used. BUG: 789278 CID: 1124783 Change-Id: Ia3e734c46e1ad8c97cb8cc7f1a5616606bfbc550 Signed-off-by: Christopher R. Hertel <crh@redhat.com> Reviewed-on: http://review.gluster.org/6933 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/gfid-access: fix lookup on .gfid/<parent>/bnameVenky Shankar2014-01-271-0/+39
| | | | | | | | | | | | | | In gfid translator, lookup was not handling the case when the lookup is sent on .gfid/<parent>/bname. In this case, we flip with fake inode of the parent with the real inode in loc and send it downwards. Change-Id: I639ff1dce10ffc045da419e333d455e208b6a0f0 BUG: 1057881 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6795 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/quota: Improvements to quotaRaghavendra G2013-11-261-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Two stages of quota enforcement is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota enforcer is moved to server-side. It takes care of enforcing quota. Since enforcer doesn't have the cluster view, it relies on another service called quota-aggregator. Aggregator, on query can return the size of a directory based on the cluster view. Enforcer is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to enforcer: server-quota - Specifies whether the feature is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command). The algorithm followed is, i. Adjust statvfs based on limit configured on root. ii. If limit is set on the inode passed, use size/limits on that inode to populate statvfs. Otherwise, use size/limits configured on root. iii. Upon statvfs, update the ctx->size on the inode. iv. Don't let DHT aggregate, instead take the maximum of the usages from the subvols of the DHT, since each of it contains the complete information. Enforcer also makes use of gfid-to-path conversion functionality to work correctly when a client like nfs predominently relies on nameless lookups. * Quota Aggregator acts as a thin client to provide cluster view Its a lightweight *gluster client* process with no mount point, started upon enabling quota or restarting the volume. This is a single process run on each brick, which can answer queries on all volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol. Credits: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5952 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>