summaryrefslogtreecommitdiffstats
path: root/xlators/cluster/afr/src/afr-lk-common.c
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: post-op-delay supportAnand Avati2012-07-041-0/+2
| | | | | | | | | | | | | | | | post-op-delay introduces an artificial delay between the OP and POST-OP-CHANGELOG phases of a write transaction to increase the probability of changelog-piggyback and eager-locking to work more efficiently. Also enable eager-locking by default. Change-Id: I865ca4b68512c44818719c7e388952f15d53e6c2 BUG: 836033 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3621 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
* cluster/afr: cleanup lk_owner and PID messAnand Avati2012-07-041-22/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically PID (frame->root->pid) was used by the locks translator to identify a locker (and make decisions about which locks contend or cooperate/merge). Since the introduction of lock_owner parameter the usage of PID (for locks) was deprecated and is now unused. This patch nukes the usage of PID in AFR The usage of lk_owner has also ended up being a mess, because of the differentiation required between ->lk() and ->inodelk(), (->lk() needs to be identified by the process (roughly) and ->inodelk() needs to be identified by the transaction) and also because of optimizations like eager locking (locks are no more identified by the transaction as they now get inherited by the next transaction). The scheme (and technique) now is: - All FOPs (the third phase of the transaction) happen with the lk_owner which is set by the topmost layer (FUSE, NFS etc.) - All entrylks are issued with lk_owner set to the frame->root address. - Inodelks which will not be subject to eager locking are issued with lk_owner set to frame->root. - Inodelks which are subject to eager locking are issued with lk_owner set to the address of fd_t (which are the only type of frames which get subject to the eager locking optimization) - At the start of the transaction, the transaction frame's lk_owner is set to the either frame->root or fd_t (and never unmodified) depending on the type of transaction. - Just before the third phase (FOP phase) the set lk_owner is "saved" away and overwritten by the lk_owner submitted by the top layer (FUSE or NFS) - Right after the third phase, the saved lk_owner is "restored" to resume the transaction into the POST-OP and eventually UNLOCK using the same lk_owner which was used during the LOCK phase. Change-Id: I6ab8e4d6b65ae4185fa85ad3fded8e9188b2f929 BUG: 836033 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.com/3620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pranithk@gluster.com>
* cluster/afr: Unlock higher entry locks in rename entrylk failure.Pranith Kumar K2012-05-221-0/+4
| | | | | | | | | BUG: 823255 Change-Id: Ic6ad33518ea42c9518a21381518bd4f4afdd87cb Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3382 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* license: dual license under GPLV2 and LGPLV3+Kaleb KEITHLEY2012-05-101-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that the license was not changed in any of the following: .../argp-standalone/... .../booster/... .../cli/... .../contrib/... .../extras/... .../glusterfsd/... .../glusterfs-hadoop/... .../mod_clusterfs/... .../scheduler/... .../swift/... The license was not changed in any of the non-building xlators. The license was not changed in any of the xlators that seemed — to me — to be clearly server-side only, e.g. protocol/server Note too that copyright was changed along with the license; I did not change the copyright in files where the license did not change. If you find any errors or ommissions please don't hesitate to let me know. The complete list of files with the license change is: libglusterfs/src/byte-order.h libglusterfs/src/call-stub.c libglusterfs/src/call-stub.h libglusterfs/src/checksum.c libglusterfs/src/checksum.h libglusterfs/src/circ-buff.c libglusterfs/src/circ-buff.h libglusterfs/src/common-utils.c libglusterfs/src/common-utils.h libglusterfs/src/compat-errno.c libglusterfs/src/compat-errno.h libglusterfs/src/compat.c libglusterfs/src/compat.h libglusterfs/src/daemon.c libglusterfs/src/daemon.h libglusterfs/src/defaults.c libglusterfs/src/defaults.h libglusterfs/src/dict.c libglusterfs/src/dict.h libglusterfs/src/event-history.c libglusterfs/src/event-history.h libglusterfs/src/event.c libglusterfs/src/event.h libglusterfs/src/fd-lk.c libglusterfs/src/fd-lk.h libglusterfs/src/fd.c libglusterfs/src/fd.h libglusterfs/src/gf-dirent.c libglusterfs/src/gf-dirent.h libglusterfs/src/globals.c libglusterfs/src/globals.h libglusterfs/src/glusterfs.h libglusterfs/src/graph-print.c libglusterfs/src/graph-utils.h libglusterfs/src/graph.c libglusterfs/src/hashfn.c libglusterfs/src/hashfn.h libglusterfs/src/iatt.h libglusterfs/src/inode.c libglusterfs/src/inode.h libglusterfs/src/iobuf.c libglusterfs/src/iobuf.h libglusterfs/src/latency.c libglusterfs/src/latency.h libglusterfs/src/list.h libglusterfs/src/lkowner.h libglusterfs/src/locking.h libglusterfs/src/logging.c libglusterfs/src/logging.h libglusterfs/src/mem-pool.c libglusterfs/src/mem-pool.h libglusterfs/src/mem-types.h libglusterfs/src/options.c libglusterfs/src/options.h libglusterfs/src/rbthash.c libglusterfs/src/rbthash.h libglusterfs/src/run.c libglusterfs/src/run.h libglusterfs/src/scheduler.c libglusterfs/src/scheduler.h libglusterfs/src/stack.c libglusterfs/src/stack.h libglusterfs/src/statedump.c libglusterfs/src/statedump.h libglusterfs/src/syncop.c libglusterfs/src/syncop.h libglusterfs/src/syscall.c libglusterfs/src/syscall.h libglusterfs/src/timer.c libglusterfs/src/timer.h libglusterfs/src/trie.c libglusterfs/src/trie.h libglusterfs/src/xlator.c libglusterfs/src/xlator.h libglusterfsclient/src/libglusterfsclient-dentry.c libglusterfsclient/src/libglusterfsclient-internals.h libglusterfsclient/src/libglusterfsclient.c libglusterfsclient/src/libglusterfsclient.h rpc/rpc-lib/src/auth-glusterfs.c rpc/rpc-lib/src/auth-null.c rpc/rpc-lib/src/auth-unix.c rpc/rpc-lib/src/protocol-common.h rpc/rpc-lib/src/rpc-clnt.c rpc/rpc-lib/src/rpc-clnt.h rpc/rpc-lib/src/rpc-transport.c rpc/rpc-lib/src/rpc-transport.h rpc/rpc-lib/src/rpcsvc-auth.c rpc/rpc-lib/src/rpcsvc-common.h rpc/rpc-lib/src/rpcsvc.c rpc/rpc-lib/src/rpcsvc.h rpc/rpc-lib/src/xdr-common.h rpc/rpc-lib/src/xdr-rpc.c rpc/rpc-lib/src/xdr-rpc.h rpc/rpc-lib/src/xdr-rpcclnt.c rpc/rpc-lib/src/xdr-rpcclnt.h rpc/rpc-transport/rdma/src/name.c rpc/rpc-transport/rdma/src/name.h rpc/rpc-transport/rdma/src/rdma.c rpc/rpc-transport/rdma/src/rdma.h rpc/rpc-transport/socket/src/name.c rpc/rpc-transport/socket/src/name.h rpc/rpc-transport/socket/src/socket.c rpc/rpc-transport/socket/src/socket.h xlators/cluster/afr/src/afr-common.c xlators/cluster/afr/src/afr-dir-read.c xlators/cluster/afr/src/afr-dir-read.h xlators/cluster/afr/src/afr-dir-write.c xlators/cluster/afr/src/afr-dir-write.h xlators/cluster/afr/src/afr-inode-read.c xlators/cluster/afr/src/afr-inode-read.h xlators/cluster/afr/src/afr-inode-write.c xlators/cluster/afr/src/afr-inode-write.h xlators/cluster/afr/src/afr-lk-common.c xlators/cluster/afr/src/afr-mem-types.h xlators/cluster/afr/src/afr-open.c xlators/cluster/afr/src/afr-self-heal-algorithm.c xlators/cluster/afr/src/afr-self-heal-algorithm.h xlators/cluster/afr/src/afr-self-heal-common.c xlators/cluster/afr/src/afr-self-heal-common.h xlators/cluster/afr/src/afr-self-heal-data.c xlators/cluster/afr/src/afr-self-heal-entry.c xlators/cluster/afr/src/afr-self-heal-metadata.c xlators/cluster/afr/src/afr-self-heal.h xlators/cluster/afr/src/afr-self-heald.c xlators/cluster/afr/src/afr-self-heald.h xlators/cluster/afr/src/afr-transaction.c xlators/cluster/afr/src/afr-transaction.h xlators/cluster/afr/src/afr.c xlators/cluster/afr/src/afr.h xlators/cluster/afr/src/pump.c xlators/cluster/afr/src/pump.h xlators/cluster/dht/src/dht-common.c xlators/cluster/dht/src/dht-common.h xlators/cluster/dht/src/dht-diskusage.c xlators/cluster/dht/src/dht-hashfn.c xlators/cluster/dht/src/dht-helper.c xlators/cluster/dht/src/dht-inode-read.c xlators/cluster/dht/src/dht-inode-write.c xlators/cluster/dht/src/dht-layout.c xlators/cluster/dht/src/dht-linkfile.c xlators/cluster/dht/src/dht-mem-types.h xlators/cluster/dht/src/dht-rebalance.c xlators/cluster/dht/src/dht-rename.c xlators/cluster/dht/src/dht-selfheal.c xlators/cluster/dht/src/dht.c xlators/cluster/dht/src/nufa.c xlators/cluster/dht/src/switch.c xlators/cluster/stripe/src/stripe-helpers.c xlators/cluster/stripe/src/stripe-mem-types.h xlators/cluster/stripe/src/stripe.c xlators/cluster/stripe/src/stripe.h xlators/features/index/src/index-mem-types.h ¹ xlators/features/index/src/index.c ¹ xlators/features/index/src/index.h ¹ xlators/performance/io-cache/src/io-cache.c xlators/performance/io-cache/src/io-cache.h xlators/performance/io-cache/src/ioc-inode.c xlators/performance/io-cache/src/ioc-mem-types.h xlators/performance/io-cache/src/page.c xlators/performance/io-threads/src/io-threads.c xlators/performance/io-threads/src/io-threads.h xlators/performance/io-threads/src/iot-mem-types.h xlators/performance/md-cache/src/md-cache-mem-types.h xlators/performance/md-cache/src/md-cache.c xlators/performance/quick-read/src/quick-read-mem-types.h xlators/performance/quick-read/src/quick-read.c xlators/performance/quick-read/src/quick-read.h xlators/performance/read-ahead/src/page.c xlators/performance/read-ahead/src/read-ahead-mem-types.h xlators/performance/read-ahead/src/read-ahead.c xlators/performance/read-ahead/src/read-ahead.h xlators/performance/symlink-cache/src/symlink-cache.c xlators/performance/write-behind/src/write-behind-mem-types.h xlators/performance/write-behind/src/write-behind.c xlators/protocol/auth/addr/src/addr.c ¹ xlators/protocol/auth/login/src/login.c ¹ xlators/protocol/client/src/client-callback.c xlators/protocol/client/src/client-handshake.c xlators/protocol/client/src/client-helpers.c xlators/protocol/client/src/client-lk.c xlators/protocol/client/src/client-mem-types.h xlators/protocol/client/src/client.c xlators/protocol/client/src/client.h xlators/protocol/client/src/client3_1-fops.c ¹ Copyright only, license reverted to original Change-Id: If560e826c61b6b26f8b9af7bed6e4bcbaeba31a8 BUG: 820551 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3304 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Fix inodelk-trace logs to print lk-ownersPranith Kumar K2012-05-071-11/+11
| | | | | | | | | Change-Id: Icc983effcf1b6283410a162f260755e97d41ee65 BUG: 810502 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3228 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Perform Flush with lk-owner given by parent xlator.Pranith Kumar K2012-05-071-0/+3
| | | | | | | | | | | | Lk-owner of posix-lk and flush should be same, flush can't clear posix-lks without that lk-owner. Change-Id: If775abb5741a0beb00c419b54d023fbd429e3cb7 BUG: 810502 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3221 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Fix race in nonblocking entrylkPranith Kumar K2012-05-031-6/+6
| | | | | | | | | Change-Id: I6d96c1aed4bf08b90b6918e3694c688eccdc2445 BUG: 818578 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/3270 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: adding extra data for fopsAmar Tumballi2012-03-221-38/+43
| | | | | | | | | | | | | with this change, the xlator APIs will have a dictionary as extra argument, which is passed between all the layers. This can be utilized for overloading in some of the operations. Change-Id: I58a8186b3ef647650280e63f3e5e9b9de7827b40 Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 782265 Reviewed-on: http://review.gluster.com/2960 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Logs: Improved logs in lock/unlock execution pathPranith Kumar K2012-03-181-1/+1
| | | | | | | | | | | Statedump will now start showing the lk-owner of the stack. Change-Id: I9f650ce9a8b528cd626c8bb595c1bd1050462c86 BUG: 803209 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2968 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cluster/afr: Enable eager-lockPranith Kumar K2012-03-171-99/+134
| | | | | | | | | | | | | | | | | | Eager-lock is disabled by default. Use cluster.eager-lock on/off to change the config. write-behind on and eager-lock off is not supported configuration. In afr, when eager-lock is enabled the inode lock on fd is taken using the fd address as the lk-owner. So the lock is interchangableale between the inode-locks on the same fd. Change-Id: I7eef1ecd510f8028f5395dee882782da53c0de3f BUG: 802515 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2925 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* Revert "afr: [Un]Set the 'right' lkowner for [f]{inode|entry}_lk and the ↵Vijay Bellur2012-03-031-80/+11
| | | | | | | | | | | 'enclosed' fop." This reverts commit 2e80fdbeb6abbb23ff6789c2b98c82704883af0a. Change-Id: I417fd43e4195d63e5b8b83dd3beb712887130e1e Reviewed-on: http://review.gluster.com/2860 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Handle errors in build_parent_locPranith Kumar K2012-03-011-1/+1
| | | | | | | | | BUG: 787671 Change-Id: I0b01b0f9e14a26d757748413dd71909e915c7573 Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Reviewed-on: http://review.gluster.com/2826 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* afr: [Un]Set the 'right' lkowner for [f]{inode|entry}_lk and the 'enclosed' fop.Krishnan Parthasarathi2012-03-011-11/+80
| | | | | | | | | | | | | | afr 'mangles' the lkowner inorder to ensure [f]inodelk/[f]entrylk fops from the same application contend. But other fops that are 'visible' to the application should operate with the lkowner provided by fuse for correct functioning of posix-locks xlator. Change-Id: I7e71f35ae7df2a070f1f46d4fc77eed26a717673 BUG: 790743 Signed-off-by: Krishnan Parthasarathi <kp@gluster.com> Reviewed-on: http://review.gluster.com/2752 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* core: utilize mempool for frame->local allocationsAmar Tumballi2012-02-211-1/+1
| | | | | | | | | | | | | | | in each translator, which uses 'frame->local', we are using GF_CALLOC/GF_FREE, which would be costly considering the number of allocation happening in a lifetime of 'fop'. It would be good to utilize the mem pool framework for xlator's local structures, so there is no allocation overhead. Change-Id: Ida6e65039a24d9c219b380aa1c3559f36046dc94 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 765336 Reviewed-on: http://review.gluster.com/2772 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* core: change lk-owner as a 1k bufferAmar Tumballi2012-01-241-13/+14
| | | | | | | | | | | | | so, NLM can send the lk-owner field directly to the locks translators, while doing the same effort, also enabled sending maximum of 500 aux gid over protocol. Change-Id: I87c2514392748416f7ffe21d5154faad2e413969 Signed-off-by: Amar Tumballi <amar@gluster.com> BUG: 767229 Reviewed-on: http://review.gluster.com/779 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Handle error cases in local initPranith Kumar K2011-12-281-14/+12
| | | | | | | | | | | | - Fop should unwind with appropriate errno - Local is de-allocated on errors Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Change-Id: I4db40342ae184fe1cc29e51072e8fea72ef2cb15 BUG: 770513 Reviewed-on: http://review.gluster.com/2539 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Remove treating sh_frame as special loop_framePranith Kumar K2011-11-231-0/+25
| | | | | | | | Change-Id: I0d87f06f989b2d4b971967c52d4898331693a801 BUG: 3675 Reviewed-on: http://review.gluster.com/735 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterfs: An effort to fix all the spell mistakes and typoHarshavardhana2011-11-161-4/+4
| | | | | | | | | | | | | | | in the entire glusterfs codebase. This patch fixes many of spell mistakes and typo in the entire glusterfs codebase and all supported modules. Change-Id: I83238a41aa08118df3cf4d1d605505dd3cda35a1 BUG: 3809 Signed-off-by: Harshavardhana <fharshav@redhat.com> Reviewed-on: http://review.gluster.com/731 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* build: warning suppression (round n)Amar Tumballi2011-10-201-4/+5
| | | | | | | | | | with this patch, there are no more warnings with gcc (GCC) 4.6.1 20110908 Change-Id: Ice0d52d304b9846395f8a4a191c98eb53125f792 BUG: 2550 Reviewed-on: http://review.gluster.com/607 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Second round of warning suppression.Jeff Darcy2011-09-291-2/+0
| | | | | | | | | | | | | | | | | | | | | | Used a #pragma to kill ~170 in rpcgen code. Added GF_UNUSED to deal with a few more from macros elsewhere. The remainder are function return values (mostly context and dict calls) that really should be checked. Those would be harder to fix without real understanding of the code where they occur, so they remain as reminders. (Patchset 2: deal with older gcc that doesn't handle #pragma GCC diagnostic) (Patchset 3: fix include paths in generated files) (Patchset 4: keep up with trunk, squash 9 new warnings) (Patchset 5: six more, all in AFR) Change-Id: I29760c8c81be4d7e6489312c5d0e92cc24814b7b BUG: 2550 Reviewed-on: http://review.gluster.com/378 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: Make local->child_up immutablePranith Kumar K2011-09-211-4/+0
| | | | | | | | | | | | | | | | | | | | | | | Afr transaction performs lock, pre-op, op, post-op and unlock steps in that order. The child_up[] is overloaded with the information of where all the first two steps succeeded. This works perfectly fine for Transaction, but the locking/unlocking part of the code is re-used by data self-heal. In that each loop_frame does lock, rchecksum, read-from-source and write-to-sinks, unlock steps. Rchecksum fop assumes that the fop needs to happen on one source + all sinks and sets the call_count to that number. But if the lock step fails on any of the sinks it will mark the child_up of that child to 0, which will result in call_count mismatch and the frame will hang thinking that some more cbks need to come. When this happens loop_frame will never go to unlock step leading to hangs on that file. Change-Id: I3dd0449cc6193a980bacf637d935881f4b22210a BUG: 3597 Reviewed-on: http://review.gluster.com/474 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: eager locking of FD writesAnand Avati2011-09-081-51/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a change in the way write transactions hold a lock which optimizes the case of sequential writes from a single writer. Lock phase of a transaction has two sub-phases. First is an attempt to acquire locks in parallel by broadcasting non-blocking lock requests. If lock aquistion fails on any server, then the held locks are unlocked and revert to a blocking locked mode sequentially on one server after another. The change in this patch is to make the initial broadcasting lock request attempt to acquire lock on the entire file. If this fails, we revert back to the sequential "regional" blocking lock as before. In the case where such an "eager" lock is granted in the non-blocking phase, it gives rise to an opportunity for optimization. i.e, if the next write transaction on the same FD arrives before the unlock phase of the first transaction, it "takes over" the full file lock. Similarly if yet another transaction arrives before the unlock phase of the "optimized" transaction, that in turn "takes over" the lock as well. The actual unlock now happens at the end of the last "optimzed" transaction. Any operation which arrives before the unlock phase of the previous transaction is a potential candidate to become an "optimized" transaction. In cases where the previous transaction had aquired lock as a "regional" blocking lock, and the next transaction comes in before its unlock phase, then it would not be an "optimized" transaction. Implied assumption ------------------ Since two or more transactions can now operate within the same large lock, there is a possibility that overlapping transactions can arrive at oppoosite orders on the servers. However in the larger picture this is not possible as write-behind already ensures that no two overlapping writes on an inode are in transit at the same time. Overlapping writes across clients are not a problem as they compete at locks anyways. Theoretical benefits and potential harms ---------------------------------------- In case of a single writer: The benefits are large for sequential writes. In the best case the entire file write can happen with just one lock and unlock per server, provided writes are coming in fast enough and getting pipelined by write-behind soon enough (which is usually the case). If the writes are not coming in fast enough, then the optimization "kicks in" for only those subsets of writes which are close enough to get "piggybacked". For random writes the benefits are the same as well. In any case the overall performance is better than or equal to the performance without this optimization for a single writer. In case of multiple writers: When multiple writers are not writing concurrently, there is no negative performance impact. When multiple writers are writing concurrently to the same region, there is no negative impact either, as they were previously getting arbitrated at the locks translator too. In the case of multiple writers writing to different regions concurrently, there will be an increased number of "failovers" from failed parallel non-blocking to sequential blocking regional locks. This above "worst case" has a simple workaround that as soon as we detect > 1 open-fd-count in lookup xattr, we can disable this optimization on those fds. Beneficial side-effects ----------------------- There is another similar optimization in AFR for changelogs which goes by the name of "changelog-piggybacking". That works in a similar way where pending flags get 'taken over' or 'piggybacked' by the next transaction if its 'pre-op' phase kicks in before the 'post-op' phase of the previous transaction. It has been observed that this changelog-piggybacking optimization gives a saving of about ~55% savings of xattr calls hitting the wire, measured across various types of network interfaces. The side effect of this eager-lock optimization is that it gives an almost 100% saving of xattr calls by making the optimistic-changelog work much more efficiently as it gives a wider overlap of the xattr phases of two consecutive transactions. Change-Id: I41c02eb3b64c14c68ef66a344610ec3f024cd59d BUG: 3409 Reviewed-on: http://review.gluster.com/240 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-071-44/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Perform self-heal without locking the whole filePranith Kumar K2011-08-201-46/+76
| | | | | | | | Change-Id: I206571c77f2d7b3c9f9d7bb82a936366fd99ce5c BUG: 3182 Reviewed-on: http://review.gluster.com/141 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Change Copyright current yearPranith Kumar K2011-08-101-1/+1
| | | | | | | | Change-Id: I2d10f2be44f518f496427f257988f1858e888084 BUG: 3348 Reviewed-on: http://review.gluster.com/200 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* LICENSE: s/GNU Affero General Public/GNU General Public/Pranith Kumar K2011-08-061-3/+3
| | | | | | | | Change-Id: I3914467611e573cccee0d22df93920cf1b2eb79f BUG: 3348 Reviewed-on: http://review.gluster.com/182 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Don't depend on fuse lk_owner for inodelksPranith K2011-07-171-6/+4
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 3182 (Afr self-heal should happen with out big lock) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=3182
* cluster/afr: propagate proper errno returned by lock fopsAnand Avati2011-06-101-4/+0
| | | | | | | | | | If locks could not be held on any of the servers, then propagate the errno returned by the lock FOPs instead of hardcoding EAGAIN/EINVAL. Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2993 ([glusterfs-3.2.0qa2]: hang while doing the selfheal) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2993
* cluster/afr: Log errors in afr self-heal with GF_LOG_ERRORPranith Kumar K2011-06-081-1/+1
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2986 (Failed operations should should be logged `E' or `W') URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2986
* cluster/afr: Send Non-blocking lock in non-blocking entrylkPranith Kumar K2011-05-301-1/+1
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Anand Avati <avati@gluster.com> BUG: 2949 (self-heal hangs) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2949
* removed reference to GF_LOG_NORMALAmar Tumballi2011-04-071-4/+4
| | | | | | | | | | instead used GF_LOG_INFO, which is more standard log level. Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@gluster.com> BUG: 2669 (RuntimeError: cannot recognize log level "normal") URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2669
* cluster/afr: log enhancement - part 3Amar Tumballi2011-04-011-51/+37
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/afr: white-space cleanup - part 2Amar Tumballi2011-03-311-186/+186
| | | | | | | | Signed-off-by: Amar Tumballi <amar@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 2346 (Log message enhancements in GlusterFS - phase 1) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2346
* cluster/replicate: fix warnings due to format string mismatches during ↵Raghavendra G2010-12-141-1/+1
| | | | | | | | | | invocation of gf_log. Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 2211 ((re)introduce warnings for format string/parameter mismatch) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2211
* check the return value after setting the fd context in afrRaghavendra Bhat2010-12-121-2/+2
| | | | | | | | Signed-off-by: Raghavendra Bhat <raghavendrabhat@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* replicate: fix hang/missing frame during lockingAnand Avati2010-10-271-1/+15
| | | | | | | | | | | | nonblocking style locking would result in a missing frame when all subvolumes are down or when no subvolume on which fd was opened is up. Check for this condition and unlock gracefully Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 918 (AFR write fails when subvolumes' state is swapped) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=918
* Copyright changesVijay Bellur2010-10-111-1/+1
| | | | | | | | Signed-off-by: Vijay Bellur <vijay@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 971 (dynamic volume management) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=971
* features/locks: cluster/afr: Misc fixes for lock recovery.Pavan Sondur2010-10-051-2/+5
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* Change GNU GPL to GNU AGPLPranith K2010-10-041-3/+3
| | | | | | | | Signed-off-by: Pranith Kumar K <pranithk@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1388 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1388
* Set the lock owner properly for lock self heal.Pavan Sondur2010-10-021-0/+2
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* Changes to replace flock with gf_flock across GlusterFS.Pavan Sondur2010-10-011-13/+13
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* cluster/afr: Recover locks on child_up from source to sink.Pavan Sondur2010-10-011-0/+383
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* protocol/client: cluster/afr: Support lock recovery and self heal.Pavan Sondur2010-09-301-0/+101
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 865 (Add locks recovery support in GlusterFS) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=865
* replicate: replace first-write-to-flush optimizationAnand V. Avati2010-09-291-4/+0
| | | | | | | | | | | | use a changelog piggybacking optimization instead of first-write-to-flush optimization and do other cleanups (removal of post-post-op hook etc.) Signed-off-by: Anand V. Avati <avati@blackhole.gluster.com> Signed-off-by: Anand V. Avati <avati@amp.gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1235 (Bug for all pump/migrate commits) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1235
* cluster/afr: Fix hang in create when one subvol is down.Pavan Sondur2010-09-061-17/+26
| | | | | | | | Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1544 (Create fails when 1 server is down) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1544
* cluster/replicate: fix warnings during build.Raghavendra G2010-09-031-2/+2
| | | | | | | | Signed-off-by: Raghavendra G <raghavendra@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 960 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=960
* cluster/afr: Break STACK_WIND loop when the call count is reached.Pavan Sondur2010-08-311-0/+18
| | | | | | | | | | | Fix also has a check for self heal relevant to pump. Tested with dbench with AFR client and pump on server. Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Vijay Bellur <vijay@dev.gluster.com> BUG: 1443 (Crash in afr_nonblocking_entrylk_cbk) URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1443
* cluster/afr: Use 2 phase locking for transactions and self heal.Pavan Sondur2010-08-221-0/+1657
Signed-off-by: Pavan Vilas Sondur <pavan@gluster.com> Signed-off-by: Anand V. Avati <avati@dev.gluster.com> BUG: 960 () URL: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=960