summaryrefslogtreecommitdiffstats
path: root/rpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'upstream'HEADmasterJeff Darcy2014-04-2840-4014/+471
|\ | | | | | | | | | | | | | | | | | | | | | | Conflicts: rpc/xdr/src/glusterfs3-xdr.c rpc/xdr/src/glusterfs3-xdr.h xlators/features/changelog/src/Makefile.am xlators/features/changelog/src/changelog-helpers.h xlators/features/changelog/src/changelog.c xlators/mgmt/glusterd/src/glusterd-sm.c Change-Id: I9972a5e6184503477eb77a8b56c50a4db4eec3e2
| * glusterd/snapshot: Compare and update snapshots during peer handshakeAvra Sengupta2014-04-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During a peer-handshake, after the volumes have synced, and the list of missed snapshots have synced, the node will perform the pending deletes and restores on this list. At this point, the current snapshot list in the node will be updated, and hence in case of conflicts arising during snapshot handshake, the peer hosting the bricks will be given precedence Likewise, if there will be a conflict, and both peers will be in the same state, i.e either both would be hosting bricks or both would not be hosting bricks, then a decision can't be taken and a peer-reject will happen. glusterd_compare_and_update_snap() implements the following algorithm to perform the above task: Step 1: Start. Step 2: Check if the peer is missing a delete on the said snap. If yes, goto step 6. Step 3: Check if there is a conflict between the peer's data and the local snap. If no, goto step 5. Step 4: As there is a conflict, check if both the peer and the local nodes are hosting bricks. Based on the results perform the following: Peer Hosts Bricks Local Node Hosts Bricks Action Yes Yes Goto Step 7 No No Goto Step 7 Yes No Goto Step 8 No Yes Goto Step 6 Step 5: Check if the local node is missing the peer's data. If yes, goto step 9. Step 6: It's a no-op. Goto step 10 Step 7: Peer Reject. Goto step 10 Step 8: Delete local node's data. Step 9: Accept Peer Data. Step 10: Stop Change-Id: I79be0f0f5f2a4f5c72277a4e77c2be732af432e1 BUG: 1061685 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7525 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
| * gNFS: gNFS drc cache failed to detecte duplicates.Yuan Ding2014-04-281-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the drc cache get full, message "DRC failed to detect duplicates" keep printed in the log. The root cause is drc_compare_reqs use the wrong compare type. This function should use type drc_cache_op_t as its input type. Since all rbtree related code (except function rpcsvc_drc_lookup) in drc cache pass drc_cache_op_t as compare type. Only rpcsvc_drc_lookup use type rpcsvc_request_t. It has been modified too. Change-Id: I925c097debe6b82f267986961fd4e7755f3de9af BUG: 1089676 Signed-off-by: Yuan Ding <beback198611@gmail.com> Reviewed-on: http://review.gluster.org/7519 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * rpcgen: After recent changes parallel builds failedHarshavardhana2014-04-272-9/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parallel builds failed due to make file would overrun xdrgen (internally xdrgen uses tempfiles to add License header). Seperate out header and source generation and add explicit dependency to fix it. Change-Id: Id20f548493540b0f17a2300f0775646f9f20789c BUG: 1090807 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/7572 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * glusterd/snapshot-handshake: Perform handshake of missed_snaps_list.Avra Sengupta2014-04-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a handshake, create a union of the missed_snap_lists of the two peers. If an entry is present, its no op. If an entry is pendng, and the peer entry is done, mark own entry as done. If an entry is done, and the peer ertry is pending, its a no-op. If its a new entry, add it. Change-Id: Idbfa49cc34871631ba8c7c56d915666311024887 BUG: 1061685 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7453 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
| * rpcgen: 'hyper' is 64bit undefined on Darwin use quad_tHarshavardhana2014-04-253-69/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | "9819fcedf10f1430d4969c86e6df4dfe975b7dcf" After the commit i observed that we never used "hyper" as defined in .x previously in the .[c,h] files. Change-Id: I26152141bca6e789c4a3b3158fffe0a65e4b0878 BUG: 1090807 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/7566 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * rpcgen: Remove autogenerated files instead build on demandHarshavardhana2014-04-2529-7338/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | Avoid modifying autogenerated files and keeping them in repository - autogenerate them on demand from ".x" files Change-Id: I2cdb1fe9b99768ceb80a8cb100fa00bd1d8fe2c6 BUG: 1090807 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/7526 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * rpcsvc: Ignore INODELK/ENTRYLK/LK for throttlingPranith Kumar K2014-04-241-16/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When iozone is in progress, number of blocking inodelks sometimes becomes greater than the threshold number of rpc requests allowed for that client (RPCSVC_DEFAULT_OUTSTANDING_RPC_LIMIT). Subsequent requests from that client will not be read until all the outstanding requests are processed and replied to. But because no more requests are read from that client, unlocks on the already granted locks will never come thus the number of outstanding requests would never come down. This leads to a ping-timeout on the client. Fix: Do not account INODELK/ENTRYLK/LK for throttling Change-Id: I59c6b54e7ec24ed7375ff977e817a9cb73469806 BUG: 1089470 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/7531 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * build: MacOSX Porting fixesHarshavardhana2014-04-2412-47/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git@forge.gluster.org:~schafdog/glusterfs-core/osx-glusterfs Working functionality on MacOSX - GlusterD (management daemon) - GlusterCLI (management cli) - GlusterFS FUSE (using OSXFUSE) - GlusterNFS (without NLM - issues with rpc.statd) Change-Id: I20193d3f8904388e47344e523b3787dbeab044ac BUG: 1089172 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Signed-off-by: Dennis Schafroth <dennis@schafroth.com> Tested-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Dennis Schafroth <dennis@schafroth.com> Reviewed-on: http://review.gluster.org/7503 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
| * gNFS: Support wildcard in RPC auth allow/rejectSantosh Kumar Pradhan2014-04-221-2/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFE: Support wildcard in "nfs.rpc-auth-allow" and "nfs.rpc-auth-reject". e.g. *.redhat.com 192.168.1[1-5].* 192.168.1[1-5].*, *.redhat.com, 192.168.21.9 Along with wildcard, support for subnetwork or IP range e.g. 192.168.10.23/24 The option will be validated for following categories: 1) Anonymous i.e. "*" 2) Wildcard pattern i.e. string containing any ('*', '?', '[') 3) IPv4 address 4) IPv6 address 5) FQDN 6) subnetwork or IPv4 range Currently this does not support IPv6 subnetwork. Change-Id: Iac8caf5e490c8174d61111dad47fd547d4f67bf4 BUG: 1086097 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/7485 Reviewed-by: Poornima G <pgurusid@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* | Merge branch 'upstream'Jeff Darcy2014-04-2211-39/+729
|\| | | | | | | | | | | | | | | | | | | Conflicts: glusterfs.spec.in xlators/mgmt/glusterd/src/Makefile.am xlators/mgmt/glusterd/src/glusterd-utils.c xlators/mgmt/glusterd/src/glusterd.h Change-Id: I27bdcf42b003cfc42d6ad981bd2bf8180176806d
| * rdma: correct some spelling mistakesNiels de Vos2014-04-171-4/+4
| | | | | | | | | | | | | | | | | | | | Change-Id: Ib7018aa8a79d36ab942516457a79039cb3e67355 BUG: 1088849 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7498 Reviewed-by: Vikhyat Umrao <vumrao@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
| * gluster: GlusterFS Volume Snapshot FeatureAvra Sengupta2014-04-117-26/+637
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial patch for the Snapshot feature. Current patch includes following features: * Snapshot create * Snapshot delete * Snapshot restore * Snapshot list * Snapshot info * Snapshot status * Snapshot config Change-Id: I2f46920c0d61c515f6a60e0f8b46fff886d9f6a9 BUG: 1061685 Signed-off-by: shishir gowda <sgowda@redhat.com> Signed-off-by: Sachin Pandit <spandit@redhat.com> Signed-off-by: Vijaikumar M <vmallika@redhat.com> Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/7128 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
| * rpc: warn and truncate grouplist if RPC/AUTH can not hold everythingNiels de Vos2014-04-083-8/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GlusterFS protocol currently uses AUTH_GLUSTERFS_V2 in the RPC/AUTH header. This header contains the uid, gid and auxiliary groups of the user/process that accesses the Gluster Volume. The AUTH_GLUSTERFS_V2 structure allows up to 65535 auxiliary groups to be passed on. Unfortunately, the RPC/AUTH header is limited to 400 bytes by the RPC specification: http://tools.ietf.org/html/rfc5531#section-8.2 In order to not cause complete failures on the client-side when trying to encode a AUTH_GLUSTERFS_V2 that would result in more than 400 bytes, we can calculate the expected size of the other elements: 1 | pid 1 | uid 1 | gid 1 | groups_len XX | groups_val (GF_MAX_AUX_GROUPS=65535) 1 | lk_owner_len YY | lk_owner_val (GF_MAX_LOCK_OWNER_LEN=1024) ----+------------------------------------------- 5 | total xdr-units one XDR-unit is defined as BYTES_PER_XDR_UNIT = 4 bytes MAX_AUTH_BYTES = 400 is the maximum, this is 100 xdr-units. XX + YY can be 95 to fill the 100 xdr-units. Note that the on-wire protocol has tighter requirements than the internal structures. It is possible for xlators to use more groups and a bigger lk_owner than that can be sent by a GlusterFS-client. This change prevents overflows when allocating the RPC/AUTH header. Two new macros are introduced to calculate the number of groups that fit in the RPC/AUTH header, when taking the size of the lk_owner in account. In case the list of groups exceeds the maximum possible, only the first groups are passed over the RPC/GlusterFS protocol to the bricks. A warning is added to the logs, so that most system administrators will get informed. The reducing of the number of groups is not a new inventions. The RPC/AUTH header (AUTH_SYS or AUTH_UNIX) that NFS uses has a limit of 16 groups. Most, if not all, NFS-clients will reduce any bigger number of groups to 16. (nfs.server-aux-gids can be used to workaround the limit of 16 groups, but the Gluster NFS-server will be limited to a maximum of 93 groups, or fewer in case the lk_owner structure contains more items.) Change-Id: I8410e59d0fd246d601b54b961d3ae9cb5a858c10 BUG: 1053579 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/7202 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
| * log: enhance gluster log format with message ID and standardize errno reportingShyamsundarR2014-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there are quite a slew of logs in Gluster that do not lend themselves to trivial analysis by various tools that help collect and monitor logs, due to the textual nature of the logs. This FEAT is to make this better by giving logs message IDs so that the tools do not have to do complex log parsing to break it down to problem areas and suggest troubleshooting options. With this patch, a new set of logging APIs are introduced that take additionally a message ID and an error number, so as to print the message ID and the descriptive string for the error. New APIs: - gf_msg, gf_msg_debug/trace, gf_msg_nomem, gf_msg_callingfn These APIs follow the functionality of the previous gf_log* counterparts, and hence are 1:1 replacements, with the delta that, gf_msg, gf_msg_callingfn take additional parameters as specified above. Defining the log messages: Each invocation of gf_msg/gf_msg_callingfn, should provide an ID and an errnum (if available). Towards this, a common message id file is provided, which contains defines to various messages and their respective strings. As other messages are changed to the new infrastructure APIs, it is intended that this file is edited to add these messages as well. Framework enhanced: The logging framework is also enhanced to be able to support different logging backends in the future. Hence new configuration options for logging framework and logging formats are introduced. Backward compatibility: Currently the framework supports logging in the traditional format, with the inclusion of an error string based on the errnum passed in. Hence the shift to these new APIs would retain the log file names, locations, and format with the exception of an additional error string where applicable. Testing done: Tested the new APIs with different messages in normal code paths Tested with configurations set to gluster logs (syslog pending) Tested nomem variants, inducing the message in normal code paths Tested ident generation for normal code paths (other paths pending) Tested with sample gfapi program for gfapi messages Test code is stripped from the commit Pending work (not to be addressed in this patch (future)): - Logging framework should be configurable - Logging format should be configurable - Once all messages move to the new APIs deprecate/delete older APIs to prevent misuse/abuse using the same - Repeated log messages should be suppressed (as a configurable option) - Logging framework assumes that only one init is possible, but there is no protection around the same (in existing code) - gf_log_fini is not invoked anywhere and does very little cleanup (in existing code) - DOxygen comments to message id headers for each message Change-Id: Ia043fda99a1c6cf7817517ef9e279bfcf35dcc24 BUG: 1075611 Signed-off-by: ShyamsundarR <srangana@redhat.com> Reviewed-on: http://review.gluster.org/6547 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* | Merge branch 'upstream'Jeff Darcy2014-03-242-60/+102
|\|
| * rpc: transport may be destroyed while rpc isn'tKrishnan Parthasarathi2014-03-052-60/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rpc_clnt object is destroyed after the corresponding transport object is destroyed. But rpc_clnt_reconnect, a timer driven function, refers to the transport object beyond its 'life'. Instead, using the embedded connection object prevents use after free problem wrt transport object. Also, access transport object under conn->lock. Change-Id: Iae28e8a657d02689963c510114ad7cb7e6764e62 BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6751 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* | Add GF_FOP_IPC for inter-translator communication.Jeff Darcy2014-03-114-44/+71
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several features - e.g. encryption, erasure codes, or NSR - involve multiple cooperating translators which sometimes need a "private" means of communication amongst themselves. Historically we've used virtual or synthetic xattrs, but that's not very elegant and clutters up the getxattr/setxattr path which must also handle real xattr requests. This new fop should address that. The only argument is an int32_t "op" which should be recognized by the target translator. It is recommended that translators using these feature follow some convention regarding the ops that they define, to avoid conflicts. Using a hash of the target translator's type string as a base for a series of ops would probably be a good start. Any other information can be passed in both directions using xdata. The default behavior for this fop, as with any other, is to pass through to FIRST_CHILD. That makes use of this fop "transparent" to other translators that were written before it existed, but it also means that it only really works with pass-through translators. If a routing translator (such as DHT) or a fan-out translator (such as AFR) is involved, the IPC might not reach its intended destination unless those translators are modified to forward IPC fops along all paths. If an IPC gets all the way to storage/posix it is considered an error, much like an uncaught exception. We don't actually *do* anything in that case, but we do flag it as an error in the log. Change-Id: I7f37c9247ee35536f8136c7aea758e6fe04616c4 Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
* socket: don't send notification 'up' on socket_writev failureKrishnan Parthasarathi2014-02-271-1/+2
| | | | | | | | Change-Id: If4e4b95fe025a412f25313d83c780046dfec5116 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6716 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc/socket: Avoid excessive INFO logs when SSL is not configured.Vijay Bellur2014-02-201-2/+4
| | | | | | | | | Change-Id: I7f4dd2ae4225c8d3783417d0c3d415178f04c0da BUG: 1067011 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/7031 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: Fix a crash due to NULL dereferenceVijay Bellur2014-02-161-2/+2
| | | | | | | | | Change-Id: Ib2bf6dd564fb7e754d5441c96715b65ad2e21441 BUG: 1065611 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/7007 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* Clean up some weirdness with the gf_resolve inet definesJustin Clift2014-02-131-7/+0
| | | | | | | | Change-Id: I6bf6101aa0b5d6624891a8ebed2ac1fec2e11e1c Reviewed-on: http://review.gluster.org/6948 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Fix possible leaks on failure code pathSantosh Kumar Pradhan2014-02-121-1/+14
| | | | | | | | | | | | | | | Fix the memory leaks in socket and glusterd in failure code paths reported by Coverity. CIDs: 1124777, 1124781, 124782 Change-Id: I63472c6b5900f308f19e64fc93bf7ed2f7b06ade BUG: 789278 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6954 Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* protocol/server: do not do root-squashing for trusted clientsRaghavendra Bhat2014-02-102-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * As of now clients mounting within the storage pool using that machine's ip/hostname are trusted clients (i.e clients local to the glusterd). * Be careful when the request itself comes in as nfsnobody (ex: posix tests). So move the squashing part to protocol/server when it creates a new frame for the request, instead of auth part of rpc layer. * For nfs servers do root-squashing without checking if it is trusted client, as all the nfs servers would be running within the storage pool, hence will be trusted clients for the bricks. * Provide one more option for mounting which actually says root-squash should/should not happen. This value is given priority only for the trusted clients. For non trusted clients, the volume option takes the priority. But for trusted clients if root-squash should not happen, then they have to be mounted with root-squash=no option. (This is done because by default blocking root-squashing for the trusted clients will cause problems for smb and UFO clients for which the requests have to be squashed if the option is enabled). * For geo-replication and defrag clients do not do root-squashing. * Introduce a new option in open-behind for doing read after successful open. Change-Id: I8a8359840313dffc34824f3ea80a9c48375067f0 BUG: 954057 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/4863 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Volume locks and transaction specific opinfosAvra Sengupta2014-02-104-0/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this patch we are replacing the existing cluster-wide lock taken on glusterds across the cluster, with volume locks which are also taken on glusterds across the cluster, but are volume specific. So with the volume locks we are able to perform more than one gluster operation at the same time, as long as the operations are being performed on different volumes. We maintain a global list of volume-locks (using a dict for a list) where the key is the volume name, and which saves the uuid of the originator glusterd. These locks are held and released per volume transaction. In order to acheive multiple gluster operations occuring at the same time, we also separate opinfos in the op-state-machine, as a part of this patch. To do so, we generate a unique transaction-id (uuid) per gluster transaction. An opinfo is then associated with this transaction id, which is used throughout the transaction. We maintain a run-time global list(using a dict) of transaction-ids, and their respective opinfos to achieve this. Upstream Feature Page: http://www.gluster.org/community/documentation/index.php/Features/glusterd-volume-locks Change-Id: Iaad505a854bac8de8f83beec0357eb6cde3f7ea8 BUG: 1011470 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5994 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: make sure quota enforcer has established connection with ↵Raghavendra G2014-01-251-0/+28
| | | | | | | | | | | | | | | | | quotad before marking quota as enabled. without this patch there is a window of time when quota is marked as enabled in quota-enforcer, but connection to quotad wouldn't have been established. Any checklimit done during this period can result in a failed fop because of unavailability of quotad. Change-Id: I0d509fabc434dd55ce9ec59157123524197fcc80 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6572 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli: Add options to the CLI that let the user control the reset ofDawit Alemu2014-01-243-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stats "volume profile info" automatically clears incremental stats. There isn't a command to: - fetch stats without clearing incremental stats and - clear cumulative and incremental stats This change introduces two arguments (i.e. peek and clear). 'clear' will wipe both incremental and cumulative stats. 'peek' fetches stats without wiping incremental stats. 'volume profile info peek' - fetches incremental and cumulative stats without wiping incremental stats 'volume profile info incremental peek' - fetches incremental stats without wiping incremental stats 'volume profile info clear' - clears both incremental and cumultiave stats Change-Id: I91834515ad672eca5f882809941147d7d997c4c9 BUG: 1047416 Signed-off-by: Dawit Alemu <dalemu@redhat.com> Reviewed-on: http://review.gluster.org/6620 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc: use GF_FREE when a string is gf_strdup'd.Krishnan Parthasarathi2014-01-221-1/+1
| | | | | | | | Change-Id: I522c30a600e712be9cc09393104e228e4d8e13f5 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6752 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: Start using library versioning for various librariesHarshavardhana2014-01-182-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to libtool three individual numbers stand for CURRENT:REVISION:AGE, or C:R:A for short. The libtool script typically tacks these three numbers onto the end of the name of the .so file it creates. The formula for calculating the file numbers on Linux and Solaris is /path/to/library/<library_name>.(C - A).(A).(R) As you release new versions of your library, you will update the library's C:R:A. Although the rules for changing these version numbers can quickly become confusing, a few simple tips should help keep you on track. The libtool documentation goes into greater depth. In essence, every time you make a change to the library and release it, the C:R:A should change. A new library should start with 0:0:0. Each time you change the public interface (i.e., your installed header files), you should increment the CURRENT number. This is called your interface number. The main use of this interface number is to tag successive revisions of your API. The AGE number is how many consecutive versions of the API the current implementation supports. Thus if the CURRENT library API is the sixth published version of the interface and it is also binary compatible with the fourth and fifth versions (i.e., the last two), the C:R:A might be 6:0:2. When you break binary compatibility, you need to set AGE to 0 and of course increment CURRENT. The REVISION marks a change in the source code of the library that doesn't affect the interface-for example, a minor bug fix. Anytime you increment CURRENT, you should set REVISION back to 0. Change-Id: Id72e74c1642c804fea6f93ec109135c7c16f1810 BUG: 862082 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: Set default outstanding RPC limit to 16Santosh Kumar Pradhan2014-01-152-23/+28
| | | | | | | | | | | | | With 64, NFS server hangs with large I/O load (~ 64 threads writing to NFS server). The test results from Ben England (Performance expert) suggest to set it as 16 instead of 64. Change-Id: I418ff5ba0a3e9fdb14f395b8736438ee1bbd95f4 BUG: 1008301 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6696 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: ben england <bengland@redhat.com>
* socket: propogate connect failure in socket_event_handlerKrishnan Parthasarathi2014-01-141-1/+1
| | | | | | | | | | | | | | | | | This patch prevents spurious handling of pollin/pollout events on an 'un-connected' socket, when outgoing packets to its remote endpoint are 'dropped' using iptables(8) rules. For eg, iptables -I OUTPUT -p tcp --dport 24007 -j DROP Change-Id: I1d3f3259dc536adca32330bfb7566e0b9a521e3c BUG: 1048188 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6627 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* rpc/auth: Avoid NULL dereference in rpcsvc_auth_request_init()Harshavardhana2014-01-086-46/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code section is bogus! ------------------------------------------ 370: if (!auth->authops->request_init) 371: ret = auth->authops->request_init (req, auth->authprivate); ------------------------------------------ Seems to have been never been used historically since logically above code has never been true to actually execute "authops->request_init() --> auth_glusterfs_{v2,}_request_init()" On top of that under "rpcsvc_request_init()" verf.flavour and verf.datalen are initialized from what is provided through 'callmsg'. ------------------------------------------ req->verf.flavour = rpc_call_verf_flavour (callmsg); req->verf.datalen = rpc_call_verf_len (callmsg); /* AUTH */ rpcsvc_auth_request_init (req); return req; ------------------------------------------ So the code in 'auth_glusterfs_{v2,}_request_init()' performing this operation will over-write the original flavour and datalen. ------------------------------------------ if (!req) return -1; memset (req->verf.authdata, 0, GF_MAX_AUTH_BYTES); req->verf.datalen = 0; req->verf.flavour = AUTH_NULL; ------------------------------------------ Refactoring the whole code into a more understandable version and also avoiding a potential NULL dereference Change-Id: I1a430fcb4d26de8de219bd0cb3c46c141649d47d BUG: 1049735 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6591 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: Small memory leak in rpcsvc_drc_init()Santosh Kumar Pradhan2014-01-031-25/+17
| | | | | | | | | | | | | | | | | | | | | | 1. The routine rpcsvc_drc_init() is only used while initialization of NFS xlator. It should just check for nfs.drc option and init DRC feature accordingly. If it's set to OFF, then rpcsvc_drc_init() allocates memory for svc.drc and set ret value to 0 and goes to out: block where drc is leaked. 2. rpcsvc_drc_init() should just allocate svc.drc and init it. Here svc.drc can never be valid. 3. If svc.drc gets init'd here, no point of checking for drc type here. Change-Id: I4085771cdb8c9c15d1b9c548b77929a37f27c124 BUG: 1047902 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6628 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: Possible SEGV crash in NFS while DRC is OFFSantosh Kumar Pradhan2014-01-031-1/+1
| | | | | | | | | | | | | | | | In rpcsvc_submit_generic(), FILE: rpc/rpc-lib/src/rpcsvc.c, while caching the reply (DRC), the code does not check if DRC is ON and goes ahead assuming DRC is on and try to take a LOCK on drc. FIX: Put a check on svc->drc by rpcsvc_need_drc(). Change-Id: I52c57280487e6061c68fd0b784e1cafceb2f3690 BUG: 1048072 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6632 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc/server: add anonuid and anongid options for root-squashNiels de Vos2013-12-303-4/+21
| | | | | | | | | | | | | | | | | | | Introduce new options to modify the behaviour of server.root-squash. With server.anonuid and server.anongid the uid/gid can be specified and the root user (uid=0 and gid=0) will be mapped to the given uid/gid instead of nfsnobody (uid=65534 and gid=65534). Many thanks to Vikhyat Umrao for writing the majority of the test-case! Change-Id: I6379a3d2ef52b9b9707f2f6f0529657580c8d779 BUG: 1043886 CC: Vikhyat Umrao <vumrao@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/6546 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Vikhyat Umrao <vumrao@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* socket: unix socket connect path can't be greater than UNIX_PATH_MAX charactersKrishnan Parthasarathi2013-12-261-2/+2
| | | | | | | | | | Change-Id: I74788b63dd1c14507aa6d65182ea4b87a2e1f389 BUG: 1046308 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6589 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc,glusterd: Use rpc_clnt notifyfn to cleanup mydataKaushal M2013-12-162-2/+10
| | | | | | | | | | | | | | | | | | | | | | | rpc: - On a RPC_TRANSPORT_CLEANUP event, rpc_clnt_notify calls the registered notifyfn with a RPC_CLNT_DESTROY event. The notifyfn should properly cleanup the saved mydata on this event. - Break the reconnect chain when an rpc client is disabled. This will prevent new disconnect events which can lead to crashes. glusterd: - Added support for RPC_CLNT_DESTROY in glusterd_brick_rpc_notify - Use a common glusterd_rpc_clnt_unref() function throught glusterd in place of rpc_clnt_unref(). This function correctly gives up the big-lock before performing the unref. Change-Id: I93230441c5089039643fc9f5632477ef1b695348 BUG: 962619 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5512 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/geo-rep: more glusterd and cli fixes for geo-rep.Ajeet Jha2013-12-121-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | -> handle option validation cases in reset case. -> Creating valid conf path when glusterd restarts. -> Reading the gsyncd worker thread status and displaying it. -> Displaying status-detail per worker. -> Fetch checkpoint info in geo-rep status. -> use-tarssh value validation added. misc: misc geo-rep fixes based on cluster, logrotate etc.. -> cluster/dht: fix 'stime' getxattr getting overwritten. -> cluster/afr: return max of 'stime' values in subvol. -> geo-rep-logrotate: Sending SIGHUP to geo-rep auxiliary. -> cluster/dht: fix convoluted logic while aggregating. -> cluster/*: fix 'stime' min/max fetch logic. Change-Id: I811acea0bbd6194797a3e55d89295d1ea021ac85 BUG: 1036552 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/6405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli: Add an option to fetch just incremental or cumulative I/0Dawit Alemu2013-12-031-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | information 'volume profile info' fetches both cumulative and incremental I/O statistics. There isn't a way to fetch just cumulative or incremental statistics. This change introduces two optional arguments, namely "incremental" and "cumulative", that can be tacked on to 'volume profile info'. In other words, the new command format is volume profile <VOLNAME> {start | info [incremental | cumulative] | stop} [nfs] 'volume profile info incremental' - fetches incremental stats 'volume profile info cumulative' - fetches cumulative stats 'volume profile info' - fetches incremental and cumulative stats Change-Id: I5ddb45d990542ea611d23d251efebfec46f472d0 BUG: 1030580 Signed-off-by: Dawit Alemu <dalemu@redhat.com> Reviewed-on: http://review.gluster.org/6264 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
* cli: More checks in rebalance status outputKaushal M2013-12-032-1/+3
| | | | | | | | | | Change-Id: Ibd2edc5608ae6d3370607bff1c626c8347c4deda BUG: 1031887 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/6337 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* socket: limit vector count to IOV_MAXAnand Avati2013-11-261-2/+2
| | | | | | | | | | | | | IOV_MAX is the maximum supported vector count on a given platform. Limit the count to IOV_MAX if higher. As we are performing non-blocking IO getting a smaller return value is handled naturally. Change-Id: I94ef67a03ed0e10da67a776af2b55506bf721611 BUG: 1034398 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6354 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com>
* cli/glusterd: Changes to quota command Quota featureRaghavendra G2013-11-262-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | re-work. Following are the cli commands that are new/re-worked: ====================================================== volume quota <VOLNAME> {enable|disable|list [<path> ...]|remove <path>| default-soft-limit <percent>} | volume quota <VOLNAME> {limit-usage <path> <size> [<percent>]} | volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} volume status [all | <VOLNAME> [nfs|shd|<BRICK>|quotad]] [detail|clients|mem|inode|fd|callpool] volume statedump <VOLNAME> [nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history] glusterd changes: ================= * Quota limits are now set as extended attributes by glusterd from the aux mount created by the cli. * The gfids of the directories on which quota limits are set for a given volume are stored in /var/lib/glusterd/vols/<volname>/quota.conf file in binary format, and whose cksum and version is stored in /var/lib/glusterd/vols/<volname>/quota.cksum. Original-author: Krutika Dhananjay <kdhananj@redhat.com> Original-author: Krishnan Parthasarathi <kparthas@redhat.com> BUG: 969461 Change-Id: If32bba36c67f9c2a30417af9c6389045b2b7c13b Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6003 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* features/quota: Improvements to quotaRaghavendra G2013-11-261-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Two stages of quota enforcement is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota enforcer is moved to server-side. It takes care of enforcing quota. Since enforcer doesn't have the cluster view, it relies on another service called quota-aggregator. Aggregator, on query can return the size of a directory based on the cluster view. Enforcer is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to enforcer: server-quota - Specifies whether the feature is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command). The algorithm followed is, i. Adjust statvfs based on limit configured on root. ii. If limit is set on the inode passed, use size/limits on that inode to populate statvfs. Otherwise, use size/limits configured on root. iii. Upon statvfs, update the ctx->size on the inode. iv. Don't let DHT aggregate, instead take the maximum of the usages from the subvols of the DHT, since each of it contains the complete information. Enforcer also makes use of gfid-to-path conversion functionality to work correctly when a client like nfs predominently relies on nameless lookups. * Quota Aggregator acts as a thin client to provide cluster view Its a lightweight *gluster client* process with no mount point, started upon enabling quota or restarting the volume. This is a single process run on each brick, which can answer queries on all volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol. Credits: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5952 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* gNFS: More clean up for Gluster NFSSantosh Kumar Pradhan2013-11-252-39/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Fix the typo in NFS default ACL The typo was introduced as part of the Fix to BZ 1009210 i.e. http://review.gluster.org/5980. The user ACL xattr structure was passed to default ACL xattr. 2) Clean up NFS code to avoid unnecessary SEGV in rpcsvc_drc_reconfigure() which was not validating the svc->drc. Add a routine rpcsvc_drc_deinit() to handle the clean up of DRC specific data structures. For init(), use rpcsvc_drc_init(). 3) nfs_init_state() was returning wrong value even if the registration with portmapper failed, causing the NFS server process to hang around. As a result it used to get SEGV during rpcsvc_drc_reconfigure(). 4) Clean up memfactor usage across nfs.c nfs3.c. Change-Id: I5cea26cb68dd8a822ec0ae104952f67fe63fa703 BUG: 1009210 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6329 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* gNFS: RFE for NFS connection behaviorSantosh Kumar Pradhan2013-11-147-39/+269
| | | | | | | | | | | | | | | Implement reconfigure() for NFS xlator so that volume set/reset wont restart the NFS server process. But few options can not be reconfigured dynamically e.g. nfs.mem-factor, nfs.port etc which needs NFS to be restarted. Change-Id: Ic586fd55b7933c0a3175708d8c41ed0475d74a1c BUG: 1027409 Signed-off-by: Santosh Kumar Pradhan <spradhan@redhat.com> Reviewed-on: http://review.gluster.org/6236 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
* bd_map: Remove bd_map xlatorM. Mohan Kumar2013-11-131-10/+0
| | | | | | | | | | | Remove bd_map xlator and CLI related changes. Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5747 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterfs: zerofill supportM. Mohan Kumar2013-11-104-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpcsvc: implement per-client RPC throttlingAnand Avati2013-10-286-0/+107
| | | | | | | | | | | | | | | | Implement a limit on the total number of outstanding RPC requests from a given cient. Once the limit is reached the client socket is removed from POLL-IN event polling. Change-Id: I8071b8c89b78d02e830e6af5a540308199d6bdcd BUG: 1008301 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6114 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: add remote peer's hostname to call_bail log msgsKrishnan Parthasarathi2013-10-171-3/+4
| | | | | | | | | | Change-Id: I982cf7619463983c04b401d70a76635991d072d2 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6091 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Harshavardhana <harsha@harshavardhana.net>
* libglusterfs: Add monotonic clocking counter for timer threadHarshavardhana2013-10-151-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gettimeofday() returns the current wall clock time and timezone. Using these functions in order to measure the passage of time (how long an operation took) therefore seems like a no-brainer. This time suffer's from some limitations: a. They have a low resolution: “High-performance” timing by definition, requires clock resolutions into the microseconds or better. b. They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive). In such cases timer thread could go into an infinite loop. From 'man gettimeofday': ---------- .. .. The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). .. .. ---------- Rationale: For calculating interval timing for Timer thread, all that’s needed should be clock as a simple counter that increments at a stable rate. This is necessary to avoid the jumps which are caused by using "wall time", this counter must be monotonic that can never “tick” backwards, ever. Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb BUG: 1017993 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>