summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
...
* storage/posix: Filter custom getxattrs in lookupPranith Kumar K2011-09-201-2/+33
| | | | | | | | Change-Id: If948ff1b355ea4fd92036bcc43e7b32325aeb3e4 BUG: 3470 Reviewed-on: http://review.gluster.com/325 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/distribute: minor fixes in open file migrationAmar Tumballi2011-09-195-37/+91
| | | | | | | | | | | * incorporated Avati's comments on the first patch. * send proper stat information while unwinding Change-Id: I36982cec610753c241c372272620ab2bd581fd9f BUG: 3071 Reviewed-on: http://review.gluster.com/408 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* features/locks: free the string allocated by inode_pathRaghavendra Bhat2011-09-191-3/+8
| | | | | | | | Change-Id: I1b7d4059610713b92c4bb78676c3b48335e3a0fe BUG: 3468 Reviewed-on: http://review.gluster.com/465 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd: run 'volume top read-perf/write-perf' in different threadKaushal M2011-09-191-195/+0
| | | | | | | | | | | | | | | Runs the 'volume top read-perf/write-perf' operations in a different thread without blocking glusterd. Prvents glusterd from being unresponsive when large values of 'bs' and 'count' are given. Also increase cli timeout for top/profile commands , from 120s to 300s to allow large i/o top read-perf and write-perf to return result. Change-Id: I4b7de1d735f33643d836772db7f25133f112b75a BUG: 2720 Reviewed-on: http://review.gluster.com/375 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com>
* s@GFS_PREFIX"/sbin@SBIN_DIR@Csaba Henk2011-09-197-12/+11
| | | | | | | | | | | $sbindir is the install path for gluster* binaries, so this is what should be used in their invocation Change-Id: Ie748b4cbf59c3ee77f721ff6e0ab7151742ce0ab BUG: 2825 Reviewed-on: http://review.gluster.com/458 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* glusterd: provide a option to start processes with valgrindRajesh Amaravathi2011-09-193-6/+50
| | | | | | | | | | | | By enabling the brick-with-valgrind option in glusterd, one can automatically start all bricks with valgrind monitoring them. Change-Id: Ib0a97a83c4461c0878454e96bc84462f6cad6bc8 BUG: 3461 Reviewed-on: http://review.gluster.com/311 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd/top: volume top succeeds on partial brickpathRajesh Amaravathi2011-09-197-60/+79
| | | | | | | | | | | | | | Rewrite of glusterd_volume_brickinfo_get in glusterd-utils.c An additional argument to glusterd_volume_brick_info_get_by_brick and glusterd_volume_brickinfo_get enables matching brick path in two ways: Complete or partial(ancestor and descendent paths matched). Change-Id: Ia87833a6f0c139599c3e40b59d60c64281b4084b BUG: 3271 Reviewed-on: http://review.gluster.com/162 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shishir Gowda <shishirng@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd: make sort portableRajesh Amaravathi2011-09-191-8/+1
| | | | | | | | | | fixed for fd leaking. reopening of file was not needed BUG: 3491 Change-Id: I1351bdcaa41a5901574f5e779c33bf6f80a938f9 Reviewed-on: http://review.gluster.com/453 Reviewed-by: Csaba Henk <csaba@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* protocol/client: minor log enhancementsRajesh Amaravathi2011-09-191-21/+14
| | | | | | | | | | minor changes to the log enhancements of bug 3473. Change-Id: Id38d29db5a744e0ab7342d10ead6d16866228062 BUG: 3473 Reviewed-on: http://review.gluster.com/452 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* Fix typo in log message.Sachidananda Urs2011-09-191-1/+1
| | | | | | | | Change-Id: Ia51ffe03c8b94ddfe21c6609bc0d54b5bd29eca7 BUG: 3158 Reviewed-on: http://review.gluster.com/392 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/marker: Use appropriate loc struct to do removexattr on newpath ↵Raghavendra G2011-09-191-1/+12
| | | | | | | | | | after rename. Change-Id: I060e62c1fbb288179063a6d64d73bad1a6572661 BUG: 3493 Reviewed-on: http://review.gluster.com/390 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd: make sort portableRajesh Amaravathi2011-09-181-9/+21
| | | | | | | | | | | | | The result of sorting the volume info file has been programmatically redirected, instead of using the -o option. Change-Id: Id789fab8dc92b254571a4fc7239e4872f3ac055f BUG: 3491 Reviewed-on: http://review.gluster.com/395 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterfsd: enable max fetch attemptsKaushal M2011-09-181-1/+7
| | | | | | | | | | | | | | Enables usage of 'volfile-max-fetch-attempts' option of glusterfsd. Also, adds an option to 'mount.glusterfs' for setting the max fetch attempts. For a server with multiple ips, each call to gf_resolve_ip6() returns a different ip. Since gf_resolve_ip6() is called for each fetch attempt, this change also enables rrdns support for gluster. Change-Id: I3edadbf0ff43ff414b30eb50dd9ca4a6fd6b1089 BUG: 2441 Reviewed-on: http://review.gluster.com/239 Reviewed-by: Amar Tumballi <amar@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* performance/quick-read: fix memory corruption.Raghavendra G2011-09-181-4/+0
| | | | | | | | | | | | - macro QR_STACK_UNWIND destroys the stub present in local and hence no need of explicitly calling call_stub_destroy on it. Change-Id: Ib81c9a0d382765e783722b14fdbd7877086b1bec BUG: 3562 Reviewed-on: http://review.gluster.com/439 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: log enhancementsRajesh Amaravathi2011-09-181-90/+109
| | | | | | | | | | | * print paths wherever it is possible to log, to help debugging. * bring uniformity in log level. Change-Id: I2fa85b629de5dd0f0057ed96cba08ecb0ff1a798 BUG: 3473 Reviewed-on: http://review.gluster.com/328 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* glusterd: profile cmd incorrectly reports all bricks down.Krishnan Parthasarathi2011-09-154-30/+19
| | | | | | | | | | | | If there are no bricks of a volume running 'local' to glusterd where the 'profile info' command is issued, glusterd incorrectly reports that all bricks of the volume are down. Change-Id: Idd703c991f0bcf59b76b9ef8f4ad8cd71960a55b BUG: 3553 Reviewed-on: http://review.gluster.com/430 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* Proactive self heal process implementationPranith Kumar K2011-09-1423-523/+1592
| | | | | | | | Change-Id: I96db0d94566ceabf1649f890318363f738c06553 BUG: 2458 Reviewed-on: http://review.gluster.com/403 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* debug/io-stats: Allow multiple children in graphPranith Kumar K2011-09-141-2/+2
| | | | | | | | Change-Id: Ie4fb75d8000ff95daa8bf9f6757926822de28a65 BUG: 2458 Reviewed-on: http://review.gluster.com/401 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* features/quota: explicitly create xattrs in marker_create_cbkRaghavendra G2011-09-142-3/+11
| | | | | | | | | | | | | | - the earlier approach of creating quota related xattrs through side-effect of updating size and contribution values won't work, since when no contribution xattr is present, the updation process treats contribution value as zero and hence will be equal to size of freshly created files Change-Id: If9b2063b1ac3a4cf50d3fe2c81e907bc8eccb677 BUG: 3531 Reviewed-on: http://review.gluster.com/385 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Mohammed Junaid <junaid@gluster.com>
* features/quota: implement mknod fop.Raghavendra G2011-09-141-0/+142
| | | | | | | | Change-Id: If8f2a0bb635160ee78f35787ee9f8a4db87ae8ac BUG: 3531 Reviewed-on: http://review.gluster.com/384 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Mohammed Junaid <junaid@gluster.com>
* GlusterFS Hadoop specific DSL for mountbrokerVenky Shankar2011-09-133-12/+65
| | | | | | | | Change-Id: Ie379992bdea0974c8c5e1a4d7bc3e87cefe0d256 BUG: 3539 Reviewed-on: http://review.gluster.com/404 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd rebalance: fix minor issuesAmar Tumballi2011-09-132-35/+42
| | | | | | | | | | | | | | | there were bugs introduced due to parallelizing rebalance op. * argument to dict_set_str () should be static as for the life of dict * uuid_utoa() output should not be considered as static * overloading 'volinfo->defrag' in other nodes is a overkill, just KISS Change-Id: I43d00c8e22beb2dd5c5f9824552f7337543b2255 BUG: 2112 Reviewed-on: http://review.gluster.com/407 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* support for de-commissioning a node using 'remove-brick'Amar Tumballi2011-09-1315-137/+588
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to achieve this, we now create volume-file with 'decommissioned-nodes' option in distribute volume, then just perform the rebalance set of operations (with 'force' flag set). now onwards, the 'remove-brick' (with 'start' option) operation tries to migrate data from removed bricks to existing bricks. 'remove-brick' also supports similar options as of replace-brick. * (no options) -> works as 'force', will have the current behavior of remove-brick, ie., no data-migration, volume changes. * start (starts remove-brick with data-migration/draining process, which takes care of migrating data and once complete, will commit the changes to volume file) * pause (stop data migration, but keep the volume file intact with extra options whatever is set) * abort (stop data-migration, and fall back to old configuration) * commit (if volume is stopped, commits the changes to volumefile) * force (stops the data-migration and commits the changes to volume file) Change-Id: I3952bcfbe604a0952e68b6accace7014d5e401d3 BUG: 1952 Reviewed-on: http://review.gluster.com/118 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* mgmt/glusterd: check the availability of fuse for few glusterd operationsKaushik BV2011-09-135-0/+76
| | | | | | | | | Change-Id: I410cc6a86c32637566e5498f69f46cb40322e7fb BUG: 2715 Reviewed-on: http://review.gluster.com/364 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* mgmt/glusterd: fail glusterd if gsyncd does not behave as expectedKaushik BV2011-09-131-10/+26
| | | | | | | | Change-Id: Ic54220328f15c579dcf441de2aad8620751a97ef BUG: 2744 Reviewed-on: http://review.gluster.com/331 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Csaba Henk <csaba@gluster.com>
* distribute rebalance: handle the open file migrationAmar Tumballi2011-09-1215-1654/+2906
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complexity involved: To migrate a file with open fd, we have to notify the other client process which has the open fd, and make sure the write()s happening on that fd is properly synced to the migrated file. Once the migration is complete, the client process which has open-fd should get notified and it should start performing all the operations on the new subvolume, instead of earlier cached volume. How to solve the notification part: We can overload the 'postbuf' attribute in the _cbk() function to understand if a file is 'under-migration' or 'migration-complete' state. (This will be something similar to deciding whether a file is DHT-linkfile by its 'mode'). Overall change includes below mentioned major changes: 1. dht_linkfile is decided by only 2 factors (mode(01000), xattr(trusted.glusterfs.dht.linkto)), instead of earlier 3 factors (size==0) 2. in linkfile self-heal part (in 'dht_lookup_everywhere_cbk()'), don't delete a linkfile if there is a open-fd on it. It means, there may be a migration in progress. 3. if a file's revalidate fails with ENOENT, it may be due to file migration, and hence need a lookup_everywhere() 4. There will be 2 phases of file-migration. -> Phase 1: Migration in progress * The source data file will have SGID and STICKY bit set in its mode. * The source data file will have a 'linkto' xattr pointing the destination. * Destination file will have mode set to '01000', and 'linkto' xattr set to itself. -> Phase 2: File migration Complete * The source data file will have mode '01000', and will be 'truncated' to size 0. * The destination file will have inherited mode from the source. (without sgid and sticky bit) and its 'linkto' attribute will be removed. 4. Changes in distribute to work smoothly with a file which is in migration / got migrated. The 'fops' are divided into 3 categories, inode-read, inode-write and others. inode-read fops need to handle only 'phase 2' notification, where as, the inode-write fops need to handle both 'phase 1' and phase2. The inode-write operations will be done on source file, and if any of 'file-migration' procedures are detected in _cbk(), then the operations should be performed on the destination too. when a phase-2 is detected, then the inode-ctx itself should be changed to represent a new layout. With these changes, the open file migration will work smoothly with multiple clients. Change-Id: I512408463814e650f34c62ed009bf2101d016fd6 BUG: 3071 Reviewed-on: http://review.gluster.com/209 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* geo-rep: partial support for unprivileged gsyncd via mountbrokerCsaba Henk2011-09-125-46/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | gsyncd: - mounting code is split to a direct and a mountbroker based backend - option gluster-command gone - new options: gluster-params, gluster-cli-options, mountbroker - mountbroker mount backend is used if either a mountbroker label is given through the mountbroker option, or if gsyncd is unprivileged; in this case the username is used as label - have gluster cli invocations log to stderr so that we don't hit a permission issue with the logfiles glusterd: - do gsyncd pre-config with new options - add option geo-replication-log-group, so if that specified geo-rep logfile directories are given to that group (and thus members of the given group can do logging there) This is just WIP as geo-rep relies on trusted extended attributes and those are not accessible for unprivileged users. Even if we solved this issue, glusterd security settings are too coarse, so that if we made it possible for an unprivileged gsyncd to operate, we would open up too far. Change-Id: Icd520b58cbadccea3fad7c0f437b99de1e22db14 BUG: 2825 Reviewed-on: http://review.gluster.com/399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd / cli: mount-broker serviceCsaba Henk2011-09-127-3/+1027
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mountbroker is configured in glusterd volfile through a DSL which is restriced enough to be able to appear in the role of the value of a volfile knob. Basically the DSL describes set-theorical requirements against the option set which is sent by the cli (in the hope of getting a mount with these options). If the requirements meet and the volume id and the uid who is to "own" the mount can be unambigously deduced from the given request, glusterd does the mount with the given parameters. The use case of geo-replication is sugared by means of volume options which then generate a complete mount-broker option set. Demo: - add the following option to your glusterd volfile: option mountbroker-root /tmp/mbr option mountbroker.fool EQL(volfile-id=pop*|user-map-root=*|volfile-server=localhost)&MEET(user-map-root=john|user-map-root=jane) - before starting glusterd, create /tmp/mbr owned by root with mode 0755 - with cli, do $ gluster system:: mount fool volfile-id=pop33 user-map-root=jane volfile-server=localhost - on succesful completion (volume pop33 exists and is started, jane is a valid username), the mount path will be echoed to you - you can get rid of the mount by $ gluster system:: umount <mount-path> Change-Id: I629cf64add0a45500d05becc3316f67cdb5b42ff BUG: 3482 Reviewed-on: http://review.gluster.com/128 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* add --user-map-root optionCsaba Henk2011-09-122-0/+15
| | | | | | | | | | | | | | | | | | This makes client fake that given user is a superuser, by changing FUSE requests coming with uid of user so that uid is set to 0. User can be given in numeric form, in which case it's treated as an uid directly, or else it's tried to be resolved to an uid with getpwnam(3). Implies --acl. Change-Id: I2d5a3d3e178be7ffdf22b46a56f33a7eeaaa7fe1 BUG: 3242 Reviewed-on: http://review.gluster.com/127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* gsyncd: python3 compat fixesCsaba Henk2011-09-124-5/+36
| | | | | | | | | | | Also add __codecheck script which can verify if source is OK at the syntactical level with a given Python interpreter. Change-Id: Ieff34bcd3efd1cdc0e8f9a510c05488f35897bbe BUG: 1570 Reviewed-on: http://review.gluster.com/320 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd: fix cleaning up of runner objectCsaba Henk2011-09-121-2/+1
| | | | | | | | | | | in lack of that, if geo-rep component is not installed, glusterd got a zombie child Change-Id: Ic4a2a4ffc943de68dd02db76a32b1618821ddf56 BUG: 2744 Reviewed-on: http://review.gluster.com/317 Reviewed-by: Kaushik BV <kaushikbv@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* glusterd: free the allocated string to avoid memory leakRaghavendra Bhat2011-09-122-56/+24
| | | | | | | | Change-Id: I520abf3c57a15be8bb7dd1e92ad0b049ef5c8970 BUG: 3341 Reviewed-on: http://review.gluster.com/394 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* protocol/client: avoid code duplication in fd based operationsRaghavendra Bhat2011-09-112-340/+42
| | | | | | | | | Change-Id: I012f78bac8ba82333628c59ef51d5e5f43d05ac7 BUG: 3158 Reviewed-on: http://review.gluster.com/329 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* features/marker: unref the local incase of errors before unwindingRaghavendra Bhat2011-09-111-3/+5
| | | | | | | | Change-Id: I4dcad7ddf84bf98b4b7f4a0e407a418426674280 BUG: 2784 Reviewed-on: http://review.gluster.com/299 Reviewed-by: Vijay Bellur <vijay@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: volume set help-xml format changeVijay Bellur2011-09-111-1/+1
| | | | | | | | Change-Id: I503364c855d52605e301f4d3c205af6c9fc0e1df BUG: 3366 Reviewed-on: http://review.gluster.com/380 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com>
* features/marker-quota: Prefix the function names with mq (marker-quota).Junaid2011-09-095-310/+310
| | | | | | | | | | | This is to fix to bug marker translator and quota translator cannot co-exist in same process. Change-Id: I9f132b663f03641f4f2c7e168df8400adbc5570f BUG: 3020 Reviewed-on: http://review.gluster.com/381 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amar@gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* cluster/afr: perform self-heal with least priorityPranith Kumar K2011-09-091-0/+7
| | | | | | | | Change-Id: Id8a1dffa3c3200234ad154d1749278a2d7c7021b BUG: 3502 Reviewed-on: http://review.gluster.com/336 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* glusterd rebalance: make co-operate with all other 'op'Amar Tumballi2011-09-094-56/+380
| | | | | | | | | | | | that way, we can share the rebalance state with other peers and can prevent confusion/conflicts when multiple rebalances are done by different peers. Change-Id: I24159e69332644718df7314f6f1da7fce9ff740e BUG: 2112 Reviewed-on: http://review.gluster.com/343 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* features/marker-quota: Perform xattr related operations with root ↵Junaid2011-09-092-6/+39
| | | | | | | | | | permissions in rename fop. Change-Id: Id9ac1ecdd9753377c9eb24464f51dcbdc0cd2821 BUG: 3194 Reviewed-on: http://review.gluster.com/367 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vijay@gluster.com>
* performance/io-threads: treat -ve pid as request for fop with least priorityPranith Kumar K2011-09-081-63/+325
| | | | | | | | Change-Id: Ib6730a708f008054fbd379889a0f6dd3b051b6ad BUG: 3502 Reviewed-on: http://review.gluster.com/335 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: Make data selfheal trigger to be configurable.Pranith Kumar K2011-09-0811-112/+217
| | | | | | | | | | | | | | | | | | | | By default, lookup triggers data self-heal but that is not the preferred way of operating replicated volumes. We would like the data self heals to be triggered in open instead. Number of back-ground self-heals allowed is 16 and lookups block until self-heal is completed. We want to prevent blocking in fops. We can not make lookups independent of self-heal frames because when there are gfid conflicts the decision of which file is correct is determined in self-heal phase. So in afr, lookup self-heal is going to guarantee name space consistency and open/fd fops will take responsibility for data consistency, these are non blocking. The user needs to set the option cluster.data-self-heal "open" for this behavior. Change-Id: If9463cdb9ebac114708558ec13bbca0270acd659 BUG: 3503 Reviewed-on: http://review.gluster.com/334 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* posix-acl: configurable super user IDAnand Avati2011-09-082-7/+61
| | | | | | | | | | | | In configurations with a uid mapper, super user ID could be mapped to a non-zero value. Hence making it configurable in access control would be necessary for proper super-user semantics. Change-Id: I51e8e0395680e9b96a99657a0af547659bd9affe BUG: 2815 Reviewed-on: http://review.gluster.com/332 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* cluster/afr: eager locking of FD writesAnand Avati2011-09-084-58/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a change in the way write transactions hold a lock which optimizes the case of sequential writes from a single writer. Lock phase of a transaction has two sub-phases. First is an attempt to acquire locks in parallel by broadcasting non-blocking lock requests. If lock aquistion fails on any server, then the held locks are unlocked and revert to a blocking locked mode sequentially on one server after another. The change in this patch is to make the initial broadcasting lock request attempt to acquire lock on the entire file. If this fails, we revert back to the sequential "regional" blocking lock as before. In the case where such an "eager" lock is granted in the non-blocking phase, it gives rise to an opportunity for optimization. i.e, if the next write transaction on the same FD arrives before the unlock phase of the first transaction, it "takes over" the full file lock. Similarly if yet another transaction arrives before the unlock phase of the "optimized" transaction, that in turn "takes over" the lock as well. The actual unlock now happens at the end of the last "optimzed" transaction. Any operation which arrives before the unlock phase of the previous transaction is a potential candidate to become an "optimized" transaction. In cases where the previous transaction had aquired lock as a "regional" blocking lock, and the next transaction comes in before its unlock phase, then it would not be an "optimized" transaction. Implied assumption ------------------ Since two or more transactions can now operate within the same large lock, there is a possibility that overlapping transactions can arrive at oppoosite orders on the servers. However in the larger picture this is not possible as write-behind already ensures that no two overlapping writes on an inode are in transit at the same time. Overlapping writes across clients are not a problem as they compete at locks anyways. Theoretical benefits and potential harms ---------------------------------------- In case of a single writer: The benefits are large for sequential writes. In the best case the entire file write can happen with just one lock and unlock per server, provided writes are coming in fast enough and getting pipelined by write-behind soon enough (which is usually the case). If the writes are not coming in fast enough, then the optimization "kicks in" for only those subsets of writes which are close enough to get "piggybacked". For random writes the benefits are the same as well. In any case the overall performance is better than or equal to the performance without this optimization for a single writer. In case of multiple writers: When multiple writers are not writing concurrently, there is no negative performance impact. When multiple writers are writing concurrently to the same region, there is no negative impact either, as they were previously getting arbitrated at the locks translator too. In the case of multiple writers writing to different regions concurrently, there will be an increased number of "failovers" from failed parallel non-blocking to sequential blocking regional locks. This above "worst case" has a simple workaround that as soon as we detect > 1 open-fd-count in lookup xattr, we can disable this optimization on those fds. Beneficial side-effects ----------------------- There is another similar optimization in AFR for changelogs which goes by the name of "changelog-piggybacking". That works in a similar way where pending flags get 'taken over' or 'piggybacked' by the next transaction if its 'pre-op' phase kicks in before the 'post-op' phase of the previous transaction. It has been observed that this changelog-piggybacking optimization gives a saving of about ~55% savings of xattr calls hitting the wire, measured across various types of network interfaces. The side effect of this eager-lock optimization is that it gives an almost 100% saving of xattr calls by making the optimistic-changelog work much more efficiently as it gives a wider overlap of the xattr phases of two consecutive transactions. Change-Id: I41c02eb3b64c14c68ef66a344610ec3f024cd59d BUG: 3409 Reviewed-on: http://review.gluster.com/240 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* storage/posix: posix getxattr log enhancementRajesh Amaravathi2011-09-081-4/+4
| | | | | | | | | | Now the key is logged with getxattr failure. Change-Id: I96a9234cf138ae0922dc403e2fddcd4df0d89df8 BUG: 3283 Reviewed-on: http://review.gluster.com/373 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* mount/nfs: Gluster nfs crashes with subdirectory mountRajesh Amaravathi2011-09-081-1/+4
| | | | | | | | | | | | | Glusterfs used to crash trying to dereference a NULL pointer. Also, in mnt3_resolve_export_subdir, volume name was prefixed to sub directory exported, resulting in mount fail of sub directory. Fixed both issues. Change-Id: I746f0c244b4cbf03033d73ac3e40518762d76385 BUG: 3481 Reviewed-on: http://review.gluster.com/323 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* Save the mode flags set by the application when ACLs are in usePavan T C2011-09-081-1/+2
| | | | | | | | | | | | | | | | | While inheriting the ACLs from a directory that has default ACLs, make sure that the mode flags set by the application are saved. It is required to inherit only the Read, Write and Execute permissions while leaving the others viz. setuid, setgid and sticky bit untouched hence honouring the requests made by the application during create operations (mknod, mkdir et al). For a description of the problem, root cause and evaluation, refer: http://bugs.gluster.com/show_bug.cgi?id=3522 Change-Id: I994077fb321a35d8254f0cc5a7de99a17ec40c47 BUG: 3522 Reviewed-on: http://review.gluster.com/368 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* gsyncd: do the homework, document _everything_Csaba Henk2011-09-089-17/+483
| | | | | | | | Change-Id: I559e6a0709b8064cfd54c693e289c741f9c4c4ab BUG: 1570 Reviewed-on: http://review.gluster.com/319 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushik BV <kaushikbv@gluster.com>
* nfs3: Resolve entry vs. hash conflict at same dir depthShehjar Tikoo2011-09-072-15/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro Note ========== The current code in hard fh resolution takes the first-match approach, i.e. which ever dirent either matches the hash or matches the gfid first is the one chosen as the result for the next step of fh resolution. In the latter case, i.e., dirent matches the gfid, we the next step is to conclude the fh resolution by returning the entry whose gfid matched. In the former, i.e., the hash matches the dirent, we choose the hash-matching dirent as the next directory to descend into, for searching the file to be operated upon. Problem ======= When performing hard fh resolution, there can be a situation where: o the hash of the primary entry,i.e. the entry we're looking for and the hash of another sibling directory, match. Note the use of "sibling", meaning both the primary entry and the hash matching one are in the same directory, i.e., their filehandle.hashcount will be same. o the sibling directory is encountered first during the dir search. Because of the current code described in "Intro", we'll end up descending into the sibling directory even though the correct behaviour is to ignore this and wait till we encounter the primary entry in the same parent directory. Once we end up descending into this sibling directory, the directory depth validation check fails. The check fails because it notices that the resolution is attempting to open a directory that is deeper in the fs tree than the file we're looking for. When this check fails, we return an ESTALE. So basically, a false-positive results in an estale to Specsfs. This is not a theoretical situation. Me and Avati saw this on specsfs test where sfs created terabytes-sized file system for its tests. The number of files was so huge in a single directory that the hashes of two entries ended up colliding. Change-Id: I4a6df11d326a67a507b1cd716c2c8e00b5a858a4 BUG: 3510 Reviewed-on: http://review.gluster.com/357 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Shehjar Tikoo <shehjart@gluster.com>
* Eliminate many "var set but not used" warnings with newer gcc.Jeff Darcy2011-09-0741-396/+0
| | | | | | | | | | | | | | | | This fixes ~200 such warnings, but leaves three categories untouched. (1) Rpcgen code. (2) Macros which set variables in the outer (calling function) scope. (3) Variables which are set via function calls which may have side effects. Change-Id: I6554555f78ed26134251504b038da7e94adacbcd BUG: 2550 Reviewed-on: http://review.gluster.com/371 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>
* glusterd: send the 'stripe-count' value to peer during handshakeAmar Tumballi2011-09-071-0/+17
| | | | | | | | | | | without which, if a peer is added after volume of type 'stripe-replica' is created, it won't be reflected in the newly added peer. Change-Id: I77ee6aa3f33994bd4c6dbfefd853cc7e7491c1db BUG: 3523 Reviewed-on: http://review.gluster.com/369 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@gluster.com>