summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt
Commit message (Collapse)AuthorAgeFilesLines
* posix, quota, glusterd, dht: Modification to the pgfid xattr handlingVarun Shastry2013-09-192-5/+17
| | | | | | | | | | | | | | | Commit makes the following changes to the source. i. Updating hard link count for parent dir should be configurable. Starts working only when the quota is enabled. ii. Heal nlinks of pgfid xattr in lookup. iii. Start quota crawler without readdirp optimization. iv. Rename: Handle the internal fops properly. Use GLUSTERFS_INTERNAL_FOP_KEY for representing the internal fops. Change-Id: Ic6586a82a8bb6eb4329eb6cbd5430da11418e753 BUG: 969461 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Signed-off-by: Varun Shastry <vshastry@redhat.com>
* Revert "glusterd: Relax op-version check in quota command"Krutika Dhananjay2013-09-161-3/+36
| | | | | | | | | | | | | | | This reverts commit 34ffc3b71ad96b9be6fa34cad44f92eceb56f5e7. Reverting this patch because quota command in glusterd only reads (but NOT modify) the op-version and therefore can never bump up the op-version of the cluster. As of today, that kind of intelligence rests only with 'volume set' operation. Hence 'volume set' interface is going to be used in the post upgrade script to explicitly trigger an increment in the cluster's op-version. Change-Id: Id38bbc044429ebb1349772f37b637925032618bc Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* Merge "glusterd: Validate timeout options during volume quota stage op" into ↵Krishnan Parthasarathi2013-09-161-7/+68
|\ | | | | | | upstream_on_quota
| * glusterd: Validate timeout options during volume quota stage opKrutika Dhananjay2013-09-121-7/+68
| | | | | | | | | | Change-Id: If9ae015ab189f57f3a3f9a56cbb38a5e8491fe6f Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* | Merge "glusterd: Relax op-version check in quota command" into upstream_on_quotaKrishnan Parthasarathi2013-09-161-32/+0
|\ \
| * | glusterd: Relax op-version check in quota commandKrutika Dhananjay2013-09-131-32/+0
| |/ | | | | | | | | | | | | | | | | This is to set default-soft-limit as a way of bumping up the op-version, as part of post-upgrade script before setting limits. Change-Id: I6693bf6a6d7f761c55f83120cf686ffa0951bd50 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* / glusterd: Ignore "features.limit-usage" in glusterd's (re)start pathKrutika Dhananjay2013-09-131-0/+8
|/ | | | | Change-Id: I08f919e0bf3f014c553070dbd7710ba5db9fd2b4 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: set soft-limit as -1, if not specified in quota-limit-usageKrutika Dhananjay2013-09-061-18/+19
| | | | | | | Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: I07d7f01af597cbec836972fd06076dcee82eff7d Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Conditionally aggregate peer responses for quota subcommandsKrutika Dhananjay2013-09-062-16/+32
| | | | | | | | | | | | | | | | | In the function _gd_syncop_stage_op_cbk (), aggregate rsp dicts only during REPLACE_BRICK and QUOTA commands. Similarly, in the function _gd_syncop_stage_op_cbk (), aggregate the rsp dict from the peers only for quota sub-command 'list' and for all other commands unconditionally. This is the cause of the log messages seen in the bug 1001432. Also, read interim 'count' from op_ctx before aggregating it with rsp_dict_count Change-Id: I9ecb832e83354d62a8f841db2ce6b2377920abad Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Destroy volinfo->quota_conf_shandle during 'volume delete'Krutika Dhananjay2013-09-064-39/+48
| | | | | | | ... and also remove auxiliary mount, if it exists. Change-Id: I91ac3f434df3e03ea914051d1d6890e7a05a3cad Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: probe quota.conf format changesKrutika Dhananjay2013-09-063-56/+93
| | | | | | | Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Iff4ea98276ffdba5b39cadceff63739107eafd77 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Add cksum file (quota.cksum) for quota.conf fileKrutika Dhananjay2013-09-066-96/+514
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | .. and use quota checksum and version to validate one's own quota store config cli: cleanup quota-list-all implementation Also, change the format in which we store the directory quota configurations. We store the list of gfids as 16 byte unsigned chars, in binary mode. Original-author: Krishnan Parthasarathi <kparthas@redhat.com> glusterd: Store quota checksum and version in quota.cksum Quota version is incremented AND quota checksum is computed everytime quota.conf is modified. The checksum and versions are also retrieved from store into memory whenever glusterd is restarted. glusterd: Unlink quota.conf and quota.cksum on quota disable Also destroy volinfo->quota_conf_shandle and reset it to NULL, and reset volinfo->quota_conf_version to 0 in memory. Change-Id: Ie71da3a75bc80e1ffddf4f2e38a99a48ad4de164 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: add quota conf to probe payloadKrutika Dhananjay2013-09-062-0/+142
| | | | | | | | | also fix FILE* leak in cli Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Icb9b58ef065ce1a150d98b4c26bbcddeeb390e44 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* volgen: use volname instead of volume-idKrutika Dhananjay2013-09-061-2/+2
| | | | | | | | | | | | | | | As part of volume-quota-list command implementation, the gluster cli process makes an RPC to quotad, querying for quota related information of a gfid in a volume. Prior to this patch, we used the volume-id for quotad to uniquely identify a volume. Since cli doesn't have a way (today) to fetch the volume-id for a given volname, we had to use a (nearly) unique key to identify a volume" Original-author: Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: I49b66f46fc7bb8dc4075cc855a1d854166d0e89c Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Implement persistent store for quotaKrutika Dhananjay2013-08-258-24/+550
| | | | | | | | | | | | | | | | | | | Setting quota limits on a given directory will cause glusterd to persist the gfid of the path in WD/vols/<volname>/quota.conf. Also, executing 'quota remove' will cause glusterd to remove the gfid of the given path. This is needed for implementing the 'list-all' variant of 'quota list' command. To-Do: 1. Exchange quota.conf when a new node is added into the cluster; 2. Unlink quota.conf on disabling quota (?) Change-Id: I7d75a9cdb43e4e1389ddb08ffe09b294d36f87d8 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* mgmt/glusterd: Correct auto variable initialization.Vijay Bellur2013-08-191-2/+5
| | | | | | | | | | | | | | | | | Previously struct foo { int a; int b; } bar1, bar2 = {0,}; inited only members of bar2 to 0. This has been set right. Also added verification tests for soft-quota configuration. Change-Id: I9e3b4d65286e59d7dad8db8fa649b1b91a5d25bc Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Initialize auto variablesVijay Bellur2013-08-151-1/+1
| | | | | Change-Id: I72e97bf57bd4103506324b5caf8dffb3fd7d7f71 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
* glusterd,cli: Use 'packed' attribute while reading/writing xattrs from/to ↵Krutika Dhananjay2013-08-121-2/+8
| | | | | | | backend Change-Id: I9229899361794d48bb2f741fb989bf025081987f Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Remove aux mounts on every node in cluster during 'quota disable'Krutika Dhananjay2013-08-122-1/+49
| | | | | | | | With this patch, upon disablement of quota on a given volume, the auxiliary mounts if present are removed on every node by glusterd during commit op. Change-Id: Ie50fda8da28635f3f65a25106e2277b4a2fc45f9 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Treat default value of default-soft-limit as 80% in 'quota list'Krutika Dhananjay2013-08-121-1/+1
| | | | | Change-Id: I3ce0bb417b6faa9430689039741d0477cb7a582f Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* common-utils: Move glusterd_is_service_running() to common-utilsKrutika Dhananjay2013-08-124-46/+9
| | | | | Change-Id: I96cbe03511cbecb112418da82c44c00fbab74ba3 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd, quotad: volume-id option fixupsBrian Foster2013-08-121-0/+14
| | | | | | | | | | | A few little hacks to set the volume id on the quota server and a mapping option on quotad to map the volume name to the uuid passed via the lookup request. Change-Id: Ic151acb18ed29d2ee4ae5d1bc6841ae4a4de176a Original-author: Brian Foster <bfoster@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd,cli: changes to quota list <path> ...Krutika Dhananjay2013-08-121-67/+11
| | | | | Change-Id: Ia37020c3aa11af6eed3af09cfe390b848b028b6a Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Clean up and fix glusterd_op_quota()Krutika Dhananjay2013-08-122-100/+70
| | | | | | | | | | | | | | | | | | | | ... and also fix cli logging In glusterd_op_quota(), * do not modify ret after going to 'out' as this causes the failure status (-1) to be overwritten, thereby causing the command to return 0 even on failure. * knock off additional labels like create_vol. * replace 'if' statements with 'switch case' statement. * delete only the 3 timeouts and the defaul-soft-limit, if present, from volinfo->dict, upon disablement of quota. Change-Id: If486a5373b66f2379d6d041a974d9b824fcb8518 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Changes to 'quota remove' subcommand behaviorKrutika Dhananjay2013-08-121-44/+39
| | | | | Change-Id: Ifdc60071146587dc5c60d9a53a92d49b3487fd82 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: club limit-usage and soft-limit into a single commandKrutika Dhananjay2013-08-121-165/+83
| | | | | Change-Id: I5f680675576aeec584b497eb25dd804a9dd6d690 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd, cli: Provide status of quotad in 'volume status'Krutika Dhananjay2013-08-125-19/+80
| | | | | | Change-Id: I5e90376ecfe11ae5a3bca936d9d9acdd54c337d7 BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Move timeout options and default-soft-limit to quota xlatorKrutika Dhananjay2013-08-122-80/+6
| | | | | | | | | Write the 3 timeout options {soft-timeout, hard-timeout, alert-time} and default-soft-limit, if explicitly set, into brick volfiles. Change-Id: Ie3229a8ab1b081a5936defd4f977afc8a19dad50 BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Spawn quotad on every nodeKrutika Dhananjay2013-08-123-31/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and also trigger a reconfigure when the following quota sub-commands are executed: a. default-soft-limit, b. hard-timeout, c. soft-timeout, and d. alert-time. Also start/restart/stop quotad only when quota is enabled or disabled on any volume. Tests performed in a two node cluster: a. Create and start a volume, enable quota on it and check if quotad is spawned on both the nodes. b. Execute all quota sub-commands on the volume except 'enable' and 'disable' and verify that the pid of quota daemon doesn't change. c. Stop the volume and verify that quotad is stopped. d. Start it again. Quotad must be started now. e. Create, start and enable quota on a second volume, verify that the pid of quotad changes on both the nodes (indicating a restart). f. Disable quota on one of the volumes and verify that quotad's pid changes. g. Disable quota on the second volume too and verify that quotad is stopped on both the nodes. h. Enable quota again on one of the volumes, and verify that quotad is started on both the nodes. i. Add a new node into this cluster and verify that quotad is spawned on this node too. Change-Id: Ie93ab69c685051e196c377cff15078a1cde17fca BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/quota: Improvements to quotaVarun Shastry2013-08-1210-249/+1117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Old implementation * Client side implementation of quota - Not secure - Increased traffic in updating the ctx New Implementation * 2 stages of quota implementation is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota is moved to server-side. There will be 2 quota xlators i. Quota Server It takes care of the enforcing the quota and maintains the context specific to the brick. Since this doesn't have the complete picture of the cluster, cluster wide usage is updated from the quota daemon. This updated context is saved and used for the enforcement. It updates its context by searching the QUOTA_UPDATE_KEY from the dict in the setxattr call, and is updated from nowhere else. The quota is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to quota-server: server-quota - Specifies whether the features is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command) ii. Quota Daemon This is the new xlator introduced with this patch. Its the *gluster client* process with no mount point, started upon enabling quota or restarting the volume. This is a single process for all the volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKI_DIR/quotad/quotad.vol. It queries for the sizes on all the bricks, aggregates the size and sends back the updated size, periodically. The timeout between successive updation is configurable and typically/by default more for below-soft-quota usage and less for above-soft-quota usage. It maintains the timeout inside the limit structure based on the usage; below soft limit and above soft limit. There will be thread running per volume which iterates through the list and decides whether the size to be queried in the current iteration based on its timeout. It takes the next iteration time taking the least of the timeouts in the list of entries. Maintains a separate inode table for each volume in the quotad. In the first iteration it builds the table for quota-dirs (dirs on which limit is set) and its components. Options specific to quotad: hard-timeout - Timeout for updation of usage to the quota-server when the usage is crosses the soft-limit. soft-timeout - Timeout for the updation of usage to the quota-server when the usage is below soft-limit. alert-time - Frequency of logging after the usage reached soft limit. Options common to both: default-soft-limit - This is used when individual paths are not configured with soft-limit and default value of this option is 90% of the hard-limit. limit-set - String containing all the limits. Thus in the current implementation we'll have 2 quota xlators: one in server graph and one in trusted client (quota daemon) of which the sole purpose will be to aggregate the quota size xattrs from all the bricks and send the same to server quota xlator. * Changes in glusterd and CLI A single volfile is created for all the volumes, similar to nfs volfile. All files related to quota client (volfile, pid etc) are stored in GLUSTERD_DEFAULT_WORK_DIR/quotad/. The new pattern of the quota limit stores in limit-set = <single-dir-limit>[,<single-dir-limit>] single-dir-limit = <abs-path>:<hard-limit>[:<soft-limit-in-percent>] It also introduces new options: volume quota <VOLNAME> {enable|disable|list [<path> ...]|remove <path>| default-soft-limit <percent>} | volume quota <VOLNAME> {limit-usage <path> <size> |soft-limit <path> <percent>} | volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} Credit: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: I16ec5be0c2faaf42b14034b9ccaf17796adef082 BUG: 969461 Signed-off-by: Varun Shastry <vshastry@redhat.com>
* Correcting a log message in glusterd-geo-rep.cM S Vishwanath Bhat2013-08-051-1/+1
| | | | | | | | | | Change-Id: I4352f513fc5616daa20e9a4ad51a63fb13a27dff BUG: 847839 Signed-off-by: M S Vishwanath Bhat <vbhat@redhat.com> Reviewed-on: http://review.gluster.org/5472 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Add switch and nufa options to 'gluster cli'Harshavardhana2013-08-032-17/+53
| | | | | | | | | Change-Id: Ic3c43291e0e1ead0d89c0436e8d70aa5dee2f543 BUG: 924488 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5391 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Fix when tasks are shown in 'volume status'Kaushal M2013-08-031-0/+4
| | | | | | | | | | | | | Asynchronous tasks are shown in 'volume status' only for a normal volume status request for either all volumes or a single volume. Change-Id: I9d47101511776a179d213598782ca0bbdf32b8c2 BUG: 888752 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5308 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Use volume op-versions during volgenKaushal M2013-08-024-19/+14
| | | | | | | | | | | | | | Instead of using the cluster op-version, volume op-version is used to enable open-behind during volgen. For doing this, the volume op-versions are updated before regenerating the volfiles. Change-Id: I675bb549bf7c7c0279030dca698fb530781addc6 BUG: 990830 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5385 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Re-initialize skipped file count in glusterdshishir gowda2013-07-311-0/+1
| | | | | | | | | Change-Id: I42d08b3a6a7a3839f5e9953e1f83959222c080f8 Signed-off-by: shishir gowda <sgowda@redhat.com> BUG: 989846 Reviewed-on: http://review.gluster.org/5446 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd : Checking session created or not in case of geo-rep stopAvra Sengupta2013-07-311-2/+6
| | | | | | | | | | | | | | | | | Performing statefile check in case of geo-rep stop, so as to provide proper error message in case session is not created. However in case of geo-rep stop force, we allow the command to succeed even in case that the session is not created, because the stop command is a failsafe command to stop running geo-rep sessions on any nodes. Change-Id: I2b6a0253de977633606c422cbbc9e37cede9a268 BUG: 989541 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : initiating gsyncd restart during add-brickAvra Sengupta2013-07-313-25/+150
| | | | | | | | | | | | | | | | | | | | | | | During add-brick, when a new brick is added in one of the nodes that was already a part of the existing volume, and gsyncd was already running on that node, then all gsyncd processes running on that node, for that particular master and any slave sessions will be restarted If a new brick is added in a new node, then after adding the brick, the user has to perform the following steps: 1. gluster system:: execute gsec_create 2. gluster volume geo-replication <master-vol> <slave-vol> create push-pem force 3. gluster volume geo-replication <master-vol> <slave-vol> start force Change-Id: I4b9633e176c80e4a7cf33f42ebfa47ab8fc283f1 BUG: 989532 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix a minor typo.Vijay Bellur2013-07-311-1/+1
| | | | | | | | | | | Thanks to Patrick Matthäi <pmatthaei@debian.org> for the patch. Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I59da74298894ccc2ab30967ffe44cc844aa73f82 BUG: 814534 Reviewed-on: http://review.gluster.org/5436 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cluster/dht: Treat migration failures due to space constraints as skippedshishir gowda2013-07-302-0/+28
| | | | | | | | | | | | | | | | Currently rebalance/remove-brick op's display migration failed count even for files which failed due to space issues (not enough space for file, or migration leading to cluster imbalance) These will now be counted as skipped, and rebalance/remove-brick status will display the additional counter Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4 BUG: 989846 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Fixing create force issues while it returned true everytime.Avra Sengupta2013-07-291-42/+71
| | | | | | | | | | | | | | | | | | | | Now geo-rep create force will return true if a node is down, and log an appropriate message. It will also return true with an appropriate log message if the slave verification fails. However it will not return true if the config file is deleted, ot corrupted, so as not to get the state_file's path. It will also fail if the slave url is invalid. If the push-pem option is given and /var/lib/glusterd/geo-replication/common_secret.pem.pub is not present, then also the create force command will fail. Change-Id: Ie7532a0884ddf9c3008bd30832d171d5b53b540e BUG: 988314 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli, glusterd: Cleanup logging of bd op commands.Vijay Bellur2013-07-271-1/+0
| | | | | | | | | | | | | | This patch prevents messages of the form "bd op: %s : SUCCESS" from being logged in .cmd_log_history. Change-Id: Iebeb7e26d409bf99b9c8df0a5c1c5a5d30d78a61 BUG: 823081 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4871 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: let each brick write the valgrind o/p to different fileRaghavendra Bhat2013-07-261-1/+5
| | | | | | | | | | | | | Till now all the brick processes were writing the valgrind information to the same log file. Change-Id: I0251c943935e2901b729c71f21d0677edb9f6867 BUG: 922877 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5394 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/cli changes for distributed geo-repAvra Sengupta2013-07-2612-574/+2677
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd: distribute the crawling loadAvra Sengupta2013-07-262-1/+15
| | | | | | | | | | | | | | | | | | | * also consume changelog for change detection. * Status fixes * Use new libgfchangelog done API * process (and sync) one changelog at a time Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16 BUG: 847839 Original Author: Csaba Henk <csaba@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5131 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: implement batched fsync in a single threadAnand Avati2013-07-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of the extra fsync()s issued by AFR transaction, they could potentially "clog" all the io-threads denying unrelated operations from making progress. This patch assigns a dedicated thread to issues fsyncs, as an experimental feature to understand performance characteristics with the approach. As a basis, incoming individual fsync requests are grouped into batches, falling in the same @batch-fsync-delay-usec window of time. These windows can extend in practice, as processing of the previous batch can take longer than @batch-fsync-delay-usec while new requests are getting batched. The feature support three modes (similar to the -S modes of fs_mark) - syncfs: In this mode one syncfs() is issued per batch, instead of N fsync()s (one per file.) - syncfs-single-fsync: In this mode one syncfs() is issued per batch (which, on Linux, guarantees the completion of write-out of dirty pages in the filesystem up to that point) and one single fsync() to synchronize or flush the controller/drive cache. This corresponds to -S 2 of fsmark. - syncfs-reverse-fsync: In this mode, one syncfs() is issued per batch, and all the open files in that batch are fsync()'ed in the reverse order of the queue. This corresponds to -S 4 of fsmark. - reverse-fsync: In this mode, no syncfs() is issued and all the files in the batch are fsync()'ed in the reverse order. This corresponds to -S 3 of fsmark. Change-Id: Ia1e170a810c780c8d80e02cf910accc4170c4cd4 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4746 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-223-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Give up biglock before brick's rpc unrefKrishnan Parthasarathi2013-07-111-1/+5
| | | | | | | | | | | | | This is to prevent the possibility of a deadlock when rpc_connection_cleanup being called in the same thread as rpc_clnt_unref Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5321 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Correct op-version of some optionsKaushal M2013-07-111-23/+23
| | | | | | | | | | | | | | | New options being introduced in the master branch should now have op-version set to the GD_OP_VERSION_MAX (3). Some of the options have been backported to release-3.3 branch and hence should have their op-version reduced. Some other options had op-version incorrectly set as 1. Change-Id: If40325b7b2da7aa36f90261024117cd18cf51ef0 BUG: 981278 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5318 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/common-utils: move hostname helper functions to common-utilsKrishnan Parthasarathi2013-07-046-255/+20
| | | | | | | | | Change-Id: If47e209cb61ea0eb74ee2d6ef9e9342b2d6ee13a BUG: 980838 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add a simple health-checkerNiels de Vos2013-07-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Goal of this health-checker is to detect fatal issues of the underlying storage that is used for exporting a brick. The current implementation requires the filesystem to detect the storage error, after which it will notify the parent xlators and exit the glusterfsd (brick) process to prevent further troubles. The interval the health-check runs can be configured per volume with the storage.health-check-interval option. The default interval is 30 seconds. It is not trivial to write an automated test-case with the current prove-framework. These are the manual steps that can be done to verify the functionality: - setup a Logical Volume (/dev/bz970960/xfs) and format is as XFS for brick usage - create a volume with the one brick # gluster volume create failing_xfs glufs1:/bricks/failing_xfs/data # gluster volume start failing_xfs - mount the volume and verify the functionality - make the storage fail (use device-mapper, or pull disks) # dmsetup table .. bz970960-xfs: 0 196608 linear 7:0 2048 # echo 0 196608 error > dmsetup-error-target # dmsetup load bz970960-xfs dmsetup-error-target # dmsetup resume bz970960-xfs # dmsetup table ... bz970960-xfs: 0 196608 error - notice the errors caught by syslog: Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 512 Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): I/O Error Detected. Shutting down filesystem Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s) Jun 24 11:31:49 vm130-32 kernel: VFS:Filesystem freeze failed Jun 24 11:31:50 vm130-32 GlusterFS[1969]: [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down Jun 24 11:32:09 vm130-32 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jun 24 11:32:20 vm130-32 GlusterFS[1969]: [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - these errors are in the log of the brick as well: [2013-06-24 10:31:50.500607] W [posix-helpers.c:1102:posix_health_check_thread_proc] 0-failing_xfs-posix: stat() on /bricks/failing_xfs/data returned: Input/output error [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - the glusterfsd process has exited correctly: # gluster volume status Status of volume: failing_xfs Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick glufs1:/bricks/failing_xfs/data N/A N N/A NFS Server on localhost 2049 Y 1897 Change-Id: Ic247fbefb97f7e861307a5998a9a7a3ecc80aa07 BUG: 971774 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5176 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>