summaryrefslogtreecommitdiffstats
path: root/xlators/mgmt
Commit message (Collapse)AuthorAgeFilesLines
* common-utils: Move glusterd_is_service_running() to common-utilsKrutika Dhananjay2013-08-124-46/+9
| | | | | Change-Id: I96cbe03511cbecb112418da82c44c00fbab74ba3 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd, quotad: volume-id option fixupsBrian Foster2013-08-121-0/+14
| | | | | | | | | | | A few little hacks to set the volume id on the quota server and a mapping option on quotad to map the volume name to the uuid passed via the lookup request. Change-Id: Ic151acb18ed29d2ee4ae5d1bc6841ae4a4de176a Original-author: Brian Foster <bfoster@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com>
* glusterd,cli: changes to quota list <path> ...Krutika Dhananjay2013-08-121-67/+11
| | | | | Change-Id: Ia37020c3aa11af6eed3af09cfe390b848b028b6a Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Clean up and fix glusterd_op_quota()Krutika Dhananjay2013-08-122-100/+70
| | | | | | | | | | | | | | | | | | | | ... and also fix cli logging In glusterd_op_quota(), * do not modify ret after going to 'out' as this causes the failure status (-1) to be overwritten, thereby causing the command to return 0 even on failure. * knock off additional labels like create_vol. * replace 'if' statements with 'switch case' statement. * delete only the 3 timeouts and the defaul-soft-limit, if present, from volinfo->dict, upon disablement of quota. Change-Id: If486a5373b66f2379d6d041a974d9b824fcb8518 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Changes to 'quota remove' subcommand behaviorKrutika Dhananjay2013-08-121-44/+39
| | | | | Change-Id: Ifdc60071146587dc5c60d9a53a92d49b3487fd82 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: club limit-usage and soft-limit into a single commandKrutika Dhananjay2013-08-121-165/+83
| | | | | Change-Id: I5f680675576aeec584b497eb25dd804a9dd6d690 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd, cli: Provide status of quotad in 'volume status'Krutika Dhananjay2013-08-125-19/+80
| | | | | | Change-Id: I5e90376ecfe11ae5a3bca936d9d9acdd54c337d7 BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Move timeout options and default-soft-limit to quota xlatorKrutika Dhananjay2013-08-122-80/+6
| | | | | | | | | Write the 3 timeout options {soft-timeout, hard-timeout, alert-time} and default-soft-limit, if explicitly set, into brick volfiles. Change-Id: Ie3229a8ab1b081a5936defd4f977afc8a19dad50 BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* glusterd: Spawn quotad on every nodeKrutika Dhananjay2013-08-123-31/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... and also trigger a reconfigure when the following quota sub-commands are executed: a. default-soft-limit, b. hard-timeout, c. soft-timeout, and d. alert-time. Also start/restart/stop quotad only when quota is enabled or disabled on any volume. Tests performed in a two node cluster: a. Create and start a volume, enable quota on it and check if quotad is spawned on both the nodes. b. Execute all quota sub-commands on the volume except 'enable' and 'disable' and verify that the pid of quota daemon doesn't change. c. Stop the volume and verify that quotad is stopped. d. Start it again. Quotad must be started now. e. Create, start and enable quota on a second volume, verify that the pid of quotad changes on both the nodes (indicating a restart). f. Disable quota on one of the volumes and verify that quotad's pid changes. g. Disable quota on the second volume too and verify that quotad is stopped on both the nodes. h. Enable quota again on one of the volumes, and verify that quotad is started on both the nodes. i. Add a new node into this cluster and verify that quotad is spawned on this node too. Change-Id: Ie93ab69c685051e196c377cff15078a1cde17fca BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
* features/quota: Improvements to quotaVarun Shastry2013-08-1210-249/+1117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Old implementation * Client side implementation of quota - Not secure - Increased traffic in updating the ctx New Implementation * 2 stages of quota implementation is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota is moved to server-side. There will be 2 quota xlators i. Quota Server It takes care of the enforcing the quota and maintains the context specific to the brick. Since this doesn't have the complete picture of the cluster, cluster wide usage is updated from the quota daemon. This updated context is saved and used for the enforcement. It updates its context by searching the QUOTA_UPDATE_KEY from the dict in the setxattr call, and is updated from nowhere else. The quota is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to quota-server: server-quota - Specifies whether the features is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command) ii. Quota Daemon This is the new xlator introduced with this patch. Its the *gluster client* process with no mount point, started upon enabling quota or restarting the volume. This is a single process for all the volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKI_DIR/quotad/quotad.vol. It queries for the sizes on all the bricks, aggregates the size and sends back the updated size, periodically. The timeout between successive updation is configurable and typically/by default more for below-soft-quota usage and less for above-soft-quota usage. It maintains the timeout inside the limit structure based on the usage; below soft limit and above soft limit. There will be thread running per volume which iterates through the list and decides whether the size to be queried in the current iteration based on its timeout. It takes the next iteration time taking the least of the timeouts in the list of entries. Maintains a separate inode table for each volume in the quotad. In the first iteration it builds the table for quota-dirs (dirs on which limit is set) and its components. Options specific to quotad: hard-timeout - Timeout for updation of usage to the quota-server when the usage is crosses the soft-limit. soft-timeout - Timeout for the updation of usage to the quota-server when the usage is below soft-limit. alert-time - Frequency of logging after the usage reached soft limit. Options common to both: default-soft-limit - This is used when individual paths are not configured with soft-limit and default value of this option is 90% of the hard-limit. limit-set - String containing all the limits. Thus in the current implementation we'll have 2 quota xlators: one in server graph and one in trusted client (quota daemon) of which the sole purpose will be to aggregate the quota size xattrs from all the bricks and send the same to server quota xlator. * Changes in glusterd and CLI A single volfile is created for all the volumes, similar to nfs volfile. All files related to quota client (volfile, pid etc) are stored in GLUSTERD_DEFAULT_WORK_DIR/quotad/. The new pattern of the quota limit stores in limit-set = <single-dir-limit>[,<single-dir-limit>] single-dir-limit = <abs-path>:<hard-limit>[:<soft-limit-in-percent>] It also introduces new options: volume quota <VOLNAME> {enable|disable|list [<path> ...]|remove <path>| default-soft-limit <percent>} | volume quota <VOLNAME> {limit-usage <path> <size> |soft-limit <path> <percent>} | volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} Credit: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: I16ec5be0c2faaf42b14034b9ccaf17796adef082 BUG: 969461 Signed-off-by: Varun Shastry <vshastry@redhat.com>
* Correcting a log message in glusterd-geo-rep.cM S Vishwanath Bhat2013-08-051-1/+1
| | | | | | | | | | Change-Id: I4352f513fc5616daa20e9a4ad51a63fb13a27dff BUG: 847839 Signed-off-by: M S Vishwanath Bhat <vbhat@redhat.com> Reviewed-on: http://review.gluster.org/5472 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* Add switch and nufa options to 'gluster cli'Harshavardhana2013-08-032-17/+53
| | | | | | | | | Change-Id: Ic3c43291e0e1ead0d89c0436e8d70aa5dee2f543 BUG: 924488 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5391 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* cli,glusterd: Fix when tasks are shown in 'volume status'Kaushal M2013-08-031-0/+4
| | | | | | | | | | | | | Asynchronous tasks are shown in 'volume status' only for a normal volume status request for either all volumes or a single volume. Change-Id: I9d47101511776a179d213598782ca0bbdf32b8c2 BUG: 888752 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5308 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Use volume op-versions during volgenKaushal M2013-08-024-19/+14
| | | | | | | | | | | | | | Instead of using the cluster op-version, volume op-version is used to enable open-behind during volgen. For doing this, the volume op-versions are updated before regenerating the volfiles. Change-Id: I675bb549bf7c7c0279030dca698fb530781addc6 BUG: 990830 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5385 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/dht: Re-initialize skipped file count in glusterdshishir gowda2013-07-311-0/+1
| | | | | | | | | Change-Id: I42d08b3a6a7a3839f5e9953e1f83959222c080f8 Signed-off-by: shishir gowda <sgowda@redhat.com> BUG: 989846 Reviewed-on: http://review.gluster.org/5446 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd : Checking session created or not in case of geo-rep stopAvra Sengupta2013-07-311-2/+6
| | | | | | | | | | | | | | | | | Performing statefile check in case of geo-rep stop, so as to provide proper error message in case session is not created. However in case of geo-rep stop force, we allow the command to succeed even in case that the session is not created, because the stop command is a failsafe command to stop running geo-rep sessions on any nodes. Change-Id: I2b6a0253de977633606c422cbbc9e37cede9a268 BUG: 989541 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5417 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd : initiating gsyncd restart during add-brickAvra Sengupta2013-07-313-25/+150
| | | | | | | | | | | | | | | | | | | | | | | During add-brick, when a new brick is added in one of the nodes that was already a part of the existing volume, and gsyncd was already running on that node, then all gsyncd processes running on that node, for that particular master and any slave sessions will be restarted If a new brick is added in a new node, then after adding the brick, the user has to perform the following steps: 1. gluster system:: execute gsec_create 2. gluster volume geo-replication <master-vol> <slave-vol> create push-pem force 3. gluster volume geo-replication <master-vol> <slave-vol> start force Change-Id: I4b9633e176c80e4a7cf33f42ebfa47ab8fc283f1 BUG: 989532 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5416 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* mgmt/glusterd: Fix a minor typo.Vijay Bellur2013-07-311-1/+1
| | | | | | | | | | | Thanks to Patrick Matthäi <pmatthaei@debian.org> for the patch. Signed-off-by: Vijay Bellur <vbellur@redhat.com> Change-Id: I59da74298894ccc2ab30967ffe44cc844aa73f82 BUG: 814534 Reviewed-on: http://review.gluster.org/5436 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
* cluster/dht: Treat migration failures due to space constraints as skippedshishir gowda2013-07-302-0/+28
| | | | | | | | | | | | | | | | Currently rebalance/remove-brick op's display migration failed count even for files which failed due to space issues (not enough space for file, or migration leading to cluster imbalance) These will now be counted as skipped, and rebalance/remove-brick status will display the additional counter Change-Id: I674904d380b5f8300e9ca9e6af557c3d30d6cff4 BUG: 989846 Signed-off-by: shishir gowda <sgowda@redhat.com> Reviewed-on: http://review.gluster.org/5399 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Fixing create force issues while it returned true everytime.Avra Sengupta2013-07-291-42/+71
| | | | | | | | | | | | | | | | | | | | Now geo-rep create force will return true if a node is down, and log an appropriate message. It will also return true with an appropriate log message if the slave verification fails. However it will not return true if the config file is deleted, ot corrupted, so as not to get the state_file's path. It will also fail if the slave url is invalid. If the push-pem option is given and /var/lib/glusterd/geo-replication/common_secret.pem.pub is not present, then also the create force command will fail. Change-Id: Ie7532a0884ddf9c3008bd30832d171d5b53b540e BUG: 988314 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5405 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cli, glusterd: Cleanup logging of bd op commands.Vijay Bellur2013-07-271-1/+0
| | | | | | | | | | | | | | This patch prevents messages of the form "bd op: %s : SUCCESS" from being logged in .cmd_log_history. Change-Id: Iebeb7e26d409bf99b9c8df0a5c1c5a5d30d78a61 BUG: 823081 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/4871 Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* mgmt/glusterd: let each brick write the valgrind o/p to different fileRaghavendra Bhat2013-07-261-1/+5
| | | | | | | | | | | | | Till now all the brick processes were writing the valgrind information to the same log file. Change-Id: I0251c943935e2901b729c71f21d0677edb9f6867 BUG: 922877 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-on: http://review.gluster.org/5394 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd/cli changes for distributed geo-repAvra Sengupta2013-07-2612-574/+2677
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* gsyncd: distribute the crawling loadAvra Sengupta2013-07-262-1/+15
| | | | | | | | | | | | | | | | | | | * also consume changelog for change detection. * Status fixes * Use new libgfchangelog done API * process (and sync) one changelog at a time Change-Id: I24891615bb762e0741b1819ddfdef8802326cb16 BUG: 847839 Original Author: Csaba Henk <csaba@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Avra Sengupta <asengupt@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5131 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
* storage/posix: implement batched fsync in a single threadAnand Avati2013-07-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of the extra fsync()s issued by AFR transaction, they could potentially "clog" all the io-threads denying unrelated operations from making progress. This patch assigns a dedicated thread to issues fsyncs, as an experimental feature to understand performance characteristics with the approach. As a basis, incoming individual fsync requests are grouped into batches, falling in the same @batch-fsync-delay-usec window of time. These windows can extend in practice, as processing of the previous batch can take longer than @batch-fsync-delay-usec while new requests are getting batched. The feature support three modes (similar to the -S modes of fs_mark) - syncfs: In this mode one syncfs() is issued per batch, instead of N fsync()s (one per file.) - syncfs-single-fsync: In this mode one syncfs() is issued per batch (which, on Linux, guarantees the completion of write-out of dirty pages in the filesystem up to that point) and one single fsync() to synchronize or flush the controller/drive cache. This corresponds to -S 2 of fsmark. - syncfs-reverse-fsync: In this mode, one syncfs() is issued per batch, and all the open files in that batch are fsync()'ed in the reverse order of the queue. This corresponds to -S 4 of fsmark. - reverse-fsync: In this mode, no syncfs() is issued and all the files in the batch are fsync()'ed in the reverse order. This corresponds to -S 3 of fsmark. Change-Id: Ia1e170a810c780c8d80e02cf910accc4170c4cd4 BUG: 927146 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4746 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/changelog: changelog translatorAvra Sengupta2013-07-223-11/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Give up biglock before brick's rpc unrefKrishnan Parthasarathi2013-07-111-1/+5
| | | | | | | | | | | | | This is to prevent the possibility of a deadlock when rpc_connection_cleanup being called in the same thread as rpc_clnt_unref Change-Id: Ia4dcc0a8a6e6158d4ddec68b780fccbc4cd64adb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5321 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Correct op-version of some optionsKaushal M2013-07-111-23/+23
| | | | | | | | | | | | | | | New options being introduced in the master branch should now have op-version set to the GD_OP_VERSION_MAX (3). Some of the options have been backported to release-3.3 branch and hence should have their op-version reduced. Some other options had op-version incorrectly set as 1. Change-Id: If40325b7b2da7aa36f90261024117cd18cf51ef0 BUG: 981278 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5318 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd/common-utils: move hostname helper functions to common-utilsKrishnan Parthasarathi2013-07-046-255/+20
| | | | | | | | | Change-Id: If47e209cb61ea0eb74ee2d6ef9e9342b2d6ee13a BUG: 980838 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5261 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* posix: add a simple health-checkerNiels de Vos2013-07-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Goal of this health-checker is to detect fatal issues of the underlying storage that is used for exporting a brick. The current implementation requires the filesystem to detect the storage error, after which it will notify the parent xlators and exit the glusterfsd (brick) process to prevent further troubles. The interval the health-check runs can be configured per volume with the storage.health-check-interval option. The default interval is 30 seconds. It is not trivial to write an automated test-case with the current prove-framework. These are the manual steps that can be done to verify the functionality: - setup a Logical Volume (/dev/bz970960/xfs) and format is as XFS for brick usage - create a volume with the one brick # gluster volume create failing_xfs glufs1:/bricks/failing_xfs/data # gluster volume start failing_xfs - mount the volume and verify the functionality - make the storage fail (use device-mapper, or pull disks) # dmsetup table .. bz970960-xfs: 0 196608 linear 7:0 2048 # echo 0 196608 error > dmsetup-error-target # dmsetup load bz970960-xfs dmsetup-error-target # dmsetup resume bz970960-xfs # dmsetup table ... bz970960-xfs: 0 196608 error - notice the errors caught by syslog: Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 buf count 512 Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): I/O Error Detected. Shutting down filesystem Jun 24 11:31:49 vm130-32 kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s) Jun 24 11:31:49 vm130-32 kernel: VFS:Filesystem freeze failed Jun 24 11:31:50 vm130-32 GlusterFS[1969]: [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down Jun 24 11:32:09 vm130-32 kernel: XFS (dm-2): xfs_log_force: error 5 returned. Jun 24 11:32:20 vm130-32 GlusterFS[1969]: [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - these errors are in the log of the brick as well: [2013-06-24 10:31:50.500607] W [posix-helpers.c:1102:posix_health_check_thread_proc] 0-failing_xfs-posix: stat() on /bricks/failing_xfs/data returned: Input/output error [2013-06-24 10:31:50.500674] M [posix-helpers.c:1114:posix_health_check_thread_proc] 0-failing_xfs-posix: health-check failed, going down [2013-06-24 10:32:20.508690] M [posix-helpers.c:1119:posix_health_check_thread_proc] 0-failing_xfs-posix: still alive! -> SIGTERM - the glusterfsd process has exited correctly: # gluster volume status Status of volume: failing_xfs Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick glufs1:/bricks/failing_xfs/data N/A N N/A NFS Server on localhost 2049 Y 1897 Change-Id: Ic247fbefb97f7e861307a5998a9a7a3ecc80aa07 BUG: 971774 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/5176 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* cluster/afr: Provide an option to disable afr durabilityPranith Kumar K2013-07-031-0/+5
| | | | | | | | | Change-Id: I40eec20ca6b3f857245a2438883822e251077ee9 BUG: 979365 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5269 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: More checks before starting rebalance/remove-brickKaushal M2013-07-024-12/+60
| | | | | | | | | | | | Check if a previous remove-brick operation has been committed before starting a new rebalance/remove-brick task. Change-Id: I553e5ba64a6a352ca91032ab1a17997051a4494e BUG: 963541 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5019 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* rpc: duplicate request cache for nfsRajesh Amaravathi2013-06-215-57/+66
| | | | | | | | | | | | | | | Duplicate request cache provides a mechanism for detecting duplicate rpc requests from clients. DRC caches replies and on duplicate requests, sends the cached reply instead of re-processing the request. Change-Id: I3d62a6c4aa86c92bf61f1038ca62a1a46bf1c303 BUG: 847624 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/4049 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* store: move glusterd_store functions from mgmt/glusterd to libglusterfsNiels de Vos2013-06-206-867/+179
| | | | | | | | | | | | | Making the glusterd_store_* functions re-usable will help with future changes that need to read/write lists of items. BUG: 904065 Change-Id: I99fb8eced76d12d5a254567eccff9790b43d8da3 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/4676 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Log peer op status at the appropriate timeKrutika Dhananjay2013-06-186-72/+284
| | | | | | | | | | Change-Id: Ia8e1af082078f2f791708ba4faa4992bf291dd6e BUG: 961339 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5023 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Disable transport before cleaning up rpc objectKrishnan Parthasarathi2013-06-183-19/+99
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: rpc_transport object, which is part of rpc_clnt, is destroyed prematurely. This is because, rpc_transport object is ref'd by socket layer and rpc layer. These ref's, until the synctask'izing of operations, were unref'd sequentially in the epoll thread. With more threads at play, the sequential unref guarantee is off. Fix: Shutting down the transport before proceeding with cleaning up of rpc_clnt object would serialize the unref's on the rpc_transport object and thus eliminating the race. Also, we don't store the address of brickinfo in brick's rpc notify function, to avoid the possibility of referring a freed brickinfo. Instead we use a string based id to 'reach' the corresponding brickinfo. Change-Id: If2739e2eeaee1e8b071ab2b6754b7ea0f81cfceb BUG: 962619 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5000 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs: option to disable aclRajesh Amaravathi2013-06-152-0/+13
| | | | | | | | | | | | | | | | 1. Option to disable or enable acl with nfs.acl boolean option. 2. Deregister the acl service with the portmapper service when no longer required. Change-Id: I6562b6b40138d040aa2bf1e5641f4c0e0e9f9d09 BUG: 970070 Signed-off-by: Rajesh Amaravathi <rajesh@redhat.com> Reviewed-on: http://review.gluster.org/5136 Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
* glusterd: Ignore directories matching *.tmp in storeKrishnan Parthasarathi2013-06-131-0/+1
| | | | | | | | | | | | store being glusterd's persistent store under /var/lib/glusterd/ Change-Id: I1c01a09a8ce4a73ea612f05e7f14d4ab39ad1628 BUG: 971796 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5177 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd, socket: Change logging for brick disconnectsPranith Kumar K2013-06-111-2/+6
| | | | | | | | | | | | | | | | | | For unix path based sockets, the socket path is cryptic (md5sum of path) and may not be useful for the user in debugging so log it in DEBUG. Changed logging in brick_rpc_notify to log brickinfo for disconnects. Change-Id: I69174bbbbde8352d38837723e950ad8fc15232aa BUG: 963153 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5009 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Add a cmd for getting uuid of local nodeKrishnan Parthasarathi2013-06-101-0/+99
| | | | | | | | | | | | | | | | | | | Usage: gluster system:: uuid get This is needed since we generate uuid of a node in a lazy manner. ie, we generate a uuid for the node only on the first volume or peer operation, when the node needs an external identity. With this command, we can force[1] the uuid generation, without a volume or peer operation performed. [1]: Querying for uuid (or uuid get), forces uuid to come into existence. Change-Id: I62c8b6754117756aa4d773dd48af4ddeb1a1d878 BUG: 971661 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5175 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaushal M <kaushal@redhat.com>
* nfs: gluster volume set help shows null as default valueRajesh Joseph2013-06-061-11/+17
| | | | | | | | | | | | | | Bug(967445): The default value for all nfs options is displayed as "(null)" Fix: Changed nfs options to show default value. Change-Id: I3b1f27439c19a6655f7dcc7891df40706db9e474 BUG: 967445 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/5098 Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Set task op at the time of task-id setPranith Kumar K2013-06-062-0/+2
| | | | | | | | | | | | | | | | | | Problem: If a remove-brick start is executed on m1 with brick from m2 on local subvolume no rebalance process is launched. Because of this volinfo->rebal.op is not set. This leads to volume status failures. Fix: Set rebal.op even when the reblance process is not started. Change-Id: I71c7e6f09353be14c1e8edca3c8685ebfdf226d6 BUG: 964059 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5030 Reviewed-by: Kaushal M <kaushal@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd-syncop: Fix unlocking and collating errorsKaushal M2013-06-043-50/+86
| | | | | | | | | | | | | * Only those peers which were locked need to be unlocked. * Fix location of collating errors in callbacks. The callback functions could miss collating errors if there was an rpc error. Change-Id: Ie27c2f1ec197da4f5077a4d6e032127954ce87cd BUG: 948686 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/5087 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* mgmt/glusterd: Make sure peerinfo->uuid_str is assignedPranith Kumar K2013-05-314-5/+24
| | | | | | | | | Change-Id: I9e2743ab61c8baee92a1dfd376ec4bb145776176 BUG: 963524 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/5016 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* glusterd-volgen: Improve volume op-versions calculationKaushal M2013-05-314-488/+616
| | | | | | | | | | | | | | | | | | Volume op-versions calculations now take into account if an option, a. enables/disables an xlator, or b. is a boolean option. This prevents op-versions from being updated when a feature is disabled. Also, correctly close the dynamically loaded xlators in xlator_volopt_dynload() and prevent leaks. Change-Id: I895ddeeec6f6a33e509325f0ce6f01b7aad3cf5c BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4952 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd-volgen: Enable open-behind based on op-versionKaushal M2013-05-283-11/+50
| | | | | | | | | | | | | | | This patch enables the open-behind by default only when the op-version allows it. Also the volume op-version calculations take account of this enablement. Change-Id: Idf7a3c274ec4828aafc815cdd1df829ecb853354 BUG: 954256 Signed-off-by: Kaushal M <kaushal@redhat.com> Reviewed-on: http://review.gluster.org/4866 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Allow volume start force to succeed if brick directories are recreatedKrutika Dhananjay2013-05-251-1/+15
| | | | | | | | | | | Change-Id: I4fc3c5c829adca256bb131f4a2722abc95741158 BUG: 963665 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5020 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* glusterd: Give up biglock during rpc conn cleanupKrishnan Parthasarathi2013-05-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | glusterd could deadlock after a peer-detach command as follows, 1) glusterd_friend_cleanup function 'flushes' out messages in the rpc layer's queue, that haven't received a response. At this point, glusterd has already acquired the big lock. 2) The side-effect of flushing out the messages is that the corresponding call backs are called. Call backs themselves are executed after acquiring the big lock. This results in the big lock being acquired in a nested manner (in the same thread), which causes a deadlock. This can also happen during brick/NFS/SHD disconnect in volume-stop. Change-Id: Iab3aad143cd8ebbab53ea0b69687f0e7627dc8a9 BUG: 965533 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/5061 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* glusterd: Set op_errstr to error string received from peerKrutika Dhananjay2013-05-161-0/+4
| | | | | | | | | | | | ... in case of volume op failure on remote host Change-Id: I7177dc02369dffa82f217496559532d18b7c7c7a BUG: 963628 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/5018 Reviewed-by: Amar Tumballi <amarts@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* rpc-transport: Moved unix socket options function to rpc-transportKrishnan Parthasarathi2013-05-162-6/+6
| | | | | | | | | | | | | This change removes the asymmetry in the 'layer' (read rpc, transport etc) in which transport options were being filled for inet and unix sockets. Change-Id: Iaa080691fd5e4c3baedffa97e9c3f16642c1fc12 BUG: 955919 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/4850 Reviewed-by: Raghavendra G <raghavendra@gluster.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>