<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/extras/ganesha/ocf, branch v3.9dev</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>common-ha: continuous grace_mon log messages in /var/log/messages</title>
<updated>2016-04-28T10:19:50+00:00</updated>
<author>
<name>Kaleb S KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2016-04-06T15:18:30+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=6530521700d94834d110424f525b95e40bf68094'/>
<id>6530521700d94834d110424f525b95e40bf68094</id>
<content type='text'>
messages are seen on RHEL6.x and RHEL7.1 and earlier versions of
pacemaker. (And RHEL7.2 with RHEL7.1 pacemaker packages.)

It's not possible to query attrd attributes in the older version,
only set/update/clear them. The messages come from invalid attempts
to query the attributes.

However it is possible to query crm attributes. The fix here is to
create a "shadow" crm attribute for the attrd attribute. Changes are
made to both, queries are made on the crm attribute.

(Resource Agents "follow" the attrd attribute using constraint locations,
so we must keep the attrd attribute.)

Change-Id: I84ac1a80673e528d98b67b7d5062e21dcf744d4a
BUG: 1324509
Signed-off-by: Kaleb S KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13919
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
messages are seen on RHEL6.x and RHEL7.1 and earlier versions of
pacemaker. (And RHEL7.2 with RHEL7.1 pacemaker packages.)

It's not possible to query attrd attributes in the older version,
only set/update/clear them. The messages come from invalid attempts
to query the attributes.

However it is possible to query crm attributes. The fix here is to
create a "shadow" crm attribute for the attrd attribute. Changes are
made to both, queries are made on the crm attribute.

(Resource Agents "follow" the attrd attribute using constraint locations,
so we must keep the attrd attribute.)

Change-Id: I84ac1a80673e528d98b67b7d5062e21dcf744d4a
BUG: 1324509
Signed-off-by: Kaleb S KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13919
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>common-ha: reliable grace using pacemaker notify actions</title>
<updated>2016-03-15T04:34:27+00:00</updated>
<author>
<name>Kaleb S KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-12-14T14:24:57+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=40a24f5ab917863d1549508ae9cf31085955d174'/>
<id>40a24f5ab917863d1549508ae9cf31085955d174</id>
<content type='text'>
Using *-dead_ip-1 resources to track on which nodes the ganesha.nfsd
had died was found to be unreliable.

Running `pcs status` in the ganesha_grace monitor action was seen to
time out during failover; the HA devs opined that it was, generally,
not a good idea to run `pcs status` in a monitor action in any event.
They suggested using the notify feature, where the resources on all
the nodes are notified when a clone resource agent dies.

This change adds a notify action to the ganesha_grace RA. The ganesha_mon
RA monitors its ganesha.nfsd daemon. While the daemon is running, it
creates two attributes: ganesha-active and grace-active. When the daemon
stops for any reason, the attributes are deleted. Deleting the
ganesha-active attribute triggers the failover of the virtual IP (the
IPaddr RA) to another node where ganesha.nfsd is still running. The
ganesha_grace RA monitors the grace-active attribute. When the
grace-active attibute is deleted, the ganesha_grace RA stops, and will
not restart. This triggers pacemaker to trigger the notify action in
the ganesha_grace RAs on the other nodes in the cluster; which send a
DBUS message to their ganesha.nfsd.

(N.B. grace-active is a bit of a misnomer. while the grace-active
attribute exists, everything is normal and healthy. Deleting the
attribute triggers putting the surviving ganesha.nfsds into GRACE.)

To ensure that the remaining/surviving ganesha.nfsds are put into
NFS-GRACE before the IPaddr (virtual IP) fails over there is a short
delay (sleep) between deleting the grace-active attribute and the
ganesha-active attribute. To summarize:
  1. on node 2 ganesha_mon:monitor notices that ganesha.nfsd has died
  2. on node 2 ganesha_mon:monitor deletes its grace-active attribute
  3. on node 2 ganesha_grace:monitor notices that grace-active is gone
     and returns OCF_ERR_GENERIC, a.k.a. new error. When pacemaker
     tries to (re)start ganesha_grace, its start action will return
     OCF_NOT_RUNNING, a.k.a. known error, don't attempt further
     restarts.
  4. on nodes 1, 3, etc., ganesha_grace:notify receives a post-stop
     notification indicating that node 2 is gone, and sends a DBUS
     message to its ganesha.nfsd putting it into NFS-GRACE.
  5. on node 2 ganesha_mon:monitor waits a short period, then deletes
     its ganesha-active attribute. This triggers the IPaddr (virt IP)
     failover according to constraint location rules.

ganesha_nfsd modified to run for the duration, start action is invoked
to setup the /var/lib/nfs symlink, stop action is invoked to restore it.
ganesha-ha.sh modified accordingly to create it as a clone resource.

BUG: 1290865
Change-Id: I1ba24f38fa4338b3aeb17c65645e9f439387ff57
Signed-off-by: Kaleb S KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12964
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-on: http://review.gluster.org/13725
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using *-dead_ip-1 resources to track on which nodes the ganesha.nfsd
had died was found to be unreliable.

Running `pcs status` in the ganesha_grace monitor action was seen to
time out during failover; the HA devs opined that it was, generally,
not a good idea to run `pcs status` in a monitor action in any event.
They suggested using the notify feature, where the resources on all
the nodes are notified when a clone resource agent dies.

This change adds a notify action to the ganesha_grace RA. The ganesha_mon
RA monitors its ganesha.nfsd daemon. While the daemon is running, it
creates two attributes: ganesha-active and grace-active. When the daemon
stops for any reason, the attributes are deleted. Deleting the
ganesha-active attribute triggers the failover of the virtual IP (the
IPaddr RA) to another node where ganesha.nfsd is still running. The
ganesha_grace RA monitors the grace-active attribute. When the
grace-active attibute is deleted, the ganesha_grace RA stops, and will
not restart. This triggers pacemaker to trigger the notify action in
the ganesha_grace RAs on the other nodes in the cluster; which send a
DBUS message to their ganesha.nfsd.

(N.B. grace-active is a bit of a misnomer. while the grace-active
attribute exists, everything is normal and healthy. Deleting the
attribute triggers putting the surviving ganesha.nfsds into GRACE.)

To ensure that the remaining/surviving ganesha.nfsds are put into
NFS-GRACE before the IPaddr (virtual IP) fails over there is a short
delay (sleep) between deleting the grace-active attribute and the
ganesha-active attribute. To summarize:
  1. on node 2 ganesha_mon:monitor notices that ganesha.nfsd has died
  2. on node 2 ganesha_mon:monitor deletes its grace-active attribute
  3. on node 2 ganesha_grace:monitor notices that grace-active is gone
     and returns OCF_ERR_GENERIC, a.k.a. new error. When pacemaker
     tries to (re)start ganesha_grace, its start action will return
     OCF_NOT_RUNNING, a.k.a. known error, don't attempt further
     restarts.
  4. on nodes 1, 3, etc., ganesha_grace:notify receives a post-stop
     notification indicating that node 2 is gone, and sends a DBUS
     message to its ganesha.nfsd putting it into NFS-GRACE.
  5. on node 2 ganesha_mon:monitor waits a short period, then deletes
     its ganesha-active attribute. This triggers the IPaddr (virt IP)
     failover according to constraint location rules.

ganesha_nfsd modified to run for the duration, start action is invoked
to setup the /var/lib/nfs symlink, stop action is invoked to restore it.
ganesha-ha.sh modified accordingly to create it as a clone resource.

BUG: 1290865
Change-Id: I1ba24f38fa4338b3aeb17c65645e9f439387ff57
Signed-off-by: Kaleb S KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12964
Smoke: Gluster Build System &lt;jenkins@build.gluster.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-on: http://review.gluster.org/13725
</pre>
</div>
</content>
</entry>
<entry>
<title>common-ha: cluster setup issues on RHEL7</title>
<updated>2015-06-19T08:25:46+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-06-16T13:33:48+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=46bf15e897ee9711835af211c19351a9920d490b'/>
<id>46bf15e897ee9711835af211c19351a9920d490b</id>
<content type='text'>
 * use --name on RHEL7 (later versions of pcs drop --name) we guessed
   wrong and did not get the version that dropped use of --name option
 * more robust config file param parsing for n/v with ""s in the value
   after not sourcing the config file
 * pid file fix. RHEL6 init.d adds -p /var/run/ganesha.nfsd.pid to
   cmdline options. RHEL7 systemd does not, so defaults to
   /var/run/ganesha.pid.

Change-Id: I575aa13c98f05523cca10c55f2c387200bad3f93
BUG: 1229948
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11257
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Meghana M &lt;mmadhusu@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
 * use --name on RHEL7 (later versions of pcs drop --name) we guessed
   wrong and did not get the version that dropped use of --name option
 * more robust config file param parsing for n/v with ""s in the value
   after not sourcing the config file
 * pid file fix. RHEL6 init.d adds -p /var/run/ganesha.nfsd.pid to
   cmdline options. RHEL7 systemd does not, so defaults to
   /var/run/ganesha.pid.

Change-Id: I575aa13c98f05523cca10c55f2c387200bad3f93
BUG: 1229948
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11257
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Meghana M &lt;mmadhusu@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>common-ha: handle long node names and node names with '-' and '.' in them</title>
<updated>2015-06-11T20:12:11+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-06-01T18:02:58+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=04791e4d53650eb108890e9ad9a809768a06987b'/>
<id>04791e4d53650eb108890e9ad9a809768a06987b</id>
<content type='text'>
sourcing the /etc/ganesha/ganesha-ha.conf file seemed like a simple
and elegant solution for reading config params, but bash variable names
do not allow '-' and '.' in them.

Also fix incorrect path in shared volume mount

Change-Id: I40140e5da0903221efd316de94dce40229263e15
BUG: 1225572
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11035
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sourcing the /etc/ganesha/ganesha-ha.conf file seemed like a simple
and elegant solution for reading config params, but bash variable names
do not allow '-' and '.' in them.

Also fix incorrect path in shared volume mount

Change-Id: I40140e5da0903221efd316de94dce40229263e15
BUG: 1225572
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/11035
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>common-ha: fix race between setting grace and virt IP fail-over</title>
<updated>2015-06-01T18:13:45+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-05-07T13:22:04+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=751c4583bbaa59ebfe492ab6ecfab3108711f4c5'/>
<id>751c4583bbaa59ebfe492ab6ecfab3108711f4c5</id>
<content type='text'>
Also send stderr output of `pcs resource {create,delete} $node-dead_ip-1`
to /dev/null to avoid flooding the logs

Change-Id: I29d526429cc4d7521971cd5e2e69bfb64bfc5ca9
BUG: 1219485
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10646
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Reviewed-by: Meghana M &lt;mmadhusu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also send stderr output of `pcs resource {create,delete} $node-dead_ip-1`
to /dev/null to avoid flooding the logs

Change-Id: I29d526429cc4d7521971cd5e2e69bfb64bfc5ca9
BUG: 1219485
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10646
Tested-by: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Reviewed-by: Meghana M &lt;mmadhusu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>common-ha: delete-node implementation</title>
<updated>2015-04-30T01:25:12+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-04-14T12:17:10+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=d28a99d6f20650e5d78accb1e16bd3721a2b6d02'/>
<id>d28a99d6f20650e5d78accb1e16bd3721a2b6d02</id>
<content type='text'>
omnibus patch consisting of:
+ completed implemenation of delete-node         (BZ 1212823)
+ teardown leaves /var/lib/nfs symlink           (BZ 1210712)
+ setup copy config, teardown clean /etc/cluster (BZ 1212823)

setup for copy config, teardown clean /etc/cluster:
1. on one (primary) node in the cluster, run:
  `ssh-keygen -f /var/lib/glusterd/nfs/secret.pem`

  Press Enter twice to avoid passphrase.

2. deploy the pubkey ~root/.ssh/authorized keys on _all_ nodes, run:
  `ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@$node`

3. copy the keys to _all_ nodes in the cluster, run:
  `scp /var/lib/glusterd/nfs/secret.*  $node:/var/lib/glusterd/nfs/`
  N.B. this allows setup, teardown, etc., to be run on any node

Change-Id: I9fcd3a57073ead24cd2d0ef0ee7a67c524f3d4b0
BUG: 1213933
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10234
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
omnibus patch consisting of:
+ completed implemenation of delete-node         (BZ 1212823)
+ teardown leaves /var/lib/nfs symlink           (BZ 1210712)
+ setup copy config, teardown clean /etc/cluster (BZ 1212823)

setup for copy config, teardown clean /etc/cluster:
1. on one (primary) node in the cluster, run:
  `ssh-keygen -f /var/lib/glusterd/nfs/secret.pem`

  Press Enter twice to avoid passphrase.

2. deploy the pubkey ~root/.ssh/authorized keys on _all_ nodes, run:
  `ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@$node`

3. copy the keys to _all_ nodes in the cluster, run:
  `scp /var/lib/glusterd/nfs/secret.*  $node:/var/lib/glusterd/nfs/`
  N.B. this allows setup, teardown, etc., to be run on any node

Change-Id: I9fcd3a57073ead24cd2d0ef0ee7a67c524f3d4b0
BUG: 1213933
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10234
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: soumya k &lt;skoduri@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ganesha-ha: more robust pid file handling</title>
<updated>2015-04-08T17:50:09+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-04-08T12:03:58+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=6ae434eeb3bfe421f34e624182d46ee11f57c736'/>
<id>6ae434eeb3bfe421f34e624182d46ee11f57c736</id>
<content type='text'>
fix bug with reading pid file to determine if ganesha.nfsd is running

Change-Id: I4050a119e2be93578045a221b67f616e152546d9
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10163
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fix bug with reading pid file to determine if ganesha.nfsd is running

Change-Id: I4050a119e2be93578045a221b67f616e152546d9
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/10163
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFS-Ganesha: Install scripts, config files, and resource agent scripts</title>
<updated>2015-03-17T20:39:44+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2015-03-17T13:27:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=d81182cf69a4f188f304fcce6d651ffd56b67aac'/>
<id>d81182cf69a4f188f304fcce6d651ffd56b67aac</id>
<content type='text'>
Resubmitting after a gerrit bug bungled the merge of
http://review.gluster.org/9621 (was it really a gerrit bug?)

Scripts related to NFS-Ganesha are in extras/ganesha/scripts.
Config files are in extras/ganesha/config.
Resource Agent files are in extras/ganesha/ocf

Files are copied to appropriate locations.

Change-Id: I137169f4d653ee2b7d6df14d41e2babd0ae8d10c
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/9912
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Resubmitting after a gerrit bug bungled the merge of
http://review.gluster.org/9621 (was it really a gerrit bug?)

Scripts related to NFS-Ganesha are in extras/ganesha/scripts.
Config files are in extras/ganesha/config.
Resource Agent files are in extras/ganesha/ocf

Files are copied to appropriate locations.

Change-Id: I137169f4d653ee2b7d6df14d41e2babd0ae8d10c
BUG: 1188184
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/9912
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
