<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/protocol/client, branch v3.13.0</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>protocol/client: handle the subdir handshake properly for add-brick</title>
<updated>2017-10-29T07:55:42+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2017-10-22T07:11:38+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=9aa574a51b84717c1f3949ed2e28a49e49840a93'/>
<id>9aa574a51b84717c1f3949ed2e28a49e49840a93</id>
<content type='text'>
There should be different way we handle handshake in case of subdir
mount for the first time, and in case of subsequent graph changes.

Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54
BUG: 1505323
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There should be different way we handle handshake in case of subdir
mount for the first time, and in case of subsequent graph changes.

Change-Id: I2a7ba836433bb0a0f4a861809e2bb0d7fbc4da54
BUG: 1505323
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Coverity issue fix: checked return</title>
<updated>2017-10-04T10:00:12+00:00</updated>
<author>
<name>Kartik_Burmee</name>
<email>kburmee@redhat.com</email>
</author>
<published>2017-09-25T10:19:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=fabee9d6816077d0f44a7d82c0b35aef60771711'/>
<id>fabee9d6816077d0f44a7d82c0b35aef60771711</id>
<content type='text'>
issue:Calling "client_submit_request" without checking return value (as is done elsewhere 52 out of 58 times).

function: client_fdctx_destroy

Change-Id: I66a295dd114fc20f04eb1aca9a5b274df53be090
BUG: 789278
fix: typecasted function return value using void
Signed-off-by: Kartik_Burmee &lt;kburmee@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
issue:Calling "client_submit_request" without checking return value (as is done elsewhere 52 out of 58 times).

function: client_fdctx_destroy

Change-Id: I66a295dd114fc20f04eb1aca9a5b274df53be090
BUG: 789278
fix: typecasted function return value using void
Signed-off-by: Kartik_Burmee &lt;kburmee@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>protocol/client: Coverity Fix IDENTICAL_BRANCHES in client3_3_flush_cbk</title>
<updated>2017-09-29T09:27:46+00:00</updated>
<author>
<name>Mohammed Azhar Padariyakam</name>
<email>mpadariy@redhat.com</email>
</author>
<published>2017-09-21T15:25:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=d01a674077f2411ecde7bd6037efbfcf6faa65ba'/>
<id>d01a674077f2411ecde7bd6037efbfcf6faa65ba</id>
<content type='text'>
Issue : The same code is executed when the condition "ret &lt; 0" is true
or false, because the code in the if-then branch and after the if
statement is identical.

Solution : Remove the if-condition whose output does not alter the
program flow of control

Fix : The identical branching condition was removed.

Change-Id: Iae40f068e5a03946273e1091886ba7011460fcc8
BUG: 789278
Signed-off-by: Mohammed Azhar Padariyakam &lt;mpadariy@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue : The same code is executed when the condition "ret &lt; 0" is true
or false, because the code in the if-then branch and after the if
statement is identical.

Solution : Remove the if-condition whose output does not alter the
program flow of control

Fix : The identical branching condition was removed.

Change-Id: Iae40f068e5a03946273e1091886ba7011460fcc8
BUG: 789278
Signed-off-by: Mohammed Azhar Padariyakam &lt;mpadariy@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Infra to indentify process</title>
<updated>2017-08-16T15:38:49+00:00</updated>
<author>
<name>hari gowtham</name>
<email>hgowtham@redhat.com</email>
</author>
<published>2017-07-25T12:37:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=def09c4fbb2805618715c5a125ddf482d7e53de3'/>
<id>def09c4fbb2805618715c5a125ddf482d7e53de3</id>
<content type='text'>
Problem: currently we can't identify which process is running and
how many instances of it are available.

Fix: name the process when its spawned and send it to the server
and save it in the client_t

The processes that abide by this change from this patch are:
1) fuse mount,
2) rebalance,
3) selfheal,
4) tier,
5) quota,
6) snapshot,
7) brick.
8) gfapi (by default. gfapi.&lt;processname&gt; if processname is found)

Note: fuse gets a process name as native-fuse-client by default.
If the user gives a name for the fuse and spawns it, it will be of
this type --process-name native-fuse-client.&lt;name_specified&gt;.
This can be made use by the process like aux mount done by quota,
geo-rep, etc by adding another option in the aux mount " -o
process-name=gsync_mount"

Updates: #178
Signed-off-by: hari gowtham &lt;hgowtham@redhat.com&gt;
Change-Id: Ie4d02257216839338043737691753bab9a974d5e
Reviewed-on: https://review.gluster.org/17957
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: hari gowtham &lt;hari.gowtham005@gmail.com&gt;
Reviewed-by: Amar Tumballi &lt;amarts@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
Reviewed-by: Aravinda VK &lt;avishwan@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: currently we can't identify which process is running and
how many instances of it are available.

Fix: name the process when its spawned and send it to the server
and save it in the client_t

The processes that abide by this change from this patch are:
1) fuse mount,
2) rebalance,
3) selfheal,
4) tier,
5) quota,
6) snapshot,
7) brick.
8) gfapi (by default. gfapi.&lt;processname&gt; if processname is found)

Note: fuse gets a process name as native-fuse-client by default.
If the user gives a name for the fuse and spawns it, it will be of
this type --process-name native-fuse-client.&lt;name_specified&gt;.
This can be made use by the process like aux mount done by quota,
geo-rep, etc by adding another option in the aux mount " -o
process-name=gsync_mount"

Updates: #178
Signed-off-by: hari gowtham &lt;hgowtham@redhat.com&gt;
Change-Id: Ie4d02257216839338043737691753bab9a974d5e
Reviewed-on: https://review.gluster.org/17957
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Tested-by: hari gowtham &lt;hari.gowtham005@gmail.com&gt;
Reviewed-by: Amar Tumballi &lt;amarts@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Atin Mukherjee &lt;amukherj@redhat.com&gt;
Reviewed-by: Aravinda VK &lt;avishwan@redhat.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterfsd: allow subdir mount</title>
<updated>2017-08-04T05:26:42+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2017-07-19T17:38:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=590ae48c65a60c93c2e5407e3f663cef3daacc55'/>
<id>590ae48c65a60c93c2e5407e3f663cef3daacc55</id>
<content type='text'>
Changes:

1. Take subdir mount option in client (mount.gluster / glusterfsd)
2. Pass the subdir mount to server-handshake (from client-handshake)
3. Handle subdir-mount dir's lookup in server-first-lookup and handle
   all fops resolution accordingly with proper gfid of subdir
4. Change the auth/addr module to handle the multiple subdir entries
   in option, and valid parsing.

How to use the feature:

`# mount -t glusterfs $hostname:/$volname/$subdir /$mount_point`
Or
`# mount -t glusterfs $hostname:/$volname -osubdir_mount=$subdir /$mount_point`

Option can be set like:

`# gluster volume set &lt;volname&gt; auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)"`

Updates #175

Change-Id: I7ea57f76ddbe6c3862cfe02e13f89e8a39719e11
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
Reviewed-on: https://review.gluster.org/17141
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Shyamsundar Ranganathan &lt;srangana@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Changes:

1. Take subdir mount option in client (mount.gluster / glusterfsd)
2. Pass the subdir mount to server-handshake (from client-handshake)
3. Handle subdir-mount dir's lookup in server-first-lookup and handle
   all fops resolution accordingly with proper gfid of subdir
4. Change the auth/addr module to handle the multiple subdir entries
   in option, and valid parsing.

How to use the feature:

`# mount -t glusterfs $hostname:/$volname/$subdir /$mount_point`
Or
`# mount -t glusterfs $hostname:/$volname -osubdir_mount=$subdir /$mount_point`

Option can be set like:

`# gluster volume set &lt;volname&gt; auth.allow "/subdir1(192.168.1.*),/(192.168.10.*),/subdir2(192.168.8.*)"`

Updates #175

Change-Id: I7ea57f76ddbe6c3862cfe02e13f89e8a39719e11
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
Reviewed-on: https://review.gluster.org/17141
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Shyamsundar Ranganathan &lt;srangana@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove uneeded gotos, as it jump to the next line</title>
<updated>2017-05-08T13:14:21+00:00</updated>
<author>
<name>Michael Scherer</name>
<email>misc@redhat.com</email>
</author>
<published>2017-05-05T15:20:14+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=5b8e9261f1e8aab1a3475578d56d54037459a5e6'/>
<id>5b8e9261f1e8aab1a3475578d56d54037459a5e6</id>
<content type='text'>
Found by coverity
Signed-off-by: Michael Scherer &lt;misc@redhat.com&gt;

Change-Id: I1fef09a427d18bbeb9af0bf4f5c82c77aa3c0d3d
BUG: 789278
Reviewed-on: https://review.gluster.org/17195
Tested-by: Michael Scherer &lt;misc@fedoraproject.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Prashanth Pai &lt;ppai@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Jeff Darcy &lt;jeff@pl.atyp.us&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Found by coverity
Signed-off-by: Michael Scherer &lt;misc@redhat.com&gt;

Change-Id: I1fef09a427d18bbeb9af0bf4f5c82c77aa3c0d3d
BUG: 789278
Reviewed-on: https://review.gluster.org/17195
Tested-by: Michael Scherer &lt;misc@fedoraproject.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Prashanth Pai &lt;ppai@redhat.com&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Jeff Darcy &lt;jeff@pl.atyp.us&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Halo Replication feature for AFR translator</title>
<updated>2017-05-02T10:23:53+00:00</updated>
<author>
<name>Kevin Vigor</name>
<email>kvigor@fb.com</email>
</author>
<published>2017-03-21T15:23:25+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=07cc8679cdf3b29680f4f105d0222da168d8bfc1'/>
<id>07cc8679cdf3b29680f4f105d0222da168d8bfc1</id>
<content type='text'>
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster.  Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.

In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.

There are a few new volume options due to this feature:
  halo-shd-latency:  The threshold below which self-heal daemons will
  consider children (bricks) connected.

  halo-nfsd-latency: The threshold below which NFS daemons will consider
  children (bricks) connected.

  halo-latency: The threshold below which all other clients will
  consider children (bricks) connected.

  halo-min-replicas: The minimum number of replicas which are to
  be enforced regardless of latency specified in the above 3 options.
  If the number of children falls below this threshold the next
  best (chosen by latency) shall be swapped in.

New FUSE mount options:
  halo-latency &amp; halo-min-replicas: As descripted above.

This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.

Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
  the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
  Writes &amp; deletes will be protected via entry-locks as usual preventing
  concurrent writes into files which are undergoing replication.  Read operations
  on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
  other regions.  The take away from this is care should be taken to ensure
  multiple writers do not write the same files resulting in a gfid split-brain
  which will require resolution via split-brain policies (majority, mtime &amp;
  size).  Recommended method for preventing this is using the nfs-auth feature to
  define which region for each share has RW permissions, tiers not in the origin
  region should have RO perms.

TODO:
- Synchronous clients (including the SHD) should choose clients from their own
  region as preferred sources for reads.  Most of the plumbing is in place for
  this via the child_latency array.
- Better GFID split brain handling &amp; better dent type split brain handling
  (i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
  to synchronously write to

Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer &amp; valgrind
- Prove tests

Reviewers: jackl, dph, cjh, meyering

Reviewed By: meyering

Subscribers: ethanr

Differential Revision: https://phabricator.fb.com/D1272053

Tasks: 4117827

Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor &lt;kvigor@fb.com&gt;
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Summary:
Halo Geo-replication is a feature which allows Gluster or NFS clients to write
locally to their region (as defined by a latency "halo" or threshold if you
like), and have their writes asynchronously propagate from their origin to the
rest of the cluster.  Clients can also write synchronously to the cluster
simply by specifying a halo-latency which is very large (e.g. 10seconds) which
will include all bricks.

In other words, it allows clients to decide at mount time if they desire
synchronous or asynchronous IO into a cluster and the cluster can support both
of these modes to any number of clients simultaneously.

There are a few new volume options due to this feature:
  halo-shd-latency:  The threshold below which self-heal daemons will
  consider children (bricks) connected.

  halo-nfsd-latency: The threshold below which NFS daemons will consider
  children (bricks) connected.

  halo-latency: The threshold below which all other clients will
  consider children (bricks) connected.

  halo-min-replicas: The minimum number of replicas which are to
  be enforced regardless of latency specified in the above 3 options.
  If the number of children falls below this threshold the next
  best (chosen by latency) shall be swapped in.

New FUSE mount options:
  halo-latency &amp; halo-min-replicas: As descripted above.

This feature combined with multi-threaded SHD support (D1271745) results in
some pretty cool geo-replication possibilities.

Operational Notes:
- Global consistency is gaurenteed for synchronous clients, this is provided by
  the existing entry-locking mechanism.
- Asynchronous clients on the other hand and merely consistent to their region.
  Writes &amp; deletes will be protected via entry-locks as usual preventing
  concurrent writes into files which are undergoing replication.  Read operations
  on the other hand should never block.
- Writes are allowed from _any_ region and propagated from the origin to all
  other regions.  The take away from this is care should be taken to ensure
  multiple writers do not write the same files resulting in a gfid split-brain
  which will require resolution via split-brain policies (majority, mtime &amp;
  size).  Recommended method for preventing this is using the nfs-auth feature to
  define which region for each share has RW permissions, tiers not in the origin
  region should have RO perms.

TODO:
- Synchronous clients (including the SHD) should choose clients from their own
  region as preferred sources for reads.  Most of the plumbing is in place for
  this via the child_latency array.
- Better GFID split brain handling &amp; better dent type split brain handling
  (i.e. create a trash can and move the offending files into it).
- Tagging in addition to latency as a means of defining which children you wish
  to synchronously write to

Test Plan:
- The usual suspects, clang, gcc w/ address sanitizer &amp; valgrind
- Prove tests

Reviewers: jackl, dph, cjh, meyering

Reviewed By: meyering

Subscribers: ethanr

Differential Revision: https://phabricator.fb.com/D1272053

Tasks: 4117827

Change-Id: I694a9ab429722da538da171ec528406e77b5e6d1
BUG: 1428061
Signed-off-by: Kevin Vigor &lt;kvigor@fb.com&gt;
Reviewed-on: http://review.gluster.org/16099
Reviewed-on: https://review.gluster.org/16177
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>protocol-client: Initialize the list_head before using</title>
<updated>2017-03-27T00:56:19+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2017-03-26T02:29:53+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=98f602cd64e3f8cc7e4c4a7245e132c0afd7347f'/>
<id>98f602cd64e3f8cc7e4c4a7245e132c0afd7347f</id>
<content type='text'>
In client3_3_readdir(p)_cbk, in case of error conditions,
it is possible that the list_head is used before initializing.
Hence move the initialization before usage.

Change-Id: Ie58902d079fdc58416d17b5fa5f61375decb1c99
BUG: 1435943
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16948
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In client3_3_readdir(p)_cbk, in case of error conditions,
it is possible that the list_head is used before initializing.
Hence move the initialization before usage.

Change-Id: Ie58902d079fdc58416d17b5fa5f61375decb1c99
BUG: 1435943
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16948
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rpc/clnt: remove locks while notifying CONNECT/DISCONNECT</title>
<updated>2017-03-01T14:35:48+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2017-02-28T07:43:59+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=773f32caf190af4ee48818279b6e6d3c9f2ecc79'/>
<id>773f32caf190af4ee48818279b6e6d3c9f2ecc79</id>
<content type='text'>
Locking during notify was introduced as part of commit
aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced
to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent
xlators [2]. However as part of handling DISCONNECT protocol/client
does unwind saved frames (with failure) waiting for responses. This
saved_frames_unwind can be a costly operation and hence ideally
shouldn't be included in the critical section of notifylock, as it
unnecessarily delays the reconnection to same brick. Also, its not a
good practise to pass control to other xlators holding a lock as it
can lead to deadlocks. So, this patch removes locking in rpc-clnt
while notifying parent xlators.

To fix [2], two changes are present in this patch:

* notify DISCONNECT before cleaning up rpc connection (same as commit
  a6b63e11b7758cf1bfcb6798, patch [3]).
* protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc
  connection and does a start while handling a DISCONNECT event from
  rpc. Note that patch [3] was reverted as rpc_clnt_start called in
  quick_reconnect path of protocol/client didn't invoke connect on
  transport as the connection was not cleaned up _yet_ (as cleanup was
  moved post notification in rpc-clnt). This resulted in clients never
  attempting connect to bricks.

Note that one of the neater ways to fix [2] (without using locks) is
to introduce generation numbers to map CONNECT and DISCONNECTS across
epochs and ignore DISCONNECT events if they don't belong to current
epoch. However, this approach is a bit complex to implement and
requires time. So, current patch is a hacky stop-gap fix till we come
up with a more cleaner solution.

[1] http://review.gluster.org/15916
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626
[3] http://review.gluster.org/15681

Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
BUG: 1427012
Reviewed-on: https://review.gluster.org/16784
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Milind Changire &lt;mchangir@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Locking during notify was introduced as part of commit
aa22f24f5db7659387704998ae01520708869873 [1]. The fix was introduced
to fix out-of-order CONNECT/DISCONNECT events from rpc-clnt to parent
xlators [2]. However as part of handling DISCONNECT protocol/client
does unwind saved frames (with failure) waiting for responses. This
saved_frames_unwind can be a costly operation and hence ideally
shouldn't be included in the critical section of notifylock, as it
unnecessarily delays the reconnection to same brick. Also, its not a
good practise to pass control to other xlators holding a lock as it
can lead to deadlocks. So, this patch removes locking in rpc-clnt
while notifying parent xlators.

To fix [2], two changes are present in this patch:

* notify DISCONNECT before cleaning up rpc connection (same as commit
  a6b63e11b7758cf1bfcb6798, patch [3]).
* protocol/client uses rpc_clnt_cleanup_and_start, which cleans up rpc
  connection and does a start while handling a DISCONNECT event from
  rpc. Note that patch [3] was reverted as rpc_clnt_start called in
  quick_reconnect path of protocol/client didn't invoke connect on
  transport as the connection was not cleaned up _yet_ (as cleanup was
  moved post notification in rpc-clnt). This resulted in clients never
  attempting connect to bricks.

Note that one of the neater ways to fix [2] (without using locks) is
to introduce generation numbers to map CONNECT and DISCONNECTS across
epochs and ignore DISCONNECT events if they don't belong to current
epoch. However, this approach is a bit complex to implement and
requires time. So, current patch is a hacky stop-gap fix till we come
up with a more cleaner solution.

[1] http://review.gluster.org/15916
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1386626
[3] http://review.gluster.org/15681

Change-Id: I62daeee8bb1430004e28558f6eb133efd4ccf418
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
BUG: 1427012
Reviewed-on: https://review.gluster.org/16784
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Milind Changire &lt;mchangir@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>protocol/client: Add gfid in request for better tcpdumps</title>
<updated>2017-02-28T04:28:07+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-02-22T04:19:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.gluster.org/cgit/glusterfs.git/commit/?id=874059dc94a123f78c19ba14bee2d5540c106a9e'/>
<id>874059dc94a123f78c19ba14bee2d5540c106a9e</id>
<content type='text'>
It is difficult to match opendir/releasedir or
open/release calls without gfid associlation to find
fd-leaks on the bricks. So assigning gfid also in
the request which was already there in the on-wire
format.

BUG: 1425676
Change-Id: Iec908eeaa2f97295d45140a529b7f8fb834a1553
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16706
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is difficult to match opendir/releasedir or
open/release calls without gfid associlation to find
fd-leaks on the bricks. So assigning gfid also in
the request which was already there in the on-wire
format.

BUG: 1425676
Change-Id: Iec908eeaa2f97295d45140a529b7f8fb834a1553
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: https://review.gluster.org/16706
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
