diff options
| author | Kaushal M <kaushal@redhat.com> | 2015-03-26 15:18:54 +0530 | 
|---|---|---|
| committer | Krishnan Parthasarathi <kparthas@redhat.com> | 2015-04-13 06:30:02 +0000 | 
| commit | 1efa50861b2cee68de9c9b751d9fc5eed08f5e5b (patch) | |
| tree | 218ca5aba1eb404ff0d7ac25f20716915768f4ec /libglusterfs | |
| parent | 7c7bbc027feb4c5b233e3078951e5bb1d9fc4618 (diff) | |
glusterd: Replace transaction peers lists
Transaction peer lists were used in GlusterD to peers belonging to a
transaction. This was needed to prevent newly added peers performing
partial transactions, which could be incorrect.
This was accomplished by creating a seperate transaction peers list at
the beginning of every transaction. A transaction peers list referenced
the peerinfo data structures of the peers which were present at the
beginning of the transaction. RCU protection of peerinfos referenced by
the transaction peers list is a hard problem and difficult to do
correctly.
To have proper RCU protection of peerinfos, the transaction peers lists
have been replaced by an alternative method to identify peers that
belong to a transaction. The alternative method is to the global peers
list along with generation numbers to identify peers that should belong
to a transaction.
This change introduces a global peer list generation number, and a
generation number for each peerinfo object. Whenever a peerinfo object
is created, the global generation number is bumped, and the peerinfos
generation number is set to the bumped global generation.
With the above changes, the algorithm to identify peers belonging to a
transaction with RCU protection is as follows,
- At the beginning of a transaction, the current global generation
  number is saved
- To identify if a peers belonging to the transaction,
  - Start a RCU read critical section
  - For each peer in the global peers list,
    - If the peers generation number is not greater than the saved
      generation number, continue with the action on the peer
  - End the RCU read critical section
The above algorithm guarantees that,
- The peer list is not modified when a transaction is iterating through
  it
- The transaction actions are only done on peers that were present when
  the transaction started
But, as a transaction could iterate over the peers list multiple times,
the algorithm cannot guarantee that same set of peers will be selected
every time. A peer could get deleted between two iterations of the list
within a transaction. This problem existed with transaction peers list
as well, but unlike before now it will not lead to invalid memory access
and potential crashes. This problem will be addressed seprately.
This change was developed on the git branch at [1]. This commit is a
combination of the following commits on the development branch.
  52ded5b Add timespec_cmp
  44aedd8 Add create timestamp to peerinfo
  7bcbea5 Fix some silly mistakes
  13e3241 Add start time to opinfo
  17a6727 Use timestamp comparisions to identify xaction peers instead
          of a xaction peer list
  3be05b6 Correct check for peerinfo age
  70d5b58 Use read-critical sections for peer list iteration
  ba4dbca Use peerinfo timestamp checks in op-sm instead of xaction peer
          list
  d63f811 Add more peer status checks when iterating peers list in
          glusterd-syncop
  1998a2a Timestamp based peer list traversal of mgmtv3 xactions
  f3c1a42 Remove transaction peer lists
  b8b08ee Remove unused labels
  32e5f5b Remove 'npeers' usage
  a075fb7 Remove 'npeers' from mgmt-v3 framework
  12c9df2 Use generation number instead of timestamps.
  9723021 Remove timespec_cmp
  80ae2c6 Remove timespec.h include
  a9479b0 Address review comments on 10147/4
[1]: https://github.com/kshlm/glusterfs/tree/urcu
Change-Id: I9be1033525c0a89276f5b5d83dc2eb061918b97f
BUG: 1205186
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/10147
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Anand Nekkunti <anekkunt@redhat.com>
Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
Tested-by: Krishnan Parthasarathi <kparthas@redhat.com>
Diffstat (limited to 'libglusterfs')
| -rw-r--r-- | libglusterfs/src/mem-types.h | 1 | 
1 files changed, 0 insertions, 1 deletions
diff --git a/libglusterfs/src/mem-types.h b/libglusterfs/src/mem-types.h index f4d3974f0b2..fc06d52239b 100644 --- a/libglusterfs/src/mem-types.h +++ b/libglusterfs/src/mem-types.h @@ -150,7 +150,6 @@ enum gf_common_mem_types_ {          gf_common_mt_nfs_exports          = 131,          gf_common_mt_gf_brick_spec_t      = 132,          gf_common_mt_gf_timer_entry_t     = 133, -        gf_common_mt_list_head_t          = 134,          gf_common_mt_end  };  #endif  | 
