glusterfs.git/xlators/features, branch release-3.8-fb

build: many rpm-build fixes

2017-12-20T17:40:19+00:00

Summary:
Highlights include:

 * Fixed GF_CONF_OPTS (dev builds) and RPM_BUILD_FLAGS (rpm builds)

 * Fixed version in configure.ac

 * Fixed handling of files only present when BUILD_FB_EXTRAS is set

 * Fixed disable-georeplication (upstream bug)

 * Fixed disable-tiering (upstream bug)

 * Removed .service files which should be generated from .in versions

 * Fixed tirpc (previously fbtirpc) references

 * Fixed init_enable problems

 * Removed delay-gen references

Test Plan: Use build.sh to build an RPM, and install it.

Differential Revision: https://phabricator.intern.facebook.com/D6611299

Change-Id: If61a4964a149f782038ea47362a82b813e6b7738
Signed-off-by: Jeff Darcy

Replace namespace/io-stats/io-threads with 3.6-fb versions

2017-09-15T20:47:01+00:00

This rolls up multiple patches related to namespace identificaton and
throttling/QoS.  This primarily includes the following, all by Michael
Goulet .

  io-threads: Add weighted round robin queueing by namespace
  https://phabricator.facebook.com/D5615269

  io-threads: Add per-namespaces queue sizes to IO_THREADS_QUEUE_SIZE_KEY
  https://phabricator.facebook.com/D5683162

  io-threads: Implement better slot allocation algorithm
  https://phabricator.facebook.com/D5683186

  io-threads: Only enable weighted queueing on bricks
  https://phabricator.facebook.com/D5700062

  io-threads: Update queue sizes on drain
  https://phabricator.facebook.com/D5704832

  Fix parsing (-1) as default NS weight
  https://phabricator.facebook.com/D5723383

Parts of the following patches have also been applied to satisfy
dependencies.

  io-throttling: Calculate moving averages and throttle offending hosts
  https://phabricator.fb.com/D2516161
  Shreyas Siravara 

  Hook up ODS logging for FUSE clients.
  https://phabricator.facebook.com/D3963376
  Kevin Vigor 

  Add the flag --skip-nfsd-start to skip the NFS daemon stating, even if
  it is enabled
  https://phabricator.facebook.com/D4575368
  Alex Lorca 

There are also some "standard" changes: dealing with code that moved,
reindenting to comply with Gluster coding standards, gf_uuid_xxx, etc.

This patch *does* revert some changes which have occurred upstream since
3.6; these will be re-applied as apppropriate on top of this new base.

Change-Id: I69024115da7a60811e5b86beae781d602bdb558d
Signed-off-by: Jeff Darcy

inodelk-count: Add stats to count the number of lock objects

2017-09-13T14:13:58+00:00

Summary:
We want to track the number of locks held by the locks xlator. One of the ways to do it would be to track the
total number of pl_lock objects in the system.

This patch tracks the total number of pl_lock object and exposes the stat via io-stats JSON dump.

Test Plan: WIP, haven't got a pass. Putting the diff to get a sense of this approach would yield what you guys are looking for?

Reviewers: kvigor, sshreyas, jdarcy

Reviewed By: jdarcy

Differential Revision: https://phabricator.intern.facebook.com/D5303071

Change-Id: I946debcbff61699ec28b4d6f243042440107a224
Signed-off-by: Jeff Darcy 
Reviewed-on: https://review.gluster.org/18273
Reviewed-by: Jeff Darcy 
Tested-by: Jeff Darcy 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System

features/namespace: Add namespace xlator and link into brick graph

2017-09-07T23:02:56+00:00

Summary:
This translator tags namespaces with a unique hash that corresponds to the
top-level directory (right under the gluster root) of the file the fop acts
on. The hash information is injected into the call frame by this translator,
so this namespace information can later be used to do throttling, QoS and
other namespace-specific stats collection and actions in later xlators
further down the stack.

When the translator can't find a path directly for the fd_t or loc_t, it winds
a GET_ANCESTRY_PATH_KEY down to the posix xlator to get the path manually.
Caching this namespace information in the inode makes sure that most requests
don't need to recalculate the hash, so that typically fops are just doing an
inode_ctx_get instead of the more expensive code paths that this xlator can take.

Right now the xlator is hard-coded to only hash the top-level directory, but
this could be easily extended to more sophisticated matching by modification
of the parse_path function.

Test Plan:
Run `prove -v tests/basic/namespace.t` to see that tagging works.

Change-Id: I960ddadba114120ac449d27a769d409cc3759ebc
Reviewed-on: https://review.gluster.org/18041
Smoke: Gluster Build System 
Reviewed-by: Shreyas Siravara 
CentOS-regression: Gluster Build System

Merge remote-tracking branch 'origin/release-3.8' into release-3.8-fb

2017-08-31T19:33:59+00:00

Change-Id: Ie35cd1c8c7808949ddf79b3189f1f8bf0ff70ed8

features/locks: Fix crash bug in connection (lock) clean-up flow

2017-08-28T17:05:09+00:00

Summary:
- Fixes crash bug where bricks can crash when the "clear locks" command is run
  (by CLI or by revocation code) and sockets are later cleaned-up
  causing bricks to crash.  Crash bug is due to use-after-free due to
  refs being left to the lock in the client-list.  When this list is
  later traversed it triggers a crash as pointers are now pointing to
  garbage.

Test Plan:
- Ran with monkey-unlock and tested connection clean-ups after lock
  revocation

Reviewers: sshreyas, dph, moox

Reviewed By: moox

Differential Revision: https://phabricator.fb.com/D2695087

Tasks: 6207062

Change-Id: Iea26efe4bfbadc26431a3c50a0a8bda218bb5219
Signed-off-by: Jeff Darcy 
Reviewed-on: https://review.gluster.org/18122
Smoke: Gluster Build System 
Reviewed-by: Jeff Darcy 
Tested-by: Jeff Darcy 
CentOS-regression: Gluster Build System

features/quota: Fix brick crash in quota unlink callback

2017-08-01T02:26:45+00:00

Summary: - Get log message to use loc.gfid not loc.inode->gfid

Test Plan: - Run prove -v tests/basic/quota*.t

Reviewers: dph, moox, sshreyas

Reviewed By: sshreyas

Signature: t1:2559107:1445311668:61ca5809fa977326d0fb503e874363a29cd31dfe

Change-Id: Iad16d7b2102376380eb0f6918111249af370aaeb
Reviewed-on: https://review.gluster.org/17938
Smoke: Gluster Build System 
Reviewed-by: Jeff Darcy 
Tested-by: Jeff Darcy 
CentOS-regression: Gluster Build System

afr/cluster: PGFID heal support

2017-07-31T21:23:56+00:00

Summary:
  PGFID healing enables heals which might otherwise fail due
  due to the lack of a entry heal to succeed by performing
  the entry healing within the same heal flow.

  It does this by leveraging the PGFID tracking feature of
  the POSIX xlator, and examining lookup replies for the
  PGFID attribute.  If detected, the pgfid will be decoded
  and stored for later use in case the heal fails for whatever
  reason.  Cascading heal failures are handled through
  recursion.

  This feature is critical for a couple reasons:
      1. General healing predictability - When the SHD
         attempts to heal a given GFID, it should be able
         to do so without having to wait for some other
         dependent heal to take place.
      2. Reliability - In some cases the parent directory
         may require healing, but the req'd entry in the
         indices/xattrop directory may not exist
         (e.g. bugs/crashes etc).  Prior to PGFID heal support
         some sort of external script would be required to
         queue up these heals by using FS specific utilities
         to lookup the parent directory by hardlink or
         worse...do a costly full heal to clean them up.
      3. Performance - In combination with multi-threaded SHD
         this feature will make SHD healing _much_ faster as
         directories with large amount of files to be healed
         will no longer have to wait for an entry heal to
         come along, the first file in that directory queued
         for healing will trigger an entry heal for the directory
         and this will allow the other files in that directory
         to be (immediatelly) healed in parallel.

Test Plan:
- run prove tests/basic/afr/shd_pgfid_heal.t
- run prove tests/basic/afr/shd*.t
- run prove tests/basic/afr/gfid*.t

Differential Revision: https://phabricator.fb.com/D2546133

Change-Id: I25f586047f8bcafa900c0cc9ee8f0e2128688c73
Signed-off-by: Jeff Darcy 
Reviewed-on: https://review.gluster.org/17929
Smoke: Gluster Build System 
Tested-by: Jeff Darcy 
CentOS-regression: Gluster Build System 
Reviewed-by: Jeff Darcy

cluster/afr: Handle gfid-less directories in heal flow

2017-07-12T17:04:32+00:00

Summary:
- Updates heal flow to handle case where a directory does not have a
  gfid assigned.  In this case we will remove _only_ empty directories
  in these cases such that the parent can re-gain consistency and files
  within can be correctly healed.
- Also adds a test for the case where a file does not have a gfid, this
  is already handles by the metadata heal flow, but tests were lacking
  for this code path.

Test Plan:
- prove -v tests/basic/shd_autofix_nogfid.t
- prove -v tests/basic/gfid_unsplit_shd.t

Reviewers: dph, moox, sshreyas

Reviewed By: sshreyas

Differential Revision: https://phabricator.fb.com/D2502067

Tasks: 8549168

Change-Id: I8dd3e6a6d62807cb38aafe597eced3d4b402351b
Signed-off-by: Jeff Darcy 
Reviewed-on: https://review.gluster.org/17750
Tested-by: Jeff Darcy 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Jeff Darcy

features/shard: Fix vm corruption upon fix-layout

2017-04-10T15:45:51+00:00

        Backport of: https://review.gluster.org/17010

shard's writev implementation, as part of identifying
presence of participant shards that aren't in memory,
first sends an MKNOD on these shards, and upon EEXIST error,
looks up the shards before proceeding with the writes.

The VM corruption was caused when the following happened:
1. DHT had n subvolumes initially.
2. Upon add-brick + fix-layout, the layout of .shard changed
   although the existing shards under it were yet to be migrated
   to their new hashed subvolumes.
3. During this time, there were writes on the VM falling in regions
   of the file whose corresponding shards were already existing under
   .shard.
4. Sharding xl sent MKNOD on these shards, now creating them in their
   new hashed subvolumes although there already exist shard blocks for
   this region with valid data.
5. All subsequent writes were wound on these newly created copies.

The net outcome is that both copies of the shard didn't have the correct
data. This caused the affected VMs to be unbootable.

FIX:
For want of better alternatives in DHT, the fix changes shard fops to do
a LOOKUP before the MKNOD and upon EEXIST error, perform another lookup.

Change-Id: I1a5d3515b42e2e5583c407d1b4aff44d7ce472eb
BUG: 1440635
RCA'd-by: Raghavendra Gowdappa 
Reported-by: Mahdi Adnan 
Signed-off-by: Krutika Dhananjay 
Reviewed-on: https://review.gluster.org/17019
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Smoke: Gluster Build System 
Reviewed-by: jiffin tony Thottan