glusterfs.git/tests/bugs, branch v9dev

tests: Fix for spurious failure for some test cases

2020-04-16T11:15:51+00:00

Problem: Sometimes test case is failing at the time of creating files
         on mount point after mounting the volume

Solution: After started the volume need to wait to make sure all
          bricks instances are completely started so put a online_brick_count
          check after just started the volume

Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6
Fixes: #1162
Signed-off-by: Mohit Agrawal

tests: do not truncate file offsets and sizes to 32-bit

2020-04-15T07:58:33+00:00

Do not truncate file offsets and sizes to 32-bit to
prevent tests from spurious failures on >2Gb files.

Signed-off-by: Dmitry Antipov 

Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2
Fixes: #1161

test: Fix test "bug-1064148" to pass in mux

2020-04-13T05:28:08+00:00

Parts of the test weren't designed to run in mux mode, this is now fixed

Change-Id: I428c2fcce6d047e324ca5dcaef677ee1794e3dfe
updates: #1154
Signed-off-by: Barak Sason Rofman

tests: Fix spurious failure of tests/bugs/glusterd/serialize-shd-manager-glusterd-restart.t

2020-04-13T05:25:40+00:00

Problem: Sometime volume status is failed after restart glusterd
         in one cluster node

Solution: Wait to finish glusterd handshake on down cluster node

Change-Id: Ib23ca41c943caf2903c61ebf42dc437c1b9d6054
Fixes: #1158
Signed-off-by: Mohit Agrawal

posix: Avoid dict_del logs in posix_is_layout_stale while key is NULL

2020-04-09T04:26:28+00:00

Problem: The key "GF_PREOP_PARENT_KEY" has been populated by dht and
         for non-distribute volume like 1x3 key is not populated so
         posix_is_layout stale throw a message while a file is created

Solution: To avoid a log put a condition before delete a key

Change-Id: I813ee7960633e7f9f5e9ad2f42f288053d9eb71f
Fixes: #1150
Signed-off-by: Mohit Agrawal

mount/fuse: Wait for 'mount' child to exit before dying

2020-04-09T04:15:17+00:00

Problem:
tests/bugs/protocol/bug-1433815-auth-allow.t fails
sometimes because of stale mount. This stale mount
comes into picture when parent process dies without
waiting for the child process which mounts fuse fs
to die

Fix:
Wait for mounting child process to die before dying.

Fixes: #1152
Change-Id: I8baee8720e88614fdb762ea822d5877973eef8dc
Signed-off-by: Pranith Kumar K

dht - fixing a permission update issue

2020-04-08T06:57:53+00:00

When bringing back a downed brick and performing lookup from the client
side, the permission on said brick aren't updated on the first lookup,
but only on the second.

This patch modifies permission update logic so the first lookup will
trigger a permission update on the downed brick.

LIMITATIONS OF THE PATCH:
As the choice of source depends on whether the directory has layout or not.
Even the directories on the newly added brick will have layout xattr[zeroed], but the same is not true for a root directory.
Hence, in case in the entire cluster only the newly added bricks are up [and others are down], then any change in permission during this time will be overwritten by the older permissions when the cluster is restarted.

fixes: #999
Change-Id: Ieb70246d41e59f9cae9f70bc203627a433dfbd33
Signed-off-by: Barak Sason Rofman

tests: Fix spurious failure of tests/bugs/snapshot/bug-1111041.t

2020-04-07T15:31:01+00:00

Test should wait for process down notification to be received
by glusterd.

Fixes: #1153
Change-Id: I9162b58a92c1a909ca98097f14c0714f9086bdd1
Signed-off-by: Pranith Kumar K

fuse: Add error-logs to debug bug-1433815-auth-allow.t failures

2020-04-06T01:38:05+00:00

Fixes: #1149
Change-Id: I38483fc7d76d7fe0ac9fb649669a46bdf9c82234
Signed-off-by: Pranith Kumar K

write-behind: fix data corruption

2020-04-03T14:28:33+00:00

There was a bug in write-behind that allowed a previous completed write
to overwrite the overlapping region of data from a future write.

Suppose we want to send three writes (W1, W2 and W3). W1 and W2 are
sequential, and W3 writes at the same offset of W2:

    W2.offset = W3.offset = W1.offset + W1.size

Both W1 and W2 are sent in parallel. W3 is only sent after W2 completes.
So W3 should *always* overwrite the overlapping part of W2.

Suppose write-behind processes the requests from 2 concurrent threads:

    Thread 1                    Thread 2

    
                                
    wb_enqueue_tempted(W1)
    /* W1 is assigned gen X */
                                wb_enqueue_tempted(W2)
                                /* W2 is assigned gen X */

                                wb_process_queue()
                                  __wb_preprocess_winds()
                                    /* W1 and W2 are sequential and all
                                     * other requisites are met to merge
                                     * both requests. */
                                    __wb_collapse_small_writes(W1, W2)
                                    __wb_fulfill_request(W2)

                                  __wb_pick_unwinds() -> W2
                                  /* In this case, since the request is
                                   * already fulfilled, wb_inode->gen
                                   * is not updated. */

                                wb_do_unwinds()
                                  STACK_UNWIND(W2)

                                /* The application has received the
                                 * result of W2, so it can send W3. */
                                

                                wb_enqueue_tempted(W3)
                                /* W3 is assigned gen X */

                                wb_process_queue()
                                  /* Here we have W1 (which contains
                                   * the conflicting W2) and W3 with
                                   * same gen, so they are interpreted
                                   * as concurrent writes that do not
                                   * conflict. */
                                  __wb_pick_winds() -> W3

                                wb_do_winds()
                                  STACK_WIND(W3)

    wb_process_queue()
      /* Eventually W1 will be
       * ready to be sent */
      __wb_pick_winds() -> W1
      __wb_pick_unwinds() -> W1
        /* Here wb_inode->gen is
         * incremented. */

    wb_do_unwinds()
      STACK_UNWIND(W1)

    wb_do_winds()
      STACK_WIND(W1)

So, as we can see, W3 is sent before W1, which shouldn't happen.

The problem is that wb_inode->gen is only incremented for requests that
have not been fulfilled but, after a merge, the request is marked as
fulfilled even though it has not been sent to the brick. This allows
that future requests are assigned to the same generation, which could
be internally reordered.

Solution:

Increment wb_inode->gen before any unwind, even if it's for a fulfilled
request.

Special thanks to Stefan Ring for writing a reproducer that has been
crucial to identify the issue.

Change-Id: Id4ab0f294a09aca9a863ecaeef8856474662ab45
Signed-off-by: Xavi Hernandez 
Fixes: #884