summaryrefslogtreecommitdiffstats
path: root/doc/release-notes/7.0.md
blob: bb31433447bc374a18dd8f6457dbdd11cb772ab2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
# Release notes for Gluster 7.0

This is a major release that includes a range of code improvements and stability
fixes along with a few features as noted below.

A selection of the key features and changes are documented in this page.
A full list of bugs that have been addressed is included further below.

- [Announcements](#announcements)
- [Major changes and features](#major-changes-and-features)
- [Major issues](#major-issues)
- [Bugs addressed in the release](#bugs-addressed)

## Announcements

1. Releases that receive maintenance updates post release 7 are, 6 and 7
([reference](https://www.gluster.org/release-schedule/))

2. Release 7 will receive maintenance updates around the 10th of every month
for the first 3 months post release (i.e Oct'19, Nov'19, Dec'19). Post the
initial 3 months, it will receive maintenance updates every 2 months till EOL.



## Major changes and features

### Highlights

- Several stability fixes addressing,
  - coverity, clang-scan, address sanitizer and valgrind reported issues
  - removal of unused and hence, deprecated code and features
- Performance Improvements


Features 

#### 1. Rpcbind not required in glusterd.service when gnfs isn't built.

#### 2. Latency based read child to improve read workload latency in a cluster, especially in a cloud setup. Also provides a load balancing with the outstanding pending request.

#### 3. Glusterfind: integrate with gfid2path, to improve performance. 

#### 4. Issue #532: Work towards implementing global thread pooling has started

#### 5. This release includes extra coverage for glfs public APIs in our regression tests, so we don't break anything.



## Major issues

**None**

## Note

Any new volumes created with the release will have the `fips-mode-rchecksum` volume option set to `on` by default.

If a client older than glusterfs-4.x (i.e. 3.x clients) accesses a volume which has the `fips-mode-rchecksum` volume option enabled, it can cause erroneous checksum computation/ unwanted behaviour during afr self-heal. This option is to be enabled only when all clients are also >=4.x. So if you are using these older clients, please explicitly turn this option `off`.

## Bugs addressed

Bugs addressed since release-6 are listed below.

- [#789278](https://bugzilla.redhat.com/789278): Issues reported by Coverity static analysis tool
- [#1098991](https://bugzilla.redhat.com/1098991): Dist-geo-rep: Invalid slave url (::: three or more colons)  error out with unclear error message.
- [#1193929](https://bugzilla.redhat.com/1193929): GlusterFS can be improved
- [#1241494](https://bugzilla.redhat.com/1241494): [Backup]: Glusterfind CLI commands need to verify the accepted names for session/volume, before failing with error(s)
- [#1512093](https://bugzilla.redhat.com/1512093): Value of pending entry operations in detail status output is going up after each synchronization.
- [#1535511](https://bugzilla.redhat.com/1535511): Gluster CLI shouldn't stop if log file couldn't be opened
- [#1542072](https://bugzilla.redhat.com/1542072): Syntactical errors in hook scripts for managing SELinux context on bricks #2 (S10selinux-label-brick.sh + S10selinux-del-fcontext.sh)
- [#1573226](https://bugzilla.redhat.com/1573226): eventsapi: ABRT report for package glusterfs has reached 10 occurrences
- [#1580315](https://bugzilla.redhat.com/1580315): gluster volume status inode getting timed out after 30 minutes with no output/error
- [#1590385](https://bugzilla.redhat.com/1590385): Refactor dht lookup code
- [#1593224](https://bugzilla.redhat.com/1593224): [Disperse] : Client side heal is not removing dirty flag for some of the files.
- [#1596787](https://bugzilla.redhat.com/1596787): glusterfs rpc-clnt.c: error returned while attempting to connect to host: (null), port 0
- [#1622665](https://bugzilla.redhat.com/1622665): clang-scan report: glusterfs issues
- [#1624701](https://bugzilla.redhat.com/1624701): error-out {inode,entry}lk fops with all-zero lk-owner
- [#1628194](https://bugzilla.redhat.com/1628194): tests/dht: Additional tests for dht operations
- [#1633930](https://bugzilla.redhat.com/1633930): ASan (address sanitizer) fixes - Blanket bug
- [#1634664](https://bugzilla.redhat.com/1634664): Inconsistent quorum checks during open and fd based operations
- [#1635688](https://bugzilla.redhat.com/1635688): Keep only the valid (maintained/supported) components in the build
- [#1642168](https://bugzilla.redhat.com/1642168): changes to cloudsync xlator
- [#1642810](https://bugzilla.redhat.com/1642810): remove glupy from code and build
- [#1648169](https://bugzilla.redhat.com/1648169): Fuse mount would crash if features.encryption is on in the version from 3.13.0 to 4.1.5
- [#1648768](https://bugzilla.redhat.com/1648768): Tracker bug for all leases related issues
- [#1650095](https://bugzilla.redhat.com/1650095): Regression tests for geo-replication on EC volume is not available. It should be added.
- [#1651246](https://bugzilla.redhat.com/1651246): Failed to dispatch handler
- [#1651439](https://bugzilla.redhat.com/1651439): gluster-NFS crash while expanding volume
- [#1651445](https://bugzilla.redhat.com/1651445): [RFE] storage.reserve option should take size of disk as input instead of percentage
- [#1652887](https://bugzilla.redhat.com/1652887): Geo-rep help looks to have a typo.
- [#1654021](https://bugzilla.redhat.com/1654021): Gluster volume heal causes continuous info logging of "invalid argument"
- [#1654270](https://bugzilla.redhat.com/1654270): glusterd crashed with seg fault possibly during node reboot while volume creates and deletes were happening
- [#1659334](https://bugzilla.redhat.com/1659334): FUSE mount seems to be hung and not accessible
- [#1659708](https://bugzilla.redhat.com/1659708): Optimize by not stopping (restart) selfheal deamon (shd)  when a volume is stopped unless it is the last volume
- [#1664934](https://bugzilla.redhat.com/1664934): glusterfs-fuse client not benefiting from page cache on read after write
- [#1670031](https://bugzilla.redhat.com/1670031): performance regression seen with smallfile workload tests
- [#1672480](https://bugzilla.redhat.com/1672480): Bugs Test Module tests failing on s390x
- [#1672711](https://bugzilla.redhat.com/1672711): Upgrade from glusterfs 3.12 to gluster 4/5 broken
- [#1672727](https://bugzilla.redhat.com/1672727): Fix timeouts so the tests pass on AWS
- [#1672851](https://bugzilla.redhat.com/1672851): With parallel-readdir enabled, deleting a directory containing stale linkto files fails with "Directory not empty"
- [#1674389](https://bugzilla.redhat.com/1674389): [thin arbiter] : rpm - add thin-arbiter package
- [#1674406](https://bugzilla.redhat.com/1674406): glusterfs FUSE client crashing every few days with 'Failed to dispatch handler'
- [#1674412](https://bugzilla.redhat.com/1674412): listing a file while writing to it causes deadlock
- [#1675076](https://bugzilla.redhat.com/1675076): [posix]: log the actual path wherever possible
- [#1676400](https://bugzilla.redhat.com/1676400): rm -rf fails with  "Directory not empty"
- [#1676430](https://bugzilla.redhat.com/1676430): distribute: Perf regression in mkdir path
- [#1676736](https://bugzilla.redhat.com/1676736): tests: ./tests/bugs/distribute/bug-1161311.t times out
- [#1676797](https://bugzilla.redhat.com/1676797): server xlator doesn't handle dict unserialization failures correctly
- [#1677559](https://bugzilla.redhat.com/1677559): gNFS crashed when processing "gluster v profile [vol] info nfs"
- [#1678726](https://bugzilla.redhat.com/1678726): Integer Overflow possible in md-cache.c due to data type inconsistency
- [#1679401](https://bugzilla.redhat.com/1679401): Geo-rep setup creates an incorrectly formatted authorized_keys file
- [#1679406](https://bugzilla.redhat.com/1679406): glustereventsd does not start on Ubuntu 16.04 LTS
- [#1680587](https://bugzilla.redhat.com/1680587): Building RPM packages with _for_fedora_koji_builds enabled fails on el6
- [#1683352](https://bugzilla.redhat.com/1683352): remove experimental xlators informations from glusterd-volume-set.c
- [#1683594](https://bugzilla.redhat.com/1683594): nfs ltp ftest* fstat gets mismatch size as except after turn on md-cache
- [#1683816](https://bugzilla.redhat.com/1683816): Memory leak when peer detach fails
- [#1684385](https://bugzilla.redhat.com/1684385): [ovirt-gluster] Rolling gluster upgrade from 3.12.5 to 5.3 led to shard on-disk xattrs disappearing
- [#1684404](https://bugzilla.redhat.com/1684404): Multiple shd processes are running on brick_mux environmet
- [#1685027](https://bugzilla.redhat.com/1685027): Error handling in /usr/sbin/gluster-eventsapi produces IndexError: tuple index out of range
- [#1685120](https://bugzilla.redhat.com/1685120): upgrade from 3.12, 4.1 and 5 to 6 broken
- [#1685414](https://bugzilla.redhat.com/1685414): glusterd memory usage grows at 98 MB/h while running "gluster v profile" in a loop
- [#1685944](https://bugzilla.redhat.com/1685944): WORM-XLator: Maybe integer overflow when computing new atime
- [#1686371](https://bugzilla.redhat.com/1686371): Cleanup nigel access and document it
- [#1686398](https://bugzilla.redhat.com/1686398): Thin-arbiter minor fixes
- [#1686568](https://bugzilla.redhat.com/1686568): [geo-rep]: Checksum mismatch when 2x2 vols are converted to arbiter
- [#1686711](https://bugzilla.redhat.com/1686711): [Thin-arbiter] : send correct error code in case of failure
- [#1687326](https://bugzilla.redhat.com/1687326): [RFE] Revoke access from nodes using Certificate Revoke List in SSL
- [#1687705](https://bugzilla.redhat.com/1687705): Brick process has coredumped, when starting glusterd
- [#1687811](https://bugzilla.redhat.com/1687811): core dump generated while running the test ./tests/00-geo-rep/georep-basic-dr-rsync-arbiter.t
- [#1688068](https://bugzilla.redhat.com/1688068): Proper error message needed for FUSE mount failure when /var is filled.
- [#1688106](https://bugzilla.redhat.com/1688106): Remove implementation of number of files opened in posix xlator
- [#1688116](https://bugzilla.redhat.com/1688116): Spurious failure in test ./tests/bugs/glusterfs/bug-844688.t
- [#1688287](https://bugzilla.redhat.com/1688287): ganesha crash on glusterfs with shard volume
- [#1689097](https://bugzilla.redhat.com/1689097): gfapi: provide an option for changing statedump path in glfs-api.
- [#1689799](https://bugzilla.redhat.com/1689799): [cluster/ec] : Fix handling of heal info cases without locks
- [#1689920](https://bugzilla.redhat.com/1689920): lots of "Matching lock not found for unlock xxx" when using disperse (ec) xlator
- [#1690753](https://bugzilla.redhat.com/1690753): Volume stop when quorum not met is successful
- [#1691164](https://bugzilla.redhat.com/1691164): glusterd leaking memory when issued gluster vol status all tasks continuosly
- [#1691616](https://bugzilla.redhat.com/1691616): client log flooding with intentional socket shutdown message when a brick is down
- [#1692093](https://bugzilla.redhat.com/1692093): Network throughput usage increased x5
- [#1692612](https://bugzilla.redhat.com/1692612): Locking issue when restarting bricks
- [#1692666](https://bugzilla.redhat.com/1692666): ssh-port config set is failing
- [#1693575](https://bugzilla.redhat.com/1693575): gfapi: do not block epoll thread for upcall notifications
- [#1693648](https://bugzilla.redhat.com/1693648): Geo-re: Geo replication failing in "cannot allocate memory"
- [#1693692](https://bugzilla.redhat.com/1693692): Increase code coverage from regression tests
- [#1694820](https://bugzilla.redhat.com/1694820): Geo-rep: Data inconsistency while syncing heavy renames with constant destination name
- [#1694925](https://bugzilla.redhat.com/1694925): GF_LOG_OCCASSIONALLY API doesn't log at first instance
- [#1695327](https://bugzilla.redhat.com/1695327): regression test fails with brick mux enabled.
- [#1696046](https://bugzilla.redhat.com/1696046): Log level changes do not take effect until the process is restarted
- [#1696077](https://bugzilla.redhat.com/1696077): Add pause and resume test case for geo-rep
- [#1696136](https://bugzilla.redhat.com/1696136): gluster fuse mount crashed, when deleting 2T image file from oVirt Manager UI
- [#1696512](https://bugzilla.redhat.com/1696512): glusterfs build is failing on rhel-6
- [#1696599](https://bugzilla.redhat.com/1696599): Fops hang when inodelk fails on the first fop
- [#1697316](https://bugzilla.redhat.com/1697316): Getting SEEK-2 and SEEK7 errors with [Invalid argument] in the bricks' logs
- [#1697486](https://bugzilla.redhat.com/1697486): bug-1650403.t && bug-858215.t are throwing error "No such file" at the time of access glustershd pidfile
- [#1697866](https://bugzilla.redhat.com/1697866): Provide a way to detach a failed node
- [#1697907](https://bugzilla.redhat.com/1697907): ctime feature breaks old client to connect to new server
- [#1697930](https://bugzilla.redhat.com/1697930): Thin-Arbiter SHD minor fixes
- [#1698078](https://bugzilla.redhat.com/1698078): ctime: Creation of tar file on gluster mount throws warning "file changed as we read it"
- [#1698449](https://bugzilla.redhat.com/1698449): thin-arbiter lock release fixes
- [#1699025](https://bugzilla.redhat.com/1699025): Brick is not able to detach successfully in brick_mux environment
- [#1699176](https://bugzilla.redhat.com/1699176): rebalance start command doesn't throw up error message if the command fails
- [#1699189](https://bugzilla.redhat.com/1699189): fix truncate lock to cover the write in tuncate clean
- [#1699339](https://bugzilla.redhat.com/1699339): With 1800+ vol and simultaneous 2 gluster pod restarts, running gluster commands gives issues once all pods are up
- [#1699394](https://bugzilla.redhat.com/1699394): [geo-rep]: Geo-rep goes FAULTY with OSError
- [#1699866](https://bugzilla.redhat.com/1699866): I/O error on writes to a disperse volume when replace-brick is executed
- [#1700078](https://bugzilla.redhat.com/1700078): disablle + reenable of bitrot leads to files marked as bad
- [#1700865](https://bugzilla.redhat.com/1700865): FUSE mount seems to be hung and not accessible
- [#1701337](https://bugzilla.redhat.com/1701337): issues with 'building' glusterfs packages if we do 'git clone --depth 1'
- [#1701457](https://bugzilla.redhat.com/1701457): ctime: Logs are flooded with "posix set mdata failed, No ctime" error during open
- [#1702131](https://bugzilla.redhat.com/1702131): The source file is left in EC volume after rename when glusterfsd out of service
- [#1702185](https://bugzilla.redhat.com/1702185): coredump reported by test ./tests/bugs/glusterd/bug-1699339.t
- [#1702299](https://bugzilla.redhat.com/1702299): Custom xattrs are not healed on newly added brick
- [#1702303](https://bugzilla.redhat.com/1702303): Enable enable fips-mode-rchecksum for new volumes by default
- [#1702952](https://bugzilla.redhat.com/1702952): remove tier related information from manual pages
- [#1703020](https://bugzilla.redhat.com/1703020): The cluster.heal-timeout option is unavailable for ec volume
- [#1703629](https://bugzilla.redhat.com/1703629): statedump is not capturing info related to glusterd
- [#1703948](https://bugzilla.redhat.com/1703948): Self-heal daemon resources are not cleaned properly after a ec fini
- [#1704252](https://bugzilla.redhat.com/1704252): Creation of bulkvoldict thread logic is not correct while brick_mux is enabled for single volume
- [#1704888](https://bugzilla.redhat.com/1704888): delete the snapshots and volume at the end of uss.t
- [#1705865](https://bugzilla.redhat.com/1705865): VM stuck in a shutdown because of a pending fuse request
- [#1705884](https://bugzilla.redhat.com/1705884): Image size as reported from the fuse mount is incorrect
- [#1706603](https://bugzilla.redhat.com/1706603): Glusterfsd crashing in ec-inode-write.c, in GF_ASSERT
- [#1707081](https://bugzilla.redhat.com/1707081): Self heal daemon not coming up after upgrade to glusterfs-6.0-2 (intermittently) on a brick mux setup
- [#1707700](https://bugzilla.redhat.com/1707700): maintain consistent values across for options when fetched at cluster level or volume level
- [#1707728](https://bugzilla.redhat.com/1707728): geo-rep: Sync hangs with tarssh as sync-engine
- [#1707742](https://bugzilla.redhat.com/1707742): tests/geo-rep: arequal checksum comparison always succeeds
- [#1707746](https://bugzilla.redhat.com/1707746): AFR-v2 does not log before attempting data self-heal
- [#1708051](https://bugzilla.redhat.com/1708051): Capture memory consumption for gluster process at the time of throwing no memory available message
- [#1708156](https://bugzilla.redhat.com/1708156): ec ignores lock contention notifications for partially acquired locks
- [#1708163](https://bugzilla.redhat.com/1708163): tests: fix bug-1319374.c compile warnings.
- [#1708926](https://bugzilla.redhat.com/1708926): Invalid memory access while executing cleanup_and_exit
- [#1708929](https://bugzilla.redhat.com/1708929): Add more test coverage for shd mux
- [#1709248](https://bugzilla.redhat.com/1709248): [geo-rep]: Non-root - Unable to set up mountbroker root directory and group
- [#1709653](https://bugzilla.redhat.com/1709653): geo-rep: With heavy rename workload geo-rep log if flooded
- [#1710054](https://bugzilla.redhat.com/1710054): Optimize the glustershd manager to send reconfigure
- [#1710159](https://bugzilla.redhat.com/1710159): glusterd: While upgrading (3-node cluster) 'gluster v status' times out on node to be upgraded
- [#1711240](https://bugzilla.redhat.com/1711240): [GNFS]  gf_nfs_mt_inode_ctx serious memory leak
- [#1711250](https://bugzilla.redhat.com/1711250): bulkvoldict thread is not handling all volumes while brick multiplex is enabled
- [#1711297](https://bugzilla.redhat.com/1711297): Optimize glusterd code to copy dictionary in handshake code path
- [#1711764](https://bugzilla.redhat.com/1711764): Files inaccessible if one rebalance process is killed in a multinode volume
- [#1711820](https://bugzilla.redhat.com/1711820): Typo in cli return string.
- [#1711827](https://bugzilla.redhat.com/1711827): test case bug-1399598-uss-with-ssl.t  is generating crash
- [#1712322](https://bugzilla.redhat.com/1712322): Brick logs  inundated with [2019-04-27 22:14:53.378047] I [dict.c:541:dict_get] (-->/usr/lib64/glusterfs/6.0/xlator/features/worm.so(+0x7241) [0x7fe857bb3241] -->/usr/lib64/glusterfs/6.0/xlator/features/locks.so(+0x1c219) [0x7fe857dda219] [Invalid argumen
- [#1712668](https://bugzilla.redhat.com/1712668): Remove-brick shows warning cluster.force-migration enabled  where as cluster.force-migration is disabled on the volume
- [#1712741](https://bugzilla.redhat.com/1712741): glusterd_svcs_stop should call individual wrapper function to stop rather than calling the glusterd_svc_stop
- [#1713730](https://bugzilla.redhat.com/1713730): Failure when glusterd is configured to bind specific IPv6 address. If bind-address is IPv6, *addr_len will be non-zero and it goes to ret = -1 branch, which will cause listen failure eventually
- [#1714098](https://bugzilla.redhat.com/1714098): Make debugging hung frames easier
- [#1714415](https://bugzilla.redhat.com/1714415): Script to make it easier to find hung frames
- [#1714973](https://bugzilla.redhat.com/1714973): upgrade after tier code removal results in peer rejection.
- [#1715921](https://bugzilla.redhat.com/1715921): uss.t tests times out with brick-mux regression
- [#1716695](https://bugzilla.redhat.com/1716695): Fix memory leaks that are present even after an xlator fini [client side xlator]
- [#1716766](https://bugzilla.redhat.com/1716766): [Thin-arbiter] TA process is not picking 24007 as port while starting up
- [#1716812](https://bugzilla.redhat.com/1716812): Failed to create volume which transport_type is "tcp,rdma"
- [#1716830](https://bugzilla.redhat.com/1716830): DHT: directory permissions are wiped out
- [#1717757](https://bugzilla.redhat.com/1717757): WORM: Segmentation Fault if bitrot stub do signature
- [#1717782](https://bugzilla.redhat.com/1717782): gluster v get <VolumeName> all still showing storage.fips-mode-rchecksum off
- [#1717819](https://bugzilla.redhat.com/1717819): Changes to self-heal logic w.r.t. detecting metadata split-brains
- [#1717953](https://bugzilla.redhat.com/1717953): SELinux context labels are missing for newly added bricks using add-brick command
- [#1718191](https://bugzilla.redhat.com/1718191): Regression: Intermittent test failure for quick-read-with-upcall.t
- [#1718273](https://bugzilla.redhat.com/1718273): markdown formatting errors in files present under /doc directory of the project
- [#1718316](https://bugzilla.redhat.com/1718316): Ganesha-gfapi logs are flooded with error messages related to "gf_uuid_is_null(gfid)) [Invalid argument]" when lookups are running from multiple clients
- [#1718338](https://bugzilla.redhat.com/1718338): Upcall: Avoid sending upcalls for invalid Inode
- [#1718848](https://bugzilla.redhat.com/1718848): False positive logging of mount failure
- [#1718998](https://bugzilla.redhat.com/1718998): Fix test case "tests/basic/afr/split-brain-favorite-child-policy.t" failure
- [#1720201](https://bugzilla.redhat.com/1720201): Healing not proceeding during in-service upgrade on a disperse volume
- [#1720290](https://bugzilla.redhat.com/1720290): ctime changes:  tar still complains file changed as we read it if uss is enabled
- [#1720615](https://bugzilla.redhat.com/1720615): [RHEL-8.1] yum update fails for rhel-8 glusterfs client packages 6.0-5.el8
- [#1720993](https://bugzilla.redhat.com/1720993): tests/features/subdir-mount.t is failing for brick_mux regrssion
- [#1721385](https://bugzilla.redhat.com/1721385): glusterfs-libs: usage of inet_addr() may impact IPv6
- [#1721435](https://bugzilla.redhat.com/1721435): DHT: Internal xattrs visible on the mount
- [#1721441](https://bugzilla.redhat.com/1721441): geo-rep: Fix permissions for GEOREP_DIR in non-root setup
- [#1721601](https://bugzilla.redhat.com/1721601): [SHD] : logs of one volume are going to log file of other volume
- [#1722541](https://bugzilla.redhat.com/1722541): stale shd process files leading to heal timing out and heal deamon not coming up for all volumes