summaryrefslogtreecommitdiffstats
path: root/xlators
Commit message (Collapse)AuthorAgeFilesLines
* cluster/afr: When failing fop due to lack of quorum, also log error stringKrutika Dhananjay2016-11-111-11/+12
| | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15800 Change-Id: I1e73b4518bcf26196d6326065ad404f878e70bd4 BUG: 1393631 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15814 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* posix-acl: check dictionary before using itRajesh Joseph2016-11-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If extended attributes are not present in md-cache it returns NULL as xattr. posix acl xlator should check for NULL before using xattr. If normal and default ACLs are not set on file then md-cache will not contain system.posix_acl_access and system.posix_acl_default extended attributes in its cache. Therefore posix_acl_lookup_cbk should check xattr before using it, otherwise the logs will get filled with dictionary errors. > Reviewed-on: http://review.gluster.org/15769 > Reviewed-by: Raghavendra Talur <rtalur@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Vijay Bellur <vbellur@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit de7fe24663713fff364cfc2b52b675e3e979ee68) Change-Id: Icebf73cf0b313bd3e82ca8cbda63786dd0fa47da BUG: 1392867 Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-on: http://review.gluster.org/15798 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-082-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I34b9f90d0f09766b6d87b3994d5cd7a77b622dcb BUG: 1392853 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15797 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* glusterd/quota: upgrade quota.conf file during an upgradeManikandan Selvaganesh2016-11-084-16/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem ======= When quota is enabled on 3.6, it will have quota conf version in quota.conf as v1.1. This node gets upgraded to 3.7 but it will still have quota conf version as v1.1 until a quota enable/disable/set limit is initiated. When this is not initiated and when this node tries to peer probe a node which is a fresh install of 3.7 (which will have quota conf version as v1.2), then this will result in "Peer rejected" state. This patch fixes the issue. Solution ======== When an upgrade happens from 3.6 to 3.7, quota.conf file needs to be modified as well. With 3.6, in quota.conf the version will be v1.1 and it needs to be changed to v1.2 from 3.7. This is because in 3.7, inode quota feature is introduced. So when an op-version bumpup happens quota.conf needs to be upgraded with quota conf version v1.2 and all the 16 byte uuid needs to be changed to 17 bytes uuid as well. Previously, when the cluster version is upgraded to 3.7, the quota.conf got upgraded as well. But, the upgradation was done only when quota enable/disable/set limit is done. With this patch, the upgradation is done during a cluster op version bump up as well. >Reviewed-on: http://review.gluster.org/15352 >Tested-by: Atin Mukherjee <amukherj@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Change-Id: Idb5ba29d3e1ea0e45c85d87c952c75da9e0f99f0 BUG: 1392715 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/15790 Tested-by: sanoj-unnikrishnan <sunnikri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Manikandan Selvaganesh <manikandancs333@gmail.com>
* md-cache: Invalidate cache entry for open() with O_TRUNCSoumya Koduri2016-11-041-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a file is opened with O_TRUNC flag set, its size gets set to '0'. This case needs to be handled in md-cache to avoid sending incorrect cached stat. This is backport of below mainline patch - http://review.gluster.org/#/c/15618/ > Change-Id: I95d1f8a6634734898883ede010c3e7b0b7eb97d9 > BUG: 1382266 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/15618 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Tested-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > (cherry picked from commit 6ca5d6382f03685b31b12accb095093cf1486603) Change-Id: I00d2b0a6dbafa690235bbc7cd714ec7cc8a13808 BUG: 1391451 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/15772 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* performance/write-behind: fix flush stuck by former failed writesRyan Ding2016-11-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the issue is happened in this case: assume a file is opened with fd1 and fd2. 1. some WRITE opto fd1 got error, they were add back to 'todo' queue because of those error. 2. fd2 closed, a FLUSH op is send to write-behind. 3. FLUSH can not be unwind because it's not a legal waiter for those failed write(as func __wb_request_waiting_on() say). and those failed WRITE also can not be ended if fd1 is not closed. fd2 stuck in close syscall. to resolve this issue, we can change the way we determine 2 requests is 'conflict': flush/fsync is not conflict with those write that is not belonged to them. so __wb_pick_winds() can wind the FLUSH op. below is some information when the stuck issue happen: glusterdump logs: [xlator.performance.write-behind.wb_inode] path=/ltp-F9eG0ZSOME/rw-buffered-16436 inode=0x7fdbe8039b9c window_conf=1048576 window_current=249856 transit-size=0 dontsync=0 [.WRITE] request-ptr=0x7fdbe8020200 refcount=1 wound=no generation-number=4 req->op_ret=-1 req->op_errno=116 sync-attempts=3 sync-in-progress=no size=131072 offset=1220608 lied=-1 append=0 fulfilled=0 go=0 [.WRITE] request-ptr=0x7fdbe8068c30 refcount=1 wound=no generation-number=5 req->op_ret=-1 req->op_errno=116 sync-attempts=2 sync-in-progress=no size=118784 offset=1351680 lied=-1 append=0 fulfilled=0 go=0 [.FLUSH] request-ptr=0x7fdbe8021cd0 refcount=1 wound=no generation-number=6 req->op_ret=0 req->op_errno=0 sync-attempts=0 gdb detail about above 3 requests: (gdb) print *((wb_request_t *)0x7fdbe8021cd0) $2 = {all = {next = 0x7fdbe803a608, prev = 0x7fdbe8068c30}, todo = {next = 0x7fdbe803a618, prev = 0x7fdbe8068c40}, lie = {next = 0x7fdbe8021cf0, prev = 0x7fdbe8021cf0}, winds = {next = 0x7fdbe8021d00, prev = 0x7fdbe8021d00}, unwinds = {next = 0x7fdbe8021d10, prev = 0x7fdbe8021d10}, wip = { next = 0x7fdbe8021d20, prev = 0x7fdbe8021d20}, stub = 0x7fdbe80224dc, write_size = 0, orig_size = 0, total_size = 0, op_ret = 0, op_errno = 0, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_FLUSH, lk_owner = {len = 8, data = "W\322T\f\271\367y$", '\000' <repeats 1015 times>}, iobref = 0x0, gen = 6, fd = 0x7fdbe800f0dc, wind_count = 0, ordering = {size = 0, off = 0, append = 0, tempted = 0, lied = 0, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8020200) $3 = {all = {next = 0x7fdbe8068c30, prev = 0x7fdbe803a608}, todo = {next = 0x7fdbe8068c40, prev = 0x7fdbe803a618}, lie = {next = 0x7fdbe8068c50, prev = 0x7fdbe803a628}, winds = {next = 0x7fdbe8020230, prev = 0x7fdbe8020230}, unwinds = {next = 0x7fdbe8020240, prev = 0x7fdbe8020240}, wip = { next = 0x7fdbe8020250, prev = 0x7fdbe8020250}, stub = 0x7fdbe8062c3c, write_size = 131072, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe80311a0, gen = 4, fd = 0x7fdbe805c89c, wind_count = 3, ordering = {size = 131072, off = 1220608, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} (gdb) print *((wb_request_t *)0x7fdbe8068c30) $4 = {all = {next = 0x7fdbe8021cd0, prev = 0x7fdbe8020200}, todo = {next = 0x7fdbe8021ce0, prev = 0x7fdbe8020210}, lie = {next = 0x7fdbe803a628, prev = 0x7fdbe8020220}, winds = {next = 0x7fdbe8068c60, prev = 0x7fdbe8068c60}, unwinds = {next = 0x7fdbe8068c70, prev = 0x7fdbe8068c70}, wip = { next = 0x7fdbe8068c80, prev = 0x7fdbe8068c80}, stub = 0x7fdbe806746c, write_size = 118784, orig_size = 4096, total_size = 0, op_ret = -1, op_errno = 116, refcount = 1, wb_inode = 0x7fdbe803a5f0, fop = GF_FOP_WRITE, lk_owner = {len = 8, data = '\000' <repeats 1023 times>}, iobref = 0x7fdbe8052b10, gen = 5, fd = 0x7fdbe805c89c, wind_count = 2, ordering = {size = 118784, off = 1351680, append = 0, tempted = -1, lied = -1, fulfilled = 0, go = 0}} you can see they are all on 'todo' queue, and FLUSH op fd is not the same WRITE op fd. > Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab > BUG: 1372211 > Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Change-Id: Id687f9cd3b9f281e1a97c83f1ce981ede272b8ab BUG: 1390840 Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-on: http://review.gluster.org/15763 Tested-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* snapshot/cli: Fix snapshot status xml outputAvra Sengupta2016-11-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14018/ snap status --xml errors out if a brick is down and doesn't have pid. It is handled in the cli of the snap status where "N/A" is displayed in such a scenario. Handled the same in xml snap status <snapname> --xml fails as the writer is not initialised for the same. Using GF_SNAP_STATUS_TYPE_ITER instead of GF_SNAP_STATUS_TYPE_SNAP for all snap's status to differentiate between the two scenarios. Added testcase volume-snapshot-xml.t to check all snapshot commands xml outputs > Reviewed-on: http://review.gluster.org/14018 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Rajesh Joseph <rjoseph@redhat.com> Change-Id: I99563e8f3e84f1aaeabd865326bb825c44f5c745 BUG: 1369363 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15290 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* afr,ec: Heal device files with correct major, minor numbersPranith Kumar K2016-10-262-2/+4
| | | | | | | | | | | | | | | | | | | | | | | Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the fix. >BUG: 1384297 >Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15728 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I28636a741592335cebcaa1abc2af8460ebc740e1 BUG: 1388949 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15736 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
* system/posix-acl: Unwind with NULL xdata on errorPranith Kumar K2016-10-201-17/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In posix-acl when there are errors xdata that comes as part of input is used to unwind which can be used as response xdata which may lead to problems as the keys in the input will match with keys in the output but the values the response xdata may expect can be completely different. For example, we see that dht sends DHT_IATT_IN_XDATA_KEY in setxattr which will be unwound with the same key in the xdata-response which dht thinks is valid response and fills stbuf with invalid values leading to EIO > BUG: 1374093 > Change-Id: I6b77a1fa1ee99cb62e181e1db2e6fea73f6eaaa3 > Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> > Reviewed-on: http://review.gluster.org/15421 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit c9271ff14d3efa8279cf67907548b3f43970d4fb) Change-Id: I6b77a1fa1ee99cb62e181e1db2e6fea73f6eaaa3 BUG: 1374641 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15475 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* performance/write-behind: remove the request from liability queue inRaghavendra G2016-10-171-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wb_fulfill_request Before this patch, a request is removed from liability queue only when ref count of request hits 0. Though, wb_fulfill_request does an unref, it need not be the last unref and hence the request may survive in liability queue till the last unref. Let, T1: the time at which wb_fulfill_request is invoked T2: the time at which last unref is done on request Let's consider a case of T2 > T1. In the time window between T1 and T2, any other request (waiter) conflicting with request in liability queue (blocker - basically a write which has been lied) is blocked from winding. If T2 happens to be when wb_do_unwinds is invoked, no further processing of request list happens and "waiter" would get blocked forever. An example imaginary sequence of events is given below: 1. A write request w1 is picked up for unwinding in __wb_pick_unwinds (but unwind is not done _yet_ and hence reference remains). However, w1 is moved to liability queue. Let's call this invocation of wb_process_queue by wb_writev as PQ1. 2. A flush (f1) request hits write behind. Since the liability queue of inode is not empty, f1 is not picked for unwinding. Let's call the invocation of wb_process_queue by wb_flush as PQ2. 3. PQ2 continues and picks w1 for fulfilling and invokes wb_fulfill. As part of successful wb_fulfill_cbk, wb_fulfill_request (w1) is invoked. But, w1 is not freed (and hence not removed from liability queue) as w1 is not unwound _yet_ and a ref remains (PQ1 has not invoked wb_do_unwinds _yet_). 4. wb_fulfill_cbk (triggered by PQ2) invokes a wb_process_queue (let's say PQ3). f1 is not resumed in PQ3 as w1 is still in liability queue. At this time, PQ2 and PQ3 are complete. 5. PQ1 continues, unwinds w1 and does last unref on w1 and w1 is freed (and removed from liability queue). Since PQ1 didn't invoke wb_fulfill on any other write requests, there won't be any future codepaths that would invoke wb_process_queue and f1 is stuck forever. With this fix, w1 is removed from liability queue in step 3 above and PQ3 resumes f1 in step 4 (as there are no requests conflicting with f1 in liability queue during execution of PQ3). > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > BUG: 1379655 > Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb > Reviewed-on: http://review.gluster.org/15579 > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> (cherry picked from commit a8b2a981881221925bb5edfe7bb65b25ad855c04) Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1385622 Change-Id: Idacda1fcd520ac27f30224f8dfe8360dba6ac6cb Reviewed-on: http://review.gluster.org/15657 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* afr: Take full locks in arbiter only for data transactionsRavishankar N2016-10-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Sharding exposed a bug in arbiter config. where `dd` throughput was extremely slow. Shard xlator was sending a fxattrop to update the file size immediately after a writev. Arbiter was incorrectly over-riding the LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop, causing the inodelk to be taken on the data domain. And since the preceeding writev hadn't released the lock (afr does a 'lazy' unlock if write succeeds on all bricks), this degraded to a blocking lock causing extra lock/unlock calls and delays. Fix: Modify flock.l_len and flock.l_start to take full locks only for data transactions. > Reviewed-on: http://review.gluster.org/15641 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> (cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60) Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b BUG: 1385226 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reported-by: Max Raba <max.raba@comsysto.com> Reviewed-on: http://review.gluster.org/15649 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Upcall/cache-invalidation: Use parent stbuf while updating parent entrySoumya Koduri2016-10-032-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For *create* fops (CREATE, MKDIR, MKNOD), we invalidate the parent entry. Hence send parent attributes in the stat field. Also "UP_PARENT_DENTRY_FLAGS" has to be set only for the fops which shall result in two invalidations requests - one for the inode on which fop is being performed and another on parent entry. In case of CREATE/MKDIR/MKNOD fops, there shall be only one invalidation request sent, that too on parent inode. We send invalidation directly on parent inode's gfid. So there is no necessity to set these flags which when set shall endup invalidating the parent's parent entry. Cherry picked from commit f4282bd927e2e0d826d62cf1192102382c5697b2: > Change-Id: I7514ee08382081e3e060818ede497dbca26987dc > BUG: 1291259 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: http://review.gluster.org/12962 > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I7514ee08382081e3e060818ede497dbca26987dc BUG: 1347715 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15600 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* performance/open-behind: Pass O_DIRECT flags for anon fd reads when requiredKrutika Dhananjay2016-09-232-39/+28
| | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15537 cherry-picked from a412a4f50d8ca2ae68dbfa93b80757889150ce99 Writes are already passing the correct flags at the time of open(). Also, make io-cache honor direct-io for anon-fds with O_DIRECT flag during reads. Change-Id: I221d6e8e7431931a0c1fc93f1a886a62ab58d0ca BUG: 1378695 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15551 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Use locks for opendirPranith Kumar K2016-09-151-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In some cases we see that readdir keeps winding to the brick that doesn't have any blocked locks i.e. first brick. This is leading to the client assuming that there are no blocking locks on the inode so it won't give away the lock. Other clients end up blocked on the lock as if the command hung. Fix: Proper way to fix this issue is to use infra present in http://review.gluster.org/14736 This is a stop gap fix where we start taking inodelks in opendir which goes to all the bricks, this will detect if there is any contention. cherry picked from commit f013335400d033a9677797377b90b968803135f4: >BUG: 1346719 >Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/15309 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9 BUG: 1373392 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15406 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* Tier: failing detach commit on detach failure and in-progresshari gowtham2016-09-131-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | back-port of: http://review.gluster.org/#/c/15438/ PROBLEM: if detach status has failed or if it remains in progress we allow detach commit to happen. only detach force should be allowed. FIX: check the detach status for failure or inprogress and disallow with the apt error message. >Change-Id: Ib97d540fec67717bb55c18d133187c665cf69ef1 >BUG: 1374584 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/15438 Smoke: Gluster Build System <jenkins@build.gluster.org> Tested-by: hari gowtham <hari.gowtham005@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com> Change-Id: I932b1074de277361fe7c3fe247d799f772cf4658 BUG: 1375474 Signed-off-by: hari gowtham <hgowtham@redhat.com> Reviewed-on: http://review.gluster.org/15491 Tested-by: hari gowtham <hari.gowtham005@gmail.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht: "replica.split-brain-status" attribute value is not correctMohit Agrawal2016-09-122-12/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: In a distributed-replicate volume attribute "replica.split-brain-status" value does not display split-brain condition though directory is in split-brain. If directory is in split brain on mutiple replica-pairs it does not show full list of replica pairs. Solution: Update the dht_aggregate code to aggregate the xattr value in this specific condition. Fix: 1) function getChoices returns the choices from split-brain status string. 2) function add_opt adding the choices to local buffer to store in dictionary 3) For the key "replica.split-brain-status" function dht_aggregate call dht_aggregate_split_brain_xattr to prepare the list. Test: To verify the patch followed below steps 1) Create a distributed replica volume and create mount point 2) Stop heal daemon 3) Touch file and directories on mount point mkdir test{1..5};touch tmp{1..5} 4) Down brick process on one of the replica set pkill -9 glusterfsd 5) Change permission of dir on mount point chmod 755 test{1..5} 6) Restart brick process on node with force option 7) kill brick process on other node in same replica set 8) Change permission of dir again on mount point chmod 766 test{1..5} 9) Reexecute same step from 4-9 on other replica set also 10) After check heal status on server it will show dir's are in split brain on all replica sets 11) After check the replica.split-brain-status attr on mount point it will show wrong status of split brain. 12) After apply the patch the attribute shows correct value. > Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/15201 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > (cherry picked from commit c4e9ec653c946002ab6d4c71ee8e6df056438a04) Change-Id: Ia5234e8a2291a7e8a7211c82368f4df1c99fa099 Backport of commit c4e9ec653c946002ab6d4c71ee8e6df056438a04 BUG: 1375099 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/15466 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* gluster: Fixed typosN Balachandran2016-09-081-1/+1
| | | | | | | | | | | | | | | | | | This is not a backport from master as one of these typos has already been fixed in master. Posting this on behalf of Patrick M. credit: pmatthaei@debian.org Change-Id: I15dca6fc2c7df2fcb84db8f01c8585eeff26a114 BUG: 1223935 Original-author: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15122 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* build: correctly format some (s)size_t messagesNiels de Vos2016-09-074-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit builds the are are warnings like these: posix.c:6438: warning: format '%ld' expects type 'long int', but argument 11 has type 'ssize_t' Instead of using "%l" for (signed) size_t variables, "%z" should be used. Cherry picked from commit 3af889f02722f4636d2ea30570de6477e8b5a3a9: > BUG: 1198849 > Change-Id: I6f57b5e8ea174dd9e3056aff5da685e497894ccf > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/14933 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> This patch is not really recommended for backporting, but we do have a new smoke test that fails when these warnings pop-up. It is cleaner to correct the code in the release-3.7 branch then to modify the smoke test to skip this branch. Change-Id: I6f57b5e8ea174dd9e3056aff5da685e497894ccf BUG: 1225842 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15401 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* features/upcall: segment fault while join thread reaper_thr in fini()Niels de Vos2016-09-052-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | reaper_thr thread may not be started according to option 'cache-invalidation', if it's not started, join it in fini will cause a segment fault. Cherry picked from commit 7f0042dce94edb58c92662d9e4f852ba006d12dc: > Change-Id: I1c145a5feb137767880a08e79f810537283fb6b9 > BUG: 1369524 > Signed-off-by: Ryan Ding <ryan.ding@open-fs.com> > [ndevos: check .reaper_init_done and make it a boolean] > Reviewed-on: http://review.gluster.org/15298 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I1c145a5feb137767880a08e79f810537283fb6b9 BUG: 1371196 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15337 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ryan Ding <ryan.ding@open-fs.com> Reviewed-by: soumya k <skoduri@redhat.com>
* features/locks: fix fdctx leak in locks xlatorsyanping.gao2016-08-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Locks xlators is leaking fdctx in pl_release when inode_ctx_get return non-zero Fix: This patch fixes fdctx leak in pl_release path > Reviewed-on: http://review.gluster.org/15302 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Change-Id: Icd5c5c681b7d890e7971b3b06d4258a51d45097d BUG: 1370388 Signed-off-by: Yanping.gao <yanping.gao@xtaotech.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15322 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* nfs: allow hostnames with dashes in exports/netgroups filesNiels de Vos2016-08-255-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hostnames with dashes (like "vagrant-testVM") are not correctly parsed when reading the exports/netgroups files. This bacomes obvious when running ./run-tests-in-vagrant.sh because it causes tests/basic/mount-nfs-auth.t and tests/basic/netgroup_parsing.t to fail. The regex for hostname (in exports) and the entry and hostname (netgroups) parsing does not include the "-" sign, and hence the hostnames are splitted at it. Cherry picked from commit e5221d288e41d29d89d52f8deab657d2285a852c: > BUG: 1350237 > Change-Id: I38146a283561e1fa386cc841c43fd3b1e30a87ad > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/14809 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I38146a283561e1fa386cc841c43fd3b1e30a87ad BUG: 1357835 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/14956 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* glusterd/geo-rep: Add relative path validation to copy file commandAravinda VK2016-08-251-0/+34
| | | | | | | | | | | | | | | Added validation for input file, command fails if input file path is relative path pointing outside of GLUSTERD_WORKDIR. BUG: 1350784 Change-Id: I329d43ebed69bfe9fe03d6be70dc8c78a605ffc5 Signed-off-by: Aravinda VK <avishwan@redhat.com> Reviewed-on: http://review.gluster.org/14772 Reviewed-on: http://review.gluster.org/14948 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kotresh HR <khiremat@redhat.com>
* protocol/client: Unserialize xdata even if lookup failsAnuradha Talur2016-08-241-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: AFR relies on xdata returned by lookup to determine if there are any files that need healing. This info is further used to optimize readdirp. In case of lookups with negative return value, client xlator was sending NULL xdata. Due to absence of xdata, AFR conservatively assumes that there are files that need healing, which is incorrect. Solution: Even in case of unsuccessful lookups, send the xdata received by protocol client so that higher xlators can get the info that they rely on. >Change-Id: Id3a1023eb536180888eb2c0b39050000b76f7226 >BUG: 1366284 >Signed-off-by: Anuradha Talur <atalur@redhat.com> >Reviewed-on: http://review.gluster.org/15120 >Smoke: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Poornima G <pgurusid@redhat.com> >Tested-by: Poornima G <pgurusid@redhat.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Ashish Pandey <aspandey@redhat.com> >Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> >Signed-off-by: Anuradha Talur <atalur@redhat.com> Change-Id: Ia22bcb200d599b78677e429d25877c78f7d27612 BUG: 1369211 Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/15259 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* arbiter: Fix memleak in arbiter_inode ctxRavishankar N2016-08-243-27/+13
| | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15289/ Problem: The iattbuf ptr stored in arbiter's inode context was not freed during inode forget. Fix: Change it to a statically allocated value so that we don't have to deal with allocating/freeing it. Change-Id: Id1b73b8aee1fb5c4174d0734bd20e168432b1abd BUG: 1369752 Reported-by: Benjamin Edgar <benedgar8@gmail.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com> (cherry picked from commit 4aa52061a51b97c4f865b402f977b3b43f5471a7) Reviewed-on: http://review.gluster.org/15307 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Prevent split-brain when bricks are brought off and on in ↵Krutika Dhananjay2016-08-229-51/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cyclic order Backport of: http://review.gluster.org/15080 When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08 BUG: 1367270 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15222 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* snapshot/snapd: Don't display pid when snapd is offlineAvra Sengupta2016-08-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14981/ We were previously reading the pidfile, and displaying the pid even if snapd daemon is not running. Now to fix it, we re-assign pid value to -1, if snapd is offline. > Reviewed-on: http://review.gluster.org/14981 > Tested-by: Vijay Bellur <vbellur@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit ec6925a379c7bee071df1638bc2751b266cee346) Change-Id: I4baff8d489fe9380061c52aea006db90fa421cd7 BUG: 1360979 Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/15032 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* glusterd/geo-rep: Handle empty monitor.status during upgradeSaravanakumar Arumugam2016-08-192-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Consider geo-replication is in Stopped state. Following which, glusterfs is upgraded (where monitor.status is the new status file). Now, When geo-replication status command is run, empty monitor status file gets created. Now, if glusterd is restarted, it reads empty monitor status and starts geo-replication session. This is incorrect as session was in Stopped state earlier. Solution: If monitor status is empty, error out and avoid starting geo-replication session. Note: if monitor status is empty, geo-rep session is displayed as Stopped state. Change-Id: Ifb3db896e5ed92b927764cf1163503765cb08bb4 BUG: 1368055 Signed-off-by: Saravanakumar Arumugam <sarumuga@redhat.com> > Reviewed-on: http://review.gluster.org/14830 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit f938b3a26ffab9482d5f910ee76d2bb2b370517f) Reviewed-on: http://review.gluster.org/15197 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* features/libgfchangelog: Log failure in gf_histroy_changelogKotresh HR2016-08-181-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Add error logs if gf_history_changelog fails. If requested changelog range is not available, log the error and exit instead of continuing the loop and exiting in readdir without logging. Also fixed the duplicate MSGID number in 'changelog-lib-messages.h' > Change-Id: Icd71b89ae23b48a71380657ba5649029c32fabfd > BUG: 1362151 > Signed-off-by: Kotresh HR <khiremat@redhat.com> > Reviewed-on: http://review.gluster.org/15064 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Aravinda VK <avishwan@redhat.com> Change-Id: Icd71b89ae23b48a71380657ba5649029c32fabfd BUG: 1365877 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/15139 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Aravinda VK <avishwan@redhat.com>
* glusterd: Convert volume to replica after adding brick self heal is not ↵Mohit Agrawal2016-08-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | triggered Problem: After add brick to a distribute volume to convert to replica is not triggering self heal. Solution: Modify the condition in brick_graph_add_index to set trusted.afr.dirty attribute in xlator. Test : To verify the patch followd below steps 1) Create a single node volume gluster volume create <DIS> <IP:/dist1/brick1> 2) Start volume and create mount point mount -t glusterfs <IP>:/DIS /mnt 3) Touch some file and write some data on file 4) Add another brick along with replica 2 gluster volume add-brick DIS replica 2 <IP>:/dist2/brick2 5) Before apply the patch file size is 0 bytes in mount point. Backport of commit 87bb8d0400d4ed18dd3954b1d9e5ca6ee0fb9742 BUG: 1366444 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Change-Id: Ief0ccbf98ea21b53d0e27edef177db6cabb3397f > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/15118 > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-by: Anuradha Talur <atalur@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > (cherry picked from commit 87bb8d0400d4ed18dd3954b1d9e5ca6ee0fb9742) Change-Id: I9c21ba4d7b1a2d7c5c79a6bb86cc05831b0cd120 Reviewed-on: http://review.gluster.org/15152 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Tested-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/afr: copy loc before passing to syncopPranith Kumar K2016-08-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: When io-threads is enabled on the client side, io-threads destroys the call-stub in which the loc is stored as soon as the c-stack unwinds. Because afr is creating a syncop with the address of loc passed in setxattr by the time syncop tries to access it, io-threads would have already freed the call-stub. This will lead to crash. Fix: Copy loc to frame->local and use it's address. > Reviewed-on: http://review.gluster.org/15070 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Ravishankar N <ravishankar@redhat.com> BUG: 1367305 Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15168 Reviewed-by: Ravishankar N <ravishankar@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Bug fixes in txn codepathKrutika Dhananjay2016-08-171-2/+2
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/15145 AFR sets transaction.pre_op[] array even before actually doing the pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array but also the failed_subvols[] information before setting the pre_op_done[] flag. This patch fixes that. Change-Id: I8163256a6de254be43a7a526c6d2f9dc30e0e1df BUG: 1367270 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15162 Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Anuradha Talur <atalur@redhat.com>
* storage/posix: Log EEXIST failures at DEBUG log-levelKrutika Dhananjay2016-08-171-3/+6
| | | | | | | | | | | | | Backport of: http://review.gluster.org/15161 Change-Id: I98faa1170dae89879f678e8836f499bbc6ace675 BUG: 1367362 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15174 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/dht: initialize cbk before attempting inode-linkRaghavendra G2016-08-161-9/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise inode-link failures in selfheal codepath will result in a crash. This regression was introduced in master as fix to 1334164. But, that patch never made into 3.7. Hence, in essence this patch is 3.7 version of fix to 1334164, minus the regression. > Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022 > BUG: 1345748 > Signed-off-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-on: http://review.gluster.org/14707 > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Poornima G <pgurusid@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a6022 BUG: 1366483 Reviewed-on: http://review.gluster.org/15163 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* xlator/trash : append '/' at the end in trash_notify_lookup_cbkJiffin Tony Thottan2016-08-151-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the notify function in trash xlator, a lookup is performed to obtain path of old trash directory. The result usually contains path without '/' at the end. The trash xlator maintains expects '/' at the end for the values such as 'old trash dir' and 'new trash dir'. Otherwise certian checks in the code will fail. Upstream reference: >Change-Id: I89e02e4b249314fb6536297f959865feee182c83 >BUG: 1357397 >Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> >Reviewed-on: http://review.gluster.org/14938 >Smoke: Gluster Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Anoop C S <anoopcs@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >(cherry picked from commit d90307c1b0245e0e6a39044a28819cde520a100c) Change-Id: I89e02e4b249314fb6536297f959865feee182c83 BUG: 1358268 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/14966 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* gluster v set help does not show ssl optionsMohit Agrawal2016-08-111-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: "gluster v set help" does not show ssl options. Solution: Remove NO_DOC option for client.ssl/server.ssl from glusterd_volopt_map. Backport of commit 9733c68e878869daec196bf7bca16780eef73f74 BUG: 1365757 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Change-Id: Iabe982ea56398209bbf30d41260798e5ad7fce7b > BUG: 1351134 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Reviewed-on: http://review.gluster.org/14829 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > (cherry picked from commit 9733c68e878869daec196bf7bca16780eef73f74) Change-Id: Ic602ede19342fb8c507e1daed56d4883392fb172 Reviewed-on: http://review.gluster.org/15131 Tested-by: MOHIT AGRAWAL <moagrawa@redhat.com> Reviewed-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* posix: Do not move and recreate .glusterfs/unlink directoryAshish Pandey2016-08-101-11/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: At the time of start of a volume, it is checked if .glusterfs/unlink exist or not. If it does, move it to landfill and recreate unlink directory. If a volume is mounted and we write data on it till we face ENOSPC, restart of that volume fails as it will not be able to create unlink dir. mkdir will fail with ENOSPC. This will not allow volume to restart. Solution: If .glusterfs/unlink directory exist, don't move it to landfill. Delete all the entries inside it. master - http://review.gluster.org/#/c/15030/ Change-Id: Icde3fb36012f2f01aeb119a2da042f761203c11f BUG: 1364354 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15092 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* storage/posix: fix inode leaksRaghavendra G2016-08-102-1/+6
| | | | | | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/14739 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: N Balachandran <nbalacha@redhat.com> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> BUG: 1364886 Change-Id: Ibd221ba62af4db17bea5c52d37f5c0ba30b60a7d Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15107 Reviewed-by: N Balachandran <nbalacha@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* posix: leverage FALLOC_FL_ZERO_RANGE in zerofill fopRavishankar N2016-08-081-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | posix_zerofill() implements zerofilling of a given (offset,length) by doing a writev in a loop followed by an optional fsync on the file. fallocate(2) has a FALLOC_FL_ZERO_RANGE flag which does away with all this and provides the same result (from a userspace application point of view) with a single syscall. This patch attempts the zerofill with the latter and falls back to the former if it fails. Tested using a libgfapi based C program on XFS and observed using gdb that posix_zerofill()'s call to fallocate with FALLOC_FL_ZERO_RANGE was a success. > Reviewed-on: http://review.gluster.org/15037 > Reviewed-on: http://review.gluster.org/15100 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> BUG: 1363750 Change-Id: I77e9b7de0d59c255f06b0c39c43a276990081727 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15082 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* meta: fix memory leak in meta xlatorsMohammed Rafi KC2016-08-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | meta xlator is leaking iobuf and iobrefs in read path This patch fixes memleak in meta_default_read code path > Reviewed-on: http://review.gluster.org/15068 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> > Reviewed-by: Poornima G <pgurusid@redhat.com> BUG: 1364506 Change-Id: Ieb413267604d9870dbe6e11258fffd279a7bd7cf Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15102 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* posix: fix posix_fgetxattr to return the correct errorSusant Palai2016-08-071-3/+23
| | | | | | | | | | | | | | | | | | | | posix_fgetxattr used to not updating op_ret and op_errno (initialized to -1 and ENOENT respectively) on success cases. Hence, it can return ENOENT even if all the opertions were sucessful. BUG: 1358823 Change-Id: I1799ca7e48a4bd800a678366d2b6927d19c31f88 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/12745 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/13449 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-08-051-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. > Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 > BUG: 1362070 > Reviewed-on: http://review.gluster.org/15000 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b) Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1362070 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15062 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: Fix memory leak in glusterd (un)lock RPCsroot2016-08-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: At the time of execute "gluster volume profile <vol> info" command It does have memory leak in glusterd. Solution: Modify the code to prevent memory leak in glusterd. Fix : 1) Unref dict and free dict_val buffer in glusterd_mgmt_v3_lock_peer and glusterd_mgmt_v3_unlock_peers. Test : To verify the patch run below loop to generate io traffic for (( i=0 ; i<=1000000 ; i++ )); do echo "hi Start Line " > file$i; cat file$i >> /dev/null; done To verify the improvement in memory leak specific to glusterd run below command cnt=0;while [ $cnt -le 1000 ]; do pmap -x <glusterd-pid> | grep total; gluster volume profile distributed info > /dev/null; cnt=`expr $cnt + 1`; done After apply this patch it will reduce leak significantly. > Reviewed-on: http://review.gluster.org/14862 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Prashanth Pai <ppai@redhat.com> BUG: 1363747 Change-Id: I52a0ca47adb20bfe4b1848a11df23e5e37c5cea9 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15081 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* storage/posix: Look for file in "unlink" dir IFF open on real-path fails ↵Krutika Dhananjay2016-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | with ENOENT Backport of: http://review.gluster.org/#/c/15039/ PROBLEM: In some of our users' setups, open() on the anon fd failed for a reason other than ENOENT. But this error code is getting masked by a subsequent open() under posix's hidden "unlink" directory, which will fail with ENOENT because the gfid handle still exists under .glusterfs. And the log message following the two open()s ends up logging ENOENT, causing much confusion. FIX: Look for the presence of the file under "unlink" ONLY if the open() on the real_path failed with ENOENT. Change-Id: Id68bbe98740eea9889b17f8ea3126ed45970d26f BUG: 1360785 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15041 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* protocol/client: Filter o-direct in readv/writevPranith Kumar K2016-07-291-8/+15
| | | | | | | | | | | | | | | | | | | | | | | >Change-Id: I519c666b3a7c0db46d47e08a6a7e2dbecc05edf2 >BUG: 1322214 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14215 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> >(cherry picked from commit 74837896c38bafdd862f164d147b75fcbb619e8f) BUG: 1360785 Pranith Kumar K <pkarampu@redhat.com> Change-Id: Ib4013b10598b0b988b9f9f163296b6afa425f8fd Reviewed-on: http://review.gluster.org/15050 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/ec: Unlock stale locks when inodelk/entrylk/lk failsPranith Kumar K2016-07-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Thanks to Rafi for hinting a while back that this kind of problem he saw once. I didn't think the theory was valid. Could have caught it earlier if I had tested his theory. >Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b >BUG: 1344836 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14703 >Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> >Smoke: Gluster Build System <jenkins@build.gluster.org> >Tested-by: mohammed rafi kc <rkavunga@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> BUG: 1361402 Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15040 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* storage/posix: Give correct errno for anon-fd operationsPranith Kumar K2016-07-294-46/+51
| | | | | | | | | | | | | | | | | | | | >Change-Id: Ia9e61d3baa6881eb7dc03dd8ddb6bfdde5a01958 >BUG: 1343906 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14669 >Smoke: Gluster Build System <jenkins@build.gluster.org> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.org> >Reviewed-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1360140 Change-Id: I553dd59424aaff9538e0798370846d653ca1cf32 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/15011 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
* afr: some coverity fixesRavishankar N2016-07-287-103/+155
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/14895/ Thanks to Krutika for a cleaner way to track inode refs in afr_set_split_brain_choice(). Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df BUG: 1360549 Signed-off-by: Ravishankar N <ravishankar@redhat.com> Reviewed-on: http://review.gluster.org/15017 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* features/ctr: Check for NULL localN Balachandran2016-07-273-2/+509
| | | | | | | | | | | | | | | | | | | | | This is a defensive fix to prevent a crash reported during a rename operation. This is not reproducible under normal circumstances. This patch also moves ctr-messages.h to the src dir of the changetimerecorder xlator. Backported from master: http://review.gluster.org/#/c/14964/ Change-Id: I2aac2d4da5752f6a0b45a70e0d97a4d506532ff4 BUG: 1360125 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15007 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* changelog/rpc: Fix rpc_clnt_t mem leaksKotresh HR2016-07-274-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13658 PROBLEM: 1. Freeing up rpc_clnt object might lead to crashes. Well, it was not a necessity to free rpc-clnt object till now because all the existing use cases needs to reconnect back on disconnects. Hence timer code was not taking ref on rpc-clnt object. Glusterd had some use-cases that led to crash due to ping-timer and they fixed only those code paths that involve ping-timer. Now, since changelog has an use-case where rpc-clnt need to be freed up, we need to fix timer code to take refs 2. In changelog, because of issue 1, only mydata was being freed which is incorrect. And there are races where rpc-clnt object would access the freed mydata which would lead to crashes. Since changelog xlator resides on brick side and is long living process, if multiple libgfchangelog consumers register to changelog and disconnect/reconnect mulitple times, it would result in leak of 'rpc-clnt' object for every connect/disconnect. SOLUTION: 1. Handle ref/unref of 'rpc_clnt' structure in timer functions properly. 2. In changelog, unref 'rpc_clnt' in RPC_CLNT_DISCONNECT after disabling timers and free mydata on RPC_CLNT_DESTROY. RPC SETUP IN CHANGELOG: 1. changelog xlator initiates rpc server say 'changelog_rpc_server' 2. libgfchangelog initiates one rpc server say 'libgfchangelog_rpc_server' 3. libgfchangelog initiates rpc client and connects to 'changelog_rpc_server' 4. In return changelog_rpc_server initiates a rpc client and connects back to 'libgfchangelog_rpc_server' REF/UNREF HANDLING IN TIMER FUNCTIONS: Let's say rpc clnt refcount = 1 1. Take the ref before reigstering callback to timer queue >>>> rpc_clnt_ref (say ref count becomes = 2) 2. Register a callback to timer say 'callback1' 3. If register fails: >>>> rpc_clnt_unref (ref count = 1) 4. On timer expiration, 'callback1' gets called. So unref rpc clnt at the end in 'callback1'. This is corresponding to ref taken in step 1 >>>> rpc_clnt_unref (ref count = 1) 5. The cycle from step-1 to step-4 continues....until timer cancel event happens 6. timer cancel of say 'callback1' If timer cancel fails: Do nothing, Step-4 would have unrefd If timer cancel succeeds: >>>> rpc_clnt_unref (ref count = 1) Change-Id: I91389bc511b8b1a17824941970ee8d2c29a74a09 BUG: 1359363 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 637ce9e2e27e9f598a4a6c5a04cd339efaa62076) Reviewed-on: http://review.gluster.org/14993 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* cluster/ec: Handle absence of keys in some callback dictAshish Pandey2016-07-271-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: This issue arises when we do a rolling update from 3.7.5 to 3.7.9. For 4+2 volume running 3.7.5, if we update 2 nodes and after heal completion kill 2 older nodes, this problem can be seen. After update and killing of bricks, 2 nodes will return inodelk count key in dict while other 2 nodes will not have inodelk count in dict. This is also true for get-link-count. During dictionary match , ec_dict_compare, this will lead to mismatch of answers and the file operation on mount point will fail with IO error. Solution: Don't match inode, entry and link count keys while comparing two dictionaries. However, while combining the data in ec_dict_combine, go through all the dictionaries and select the maximum values received in different dicts for these keys. master- http://review.gluster.org/#/c/14761/ Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f BUG: 1360152 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/14761 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/15012