summaryrefslogtreecommitdiffstats
path: root/libglusterfs/src
Commit message (Collapse)AuthorAgeFilesLines
* features/shard: Fill loc.pargfid too for named lookups on individual shardsKrutika Dhananjay2016-11-082-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/#/c/15788/ On a sharded volume when a brick is replaced while IO is going on, named lookup on individual shards as part of read/write was failing with ENOENT on the replaced brick, and as a result AFR initiated name heal in lookup callback. But since pargfid was empty (which is what this patch attempts to fix), the resolution of the shards by protocol/server used to fail and the following pattern of logs was seen: Brick-logs: [2016-11-08 07:41:49.387127] W [MSGID: 115009] [server-resolve.c:566:server_resolve] 0-rep-server: no resolution type for (null) (LOOKUP) [2016-11-08 07:41:49.387157] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null) (00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82) ==> (Invalid argument) [Invalid argument] Client-logs: [2016-11-08 07:41:27.497687] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.497755] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.498500] W [MSGID: 114031] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-11-08 07:41:27.499680] E [MSGID: 133010] Also, this patch makes AFR by itself choose a non-NULL pargfid even if its ancestors fail to initialize all pargfid placeholders. Change-Id: I34b9f90d0f09766b6d87b3994d5cd7a77b622dcb BUG: 1392853 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15797 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* libglusterfs: add gf_get_mem_type()Niels de Vos2016-10-032-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfapi needs to provide a function towards applications to free memory that it allocated. Depending on how the application is compiled/linked, it could use a different memory allocator than Gluster itself. Therefore it is not safe for gfapi to request applications to free memory with 'standard' free(). Examples for this are Gluster allocated structures with GF_CALLOC() when memory accounting is enabled (the default). Some gfapi functions use malloc() to allocate memory as a workaround, but the free() from the jemalloc implementation should not be combined with the malloc() from glibc. Cherry picked from commit db4e26ed71a01e5f760fbc3c7051962426f102c9: > Change-Id: I626cd1a60abf8965f9263290f4045d1f69fc2093 > BUG: 1344714 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: http://review.gluster.org/15108 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Change-Id: I626cd1a60abf8965f9263290f4045d1f69fc2093 BUG: 1347715 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15601 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Kaushal M <kaushal@redhat.com>
* cluster/afr: Prevent split-brain when bricks are brought off and on in ↵Krutika Dhananjay2016-08-222-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cyclic order Backport of: http://review.gluster.org/15080 When the bricks are brought offline and then online in cyclic order while writes are in progress on a file, thanks to inode refresh in write txns, AFR will mostly fail the write attempt when the only good copy is offline. However, there is still a remote possibility that the file will run into split-brain if the brick that has the lone good copy goes offline *after* the inode refresh but *before* the write txn completes (I call it in-flight split-brain in the patch for ease of reference), requiring intervention from admin to resolve the split-brain before the IO can resume normally on the file. To get around this, the patch does the following things: i) retains the dirty xattrs on the file ii) avoids marking the last of the good copies as bad (or accused) in case it is the one to go down during the course of a write. iii) fails that particular write with the appropriate errno. This way, we still have one good copy left despite the split-brain situation which when it is back online, will be chosen as source to do the heal. Change-Id: I7c13c6ddd5b8fe88b0f2684e8ce5f4a9c3a24a08 BUG: 1367270 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/15222 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* inode: Adjust lru_size while retiring entries in lru listSoumya Koduri2016-08-102-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | As part of inode_table_destroy(), we first retire entries in the lru list but the lru_size is not adjusted accordingly. This may result in invalid memory reference in inode_table_prune if the lru_size > lru_limit. > Reviewed-on: http://review.gluster.org/15087 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Reviewed-by: Prashanth Pai <ppai@redhat.com> BUG: 1365746 Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15128 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* libglusterfs/client_t: Dump the 0th client tooRaghavendra G2016-08-091-1/+1
| | | | | | | | | | | | | | | | | | > Reviewed-on: http://review.gluster.org/14704 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1364886 Change-Id: I565e81944b6670d26ed1962689dcfd147181b61e Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reviewed-on: http://review.gluster.org/15106 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Prashanth Pai <ppai@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* dht/rebalance: allocate migrator thread pool dynamicallySusant Palai2016-08-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problems: The maximum number of migratior threads created was static set to "40". And the number of these threads get created in rebalance depends on the number of cores user has. If the number of cores exceeds 40, a crash or memory corruption can be seen. Fix: Make the migratior thread pool dynamic. > Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 > BUG: 1362070 > Reviewed-on: http://review.gluster.org/15000 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> (cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b) Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839 BUG: 1362070 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/15062 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/ctr: Check for NULL localN Balachandran2016-07-272-497/+1
| | | | | | | | | | | | | | | | | | | | | This is a defensive fix to prevent a crash reported during a rename operation. This is not reproducible under normal circumstances. This patch also moves ctr-messages.h to the src dir of the changetimerecorder xlator. Backported from master: http://review.gluster.org/#/c/14964/ Change-Id: I2aac2d4da5752f6a0b45a70e0d97a4d506532ff4 BUG: 1360125 Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/15007 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* rpc/socket.c: Modify approach to cleanup threads of socket_poller in ↵Mohit Agrawal2016-07-263-1/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | socket_spawn. Problem: Current approach to cleanup threads of socket_poller is not appropriate. Solution: Enable detach flag at the time of thread creation in socket_spawn. Fix: Write a new wrapper(gf_create_detach_thread) to create detachable thread instead of store thread ids in a queue. Test: Fix is verfied on gluster process, To test the patch followed below procedure Enable the client.ssl and server.ssl option on the volume Start the volume and count anon segment in pmap output for glusterd process pmap -x <glusterd-pid> | grep "\[ anon \]" | wc -l Stop the volume and check again count of anon segment it should not increase. Backport of commit 2ee48474be32f6ead2f3834677fee89d88348382 > Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> > Change-Id: Ib8f7ec7504ec8f6f74b45ce6719b6fb47f9fdc37 > BUG: 1336508 > Reviewed-on: http://review.gluster.org/14694 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1354394 Change-Id: I271e83e7a210ecd27a7471c53147ceb837a33cad Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Reviewed-on: http://review.gluster.org/14886 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/bitrot: Move throttling code to libglusterfsKotresh HR2016-07-194-2/+371
| | | | | | | | | | | | | | | | Backport of http://review.gluster.org/14846 Since throttling is a separate feature by itself, move throttling code to libglusterfs. Change-Id: If9b99885ceb46e5b1865a4af18b2a2caecf59972 BUG: 1357514 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14944 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* libglusterfs: Implement API that provides page-aligned iobufsKrutika Dhananjay2016-06-292-0/+39
| | | | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14672 One of the consumers of a page aligned buffer would be posix's readv fop on O_DIRECT fds. Today the way it works is by getting a page-aligned buffer through calloc, pread()ing into this buffer and then copying its contents into a newly created iobuf's ptr. This results in an extra memcpy() which can be avoided if we could implement an api that would return an iobuf whose ptr is page-aligned. That way the iobuf->ptr can be directly passed to sys_pread() as a parameter by posix translator. Change-Id: Ie44c300dc773fef8e1669d609987d47dbd340ac2 BUG: 1351024 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14826 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Zhou Zhengping <johnzzpcrystal@gmail.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs: Negate all but O_DIRECT flag if present on anon fdsKrutika Dhananjay2016-06-161-10/+4
| | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14665 This is to prevent any unforeseen problems that might arise due to writevs and readvs being wound with @flag parameter containing O_TRUNC or O_APPEND especially wrt translators like sharding and ec where O_TRUNC write or O_APPEND write on individual shards/fragments is not the same as O_TRUNC write or O_APPEND write as expected by the application. Change-Id: Ib0110731d6099bc888b7ada552a2abbea8e76051 BUG: 1342903 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14735 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core, shard: Make shards inherit main file's O_DIRECT flag if presentKrutika Dhananjay2016-06-162-9/+36
| | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/14191 If the application opens a file with O_DIRECT, the shards' anon fds would also need to inherit the flag. Towards this, shard xl would be passing the odirect flag in the @flags parameter to the WRITEV fop. This will be used in anon fd resolution and subsequent opening by posix xl. Change-Id: I3a0593fa46cc25e390a5762a0354b469c2a1532d BUG: 1342903 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14663 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/ec: Fix invalid __fd_unref() callXavier Hernandez2016-06-131-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | __fd_unref() doesn't do any cleanup, so it cannot be called to release fd references, specially if it's the last reference. The code has been changed to avoid a call to this function. In the previous version we always tried to keep the newest fd in the ec_lock_t structure. However this is not necessary. We'll always keep one reference to an open file on the same inode. It's irrelevant if the reference is new or old. The function __fd_unref() has also been removed from fd.h to avoid being used in the future since it's useless as it's defined now. Backport of http://review.gluster.org/14683 Change-Id: Ia728777fc8e464758d5ea4d3bf020f0603919039 BUG: 1344422 Signed-off-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-on: http://review.gluster.org/14685 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs: Even anonymous fds must have fd->flags setRaghavendra Talur2016-06-082-3/+6
| | | | | | | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/10219 We do not set the same flags to anonymous fd that posix uses to open an anonymous fd in the backend. If there are any xlators which rely on these flags for their operation they may not work well. Add proper flags to anonymous fds at the time of their creation and refer to them for subsequent operations. Change-Id: I60d5b04ab851c31fbe4e5e74bee1af1b14daa52e BUG: 1342903 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/14662 Tested-by: Krutika Dhananjay <kdhananj@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* cluster/afr: Don't let NFS cache stat after writes.Pranith Kumar K2016-06-032-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Afr does post-ops after write but the stat buffer it unwinds is at the time of write, so if nfs client caches this, it will see different ctime when it does stat on it after post-op is done. From NFS client's perspective it thinks the file is changed. Tar which depends on this to be correct keeps giving 'file changed as we read it' warning. If Afr instead has to choose to unwind after post-op, eager-lock, delayed-post-op will have to be disabled which will lead to bad performance for all write usecases. Fix: Don't let client cache stat after write. >Change-Id: Ic6062acc6e5cdd97a9c83c56bd529ec83cee8a23 >BUG: 1302948 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Signed-off-by: Anuradha Talur <atalur@redhat.com> >Reviewed-on: http://review.gluster.org/13785 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Niels de Vos <ndevos@redhat.com> BUG: 1312721 Change-Id: I42a5d524bcf2a2034fe48ee8454812ca26a98c37 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14454 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* cluster/afr: Do heals with shd pidPranith Kumar K2016-05-242-12/+13
| | | | | | | | | | | | | | | | | | | | | | Multi-threaded healing doesn't create synctask with shd pid, this leads to healing problems when quota exceeds. >BUG: 1332994 >Change-Id: I80f57c1923756f3298730b8820498127024e1209 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/14211 >Smoke: Gluster Build System <jenkins@build.gluster.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> Change-Id: Id3f3ee44b27db7dbf94f3e7a9a6bfd7412d44ab8 BUG: 1335686 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14313 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* features/shard: Get hard-link-count in {unlink,rename}_cbk before deleting ↵Krutika Dhananjay2016-05-231-0/+1
| | | | | | | | | | | | | | | shards Backport of http://review.gluster.org/#/c/14334/ Change-Id: I41321d8b00a10f1bd5b0a7b008f673b1aa240d0c BUG: 1337837 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14450 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* gfapi: fill iatt in readdirp_cbk if entry->inode is nullMohammed Rafi KC2016-05-232-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | If any of dirent have inode as null in readdirp_cbk, which indicates that the stat information is not valid. So for such entries, we send explicit lookup to fill the stat information. Backport of> >Change-Id: I0604bce34583db0bb04b5aae8933766201c6ddad >BUG: 1330567 >Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> >Reviewed-on: http://review.gluster.org/14079 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >Smoke: Gluster Build System <jenkins@build.gluster.com> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Niels de Vos <ndevos@redhat.com> (cherry picked from commit 9423bdeed169076ebedd9af40b52aaac58c9839e) Change-Id: I90a218c78d5544a3b49b29079c64a8b76e7939df BUG: 1331263 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/14109 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* libglusterfs/gfapi: set appropriate errno for inode_link failuresSoumya Koduri2016-05-111-4/+17
| | | | | | | | | | | | | | | | | | | | | | We do not seem to be setting errno appropriately in case of inode_link failures. This errno may be used by any application (for eg., nfs-ganesha) to determine the error encountered. This patch addresses the same. This is backport of below mainline fix - http://review.gluster.org/14278 Change-Id: I674f747c73369d0597a9c463e6ea4c85b9091355 BUG: 1335016 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/14278 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14287 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* readdir-ahead: Prefetch xattrs needed by md-cachePrashanth Pai2016-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Negative cache feature implementation in md-cache requires xattrs returned by posix to be intercepted for every call that can possibly return xattrs. This includes readdirp(). This is crucial to treat missing keys in cache as a case of negative entry (returns ENODATA) md-cache puts names of xattrs that it wants to cache in xdata and passes it down to posix which returns the specified xattrs in the callback. This is done in lookup() and readdirp(). Hence, a xattr that is cached can be invalidated during readdirp_cbk too. This is based on the assumption that readdirp() will always return all xattrs that md-cache is interested in. However, this is not the case when readdirp() call is served from readdir-ahead's cache. readdir-ahead xlator will pre-fetch dentries during opendir_cbk and readdirp. These internal readdirp() calls made by readdir-ahead xlator does not set xdata in it's requests. Hence, no xattrs are fetched and stored in it's internal cache. This causes metadata loss in gluster-swift. md-cache returns ENODATA during getxattr() call even though the xattr for that object exists on the brick. On receiving ENODATA, gluster-swift will create new metadata and do setxattr(). This results in loss of information stored in existing xattr. Fix: During opendir, md-cache will communicate to readdir-ahead asking it to store the names of xattrs it's interested in so that readdir-ahead can fetch those in all subsequent internal readdirp() calls issued by it. This stored names of xattrs is invalidated/updated on the next real readdirp() call issued by application. This readdirp() call will have xdata set correctly by md-cache xlator. > Reviewed-on: http://review.gluster.org/14214 > Tested-by: Prashanth Pai <ppai@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Tested-by: Raghavendra G <rgowdapp@redhat.com> BUG: 1334700 Change-Id: I32d46f93a99d4ec34c741f3c52b0646d141614f9 (cherry picked from commit 0c73e7050c4d30ace0c39cc9b9634e9c1b448cfb) Reviewed-on: http://review.gluster.org/14282 Tested-by: Prashanth Pai <ppai@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* rpc: define client port rangePrasanna Kumar Kalever2016-05-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: when bind-insecure is 'off', all the clients bind to secure ports, if incase all the secure ports exhaust the client will no more bind to secure ports and tries gets a random port which is obviously insecure. we have seen the client obtaining a port number in the range 49152-65535 which are actually reserved as part of glusterd's pmap_registry for bricks, hence this will lead to port clashes between client and brick processes. Solution: If we can define different port ranges for clients incase where secure ports exhaust, we can avoid the maximum port clashes with in gluster processes. Still we are prone to have clashes with other non-gluster processes, but the chances being very low, but that's a different story on its own, which will be handled in upcoming patches. > Change-Id: Ib5ce05991aa1290ccb17f6f04ffd65caf411feaf > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/13998 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Change-Id: I712676d3e79145d78a17f2c361525e6ef82a4732 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14205 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd: add defence mechanism to avoid brick port clashesPrasanna Kumar Kalever2016-05-041-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro: Currently glusterd maintain the portmap registry which contains ports that are free to use between 49152 - 65535, this registry is initialized once, and updated accordingly as an then when glusterd sees they are been used. Glusterd first checks for a port within the portmap registry and gets a FREE port marked in it, then checks if that port is currently free using a connect() function then passes it to brick process which have to bind on it. Problem: We see that there is a time gap between glusterd checking the port with connect() and brick process actually binding on it. In this time gap it could be so possible that any process would have occupied this port because of which brick will fail to bind and exit. Case 1: To avoid the gluster client process occupying the port supplied by glusterd : we have separated the client port map range with brick port map range more @ http://review.gluster.org/#/c/13998/ Case 2: (Handled by this patch) To avoid the other foreign process occupying the port supplied by glusterd : To handle above situation this patch implements a mechanism to return EADDRINUSE error code to glusterd, upon which a new port is allocated and try to restart the brick process with the newly allocated port. Note: Incase of glusterd restarts i.e. runner_run_nowait() there is no way to handle Case 2, becuase runner_run_nowait() will not wait to get the return/exit code of the executed command (brick process). Hence as of now in such case, we cannot know with what error the brick has failed to connect. This patch also fix the runner_end() to perform some cleanup w.r.t return values. Backport of: > Change-Id: Iec52e7f5d87ce938d173f8ef16aa77fd573f2c5e > BUG: 1322805 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14043 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Change-Id: Ief247b4d4538c1ca03e73aa31beb5fa99853afd6 BUG: 1323564 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14208 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* features/marker: Fix dict_get errors when key is NULLKotresh HR2016-05-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Backport of: >Change-Id: I25e497459441334c13af77b3fec83c42a7a92ac4 >BUG: 1319581 >Signed-off-by: Kotresh HR <khiremat@redhat.com> >Reviewed-on: http://review.gluster.org/13793 >Smoke: Gluster Build System <jenkins@build.gluster.com> >Tested-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> >Reviewed-by: Vijaikumar Mallikarjuna <vmallika@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Venky Shankar <vshankar@redhat.com> >Signed-off-by: Kotresh HR <khiremat@redhat.com> Change-Id: I8054ffab3574b6ceb1c7d4290e9f6de3dbf38724 BUG: 1332074 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/14144 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* features/trash: wind mkdir with special pidAnoop C S2016-05-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes done w.r.t handling of mkdir calls in posix translator resulted in crashing the brick process from trash translator. This was due to the changes made in posix translator to return EPERM for every mkdir calls without 'gfid-req' set in dictionary. In order to avoid gfid mismatches during directory creation from brick side trash translator does not set 'gfid-req'. This patch is to have an exemption for trash based on a special pid set for those mkdir calls originating from trash translator and to reset it in callback. This patch also includes a small optimization to the existing test case for trash feature. > Reviewed-on: http://review.gluster.org/13776 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> (cherry picked from commit b5cfe948cb3569f034da80ac97b5d2f028b3b0e5) Change-Id: I59f084ac875e54342ecf2bffa6e43ebd84814153 BUG: 1332372 Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/14173 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* Rename enum _gf_client_pid to _gf_special_pidAnoop C S2016-05-021-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Till now _gf_client_pid enum has been used to define special PIDs used by glusterfs clients like shd, quotad etc. In order to have this enum capable of holding all other special PIDs including the one used by trash translator, _gf_client_pid is being renamed to _gf_special_pid. > Change-Id: Id123127771f18aa55d39f335801a54810848d7bc > BUG: 1330616 > Reviewed-on: http://review.gluster.org/14083 > Reviewed-by: Joseph Fernandes > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> (cherry picked from commit 2a6c6de35130328cfb6f95a6d7794dcb01b4004d) Change-Id: Id123127771f18aa55d39f335801a54810848d7bc Signed-off-by: Anoop C S <anoopcs@redhat.com> Reviewed-on: http://review.gluster.org/14097 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* rpc: fix gf_process_reserved_portsPrasanna Kumar Kalever2016-05-021-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | this patch also does minor code cleanups. Backport of: > Change-Id: I0d005bd0f9baaaae498aa1df4faa6fcb65fa7a6e > BUG: 1198849 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/13997 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Atin Mukherjee <amukherj@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Change-Id: Ia53ba724f6d31cb2fc609786e31a1b676f55fe01 BUG: 1331941 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14128 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* libglusterfs: Add debug and trace logs for stack traceRaghavendra Talur2016-05-021-4/+49
| | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/13448 commit c4d67a8338b42d6485f49999f310cbb9ed5359c5 It has become very difficult to identify the xlator which returned negative op_ret. Being able to just change the log level and visualize the stack is helpful in such cases. Change-Id: I6545b4802c1ab4d0d230d5e9e036afb2384882e1 BUG: 1330739 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/14099 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* runner: extract and return actual exit status of childPrasanna Kumar Kalever2016-05-011-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intro: pid_t waitpid(pid_t pid, int *status, int options); The waitpid() system call suspends execution of the calling process until a child specified by pid argument has changed state. Here the ret (pid) value is not equal to the exit status of the child process. Check manpages for more info on this. Problem: In the current runner framework we always return the pid i.e ret value of the waitpid, as said above it is not the exit value of the child process Solution: Extract the actual exit code/status in case if the child terminated normally, that is, by calling exit(3) or _exit(2), or by returning from main() Backport of: > Change-Id: Iffae99a43e540af66917b3745f21ea3c2a5a3c2d > BUG: 1329129 > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> > Reviewed-on: http://review.gluster.org/14042 > Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: I137b2fc120ec2b1137bf8a4e6b180f1787bf5908 BUG: 1331759 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/14116 Tested-by: Prasanna Kumar Kalever <pkalever@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* inode: Always fetch first entry from the inode lists during inode_table_destroySoumya Koduri2016-04-291-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | In inode_table_destroy, we iterate through lru and active lists to move the entries to purge list so that they can be destroyed during inode_table_prune. But if used "list_for_each_entry" or "list_for_each_entry_safe" to iterate, we could end up accessing the entries which may have got moved to different(purge) lists in the process and can result in either infinite loop or crash. The safe approach seems to fetch the first entry of the list in each iteration till it gets empty. This is backport of below mainline fix - http://review.gluster.org/13987 Change-Id: I24a18881833bd9419c2d8e5e8807bc71ec396479 BUG: 1330892 Signed-off-by: Soumya Koduri <skoduri@redhat.com> Reviewed-on: http://review.gluster.org/13987 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/14089 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cli/quota: Sort the list output alphabetically by pathvmallika2016-04-273-0/+61
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/14000 > Change-Id: I0b124e119d167817be2ae3eb52ac6c80fc7db5d1 > BUG: 1320716 > Signed-off-by: vmallika <vmallika@redhat.com> > Reviewed-on: http://review.gluster.org/14000 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Kaushal M <kaushal@redhat.com> Change-Id: I87e12d58c8e267b2af67e287998e7313efc70af4 BUG: 1330018 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/14061 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* afr: replica pair going offline does not require CHILD_MODIFIED eventSakshi Bansal2016-04-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As a part of CHILD_MODIFIED event DHT forgets the current layout and performs fresh lookup. However this is not required when a replica pair goes offline as the xattrs can be read from other replica pairs. Hence setting different event to handle replica pair going down. > Backport of http://review.gluster.org/#/c/12573/ > Change-Id: I5ede2a6398e63f34f89f9d3c9bc30598974402e3 > BUG: 1281230 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/12573 > Reviewed-by: Ravishankar N <ravishankar@redhat.com> > Reviewed-by: Susant Palai <spalai@redhat.com> > Tested-by: NetBSD Build System <jenkins@build.gluster.org> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Jeff Darcy <jdarcy@redhat.com> Change-Id: Ida30240d1ad8b8730af7ab50b129dfb05264fdf9 BUG: 1283972 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/12767 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* quota: setting 'read-only' option in xdata to instruct DHT to not healSakshi Bansal2016-04-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Backport of http://review.gluster.org/#/c/13988/ > Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 > BUG: 1252244 > Signed-off-by: Sakshi Bansal <sabansal@redhat.com> > Reviewed-on: http://review.gluster.org/13988 > Smoke: Gluster Build System <jenkins@build.gluster.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1328473 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/14031 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Raghavendra G <rgowdapp@redhat.com>
* tier/libgfdb: Ordering query results from libgfdbJoseph Fernandes2016-04-261-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When querying we will order the query result to get the hotest or the coldest files in the queried list so that these files are migrated first. Now here we are giving priority to the write heat(time and counters), as it requires complex queries to have a composite ordering of write and read + it has it impact on performance. Backport of http://review.gluster.org/13607 > Change-Id: I2e0415dcfad4218b42c68fc5c2ed8d1f075ce9ea > Signed-off-by: Joseph Fernandes <josferna@redhat.com> > Reviewed-on: http://review.gluster.org/13607 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Tested-by: Joseph Fernandes > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: Joseph Fernandes <josferna@redhat.com> Change-Id: If5fad07f8d0f50016b10e256803abd5266cd708f BUG: 1323017 Reviewed-on: http://review.gluster.org/13881 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Joseph Fernandes Tested-by: Joseph Fernandes NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
* cluster/distribute: detect stale layouts in entry fopsRaghavendra G2016-04-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dht_mkdir () { first-hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; inodelk (SETLKW, parent, "LAYOUT_HEAL_DOMAIN", "can be any subvol, but we choose first-hashed-subvol randomly"); { begin: hashed-subvol = hashed-subvol for "bname" in in-memory layout of "parent"; hash-range = extract hashe-range from layout of "parent"; ret = mkdir (parent/bname, hashed-subvol, hash-range); if (ret == "hash-value doesn't fall into layout stored on the brick (this error is returned by posix-mkdir)") { refresh_parent_layout (); goto begin; } } inodelk (UNLCK, parent, "LAYOUT_HEAL_DOMAIN", "first-hashed-subvol"); proceed with other parts of dht_mkdir; } posix_mkdir (parent/bname, client-hash-range) { disk-hash-range = getxattr (parent, "dht-layout-key"); if (disk-hash-range != client-hash-range) { fail-with-error ("hash-value doesn't fall into layout stored on the brick"); return 0; } continue-with-posix-mkdir; } Similar changes need to be done for dentry operations like create, symlink, link, unlink, rmdir, rename. These will be addressed in subsequent patches. This patch addresses only mkdir codepath. This change breaks stripe tests, as on some striped subvols dht layout xattrs are not set for some reason. This results in failure of mkdir. Since striped volumes are always created with dht, some tests associated with stripe also fail. So, I am making following tests changes (since stripe is out of maintainance): * modify ./tests/basic/rpc-coverage.t to not to use striped volumes * mark all (2) tests in tests/bugs/stripe/ as bad tests Change-Id: Idd1ae879f24a48303dc743c1bb4d91f89a629e25 BUG: 1329062 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/14040 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* op-version: Bump up op-version to 3.7.12Pranith Kumar K2016-04-181-1/+1
| | | | | | | | | | | BUG: 1325857 Change-Id: I49286ba60281d543f2acacf45c4f824627ef4167 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14017 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
* cluster/afr: Use parallel dir scan functionalityPranith Kumar K2016-04-171-0/+2
| | | | | | | | | | | | | | | | | | | | >BUG: 1221737 >Change-Id: I0ed71a72f0e33bd733723e00a01cf28378c5534e >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13755 >Reviewed-on: http://review.gluster.org/13992 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> BUG: 1325857 Change-Id: I7c6b2ea065edd7f5dafffeb42fd6c601b4ab8d14 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/14010 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* syncop: Add parallel dir scan functionalityPranith Kumar K2016-04-163-0/+246
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of this functionality's ideas are contributed by Richard Wareing, in his patch: https://bugzilla.redhat.com/show_bug.cgi?id=1221737#c1 VERY BIG thanks to him :-). After starting porting/testing the patch above, I found a few things we can improve in this patch based on the results we got in testing. 1) We are reading all the indices before we launch self-heals. In some customer cases I worked on there were almost 5million files/directories that needed heal. With such a big number self-heal daemon will be OOM killed if we go this route. So I modified this to launch heals based on a queue length limit. 2) We found that for directory hierarchies, multi-threaded self-heal patch was not giving better results compared to single-threaded self-heal because of the order problems. We improved index xlator to give gfid type to make sure that all directories in the indices are healed before the files that follow in that iteration of readdir output(http://review.gluster.org/13553). In our testing this lead to zero errors of self-heals as we were only doing self-heals in parallel for files and not directories. I think we can further improve self-heal speed for directories by doing name heals in parallel based on similar techniques Richard's patch showed. I think the best thing there would be to introduce synccond_t infra (pthread_cond_t kind of infra for syncops) which I am planning to implement for future releases. 3) Based on 1), 2) and the fact that afr already does retries of the indices in a loop I removed retries again in the threads. 4) After the refactor, the changes required to bring in multi-threaded self-heal for ec would just be ~10 lines, most of it will be about options initialization. Our tests found that we are able to easily saturate network :-). High level description of the final feature: Traditionally self-heal daemon reads the indices (gfids) that need to be healed from the brick and initiates heal one gfid at a time. Goal of this feature is to add parallelization to the way we do self-heals in a way we do not regress in any case but increase parallelization wherever we can. As part of this following knobs are introduced to improve parallelization: 1) We can launch 'max-jobs' number of heals in parallel. 2) We can keep reading indices as long as the wait-q for heals doesn't go over 'max-qlen' passed as arguments to multi-threaded dir_scan. As a first cut, we always do healing of directories in serial order one at a time but for files we launch heals in parallel. In future we can do name-heals of dir in parallel, but this is not implemented as of now. Reason for this is mentioned already in '2)' above. AFR/EC can introduce options like max-shd-threads/wait-qlength which can be set by users to increase the rate of heals when they want. Please note that the options will take effect only for the next crawl. >BUG: 1221737 >Change-Id: I8fc0afc334def87797f6d41e309cefc722a317d2 >Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> >Reviewed-on: http://review.gluster.org/13569 >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Jeff Darcy <jdarcy@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> BUG: 1325857 Change-Id: I23235bbb923208eee6a8be711bbfb14350edb11b Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/13967 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* Revert "glusterd: Bug fixes for IPv6 support"Kaushal M2016-04-162-30/+11
| | | | | | | | This reverts commit b33f3c95ec9c8112e6677e09cea05c4c462040d0. This commit exposes some issues with management encryption that prevents GlusterFS from operating properly. This will be added again once problems with management encryption are fixed.
* marker: optimize mq_update_dirty_inode_taskvmallika2016-04-062-0/+14
| | | | | | | | | | | | | | | | | | | | | This is a backport of http://review.gluster.org/#/c/13892/ In function mq_update_dirty_inode_task we do readdirp on a dirty directory and for entry we again do lookup to fecth the contribution xattr. We can fetch this contribution as part of readdirp > Change-Id: I766593c0dba793f1ab3b43625acce1c7d9af8d7f > BUG: 1320818 > Signed-off-by: vmallika <vmallika@redhat.com> Change-Id: Id826a09a72529f7435372ea7f04068dd10da5fcb BUG: 1324040 Signed-off-by: vmallika <vmallika@redhat.com> Reviewed-on: http://review.gluster.org/13908 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* Revert "cluster/ec: Rebalance hangs during rename"Kaushal M2016-04-011-8/+1
| | | | | | | This reverts commit 3d34c495d547866a533bc0614b14163381830095, which broke building rpms and possibly other packages as well. Change-Id: I2c10a613599e63bc0cbdb1b405cd87be9efa4a99
* features/changelog: Don't modify 'pargfid' in 'resolve_pargfid_to_path'Kotresh HR2016-03-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/13845/ If 'changelog' is enabled and 'changelog.capture-del-path' option is on it calls 'resolve_pargfid_to_path' which modifies 'pargfid' sent by caller. 'changelog_unlink' calls this routine directly with 'loc->pargfid' resulting it being modified and point to root instead of actual pargfid. This is a nasty bug and could cause the deletion of entry on root directory instead on actual parent when 'loc->path' is not present. Hence this fix to make 'pargfid' a const pointer and 'resolve_pargfid' to work on copy of pargfid. Glusterfind session creation enables these options by default to capture deleted entry path in changelog. Thanks Pranith for root causing this. BUG: 1322552 Change-Id: I9f2bc44b5604b224462594c12b7d79e68198d693 Signed-off-by: Kotresh HR <khiremat@redhat.com> Reviewed-on: http://review.gluster.org/13861 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* cluster/ec: Rebalance hangs during renameAshish Pandey2016-03-311-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: During the rename of a particular file (ec is holding blocking inodelk on the parent directory), if the rename of another file under the same directory comes. EC does not release the lock and goes ahead and renames the "new" file with the "already held lock". That causes rebalance process to be blocked on a lock which has been acquired by rename. Solution: While rename fop comes, ec takes blocking inodelk on old and new parent of the file. Before releasing, every lock held by ec, it waits for some "time" to see if that lock can be reused by the next fop. If within this "time" some other request comes, it releases this lock based on condition "lock count > 1" To get this "lock count" for rename fop, we have implemented "pl_rename" in feature/lock. Also, on ec side, changed the condition to release the lock based on the type of fop and old and new parent directories. master- http://review.gluster.org/#/c/13460/ Change-Id: I979dbab1185df962e8f305a6074ae1186ffe7db0 Bug: 1322299 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/13849 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* libglusterfs: open cmd_history log file with O_APPEND and O_WRONLYAtin Mukherjee2016-03-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8fdfa0c introduced a fix to ensure cmd_history file is log rotated properly. However with this fix fdopen() is called with mode "a" on a fd which was not opened with O_WRONLY & O_APPEND resulting into a fdopen() failure. Fix is to open cmd_history.log file with O_CREATE|O_WRONLY|O_APPEND mode Backport of commit 207289621f6c5b75bdb80aa14ddaf72efd5eb9b1: > Change-Id: I75ef350560aa6d5435c78c5fd83adfde1a73bfc3 > BUG: 1286959 > Signed-off-by: Atin Mukherjee <amukherj@redhat.com> > Reviewed-on: http://review.gluster.org/13829 > Smoke: Gluster Build System <jenkins@build.gluster.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Change-Id: I75ef350560aa6d5435c78c5fd83adfde1a73bfc3 BUG: 1304963 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-on: http://review.gluster.org/13847 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* gfapi: Fix the crashes caused by global_xlator and THISPoornima G2016-03-304-5/+10
| | | | | | | | | | | | | | | | | | | | | | Issue: http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10922 The right fix for this is elaborate and intrusive, until it is in place, this patch provides a temperory fix. This fix is necessary, as without this libgfapi applications like qemu, samba, NFS ganesha are prone to crashes. This patch will be reverted completely, once the actual fix gets accepted. Credits: Rajesh Joseph, Raghavendra Talur, Anoop CS Back-port of: http://review.gluster.org/#/c/13784/ Change-Id: I8a8a0572bea0eec94ece6aa0d7afcf2f459b4a43 BUG: 1319989 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/13803 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anoop C S <anoopcs@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* libglusterfs: pass buffer size to gf_store_read_and_tokenize functionGaurav Kumar Garg2016-03-232-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is backport of: http://review.gluster.org/#/c/12346/ Previously if user set an option where length of key=value goes beyond PATH_MAX (4096) character then tokenzing the option at the time of reading configuration file will fail. This is because of the we was having restraction in fgets to read maximum of PATH_MAX (4096) length of character. Consequence of this is when user try to restart glusterd, after setting key=value length beyond PATH_MAX (4096) character, glusterd will not restart. With this fix instead of PATH_MAX, consumer of gf_store_read_and_tokenize function will decide the size of the buffer length. Cherry picked from commit 816ca94f5dd49f34f395caf501de3c71f0ba113d: >> Change-Id: I655a8ce982effdfff8f3e785ea31f543dbe39301 >> BUG: 1271150 >> Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> >> Reviewed-on: http://review.gluster.org/12346 >> Tested-by: NetBSD Build System <jenkins@build.gluster.org> >> Tested-by: Gluster Build System <jenkins@build.gluster.com> >> Reviewed-by: Anand Nekkunti <anekkunt@redhat.com> >> Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I5b76e81a4ad31a286fb4298ba27f3230fba99ab4 BUG: 1319649 Signed-off-by: Gaurav Kumar Garg <ggarg@redhat.com> Reviewed-on: http://review.gluster.org/13795 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com>
* glusterd / afr : Enable auto heal when replica count increasesAnuradha Talur2016-03-232-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/12451 In replicate volumes, when a brick is added to a replicate group, heal to the new brick should be triggered. Also, the new brick should not be considered as source for healing till it is up to date. Previously, extended attributes had to be set manually on the bricks for this to happen. This patch is part 1 patch to automate this process. >Change-Id: I29958448618372bfde23bf1dac5dd23dba1ad98f >BUG: 1276203 >Signed-off-by: Anuradha Talur <atalur@redhat.com> >Reviewed-on: http://review.gluster.org/12451 >Reviewed-by: Atin Mukherjee <amukherj@redhat.com> >NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> >CentOS-regression: Gluster Build System <jenkins@build.gluster.com> >Reviewed-by: Ravishankar N <ravishankar@redhat.com> >Smoke: Gluster Build System <jenkins@build.gluster.com> Signed-off-by: Anuradha Talur <atalur@redhat.com> Conflicts: libglusterfs/src/globals.h xlators/mgmt/glusterd/src/glusterd-replace-brick.c Change-Id: Ica83592aab8edbe49e2bb9d8d4824cf5c76324b7 BUG: 1320020 Reviewed-on: http://review.gluster.org/13806 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Anuradha Talur <atalur@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* glusterd: Bug fixes for IPv6 supportNithin D2016-03-212-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport of http://review.gluster.org/#/c/11988/ Problem: Glusterd not working using ipv6 transport. The idea is with proper glusterd.vol configuration, 1. glusterd needs to listen on default port (240007) as IPv6 TCP listner. 2. Volume creation/deletion/mounting/add-bricks/delete-bricks/peer-probe needs to work using ipv6 addresses. 3. Bricks needs to listen on ipv6 addresses. All the above functionality is needed to say that glusterd supports ipv6 transport and this is broken. Fix: When "option transport.address-family inet6" option is present in glusterd.vol file, it is made sure that glusterd creates listeners using ipv6 sockets only and also the same information is saved inside brick volume files used by glusterfsd brick process when they are starting. Tests Run: Regression tests using ./run-tests.sh IPv4: Regression tests using ./run-tests.sh for release-3.7 branch verified by comparing with clean repo. IPv6: (Need to add the above mentioned config and also add an entry for "hostname ::1" in /etc/hosts) Started failing at ./tests/basic/glusterd/arbiter-volume-probe.t and ran successfully till here Change-Id: Idd7513aa2347ce0de2b1f68daeecce1b7a39a7af BUG: 1310445 Signed-off-by: Nithin D <nithind1988@yahoo.in> Reviewed-on: http://review.gluster.org/13787 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
* cluster/ec: Provide an option to enable/disable eager lockAshish Pandey2016-03-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If a fop takes lock, and completes its operation, it waits for 1 second before releasing the lock. However, If ec find any lock contention within this time period, it release the lock immediately before time expires. As we take lock on first brick, for few operations, like read, it might happen that discovery of lock contention might take long time and can degrades the performance. Solution: Provide an option to enable/disable eager lock. If eager lock is disabled, lock will be released as soon as fop completes. gluster v set <VOLUME NAME> disperse.eager-lock on gluster v set <VOLUME NAME> disperse.eager-lock off master- http://review.gluster.org/13605 Change-Id: I000985a787eba3c190fdcd5981dfbf04e64af166 BUG: 1318965 Signed-off-by: Ashish Pandey <aspandey@redhat.com> Reviewed-on: http://review.gluster.org/13773 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* fuse: Add a new mount option capabilityPoornima G2016-03-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally all security.* xattrs were forbidden if selinux is disabled, which was causing Samba's acl_xattr module to not work, as it would store the NTACL in security.NTACL. To fix this http://review.gluster.org/#/c/12826/ was sent, which forbid only security.selinux. This opened up a getxattr call on security.capability before every write fop and others. Capabilities can be used without selinux, hence if selinux is disabled, security.capability cannot be forbidden. Hence adding a new mount option called capability. Only when "--capability" or "--selinux" mount option is used, security.capability is sent to the brick, else it is forbidden. Backport of : http://review.gluster.org/#/c/13540/ & http://review.gluster.org/#/c/13653/ BUG: 1309462 Change-Id: Ib8d4f32d9f1458f4d71a05785f92b526aa7033ff Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/13626 Tested-by: Vijay Bellur <vbellur@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* features/shard: Return ENOTSUP for unsupported fallocate flagsKrutika Dhananjay2016-03-091-1/+4
| | | | | | | | | | | | | | | | Backport of: http://review.gluster.org/13523 Basis: http://lists.gnu.org/archive/html/qemu-devel/2016-02/msg05101.html Change-Id: I681eb4d0c43c635cf96a2deab0996dec7a255fe5 BUG: 1299712 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/13652 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Ravishankar N <ravishankar@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>