summaryrefslogtreecommitdiffstats
path: root/xlators/nfs
Commit message (Collapse)AuthorAgeFilesLines
* nfs: add NULL check for call state in nfs3_call_state_wipeJiffin Tony Thottan2017-08-111-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refcounting added for nfs call state in https://review.gluster.org/17696. This is based on assumption that call state won't NULL when it is freed. But currently gluster nfs server is crashing in different scenarios at nfs3_getattr() with following bt #0 0x00007ff1cfea9205 in _gf_ref_put (ref=ref@entry=0x0) at refcount.c:36 #1 0x00007ff1c1997455 in nfs3_call_state_wipe (cs=cs@entry=0x0) at nfs3.c:559 #2 0x00007ff1c1998931 in nfs3_getattr (req=req@entry=0x7ff1bc0b26d0, fh=fh@entry=0x7ff1c2f76ae0) at nfs3.c:962 #3 0x00007ff1c1998c8a in nfs3svc_getattr (req=0x7ff1bc0b26d0) at nfs3.c:987 #4 0x00007ff1cfbfd8c5 in rpcsvc_handle_rpc_call (svc=0x7ff1bc03e500, trans=trans@entry=0x7ff1bc0c8020, msg=<optimized out>) at rpcsvc.c:695 #5 0x00007ff1cfbfdaab in rpcsvc_notify (trans=0x7ff1bc0c8020, mydata=<optimized out>, event=<optimized out>, data=<optimized out>) at rpcsvc.c:789 #6 0x00007ff1cfbff9e3 in rpc_transport_notify (this=this@entry=0x7ff1bc0c8020, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7ff1bc0038d0) at rpc-transport.c:538 #7 0x00007ff1c4a2e3d6 in socket_event_poll_in (this=this@entry=0x7ff1bc0c8020, notify_handled=<optimized out>) at socket.c:2306 #8 0x00007ff1c4a3097c in socket_event_handler (fd=21, idx=9, gen=19, data=0x7ff1bc0c8020, poll_in=1, poll_out=0, poll_err=0) at socket.c:2458 #9 0x00007ff1cfe950f6 in event_dispatch_epoll_handler (event=0x7ff1c2f76e80, event_pool=0x5618154d5ee0) at event-epoll.c:572 #10 event_dispatch_epoll_worker (data=0x56181551cbd0) at event-epoll.c:648 #11 0x00007ff1cec99e25 in start_thread () from /lib64/libpthread.so.0 #12 0x00007ff1ce56634d in clone () from /lib64/libc.so.6 This patch add previous NULL check move from __nfs3_call_state_wipe() to nfs3_call_state_wipe() Cherry picked from commit 111d6bda9259126b0429113c9b8ba479958a4398: > Change-Id: I2d73632f4be23f14d8467be3d908b09b3a2d87ea > BUG: 1479030 > Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> > Reviewed-on: https://review.gluster.org/17989 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I2d73632f4be23f14d8467be3d908b09b3a2d87ea BUG: 1480594 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/18027 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nfs/nlm: keep track of the call-state and frame for notificationsNiels de Vos2017-08-112-24/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When blocking locks are used, a new frame is allocated that is used to send the notification to the client once once the lock becomes available. In all other cases, the frame that contains the request from the client will be used for the reply. Because there was no way to track the different clients with their requests (captured in the call-state), the call-state could be free'd before the notification was sent to the client. This caused a use-after-free of the call-state and could trigger segfaults of the Gluster/NFS server or incorrect replies on (un)lock requests. By introducing a nlm4_notify_args structure, the call-state and frame can be tracked better. This prevents the possibility of segfaulting when the call-state is used after being free'd. Cherry picked from commit b81997264f079983fa02bd5fa2b3715224942b00: > BUG: 1467313 > Change-Id: I285d2bc552f509e5145653b7a50afcff827cd612 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17700 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: I285d2bc552f509e5145653b7a50afcff827cd612 BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17796 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* nfs/nlm: use refcounting for nfs3_call_state_tNiels de Vos2017-08-111-11/+35
| | | | | | | | | | | | | | | | | | | | | | | | In order to track down a potential use-after-free of the nfs3_call_state_t structure in the NLM component, add reference counting where teh structure is used. This should prevent premature free'ing of the structure. Cherry picked from commit 01bfdd4d1759423681d311da33f4ac2346ace445: > Change-Id: Ib1f13b0463ab1e012b7b49a623c91f0f3e73e1fb > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17699 > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: Ib1f13b0463ab1e012b7b49a623c91f0f3e73e1fb BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17795 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nfs/nlm: handle reconnect for non-NLM4_LOCK requestsNiels de Vos2017-08-111-22/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a reply on an NLM-procedure gets stuck, the NFS-client will resend the request. This can happen through a re-connect in case the connection was terminated (long delay in the reply on the initial request). Once that happens, not all NLM-procedures are handled correctly. Testing this is difficult and time-consuming. There still may be problems with certain operations, but this definitely makes it behave much better than before. The problem occured due to a problem in EC, change-id I18a782903ba addressed the root cause. Cherry picked from commit fafe1491ead527ba1024c521013aa90d2ee2b355: > Change-Id: I23b385568e27232951fa3fbd7198a0e5d775a8c2 > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17698 > Smoke: Gluster Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Change-Id: I23b385568e27232951fa3fbd7198a0e5d775a8c2 BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17794 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nfs/nlm: unref fds in nlm_client_free()Niels de Vos2017-08-111-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | When a nlm_clnt is getting free'd, the FDs associated with this client should be unref'd as well. Cherry picked from commit e9a482f94e748ea12e73ddd2e275bad9aa314b4c: > Change-Id: Ifa4ea4b7ed45a454413cfc0c820f2516c534a9aa > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17697 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Change-Id: Ifa4ea4b7ed45a454413cfc0c820f2516c534a9aa BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17793 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* nfs: make nfs3_call_state_t refcountedNiels de Vos2017-08-113-39/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no refcounting done of the nfs3_call_state_t structure, which seems to result in use-after-free problems in the NLM part of Gluster/NFS. The structure is initialized with two different functions, it is easier to have a single place to do this. The Gluster/NFS part will not use the refcounting, for now. This is being added to make the NLM code more stable. nfs3_call_state_wipe() will behave as before for Gluster/NFS, but cleanup is triggered through the refcounting now. This prevents major changes to the stable part of the NFS-server, and makes it possible to improve the NLM component separately. Cherry picked from commit daed52b8ebcac7ef36f11e944f83826f46593867: > Change-Id: I2e15bcf12af74e8a46c2727e4a160e9444d29ece > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17696 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Amar Tumballi <amarts@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: I2e15bcf12af74e8a46c2727e4a160e9444d29ece BUG: 1471870 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17792 Smoke: Gluster Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* nfs/nlm: remove lock request from the list after cancelNiels de Vos2017-05-141-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Once an NLM client cancels a lock request, it should be removed from the list. The list can also be cleaned of unneeded entries once the client does not have any outstanding lock/share requests/granted. Cherry picked from commit 71cb7f3eb4fb706aab7f83906592942a2ff2e924: > Change-Id: I2f2b666b627dcb52cddc6d5b95856e420b2b2e26 > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17188 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: I2f2b666b627dcb52cddc6d5b95856e420b2b2e26 BUG: 1450378 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17273 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* nfs/nlm: free the nlm_client upon RPC_DISCONNECTNiels de Vos2017-05-141-12/+20
| | | | | | | | | | | | | | | | | | | | | | | | | When an NLM client disconnects, it should be removed from the list and free'd. > Cherry picked from commit 6897ba5c51b29c05b270c447adb1a34cb8e61911: > Change-Id: Ib427c896bfcdc547a3aee42a652578ffd076e2ad > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17189 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Change-Id: Ib427c896bfcdc547a3aee42a652578ffd076e2ad BUG: 1450378 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17272 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* nfs/nlm: log the caller_name if nlm_client_t can be foundNiels de Vos2017-05-141-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | In order to help tracking possible misbehaving clients down, log the 'caller_name' (hostname of the NFS client) that does not have a matching nlm_client_t structure. Cherry picked from commit 9bfb74a39954a7e63bfd762c816efc7e64b9df65: > Change-Id: Ib514a78d1809719a3d0274acc31ee632727d746d > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17186 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: Ib514a78d1809719a3d0274acc31ee632727d746d BUG: 1450378 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17271 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* nfs/nlm: ignore notify when there is no matching rpc requestNiels de Vos2017-05-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | In certain (unclear) occasions it seems to happen that there are notifications sent to the Gluster/NFS NLM service, but no call-state can be found. Instead of segfaulting, log an error but keep on running. Cherry picked from commit e997d752ba08f80b1b00d2c0035874befafe5200: > Change-Id: I0f186e56e46a86ca40314d230c1cc7719c61f0b5 > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17185 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I0f186e56e46a86ca40314d230c1cc7719c61f0b5 BUG: 1450378 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17270 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* nfs/nlm: unref rpc-client after nlm4svc_send_granted()Niels de Vos2017-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | nlm4svc_send_granted() uses the rpc_clnt by getting it from the call-state structure. It is safer to unref the rpc_clnt after the function is done with it. Cherry picked from commit 52c28c0c04722a9ffaa7c39c49ffebdf0a5c75e1: > Change-Id: I7cb7c4297801463d21259c58b50d7df7c57aec5e > BUG: 1381970 > Signed-off-by: Niels de Vos <ndevos@redhat.com> > Reviewed-on: https://review.gluster.org/17187 > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: soumya k <skoduri@redhat.com> > Reviewed-by: Jeff Darcy <jeff@pl.atyp.us> Change-Id: I7cb7c4297801463d21259c58b50d7df7c57aec5e BUG: 1450378 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/17269 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
* nfs: make subdir mounting work for Solaris 10 clientsBipin Kunal2017-04-272-30/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the segfault caused by solaris client in Gluster/NFS. Volname was not being parsed properly, Instead of volume name complete path was being used in nfs_mntpath_to_xlator(). Fixed it by striping volume name from complete path in nfs_mntpath_to_xlator(). Modified function name nfs3_funge_solaris_zerolen_fh() to nfs3_funge_webnfs_zerolen_fh() as zero-filled filehandle is specific to WebNFS. RFC : https://tools.ietf.org/html/rfc2055 Solaris uses WebNFS, the zero-filled FH is defined in the WebNFS spec. Logic was even added in fuction nfs3_funge_webnfs_zerolen_fh() to send subdir path in function glfs_resolve_at() instead of complete path for subdir mount. > Change-Id: I19aae3547b8910e7ed4974ee5385424cab3e834a > BUG: 1426667 > Signed-off-by: Bipin Kunal <bkunal@redhat.com> > Reviewed-on: https://review.gluster.org/16770 > Reviewed-by: Niels de Vos <ndevos@redhat.com> > Smoke: Gluster Build System <jenkins@build.gluster.org> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > (cherry picked from commit 40e571339b3c19ab2a5b6a93bc46eadf2252d006) Change-Id: I0adfb1555be0c5bb43941530c5d87a820929a3cf BUG: 1440278 Signed-off-by: Bipin Kunal <bkunal@redhat.com> Reviewed-on: https://review.gluster.org/17018 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* gNFS: Keep the mountdict as long as the service is activeNiels de Vos2017-02-151-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We initialize and take ref once on mountdict during NFS/MNT3 server initialization but seem to be unref'in it for every UMNTALL request. This can lead to crash when there are multiple UMNTALL requests with >=1 active mount entry(/ies) in the mountlist. Since we take the ref only once, we should keep the mountdict through out the life of the process and dereference it only during unitialization of mnt3 service. Cherry picked from commit a88ae92de190af0956013780939ba6bdfd509ff8: > Change-Id: I3238a8df09b8972e56dd93fee426d866d40d9959 > BUG: 1421759 > Signed-off-by: Soumya Koduri <skoduri@redhat.com> > Reviewed-on: https://review.gluster.org/16611 > Smoke: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> > CentOS-regression: Gluster Build System <jenkins@build.gluster.org> > Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> > Reviewed-by: Niels de Vos <ndevos@redhat.com> Change-Id: I3238a8df09b8972e56dd93fee426d866d40d9959 BUG: 1422391 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: https://review.gluster.org/16625 Reviewed-by: soumya k <skoduri@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* core: run many bricks within one glusterfsd processJeff Darcy2017-02-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple brick translator stacks running in a single brick server process. This reduces our per-brick memory usage by approximately 3x, and our appetite for TCP ports even more. It also creates potential to avoid process/thread thrashing, and to improve QoS by scheduling more carefully across the bricks, but realizing that potential will require further work. Multiplexing is controlled by the "cluster.brick-multiplex" global option. By default it's off, and bricks are started in separate processes as before. If multiplexing is enabled, then *compatible* bricks (mostly those with the same transport options) will be started in the same process. Backport of: > Change-Id: I45059454e51d6f4cbb29a4953359c09a408695cb > BUG: 1385758 > Reviewed-on: https://review.gluster.org/14763 Change-Id: I4bce9080f6c93d50171823298fdf920258317ee8 BUG: 1418091 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: https://review.gluster.org/16496 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
* refcount: return pointer to the structure instead of a counterNiels de Vos2016-12-111-4/+6
| | | | | | | | | | | | | | | | There are no real users of the counter. It was thought of a handy tool to track and debug refcounting, but it is not used at all. Some parts of the code would benefit from a pointer getting returned instead. BUG: 1399780 Change-Id: I97e52c48420fed61be942ea27ff4849b803eed12 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15971 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* afr,dht,ec: Replace GF_EVENT_CHILD_MODIFIED with event SOME_DESCENDENT_DOWN/UPPoornima G2016-11-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently these are few events related to child_up/down: GF_EVENT_CHILD_UP : Issued when any of the protocol client connects. GF_EVENT_CHILD_MODIFIED : Issued by afr/dht/ec GF_EVENT_CHILD_DOWN : Issued when any of the protocol client disconnects. These events get modified at the dht/afr/ec layers. Here is a brief on the same. DHT: - All the subvolumes reported once, and atleast one child came up, then GF_EVENT_CHILD_UP is issued - connect GF_EVENT_CHILD_UP is issued - disconnect GF_EVENT_CHILD_MODIFIED is issued - All the subvolumes disconnected, GF_EVENT_CHILD_DOWN is issued AFR: - First subvolume came up, then GF_EVENT_CHILD_UP is issued - Subsequent subvolumes coming up, results in GF_EVENT_CHILD_MODIFIED - Any of the subvolumes go down, then GF_EVENT_SOME_CHILD_DOWN is issued - Last up subvolume goes down, then GF_EVENT_CHILD_DOWN is issued Until the patch [1] introduced GF_EVENT_SOME_CHILD_UP, GF_EVENT_CHILD_MODIFIED was issued by afr/dht when any of the subvolumes go up or down. Now with md-cache changes, there is a necessity to differentiate between child up and down. Hence, introducing GF_EVENT_SOME_DESCENDENT_DOWN/UP and getting rid of GF_EVENT_CHILD_MODIFIED. [1] http://review.gluster.org/12573 Change-Id: I704140b6598f7ec705493251d2dbc4191c965a58 BUG: 1396038 Signed-off-by: Poornima G <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/15764 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
* nfs: revalidate lookup converted to fresh lookupMohammed Rafi KC2016-11-104-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | when an inode ctx is missing for a linked inode the revalidate lookups are converted to fresh. This could result in sending ESTALE when the gfid are recreated We are not able to reproduce the issue with normal setup, most part of RCA was done with code reading. Possible scenario in which this bug can reproduce, Delete a file and recreate a new file with same name, at the same time from another client process try to list/or access the file. In this case the second client may throw an ESTALE error for such files Thanks to Soumya and Pranith for doing the complete RCA Change-Id: I73992a65844b09a169cefaaedc0dcfb129d66ea1 BUG: 1379720 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/15580 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* trivial: correct some spelling mistakes in comments and logsNiels de Vos2016-10-181-1/+1
| | | | | | | | | | | | | | BUG: 1385593 Change-Id: Icfae9e557a284182c6c22e9606fdd641528906f0 Reported-by: Patrick Matthäi <pmatthaei@debian.org> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/15656 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org>
* build: out-of-tree builds generates files in the wrong directoryKaleb S KEITHLEY2016-09-181-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And minor cleanup of a few of the Makefile.am files while we're at it. Rewrite the make rules to do what xdrgen does. Now we can get rid of xdrgen. Note 1. netbsd6's sed doesn't do -i. Why are we still running smoke tests on netbsd6 and not netbsd7? We barely support netbsd7 as it is. Note 2. Why is/was libgfxdr.so (.../rpc/xdr/src/...) linked with libglusterfs? A cut-and-paste mistake? It has no references to symbols in libglusterfs. Note3. "/#ifndef\|#define\|#endif/" (note the '\'s) is a _basic_ regex that matches the same lines as the _extended_ regex "/#(ifndef|define|endif)/". To match the extended regex sed needs to be run with -r on Linux; with -E on *BSD. However NetBSD's and FreeBSD's sed helpfully also provide -r for compatibility. Using a basic regex avoids having to use a kludge in order to run sed with the correct option on OS X. Note 4. Not copying the bit of xdrgen that inserts copyright/license boilerplate. AFAIK it's silly to pretend that machine generated files like these can be copyrighted or need license boilerplate. The XDR source files have their own copyright and license; and their copyrights are bound to be more up to date than old boilerplate inserted by a script. From what I've seen of other Open Source projects -- e.g. gcc and its C parser files generated by yacc and lex -- IIRC they don't bother to add copyright/license boilerplate to their generated files. It appears that it's a long-standing feature of make (SysV, BSD, gnu) for out-of-tree builds to helpfully pretend that the source files it can find in the VPATH "exist" as if they are in the $cwd. rpcgen doesn't work well in this situation and generates files with "bad" #include directives. E.g. if you `rpcgen ../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.x`, you get an #include directive in the generated .c file like this: ... #include "../../../../$srcdir/rpc/xdr/src/glusterfs3-xdr.h" ... which (obviously) results in compile errors on out-of-tree build because the (generated) header file doesn't exist at that location. Compared to `rpcgen ./glusterfs3-xdr.x` where you get: ... #include "glusterfs3-xdr.h" ... Which is what we need. We have to resort to some Stupid Make Tricks like the addition of various .PHONY targets to work around the VPATH "help". Warning: When doing an in-tree build, -I$(top_builddir)/rpc/xdr/... looks exactly like -I$(top_srcdir)/rpc/xdr/... Don't be fooled though. And don't delete the -I$(top_builddir)/rpc/xdr/... bits Change-Id: Iba6ab96b2d0a17c5a7e9f92233993b318858b62e BUG: 1330604 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/14085 Tested-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs: fix unused variable warnings/errorsKaleb S. KEITHLEY2016-08-232-2/+0
| | | | | | | | | | | | | | | | | http://review.gluster.org/14085 fixes a/the "leak" - via the generated rpc/xdr headers - of pragmas that mask these warnings. However 14085 won't pass the smoke test until all the warnings are fixed. Change-Id: I0e872a8025c3b1b5e2aa15d8fe66248e2fd96bf1 BUG: 1369124 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/15253 Reviewed-by: Niels de Vos <ndevos@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
* nfs: Reset cs->resolvedhard while resolving an entrySoumya Koduri2016-07-171-0/+1
| | | | | | | | | | | | | | | | | | If an entry is not found in the inode table, nfs xlator should be resolving it by sending an explicit lookup to the brick process. But currently its broken in case of NFS3_LOOKUP fop where in the server bails out early resulting in sending pargfid attributes to the client. To fix the same reset 'cs->resolvedhard' so that an explicit lookup is done for the entry in the resume_fn "nfs3_lookup_resume()". Change-Id: I999f8bca7ad008526c174d13f69886dc809d9552 Signed-off-by: Soumya Koduri <skoduri@redhat.com> BUG: 1356068 Reviewed-on: http://review.gluster.org/14911 CentOS-regression: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs: allow hostnames with dashes in exports/netgroups filesNiels de Vos2016-06-285-3/+27
| | | | | | | | | | | | | | | | | | | | Hostnames with dashes (like "vagrant-testVM") are not correctly parsed when reading the exports/netgroups files. This bacomes obvious when running ./run-tests-in-vagrant.sh because it causes tests/basic/mount-nfs-auth.t and tests/basic/netgroup_parsing.t to fail. The regex for hostname (in exports) and the entry and hostname (netgroups) parsing does not include the "-" sign, and hence the hostnames are splitted at it. BUG: 1350237 Change-Id: I38146a283561e1fa386cc841c43fd3b1e30a87ad Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/14809 Smoke: Gluster Build System <jenkins@build.gluster.org> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* nfs: build exportlist with multiple groupnodesBipin Kunal2016-06-091-18/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | The EXPORT procedure of the MOUNT protocol does not correctly create structures for the 'groupnodes' in the reply. Each 'groupnode' should be a single entry in the 'nfs.rpc-auth-allow' volume option. Because the value is handled as a single string, the encoding of the groupnode->gr_name fails when the value of the volume option is longer than 255 characters. In the error case, encoding the EXPORTS reply fails, and the waiting 'showmount' command will not receive a reply and times out. Splitting the allowed entries and creating a groupnode for each one prevents the too long ->gr_name. This is following the structures for the EXPORTS reply in the MOUNT protocol more correctly as well. Note that the contents of ->gr_name is expected to be server dependent. Change-Id: Ibbabad581cc9aa00feb80fbbc851a1b10b28383d BUG: 1343286 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/14667 Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: bipin kunal <kunalbipin@gmail.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* nfs : store sattr properly in nfs3_setattr() callJiffin Tony Thottan2016-06-072-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nfs3_setattr stores the input arguments in cs->stbuf. However, inode/entry resolution code overwrites cs->stbuf after a successful resolution, thereby overwriting the input arguments with iatt values stored on backend. Hence operations like chmod/chown turns out to be a NOP. Specifically following are the functions that overwrite cs->stbuf: nfs3_fh_resolve_inode_lookup_cbk nfs3_fh_resolve_entry_lookup_cbk Since we resort to inode resolution only when inode is not found in inode table and lru limit guards the number of inodes in itable, we run into this issue only when the data set is bigger than lru limit of itable. Fix is to store input arguments in a member other than cs->stbuf. Thanks Du for suggesting the fix Change-Id: I7caef48839d4f177c3557d7823fc1d35c8294939 BUG: 1318204 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/14657 Smoke: Gluster Build System <jenkins@build.gluster.com> Tested-by: Atin Mukherjee <amukherj@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* core, shard: Make shards inherit main file's O_DIRECT flag if presentKrutika Dhananjay2016-06-072-19/+15
| | | | | | | | | | | | | | | | | | If the application opens a file with O_DIRECT, the shards' anon fds would also need to inherit the flag. Towards this, shard xl would be passing the odirect flag in the @flags parameter to the WRITEV fop. This will be used in anon fd resolution and subsequent opening by posix xl. Change-Id: Iddb75c9ed14ce5a8c5d2128ad09b749f46e3b0c2 BUG: 1342171 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/14191 Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* nfs: fix a typo in the help text for option rpc-auth.addr.rejectDustin Black2016-06-061-1/+1
| | | | | | | | | | | | | | Added space to .description Reported-by: James Shubin <purpleidea@gmail.com> Change-Id: Ie4dd8774567ac4d8e1e8ec39aa3ab595d037101a BUG: 1005257 Signed-off-by: Dustin Black <dblack@redhat.com> Reviewed-on: http://review.gluster.org/14621 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Niels de Vos <ndevos@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
* nfs: strip trailing / when clients do subdir mountsNiels de Vos2016-05-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Mounting a volume over NFS with a subdir followed by a / does not work: # mount -t nfs -o vers=3 storage.example.com:/media/installation/ /mnt mount.nfs: an incorrect mount option was specified In the nfs.log: [client-rpc-fops.c:2930:client3_3_lookup_cbk] 0-media-client-0: remote operation failed. Path: /installation/ (00000000-0000-0000-0000-000000000000) [Invalid argument] [client-rpc-fops.c:2930:client3_3_lookup_cbk] 0-media-client-1: remote operation failed. Path: /installation/ (00000000-0000-0000-0000-000000000000) [Invalid argument] [mount3.c:1134:mnt3_resolve_subdir_cbk] 0-nfs: path=/installation/ (Invalid argument) [Invalid argument] It is not possible to resolve paths with a trailing /. Stripping trailing /'s from the subdir to mount is sufficient to make it work again. Change-Id: I4075d4cd351438de58e1ff81f0fb65a1ff076da4 BUG: 1337597 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/14421 NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Smoke: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* cluster/afr: Don't let NFS cache stat after writesPranith Kumar K2016-05-043-24/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Problem: Afr does post-ops after write but the stat buffer it unwinds is at the time of write, so if nfs client caches this, it will see different ctime when it does stat on it after post-op is done. From NFS client's perspective it thinks the file is changed. Tar which depends on this to be correct keeps giving 'file changed as we read it' warning. If Afr instead has to choose to unwind after post-op, eager-lock, delayed-post-op will have to be disabled which will lead to bad performance for all write usecases. Fix: Don't let client cache stat after write. Change-Id: Ic6062acc6e5cdd97a9c83c56bd529ec83cee8a23 BUG: 1302948 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Signed-off-by: Anuradha Talur <atalur@redhat.com> Reviewed-on: http://review.gluster.org/13785 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* NFS: new option nfs.rdirplus addedSakshi Bansal2016-04-074-1/+29
| | | | | | | | | | | | | | When this option is 'disabled', NFS falls back to standard readdir instead of readdirp Change-Id: Icaaf4da6533bee56160d4a81e42bb60f7d341945 BUG: 1302948 Signed-off-by: Sakshi Bansal <sabansal@redhat.com> Reviewed-on: http://review.gluster.org/13782 Smoke: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs: fix lock variable typePrasanna Kumar Kalever2016-03-171-1/+1
| | | | | | | | | | | | | | | | variable 'mountlock' should be generic since it is used by macros LOCK_* , it can be used spinlock or mutexlock Change-Id: If558bcf8debd98c4e1a615df0f9f0caec586e39b BUG: 1312346 Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-on: http://review.gluster.org/13532 Smoke: Gluster Build System <jenkins@build.gluster.com> CentOS-regression: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Gluster Build System <jenkins@build.gluster.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* nfs: send lookup if inode_ctx is not setMohammed Rafi KC2016-01-134-15/+28
| | | | | | | | | | | | | | | | | During resolving of an entry or inode, if inode ctx was not set, we will send a lookup. This patch also make sure that inode_ctx will be created after every inode_link. Change-Id: I137a7e2510635ff4ea6d007b671961341f89c949 BUG: 1297311 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/13224 Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
* nfs : Inform client to perform extra GETATTR call for 'T' filesJiffin Tony Thottan2015-12-161-1/+15
| | | | | | | | | | | | | | | | | | | Due to the changes from http://review.gluster.org/#/c/12722/, for tier volume the readirp will be send only to cold subvol, therefore the resulting list may contain 'T' files. For those files, by performing additional getattr call will populate the attributes correctly. This check should be based on inode value passed from the readdirp(both T files and directory have NULL value) and skip directory in the same. Change-Id: Ieb6724b05301cdbf0a0ef15ad9db51014faa0457 BUG: 1291212 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/12960 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
* debug/io-stats: Add FOP sampling featureRichard Wareing2015-11-013-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Using sampling feature you can record details about every Nth FOP. The fields in each sample are: FOP type, hostname, uid, gid, FOP priority, port and time taken (latency) to fufill the request. - Implemented using a ring buffer which is not (m/c) allocated in the IO path, this should make the sampling process pretty cheap. - DNS resolution done @ dump time not @ sample time for performance w/ cache - Metrics can be used for both diagnostics, traffic/IO profiling as well as P95/P99 calculations - To control this feature there are two new volume options: diagnostics.fop-sample-interval - The sampling interval, e.g. 1 means sample every FOP, 100 means sample every 100th FOP diagnostics.fop-sample-buf-size - The size (in bytes) of the ring buffer used to store the samples. In the even more samples are collected in the stats dump interval than can be held in this buffer, the oldest samples shall be discarded. Samples are stored in the log directory under /var/log/glusterfs/samples. - Uses DNS cache written by sshreyas@fb.com (Thank-you!), the DNS cache TTL is controlled by the diagnostics.stats-dnscache-ttl-sec option and defaults to 24hrs. Test Plan: - Valgrind'd to ensure it's leak free - Run prove test(s) - Shadow testing on 100+ brick cluster Change-Id: I9ee14c2fa18486b7efb38e59f70687249d3f96d8 BUG: 1271310 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12210 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
* nfs : avoid invalid usage of `cs` variable in nfs fopsJiffin Tony Thottan2015-10-301-20/+20
| | | | | | | | | | | | | | | | | Due to changes from http://review.gluster.org/#/c/12162/ a path variable is added to nfs3_log_common_res() and usually `cs->resolvedloc.path` is passed for that. But in certain fop function `cs` may not filled due error and when it is logged using nfs3_log_common_res() results in a crash. This patch will fix the same. Change-Id: I5a709818923e7884bd04e329834ee352a1b3a58f BUG: 1276243 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/12458 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com>
* core: use syscall wrappers instead of direct syscalls - miscellaneousKaleb S. KEITHLEY2015-10-281-2/+3
| | | | | | | | | | | | | | | various xlators and other components are invoking system calls directly instead of using the libglusterfs/syscall.[ch] wrappers. If not using the system call wrappers there should be a comment in the source explaining why the wrapper isn't used. Change-Id: I1f47820534c890a00b452fa61f7438eb2b3f667c BUG: 1267967 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/12276 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
* xlators: add JSON FOP statistics dumps every N secondsRichard Wareing2015-10-081-1/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: - Adds a thread to the io-stats translator which dumps out statistics every N seconds where N is configurable by an option called "diagnostics.stats-dump-interval" - Thread cleanly starts/stops when translator is unloaded - Updates macros to use "Atomic Builtins" (e.g. intel CPU extentions) to use memory barries to update counters vs using locks. This should reduce overhead and prevent any deadlock bugs due to lock contention. Test Plan: - Test on development machine - Run prove -v tests/basic/stats-dump.t Change-Id: If071239d8fdc185e4e8fd527363cc042447a245d BUG: 1266476 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/12209 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
* build: export minimum symbols from xlators for correct resolutionKaleb S. KEITHLEY2015-09-242-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've been lucky that we haven't had any symbol collisions until now. Now we have a collision between the snapview-client's svc_lookup() and libntirpc's svc_lookup() with nfs-ganesha's FSAL_GLUSTER and libgfapi. As a short term solution all the snapview-client's FOP methods were changed to static scope. See http://review.gluster.org/11805. This works in snapview-client because all the FOP methods are defined in a single source file. This solution doesn't work for other xlators with FOP methods defined in multiple source files. To address this we link with libtool's '-export-symbols $symbol-file' (a wrapper around `ld --version-script ...` --- on linux anyway) and only export the minimum required symbols from the xlator sharedlib. N.B. the libtool man page says that the symbol file should be named foo.sym, thus the rename of *.exports to *.sym. While foo.exports worked, we will follow the documentation. Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> BUG: 1248669 Change-Id: I1de68b3e3be58ae690d8bfb2168bfc019983627c Reviewed-on: http://review.gluster.org/11814 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: soumya k <skoduri@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs : logging improvementsManikandan Selvaganesh2015-09-114-347/+336
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS log-warning messages logged twice in cbk function. Though, the logging messages are not exactly duplicate, instead of logging twice, they can be merged to one log message and the other log message is removed in cbk functions. Example: (1) W [nfs3.c:2075:nfs3svc_write_cbk] 0-nfs: 16f4dce6: /f.195 => -1 (Disk quota exceeded) (2) W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: 16f4dce6, WRITE: NFS: 69(Resource (quota) hard limit exceeded), POSIX: 122 (Disk quota exceeded), count: 0, UNSTABLE, wverf: 1381508849 Here, the second message is more elaborative, and is similar to (1). Since file name is not present in (2), it is added to (2) and then removing all mesages of type (1). Change-Id: I6028ab17b23948493a065dfad92fe4984548511f BUG: 1254146 Signed-off-by: Manikandan Selvaganesh <mselvaga@redhat.com> Reviewed-on: http://review.gluster.org/11936 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
* nfs: Fixes "Remote I/O error" mount failuresRichard Wareing2015-09-011-1/+20
| | | | | | | | | | | | | | | | | - Fixes issue where NFS mount fail with "Remove I/O error" after the target directory has been deleted and re-created after the gNFSd has already cached the inode of the first generation of the target directory. - The solution is to follow the guidance of the AFR2 comments and refresh the inode by deleting it from cache and looking it up again. BUG: 1258196 Change-Id: I9c7d8bd460ee9e5ea0b5b47d23886b1afcdcd563 Reported-by: Richard Wareing <rwareing@fb.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/12046 Tested-by: Gluster Build System <jenkins@build.gluster.com>
* all: reduce "inline" usageJeff Darcy2015-09-013-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | There are three kinds of inline functions: plain inline, extern inline, and static inline. All three have been removed from .c files, except those in "contrib" which aren't our problem. Inlines in .h files, which are overwhelmingly "static inline" already, have generally been left alone. Over time we should be able to "lower" these into .c files, but that has to be done in a case-by-case fashion requiring more manual effort. This part was easy to do automatically without (as far as I can tell) any ill effect. In the process, several pieces of dead code were flagged by the compiler, and were removed. Change-Id: I56a5e614735c9e0a6ee420dab949eac22e25c155 BUG: 1245331 Signed-off-by: Jeff Darcy <jdarcy@redhat.com> Reviewed-on: http://review.gluster.org/11769 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
* rpc: add owner xlator argument to rpc_clnt_newKrishnan Parthasarathi2015-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The @owner argument tells RPC layer the xlator that owns the connection and to which xlator THIS needs be set during network notifications like CONNECT and DISCONNECT. Code paths that originate from the head of a (volume) graph and use STACK_WIND ensure that the RPC local endpoint has the right xlator saved in the frame of the call (callback pair). This guarantees that the callback is executed in the right xlator context. The client handshake process which includes fetching of brick ports from glusterd, setting lk-version on the brick for the session, don't have the correct xlator set in their frames. The problem lies with RPC notifications. It doesn't have the provision to set THIS with the xlator that is registered with the corresponding RPC programs. e.g, RPC_CLNT_CONNECT event received by protocol/client doesn't have THIS set to its xlator. This implies, call(-callbacks) originating from this thread don't have the right xlator set too. The fix would be to save the xlator registered with the RPC connection during rpc_clnt_new. e.g, protocol/client's xlator would be saved with the RPC connection that it 'owns'. RPC notifications such as CONNECT, DISCONNECT, etc inherit THIS from the RPC connection's xlator. Change-Id: I9dea2c35378c511d800ef58f7fa2ea5552f2c409 BUG: 1235582 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/11436 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
* nfs: log disabled export/netgroups feature as INFOJiffin Tony Thottan2015-07-191-1/+1
| | | | | | | | | | | | | If export/netgroups feature is disabled for gluster/nfs, then the "nfs.log" contains a Warning message which is deceiving for the users. Logging the message as Info is sufficient. Change-Id: I3d07e8bc4f09f3eb32014f5a10390d0484b838cf BUG: 1243805 Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com> Reviewed-on: http://review.gluster.org/11695 Reviewed-by: Niels de Vos <ndevos@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* nfs: a unix-domain-socket should not be created as fifoNiels de Vos2015-06-281-7/+8
| | | | | | | | | | Change-Id: Ic6a23165df1703b330636a059967c3c674dbde57 BUG: 1235231 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11355 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* nfs: refcount each auth_cache_entry and related data_tNiels de Vos2015-06-281-9/+86
| | | | | | | | | | | | | | | | | This makes sure that all the auth_cache_entry structures are only free'd when there is no reference to it anymore. When it is free'd, the associated data_t from the auth_cache->cache_dict gets unref'd too. Upon calling auth_cache_purge(), the auth_cache->cache_dict will free each auth_cache_entry in a secure way. Change-Id: If097cc11838e43599040f5414f82b30fc0fd40c6 BUG: 1226717 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11023 Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org>
* nfs: add a gf_lock_t for the auth_cache->cache_dictNiels de Vos2015-06-282-40/+126
| | | | | | | | | | | | | | This is the 1st step towards implementing reference counters for the auth_cache_entry structure. Access to the structures should always be done atomically, but this can not be guaranteed by the a dict. Change-Id: Ic165221d72f11832177976c989823d861cf12f01 BUG: 1226717 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11021 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
* nfs: Authentication performance improvementsShreyas Siravara2015-06-161-3/+3
| | | | | | | | | | | | | | | | | | | | When file operations are sent to the NFS server, authorized filehandles are cached using the exportid, mountid, gfid and host as the key to the cache. This meant that any file OR directory will always fail on the *first* fop to that filehandle since the cache used the gfid as part of the key to the cache. However, if an export is authorized, this effectively means that ALL subdirectories and files in the export directory are authorized per the permissions of the export. This results slow times to walking a directory structure over an NFS mount. Change-Id: Iad811ad7255b454d1712e75a637478401d40791e BUG: 1232165 Signed-off-by: Shreyas Siravara <sshreyas@fb.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11245 Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* nfs: allocate and return the hashkey for the auth_cache_entryNiels de Vos2015-06-031-7/+25
| | | | | | | | | | | | | | | | | | | | | | | | | The allocation of the hashkey was never returned to the calling function. Allocating it with alloca() puts it on the stack, returning from the function makes the pointer invalid. Functions that are annotated with "inline" and call alloca(), will not always be inlined. Returning a pointer allocated with alloca() is in those cases not correct. One such confirmation was provided by GCC developer Alexandre Oliva: - http://gcc.gnu.org/ml/gcc-help/2004-04/msg00158.html It is more correct to call GF_MALLOC() and GF_FREE() for the hashkey. If this would result in preformance hit, we can always think of using alloca() again and turn make_hashkey() into a macro (yuck). Change-Id: Ia86a1f79d33240af4713bfb92f702b0ee6e87eb7 BUG: 1226714 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/11019 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com> Reviewed-by: soumya k <skoduri@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* build: do not #include "config.h" in each fileNiels de Vos2015-05-2922-110/+0
| | | | | | | | | | | | | | | | | | Instead of including config.h in each file, and have the additional config.h included from the compiler commandline (-include option). When a .c file tests for a certain #define, and config.h was not included, incorrect assumtions were made. With this change, it can not happen again. BUG: 1222319 Change-Id: I4f9097b8740b81ecfe8b218d52ca50361f74cb64 Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-on: http://review.gluster.org/10808 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
* nfs: Use uuid_clear() instead of memsetVijay Bellur2015-05-161-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following incorrect usage: mount3.c: In function '__mnt3_build_mountid_from_path': mount3.c:705:24: warning: 'sizeof' on array function parameter 'mountid' will return size of 'unsigned char *' [-Wsizeof-array-argument] length = sizeof(mountid); ^ mount3.c:699:58: note: declared here __mnt3_build_mountid_from_path (const char *path, uuid_t mountid) ^ mount3.c: In function '__mnt3_get_mount_id': mount3.c:732:24: warning: 'sizeof' on array function parameter 'mountid' will return size of 'unsigned char *' [-Wsizeof-array-argument] length = sizeof(mountid); ^ mount3.c:726:46: note: declared here __mnt3_get_mount_id (xlator_t *mntxl, uuid_t mountid) Change-Id: I08f46c5994578fc99a7b61681e808d1115e41d71 BUG: 1221095 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/10765 Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
* nfs.c nfs3.c: port log messages to a new frameworkHari Gowtham2015-05-083-375/+585
| | | | | | | | | | Change-Id: I9ddb90d66d3ad3adb2916c0c949834794ee7bdf3 BUG: 1194640 Signed-off-by: Hari Gowtham <hari.gowtham005@gmail.com> Reviewed-on: http://review.gluster.org/10216 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Meghana M <mmadhusu@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>