glusterfs.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	features/quota: Fix NULL dereference error reported by coverity.	Poornima	2014-02-10	1	-0/+4
\| \| \| \| \| \| \| \| \|	Change-Id: I48c1ebcb3261fa5da721e5b48d17c6c873df256d BUG: 789278 Signed-off-by: Poornima <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/6907 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/marker-quota: more stringent error handling in rename.	Raghavendra G	2014-02-08	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an error occurs and op_errno is not set to non-zero value, we can end up in loosing a frame resulting in a hung syscall. This patch adds code setting op_errno appropriately in storage/posix and makes marker to set err to a default non-zero value in case of op_errno being zero. Change-Id: Idc2c3e843b932709a69b32ba67deb284547168f2 BUG: 833586 Signed-off-by: Raghavendra G <raghavendra@gluster.com> Reviewed-on: http://review.gluster.org/5032 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/quota: fix crash in error handling after building ancestry.	Raghavendra G	2014-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	Change-Id: Ifbebf1aa496d49a6c4bb30258b83aaf9792828e5 BUG: 1059833 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6923 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/index: Close directories left open on error.	Christopher R. Hertel	2014-02-07	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two directory streams are opened, but not always closed if the function exits with an error. This patch aims to ensure that the directories are closed in call cases. Some return values may be explicitly discarded. This is done to avoid having a successful call to closedir() overwrite the value of ret. Also, the errno value is preserved in case a calling function needs to check errno. BUG: 789278 CID: 1124710 CID: 1124711 Change-Id: I6bf3b5c9c6a1ec9a99cc9178243ea98572c2bac2 Signed-off-by: Christopher R. Hertel <crh@redhat.com> Reviewed-on: http://review.gluster.org/6883 Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	changelog: When connection is unsuccessful, close socket too.	Raghavendra Talur	2014-02-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug fix for Coverity CID: 1124791 Change-Id: I0362d45123ebc250290f3a5231f7fb113fa41212 BUG: 789278 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/6900 Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Santosh Pradhan <spradhan@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	changelog: Restrict length of unix socket files to UNIX_PATH_MAX.	Raghavendra Talur	2014-02-05	4	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Bug fix for Coverity-CID 1124847. Change-Id: I410ef8e06cbc491b1f72535298fae5e9bc77220d BUG: 789278 Signed-off-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-on: http://review.gluster.org/6870 Reviewed-by: Venky Shankar <vshankar@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/index: Loop exit without freeing in-use memory.	Christopher R. Hertel	2014-02-04	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the while loop is exited via the goto, the call to closedir(3) is skipped, leaving an open directory stream. This patch ensure that the directory is closed if the loop is exited via the goto. BUG: 789278 CID: 1124763 Change-Id: Iec5da4b8d32c0a93c10d035ff38d0c063fd97cda Signed-off-by: Christopher R. Hertel <crh@redhat.com> Reviewed-on: http://review.gluster.org/6828 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	quota: Fixing a possibility of GF_FREE on an array pointer	Lalatendu Mohanty	2014-02-03	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Issue: when the charecter pointer is not heap allocated by the program we should not do GF_FREE on it. Change-Id: Ibccc25491a9ab924fd5b82c14566f551a061ec2e BUG: 789278 Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com> Reviewed-on: http://review.gluster.org/6741 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/afr: Add gfid to index in wind un-conditionally	Pranith Kumar K	2014-01-29	1	-9/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If brick crashes just after performing xattrop before reaching index _cbk then the gfid will not be added to/deleted from index. If it is not deleted it will be deleted when self-heal is triggered. But if the crash happens before the file is added to index, user may think the system does not need self-heal even when there is this file that needs to be healed. Fix: Add file to index un-conditionally in wind phase. This way the file remains in index even when brick process crashes before reaching _cbk. This does not affect performance because of caching(check _index_action) built into index xlator. Change-Id: Ie83ac6aa1ac0ff66862e757864865b47ab39404d BUG: 1058713 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6836 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/qemu-block: Remove unref of freed iobuf	Poornima	2014-01-29	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	Change-Id: I507a6504b379eef54be77b54d6e2beee63975ebf BUG: 789278 Signed-off-by: Poornima <pgurusid@redhat.com> Reviewed-on: http://review.gluster.org/6824 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Talur <rtalur@redhat.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/quota: Correct the statfs calculation when set limit is beyond ...	Varun Shastry	2014-01-28	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	actual disk space. Fixed one of the cases where 'df' values data show wrong when the quota limit is greater than back-end disk space. Change-Id: I09fb71a37602c6f3daf6b91dd3fd19b7f5f76817 BUG: 969461 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/6126 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/gfid-access: populating inode during virtual_lookup_cbk.	Ajeet Jha	2014-01-28	2	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Setting appropriate ia_type and gfid for the inode, obtained during virtual_lookup_cbk of a directory by doing an "inode_link". Change-Id: I9582570c82e70ff5f1d4e441f9e9868ef82e9dc6 BUG: 1054199 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/6723 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/gfid-access: fix lookup on .gfid/<parent>/bname	Venky Shankar	2014-01-27	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In gfid translator, lookup was not handling the case when the lookup is sent on .gfid/<parent>/bname. In this case, we flip with fake inode of the parent with the real inode in loc and send it downwards. Change-Id: I639ff1dce10ffc045da419e333d455e208b6a0f0 BUG: 1057881 Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-on: http://review.gluster.org/6795 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	mgmt/glusterd: make sure quota enforcer has established connection with ↵	Raghavendra G	2014-01-25	2	-6/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	quotad before marking quota as enabled. without this patch there is a window of time when quota is marked as enabled in quota-enforcer, but connection to quotad wouldn't have been established. Any checklimit done during this period can result in a failed fop because of unavailability of quotad. Change-Id: I0d509fabc434dd55ce9ec59157123524197fcc80 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6572 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/quota: maintain correct link_count in ancestry building	Raghavendra G	2014-01-25	1	-53/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	codepath. Ancestry building codepath can be executed simultaneously when a file has hardlinks. So, following fixes are needed: * modify local->link_count under locks in ancestry building codepath. * before invoking check_limit on newly constructed parents, link_count should not be set, but instead incremented by as many number of new invocations of check_limit. * decrementing link_count and the check whether the count has hit zero to resume the waiting call_stub should be atomic. Change-Id: I6f52e50547362a5ded6bd9f085570223aea372d1 BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6491 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/quota: remove in-memory accounting of files in enforcer	Raghavendra G	2014-01-25	2	-221/+210
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Accounting was done in enforcer (though marker is the ultimate source of truth) to offset cached directory size becoming stale. However, with enforcer being moved to brick we can no longer maintain correct cluster wide size for a directory. Hence removing accounting code from enforcer. Change-Id: I5ea94234da4da85ed5f5ced1354d8de3454b3fcb BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6434 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/quota: use STACK_WIND_TAIL when quota is turned off.	Raghavendra G	2014-01-24	1	-168/+207
\| \| \| \| \| \| \| \| \|	Change-Id: I8a0b7f3a1995a72560c210efbad1eaafb0bdf329 BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6373 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	gfid-access: fix the issue of entry creation with wrong gfid	Amar Tumballi	2014-01-23	1	-6/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* dict_set was happening with a string instead of 'uuid_t' causing entry creations to happen with wrong gfid * revalidate was causing excessive ESTALE logs as lookup was happening on '.gfid/' path itself causing server to become clueless Change-Id: I3b76ce7fdec9c2ff785be21f506f859f489f80f0 BUG: 1054199 Signed-off-by: Amar Tumballi <amarts@redhat.com> Reviewed-on: http://review.gluster.org/6520 Reviewed-by: Anand Avati <avati@redhat.com> Tested-by: Anand Avati <avati@redhat.com>
*	quota: filter glusterfs quota xattrs	Susant Palai	2014-01-22	1	-34/+22
\| \| \| \| \| \| \| \| \|	Change-Id: I86ebe02735ee88598640240aa888e02b48ecc06c BUG: 1040423 Signed-off-by: Susant Palai <spalai@redhat.com> Reviewed-on: http://review.gluster.org/6490 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	locks: set @lock->frame = NULL when lock is granted	Anand Avati	2014-01-22	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	This way disconnect cleanup code can differentiate which locks are granted vs blocked. Change-Id: I2a835c6865b6c804231d852953ea84eeccef35a3 BUG: 849630 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6730 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
*	quota: get directory size before enforcing quota on rename	Krishnan Parthasarathi	2014-01-20	2	-12/+72
\| \| \| \| \| \| \| \| \| \|	Change-Id: If18cab5992ddc91457782786942971deb1b51ead BUG: 1023974 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6155 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	syncop: Change return value of syncop	Pranith Kumar K	2014-01-19	3	-17/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: We found a day-1 bug when syncop_xxx() infra is used inside a synctask with compilation optimization (CFLAGS -O2). Detailed explanation of the Root cause: We found the bug in 'gf_defrag_migrate_data' in rebalance operation: Lets look at interesting parts of the function: int gf_defrag_migrate_data (xlator_t this, gf_defrag_info_t defrag, loc_t loc, dict_t migrate_data) { ..... code section - [ Loop ] while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL, &entries)) != 0) { ..... code section - [ ERRNO-1 ] (errno of readdirp is stored in readdir_operrno by a thread) /* Need to keep track of ENOENT errno, that means, there is no need to send more readdirp() / readdir_operrno = errno; ..... code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread) ret = syncop_getxattr (this, &entry_loc, &dict, GF_XATTR_LINKINFO_KEY); code section - [ ERRNO-2] (checking for failures of syncop_getxattr(). This may not always be executed in same thread which executed [SYNCOP-1]) if (ret < 0) { if (errno != ENODATA) { loglevel = GF_LOG_ERROR; defrag->total_failures += 1; ..... } the function above could be executed by thread(t1) till [SYNCOP-1] and code from [ERRNO-2] can be executed by a different thread(t2) because of the way syncop-infra schedules the tasks. when the code is compiled with -O2 optimization this is the assembly code that is generated: [ERRNO-1] 1165 readdir_operrno = errno; <<---- errno gets expanded as (__errno_location()) 0x00007fd149d48b60 <+496>: callq 0x7fd149d410c0 <address@hidden> 0x00007fd149d48b72 <+514>: mov %rax,0x50(%rsp) <<------ Address returned by __errno_location() is stored in a special location in stack for later use. 0x00007fd149d48b77 <+519>: mov (%rax),%eax 0x00007fd149d48b79 <+521>: mov %eax,0x78(%rsp) .... [ERRNO-2] 1281 if (errno != ENODATA) { 0x00007fd149d492ae <+2366>: mov 0x50(%rsp),%rax <<----- Because it already stored the address returned by __errno_location(), it just dereferences the address to get the errno value. BUT THIS CODE NEED NOT BE EXECUTED BY SAME THREAD!!! 0x00007fd149d492b3 <+2371>: mov $0x9,%ebp 0x00007fd149d492b8 <+2376>: mov (%rax),%edi 0x00007fd149d492ba <+2378>: cmp $0x3d,%edi The problem is that __errno_location() value of t1 and t2 are different. So [ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though t2 is executing [ERRNO-2] code section. When code is compiled without any optimization for [ERRNO-2]: 1281 if (errno != ENODATA) { 0x00007fd58e7a326f <+2237>: callq 0x7fd58e797300 <address@hidden><<--- As it is calling __errno_location() again it gets the location from t2 so it works as intended. 0x00007fd58e7a3274 <+2242>: mov (%rax),%eax 0x00007fd58e7a3276 <+2244>: cmp $0x3d,%eax 0x00007fd58e7a3279 <+2247>: je 0x7fd58e7a32a1 <gf_defrag_migrate_data+2287> Fix: Make syncop_xxx() return (-errno) value as the return value in case of errors and all the functions which make syncop_xxx() will need to use (-ret) to figure out the reason for failure in case of syncop_xxx() failures. Change-Id: I314d20dabe55d3e62ff66f3b4adb1cac2eaebb57 BUG: 1040356 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6475 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	build: Start using library versioning for various libraries	Harshavardhana	2014-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to libtool three individual numbers stand for CURRENT:REVISION:AGE, or C:R:A for short. The libtool script typically tacks these three numbers onto the end of the name of the .so file it creates. The formula for calculating the file numbers on Linux and Solaris is /path/to/library/<library_name>.(C - A).(A).(R) As you release new versions of your library, you will update the library's C:R:A. Although the rules for changing these version numbers can quickly become confusing, a few simple tips should help keep you on track. The libtool documentation goes into greater depth. In essence, every time you make a change to the library and release it, the C:R:A should change. A new library should start with 0:0:0. Each time you change the public interface (i.e., your installed header files), you should increment the CURRENT number. This is called your interface number. The main use of this interface number is to tag successive revisions of your API. The AGE number is how many consecutive versions of the API the current implementation supports. Thus if the CURRENT library API is the sixth published version of the interface and it is also binary compatible with the fourth and fifth versions (i.e., the last two), the C:R:A might be 6:0:2. When you break binary compatibility, you need to set AGE to 0 and of course increment CURRENT. The REVISION marks a change in the source code of the library that doesn't affect the interface-for example, a minor bug fix. Anytime you increment CURRENT, you should set REVISION back to 0. Change-Id: Id72e74c1642c804fea6f93ec109135c7c16f1810 BUG: 862082 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/quota: Handle the corner case in statfs call	Varun Shastry	2014-01-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: Even though limit is not set quota used to send 'quota-deem-statfs' key in the dictionary resulting in incorrect calculations. Change-Id: I643cb35cca6648e40f1c745135280876958ddaa9 BUG: 1046894 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/6607 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	quota: fix recording of last alert log message	Krishnan Parthasarathi	2014-01-16	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: Alert log messages, corresponding to disk usage crossing soft-limit on a directory, weren't being logged as often as expected. CAUSE: The mechanism in place to log alert messages, once every alert-time seconds, set the previous logged time incorrectly. FIX: Update previous logged time only if we logged an alert message, ie. when the "time was right" to alert. Change-Id: Iafcb7be11e3240e2d04b0bb29ddb88e4f4a1bd25 BUG: 969461 Signed-off-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-on: http://review.gluster.org/6532 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/quota: Metadata cleanup	Varun Shastry	2014-01-16	1	-4/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quota and marker uses 'trusted.glusterfs.quota' and 'trusted.pgfid' xattrs to store its configurations and accounting information and also to build the parent inode chain in case of absense of path. Problem: After disabling and then enabling quota back, the xattrs may contain stale data leading to impaired accounting and thus improper enforcement. Solution: Clean up all the quota related xattrs after quota disable. Marker xlator implements a virtual xattr to cleanup quota and pgfid xattrs. In this approach glusterd mounts an auxiliary mount and sends the below command to all the files by crawling the mountpoint. #setfattr -n "glusterfs.quota-xattr-cleanup" -v 1 <path/to/file> Credit: Krishnan Parthasarathi <kparthas@redhat.com> Varun Shastry <vshastry@redhat.com> Change-Id: I9380eca58a285dc27dd572de1767aac8f2cd8049 BUG: 969461 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/6369 Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	consolidate code for #ifdef HAVE_LINKAT usage	Vijay Bellur	2014-01-14	1	-25/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sys_link() now does ifdef HAVE_LINKAT linkat (...) else link (...) endif Use sys_link() in all places where we previously had the conditional behavior. Change-Id: I8bce5ac1175efd2ba7ab4bb5b372f6d1e0365d28 BUG: 764655 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6633 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Xavier Hernandez <xhernandez@datalab.es> Reviewed-by: Anand Avati <avati@redhat.com>
*	locks: various fixes	Anand Avati	2014-01-13	7	-626/+383
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- implement ref/unref of entry locks (and fix bad pointer deref crashes) - code cleanup and deleted various data types - fix improper read/write lock conflict detection in entrylk - fix indefinite hang of blocked locks on disconnect - register locks in client_t synchronously, fix crashes in disconnect path Change-Id: Id273690c9111b8052139d1847060d1fb5a711924 BUG: 849630 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6638 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Use linkat() instead of link() for portability sake	Emmanuel Dreyfus	2014-01-02	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	POSIX does not says wether link(2) on symlink should link on symlink itself or on target. Linux use symlink, most other systems use target. Using linkat(2) allows the behavior to be specified, so that the behavior is portable. Also fix configure test for NetBSD linkat(2), which ceased to work. BUG: 764655 Change-Id: I2633fde3b0828ca8c199e11c827720c513e15852 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/6613 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Use linkat() instead of link() for portability sake	Emmanuel Dreyfus	2014-01-02	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	POSIX does not says wether link(2) on symlink should link on symlink itself or on target. Linux use symlink, most other systems use target. Using linkat(2) allows the behavior to be specified, so that the behavior is portable. Also fix configure test for NetBSD linkata(2), which ceased to work. BUG: 764655 Change-Id: Iccd27ac076b7a74e40dcbaa1c4762fd3ad59da5f Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/6539 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/index: Minor improvement in log message.	Vijay Bellur	2013-12-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	Change-Id: Ic4f39785dab5ad64def4c06d7bd2f2dec09e19ab BUG: 1045690 Signed-off-by: Vijay Bellur <vbellur@redhat.com> Reviewed-on: http://review.gluster.org/6606 Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/quota: log usage only if hard limit not exceeded.	Raghavendra G	2013-12-15	1	-4/+9
\| \| \| \| \| \| \| \| \|	Change-Id: I60abf576999996e0d0d65534e1e416f6e10994c8 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> BUG: 969461 Reviewed-on: http://review.gluster.org/6479 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/changelog: more changelog fixes.	Ajeet Jha	2013-12-12	8	-38/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	-> log additional records. -> include FOP number for metadata. -> prevent crash if inode is not found in a fop. Change-Id: I9edd4b71819ebd68c6a2b4150ae279c471d129da BUG: 1036536 Signed-off-by: Ajeet Jha <ajha@redhat.com> Reviewed-on: http://review.gluster.org/6403 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Amar Tumballi <amarts@gmail.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
*	gfid-access: do chown() after creating the new entries	Amar Tumballi	2013-12-10	2	-8/+73
\| \| \| \| \| \| \| \| \| \| \| \|	changing the 'frame->root->uid' on the fly is not a good idea as posix-acl xlator on brick process would fail the op. Change-Id: I996b43e4ce6efb04f52949976339dad6eb89bede Signed-off-by: Amar Tumballi <amarts@redhat.com> BUG: 847839 Reviewed-on: http://review.gluster.org/5833 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	index: fix the order of initialization	Anand Avati	2013-12-04	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	thread spawning must be the last action. if not, failures after thread creation can set this->private to NULL and cause index_worker to segfault Change-Id: I71c85c25d7d6d1ed5f8d9c951db0262b8a3f1d90 BUG: 1034085 Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/6426 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	features/marker: Filter quota xattrs on file as well	Pranith Kumar K	2013-12-01	2	-3/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: Quota contributions of a file/directory are tracked by quota xlator using xattrs on the file. Quota allows these xattrs to be healed as part of metadata self-heal. This leads to wrong quota calculations on this brick after self-heal because quota xattrs don't represent the actual contributions on the brick anymore. Fix: Don't let self-heal of this xattr happen as part of self-heal by filtering quota xattrs on file in listxattr. Change-Id: Iea68a116595ba271e58c6fdcc3dd21c7bb55ebb3 BUG: 1035576 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com> Reviewed-on: http://review.gluster.org/6374 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cli, glusterd: More quota fixes ...	Krutika Dhananjay	2013-11-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... which may be grouped under the following categories: 1. Fix incorrect cli exit status for 'quota list' cmd 2. Print appropriate error message on quota parse errors in cli Authored by: Anuradha Talur <atalur@redhat.com> 3. glusterd: Improve quota validation during stage-op 4. Fix peer probe issues resulting from quota conf checksum mismatches 5. Enhancements to CLI output in the event of quota command failures Authored by: Kaushal Madappa <kmadappa@redhat.com> 7. Move aux mount location from /tmp to /var/run/gluster Authored by: Krishnan Parthasarathi <kparthas@redhat.com> 8. Fix performance issues in quota limit-usage Authored by: Krutika Dhananjay <kdhananj@redhat.com> Note: Some functions that were used in earlier version of quota, that aren't called anymore have been removed. Change-Id: I9d874f839ae5fdcfbe6d4f2d727eac091f27ac57 BUG: 969461 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/6366 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/index : Creation of indices directory as soon as brick is up.	Anuradha	2013-11-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Missing indices directory in the bricks leads to unwanted log messages. Therefore, indices directory needs to be created as soon as the brick comes up. This patch results in creation of indices/xattrop directory as required. Also includes a testcase to test the same. Change-Id: Ic2aedce00c6c1fb24929ca41f6085aed28709d2c BUG: 1034085 Signed-off-by: Anuradha <atalur@redhat.com> Reviewed-on: http://review.gluster.org/6343 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	cluster/dht: instruct marker whenever it shouldn't do accounting	Raghavendra G	2013-11-26	1	-40/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is needed for two reasons: * since dht-linkfiles are internal, they shouldn't be accounted. * hardlink handling in marker is broken. link/unlink of hardlinks present in same directory can break marker accounting. Hence, if src and dst are in same directory in case of rename, dht - if it breaks rename into link/unlink operations - should instruct marker to not to do accounting. Change-Id: I9c9f7384569f75a2792f6450ee7a5279bf751ae7 BUG: 1022995 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6203 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/marker-quota: exclude dht-linkfiles from being accounted.	Raghavendra G	2013-11-26	1	-2/+2
\| \| \| \| \| \| \| \| \|	Change-Id: I3239f5e8477664dcc04434e4d455ae447493a7ac BUG: 1022995 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6153 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/quota: make writes short when the entire write cannot fit into ↵	Raghavendra G	2013-11-26	2	-10/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	available space. This patch aims to prevent creation of infinite zero byte sized files due to amount of storage available before exceeding quota limit being less than write sizes. Imagine x bytes of storage is available before we exceed quota limit and quota enforcer is receiving writes of size y and (y > x). In this scenario, if we run a shell script like: # for i in $(seq 1 10); do dd if=/dev/zero of=$i bs=y count=1; done Then, we would end up with 10 zero byte sized files, because we allow only complete writes and all writes will fail because of lack of space. However, creates succeed since a create itself will consume zero bytes. In this pattern of creates and writes, size of volume would never grow and x bytes of space will always be available and we can end up with an infinite number of zero byte sized files. Change-Id: Ice148d6a2207883e41759f7b0be73abaa3198b41 BUG: 1012216 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/6035 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/quota: Improvements to quota	Raghavendra G	2013-11-26	10	-984/+2947
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Two stages of quota enforcement is done: Soft and hard quota Upon reaching soft quota limit on the directory it logs/alerts in the quota daemon log (ie DEFAULT_LOG_DIR/quotad.log) and no more writes allowed after hard quota limit. After reaching the soft-limit the daemon alerts the user/admin repeatively for every 'alert-time', which is configurable. * Quota enforcer is moved to server-side. It takes care of enforcing quota. Since enforcer doesn't have the cluster view, it relies on another service called quota-aggregator. Aggregator, on query can return the size of a directory based on the cluster view. Enforcer is always loaded in the server graph and is by passed if the feature is not enabled. Options specific to enforcer: server-quota - Specifies whether the feature is on/off. It is used to by pass the quota if turned off. deem-statfs - If set to on, it takes quota limits into consideration while estimating fs size. (df command). The algorithm followed is, i. Adjust statvfs based on limit configured on root. ii. If limit is set on the inode passed, use size/limits on that inode to populate statvfs. Otherwise, use size/limits configured on root. iii. Upon statvfs, update the ctx->size on the inode. iv. Don't let DHT aggregate, instead take the maximum of the usages from the subvols of the DHT, since each of it contains the complete information. Enforcer also makes use of gfid-to-path conversion functionality to work correctly when a client like nfs predominently relies on nameless lookups. * Quota Aggregator acts as a thin client to provide cluster view Its a lightweight gluster client process with no mount point, started upon enabling quota or restarting the volume. This is a single process run on each brick, which can answer queries on all volumes in the cluster. Its volfile stored in GLUSTERD_DEFAULT_WORKING_DIR/quotad/quotad.vol. Credits: Raghavendra Bhat <rabhat@redhat.com> Varun Shastry <vshastry@redhat.com> Shishir Gowda <sgowda@redhat.com> Kruthika Dhananjay <kdhananj@redhat.com> Brian Foster <bfoster@redhat.com> Krishnan Parthasarathi <kparthas@redhat.com> Change-Id: Id1cb25b414951da34c665a55f77385d482e0f9de BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5952 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/marker: quota friendly changes	Raghavendra G	2013-11-26	6	-90/+387
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* handles renames on dht linkfiles correctly * nameless lookup friendly changes. uses gfid-to-path conversion functionality from storage/posix to build ancestry till root. * log message cleanup. * build inode contexts in readdirp * Accounting still not correct with hardlinks. Credits: ======== Vijay Bellur <vbellur@redhat.com> Raghavendra Bhat <rabhat@redhat.com> Change-Id: I415b6fbbc9691f5a38d9fd3c5d083a61e578bb81 BUG: 969461 Signed-off-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-on: http://review.gluster.org/5953 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	NetBSD missing backtrace(3) portability fix	Emmanuel Dreyfus	2013-11-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Implement backtrace(3) and backtrace_symbols(3) which do not exist in NetBSD While there, remove duplicate #include <stdio.h> BUG: 764655 Change-Id: Iccd695765906e085c3f8fcb670506d4fea68fa39 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/6285 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	zerofill: Change the type of len argument of glfs_zerofill() to off_t	Bharata B Rao	2013-11-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	glfs_zerofill() can be potentially called to zero-out entire file and hence allow for bigger value of length parameter. Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63 BUG: 1028673 Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-on: http://review.gluster.org/6266 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com> Tested-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/compress: Compression/DeCompression translator	Prashanth Pai	2013-11-11	7	-1/+1039
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* When a writev call occurs, the client compresses the data before sending it to server. On the server, compressed data is decompressed. Similarly, when a readv call occurs, the server compresses the data before sending it to client. On the client, the compressed data is decompressed. Thus the amount of data sent over the wire is minimized. * Compression/Decompression is done using Zlib library. * During normal operation, this is the format of data sent over wire : <compressed-data> + trailer(8) The trailer contains the CRC32 checksum and length of original uncompressed data. This is used for validation. HOW TO USE ---------- Turning on compression xlator: gluster volume set <vol_name> compress on Configurable options: gluster volume set <vol_name> compress.compression-level 8 gluster volume set <vol_name> compress.min-size 50 Change-Id: Ib7a66b6f1f70fe002b7c513588cdf75c69370805 BUG: 923540 Original-author : Venky Shankar <vshankar@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Prashanth Pai <nullpai@gmail.com> Signed-off-by: Prashanth Pai <ppai@redhat.com> Reviewed-on: http://review.gluster.org/3251 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	features/qemu-block: simplify coroutine model to use single synctask, ucontext	Brian Foster	2013-11-10	7	-248/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current coroutine model, mapping synctasks 1-1 with qemu internal Coroutines, has some unresolved raciness issues. This problem usually manifests as lifecycle mismatches between top-level (gluster created) synctasks and the subsequently created internal coroutines from that context. Qemu's internal queueing (and locking) can cause situations where the top-level synctask is destroyed before the internal scheduler has released references to memory, leading to use after free crashes and asserts. Simplify the coroutine model to use a single synctask as a coroutine processor and rely on the existing native ucontext coroutine implementation. The syncenv thread is donated to qemu and ensures a single top-level coroutine is processed at a time. Qemu now has complete control over coroutine scheduling. BUG: 986775 Change-Id: I38223479a608d80353128e390f243933fc946fd6 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/6110 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	features/qemu-block: add qemu backing image support (clone)	Brian Foster	2013-11-10	4	-23/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add basic backing image support to the block-format mechanism. This is a functionality checkpoint that enables the raw mechanism required to support client driven "snapshot" and "clone" requests. This change enhances the block-format setxattr command to support an additional and optional backing image reference. For example: setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:<bimg>" ./newimage ... where <bimg> refers to the backing image for unallocated blocks in newimage. <bimg> can be provided in one of two formats: - a gfid string in the following format (assuming a valid gfid): <gfid:00000000-0000-0000-0000-000000000000> - or a filename that must be resident in the same directory as the new clone file being formatted. E.g., setxattr -n trusted.glusterfs.block-format -v "qcow2:10GB:baseimg" ./newimage This latter format is more restrictive, simply provided for convenience or until something more refined is available. This change makes no assumptions about the backing image file and affords no additional protection. It is up to the user/client to recognize the relationship between the files and manage them appropriately (i.e., no writes to the backing image, etc.). BUG: 986775 Change-Id: I7aff7bdc59b85a6459001a6bfeae4db6bf74f703 Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-on: http://review.gluster.org/5967 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
*	glusterfs: zerofill support	M. Mohan Kumar	2013-11-10	1	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the specified range. This fop will be useful when a whole file needs to be initialized with zero (could be useful for zero filled VM disk image provisioning or during scrubbing of VM disk images). Client/application can issue this FOP for zeroing out. Gluster server will zero out required range of bytes ie server offloaded zeroing. In the absence of this fop, client/application has to repetitively issue write (zero) fop to the server, which is very inefficient method because of the overheads involved in RPC calls and acknowledgements. WRITESAME is a SCSI T10 command that takes a block of data as input and writes the same data to other blocks and this write is handled completely within the storage and hence is known as offload . Linux ,now has support for SCSI WRITESAME command which is exposed to the user in the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus zeroing out operations can be completely offloaded to the storage device , making it highly efficient. The fop takes two arguments offset and size. It zeroes out 'size' number of bytes in an opened file starting from 'offset' position. This patch adds zerofill support to the following areas: - libglusterfs - io-stats - performance/md-cache,open-behind - quota - cluster/afr,dht,stripe - rpc/xdr - protocol/client,server - io-threads - marker - storage/posix - libgfapi Client applications can exloit this fop by using glfs_zerofill introduced in libgfapi.FUSE support to this fop has not been added as there is no system call for this fop. Changes from previous version 3: * Removed redundant memory failure log messages Changes from previous version 2: * Rebased and fixed build error Changes from previous version 1: * Rebased for latest master TODO : * Add zerofill support to trace xlator * Expose zerofill capability as part of gluster volume info Here is a performance comparison of server offloaded zeofill vs zeroing out using repeated writes. [root@llmvm02 remote]# time ./offloaded aakash-test log 20 real 3m34.155s user 0m0.018s sys 0m0.040s [root@llmvm02 remote]# time ./manually aakash-test log 20 real 4m23.043s user 0m2.197s sys 0m14.457s [root@llmvm02 remote]# time ./offloaded aakash-test log 25; real 4m28.363s user 0m0.021s sys 0m0.025s [root@llmvm02 remote]# time ./manually aakash-test log 25 real 5m34.278s user 0m2.957s sys 0m18.808s The argument log is a file which we want to set for logging purpose and the third argument is size in GB . As we can see there is a performance improvement of around 20% with this fop. Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18 BUG: 1028673 Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com> Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/5327 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	client_t: phase 2, refactor server_ctx and locks_ctx out	Kaleb S. KEITHLEY	2013-10-31	7	-117/+423
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	remove server_ctx and locks_ctx from client_ctx directly and store as into discrete entities in the scratch_ctx hooking up dump will be in phase 3 BUG: 849630 Change-Id: I94cea328326db236cdfdf306cb381e4d58f58d4c Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.org/5678 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>