path: root/xlators/features/bit-rot/src/stub/bit-rot-stub.c
Commit message (Collapse)AuthorAgeFilesLines
* features/bitrot: stub improvements and fixesVenky Shankar2015-05-281-319/+381
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the signing trigger mechanism used by bitrot daemon as a "catch up" meachanism to sign files which _missed_ signing on the last run either due to bitrot being disabled and enabled again or if bitrot is enabled for a volume with existing data. Existing implementation relies on overloading writev() to trigger signing which just by the looks sounded dangerous and I hated it to the core. This change moves all that business to the setxattr interface thereby keeping the writev path strictly for client IO. Why not use IPC fop to trigger signing? There's a need to access the object's inode to perform various maintainance operations. inode is not _directly_ accessible in the IPC fop (although, it can be found via inode_grep() for the object's GFID - the inode just needs to be pinned in memory, which is the case if there's an active fd on the inode). This patch relies on good old technique of overloading fsetxattr() to do the job instead of using IPC fop. There are some pretty nice cleanups along the lines of memory deallocations, unncessary allocations and redundant ref()ing of structures (such as fd's) provided by this patch. All in all - much improved code navigation. Change-Id: Id93fe90b1618802d1a95a5072517dac342b96cb8 BUG: 1224600 Signed-off-by: Venky Shankar <> Reviewed-on: Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <>
* features/bit-rot-stub: versioning of objects in write/truncate fop instead ↵Raghavendra Bhat2015-05-081-266/+804
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of open * This patch brings in the changes where object versioning is done in write and truncate fops instead of tracking them in open and create fops. This model works for both regular and anonymous fds. It also removes the race associated with open calls, create and lookups. This patch follows the below method for object versioning and notifications: Before sending writev on the fd, increase the ongoing version first. This makes anonymous fd write similar to the regular fd write by having the ongoing version increased before doing the write. Do following steps to do versioning: 1) For anonymous fds set the fd context (so that release is invoked) and add the fd context to the list maintained in the inode context. For regular fds the above think would have been done in open itself. 2) Increase the on-disk ongoing version 3) Increase the in memory ongoing version and mark inode as non-dirty 3) Once versioning is successfully done send write operation. If versioning fails, then fail the write fop. 5) In writev_cbk mark inode as modified. Change-Id: I7104391bbe076d8fc49b68745d2ec29a6e92476c BUG: 1207979 Signed-off-by: Raghavendra Bhat <> Reviewed-on: Tested-by: Gluster Build System <> Reviewed-by: Vijay Bellur <>
* features/bit-rot: Mark versioning fsetxattr as internal fopKotresh HR2015-04-111-5/+9
| | | | | | | | | | | | | | | | | Changelog xlator was capturing bitrot-stub's fsetxattr sent for versioning. Since it was using the same frame as of the create fop, there was inconsistency in fop number and gfid of capturing metadata. So fix is to mark fsetxattr used for versioning as internal and add internal fop filter in changelog_fsetxattr. Change-Id: I51ff468995139838b22bf293a59a0713a92ee7a5 BUG: 1170075 Signed-off-by: Kotresh HR <> Reviewed-on: Tested-by: Gluster Build System <> Reviewed-by: Venky Shankar <> Tested-by: Venky Shankar <>
* bitrot/scrub: Scrubber fixesVenky Shankar2015-04-081-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a handful of problem with scrubber which are detailed below. Scrubber used to skip objects for verification due to missing fd iterface to fetch versioning extended attributes. Similar to the inode interface, an fd based interface in POSIX is now introduced. Moreover, this patch also fixes potential false reporting by scrubber due to: An object gets dirtied and signed when scrubber is busy calculatingobject checksum. This is fixed by caching the signed version when an object is first inspected for stalenes, i.e., during pre-compute stage. This version is used to verify checksum in the post-compute stage when the signatures are compared for possible corruption. Side effect of _not_ sending signature length during signing resulted in "truncated" signature to be set for an object. Now, at the time of signing, the signature length is sent and is used in place of invoking strlen() to get signature length (which could have possible 00s). The signature length itself is not persisted in the signature xattr, but is calculated on-the-fly by substracting the xattr length by the "structure" header size. Some of the log entries are made more meaningful (as and aid for debugging). Change-Id: I938bee5aea6688d5d99eb2640053613af86d6269 BUG: 1207624 Signed-off-by: Venky Shankar <> Reviewed-on: Reviewed-by: Raghavendra Bhat <> Tested-by: Gluster Build System <> Reviewed-by: Vijay Bellur <>
* features/bitrot-stub: Enhancement to versioning protocolVenky Shankar2015-04-081-53/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | .. and potential bug fixes / memleak. While assigning initial version to an object, both extended attributes (namely, ongoing version and the default signing version) were persisted. This is optimized to just persist the ongoing version along with safe handling of xattr request(s) in it's absence. This is better than the earlier approach as the two xattr sets were not atomic anyway (allowing a request to sneak in between between two set operations). This also allows to perform sanity checks on objects during lookup()/getxattr(): objects with missing ongoing version but presence of signature are possible candidates of tampering (and catching implementation bugs). There were couple of instances in the code where versioning xattrs were incorrectly removed before in-memory versions were initialized, which have been fixed with this patch. A memory leak in the IPC code path is also fixed. Change-Id: I01c690ccfe7156a883582275f40f79a7c10c0900 BUG: 1207054 Signed-off-by: Venky Shankar <> Reviewed-on: Reviewed-by: Raghavendra Bhat <> Tested-by: Gluster Build System <> Reviewed-by: Vijay Bellur <>
* Avoid conflict between contrib/uuid and system uuidEmmanuel Dreyfus2015-04-041-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glusterfs relies on Linux uuid implementation, which API is incompatible with most other systems's uuid. As a result, libglusterfs has to embed contrib/uuid, which is the Linux implementation, on non Linux systems. This implementation is incompatible with systtem's built in, but the symbols have the same names. Usually this is not a problem because when we link with -lglusterfs, libc's symbols are trumped. However there is a problem when a program not linked with -lglusterfs will dlopen() glusterfs component. In such a case, libc's uuid implementation is already loaded in the calling program, and it will be used instead of libglusterfs's implementation, causing crashes. A possible workaround is to use pre-load libglusterfs in the calling program (using LD_PRELOAD on NetBSD for instance), but such a mechanism is not portable, nor is it flexible. A much better approach is to rename libglusterfs's uuid_* functions to gf_uuid_* to avoid any possible conflict. This is what this change attempts. BUG: 1206587 Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a Signed-off-by: Emmanuel Dreyfus <> Reviewed-on: Tested-by: Gluster Build System <> Reviewed-by: Niels de Vos <>
* features/bit-rot: filesystem scrubberVenky Shankar2015-03-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scrubber performs signature verification for objects that were signed by signer. This is done by recalculating the signature (using the hash algorithm the object was signed with) and verifying it aginst the objects persisted signature. Since the object could be undergoing IO opretaion at the time of hash calculation, the signature may not match objects persisted signature. Bitrot stub provides additional information about the stalesness of an objects signature (determinted by it's versioning mechanism). This additional bit of information is used by scrubber to determine the staleness of the signature, and in such cases the object is skipped verification (although signature staleness is performed twice: once before initiation of hash calculation and another after it (an object could be modified after staleness checks). The implmentation is a part of the bitrot xlator (signer) which acts as a signer or scrubber based on a translator option. As of now the scrub process is ever running (but has some form of weak throttling mechanism during filesystem scan). Going forward, there needs to be some form of scrub scheduling and IO throttling (during hash calculation) tunables (via CLI). Change-Id: I665ce90208f6074b98c5a1dd841ce776627cc6f9 BUG: 1170075 Original-Author: Raghavendra Bhat <> Original-Author: Venky Shankar <> Signed-off-by: Venky Shankar <> Reviewed-on: Tested-by: Vijay Bellur <> Reviewed-by: Vijay Bellur <>
* Bitrot StubVenky Shankar2015-03-241-0/+1428
Bitrot stub implements object versioning required for identifying signature freshness. More details about versioning is explained as a part of the "bitrot feature documentation" patch. Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9 BUG: 1170075 Signed-off-by: Venky Shankar <> Reviewed-on: Tested-by: Gluster Build System <> Reviewed-by: Vijay Bellur <>