Back to home page

OSCL-LXR

 
 

    


0001 ===============
0002 Persistent data
0003 ===============
0004 
0005 Introduction
0006 ============
0007 
0008 The more-sophisticated device-mapper targets require complex metadata
0009 that is managed in kernel.  In late 2010 we were seeing that various
0010 different targets were rolling their own data structures, for example:
0011 
0012 - Mikulas Patocka's multisnap implementation
0013 - Heinz Mauelshagen's thin provisioning target
0014 - Another btree-based caching target posted to dm-devel
0015 - Another multi-snapshot target based on a design of Daniel Phillips
0016 
0017 Maintaining these data structures takes a lot of work, so if possible
0018 we'd like to reduce the number.
0019 
0020 The persistent-data library is an attempt to provide a re-usable
0021 framework for people who want to store metadata in device-mapper
0022 targets.  It's currently used by the thin-provisioning target and an
0023 upcoming hierarchical storage target.
0024 
0025 Overview
0026 ========
0027 
0028 The main documentation is in the header files which can all be found
0029 under drivers/md/persistent-data.
0030 
0031 The block manager
0032 -----------------
0033 
0034 dm-block-manager.[hc]
0035 
0036 This provides access to the data on disk in fixed sized-blocks.  There
0037 is a read/write locking interface to prevent concurrent accesses, and
0038 keep data that is being used in the cache.
0039 
0040 Clients of persistent-data are unlikely to use this directly.
0041 
0042 The transaction manager
0043 -----------------------
0044 
0045 dm-transaction-manager.[hc]
0046 
0047 This restricts access to blocks and enforces copy-on-write semantics.
0048 The only way you can get hold of a writable block through the
0049 transaction manager is by shadowing an existing block (ie. doing
0050 copy-on-write) or allocating a fresh one.  Shadowing is elided within
0051 the same transaction so performance is reasonable.  The commit method
0052 ensures that all data is flushed before it writes the superblock.
0053 On power failure your metadata will be as it was when last committed.
0054 
0055 The Space Maps
0056 --------------
0057 
0058 dm-space-map.h
0059 dm-space-map-metadata.[hc]
0060 dm-space-map-disk.[hc]
0061 
0062 On-disk data structures that keep track of reference counts of blocks.
0063 Also acts as the allocator of new blocks.  Currently two
0064 implementations: a simpler one for managing blocks on a different
0065 device (eg. thinly-provisioned data blocks); and one for managing
0066 the metadata space.  The latter is complicated by the need to store
0067 its own data within the space it's managing.
0068 
0069 The data structures
0070 -------------------
0071 
0072 dm-btree.[hc]
0073 dm-btree-remove.c
0074 dm-btree-spine.c
0075 dm-btree-internal.h
0076 
0077 Currently there is only one data structure, a hierarchical btree.
0078 There are plans to add more.  For example, something with an
0079 array-like interface would see a lot of use.
0080 
0081 The btree is 'hierarchical' in that you can define it to be composed
0082 of nested btrees, and take multiple keys.  For example, the
0083 thin-provisioning target uses a btree with two levels of nesting.
0084 The first maps a device id to a mapping tree, and that in turn maps a
0085 virtual block to a physical block.
0086 
0087 Values stored in the btrees can have arbitrary size.  Keys are always
0088 64bits, although nesting allows you to use multiple keys.