Back to home page

OSCL-LXR

 
 

    


0001 =========
0002 RPC Cache
0003 =========
0004 
0005 This document gives a brief introduction to the caching
0006 mechanisms in the sunrpc layer that is used, in particular,
0007 for NFS authentication.
0008 
0009 Caches
0010 ======
0011 
0012 The caching replaces the old exports table and allows for
0013 a wide variety of values to be caches.
0014 
0015 There are a number of caches that are similar in structure though
0016 quite possibly very different in content and use.  There is a corpus
0017 of common code for managing these caches.
0018 
0019 Examples of caches that are likely to be needed are:
0020 
0021   - mapping from IP address to client name
0022   - mapping from client name and filesystem to export options
0023   - mapping from UID to list of GIDs, to work around NFS's limitation
0024     of 16 gids.
0025   - mappings between local UID/GID and remote UID/GID for sites that
0026     do not have uniform uid assignment
0027   - mapping from network identify to public key for crypto authentication.
0028 
0029 The common code handles such things as:
0030 
0031    - general cache lookup with correct locking
0032    - supporting 'NEGATIVE' as well as positive entries
0033    - allowing an EXPIRED time on cache items, and removing
0034      items after they expire, and are no longer in-use.
0035    - making requests to user-space to fill in cache entries
0036    - allowing user-space to directly set entries in the cache
0037    - delaying RPC requests that depend on as-yet incomplete
0038      cache entries, and replaying those requests when the cache entry
0039      is complete.
0040    - clean out old entries as they expire.
0041 
0042 Creating a Cache
0043 ----------------
0044 
0045 -  A cache needs a datum to store.  This is in the form of a
0046    structure definition that must contain a struct cache_head
0047    as an element, usually the first.
0048    It will also contain a key and some content.
0049    Each cache element is reference counted and contains
0050    expiry and update times for use in cache management.
0051 -  A cache needs a "cache_detail" structure that
0052    describes the cache.  This stores the hash table, some
0053    parameters for cache management, and some operations detailing how
0054    to work with particular cache items.
0055 
0056    The operations are:
0057 
0058     struct cache_head \*alloc(void)
0059       This simply allocates appropriate memory and returns
0060       a pointer to the cache_detail embedded within the
0061       structure
0062 
0063     void cache_put(struct kref \*)
0064       This is called when the last reference to an item is
0065       dropped.  The pointer passed is to the 'ref' field
0066       in the cache_head.  cache_put should release any
0067       references create by 'cache_init' and, if CACHE_VALID
0068       is set, any references created by cache_update.
0069       It should then release the memory allocated by
0070       'alloc'.
0071 
0072     int match(struct cache_head \*orig, struct cache_head \*new)
0073       test if the keys in the two structures match.  Return
0074       1 if they do, 0 if they don't.
0075 
0076     void init(struct cache_head \*orig, struct cache_head \*new)
0077       Set the 'key' fields in 'new' from 'orig'.  This may
0078       include taking references to shared objects.
0079 
0080     void update(struct cache_head \*orig, struct cache_head \*new)
0081       Set the 'content' fileds in 'new' from 'orig'.
0082 
0083     int cache_show(struct seq_file \*m, struct cache_detail \*cd, struct cache_head \*h)
0084       Optional.  Used to provide a /proc file that lists the
0085       contents of a cache.  This should show one item,
0086       usually on just one line.
0087 
0088     int cache_request(struct cache_detail \*cd, struct cache_head \*h, char \*\*bpp, int \*blen)
0089       Format a request to be send to user-space for an item
0090       to be instantiated.  \*bpp is a buffer of size \*blen.
0091       bpp should be moved forward over the encoded message,
0092       and  \*blen should be reduced to show how much free
0093       space remains.  Return 0 on success or <0 if not
0094       enough room or other problem.
0095 
0096     int cache_parse(struct cache_detail \*cd, char \*buf, int len)
0097       A message from user space has arrived to fill out a
0098       cache entry.  It is in 'buf' of length 'len'.
0099       cache_parse should parse this, find the item in the
0100       cache with sunrpc_cache_lookup_rcu, and update the item
0101       with sunrpc_cache_update.
0102 
0103 
0104 -  A cache needs to be registered using cache_register().  This
0105    includes it on a list of caches that will be regularly
0106    cleaned to discard old data.
0107 
0108 Using a cache
0109 -------------
0110 
0111 To find a value in a cache, call sunrpc_cache_lookup_rcu passing a pointer
0112 to the cache_head in a sample item with the 'key' fields filled in.
0113 This will be passed to ->match to identify the target entry.  If no
0114 entry is found, a new entry will be create, added to the cache, and
0115 marked as not containing valid data.
0116 
0117 The item returned is typically passed to cache_check which will check
0118 if the data is valid, and may initiate an up-call to get fresh data.
0119 cache_check will return -ENOENT in the entry is negative or if an up
0120 call is needed but not possible, -EAGAIN if an upcall is pending,
0121 or 0 if the data is valid;
0122 
0123 cache_check can be passed a "struct cache_req\*".  This structure is
0124 typically embedded in the actual request and can be used to create a
0125 deferred copy of the request (struct cache_deferred_req).  This is
0126 done when the found cache item is not uptodate, but the is reason to
0127 believe that userspace might provide information soon.  When the cache
0128 item does become valid, the deferred copy of the request will be
0129 revisited (->revisit).  It is expected that this method will
0130 reschedule the request for processing.
0131 
0132 The value returned by sunrpc_cache_lookup_rcu can also be passed to
0133 sunrpc_cache_update to set the content for the item.  A second item is
0134 passed which should hold the content.  If the item found by _lookup
0135 has valid data, then it is discarded and a new item is created.  This
0136 saves any user of an item from worrying about content changing while
0137 it is being inspected.  If the item found by _lookup does not contain
0138 valid data, then the content is copied across and CACHE_VALID is set.
0139 
0140 Populating a cache
0141 ------------------
0142 
0143 Each cache has a name, and when the cache is registered, a directory
0144 with that name is created in /proc/net/rpc
0145 
0146 This directory contains a file called 'channel' which is a channel
0147 for communicating between kernel and user for populating the cache.
0148 This directory may later contain other files of interacting
0149 with the cache.
0150 
0151 The 'channel' works a bit like a datagram socket. Each 'write' is
0152 passed as a whole to the cache for parsing and interpretation.
0153 Each cache can treat the write requests differently, but it is
0154 expected that a message written will contain:
0155 
0156   - a key
0157   - an expiry time
0158   - a content.
0159 
0160 with the intention that an item in the cache with the give key
0161 should be create or updated to have the given content, and the
0162 expiry time should be set on that item.
0163 
0164 Reading from a channel is a bit more interesting.  When a cache
0165 lookup fails, or when it succeeds but finds an entry that may soon
0166 expire, a request is lodged for that cache item to be updated by
0167 user-space.  These requests appear in the channel file.
0168 
0169 Successive reads will return successive requests.
0170 If there are no more requests to return, read will return EOF, but a
0171 select or poll for read will block waiting for another request to be
0172 added.
0173 
0174 Thus a user-space helper is likely to::
0175 
0176   open the channel.
0177     select for readable
0178     read a request
0179     write a response
0180   loop.
0181 
0182 If it dies and needs to be restarted, any requests that have not been
0183 answered will still appear in the file and will be read by the new
0184 instance of the helper.
0185 
0186 Each cache should define a "cache_parse" method which takes a message
0187 written from user-space and processes it.  It should return an error
0188 (which propagates back to the write syscall) or 0.
0189 
0190 Each cache should also define a "cache_request" method which
0191 takes a cache item and encodes a request into the buffer
0192 provided.
0193 
0194 .. note::
0195   If a cache has no active readers on the channel, and has had not
0196   active readers for more than 60 seconds, further requests will not be
0197   added to the channel but instead all lookups that do not find a valid
0198   entry will fail.  This is partly for backward compatibility: The
0199   previous nfs exports table was deemed to be authoritative and a
0200   failed lookup meant a definite 'no'.
0201 
0202 request/response format
0203 -----------------------
0204 
0205 While each cache is free to use its own format for requests
0206 and responses over channel, the following is recommended as
0207 appropriate and support routines are available to help:
0208 Each request or response record should be printable ASCII
0209 with precisely one newline character which should be at the end.
0210 Fields within the record should be separated by spaces, normally one.
0211 If spaces, newlines, or nul characters are needed in a field they
0212 much be quoted.  two mechanisms are available:
0213 
0214 -  If a field begins '\x' then it must contain an even number of
0215    hex digits, and pairs of these digits provide the bytes in the
0216    field.
0217 -  otherwise a \ in the field must be followed by 3 octal digits
0218    which give the code for a byte.  Other characters are treated
0219    as them selves.  At the very least, space, newline, nul, and
0220    '\' must be quoted in this way.