Back to home page

OSCL-LXR

 
 

    


0001 ============================
0002 A block layer cache (bcache)
0003 ============================
0004 
0005 Say you've got a big slow raid 6, and an ssd or three. Wouldn't it be
0006 nice if you could use them as cache... Hence bcache.
0007 
0008 The bcache wiki can be found at:
0009   https://bcache.evilpiepirate.org
0010 
0011 This is the git repository of bcache-tools:
0012   https://git.kernel.org/pub/scm/linux/kernel/git/colyli/bcache-tools.git/
0013 
0014 The latest bcache kernel code can be found from mainline Linux kernel:
0015   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
0016 
0017 It's designed around the performance characteristics of SSDs - it only allocates
0018 in erase block sized buckets, and it uses a hybrid btree/log to track cached
0019 extents (which can be anywhere from a single sector to the bucket size). It's
0020 designed to avoid random writes at all costs; it fills up an erase block
0021 sequentially, then issues a discard before reusing it.
0022 
0023 Both writethrough and writeback caching are supported. Writeback defaults to
0024 off, but can be switched on and off arbitrarily at runtime. Bcache goes to
0025 great lengths to protect your data - it reliably handles unclean shutdown. (It
0026 doesn't even have a notion of a clean shutdown; bcache simply doesn't return
0027 writes as completed until they're on stable storage).
0028 
0029 Writeback caching can use most of the cache for buffering writes - writing
0030 dirty data to the backing device is always done sequentially, scanning from the
0031 start to the end of the index.
0032 
0033 Since random IO is what SSDs excel at, there generally won't be much benefit
0034 to caching large sequential IO. Bcache detects sequential IO and skips it;
0035 it also keeps a rolling average of the IO sizes per task, and as long as the
0036 average is above the cutoff it will skip all IO from that task - instead of
0037 caching the first 512k after every seek. Backups and large file copies should
0038 thus entirely bypass the cache.
0039 
0040 In the event of a data IO error on the flash it will try to recover by reading
0041 from disk or invalidating cache entries.  For unrecoverable errors (meta data
0042 or dirty data), caching is automatically disabled; if dirty data was present
0043 in the cache it first disables writeback caching and waits for all dirty data
0044 to be flushed.
0045 
0046 Getting started:
0047 You'll need bcache util from the bcache-tools repository. Both the cache device
0048 and backing device must be formatted before use::
0049 
0050   bcache make -B /dev/sdb
0051   bcache make -C /dev/sdc
0052 
0053 `bcache make` has the ability to format multiple devices at the same time - if
0054 you format your backing devices and cache device at the same time, you won't
0055 have to manually attach::
0056 
0057   bcache make -B /dev/sda /dev/sdb -C /dev/sdc
0058 
0059 If your bcache-tools is not updated to latest version and does not have the
0060 unified `bcache` utility, you may use the legacy `make-bcache` utility to format
0061 bcache device with same -B and -C parameters.
0062 
0063 bcache-tools now ships udev rules, and bcache devices are known to the kernel
0064 immediately.  Without udev, you can manually register devices like this::
0065 
0066   echo /dev/sdb > /sys/fs/bcache/register
0067   echo /dev/sdc > /sys/fs/bcache/register
0068 
0069 Registering the backing device makes the bcache device show up in /dev; you can
0070 now format it and use it as normal. But the first time using a new bcache
0071 device, it'll be running in passthrough mode until you attach it to a cache.
0072 If you are thinking about using bcache later, it is recommended to setup all your
0073 slow devices as bcache backing devices without a cache, and you can choose to add
0074 a caching device later.
0075 See 'ATTACHING' section below.
0076 
0077 The devices show up as::
0078 
0079   /dev/bcache<N>
0080 
0081 As well as (with udev)::
0082 
0083   /dev/bcache/by-uuid/<uuid>
0084   /dev/bcache/by-label/<label>
0085 
0086 To get started::
0087 
0088   mkfs.ext4 /dev/bcache0
0089   mount /dev/bcache0 /mnt
0090 
0091 You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache .
0092 You can also control them through /sys/fs//bcache/<cset-uuid>/ .
0093 
0094 Cache devices are managed as sets; multiple caches per set isn't supported yet
0095 but will allow for mirroring of metadata and dirty data in the future. Your new
0096 cache set shows up as /sys/fs/bcache/<UUID>
0097 
0098 Attaching
0099 ---------
0100 
0101 After your cache device and backing device are registered, the backing device
0102 must be attached to your cache set to enable caching. Attaching a backing
0103 device to a cache set is done thusly, with the UUID of the cache set in
0104 /sys/fs/bcache::
0105 
0106   echo <CSET-UUID> > /sys/block/bcache0/bcache/attach
0107 
0108 This only has to be done once. The next time you reboot, just reregister all
0109 your bcache devices. If a backing device has data in a cache somewhere, the
0110 /dev/bcache<N> device won't be created until the cache shows up - particularly
0111 important if you have writeback caching turned on.
0112 
0113 If you're booting up and your cache device is gone and never coming back, you
0114 can force run the backing device::
0115 
0116   echo 1 > /sys/block/sdb/bcache/running
0117 
0118 (You need to use /sys/block/sdb (or whatever your backing device is called), not
0119 /sys/block/bcache0, because bcache0 doesn't exist yet. If you're using a
0120 partition, the bcache directory would be at /sys/block/sdb/sdb2/bcache)
0121 
0122 The backing device will still use that cache set if it shows up in the future,
0123 but all the cached data will be invalidated. If there was dirty data in the
0124 cache, don't expect the filesystem to be recoverable - you will have massive
0125 filesystem corruption, though ext4's fsck does work miracles.
0126 
0127 Error Handling
0128 --------------
0129 
0130 Bcache tries to transparently handle IO errors to/from the cache device without
0131 affecting normal operation; if it sees too many errors (the threshold is
0132 configurable, and defaults to 0) it shuts down the cache device and switches all
0133 the backing devices to passthrough mode.
0134 
0135  - For reads from the cache, if they error we just retry the read from the
0136    backing device.
0137 
0138  - For writethrough writes, if the write to the cache errors we just switch to
0139    invalidating the data at that lba in the cache (i.e. the same thing we do for
0140    a write that bypasses the cache)
0141 
0142  - For writeback writes, we currently pass that error back up to the
0143    filesystem/userspace. This could be improved - we could retry it as a write
0144    that skips the cache so we don't have to error the write.
0145 
0146  - When we detach, we first try to flush any dirty data (if we were running in
0147    writeback mode). It currently doesn't do anything intelligent if it fails to
0148    read some of the dirty data, though.
0149 
0150 
0151 Howto/cookbook
0152 --------------
0153 
0154 A) Starting a bcache with a missing caching device
0155 
0156 If registering the backing device doesn't help, it's already there, you just need
0157 to force it to run without the cache::
0158 
0159         host:~# echo /dev/sdb1 > /sys/fs/bcache/register
0160         [  119.844831] bcache: register_bcache() error opening /dev/sdb1: device already registered
0161 
0162 Next, you try to register your caching device if it's present. However
0163 if it's absent, or registration fails for some reason, you can still
0164 start your bcache without its cache, like so::
0165 
0166         host:/sys/block/sdb/sdb1/bcache# echo 1 > running
0167 
0168 Note that this may cause data loss if you were running in writeback mode.
0169 
0170 
0171 B) Bcache does not find its cache::
0172 
0173         host:/sys/block/md5/bcache# echo 0226553a-37cf-41d5-b3ce-8b1e944543a8 > attach
0174         [ 1933.455082] bcache: bch_cached_dev_attach() Couldn't find uuid for md5 in set
0175         [ 1933.478179] bcache: __cached_dev_store() Can't attach 0226553a-37cf-41d5-b3ce-8b1e944543a8
0176         [ 1933.478179] : cache set not found
0177 
0178 In this case, the caching device was simply not registered at boot
0179 or disappeared and came back, and needs to be (re-)registered::
0180 
0181         host:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register
0182 
0183 
0184 C) Corrupt bcache crashes the kernel at device registration time:
0185 
0186 This should never happen.  If it does happen, then you have found a bug!
0187 Please report it to the bcache development list: linux-bcache@vger.kernel.org
0188 
0189 Be sure to provide as much information that you can including kernel dmesg
0190 output if available so that we may assist.
0191 
0192 
0193 D) Recovering data without bcache:
0194 
0195 If bcache is not available in the kernel, a filesystem on the backing
0196 device is still available at an 8KiB offset. So either via a loopdev
0197 of the backing device created with --offset 8K, or any value defined by
0198 --data-offset when you originally formatted bcache with `bcache make`.
0199 
0200 For example::
0201 
0202         losetup -o 8192 /dev/loop0 /dev/your_bcache_backing_dev
0203 
0204 This should present your unmodified backing device data in /dev/loop0
0205 
0206 If your cache is in writethrough mode, then you can safely discard the
0207 cache device without loosing data.
0208 
0209 
0210 E) Wiping a cache device
0211 
0212 ::
0213 
0214         host:~# wipefs -a /dev/sdh2
0215         16 bytes were erased at offset 0x1018 (bcache)
0216         they were: c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
0217 
0218 After you boot back with bcache enabled, you recreate the cache and attach it::
0219 
0220         host:~# bcache make -C /dev/sdh2
0221         UUID:                   7be7e175-8f4c-4f99-94b2-9c904d227045
0222         Set UUID:               5bc072a8-ab17-446d-9744-e247949913c1
0223         version:                0
0224         nbuckets:               106874
0225         block_size:             1
0226         bucket_size:            1024
0227         nr_in_set:              1
0228         nr_this_dev:            0
0229         first_bucket:           1
0230         [  650.511912] bcache: run_cache_set() invalidating existing data
0231         [  650.549228] bcache: register_cache() registered cache device sdh2
0232 
0233 start backing device with missing cache::
0234 
0235         host:/sys/block/md5/bcache# echo 1 > running
0236 
0237 attach new cache::
0238 
0239         host:/sys/block/md5/bcache# echo 5bc072a8-ab17-446d-9744-e247949913c1 > attach
0240         [  865.276616] bcache: bch_cached_dev_attach() Caching md5 as bcache0 on set 5bc072a8-ab17-446d-9744-e247949913c1
0241 
0242 
0243 F) Remove or replace a caching device::
0244 
0245         host:/sys/block/sda/sda7/bcache# echo 1 > detach
0246         [  695.872542] bcache: cached_dev_detach_finish() Caching disabled for sda7
0247 
0248         host:~# wipefs -a /dev/nvme0n1p4
0249         wipefs: error: /dev/nvme0n1p4: probing initialization failed: Device or resource busy
0250         Ooops, it's disabled, but not unregistered, so it's still protected
0251 
0252 We need to go and unregister it::
0253 
0254         host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# ls -l cache0
0255         lrwxrwxrwx 1 root root 0 Feb 25 18:33 cache0 -> ../../../devices/pci0000:00/0000:00:1d.0/0000:70:00.0/nvme/nvme0/nvme0n1/nvme0n1p4/bcache/
0256         host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# echo 1 > stop
0257         kernel: [  917.041908] bcache: cache_set_free() Cache set b7ba27a1-2398-4649-8ae3-0959f57ba128 unregistered
0258 
0259 Now we can wipe it::
0260 
0261         host:~# wipefs -a /dev/nvme0n1p4
0262         /dev/nvme0n1p4: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
0263 
0264 
0265 G) dm-crypt and bcache
0266 
0267 First setup bcache unencrypted and then install dmcrypt on top of
0268 /dev/bcache<N> This will work faster than if you dmcrypt both the backing
0269 and caching devices and then install bcache on top. [benchmarks?]
0270 
0271 
0272 H) Stop/free a registered bcache to wipe and/or recreate it
0273 
0274 Suppose that you need to free up all bcache references so that you can
0275 fdisk run and re-register a changed partition table, which won't work
0276 if there are any active backing or caching devices left on it:
0277 
0278 1) Is it present in /dev/bcache* ? (there are times where it won't be)
0279 
0280    If so, it's easy::
0281 
0282         host:/sys/block/bcache0/bcache# echo 1 > stop
0283 
0284 2) But if your backing device is gone, this won't work::
0285 
0286         host:/sys/block/bcache0# cd bcache
0287         bash: cd: bcache: No such file or directory
0288 
0289    In this case, you may have to unregister the dmcrypt block device that
0290    references this bcache to free it up::
0291 
0292         host:~# dmsetup remove oldds1
0293         bcache: bcache_device_free() bcache0 stopped
0294         bcache: cache_set_free() Cache set 5bc072a8-ab17-446d-9744-e247949913c1 unregistered
0295 
0296    This causes the backing bcache to be removed from /sys/fs/bcache and
0297    then it can be reused.  This would be true of any block device stacking
0298    where bcache is a lower device.
0299 
0300 3) In other cases, you can also look in /sys/fs/bcache/::
0301 
0302         host:/sys/fs/bcache# ls -l */{cache?,bdev?}
0303         lrwxrwxrwx 1 root root 0 Mar  5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/bdev1 -> ../../../devices/virtual/block/dm-1/bcache/
0304         lrwxrwxrwx 1 root root 0 Mar  5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/cache0 -> ../../../devices/virtual/block/dm-4/bcache/
0305         lrwxrwxrwx 1 root root 0 Mar  5 09:39 5bc072a8-ab17-446d-9744-e247949913c1/cache0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/ata10/host9/target9:0:0/9:0:0:0/block/sdl/sdl2/bcache/
0306 
0307    The device names will show which UUID is relevant, cd in that directory
0308    and stop the cache::
0309 
0310         host:/sys/fs/bcache/5bc072a8-ab17-446d-9744-e247949913c1# echo 1 > stop
0311 
0312    This will free up bcache references and let you reuse the partition for
0313    other purposes.
0314 
0315 
0316 
0317 Troubleshooting performance
0318 ---------------------------
0319 
0320 Bcache has a bunch of config options and tunables. The defaults are intended to
0321 be reasonable for typical desktop and server workloads, but they're not what you
0322 want for getting the best possible numbers when benchmarking.
0323 
0324  - Backing device alignment
0325 
0326    The default metadata size in bcache is 8k.  If your backing device is
0327    RAID based, then be sure to align this by a multiple of your stride
0328    width using `bcache make --data-offset`. If you intend to expand your
0329    disk array in the future, then multiply a series of primes by your
0330    raid stripe size to get the disk multiples that you would like.
0331 
0332    For example:  If you have a 64k stripe size, then the following offset
0333    would provide alignment for many common RAID5 data spindle counts::
0334 
0335         64k * 2*2*2*3*3*5*7 bytes = 161280k
0336 
0337    That space is wasted, but for only 157.5MB you can grow your RAID 5
0338    volume to the following data-spindle counts without re-aligning::
0339 
0340         3,4,5,6,7,8,9,10,12,14,15,18,20,21 ...
0341 
0342  - Bad write performance
0343 
0344    If write performance is not what you expected, you probably wanted to be
0345    running in writeback mode, which isn't the default (not due to a lack of
0346    maturity, but simply because in writeback mode you'll lose data if something
0347    happens to your SSD)::
0348 
0349         # echo writeback > /sys/block/bcache0/bcache/cache_mode
0350 
0351  - Bad performance, or traffic not going to the SSD that you'd expect
0352 
0353    By default, bcache doesn't cache everything. It tries to skip sequential IO -
0354    because you really want to be caching the random IO, and if you copy a 10
0355    gigabyte file you probably don't want that pushing 10 gigabytes of randomly
0356    accessed data out of your cache.
0357 
0358    But if you want to benchmark reads from cache, and you start out with fio
0359    writing an 8 gigabyte test file - so you want to disable that::
0360 
0361         # echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
0362 
0363    To set it back to the default (4 mb), do::
0364 
0365         # echo 4M > /sys/block/bcache0/bcache/sequential_cutoff
0366 
0367  - Traffic's still going to the spindle/still getting cache misses
0368 
0369    In the real world, SSDs don't always keep up with disks - particularly with
0370    slower SSDs, many disks being cached by one SSD, or mostly sequential IO. So
0371    you want to avoid being bottlenecked by the SSD and having it slow everything
0372    down.
0373 
0374    To avoid that bcache tracks latency to the cache device, and gradually
0375    throttles traffic if the latency exceeds a threshold (it does this by
0376    cranking down the sequential bypass).
0377 
0378    You can disable this if you need to by setting the thresholds to 0::
0379 
0380         # echo 0 > /sys/fs/bcache/<cache set>/congested_read_threshold_us
0381         # echo 0 > /sys/fs/bcache/<cache set>/congested_write_threshold_us
0382 
0383    The default is 2000 us (2 milliseconds) for reads, and 20000 for writes.
0384 
0385  - Still getting cache misses, of the same data
0386 
0387    One last issue that sometimes trips people up is actually an old bug, due to
0388    the way cache coherency is handled for cache misses. If a btree node is full,
0389    a cache miss won't be able to insert a key for the new data and the data
0390    won't be written to the cache.
0391 
0392    In practice this isn't an issue because as soon as a write comes along it'll
0393    cause the btree node to be split, and you need almost no write traffic for
0394    this to not show up enough to be noticeable (especially since bcache's btree
0395    nodes are huge and index large regions of the device). But when you're
0396    benchmarking, if you're trying to warm the cache by reading a bunch of data
0397    and there's no other traffic - that can be a problem.
0398 
0399    Solution: warm the cache by doing writes, or use the testing branch (there's
0400    a fix for the issue there).
0401 
0402 
0403 Sysfs - backing device
0404 ----------------------
0405 
0406 Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and
0407 (if attached) /sys/fs/bcache/<cset-uuid>/bdev*
0408 
0409 attach
0410   Echo the UUID of a cache set to this file to enable caching.
0411 
0412 cache_mode
0413   Can be one of either writethrough, writeback, writearound or none.
0414 
0415 clear_stats
0416   Writing to this file resets the running total stats (not the day/hour/5 minute
0417   decaying versions).
0418 
0419 detach
0420   Write to this file to detach from a cache set. If there is dirty data in the
0421   cache, it will be flushed first.
0422 
0423 dirty_data
0424   Amount of dirty data for this backing device in the cache. Continuously
0425   updated unlike the cache set's version, but may be slightly off.
0426 
0427 label
0428   Name of underlying device.
0429 
0430 readahead
0431   Size of readahead that should be performed.  Defaults to 0.  If set to e.g.
0432   1M, it will round cache miss reads up to that size, but without overlapping
0433   existing cache entries.
0434 
0435 running
0436   1 if bcache is running (i.e. whether the /dev/bcache device exists, whether
0437   it's in passthrough mode or caching).
0438 
0439 sequential_cutoff
0440   A sequential IO will bypass the cache once it passes this threshold; the
0441   most recent 128 IOs are tracked so sequential IO can be detected even when
0442   it isn't all done at once.
0443 
0444 sequential_merge
0445   If non zero, bcache keeps a list of the last 128 requests submitted to compare
0446   against all new requests to determine which new requests are sequential
0447   continuations of previous requests for the purpose of determining sequential
0448   cutoff. This is necessary if the sequential cutoff value is greater than the
0449   maximum acceptable sequential size for any single request.
0450 
0451 state
0452   The backing device can be in one of four different states:
0453 
0454   no cache: Has never been attached to a cache set.
0455 
0456   clean: Part of a cache set, and there is no cached dirty data.
0457 
0458   dirty: Part of a cache set, and there is cached dirty data.
0459 
0460   inconsistent: The backing device was forcibly run by the user when there was
0461   dirty data cached but the cache set was unavailable; whatever data was on the
0462   backing device has likely been corrupted.
0463 
0464 stop
0465   Write to this file to shut down the bcache device and close the backing
0466   device.
0467 
0468 writeback_delay
0469   When dirty data is written to the cache and it previously did not contain
0470   any, waits some number of seconds before initiating writeback. Defaults to
0471   30.
0472 
0473 writeback_percent
0474   If nonzero, bcache tries to keep around this percentage of the cache dirty by
0475   throttling background writeback and using a PD controller to smoothly adjust
0476   the rate.
0477 
0478 writeback_rate
0479   Rate in sectors per second - if writeback_percent is nonzero, background
0480   writeback is throttled to this rate. Continuously adjusted by bcache but may
0481   also be set by the user.
0482 
0483 writeback_running
0484   If off, writeback of dirty data will not take place at all. Dirty data will
0485   still be added to the cache until it is mostly full; only meant for
0486   benchmarking. Defaults to on.
0487 
0488 Sysfs - backing device stats
0489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0490 
0491 There are directories with these numbers for a running total, as well as
0492 versions that decay over the past day, hour and 5 minutes; they're also
0493 aggregated in the cache set directory as well.
0494 
0495 bypassed
0496   Amount of IO (both reads and writes) that has bypassed the cache
0497 
0498 cache_hits, cache_misses, cache_hit_ratio
0499   Hits and misses are counted per individual IO as bcache sees them; a
0500   partial hit is counted as a miss.
0501 
0502 cache_bypass_hits, cache_bypass_misses
0503   Hits and misses for IO that is intended to skip the cache are still counted,
0504   but broken out here.
0505 
0506 cache_miss_collisions
0507   Counts instances where data was going to be inserted into the cache from a
0508   cache miss, but raced with a write and data was already present (usually 0
0509   since the synchronization for cache misses was rewritten)
0510 
0511 cache_readaheads
0512   Count of times readahead occurred.
0513 
0514 Sysfs - cache set
0515 ~~~~~~~~~~~~~~~~~
0516 
0517 Available at /sys/fs/bcache/<cset-uuid>
0518 
0519 average_key_size
0520   Average data per key in the btree.
0521 
0522 bdev<0..n>
0523   Symlink to each of the attached backing devices.
0524 
0525 block_size
0526   Block size of the cache devices.
0527 
0528 btree_cache_size
0529   Amount of memory currently used by the btree cache
0530 
0531 bucket_size
0532   Size of buckets
0533 
0534 cache<0..n>
0535   Symlink to each of the cache devices comprising this cache set.
0536 
0537 cache_available_percent
0538   Percentage of cache device which doesn't contain dirty data, and could
0539   potentially be used for writeback.  This doesn't mean this space isn't used
0540   for clean cached data; the unused statistic (in priority_stats) is typically
0541   much lower.
0542 
0543 clear_stats
0544   Clears the statistics associated with this cache
0545 
0546 dirty_data
0547   Amount of dirty data is in the cache (updated when garbage collection runs).
0548 
0549 flash_vol_create
0550   Echoing a size to this file (in human readable units, k/M/G) creates a thinly
0551   provisioned volume backed by the cache set.
0552 
0553 io_error_halflife, io_error_limit
0554   These determines how many errors we accept before disabling the cache.
0555   Each error is decayed by the half life (in # ios).  If the decaying count
0556   reaches io_error_limit dirty data is written out and the cache is disabled.
0557 
0558 journal_delay_ms
0559   Journal writes will delay for up to this many milliseconds, unless a cache
0560   flush happens sooner. Defaults to 100.
0561 
0562 root_usage_percent
0563   Percentage of the root btree node in use.  If this gets too high the node
0564   will split, increasing the tree depth.
0565 
0566 stop
0567   Write to this file to shut down the cache set - waits until all attached
0568   backing devices have been shut down.
0569 
0570 tree_depth
0571   Depth of the btree (A single node btree has depth 0).
0572 
0573 unregister
0574   Detaches all backing devices and closes the cache devices; if dirty data is
0575   present it will disable writeback caching and wait for it to be flushed.
0576 
0577 Sysfs - cache set internal
0578 ~~~~~~~~~~~~~~~~~~~~~~~~~~
0579 
0580 This directory also exposes timings for a number of internal operations, with
0581 separate files for average duration, average frequency, last occurrence and max
0582 duration: garbage collection, btree read, btree node sorts and btree splits.
0583 
0584 active_journal_entries
0585   Number of journal entries that are newer than the index.
0586 
0587 btree_nodes
0588   Total nodes in the btree.
0589 
0590 btree_used_percent
0591   Average fraction of btree in use.
0592 
0593 bset_tree_stats
0594   Statistics about the auxiliary search trees
0595 
0596 btree_cache_max_chain
0597   Longest chain in the btree node cache's hash table
0598 
0599 cache_read_races
0600   Counts instances where while data was being read from the cache, the bucket
0601   was reused and invalidated - i.e. where the pointer was stale after the read
0602   completed. When this occurs the data is reread from the backing device.
0603 
0604 trigger_gc
0605   Writing to this file forces garbage collection to run.
0606 
0607 Sysfs - Cache device
0608 ~~~~~~~~~~~~~~~~~~~~
0609 
0610 Available at /sys/block/<cdev>/bcache
0611 
0612 block_size
0613   Minimum granularity of writes - should match hardware sector size.
0614 
0615 btree_written
0616   Sum of all btree writes, in (kilo/mega/giga) bytes
0617 
0618 bucket_size
0619   Size of buckets
0620 
0621 cache_replacement_policy
0622   One of either lru, fifo or random.
0623 
0624 discard
0625   Boolean; if on a discard/TRIM will be issued to each bucket before it is
0626   reused. Defaults to off, since SATA TRIM is an unqueued command (and thus
0627   slow).
0628 
0629 freelist_percent
0630   Size of the freelist as a percentage of nbuckets. Can be written to to
0631   increase the number of buckets kept on the freelist, which lets you
0632   artificially reduce the size of the cache at runtime. Mostly for testing
0633   purposes (i.e. testing how different size caches affect your hit rate), but
0634   since buckets are discarded when they move on to the freelist will also make
0635   the SSD's garbage collection easier by effectively giving it more reserved
0636   space.
0637 
0638 io_errors
0639   Number of errors that have occurred, decayed by io_error_halflife.
0640 
0641 metadata_written
0642   Sum of all non data writes (btree writes and all other metadata).
0643 
0644 nbuckets
0645   Total buckets in this cache
0646 
0647 priority_stats
0648   Statistics about how recently data in the cache has been accessed.
0649   This can reveal your working set size.  Unused is the percentage of
0650   the cache that doesn't contain any data.  Metadata is bcache's
0651   metadata overhead.  Average is the average priority of cache buckets.
0652   Next is a list of quantiles with the priority threshold of each.
0653 
0654 written
0655   Sum of all data that has been written to the cache; comparison with
0656   btree_written gives the amount of write inflation in bcache.