0001 ============================
0002 A block layer cache (bcache)
0003 ============================
0004
0005 Say you've got a big slow raid 6, and an ssd or three. Wouldn't it be
0006 nice if you could use them as cache... Hence bcache.
0007
0008 The bcache wiki can be found at:
0009 https://bcache.evilpiepirate.org
0010
0011 This is the git repository of bcache-tools:
0012 https://git.kernel.org/pub/scm/linux/kernel/git/colyli/bcache-tools.git/
0013
0014 The latest bcache kernel code can be found from mainline Linux kernel:
0015 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
0016
0017 It's designed around the performance characteristics of SSDs - it only allocates
0018 in erase block sized buckets, and it uses a hybrid btree/log to track cached
0019 extents (which can be anywhere from a single sector to the bucket size). It's
0020 designed to avoid random writes at all costs; it fills up an erase block
0021 sequentially, then issues a discard before reusing it.
0022
0023 Both writethrough and writeback caching are supported. Writeback defaults to
0024 off, but can be switched on and off arbitrarily at runtime. Bcache goes to
0025 great lengths to protect your data - it reliably handles unclean shutdown. (It
0026 doesn't even have a notion of a clean shutdown; bcache simply doesn't return
0027 writes as completed until they're on stable storage).
0028
0029 Writeback caching can use most of the cache for buffering writes - writing
0030 dirty data to the backing device is always done sequentially, scanning from the
0031 start to the end of the index.
0032
0033 Since random IO is what SSDs excel at, there generally won't be much benefit
0034 to caching large sequential IO. Bcache detects sequential IO and skips it;
0035 it also keeps a rolling average of the IO sizes per task, and as long as the
0036 average is above the cutoff it will skip all IO from that task - instead of
0037 caching the first 512k after every seek. Backups and large file copies should
0038 thus entirely bypass the cache.
0039
0040 In the event of a data IO error on the flash it will try to recover by reading
0041 from disk or invalidating cache entries. For unrecoverable errors (meta data
0042 or dirty data), caching is automatically disabled; if dirty data was present
0043 in the cache it first disables writeback caching and waits for all dirty data
0044 to be flushed.
0045
0046 Getting started:
0047 You'll need bcache util from the bcache-tools repository. Both the cache device
0048 and backing device must be formatted before use::
0049
0050 bcache make -B /dev/sdb
0051 bcache make -C /dev/sdc
0052
0053 `bcache make` has the ability to format multiple devices at the same time - if
0054 you format your backing devices and cache device at the same time, you won't
0055 have to manually attach::
0056
0057 bcache make -B /dev/sda /dev/sdb -C /dev/sdc
0058
0059 If your bcache-tools is not updated to latest version and does not have the
0060 unified `bcache` utility, you may use the legacy `make-bcache` utility to format
0061 bcache device with same -B and -C parameters.
0062
0063 bcache-tools now ships udev rules, and bcache devices are known to the kernel
0064 immediately. Without udev, you can manually register devices like this::
0065
0066 echo /dev/sdb > /sys/fs/bcache/register
0067 echo /dev/sdc > /sys/fs/bcache/register
0068
0069 Registering the backing device makes the bcache device show up in /dev; you can
0070 now format it and use it as normal. But the first time using a new bcache
0071 device, it'll be running in passthrough mode until you attach it to a cache.
0072 If you are thinking about using bcache later, it is recommended to setup all your
0073 slow devices as bcache backing devices without a cache, and you can choose to add
0074 a caching device later.
0075 See 'ATTACHING' section below.
0076
0077 The devices show up as::
0078
0079 /dev/bcache<N>
0080
0081 As well as (with udev)::
0082
0083 /dev/bcache/by-uuid/<uuid>
0084 /dev/bcache/by-label/<label>
0085
0086 To get started::
0087
0088 mkfs.ext4 /dev/bcache0
0089 mount /dev/bcache0 /mnt
0090
0091 You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache .
0092 You can also control them through /sys/fs//bcache/<cset-uuid>/ .
0093
0094 Cache devices are managed as sets; multiple caches per set isn't supported yet
0095 but will allow for mirroring of metadata and dirty data in the future. Your new
0096 cache set shows up as /sys/fs/bcache/<UUID>
0097
0098 Attaching
0099 ---------
0100
0101 After your cache device and backing device are registered, the backing device
0102 must be attached to your cache set to enable caching. Attaching a backing
0103 device to a cache set is done thusly, with the UUID of the cache set in
0104 /sys/fs/bcache::
0105
0106 echo <CSET-UUID> > /sys/block/bcache0/bcache/attach
0107
0108 This only has to be done once. The next time you reboot, just reregister all
0109 your bcache devices. If a backing device has data in a cache somewhere, the
0110 /dev/bcache<N> device won't be created until the cache shows up - particularly
0111 important if you have writeback caching turned on.
0112
0113 If you're booting up and your cache device is gone and never coming back, you
0114 can force run the backing device::
0115
0116 echo 1 > /sys/block/sdb/bcache/running
0117
0118 (You need to use /sys/block/sdb (or whatever your backing device is called), not
0119 /sys/block/bcache0, because bcache0 doesn't exist yet. If you're using a
0120 partition, the bcache directory would be at /sys/block/sdb/sdb2/bcache)
0121
0122 The backing device will still use that cache set if it shows up in the future,
0123 but all the cached data will be invalidated. If there was dirty data in the
0124 cache, don't expect the filesystem to be recoverable - you will have massive
0125 filesystem corruption, though ext4's fsck does work miracles.
0126
0127 Error Handling
0128 --------------
0129
0130 Bcache tries to transparently handle IO errors to/from the cache device without
0131 affecting normal operation; if it sees too many errors (the threshold is
0132 configurable, and defaults to 0) it shuts down the cache device and switches all
0133 the backing devices to passthrough mode.
0134
0135 - For reads from the cache, if they error we just retry the read from the
0136 backing device.
0137
0138 - For writethrough writes, if the write to the cache errors we just switch to
0139 invalidating the data at that lba in the cache (i.e. the same thing we do for
0140 a write that bypasses the cache)
0141
0142 - For writeback writes, we currently pass that error back up to the
0143 filesystem/userspace. This could be improved - we could retry it as a write
0144 that skips the cache so we don't have to error the write.
0145
0146 - When we detach, we first try to flush any dirty data (if we were running in
0147 writeback mode). It currently doesn't do anything intelligent if it fails to
0148 read some of the dirty data, though.
0149
0150
0151 Howto/cookbook
0152 --------------
0153
0154 A) Starting a bcache with a missing caching device
0155
0156 If registering the backing device doesn't help, it's already there, you just need
0157 to force it to run without the cache::
0158
0159 host:~# echo /dev/sdb1 > /sys/fs/bcache/register
0160 [ 119.844831] bcache: register_bcache() error opening /dev/sdb1: device already registered
0161
0162 Next, you try to register your caching device if it's present. However
0163 if it's absent, or registration fails for some reason, you can still
0164 start your bcache without its cache, like so::
0165
0166 host:/sys/block/sdb/sdb1/bcache# echo 1 > running
0167
0168 Note that this may cause data loss if you were running in writeback mode.
0169
0170
0171 B) Bcache does not find its cache::
0172
0173 host:/sys/block/md5/bcache# echo 0226553a-37cf-41d5-b3ce-8b1e944543a8 > attach
0174 [ 1933.455082] bcache: bch_cached_dev_attach() Couldn't find uuid for md5 in set
0175 [ 1933.478179] bcache: __cached_dev_store() Can't attach 0226553a-37cf-41d5-b3ce-8b1e944543a8
0176 [ 1933.478179] : cache set not found
0177
0178 In this case, the caching device was simply not registered at boot
0179 or disappeared and came back, and needs to be (re-)registered::
0180
0181 host:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register
0182
0183
0184 C) Corrupt bcache crashes the kernel at device registration time:
0185
0186 This should never happen. If it does happen, then you have found a bug!
0187 Please report it to the bcache development list: linux-bcache@vger.kernel.org
0188
0189 Be sure to provide as much information that you can including kernel dmesg
0190 output if available so that we may assist.
0191
0192
0193 D) Recovering data without bcache:
0194
0195 If bcache is not available in the kernel, a filesystem on the backing
0196 device is still available at an 8KiB offset. So either via a loopdev
0197 of the backing device created with --offset 8K, or any value defined by
0198 --data-offset when you originally formatted bcache with `bcache make`.
0199
0200 For example::
0201
0202 losetup -o 8192 /dev/loop0 /dev/your_bcache_backing_dev
0203
0204 This should present your unmodified backing device data in /dev/loop0
0205
0206 If your cache is in writethrough mode, then you can safely discard the
0207 cache device without loosing data.
0208
0209
0210 E) Wiping a cache device
0211
0212 ::
0213
0214 host:~# wipefs -a /dev/sdh2
0215 16 bytes were erased at offset 0x1018 (bcache)
0216 they were: c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
0217
0218 After you boot back with bcache enabled, you recreate the cache and attach it::
0219
0220 host:~# bcache make -C /dev/sdh2
0221 UUID: 7be7e175-8f4c-4f99-94b2-9c904d227045
0222 Set UUID: 5bc072a8-ab17-446d-9744-e247949913c1
0223 version: 0
0224 nbuckets: 106874
0225 block_size: 1
0226 bucket_size: 1024
0227 nr_in_set: 1
0228 nr_this_dev: 0
0229 first_bucket: 1
0230 [ 650.511912] bcache: run_cache_set() invalidating existing data
0231 [ 650.549228] bcache: register_cache() registered cache device sdh2
0232
0233 start backing device with missing cache::
0234
0235 host:/sys/block/md5/bcache# echo 1 > running
0236
0237 attach new cache::
0238
0239 host:/sys/block/md5/bcache# echo 5bc072a8-ab17-446d-9744-e247949913c1 > attach
0240 [ 865.276616] bcache: bch_cached_dev_attach() Caching md5 as bcache0 on set 5bc072a8-ab17-446d-9744-e247949913c1
0241
0242
0243 F) Remove or replace a caching device::
0244
0245 host:/sys/block/sda/sda7/bcache# echo 1 > detach
0246 [ 695.872542] bcache: cached_dev_detach_finish() Caching disabled for sda7
0247
0248 host:~# wipefs -a /dev/nvme0n1p4
0249 wipefs: error: /dev/nvme0n1p4: probing initialization failed: Device or resource busy
0250 Ooops, it's disabled, but not unregistered, so it's still protected
0251
0252 We need to go and unregister it::
0253
0254 host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# ls -l cache0
0255 lrwxrwxrwx 1 root root 0 Feb 25 18:33 cache0 -> ../../../devices/pci0000:00/0000:00:1d.0/0000:70:00.0/nvme/nvme0/nvme0n1/nvme0n1p4/bcache/
0256 host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# echo 1 > stop
0257 kernel: [ 917.041908] bcache: cache_set_free() Cache set b7ba27a1-2398-4649-8ae3-0959f57ba128 unregistered
0258
0259 Now we can wipe it::
0260
0261 host:~# wipefs -a /dev/nvme0n1p4
0262 /dev/nvme0n1p4: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
0263
0264
0265 G) dm-crypt and bcache
0266
0267 First setup bcache unencrypted and then install dmcrypt on top of
0268 /dev/bcache<N> This will work faster than if you dmcrypt both the backing
0269 and caching devices and then install bcache on top. [benchmarks?]
0270
0271
0272 H) Stop/free a registered bcache to wipe and/or recreate it
0273
0274 Suppose that you need to free up all bcache references so that you can
0275 fdisk run and re-register a changed partition table, which won't work
0276 if there are any active backing or caching devices left on it:
0277
0278 1) Is it present in /dev/bcache* ? (there are times where it won't be)
0279
0280 If so, it's easy::
0281
0282 host:/sys/block/bcache0/bcache# echo 1 > stop
0283
0284 2) But if your backing device is gone, this won't work::
0285
0286 host:/sys/block/bcache0# cd bcache
0287 bash: cd: bcache: No such file or directory
0288
0289 In this case, you may have to unregister the dmcrypt block device that
0290 references this bcache to free it up::
0291
0292 host:~# dmsetup remove oldds1
0293 bcache: bcache_device_free() bcache0 stopped
0294 bcache: cache_set_free() Cache set 5bc072a8-ab17-446d-9744-e247949913c1 unregistered
0295
0296 This causes the backing bcache to be removed from /sys/fs/bcache and
0297 then it can be reused. This would be true of any block device stacking
0298 where bcache is a lower device.
0299
0300 3) In other cases, you can also look in /sys/fs/bcache/::
0301
0302 host:/sys/fs/bcache# ls -l */{cache?,bdev?}
0303 lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/bdev1 -> ../../../devices/virtual/block/dm-1/bcache/
0304 lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/cache0 -> ../../../devices/virtual/block/dm-4/bcache/
0305 lrwxrwxrwx 1 root root 0 Mar 5 09:39 5bc072a8-ab17-446d-9744-e247949913c1/cache0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/ata10/host9/target9:0:0/9:0:0:0/block/sdl/sdl2/bcache/
0306
0307 The device names will show which UUID is relevant, cd in that directory
0308 and stop the cache::
0309
0310 host:/sys/fs/bcache/5bc072a8-ab17-446d-9744-e247949913c1# echo 1 > stop
0311
0312 This will free up bcache references and let you reuse the partition for
0313 other purposes.
0314
0315
0316
0317 Troubleshooting performance
0318 ---------------------------
0319
0320 Bcache has a bunch of config options and tunables. The defaults are intended to
0321 be reasonable for typical desktop and server workloads, but they're not what you
0322 want for getting the best possible numbers when benchmarking.
0323
0324 - Backing device alignment
0325
0326 The default metadata size in bcache is 8k. If your backing device is
0327 RAID based, then be sure to align this by a multiple of your stride
0328 width using `bcache make --data-offset`. If you intend to expand your
0329 disk array in the future, then multiply a series of primes by your
0330 raid stripe size to get the disk multiples that you would like.
0331
0332 For example: If you have a 64k stripe size, then the following offset
0333 would provide alignment for many common RAID5 data spindle counts::
0334
0335 64k * 2*2*2*3*3*5*7 bytes = 161280k
0336
0337 That space is wasted, but for only 157.5MB you can grow your RAID 5
0338 volume to the following data-spindle counts without re-aligning::
0339
0340 3,4,5,6,7,8,9,10,12,14,15,18,20,21 ...
0341
0342 - Bad write performance
0343
0344 If write performance is not what you expected, you probably wanted to be
0345 running in writeback mode, which isn't the default (not due to a lack of
0346 maturity, but simply because in writeback mode you'll lose data if something
0347 happens to your SSD)::
0348
0349 # echo writeback > /sys/block/bcache0/bcache/cache_mode
0350
0351 - Bad performance, or traffic not going to the SSD that you'd expect
0352
0353 By default, bcache doesn't cache everything. It tries to skip sequential IO -
0354 because you really want to be caching the random IO, and if you copy a 10
0355 gigabyte file you probably don't want that pushing 10 gigabytes of randomly
0356 accessed data out of your cache.
0357
0358 But if you want to benchmark reads from cache, and you start out with fio
0359 writing an 8 gigabyte test file - so you want to disable that::
0360
0361 # echo 0 > /sys/block/bcache0/bcache/sequential_cutoff
0362
0363 To set it back to the default (4 mb), do::
0364
0365 # echo 4M > /sys/block/bcache0/bcache/sequential_cutoff
0366
0367 - Traffic's still going to the spindle/still getting cache misses
0368
0369 In the real world, SSDs don't always keep up with disks - particularly with
0370 slower SSDs, many disks being cached by one SSD, or mostly sequential IO. So
0371 you want to avoid being bottlenecked by the SSD and having it slow everything
0372 down.
0373
0374 To avoid that bcache tracks latency to the cache device, and gradually
0375 throttles traffic if the latency exceeds a threshold (it does this by
0376 cranking down the sequential bypass).
0377
0378 You can disable this if you need to by setting the thresholds to 0::
0379
0380 # echo 0 > /sys/fs/bcache/<cache set>/congested_read_threshold_us
0381 # echo 0 > /sys/fs/bcache/<cache set>/congested_write_threshold_us
0382
0383 The default is 2000 us (2 milliseconds) for reads, and 20000 for writes.
0384
0385 - Still getting cache misses, of the same data
0386
0387 One last issue that sometimes trips people up is actually an old bug, due to
0388 the way cache coherency is handled for cache misses. If a btree node is full,
0389 a cache miss won't be able to insert a key for the new data and the data
0390 won't be written to the cache.
0391
0392 In practice this isn't an issue because as soon as a write comes along it'll
0393 cause the btree node to be split, and you need almost no write traffic for
0394 this to not show up enough to be noticeable (especially since bcache's btree
0395 nodes are huge and index large regions of the device). But when you're
0396 benchmarking, if you're trying to warm the cache by reading a bunch of data
0397 and there's no other traffic - that can be a problem.
0398
0399 Solution: warm the cache by doing writes, or use the testing branch (there's
0400 a fix for the issue there).
0401
0402
0403 Sysfs - backing device
0404 ----------------------
0405
0406 Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and
0407 (if attached) /sys/fs/bcache/<cset-uuid>/bdev*
0408
0409 attach
0410 Echo the UUID of a cache set to this file to enable caching.
0411
0412 cache_mode
0413 Can be one of either writethrough, writeback, writearound or none.
0414
0415 clear_stats
0416 Writing to this file resets the running total stats (not the day/hour/5 minute
0417 decaying versions).
0418
0419 detach
0420 Write to this file to detach from a cache set. If there is dirty data in the
0421 cache, it will be flushed first.
0422
0423 dirty_data
0424 Amount of dirty data for this backing device in the cache. Continuously
0425 updated unlike the cache set's version, but may be slightly off.
0426
0427 label
0428 Name of underlying device.
0429
0430 readahead
0431 Size of readahead that should be performed. Defaults to 0. If set to e.g.
0432 1M, it will round cache miss reads up to that size, but without overlapping
0433 existing cache entries.
0434
0435 running
0436 1 if bcache is running (i.e. whether the /dev/bcache device exists, whether
0437 it's in passthrough mode or caching).
0438
0439 sequential_cutoff
0440 A sequential IO will bypass the cache once it passes this threshold; the
0441 most recent 128 IOs are tracked so sequential IO can be detected even when
0442 it isn't all done at once.
0443
0444 sequential_merge
0445 If non zero, bcache keeps a list of the last 128 requests submitted to compare
0446 against all new requests to determine which new requests are sequential
0447 continuations of previous requests for the purpose of determining sequential
0448 cutoff. This is necessary if the sequential cutoff value is greater than the
0449 maximum acceptable sequential size for any single request.
0450
0451 state
0452 The backing device can be in one of four different states:
0453
0454 no cache: Has never been attached to a cache set.
0455
0456 clean: Part of a cache set, and there is no cached dirty data.
0457
0458 dirty: Part of a cache set, and there is cached dirty data.
0459
0460 inconsistent: The backing device was forcibly run by the user when there was
0461 dirty data cached but the cache set was unavailable; whatever data was on the
0462 backing device has likely been corrupted.
0463
0464 stop
0465 Write to this file to shut down the bcache device and close the backing
0466 device.
0467
0468 writeback_delay
0469 When dirty data is written to the cache and it previously did not contain
0470 any, waits some number of seconds before initiating writeback. Defaults to
0471 30.
0472
0473 writeback_percent
0474 If nonzero, bcache tries to keep around this percentage of the cache dirty by
0475 throttling background writeback and using a PD controller to smoothly adjust
0476 the rate.
0477
0478 writeback_rate
0479 Rate in sectors per second - if writeback_percent is nonzero, background
0480 writeback is throttled to this rate. Continuously adjusted by bcache but may
0481 also be set by the user.
0482
0483 writeback_running
0484 If off, writeback of dirty data will not take place at all. Dirty data will
0485 still be added to the cache until it is mostly full; only meant for
0486 benchmarking. Defaults to on.
0487
0488 Sysfs - backing device stats
0489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0490
0491 There are directories with these numbers for a running total, as well as
0492 versions that decay over the past day, hour and 5 minutes; they're also
0493 aggregated in the cache set directory as well.
0494
0495 bypassed
0496 Amount of IO (both reads and writes) that has bypassed the cache
0497
0498 cache_hits, cache_misses, cache_hit_ratio
0499 Hits and misses are counted per individual IO as bcache sees them; a
0500 partial hit is counted as a miss.
0501
0502 cache_bypass_hits, cache_bypass_misses
0503 Hits and misses for IO that is intended to skip the cache are still counted,
0504 but broken out here.
0505
0506 cache_miss_collisions
0507 Counts instances where data was going to be inserted into the cache from a
0508 cache miss, but raced with a write and data was already present (usually 0
0509 since the synchronization for cache misses was rewritten)
0510
0511 cache_readaheads
0512 Count of times readahead occurred.
0513
0514 Sysfs - cache set
0515 ~~~~~~~~~~~~~~~~~
0516
0517 Available at /sys/fs/bcache/<cset-uuid>
0518
0519 average_key_size
0520 Average data per key in the btree.
0521
0522 bdev<0..n>
0523 Symlink to each of the attached backing devices.
0524
0525 block_size
0526 Block size of the cache devices.
0527
0528 btree_cache_size
0529 Amount of memory currently used by the btree cache
0530
0531 bucket_size
0532 Size of buckets
0533
0534 cache<0..n>
0535 Symlink to each of the cache devices comprising this cache set.
0536
0537 cache_available_percent
0538 Percentage of cache device which doesn't contain dirty data, and could
0539 potentially be used for writeback. This doesn't mean this space isn't used
0540 for clean cached data; the unused statistic (in priority_stats) is typically
0541 much lower.
0542
0543 clear_stats
0544 Clears the statistics associated with this cache
0545
0546 dirty_data
0547 Amount of dirty data is in the cache (updated when garbage collection runs).
0548
0549 flash_vol_create
0550 Echoing a size to this file (in human readable units, k/M/G) creates a thinly
0551 provisioned volume backed by the cache set.
0552
0553 io_error_halflife, io_error_limit
0554 These determines how many errors we accept before disabling the cache.
0555 Each error is decayed by the half life (in # ios). If the decaying count
0556 reaches io_error_limit dirty data is written out and the cache is disabled.
0557
0558 journal_delay_ms
0559 Journal writes will delay for up to this many milliseconds, unless a cache
0560 flush happens sooner. Defaults to 100.
0561
0562 root_usage_percent
0563 Percentage of the root btree node in use. If this gets too high the node
0564 will split, increasing the tree depth.
0565
0566 stop
0567 Write to this file to shut down the cache set - waits until all attached
0568 backing devices have been shut down.
0569
0570 tree_depth
0571 Depth of the btree (A single node btree has depth 0).
0572
0573 unregister
0574 Detaches all backing devices and closes the cache devices; if dirty data is
0575 present it will disable writeback caching and wait for it to be flushed.
0576
0577 Sysfs - cache set internal
0578 ~~~~~~~~~~~~~~~~~~~~~~~~~~
0579
0580 This directory also exposes timings for a number of internal operations, with
0581 separate files for average duration, average frequency, last occurrence and max
0582 duration: garbage collection, btree read, btree node sorts and btree splits.
0583
0584 active_journal_entries
0585 Number of journal entries that are newer than the index.
0586
0587 btree_nodes
0588 Total nodes in the btree.
0589
0590 btree_used_percent
0591 Average fraction of btree in use.
0592
0593 bset_tree_stats
0594 Statistics about the auxiliary search trees
0595
0596 btree_cache_max_chain
0597 Longest chain in the btree node cache's hash table
0598
0599 cache_read_races
0600 Counts instances where while data was being read from the cache, the bucket
0601 was reused and invalidated - i.e. where the pointer was stale after the read
0602 completed. When this occurs the data is reread from the backing device.
0603
0604 trigger_gc
0605 Writing to this file forces garbage collection to run.
0606
0607 Sysfs - Cache device
0608 ~~~~~~~~~~~~~~~~~~~~
0609
0610 Available at /sys/block/<cdev>/bcache
0611
0612 block_size
0613 Minimum granularity of writes - should match hardware sector size.
0614
0615 btree_written
0616 Sum of all btree writes, in (kilo/mega/giga) bytes
0617
0618 bucket_size
0619 Size of buckets
0620
0621 cache_replacement_policy
0622 One of either lru, fifo or random.
0623
0624 discard
0625 Boolean; if on a discard/TRIM will be issued to each bucket before it is
0626 reused. Defaults to off, since SATA TRIM is an unqueued command (and thus
0627 slow).
0628
0629 freelist_percent
0630 Size of the freelist as a percentage of nbuckets. Can be written to to
0631 increase the number of buckets kept on the freelist, which lets you
0632 artificially reduce the size of the cache at runtime. Mostly for testing
0633 purposes (i.e. testing how different size caches affect your hit rate), but
0634 since buckets are discarded when they move on to the freelist will also make
0635 the SSD's garbage collection easier by effectively giving it more reserved
0636 space.
0637
0638 io_errors
0639 Number of errors that have occurred, decayed by io_error_halflife.
0640
0641 metadata_written
0642 Sum of all non data writes (btree writes and all other metadata).
0643
0644 nbuckets
0645 Total buckets in this cache
0646
0647 priority_stats
0648 Statistics about how recently data in the cache has been accessed.
0649 This can reveal your working set size. Unused is the percentage of
0650 the cache that doesn't contain any data. Metadata is bcache's
0651 metadata overhead. Average is the average priority of cache buckets.
0652 Next is a list of quantiles with the priority threshold of each.
0653
0654 written
0655 Sum of all data that has been written to the cache; comparison with
0656 btree_written gives the amount of write inflation in bcache.