0001 ===========================================
0002 Fault injection capabilities infrastructure
0003 ===========================================
0004
0005 See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
0006
0007
0008 Available fault injection capabilities
0009 --------------------------------------
0010
0011 - failslab
0012
0013 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
0014
0015 - fail_page_alloc
0016
0017 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
0018
0019 - fail_usercopy
0020
0021 injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
0022
0023 - fail_futex
0024
0025 injects futex deadlock and uaddr fault errors.
0026
0027 - fail_sunrpc
0028
0029 injects kernel RPC client and server failures.
0030
0031 - fail_make_request
0032
0033 injects disk IO errors on devices permitted by setting
0034 /sys/block/<device>/make-it-fail or
0035 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
0036
0037 - fail_mmc_request
0038
0039 injects MMC data errors on devices permitted by setting
0040 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
0041
0042 - fail_function
0043
0044 injects error return on specific functions, which are marked by
0045 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
0046 under /sys/kernel/debug/fail_function. No boot option supported.
0047
0048 - NVMe fault injection
0049
0050 inject NVMe status code and retry flag on devices permitted by setting
0051 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
0052 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
0053 retry flag can be set via the debugfs.
0054
0055
0056 Configure fault-injection capabilities behavior
0057 -----------------------------------------------
0058
0059 debugfs entries
0060 ^^^^^^^^^^^^^^^
0061
0062 fault-inject-debugfs kernel module provides some debugfs entries for runtime
0063 configuration of fault-injection capabilities.
0064
0065 - /sys/kernel/debug/fail*/probability:
0066
0067 likelihood of failure injection, in percent.
0068
0069 Format: <percent>
0070
0071 Note that one-failure-per-hundred is a very high error rate
0072 for some testcases. Consider setting probability=100 and configure
0073 /sys/kernel/debug/fail*/interval for such testcases.
0074
0075 - /sys/kernel/debug/fail*/interval:
0076
0077 specifies the interval between failures, for calls to
0078 should_fail() that pass all the other tests.
0079
0080 Note that if you enable this, by setting interval>1, you will
0081 probably want to set probability=100.
0082
0083 - /sys/kernel/debug/fail*/times:
0084
0085 specifies how many times failures may happen at most. A value of -1
0086 means "no limit". Note, though, that this file only accepts unsigned
0087 values. So, if you want to specify -1, you better use 'printf' instead
0088 of 'echo', e.g.: $ printf %#x -1 > times
0089
0090 - /sys/kernel/debug/fail*/space:
0091
0092 specifies an initial resource "budget", decremented by "size"
0093 on each call to should_fail(,size). Failure injection is
0094 suppressed until "space" reaches zero.
0095
0096 - /sys/kernel/debug/fail*/verbose
0097
0098 Format: { 0 | 1 | 2 }
0099
0100 specifies the verbosity of the messages when failure is
0101 injected. '0' means no messages; '1' will print only a single
0102 log line per failure; '2' will print a call trace too -- useful
0103 to debug the problems revealed by fault injection.
0104
0105 - /sys/kernel/debug/fail*/task-filter:
0106
0107 Format: { 'Y' | 'N' }
0108
0109 A value of 'N' disables filtering by process (default).
0110 Any positive value limits failures to only processes indicated by
0111 /proc/<pid>/make-it-fail==1.
0112
0113 - /sys/kernel/debug/fail*/require-start,
0114 /sys/kernel/debug/fail*/require-end,
0115 /sys/kernel/debug/fail*/reject-start,
0116 /sys/kernel/debug/fail*/reject-end:
0117
0118 specifies the range of virtual addresses tested during
0119 stacktrace walking. Failure is injected only if some caller
0120 in the walked stacktrace lies within the required range, and
0121 none lies within the rejected range.
0122 Default required range is [0,ULONG_MAX) (whole of virtual address space).
0123 Default rejected range is [0,0).
0124
0125 - /sys/kernel/debug/fail*/stacktrace-depth:
0126
0127 specifies the maximum stacktrace depth walked during search
0128 for a caller within [require-start,require-end) OR
0129 [reject-start,reject-end).
0130
0131 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
0132
0133 Format: { 'Y' | 'N' }
0134
0135 default is 'Y', setting it to 'N' will also inject failures into
0136 highmem/user allocations (__GFP_HIGHMEM allocations).
0137
0138 - /sys/kernel/debug/failslab/ignore-gfp-wait:
0139 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
0140
0141 Format: { 'Y' | 'N' }
0142
0143 default is 'Y', setting it to 'N' will also inject failures
0144 into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations).
0145
0146 - /sys/kernel/debug/fail_page_alloc/min-order:
0147
0148 specifies the minimum page allocation order to be injected
0149 failures.
0150
0151 - /sys/kernel/debug/fail_futex/ignore-private:
0152
0153 Format: { 'Y' | 'N' }
0154
0155 default is 'N', setting it to 'Y' will disable failure injections
0156 when dealing with private (address space) futexes.
0157
0158 - /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect:
0159
0160 Format: { 'Y' | 'N' }
0161
0162 default is 'N', setting it to 'Y' will disable disconnect
0163 injection on the RPC client.
0164
0165 - /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect:
0166
0167 Format: { 'Y' | 'N' }
0168
0169 default is 'N', setting it to 'Y' will disable disconnect
0170 injection on the RPC server.
0171
0172 - /sys/kernel/debug/fail_sunrpc/ignore-cache-wait:
0173
0174 Format: { 'Y' | 'N' }
0175
0176 default is 'N', setting it to 'Y' will disable cache wait
0177 injection on the RPC server.
0178
0179 - /sys/kernel/debug/fail_function/inject:
0180
0181 Format: { 'function-name' | '!function-name' | '' }
0182
0183 specifies the target function of error injection by name.
0184 If the function name leads '!' prefix, given function is
0185 removed from injection list. If nothing specified ('')
0186 injection list is cleared.
0187
0188 - /sys/kernel/debug/fail_function/injectable:
0189
0190 (read only) shows error injectable functions and what type of
0191 error values can be specified. The error type will be one of
0192 below;
0193 - NULL: retval must be 0.
0194 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
0195 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
0196
0197 - /sys/kernel/debug/fail_function/<function-name>/retval:
0198
0199 specifies the "error" return value to inject to the given function.
0200 This will be created when the user specifies a new injection entry.
0201 Note that this file only accepts unsigned values. So, if you want to
0202 use a negative errno, you better use 'printf' instead of 'echo', e.g.:
0203 $ printf %#x -12 > retval
0204
0205 Boot option
0206 ^^^^^^^^^^^
0207
0208 In order to inject faults while debugfs is not available (early boot time),
0209 use the boot option::
0210
0211 failslab=
0212 fail_page_alloc=
0213 fail_usercopy=
0214 fail_make_request=
0215 fail_futex=
0216 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
0217
0218 proc entries
0219 ^^^^^^^^^^^^
0220
0221 - /proc/<pid>/fail-nth,
0222 /proc/self/task/<tid>/fail-nth:
0223
0224 Write to this file of integer N makes N-th call in the task fail.
0225 Read from this file returns a integer value. A value of '0' indicates
0226 that the fault setup with a previous write to this file was injected.
0227 A positive integer N indicates that the fault wasn't yet injected.
0228 Note that this file enables all types of faults (slab, futex, etc).
0229 This setting takes precedence over all other generic debugfs settings
0230 like probability, interval, times, etc. But per-capability settings
0231 (e.g. fail_futex/ignore-private) take precedence over it.
0232
0233 This feature is intended for systematic testing of faults in a single
0234 system call. See an example below.
0235
0236 How to add new fault injection capability
0237 -----------------------------------------
0238
0239 - #include <linux/fault-inject.h>
0240
0241 - define the fault attributes
0242
0243 DECLARE_FAULT_ATTR(name);
0244
0245 Please see the definition of struct fault_attr in fault-inject.h
0246 for details.
0247
0248 - provide a way to configure fault attributes
0249
0250 - boot option
0251
0252 If you need to enable the fault injection capability from boot time, you can
0253 provide boot option to configure it. There is a helper function for it:
0254
0255 setup_fault_attr(attr, str);
0256
0257 - debugfs entries
0258
0259 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
0260 Helper functions:
0261
0262 fault_create_debugfs_attr(name, parent, attr);
0263
0264 - module parameters
0265
0266 If the scope of the fault injection capability is limited to a
0267 single kernel module, it is better to provide module parameters to
0268 configure the fault attributes.
0269
0270 - add a hook to insert failures
0271
0272 Upon should_fail() returning true, client code should inject a failure:
0273
0274 should_fail(attr, size);
0275
0276 Application Examples
0277 --------------------
0278
0279 - Inject slab allocation failures into module init/exit code::
0280
0281 #!/bin/bash
0282
0283 FAILTYPE=failslab
0284 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
0285 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
0286 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
0287 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
0288 echo 0 > /sys/kernel/debug/$FAILTYPE/space
0289 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
0290 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
0291
0292 faulty_system()
0293 {
0294 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
0295 }
0296
0297 if [ $# -eq 0 ]
0298 then
0299 echo "Usage: $0 modulename [ modulename ... ]"
0300 exit 1
0301 fi
0302
0303 for m in $*
0304 do
0305 echo inserting $m...
0306 faulty_system modprobe $m
0307
0308 echo removing $m...
0309 faulty_system modprobe -r $m
0310 done
0311
0312 ------------------------------------------------------------------------------
0313
0314 - Inject page allocation failures only for a specific module::
0315
0316 #!/bin/bash
0317
0318 FAILTYPE=fail_page_alloc
0319 module=$1
0320
0321 if [ -z $module ]
0322 then
0323 echo "Usage: $0 <modulename>"
0324 exit 1
0325 fi
0326
0327 modprobe $module
0328
0329 if [ ! -d /sys/module/$module/sections ]
0330 then
0331 echo Module $module is not loaded
0332 exit 1
0333 fi
0334
0335 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
0336 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
0337
0338 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
0339 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
0340 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
0341 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
0342 echo 0 > /sys/kernel/debug/$FAILTYPE/space
0343 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
0344 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
0345 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
0346 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
0347
0348 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
0349
0350 echo "Injecting errors into the module $module... (interrupt to stop)"
0351 sleep 1000000
0352
0353 ------------------------------------------------------------------------------
0354
0355 - Inject open_ctree error while btrfs mount::
0356
0357 #!/bin/bash
0358
0359 rm -f testfile.img
0360 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
0361 DEVICE=$(losetup --show -f testfile.img)
0362 mkfs.btrfs -f $DEVICE
0363 mkdir -p tmpmnt
0364
0365 FAILTYPE=fail_function
0366 FAILFUNC=open_ctree
0367 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
0368 printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
0369 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
0370 echo 100 > /sys/kernel/debug/$FAILTYPE/probability
0371 echo 0 > /sys/kernel/debug/$FAILTYPE/interval
0372 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
0373 echo 0 > /sys/kernel/debug/$FAILTYPE/space
0374 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
0375
0376 mount -t btrfs $DEVICE tmpmnt
0377 if [ $? -ne 0 ]
0378 then
0379 echo "SUCCESS!"
0380 else
0381 echo "FAILED!"
0382 umount tmpmnt
0383 fi
0384
0385 echo > /sys/kernel/debug/$FAILTYPE/inject
0386
0387 rmdir tmpmnt
0388 losetup -d $DEVICE
0389 rm testfile.img
0390
0391
0392 Tool to run command with failslab or fail_page_alloc
0393 ----------------------------------------------------
0394 In order to make it easier to accomplish the tasks mentioned above, we can use
0395 tools/testing/fault-injection/failcmd.sh. Please run a command
0396 "./tools/testing/fault-injection/failcmd.sh --help" for more information and
0397 see the following examples.
0398
0399 Examples:
0400
0401 Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
0402 allocation failure::
0403
0404 # ./tools/testing/fault-injection/failcmd.sh \
0405 -- make -C tools/testing/selftests/ run_tests
0406
0407 Same as above except to specify 100 times failures at most instead of one time
0408 at most by default::
0409
0410 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
0411 -- make -C tools/testing/selftests/ run_tests
0412
0413 Same as above except to inject page allocation failure instead of slab
0414 allocation failure::
0415
0416 # env FAILCMD_TYPE=fail_page_alloc \
0417 ./tools/testing/fault-injection/failcmd.sh --times=100 \
0418 -- make -C tools/testing/selftests/ run_tests
0419
0420 Systematic faults using fail-nth
0421 ---------------------------------
0422
0423 The following code systematically faults 0-th, 1-st, 2-nd and so on
0424 capabilities in the socketpair() system call::
0425
0426 #include <sys/types.h>
0427 #include <sys/stat.h>
0428 #include <sys/socket.h>
0429 #include <sys/syscall.h>
0430 #include <fcntl.h>
0431 #include <unistd.h>
0432 #include <string.h>
0433 #include <stdlib.h>
0434 #include <stdio.h>
0435 #include <errno.h>
0436
0437 int main()
0438 {
0439 int i, err, res, fail_nth, fds[2];
0440 char buf[128];
0441
0442 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
0443 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
0444 fail_nth = open(buf, O_RDWR);
0445 for (i = 1;; i++) {
0446 sprintf(buf, "%d", i);
0447 write(fail_nth, buf, strlen(buf));
0448 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
0449 err = errno;
0450 pread(fail_nth, buf, sizeof(buf), 0);
0451 if (res == 0) {
0452 close(fds[0]);
0453 close(fds[1]);
0454 }
0455 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
0456 res, err);
0457 if (atoi(buf))
0458 break;
0459 }
0460 return 0;
0461 }
0462
0463 An example output::
0464
0465 1-th fault Y: res=-1/23
0466 2-th fault Y: res=-1/23
0467 3-th fault Y: res=-1/12
0468 4-th fault Y: res=-1/12
0469 5-th fault Y: res=-1/23
0470 6-th fault Y: res=-1/23
0471 7-th fault Y: res=-1/23
0472 8-th fault Y: res=-1/12
0473 9-th fault Y: res=-1/12
0474 10-th fault Y: res=-1/12
0475 11-th fault Y: res=-1/12
0476 12-th fault Y: res=-1/12
0477 13-th fault Y: res=-1/12
0478 14-th fault Y: res=-1/12
0479 15-th fault Y: res=-1/12
0480 16-th fault N: res=0/12