0001 ================================================================
0002 Documentation for Kdump - The kexec-based Crash Dumping Solution
0003 ================================================================
0004
0005 This document includes overview, setup, installation, and analysis
0006 information.
0007
0008 Overview
0009 ========
0010
0011 Kdump uses kexec to quickly boot to a dump-capture kernel whenever a
0012 dump of the system kernel's memory needs to be taken (for example, when
0013 the system panics). The system kernel's memory image is preserved across
0014 the reboot and is accessible to the dump-capture kernel.
0015
0016 You can use common commands, such as cp, scp or makedumpfile to copy
0017 the memory image to a dump file on the local disk, or across the network
0018 to a remote system.
0019
0020 Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
0021 s390x, arm and arm64 architectures.
0022
0023 When the system kernel boots, it reserves a small section of memory for
0024 the dump-capture kernel. This ensures that ongoing Direct Memory Access
0025 (DMA) from the system kernel does not corrupt the dump-capture kernel.
0026 The kexec -p command loads the dump-capture kernel into this reserved
0027 memory.
0028
0029 On x86 machines, the first 640 KB of physical memory is needed for boot,
0030 regardless of where the kernel loads. For simpler handling, the whole
0031 low 1M is reserved to avoid any later kernel or device driver writing
0032 data into this area. Like this, the low 1M can be reused as system RAM
0033 by kdump kernel without extra handling.
0034
0035 On PPC64 machines first 32KB of physical memory is needed for booting
0036 regardless of where the kernel is loaded and to support 64K page size
0037 kexec backs up the first 64KB memory.
0038
0039 For s390x, when kdump is triggered, the crashkernel region is exchanged
0040 with the region [0, crashkernel region size] and then the kdump kernel
0041 runs in [0, crashkernel region size]. Therefore no relocatable kernel is
0042 needed for s390x.
0043
0044 All of the necessary information about the system kernel's core image is
0045 encoded in the ELF format, and stored in a reserved area of memory
0046 before a crash. The physical address of the start of the ELF header is
0047 passed to the dump-capture kernel through the elfcorehdr= boot
0048 parameter. Optionally the size of the ELF header can also be passed
0049 when using the elfcorehdr=[size[KMG]@]offset[KMG] syntax.
0050
0051 With the dump-capture kernel, you can access the memory image through
0052 /proc/vmcore. This exports the dump as an ELF-format file that you can
0053 write out using file copy commands such as cp or scp. You can also use
0054 makedumpfile utility to analyze and write out filtered contents with
0055 options, e.g with '-d 31' it will only write out kernel data. Further,
0056 you can use analysis tools such as the GNU Debugger (GDB) and the Crash
0057 tool to debug the dump file. This method ensures that the dump pages are
0058 correctly ordered.
0059
0060 Setup and Installation
0061 ======================
0062
0063 Install kexec-tools
0064 -------------------
0065
0066 1) Login as the root user.
0067
0068 2) Download the kexec-tools user-space package from the following URL:
0069
0070 http://kernel.org/pub/linux/utils/kernel/kexec/kexec-tools.tar.gz
0071
0072 This is a symlink to the latest version.
0073
0074 The latest kexec-tools git tree is available at:
0075
0076 - git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
0077 - http://www.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
0078
0079 There is also a gitweb interface available at
0080 http://www.kernel.org/git/?p=utils/kernel/kexec/kexec-tools.git
0081
0082 More information about kexec-tools can be found at
0083 http://horms.net/projects/kexec/
0084
0085 3) Unpack the tarball with the tar command, as follows::
0086
0087 tar xvpzf kexec-tools.tar.gz
0088
0089 4) Change to the kexec-tools directory, as follows::
0090
0091 cd kexec-tools-VERSION
0092
0093 5) Configure the package, as follows::
0094
0095 ./configure
0096
0097 6) Compile the package, as follows::
0098
0099 make
0100
0101 7) Install the package, as follows::
0102
0103 make install
0104
0105
0106 Build the system and dump-capture kernels
0107 -----------------------------------------
0108 There are two possible methods of using Kdump.
0109
0110 1) Build a separate custom dump-capture kernel for capturing the
0111 kernel core dump.
0112
0113 2) Or use the system kernel binary itself as dump-capture kernel and there is
0114 no need to build a separate dump-capture kernel. This is possible
0115 only with the architectures which support a relocatable kernel. As
0116 of today, i386, x86_64, ppc64, ia64, arm and arm64 architectures support
0117 relocatable kernel.
0118
0119 Building a relocatable kernel is advantageous from the point of view that
0120 one does not have to build a second kernel for capturing the dump. But
0121 at the same time one might want to build a custom dump capture kernel
0122 suitable to his needs.
0123
0124 Following are the configuration setting required for system and
0125 dump-capture kernels for enabling kdump support.
0126
0127 System kernel config options
0128 ----------------------------
0129
0130 1) Enable "kexec system call" or "kexec file based system call" in
0131 "Processor type and features."::
0132
0133 CONFIG_KEXEC=y or CONFIG_KEXEC_FILE=y
0134
0135 And both of them will select KEXEC_CORE::
0136
0137 CONFIG_KEXEC_CORE=y
0138
0139 Subsequently, CRASH_CORE is selected by KEXEC_CORE::
0140
0141 CONFIG_CRASH_CORE=y
0142
0143 2) Enable "sysfs file system support" in "Filesystem" -> "Pseudo
0144 filesystems." This is usually enabled by default::
0145
0146 CONFIG_SYSFS=y
0147
0148 Note that "sysfs file system support" might not appear in the "Pseudo
0149 filesystems" menu if "Configure standard kernel features (expert users)"
0150 is not enabled in "General Setup." In this case, check the .config file
0151 itself to ensure that sysfs is turned on, as follows::
0152
0153 grep 'CONFIG_SYSFS' .config
0154
0155 3) Enable "Compile the kernel with debug info" in "Kernel hacking."::
0156
0157 CONFIG_DEBUG_INFO=Y
0158
0159 This causes the kernel to be built with debug symbols. The dump
0160 analysis tools require a vmlinux with debug symbols in order to read
0161 and analyze a dump file.
0162
0163 Dump-capture kernel config options (Arch Independent)
0164 -----------------------------------------------------
0165
0166 1) Enable "kernel crash dumps" support under "Processor type and
0167 features"::
0168
0169 CONFIG_CRASH_DUMP=y
0170
0171 2) Enable "/proc/vmcore support" under "Filesystems" -> "Pseudo filesystems"::
0172
0173 CONFIG_PROC_VMCORE=y
0174
0175 (CONFIG_PROC_VMCORE is set by default when CONFIG_CRASH_DUMP is selected.)
0176
0177 Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
0178 --------------------------------------------------------------------
0179
0180 1) On i386, enable high memory support under "Processor type and
0181 features"::
0182
0183 CONFIG_HIGHMEM64G=y
0184
0185 or::
0186
0187 CONFIG_HIGHMEM4G
0188
0189 2) With CONFIG_SMP=y, usually nr_cpus=1 need specified on the kernel
0190 command line when loading the dump-capture kernel because one
0191 CPU is enough for kdump kernel to dump vmcore on most of systems.
0192
0193 However, you can also specify nr_cpus=X to enable multiple processors
0194 in kdump kernel. In this case, "disable_cpu_apicid=" is needed to
0195 tell kdump kernel which cpu is 1st kernel's BSP. Please refer to
0196 admin-guide/kernel-parameters.txt for more details.
0197
0198 With CONFIG_SMP=n, the above things are not related.
0199
0200 3) A relocatable kernel is suggested to be built by default. If not yet,
0201 enable "Build a relocatable kernel" support under "Processor type and
0202 features"::
0203
0204 CONFIG_RELOCATABLE=y
0205
0206 4) Use a suitable value for "Physical address where the kernel is
0207 loaded" (under "Processor type and features"). This only appears when
0208 "kernel crash dumps" is enabled. A suitable value depends upon
0209 whether kernel is relocatable or not.
0210
0211 If you are using a relocatable kernel use CONFIG_PHYSICAL_START=0x100000
0212 This will compile the kernel for physical address 1MB, but given the fact
0213 kernel is relocatable, it can be run from any physical address hence
0214 kexec boot loader will load it in memory region reserved for dump-capture
0215 kernel.
0216
0217 Otherwise it should be the start of memory region reserved for
0218 second kernel using boot parameter "crashkernel=Y@X". Here X is
0219 start of memory region reserved for dump-capture kernel.
0220 Generally X is 16MB (0x1000000). So you can set
0221 CONFIG_PHYSICAL_START=0x1000000
0222
0223 5) Make and install the kernel and its modules. DO NOT add this kernel
0224 to the boot loader configuration files.
0225
0226 Dump-capture kernel config options (Arch Dependent, ppc64)
0227 ----------------------------------------------------------
0228
0229 1) Enable "Build a kdump crash kernel" support under "Kernel" options::
0230
0231 CONFIG_CRASH_DUMP=y
0232
0233 2) Enable "Build a relocatable kernel" support::
0234
0235 CONFIG_RELOCATABLE=y
0236
0237 Make and install the kernel and its modules.
0238
0239 Dump-capture kernel config options (Arch Dependent, ia64)
0240 ----------------------------------------------------------
0241
0242 - No specific options are required to create a dump-capture kernel
0243 for ia64, other than those specified in the arch independent section
0244 above. This means that it is possible to use the system kernel
0245 as a dump-capture kernel if desired.
0246
0247 The crashkernel region can be automatically placed by the system
0248 kernel at runtime. This is done by specifying the base address as 0,
0249 or omitting it all together::
0250
0251 crashkernel=256M@0
0252
0253 or::
0254
0255 crashkernel=256M
0256
0257 Dump-capture kernel config options (Arch Dependent, arm)
0258 ----------------------------------------------------------
0259
0260 - To use a relocatable kernel,
0261 Enable "AUTO_ZRELADDR" support under "Boot" options::
0262
0263 AUTO_ZRELADDR=y
0264
0265 Dump-capture kernel config options (Arch Dependent, arm64)
0266 ----------------------------------------------------------
0267
0268 - Please note that kvm of the dump-capture kernel will not be enabled
0269 on non-VHE systems even if it is configured. This is because the CPU
0270 will not be reset to EL2 on panic.
0271
0272 crashkernel syntax
0273 ===========================
0274 1) crashkernel=size@offset
0275
0276 Here 'size' specifies how much memory to reserve for the dump-capture kernel
0277 and 'offset' specifies the beginning of this reserved memory. For example,
0278 "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
0279 starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
0280
0281 The crashkernel region can be automatically placed by the system
0282 kernel at run time. This is done by specifying the base address as 0,
0283 or omitting it all together::
0284
0285 crashkernel=256M@0
0286
0287 or::
0288
0289 crashkernel=256M
0290
0291 If the start address is specified, note that the start address of the
0292 kernel will be aligned to a value (which is Arch dependent), so if the
0293 start address is not then any space below the alignment point will be
0294 wasted.
0295
0296 2) range1:size1[,range2:size2,...][@offset]
0297
0298 While the "crashkernel=size[@offset]" syntax is sufficient for most
0299 configurations, sometimes it's handy to have the reserved memory dependent
0300 on the value of System RAM -- that's mostly for distributors that pre-setup
0301 the kernel command line to avoid a unbootable system after some memory has
0302 been removed from the machine.
0303
0304 The syntax is::
0305
0306 crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
0307 range=start-[end]
0308
0309 For example::
0310
0311 crashkernel=512M-2G:64M,2G-:128M
0312
0313 This would mean:
0314
0315 1) if the RAM is smaller than 512M, then don't reserve anything
0316 (this is the "rescue" case)
0317 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
0318 3) if the RAM size is larger than 2G, then reserve 128M
0319
0320 3) crashkernel=size,high and crashkernel=size,low
0321
0322 If memory above 4G is preferred, crashkernel=size,high can be used to
0323 fulfill that. With it, physical memory is allowed to be allocated from top,
0324 so could be above 4G if system has more than 4G RAM installed. Otherwise,
0325 memory region will be allocated below 4G if available.
0326
0327 When crashkernel=X,high is passed, kernel could allocate physical memory
0328 region above 4G, low memory under 4G is needed in this case. There are
0329 three ways to get low memory:
0330
0331 1) Kernel will allocate at least 256M memory below 4G automatically
0332 if crashkernel=Y,low is not specified.
0333 2) Let user specify low memory size instead.
0334 3) Specified value 0 will disable low memory allocation::
0335
0336 crashkernel=0,low
0337
0338 Boot into System Kernel
0339 -----------------------
0340 1) Update the boot loader (such as grub, yaboot, or lilo) configuration
0341 files as necessary.
0342
0343 2) Boot the system kernel with the boot parameter "crashkernel=Y@X".
0344
0345 On x86 and x86_64, use "crashkernel=Y[@X]". Most of the time, the
0346 start address 'X' is not necessary, kernel will search a suitable
0347 area. Unless an explicit start address is expected.
0348
0349 On ppc64, use "crashkernel=128M@32M".
0350
0351 On ia64, 256M@256M is a generous value that typically works.
0352 The region may be automatically placed on ia64, see the
0353 dump-capture kernel config option notes above.
0354 If use sparse memory, the size should be rounded to GRANULE boundaries.
0355
0356 On s390x, typically use "crashkernel=xxM". The value of xx is dependent
0357 on the memory consumption of the kdump system. In general this is not
0358 dependent on the memory size of the production system.
0359
0360 On arm, the use of "crashkernel=Y@X" is no longer necessary; the
0361 kernel will automatically locate the crash kernel image within the
0362 first 512MB of RAM if X is not given.
0363
0364 On arm64, use "crashkernel=Y[@X]". Note that the start address of
0365 the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
0366
0367 Load the Dump-capture Kernel
0368 ============================
0369
0370 After booting to the system kernel, dump-capture kernel needs to be
0371 loaded.
0372
0373 Based on the architecture and type of image (relocatable or not), one
0374 can choose to load the uncompressed vmlinux or compressed bzImage/vmlinuz
0375 of dump-capture kernel. Following is the summary.
0376
0377 For i386 and x86_64:
0378
0379 - Use bzImage/vmlinuz if kernel is relocatable.
0380 - Use vmlinux if kernel is not relocatable.
0381
0382 For ppc64:
0383
0384 - Use vmlinux
0385
0386 For ia64:
0387
0388 - Use vmlinux or vmlinuz.gz
0389
0390 For s390x:
0391
0392 - Use image or bzImage
0393
0394 For arm:
0395
0396 - Use zImage
0397
0398 For arm64:
0399
0400 - Use vmlinux or Image
0401
0402 If you are using an uncompressed vmlinux image then use following command
0403 to load dump-capture kernel::
0404
0405 kexec -p <dump-capture-kernel-vmlinux-image> \
0406 --initrd=<initrd-for-dump-capture-kernel> --args-linux \
0407 --append="root=<root-dev> <arch-specific-options>"
0408
0409 If you are using a compressed bzImage/vmlinuz, then use following command
0410 to load dump-capture kernel::
0411
0412 kexec -p <dump-capture-kernel-bzImage> \
0413 --initrd=<initrd-for-dump-capture-kernel> \
0414 --append="root=<root-dev> <arch-specific-options>"
0415
0416 If you are using a compressed zImage, then use following command
0417 to load dump-capture kernel::
0418
0419 kexec --type zImage -p <dump-capture-kernel-bzImage> \
0420 --initrd=<initrd-for-dump-capture-kernel> \
0421 --dtb=<dtb-for-dump-capture-kernel> \
0422 --append="root=<root-dev> <arch-specific-options>"
0423
0424 If you are using an uncompressed Image, then use following command
0425 to load dump-capture kernel::
0426
0427 kexec -p <dump-capture-kernel-Image> \
0428 --initrd=<initrd-for-dump-capture-kernel> \
0429 --append="root=<root-dev> <arch-specific-options>"
0430
0431 Please note, that --args-linux does not need to be specified for ia64.
0432 It is planned to make this a no-op on that architecture, but for now
0433 it should be omitted
0434
0435 Following are the arch specific command line options to be used while
0436 loading dump-capture kernel.
0437
0438 For i386, x86_64 and ia64:
0439
0440 "1 irqpoll nr_cpus=1 reset_devices"
0441
0442 For ppc64:
0443
0444 "1 maxcpus=1 noirqdistrib reset_devices"
0445
0446 For s390x:
0447
0448 "1 nr_cpus=1 cgroup_disable=memory"
0449
0450 For arm:
0451
0452 "1 maxcpus=1 reset_devices"
0453
0454 For arm64:
0455
0456 "1 nr_cpus=1 reset_devices"
0457
0458 Notes on loading the dump-capture kernel:
0459
0460 * By default, the ELF headers are stored in ELF64 format to support
0461 systems with more than 4GB memory. On i386, kexec automatically checks if
0462 the physical RAM size exceeds the 4 GB limit and if not, uses ELF32.
0463 So, on non-PAE systems, ELF32 is always used.
0464
0465 The --elf32-core-headers option can be used to force the generation of ELF32
0466 headers. This is necessary because GDB currently cannot open vmcore files
0467 with ELF64 headers on 32-bit systems.
0468
0469 * The "irqpoll" boot parameter reduces driver initialization failures
0470 due to shared interrupts in the dump-capture kernel.
0471
0472 * You must specify <root-dev> in the format corresponding to the root
0473 device name in the output of mount command.
0474
0475 * Boot parameter "1" boots the dump-capture kernel into single-user
0476 mode without networking. If you want networking, use "3".
0477
0478 * We generally don't have to bring up a SMP kernel just to capture the
0479 dump. Hence generally it is useful either to build a UP dump-capture
0480 kernel or specify maxcpus=1 option while loading dump-capture kernel.
0481 Note, though maxcpus always works, you had better replace it with
0482 nr_cpus to save memory if supported by the current ARCH, such as x86.
0483
0484 * You should enable multi-cpu support in dump-capture kernel if you intend
0485 to use multi-thread programs with it, such as parallel dump feature of
0486 makedumpfile. Otherwise, the multi-thread program may have a great
0487 performance degradation. To enable multi-cpu support, you should bring up an
0488 SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
0489 options while loading it.
0490
0491 * For s390x there are two kdump modes: If a ELF header is specified with
0492 the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
0493 is done on all other architectures. If no elfcorehdr= kernel parameter is
0494 specified, the s390x kdump kernel dynamically creates the header. The
0495 second mode has the advantage that for CPU and memory hotplug, kdump has
0496 not to be reloaded with kexec_load().
0497
0498 * For s390x systems with many attached devices the "cio_ignore" kernel
0499 parameter should be used for the kdump kernel in order to prevent allocation
0500 of kernel memory for devices that are not relevant for kdump. The same
0501 applies to systems that use SCSI/FCP devices. In that case the
0502 "allow_lun_scan" zfcp module parameter should be set to zero before
0503 setting FCP devices online.
0504
0505 Kernel Panic
0506 ============
0507
0508 After successfully loading the dump-capture kernel as previously
0509 described, the system will reboot into the dump-capture kernel if a
0510 system crash is triggered. Trigger points are located in panic(),
0511 die(), die_nmi() and in the sysrq handler (ALT-SysRq-c).
0512
0513 The following conditions will execute a crash trigger point:
0514
0515 If a hard lockup is detected and "NMI watchdog" is configured, the system
0516 will boot into the dump-capture kernel ( die_nmi() ).
0517
0518 If die() is called, and it happens to be a thread with pid 0 or 1, or die()
0519 is called inside interrupt context or die() is called and panic_on_oops is set,
0520 the system will boot into the dump-capture kernel.
0521
0522 On powerpc systems when a soft-reset is generated, die() is called by all cpus
0523 and the system will boot into the dump-capture kernel.
0524
0525 For testing purposes, you can trigger a crash by using "ALT-SysRq-c",
0526 "echo c > /proc/sysrq-trigger" or write a module to force the panic.
0527
0528 Write Out the Dump File
0529 =======================
0530
0531 After the dump-capture kernel is booted, write out the dump file with
0532 the following command::
0533
0534 cp /proc/vmcore <dump-file>
0535
0536 or use scp to write out the dump file between hosts on a network, e.g::
0537
0538 scp /proc/vmcore remote_username@remote_ip:<dump-file>
0539
0540 You can also use makedumpfile utility to write out the dump file
0541 with specified options to filter out unwanted contents, e.g::
0542
0543 makedumpfile -l --message-level 1 -d 31 /proc/vmcore <dump-file>
0544
0545 Analysis
0546 ========
0547
0548 Before analyzing the dump image, you should reboot into a stable kernel.
0549
0550 You can do limited analysis using GDB on the dump file copied out of
0551 /proc/vmcore. Use the debug vmlinux built with -g and run the following
0552 command::
0553
0554 gdb vmlinux <dump-file>
0555
0556 Stack trace for the task on processor 0, register display, and memory
0557 display work fine.
0558
0559 Note: GDB cannot analyze core files generated in ELF64 format for x86.
0560 On systems with a maximum of 4GB of memory, you can generate
0561 ELF32-format headers using the --elf32-core-headers kernel option on the
0562 dump kernel.
0563
0564 You can also use the Crash utility to analyze dump files in Kdump
0565 format. Crash is available at the following URL:
0566
0567 https://github.com/crash-utility/crash
0568
0569 Crash document can be found at:
0570 https://crash-utility.github.io/
0571
0572 Trigger Kdump on WARN()
0573 =======================
0574
0575 The kernel parameter, panic_on_warn, calls panic() in all WARN() paths. This
0576 will cause a kdump to occur at the panic() call. In cases where a user wants
0577 to specify this during runtime, /proc/sys/kernel/panic_on_warn can be set to 1
0578 to achieve the same behaviour.
0579
0580 Trigger Kdump on add_taint()
0581 ============================
0582
0583 The kernel parameter panic_on_taint facilitates a conditional call to panic()
0584 from within add_taint() whenever the value set in this bitmask matches with the
0585 bit flag being set by add_taint().
0586 This will cause a kdump to occur at the add_taint()->panic() call.
0587
0588 Contact
0589 =======
0590
0591 - kexec@lists.infradead.org
0592
0593 GDB macros
0594 ==========
0595
0596 .. include:: gdbmacros.txt
0597 :literal: