Back to home page

OSCL-LXR

 
 

    


0001 Ramoops oops/panic logger
0002 =========================
0003 
0004 Sergiu Iordache <sergiu@chromium.org>
0005 
0006 Updated: 10 Feb 2021
0007 
0008 Introduction
0009 ------------
0010 
0011 Ramoops is an oops/panic logger that writes its logs to RAM before the system
0012 crashes. It works by logging oopses and panics in a circular buffer. Ramoops
0013 needs a system with persistent RAM so that the content of that area can
0014 survive after a restart.
0015 
0016 Ramoops concepts
0017 ----------------
0018 
0019 Ramoops uses a predefined memory area to store the dump. The start and size
0020 and type of the memory area are set using three variables:
0021 
0022   * ``mem_address`` for the start
0023   * ``mem_size`` for the size. The memory size will be rounded down to a
0024     power of two.
0025   * ``mem_type`` to specify if the memory type (default is pgprot_writecombine).
0026 
0027 Typically the default value of ``mem_type=0`` should be used as that sets the pstore
0028 mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
0029 ``pgprot_noncached``, which only works on some platforms. This is because pstore
0030 depends on atomic operations. At least on ARM, pgprot_noncached causes the
0031 memory to be mapped strongly ordered, and atomic operations on strongly ordered
0032 memory are implementation defined, and won't work on many ARMs such as omaps.
0033 Setting ``mem_type=2`` attempts to treat the memory region as normal memory,
0034 which enables full cache on it. This can improve the performance.
0035 
0036 The memory area is divided into ``record_size`` chunks (also rounded down to
0037 power of two) and each kmesg dump writes a ``record_size`` chunk of
0038 information.
0039 
0040 Limiting which kinds of kmsg dumps are stored can be controlled via
0041 the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
0042 ``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
0043 ``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
0044 ``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
0045 (KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
0046 ``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
0047 otherwise KMSG_DUMP_MAX.
0048 
0049 The module uses a counter to record multiple dumps but the counter gets reset
0050 on restart (i.e. new dumps after the restart will overwrite old ones).
0051 
0052 Ramoops also supports software ECC protection of persistent memory regions.
0053 This might be useful when a hardware reset was used to bring the machine back
0054 to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat
0055 corrupt, but usually it is restorable.
0056 
0057 Setting the parameters
0058 ----------------------
0059 
0060 Setting the ramoops parameters can be done in several different manners:
0061 
0062  A. Use the module parameters (which have the names of the variables described
0063  as before). For quick debugging, you can also reserve parts of memory during
0064  boot and then use the reserved memory for ramoops. For example, assuming a
0065  machine with > 128 MB of memory, the following kernel command line will tell
0066  the kernel to use only the first 128 MB of memory, and place ECC-protected
0067  ramoops region at 128 MB boundary::
0068 
0069         mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1
0070 
0071  B. Use Device Tree bindings, as described in
0072  ``Documentation/devicetree/bindings/reserved-memory/ramoops.yaml``.
0073  For example::
0074 
0075         reserved-memory {
0076                 #address-cells = <2>;
0077                 #size-cells = <2>;
0078                 ranges;
0079 
0080                 ramoops@8f000000 {
0081                         compatible = "ramoops";
0082                         reg = <0 0x8f000000 0 0x100000>;
0083                         record-size = <0x4000>;
0084                         console-size = <0x4000>;
0085                 };
0086         };
0087 
0088  C. Use a platform device and set the platform data. The parameters can then
0089  be set through that platform data. An example of doing that is:
0090 
0091  .. code-block:: c
0092 
0093   #include <linux/pstore_ram.h>
0094   [...]
0095 
0096   static struct ramoops_platform_data ramoops_data = {
0097         .mem_size               = <...>,
0098         .mem_address            = <...>,
0099         .mem_type               = <...>,
0100         .record_size            = <...>,
0101         .max_reason             = <...>,
0102         .ecc                    = <...>,
0103   };
0104 
0105   static struct platform_device ramoops_dev = {
0106         .name = "ramoops",
0107         .dev = {
0108                 .platform_data = &ramoops_data,
0109         },
0110   };
0111 
0112   [... inside a function ...]
0113   int ret;
0114 
0115   ret = platform_device_register(&ramoops_dev);
0116   if (ret) {
0117         printk(KERN_ERR "unable to register platform device\n");
0118         return ret;
0119   }
0120 
0121 You can specify either RAM memory or peripheral devices' memory. However, when
0122 specifying RAM, be sure to reserve the memory by issuing memblock_reserve()
0123 very early in the architecture code, e.g.::
0124 
0125         #include <linux/memblock.h>
0126 
0127         memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size);
0128 
0129 Dump format
0130 -----------
0131 
0132 The data dump begins with a header, currently defined as ``====`` followed by a
0133 timestamp and a new line. The dump then continues with the actual data.
0134 
0135 Reading the data
0136 ----------------
0137 
0138 The dump data can be read from the pstore filesystem. The format for these
0139 files is ``dmesg-ramoops-N``, where N is the record number in memory. To delete
0140 a stored record from RAM, simply unlink the respective pstore file.
0141 
0142 Persistent function tracing
0143 ---------------------------
0144 
0145 Persistent function tracing might be useful for debugging software or hardware
0146 related hangs. The functions call chain log is stored in a ``ftrace-ramoops``
0147 file. Here is an example of usage::
0148 
0149  # mount -t debugfs debugfs /sys/kernel/debug/
0150  # echo 1 > /sys/kernel/debug/pstore/record_ftrace
0151  # reboot -f
0152  [...]
0153  # mount -t pstore pstore /mnt/
0154  # tail /mnt/ftrace-ramoops
0155  0 ffffffff8101ea64  ffffffff8101bcda  native_apic_mem_read <- disconnect_bsp_APIC+0x6a/0xc0
0156  0 ffffffff8101ea44  ffffffff8101bcf6  native_apic_mem_write <- disconnect_bsp_APIC+0x86/0xc0
0157  0 ffffffff81020084  ffffffff8101a4b5  hpet_disable <- native_machine_shutdown+0x75/0x90
0158  0 ffffffff81005f94  ffffffff8101a4bb  iommu_shutdown_noop <- native_machine_shutdown+0x7b/0x90
0159  0 ffffffff8101a6a1  ffffffff8101a437  native_machine_emergency_restart <- native_machine_restart+0x37/0x40
0160  0 ffffffff811f9876  ffffffff8101a73a  acpi_reboot <- native_machine_emergency_restart+0xaa/0x1e0
0161  0 ffffffff8101a514  ffffffff8101a772  mach_reboot_fixups <- native_machine_emergency_restart+0xe2/0x1e0
0162  0 ffffffff811d9c54  ffffffff8101a7a0  __const_udelay <- native_machine_emergency_restart+0x110/0x1e0
0163  0 ffffffff811d9c34  ffffffff811d9c80  __delay <- __const_udelay+0x30/0x40
0164  0 ffffffff811d9d14  ffffffff811d9c3f  delay_tsc <- __delay+0xf/0x20