0001 What: /sys/devices/system/machinecheck/machinecheckX/
0002 Contact: Andi Kleen <ak@linux.intel.com>
0003 Date: Feb, 2007
0004 Description:
0005 (X = CPU number)
0006
0007 Machine checks report internal hardware error conditions
0008 detected by the CPU. Uncorrected errors typically cause a
0009 machine check (often with panic), corrected ones cause a
0010 machine check log entry.
0011
0012 For more details about the x86 machine check architecture
0013 see the Intel and AMD architecture manuals from their
0014 developer websites.
0015
0016 For more details about the architecture
0017 see http://one.firstfloor.org/~andi/mce.pdf
0018
0019 Each CPU has its own directory.
0020
0021 What: /sys/devices/system/machinecheck/machinecheckX/bank<Y>
0022 Contact: Andi Kleen <ak@linux.intel.com>
0023 Date: Feb, 2007
0024 Description:
0025 (Y bank number)
0026
0027 64bit Hex bitmask enabling/disabling specific subevents for
0028 bank Y.
0029
0030 When a bit in the bitmask is zero then the respective
0031 subevent will not be reported.
0032
0033 By default all events are enabled.
0034
0035 Note that BIOS maintain another mask to disable specific events
0036 per bank. This is not visible here
0037
0038 What: /sys/devices/system/machinecheck/machinecheckX/check_interval
0039 Contact: Andi Kleen <ak@linux.intel.com>
0040 Date: Feb, 2007
0041 Description:
0042 The entries appear for each CPU, but they are truly shared
0043 between all CPUs.
0044
0045 How often to poll for corrected machine check errors, in
0046 seconds (Note output is hexadecimal). Default 5 minutes.
0047 When the poller finds MCEs it triggers an exponential speedup
0048 (poll more often) on the polling interval. When the poller
0049 stops finding MCEs, it triggers an exponential backoff
0050 (poll less often) on the polling interval. The check_interval
0051 variable is both the initial and maximum polling interval.
0052 0 means no polling for corrected machine check errors
0053 (but some corrected errors might be still reported
0054 in other ways)
0055
0056 What: /sys/devices/system/machinecheck/machinecheckX/trigger
0057 Contact: Andi Kleen <ak@linux.intel.com>
0058 Date: Feb, 2007
0059 Description:
0060 The entries appear for each CPU, but they are truly shared
0061 between all CPUs.
0062
0063 Program to run when a machine check event is detected.
0064 This is an alternative to running mcelog regularly from cron
0065 and allows to detect events faster.
0066
0067 What: /sys/devices/system/machinecheck/machinecheckX/monarch_timeout
0068 Contact: Andi Kleen <ak@linux.intel.com>
0069 Date: Feb, 2007
0070 Description:
0071 How long to wait for the other CPUs to machine check too on a
0072 exception. 0 to disable waiting for other CPUs.
0073
0074 Unit: us
0075
0076 What: /sys/devices/system/machinecheck/machinecheckX/ignore_ce
0077 Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
0078 Date: Jun 2009
0079 Description:
0080 Disables polling and CMCI for corrected errors.
0081 All corrected events are not cleared and kept in bank MSRs.
0082
0083 What: /sys/devices/system/machinecheck/machinecheckX/dont_log_ce
0084 Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
0085 Date: Jun 2009
0086 Description:
0087 Disables logging for corrected errors.
0088 All reported corrected errors will be cleared silently.
0089
0090 This option will be useful if you never care about corrected
0091 errors.
0092
0093 What: /sys/devices/system/machinecheck/machinecheckX/cmci_disabled
0094 Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
0095 Date: Jun 2009
0096 Description:
0097 Disables the CMCI feature.