0001 PCIe Device AER statistics
0002 --------------------------
0003
0004 These attributes show up under all the devices that are AER capable. These
0005 statistical counters indicate the errors "as seen/reported by the device".
0006 Note that this may mean that if an endpoint is causing problems, the AER
0007 counters may increment at its link partner (e.g. root port) because the
0008 errors may be "seen" / reported by the link partner and not the
0009 problematic endpoint itself (which may report all counters as 0 as it never
0010 saw any problems).
0011
0012 What: /sys/bus/pci/devices/<dev>/aer_dev_correctable
0013 Date: July 2018
0014 KernelVersion: 4.19.0
0015 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0016 Description: List of correctable errors seen and reported by this
0017 PCI device using ERR_COR. Note that since multiple errors may
0018 be reported using a single ERR_COR message, thus
0019 TOTAL_ERR_COR at the end of the file may not match the actual
0020 total of all the errors in the file. Sample output::
0021
0022 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable
0023 Receiver Error 2
0024 Bad TLP 0
0025 Bad DLLP 0
0026 RELAY_NUM Rollover 0
0027 Replay Timer Timeout 0
0028 Advisory Non-Fatal 0
0029 Corrected Internal Error 0
0030 Header Log Overflow 0
0031 TOTAL_ERR_COR 2
0032
0033 What: /sys/bus/pci/devices/<dev>/aer_dev_fatal
0034 Date: July 2018
0035 KernelVersion: 4.19.0
0036 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0037 Description: List of uncorrectable fatal errors seen and reported by this
0038 PCI device using ERR_FATAL. Note that since multiple errors may
0039 be reported using a single ERR_FATAL message, thus
0040 TOTAL_ERR_FATAL at the end of the file may not match the actual
0041 total of all the errors in the file. Sample output::
0042
0043 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal
0044 Undefined 0
0045 Data Link Protocol 0
0046 Surprise Down Error 0
0047 Poisoned TLP 0
0048 Flow Control Protocol 0
0049 Completion Timeout 0
0050 Completer Abort 0
0051 Unexpected Completion 0
0052 Receiver Overflow 0
0053 Malformed TLP 0
0054 ECRC 0
0055 Unsupported Request 0
0056 ACS Violation 0
0057 Uncorrectable Internal Error 0
0058 MC Blocked TLP 0
0059 AtomicOp Egress Blocked 0
0060 TLP Prefix Blocked Error 0
0061 TOTAL_ERR_FATAL 0
0062
0063 What: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal
0064 Date: July 2018
0065 KernelVersion: 4.19.0
0066 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0067 Description: List of uncorrectable nonfatal errors seen and reported by this
0068 PCI device using ERR_NONFATAL. Note that since multiple errors
0069 may be reported using a single ERR_FATAL message, thus
0070 TOTAL_ERR_NONFATAL at the end of the file may not match the
0071 actual total of all the errors in the file. Sample output::
0072
0073 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal
0074 Undefined 0
0075 Data Link Protocol 0
0076 Surprise Down Error 0
0077 Poisoned TLP 0
0078 Flow Control Protocol 0
0079 Completion Timeout 0
0080 Completer Abort 0
0081 Unexpected Completion 0
0082 Receiver Overflow 0
0083 Malformed TLP 0
0084 ECRC 0
0085 Unsupported Request 0
0086 ACS Violation 0
0087 Uncorrectable Internal Error 0
0088 MC Blocked TLP 0
0089 AtomicOp Egress Blocked 0
0090 TLP Prefix Blocked Error 0
0091 TOTAL_ERR_NONFATAL 0
0092
0093 PCIe Rootport AER statistics
0094 ----------------------------
0095
0096 These attributes show up under only the rootports (or root complex event
0097 collectors) that are AER capable. These indicate the number of error messages as
0098 "reported to" the rootport. Please note that the rootports also transmit
0099 (internally) the ERR_* messages for errors seen by the internal rootport PCI
0100 device, so these counters include them and are thus cumulative of all the error
0101 messages on the PCI hierarchy originating at that root port.
0102
0103 What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor
0104 Date: July 2018
0105 KernelVersion: 4.19.0
0106 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0107 Description: Total number of ERR_COR messages reported to rootport.
0108
0109 What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal
0110 Date: July 2018
0111 KernelVersion: 4.19.0
0112 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0113 Description: Total number of ERR_FATAL messages reported to rootport.
0114
0115 What: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal
0116 Date: July 2018
0117 KernelVersion: 4.19.0
0118 Contact: linux-pci@vger.kernel.org, rajatja@google.com
0119 Description: Total number of ERR_NONFATAL messages reported to rootport.