Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
0002 
0003 ====================================
0004 Marvell OcteonTx2 RVU Kernel Drivers
0005 ====================================
0006 
0007 Copyright (c) 2020 Marvell International Ltd.
0008 
0009 Contents
0010 ========
0011 
0012 - `Overview`_
0013 - `Drivers`_
0014 - `Basic packet flow`_
0015 - `Devlink health reporters`_
0016 
0017 Overview
0018 ========
0019 
0020 Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW
0021 resources from the network, crypto and other functional blocks into
0022 PCI-compatible physical and virtual functions. Each functional block
0023 again has multiple local functions (LFs) for provisioning to PCI devices.
0024 RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual
0025 functions (VFs). PF0 is called the administrative / admin function (AF)
0026 and has privileges to provision RVU functional block's LFs to each of the
0027 PF/VF.
0028 
0029 RVU managed networking functional blocks
0030  - Network pool or buffer allocator (NPA)
0031  - Network interface controller (NIX)
0032  - Network parser CAM (NPC)
0033  - Schedule/Synchronize/Order unit (SSO)
0034  - Loopback interface (LBK)
0035 
0036 RVU managed non-networking functional blocks
0037  - Crypto accelerator (CPT)
0038  - Scheduled timers unit (TIM)
0039  - Schedule/Synchronize/Order unit (SSO)
0040    Used for both networking and non networking usecases
0041 
0042 Resource provisioning examples
0043  - A PF/VF with NIX-LF & NPA-LF resources works as a pure network device
0044  - A PF/VF with CPT-LF resource works as a pure crypto offload device.
0045 
0046 RVU functional blocks are highly configurable as per software requirements.
0047 
0048 Firmware setups following stuff before kernel boots
0049  - Enables required number of RVU PFs based on number of physical links.
0050  - Number of VFs per PF are either static or configurable at compile time.
0051    Based on config, firmware assigns VFs to each of the PFs.
0052  - Also assigns MSIX vectors to each of PF and VFs.
0053  - These are not changed after kernel boot.
0054 
0055 Drivers
0056 =======
0057 
0058 Linux kernel will have multiple drivers registering to different PF and VFs
0059 of RVU. Wrt networking there will be 3 flavours of drivers.
0060 
0061 Admin Function driver
0062 ---------------------
0063 
0064 As mentioned above RVU PF0 is called the admin function (AF), this driver
0065 supports resource provisioning and configuration of functional blocks.
0066 Doesn't handle any I/O. It sets up few basic stuff but most of the
0067 funcionality is achieved via configuration requests from PFs and VFs.
0068 
0069 PF/VFs communicates with AF via a shared memory region (mailbox). Upon
0070 receiving requests AF does resource provisioning and other HW configuration.
0071 AF is always attached to host kernel, but PFs and their VFs may be used by host
0072 kernel itself, or attached to VMs or to userspace applications like
0073 DPDK etc. So AF has to handle provisioning/configuration requests sent
0074 by any device from any domain.
0075 
0076 AF driver also interacts with underlying firmware to
0077  - Manage physical ethernet links ie CGX LMACs.
0078  - Retrieve information like speed, duplex, autoneg etc
0079  - Retrieve PHY EEPROM and stats.
0080  - Configure FEC, PAM modes
0081  - etc
0082 
0083 From pure networking side AF driver supports following functionality.
0084  - Map a physical link to a RVU PF to which a netdev is registered.
0085  - Attach NIX and NPA block LFs to RVU PF/VF which provide buffer pools, RQs, SQs
0086    for regular networking functionality.
0087  - Flow control (pause frames) enable/disable/config.
0088  - HW PTP timestamping related config.
0089  - NPC parser profile config, basically how to parse pkt and what info to extract.
0090  - NPC extract profile config, what to extract from the pkt to match data in MCAM entries.
0091  - Manage NPC MCAM entries, upon request can frame and install requested packet forwarding rules.
0092  - Defines receive side scaling (RSS) algorithms.
0093  - Defines segmentation offload algorithms (eg TSO)
0094  - VLAN stripping, capture and insertion config.
0095  - SSO and TIM blocks config which provide packet scheduling support.
0096  - Debugfs support, to check current resource provising, current status of
0097    NPA pools, NIX RQ, SQ and CQs, various stats etc which helps in debugging issues.
0098  - And many more.
0099 
0100 Physical Function driver
0101 ------------------------
0102 
0103 This RVU PF handles IO, is mapped to a physical ethernet link and this
0104 driver registers a netdev. This supports SR-IOV. As said above this driver
0105 communicates with AF with a mailbox. To retrieve information from physical
0106 links this driver talks to AF and AF gets that info from firmware and responds
0107 back ie cannot talk to firmware directly.
0108 
0109 Supports ethtool for configuring links, RSS, queue count, queue size,
0110 flow control, ntuple filters, dump PHY EEPROM, config FEC etc.
0111 
0112 Virtual Function driver
0113 -----------------------
0114 
0115 There are two types VFs, VFs that share the physical link with their parent
0116 SR-IOV PF and the VFs which work in pairs using internal HW loopback channels (LBK).
0117 
0118 Type1:
0119  - These VFs and their parent PF share a physical link and used for outside communication.
0120  - VFs cannot communicate with AF directly, they send mbox message to PF and PF
0121    forwards that to AF. AF after processing, responds back to PF and PF forwards
0122    the reply to VF.
0123  - From functionality point of view there is no difference between PF and VF as same type
0124    HW resources are attached to both. But user would be able to configure few stuff only
0125    from PF as PF is treated as owner/admin of the link.
0126 
0127 Type2:
0128  - RVU PF0 ie admin function creates these VFs and maps them to loopback block's channels.
0129  - A set of two VFs (VF0 & VF1, VF2 & VF3 .. so on) works as a pair ie pkts sent out of
0130    VF0 will be received by VF1 and viceversa.
0131  - These VFs can be used by applications or virtual machines to communicate between them
0132    without sending traffic outside. There is no switch present in HW, hence the support
0133    for loopback VFs.
0134  - These communicate directly with AF (PF0) via mbox.
0135 
0136 Except for the IO channels or links used for packet reception and transmission there is
0137 no other difference between these VF types. AF driver takes care of IO channel mapping,
0138 hence same VF driver works for both types of devices.
0139 
0140 Basic packet flow
0141 =================
0142 
0143 Ingress
0144 -------
0145 
0146 1. CGX LMAC receives packet.
0147 2. Forwards the packet to the NIX block.
0148 3. Then submitted to NPC block for parsing and then MCAM lookup to get the destination RVU device.
0149 4. NIX LF attached to the destination RVU device allocates a buffer from RQ mapped buffer pool of NPA block LF.
0150 5. RQ may be selected by RSS or by configuring MCAM rule with a RQ number.
0151 6. Packet is DMA'ed and driver is notified.
0152 
0153 Egress
0154 ------
0155 
0156 1. Driver prepares a send descriptor and submits to SQ for transmission.
0157 2. The SQ is already configured (by AF) to transmit on a specific link/channel.
0158 3. The SQ descriptor ring is maintained in buffers allocated from SQ mapped pool of NPA block LF.
0159 4. NIX block transmits the pkt on the designated channel.
0160 5. NPC MCAM entries can be installed to divert pkt onto a different channel.
0161 
0162 Devlink health reporters
0163 ========================
0164 
0165 NPA Reporters
0166 -------------
0167 The NPA reporters are responsible for reporting and recovering the following group of errors:
0168 
0169 1. GENERAL events
0170 
0171    - Error due to operation of unmapped PF.
0172    - Error due to disabled alloc/free for other HW blocks (NIX, SSO, TIM, DPI and AURA).
0173 
0174 2. ERROR events
0175 
0176    - Fault due to NPA_AQ_INST_S read or NPA_AQ_RES_S write.
0177    - AQ Doorbell Error.
0178 
0179 3. RAS events
0180 
0181    - RAS Error Reporting for NPA_AQ_INST_S/NPA_AQ_RES_S.
0182 
0183 4. RVU events
0184 
0185    - Error due to unmapped slot.
0186 
0187 Sample Output::
0188 
0189         ~# devlink health
0190         pci/0002:01:00.0:
0191           reporter hw_npa_intr
0192               state healthy error 2872 recover 2872 last_dump_date 2020-12-10 last_dump_time 09:39:09 grace_period 0 auto_recover true auto_dump true
0193           reporter hw_npa_gen
0194               state healthy error 2872 recover 2872 last_dump_date 2020-12-11 last_dump_time 04:43:04 grace_period 0 auto_recover true auto_dump true
0195           reporter hw_npa_err
0196               state healthy error 2871 recover 2871 last_dump_date 2020-12-10 last_dump_time 09:39:17 grace_period 0 auto_recover true auto_dump true
0197            reporter hw_npa_ras
0198               state healthy error 0 recover 0 last_dump_date 2020-12-10 last_dump_time 09:32:40 grace_period 0 auto_recover true auto_dump true
0199 
0200 Each reporter dumps the
0201 
0202  - Error Type
0203  - Error Register value
0204  - Reason in words
0205 
0206 For example::
0207 
0208         ~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_gen
0209          NPA_AF_GENERAL:
0210                  NPA General Interrupt Reg : 1
0211                  NIX0: free disabled RX
0212         ~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_intr
0213          NPA_AF_RVU:
0214                  NPA RVU Interrupt Reg : 1
0215                  Unmap Slot Error
0216         ~# devlink health dump show  pci/0002:01:00.0 reporter hw_npa_err
0217          NPA_AF_ERR:
0218                 NPA Error Interrupt Reg : 4096
0219                 AQ Doorbell Error
0220 
0221 
0222 NIX Reporters
0223 -------------
0224 The NIX reporters are responsible for reporting and recovering the following group of errors:
0225 
0226 1. GENERAL events
0227 
0228    - Receive mirror/multicast packet drop due to insufficient buffer.
0229    - SMQ Flush operation.
0230 
0231 2. ERROR events
0232 
0233    - Memory Fault due to WQE read/write from multicast/mirror buffer.
0234    - Receive multicast/mirror replication list error.
0235    - Receive packet on an unmapped PF.
0236    - Fault due to NIX_AQ_INST_S read or NIX_AQ_RES_S write.
0237    - AQ Doorbell Error.
0238 
0239 3. RAS events
0240 
0241    - RAS Error Reporting for NIX Receive Multicast/Mirror Entry Structure.
0242    - RAS Error Reporting for WQE/Packet Data read from Multicast/Mirror Buffer..
0243    - RAS Error Reporting for NIX_AQ_INST_S/NIX_AQ_RES_S.
0244 
0245 4. RVU events
0246 
0247    - Error due to unmapped slot.
0248 
0249 Sample Output::
0250 
0251         ~# ./devlink health
0252         pci/0002:01:00.0:
0253           reporter hw_npa_intr
0254             state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
0255           reporter hw_npa_gen
0256             state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
0257           reporter hw_npa_err
0258             state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
0259           reporter hw_npa_ras
0260             state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
0261           reporter hw_nix_intr
0262             state healthy error 1121 recover 1121 last_dump_date 2021-01-19 last_dump_time 05:42:26 grace_period 0 auto_recover true auto_dump true
0263           reporter hw_nix_gen
0264             state healthy error 949 recover 949 last_dump_date 2021-01-19 last_dump_time 05:42:43 grace_period 0 auto_recover true auto_dump true
0265           reporter hw_nix_err
0266             state healthy error 1147 recover 1147 last_dump_date 2021-01-19 last_dump_time 05:42:59 grace_period 0 auto_recover true auto_dump true
0267           reporter hw_nix_ras
0268             state healthy error 409 recover 409 last_dump_date 2021-01-19 last_dump_time 05:43:16 grace_period 0 auto_recover true auto_dump true
0269 
0270 Each reporter dumps the
0271 
0272  - Error Type
0273  - Error Register value
0274  - Reason in words
0275 
0276 For example::
0277 
0278         ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_intr
0279          NIX_AF_RVU:
0280                 NIX RVU Interrupt Reg : 1
0281                 Unmap Slot Error
0282         ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_gen
0283          NIX_AF_GENERAL:
0284                 NIX General Interrupt Reg : 1
0285                 Rx multicast pkt drop
0286         ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_err
0287          NIX_AF_ERR:
0288                 NIX Error Interrupt Reg : 64
0289                 Rx on unmapped PF_FUNC