Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 =====================
0004 Segmentation Offloads
0005 =====================
0006 
0007 
0008 Introduction
0009 ============
0010 
0011 This document describes a set of techniques in the Linux networking stack
0012 to take advantage of segmentation offload capabilities of various NICs.
0013 
0014 The following technologies are described:
0015  * TCP Segmentation Offload - TSO
0016  * UDP Fragmentation Offload - UFO
0017  * IPIP, SIT, GRE, and UDP Tunnel Offloads
0018  * Generic Segmentation Offload - GSO
0019  * Generic Receive Offload - GRO
0020  * Partial Generic Segmentation Offload - GSO_PARTIAL
0021  * SCTP acceleration with GSO - GSO_BY_FRAGS
0022 
0023 
0024 TCP Segmentation Offload
0025 ========================
0026 
0027 TCP segmentation allows a device to segment a single frame into multiple
0028 frames with a data payload size specified in skb_shinfo()->gso_size.
0029 When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
0030 SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
0031 skb_shinfo()->gso_size should be set to a non-zero value.
0032 
0033 TCP segmentation is dependent on support for the use of partial checksum
0034 offload.  For this reason TSO is normally disabled if the Tx checksum
0035 offload for a given device is disabled.
0036 
0037 In order to support TCP segmentation offload it is necessary to populate
0038 the network and transport header offsets of the skbuff so that the device
0039 drivers will be able determine the offsets of the IP or IPv6 header and the
0040 TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
0041 also point to the TCP header of the packet.
0042 
0043 For IPv4 segmentation we support one of two types in terms of the IP ID.
0044 The default behavior is to increment the IP ID with every segment.  If the
0045 GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
0046 ID and all segments will use the same IP ID.  If a device has
0047 NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
0048 and we will either increment the IP ID for all frames, or leave it at a
0049 static value based on driver preference.
0050 
0051 
0052 UDP Fragmentation Offload
0053 =========================
0054 
0055 UDP fragmentation offload allows a device to fragment an oversized UDP
0056 datagram into multiple IPv4 fragments.  Many of the requirements for UDP
0057 fragmentation offload are the same as TSO.  However the IPv4 ID for
0058 fragments should not increment as a single IPv4 datagram is fragmented.
0059 
0060 UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
0061 still receive them from tuntap and similar devices. Offload of UDP-based
0062 tunnel protocols is still supported.
0063 
0064 
0065 IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
0066 ========================================================
0067 
0068 In addition to the offloads described above it is possible for a frame to
0069 contain additional headers such as an outer tunnel.  In order to account
0070 for such instances an additional set of segmentation offload types were
0071 introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
0072 SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
0073 cases where there are more than just 1 set of headers.  For example in the
0074 case of IPIP and SIT we should have the network and transport headers moved
0075 from the standard list of headers to "inner" header offsets.
0076 
0077 Currently only two levels of headers are supported.  The convention is to
0078 refer to the tunnel headers as the outer headers, while the encapsulated
0079 data is normally referred to as the inner headers.  Below is the list of
0080 calls to access the given headers:
0081 
0082 IPIP/SIT Tunnel::
0083 
0084              Outer                  Inner
0085   MAC        skb_mac_header
0086   Network    skb_network_header     skb_inner_network_header
0087   Transport  skb_transport_header
0088 
0089 UDP/GRE Tunnel::
0090 
0091              Outer                  Inner
0092   MAC        skb_mac_header         skb_inner_mac_header
0093   Network    skb_network_header     skb_inner_network_header
0094   Transport  skb_transport_header   skb_inner_transport_header
0095 
0096 In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
0097 SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
0098 fact that the outer header also requests to have a non-zero checksum
0099 included in the outer header.
0100 
0101 Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
0102 header has requested a remote checksum offload.  In this case the inner
0103 headers will be left with a partial checksum and only the outer header
0104 checksum will be computed.
0105 
0106 
0107 Generic Segmentation Offload
0108 ============================
0109 
0110 Generic segmentation offload is a pure software offload that is meant to
0111 deal with cases where device drivers cannot perform the offloads described
0112 above.  What occurs in GSO is that a given skbuff will have its data broken
0113 out over multiple skbuffs that have been resized to match the MSS provided
0114 via skb_shinfo()->gso_size.
0115 
0116 Before enabling any hardware segmentation offload a corresponding software
0117 offload is required in GSO.  Otherwise it becomes possible for a frame to
0118 be re-routed between devices and end up being unable to be transmitted.
0119 
0120 
0121 Generic Receive Offload
0122 =======================
0123 
0124 Generic receive offload is the complement to GSO.  Ideally any frame
0125 assembled by GRO should be segmented to create an identical sequence of
0126 frames using GSO, and any sequence of frames segmented by GSO should be
0127 able to be reassembled back to the original by GRO.  The only exception to
0128 this is IPv4 ID in the case that the DF bit is set for a given IP header.
0129 If the value of the IPv4 ID is not sequentially incrementing it will be
0130 altered so that it is when a frame assembled via GRO is segmented via GSO.
0131 
0132 
0133 Partial Generic Segmentation Offload
0134 ====================================
0135 
0136 Partial generic segmentation offload is a hybrid between TSO and GSO.  What
0137 it effectively does is take advantage of certain traits of TCP and tunnels
0138 so that instead of having to rewrite the packet headers for each segment
0139 only the inner-most transport header and possibly the outer-most network
0140 header need to be updated.  This allows devices that do not support tunnel
0141 offloads or tunnel offloads with checksum to still make use of segmentation.
0142 
0143 With the partial offload what occurs is that all headers excluding the
0144 inner transport header are updated such that they will contain the correct
0145 values for if the header was simply duplicated.  The one exception to this
0146 is the outer IPv4 ID field.  It is up to the device drivers to guarantee
0147 that the IPv4 ID field is incremented in the case that a given header does
0148 not have the DF bit set.
0149 
0150 
0151 SCTP acceleration with GSO
0152 ===========================
0153 
0154 SCTP - despite the lack of hardware support - can still take advantage of
0155 GSO to pass one large packet through the network stack, rather than
0156 multiple small packets.
0157 
0158 This requires a different approach to other offloads, as SCTP packets
0159 cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
0160 IP segments, padding respected. So unlike regular GSO, SCTP can't just
0161 generate a big skb, set gso_size to the fragmentation point and deliver it
0162 to IP layer.
0163 
0164 Instead, the SCTP protocol layer builds an skb with the segments correctly
0165 padded and stored as chained skbs, and skb_segment() splits based on those.
0166 To signal this, gso_size is set to the special value GSO_BY_FRAGS.
0167 
0168 Therefore, any code in the core networking stack must be aware of the
0169 possibility that gso_size will be GSO_BY_FRAGS and handle that case
0170 appropriately.
0171 
0172 There are some helpers to make this easier:
0173 
0174 - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
0175   an skb is an SCTP GSO skb.
0176 
0177 - For size checks, the skb_gso_validate_*_len family of helpers correctly
0178   considers GSO_BY_FRAGS.
0179 
0180 - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
0181   will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
0182 
0183 This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
0184 set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.