Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 ==============================================
0004 Management Component Transport Protocol (MCTP)
0005 ==============================================
0006 
0007 net/mctp/ contains protocol support for MCTP, as defined by DMTF standard
0008 DSP0236. Physical interface drivers ("bindings" in the specification) are
0009 provided in drivers/net/mctp/.
0010 
0011 The core code provides a socket-based interface to send and receive MCTP
0012 messages, through an AF_MCTP, SOCK_DGRAM socket.
0013 
0014 Structure: interfaces & networks
0015 ================================
0016 
0017 The kernel models the local MCTP topology through two items: interfaces and
0018 networks.
0019 
0020 An interface (or "link") is an instance of an MCTP physical transport binding
0021 (as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
0022 device. This is represented as a ``struct netdevice``.
0023 
0024 A network defines a unique address space for MCTP endpoints by endpoint-ID
0025 (described by DSP0236, section 3.2.31). A network has a user-visible identifier
0026 to allow references from userspace. Route definitions are specific to one
0027 network.
0028 
0029 Interfaces are associated with one network. A network may be associated with one
0030 or more interfaces.
0031 
0032 If multiple networks are present, each may contain endpoint IDs (EIDs) that are
0033 also present on other networks.
0034 
0035 Sockets API
0036 ===========
0037 
0038 Protocol definitions
0039 --------------------
0040 
0041 MCTP uses ``AF_MCTP`` / ``PF_MCTP`` for the address- and protocol- families.
0042 Since MCTP is message-based, only ``SOCK_DGRAM`` sockets are supported.
0043 
0044 .. code-block:: C
0045 
0046     int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
0047 
0048 The only (current) value for the ``protocol`` argument is 0.
0049 
0050 As with all socket address families, source and destination addresses are
0051 specified with a ``sockaddr`` type, with a single-byte endpoint address:
0052 
0053 .. code-block:: C
0054 
0055     typedef __u8                mctp_eid_t;
0056 
0057     struct mctp_addr {
0058             mctp_eid_t          s_addr;
0059     };
0060 
0061     struct sockaddr_mctp {
0062             __kernel_sa_family_t smctp_family;
0063             unsigned int         smctp_network;
0064             struct mctp_addr     smctp_addr;
0065             __u8                 smctp_type;
0066             __u8                 smctp_tag;
0067     };
0068 
0069     #define MCTP_NET_ANY        0x0
0070     #define MCTP_ADDR_ANY       0xff
0071 
0072 
0073 Syscall behaviour
0074 -----------------
0075 
0076 The following sections describe the MCTP-specific behaviours of the standard
0077 socket system calls. These behaviours have been chosen to map closely to the
0078 existing sockets APIs.
0079 
0080 ``bind()`` : set local socket address
0081 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0082 
0083 Sockets that receive incoming request packets will bind to a local address,
0084 using the ``bind()`` syscall.
0085 
0086 .. code-block:: C
0087 
0088     struct sockaddr_mctp addr;
0089 
0090     addr.smctp_family = AF_MCTP;
0091     addr.smctp_network = MCTP_NET_ANY;
0092     addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
0093     addr.smctp_type = MCTP_TYPE_PLDM;
0094     addr.smctp_tag = MCTP_TAG_OWNER;
0095 
0096     int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
0097 
0098 This establishes the local address of the socket. Incoming MCTP messages that
0099 match the network, address, and message type will be received by this socket.
0100 The reference to 'incoming' is important here; a bound socket will only receive
0101 messages with the TO bit set, to indicate an incoming request message, rather
0102 than a response.
0103 
0104 The ``smctp_tag`` value will configure the tags accepted from the remote side of
0105 this socket. Given the above, the only valid value is ``MCTP_TAG_OWNER``, which
0106 will result in remotely "owned" tags being routed to this socket. Since
0107 ``MCTP_TAG_OWNER`` is set, the 3 least-significant bits of ``smctp_tag`` are not
0108 used; callers must set them to zero.
0109 
0110 A ``smctp_network`` value of ``MCTP_NET_ANY`` will configure the socket to
0111 receive incoming packets from any locally-connected network. A specific network
0112 value will cause the socket to only receive incoming messages from that network.
0113 
0114 The ``smctp_addr`` field specifies a local address to bind to. A value of
0115 ``MCTP_ADDR_ANY`` configures the socket to receive messages addressed to any
0116 local destination EID.
0117 
0118 The ``smctp_type`` field specifies which message types to receive. Only the
0119 lower 7 bits of the type is matched on incoming messages (ie., the
0120 most-significant IC bit is not part of the match). This results in the socket
0121 receiving packets with and without a message integrity check footer.
0122 
0123 ``sendto()``, ``sendmsg()``, ``send()`` : transmit an MCTP message
0124 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0125 
0126 An MCTP message is transmitted using one of the ``sendto()``, ``sendmsg()`` or
0127 ``send()`` syscalls. Using ``sendto()`` as the primary example:
0128 
0129 .. code-block:: C
0130 
0131     struct sockaddr_mctp addr;
0132     char buf[14];
0133     ssize_t len;
0134 
0135     /* set message destination */
0136     addr.smctp_family = AF_MCTP;
0137     addr.smctp_network = 0;
0138     addr.smctp_addr.s_addr = 8;
0139     addr.smctp_tag = MCTP_TAG_OWNER;
0140     addr.smctp_type = MCTP_TYPE_ECHO;
0141 
0142     /* arbitrary message to send, with message-type header */
0143     buf[0] = MCTP_TYPE_ECHO;
0144     memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
0145 
0146     len = sendto(sd, buf, sizeof(buf), 0,
0147                     (struct sockaddr_mctp *)&addr, sizeof(addr));
0148 
0149 The network and address fields of ``addr`` define the remote address to send to.
0150 If ``smctp_tag`` has the ``MCTP_TAG_OWNER``, the kernel will ignore any bits set
0151 in ``MCTP_TAG_VALUE``, and generate a tag value suitable for the destination
0152 EID. If ``MCTP_TAG_OWNER`` is not set, the message will be sent with the tag
0153 value as specified. If a tag value cannot be allocated, the system call will
0154 report an errno of ``EAGAIN``.
0155 
0156 The application must provide the message type byte as the first byte of the
0157 message buffer passed to ``sendto()``. If a message integrity check is to be
0158 included in the transmitted message, it must also be provided in the message
0159 buffer, and the most-significant bit of the message type byte must be 1.
0160 
0161 The ``sendmsg()`` system call allows a more compact argument interface, and the
0162 message buffer to be specified as a scatter-gather list. At present no ancillary
0163 message types (used for the ``msg_control`` data passed to ``sendmsg()``) are
0164 defined.
0165 
0166 Transmitting a message on an unconnected socket with ``MCTP_TAG_OWNER``
0167 specified will cause an allocation of a tag, if no valid tag is already
0168 allocated for that destination. The (destination-eid,tag) tuple acts as an
0169 implicit local socket address, to allow the socket to receive responses to this
0170 outgoing message. If any previous allocation has been performed (to for a
0171 different remote EID), that allocation is lost.
0172 
0173 Sockets will only receive responses to requests they have sent (with TO=1) and
0174 may only respond (with TO=0) to requests they have received.
0175 
0176 ``recvfrom()``, ``recvmsg()``, ``recv()`` : receive an MCTP message
0177 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0178 
0179 An MCTP message can be received by an application using one of the
0180 ``recvfrom()``, ``recvmsg()``, or ``recv()`` system calls. Using ``recvfrom()``
0181 as the primary example:
0182 
0183 .. code-block:: C
0184 
0185     struct sockaddr_mctp addr;
0186     socklen_t addrlen;
0187     char buf[14];
0188     ssize_t len;
0189 
0190     addrlen = sizeof(addr);
0191 
0192     len = recvfrom(sd, buf, sizeof(buf), 0,
0193                     (struct sockaddr_mctp *)&addr, &addrlen);
0194 
0195     /* We can expect addr to describe an MCTP address */
0196     assert(addrlen >= sizeof(buf));
0197     assert(addr.smctp_family == AF_MCTP);
0198 
0199     printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
0200 
0201 The address argument to ``recvfrom`` and ``recvmsg`` is populated with the
0202 remote address of the incoming message, including tag value (this will be needed
0203 in order to reply to the message).
0204 
0205 The first byte of the message buffer will contain the message type byte. If an
0206 integrity check follows the message, it will be included in the received buffer.
0207 
0208 The ``recv()`` system call behaves in a similar way, but does not provide a
0209 remote address to the application. Therefore, these are only useful if the
0210 remote address is already known, or the message does not require a reply.
0211 
0212 Like the send calls, sockets will only receive responses to requests they have
0213 sent (TO=1) and may only respond (TO=0) to requests they have received.
0214 
0215 ``ioctl(SIOCMCTPALLOCTAG)`` and ``ioctl(SIOCMCTPDROPTAG)``
0216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0217 
0218 These tags give applications more control over MCTP message tags, by allocating
0219 (and dropping) tag values explicitly, rather than the kernel automatically
0220 allocating a per-message tag at ``sendmsg()`` time.
0221 
0222 In general, you will only need to use these ioctls if your MCTP protocol does
0223 not fit the usual request/response model. For example, if you need to persist
0224 tags across multiple requests, or a request may generate more than one response.
0225 In these cases, the ioctls allow you to decouple the tag allocation (and
0226 release) from individual message send and receive operations.
0227 
0228 Both ioctls are passed a pointer to a ``struct mctp_ioc_tag_ctl``:
0229 
0230 .. code-block:: C
0231 
0232     struct mctp_ioc_tag_ctl {
0233         mctp_eid_t      peer_addr;
0234         __u8            tag;
0235         __u16           flags;
0236     };
0237 
0238 ``SIOCMCTPALLOCTAG`` allocates a tag for a specific peer, which an application
0239 can use in future ``sendmsg()`` calls. The application populates the
0240 ``peer_addr`` member with the remote EID. Other fields must be zero.
0241 
0242 On return, the ``tag`` member will be populated with the allocated tag value.
0243 The allocated tag will have the following tag bits set:
0244 
0245  - ``MCTP_TAG_OWNER``: it only makes sense to allocate tags if you're the tag
0246    owner
0247 
0248  - ``MCTP_TAG_PREALLOC``: to indicate to ``sendmsg()`` that this is a
0249    preallocated tag.
0250 
0251  - ... and the actual tag value, within the least-significant three bits
0252    (``MCTP_TAG_MASK``). Note that zero is a valid tag value.
0253 
0254 The tag value should be used as-is for the ``smctp_tag`` member of ``struct
0255 sockaddr_mctp``.
0256 
0257 ``SIOCMCTPDROPTAG`` releases a tag that has been previously allocated by a
0258 ``SIOCMCTPALLOCTAG`` ioctl. The ``peer_addr`` must be the same as used for the
0259 allocation, and the ``tag`` value must match exactly the tag returned from the
0260 allocation (including the ``MCTP_TAG_OWNER`` and ``MCTP_TAG_PREALLOC`` bits).
0261 The ``flags`` field must be zero.
0262 
0263 Kernel internals
0264 ================
0265 
0266 There are a few possible packet flows in the MCTP stack:
0267 
0268 1. local TX to remote endpoint, message <= MTU::
0269 
0270         sendmsg()
0271          -> mctp_local_output()
0272             : route lookup
0273             -> rt->output() (== mctp_route_output)
0274                -> dev_queue_xmit()
0275 
0276 2. local TX to remote endpoint, message > MTU::
0277 
0278         sendmsg()
0279         -> mctp_local_output()
0280             -> mctp_do_fragment_route()
0281                : creates packet-sized skbs. For each new skb:
0282                -> rt->output() (== mctp_route_output)
0283                   -> dev_queue_xmit()
0284 
0285 3. remote TX to local endpoint, single-packet message::
0286 
0287         mctp_pkttype_receive()
0288         : route lookup
0289         -> rt->output() (== mctp_route_input)
0290            : sk_key lookup
0291            -> sock_queue_rcv_skb()
0292 
0293 4. remote TX to local endpoint, multiple-packet message::
0294 
0295         mctp_pkttype_receive()
0296         : route lookup
0297         -> rt->output() (== mctp_route_input)
0298            : sk_key lookup
0299            : stores skb in struct sk_key->reasm_head
0300 
0301         mctp_pkttype_receive()
0302         : route lookup
0303         -> rt->output() (== mctp_route_input)
0304            : sk_key lookup
0305            : finds existing reassembly in sk_key->reasm_head
0306            : appends new fragment
0307            -> sock_queue_rcv_skb()
0308 
0309 Key refcounts
0310 -------------
0311 
0312  * keys are refed by:
0313 
0314    - a skb: during route output, stored in ``skb->cb``.
0315 
0316    - netns and sock lists.
0317 
0318  * keys can be associated with a device, in which case they hold a
0319    reference to the dev (set through ``key->dev``, counted through
0320    ``dev->key_count``). Multiple keys can reference the device.