0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ==============================================
0004 Management Component Transport Protocol (MCTP)
0005 ==============================================
0006
0007 net/mctp/ contains protocol support for MCTP, as defined by DMTF standard
0008 DSP0236. Physical interface drivers ("bindings" in the specification) are
0009 provided in drivers/net/mctp/.
0010
0011 The core code provides a socket-based interface to send and receive MCTP
0012 messages, through an AF_MCTP, SOCK_DGRAM socket.
0013
0014 Structure: interfaces & networks
0015 ================================
0016
0017 The kernel models the local MCTP topology through two items: interfaces and
0018 networks.
0019
0020 An interface (or "link") is an instance of an MCTP physical transport binding
0021 (as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
0022 device. This is represented as a ``struct netdevice``.
0023
0024 A network defines a unique address space for MCTP endpoints by endpoint-ID
0025 (described by DSP0236, section 3.2.31). A network has a user-visible identifier
0026 to allow references from userspace. Route definitions are specific to one
0027 network.
0028
0029 Interfaces are associated with one network. A network may be associated with one
0030 or more interfaces.
0031
0032 If multiple networks are present, each may contain endpoint IDs (EIDs) that are
0033 also present on other networks.
0034
0035 Sockets API
0036 ===========
0037
0038 Protocol definitions
0039 --------------------
0040
0041 MCTP uses ``AF_MCTP`` / ``PF_MCTP`` for the address- and protocol- families.
0042 Since MCTP is message-based, only ``SOCK_DGRAM`` sockets are supported.
0043
0044 .. code-block:: C
0045
0046 int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
0047
0048 The only (current) value for the ``protocol`` argument is 0.
0049
0050 As with all socket address families, source and destination addresses are
0051 specified with a ``sockaddr`` type, with a single-byte endpoint address:
0052
0053 .. code-block:: C
0054
0055 typedef __u8 mctp_eid_t;
0056
0057 struct mctp_addr {
0058 mctp_eid_t s_addr;
0059 };
0060
0061 struct sockaddr_mctp {
0062 __kernel_sa_family_t smctp_family;
0063 unsigned int smctp_network;
0064 struct mctp_addr smctp_addr;
0065 __u8 smctp_type;
0066 __u8 smctp_tag;
0067 };
0068
0069 #define MCTP_NET_ANY 0x0
0070 #define MCTP_ADDR_ANY 0xff
0071
0072
0073 Syscall behaviour
0074 -----------------
0075
0076 The following sections describe the MCTP-specific behaviours of the standard
0077 socket system calls. These behaviours have been chosen to map closely to the
0078 existing sockets APIs.
0079
0080 ``bind()`` : set local socket address
0081 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0082
0083 Sockets that receive incoming request packets will bind to a local address,
0084 using the ``bind()`` syscall.
0085
0086 .. code-block:: C
0087
0088 struct sockaddr_mctp addr;
0089
0090 addr.smctp_family = AF_MCTP;
0091 addr.smctp_network = MCTP_NET_ANY;
0092 addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
0093 addr.smctp_type = MCTP_TYPE_PLDM;
0094 addr.smctp_tag = MCTP_TAG_OWNER;
0095
0096 int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
0097
0098 This establishes the local address of the socket. Incoming MCTP messages that
0099 match the network, address, and message type will be received by this socket.
0100 The reference to 'incoming' is important here; a bound socket will only receive
0101 messages with the TO bit set, to indicate an incoming request message, rather
0102 than a response.
0103
0104 The ``smctp_tag`` value will configure the tags accepted from the remote side of
0105 this socket. Given the above, the only valid value is ``MCTP_TAG_OWNER``, which
0106 will result in remotely "owned" tags being routed to this socket. Since
0107 ``MCTP_TAG_OWNER`` is set, the 3 least-significant bits of ``smctp_tag`` are not
0108 used; callers must set them to zero.
0109
0110 A ``smctp_network`` value of ``MCTP_NET_ANY`` will configure the socket to
0111 receive incoming packets from any locally-connected network. A specific network
0112 value will cause the socket to only receive incoming messages from that network.
0113
0114 The ``smctp_addr`` field specifies a local address to bind to. A value of
0115 ``MCTP_ADDR_ANY`` configures the socket to receive messages addressed to any
0116 local destination EID.
0117
0118 The ``smctp_type`` field specifies which message types to receive. Only the
0119 lower 7 bits of the type is matched on incoming messages (ie., the
0120 most-significant IC bit is not part of the match). This results in the socket
0121 receiving packets with and without a message integrity check footer.
0122
0123 ``sendto()``, ``sendmsg()``, ``send()`` : transmit an MCTP message
0124 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0125
0126 An MCTP message is transmitted using one of the ``sendto()``, ``sendmsg()`` or
0127 ``send()`` syscalls. Using ``sendto()`` as the primary example:
0128
0129 .. code-block:: C
0130
0131 struct sockaddr_mctp addr;
0132 char buf[14];
0133 ssize_t len;
0134
0135 /* set message destination */
0136 addr.smctp_family = AF_MCTP;
0137 addr.smctp_network = 0;
0138 addr.smctp_addr.s_addr = 8;
0139 addr.smctp_tag = MCTP_TAG_OWNER;
0140 addr.smctp_type = MCTP_TYPE_ECHO;
0141
0142 /* arbitrary message to send, with message-type header */
0143 buf[0] = MCTP_TYPE_ECHO;
0144 memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
0145
0146 len = sendto(sd, buf, sizeof(buf), 0,
0147 (struct sockaddr_mctp *)&addr, sizeof(addr));
0148
0149 The network and address fields of ``addr`` define the remote address to send to.
0150 If ``smctp_tag`` has the ``MCTP_TAG_OWNER``, the kernel will ignore any bits set
0151 in ``MCTP_TAG_VALUE``, and generate a tag value suitable for the destination
0152 EID. If ``MCTP_TAG_OWNER`` is not set, the message will be sent with the tag
0153 value as specified. If a tag value cannot be allocated, the system call will
0154 report an errno of ``EAGAIN``.
0155
0156 The application must provide the message type byte as the first byte of the
0157 message buffer passed to ``sendto()``. If a message integrity check is to be
0158 included in the transmitted message, it must also be provided in the message
0159 buffer, and the most-significant bit of the message type byte must be 1.
0160
0161 The ``sendmsg()`` system call allows a more compact argument interface, and the
0162 message buffer to be specified as a scatter-gather list. At present no ancillary
0163 message types (used for the ``msg_control`` data passed to ``sendmsg()``) are
0164 defined.
0165
0166 Transmitting a message on an unconnected socket with ``MCTP_TAG_OWNER``
0167 specified will cause an allocation of a tag, if no valid tag is already
0168 allocated for that destination. The (destination-eid,tag) tuple acts as an
0169 implicit local socket address, to allow the socket to receive responses to this
0170 outgoing message. If any previous allocation has been performed (to for a
0171 different remote EID), that allocation is lost.
0172
0173 Sockets will only receive responses to requests they have sent (with TO=1) and
0174 may only respond (with TO=0) to requests they have received.
0175
0176 ``recvfrom()``, ``recvmsg()``, ``recv()`` : receive an MCTP message
0177 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0178
0179 An MCTP message can be received by an application using one of the
0180 ``recvfrom()``, ``recvmsg()``, or ``recv()`` system calls. Using ``recvfrom()``
0181 as the primary example:
0182
0183 .. code-block:: C
0184
0185 struct sockaddr_mctp addr;
0186 socklen_t addrlen;
0187 char buf[14];
0188 ssize_t len;
0189
0190 addrlen = sizeof(addr);
0191
0192 len = recvfrom(sd, buf, sizeof(buf), 0,
0193 (struct sockaddr_mctp *)&addr, &addrlen);
0194
0195 /* We can expect addr to describe an MCTP address */
0196 assert(addrlen >= sizeof(buf));
0197 assert(addr.smctp_family == AF_MCTP);
0198
0199 printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
0200
0201 The address argument to ``recvfrom`` and ``recvmsg`` is populated with the
0202 remote address of the incoming message, including tag value (this will be needed
0203 in order to reply to the message).
0204
0205 The first byte of the message buffer will contain the message type byte. If an
0206 integrity check follows the message, it will be included in the received buffer.
0207
0208 The ``recv()`` system call behaves in a similar way, but does not provide a
0209 remote address to the application. Therefore, these are only useful if the
0210 remote address is already known, or the message does not require a reply.
0211
0212 Like the send calls, sockets will only receive responses to requests they have
0213 sent (TO=1) and may only respond (TO=0) to requests they have received.
0214
0215 ``ioctl(SIOCMCTPALLOCTAG)`` and ``ioctl(SIOCMCTPDROPTAG)``
0216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0217
0218 These tags give applications more control over MCTP message tags, by allocating
0219 (and dropping) tag values explicitly, rather than the kernel automatically
0220 allocating a per-message tag at ``sendmsg()`` time.
0221
0222 In general, you will only need to use these ioctls if your MCTP protocol does
0223 not fit the usual request/response model. For example, if you need to persist
0224 tags across multiple requests, or a request may generate more than one response.
0225 In these cases, the ioctls allow you to decouple the tag allocation (and
0226 release) from individual message send and receive operations.
0227
0228 Both ioctls are passed a pointer to a ``struct mctp_ioc_tag_ctl``:
0229
0230 .. code-block:: C
0231
0232 struct mctp_ioc_tag_ctl {
0233 mctp_eid_t peer_addr;
0234 __u8 tag;
0235 __u16 flags;
0236 };
0237
0238 ``SIOCMCTPALLOCTAG`` allocates a tag for a specific peer, which an application
0239 can use in future ``sendmsg()`` calls. The application populates the
0240 ``peer_addr`` member with the remote EID. Other fields must be zero.
0241
0242 On return, the ``tag`` member will be populated with the allocated tag value.
0243 The allocated tag will have the following tag bits set:
0244
0245 - ``MCTP_TAG_OWNER``: it only makes sense to allocate tags if you're the tag
0246 owner
0247
0248 - ``MCTP_TAG_PREALLOC``: to indicate to ``sendmsg()`` that this is a
0249 preallocated tag.
0250
0251 - ... and the actual tag value, within the least-significant three bits
0252 (``MCTP_TAG_MASK``). Note that zero is a valid tag value.
0253
0254 The tag value should be used as-is for the ``smctp_tag`` member of ``struct
0255 sockaddr_mctp``.
0256
0257 ``SIOCMCTPDROPTAG`` releases a tag that has been previously allocated by a
0258 ``SIOCMCTPALLOCTAG`` ioctl. The ``peer_addr`` must be the same as used for the
0259 allocation, and the ``tag`` value must match exactly the tag returned from the
0260 allocation (including the ``MCTP_TAG_OWNER`` and ``MCTP_TAG_PREALLOC`` bits).
0261 The ``flags`` field must be zero.
0262
0263 Kernel internals
0264 ================
0265
0266 There are a few possible packet flows in the MCTP stack:
0267
0268 1. local TX to remote endpoint, message <= MTU::
0269
0270 sendmsg()
0271 -> mctp_local_output()
0272 : route lookup
0273 -> rt->output() (== mctp_route_output)
0274 -> dev_queue_xmit()
0275
0276 2. local TX to remote endpoint, message > MTU::
0277
0278 sendmsg()
0279 -> mctp_local_output()
0280 -> mctp_do_fragment_route()
0281 : creates packet-sized skbs. For each new skb:
0282 -> rt->output() (== mctp_route_output)
0283 -> dev_queue_xmit()
0284
0285 3. remote TX to local endpoint, single-packet message::
0286
0287 mctp_pkttype_receive()
0288 : route lookup
0289 -> rt->output() (== mctp_route_input)
0290 : sk_key lookup
0291 -> sock_queue_rcv_skb()
0292
0293 4. remote TX to local endpoint, multiple-packet message::
0294
0295 mctp_pkttype_receive()
0296 : route lookup
0297 -> rt->output() (== mctp_route_input)
0298 : sk_key lookup
0299 : stores skb in struct sk_key->reasm_head
0300
0301 mctp_pkttype_receive()
0302 : route lookup
0303 -> rt->output() (== mctp_route_input)
0304 : sk_key lookup
0305 : finds existing reassembly in sk_key->reasm_head
0306 : appends new fragment
0307 -> sock_queue_rcv_skb()
0308
0309 Key refcounts
0310 -------------
0311
0312 * keys are refed by:
0313
0314 - a skb: during route output, stored in ``skb->cb``.
0315
0316 - netns and sock lists.
0317
0318 * keys can be associated with a device, in which case they hold a
0319 reference to the dev (set through ``key->dev``, counted through
0320 ``dev->key_count``). Multiple keys can reference the device.