0001 ==========================================
0002 Xillybus driver for generic FPGA interface
0003 ==========================================
0004
0005 :Author: Eli Billauer, Xillybus Ltd. (http://xillybus.com)
0006 :Email: eli.billauer@gmail.com or as advertised on Xillybus' site.
0007
0008 .. Contents:
0009
0010 - Introduction
0011 -- Background
0012 -- Xillybus Overview
0013
0014 - Usage
0015 -- User interface
0016 -- Synchronization
0017 -- Seekable pipes
0018
0019 - Internals
0020 -- Source code organization
0021 -- Pipe attributes
0022 -- Host never reads from the FPGA
0023 -- Channels, pipes, and the message channel
0024 -- Data streaming
0025 -- Data granularity
0026 -- Probing
0027 -- Buffer allocation
0028 -- The "nonempty" message (supporting poll)
0029
0030
0031 Introduction
0032 ============
0033
0034 Background
0035 ----------
0036
0037 An FPGA (Field Programmable Gate Array) is a piece of logic hardware, which
0038 can be programmed to become virtually anything that is usually found as a
0039 dedicated chipset: For instance, a display adapter, network interface card,
0040 or even a processor with its peripherals. FPGAs are the LEGO of hardware:
0041 Based upon certain building blocks, you make your own toys the way you like
0042 them. It's usually pointless to reimplement something that is already
0043 available on the market as a chipset, so FPGAs are mostly used when some
0044 special functionality is needed, and the production volume is relatively low
0045 (hence not justifying the development of an ASIC).
0046
0047 The challenge with FPGAs is that everything is implemented at a very low
0048 level, even lower than assembly language. In order to allow FPGA designers to
0049 focus on their specific project, and not reinvent the wheel over and over
0050 again, pre-designed building blocks, IP cores, are often used. These are the
0051 FPGA parallels of library functions. IP cores may implement certain
0052 mathematical functions, a functional unit (e.g. a USB interface), an entire
0053 processor (e.g. ARM) or anything that might come handy. Think of them as a
0054 building block, with electrical wires dangling on the sides for connection to
0055 other blocks.
0056
0057 One of the daunting tasks in FPGA design is communicating with a fullblown
0058 operating system (actually, with the processor running it): Implementing the
0059 low-level bus protocol and the somewhat higher-level interface with the host
0060 (registers, interrupts, DMA etc.) is a project in itself. When the FPGA's
0061 function is a well-known one (e.g. a video adapter card, or a NIC), it can
0062 make sense to design the FPGA's interface logic specifically for the project.
0063 A special driver is then written to present the FPGA as a well-known interface
0064 to the kernel and/or user space. In that case, there is no reason to treat the
0065 FPGA differently than any device on the bus.
0066
0067 It's however common that the desired data communication doesn't fit any well-
0068 known peripheral function. Also, the effort of designing an elegant
0069 abstraction for the data exchange is often considered too big. In those cases,
0070 a quicker and possibly less elegant solution is sought: The driver is
0071 effectively written as a user space program, leaving the kernel space part
0072 with just elementary data transport. This still requires designing some
0073 interface logic for the FPGA, and write a simple ad-hoc driver for the kernel.
0074
0075 Xillybus Overview
0076 -----------------
0077
0078 Xillybus is an IP core and a Linux driver. Together, they form a kit for
0079 elementary data transport between an FPGA and the host, providing pipe-like
0080 data streams with a straightforward user interface. It's intended as a low-
0081 effort solution for mixed FPGA-host projects, for which it makes sense to
0082 have the project-specific part of the driver running in a user-space program.
0083
0084 Since the communication requirements may vary significantly from one FPGA
0085 project to another (the number of data pipes needed in each direction and
0086 their attributes), there isn't one specific chunk of logic being the Xillybus
0087 IP core. Rather, the IP core is configured and built based upon a
0088 specification given by its end user.
0089
0090 Xillybus presents independent data streams, which resemble pipes or TCP/IP
0091 communication to the user. At the host side, a character device file is used
0092 just like any pipe file. On the FPGA side, hardware FIFOs are used to stream
0093 the data. This is contrary to a common method of communicating through fixed-
0094 sized buffers (even though such buffers are used by Xillybus under the hood).
0095 There may be more than a hundred of these streams on a single IP core, but
0096 also no more than one, depending on the configuration.
0097
0098 In order to ease the deployment of the Xillybus IP core, it contains a simple
0099 data structure which completely defines the core's configuration. The Linux
0100 driver fetches this data structure during its initialization process, and sets
0101 up the DMA buffers and character devices accordingly. As a result, a single
0102 driver is used to work out of the box with any Xillybus IP core.
0103
0104 The data structure just mentioned should not be confused with PCI's
0105 configuration space or the Flattened Device Tree.
0106
0107 Usage
0108 =====
0109
0110 User interface
0111 --------------
0112
0113 On the host, all interface with Xillybus is done through /dev/xillybus_*
0114 device files, which are generated automatically as the drivers loads. The
0115 names of these files depend on the IP core that is loaded in the FPGA (see
0116 Probing below). To communicate with the FPGA, open the device file that
0117 corresponds to the hardware FIFO you want to send data or receive data from,
0118 and use plain write() or read() calls, just like with a regular pipe. In
0119 particular, it makes perfect sense to go::
0120
0121 $ cat mydata > /dev/xillybus_thisfifo
0122
0123 $ cat /dev/xillybus_thatfifo > hisdata
0124
0125 possibly pressing CTRL-C as some stage, even though the xillybus_* pipes have
0126 the capability to send an EOF (but may not use it).
0127
0128 The driver and hardware are designed to behave sensibly as pipes, including:
0129
0130 * Supporting non-blocking I/O (by setting O_NONBLOCK on open() ).
0131
0132 * Supporting poll() and select().
0133
0134 * Being bandwidth efficient under load (using DMA) but also handle small
0135 pieces of data sent across (like TCP/IP) by autoflushing.
0136
0137 A device file can be read only, write only or bidirectional. Bidirectional
0138 device files are treated like two independent pipes (except for sharing a
0139 "channel" structure in the implementation code).
0140
0141 Synchronization
0142 ---------------
0143
0144 Xillybus pipes are configured (on the IP core) to be either synchronous or
0145 asynchronous. For a synchronous pipe, write() returns successfully only after
0146 some data has been submitted and acknowledged by the FPGA. This slows down
0147 bulk data transfers, and is nearly impossible for use with streams that
0148 require data at a constant rate: There is no data transmitted to the FPGA
0149 between write() calls, in particular when the process loses the CPU.
0150
0151 When a pipe is configured asynchronous, write() returns if there was enough
0152 room in the buffers to store any of the data in the buffers.
0153
0154 For FPGA to host pipes, asynchronous pipes allow data transfer from the FPGA
0155 as soon as the respective device file is opened, regardless of if the data
0156 has been requested by a read() call. On synchronous pipes, only the amount
0157 of data requested by a read() call is transmitted.
0158
0159 In summary, for synchronous pipes, data between the host and FPGA is
0160 transmitted only to satisfy the read() or write() call currently handled
0161 by the driver, and those calls wait for the transmission to complete before
0162 returning.
0163
0164 Note that the synchronization attribute has nothing to do with the possibility
0165 that read() or write() completes less bytes than requested. There is a
0166 separate configuration flag ("allowpartial") that determines whether such a
0167 partial completion is allowed.
0168
0169 Seekable pipes
0170 --------------
0171
0172 A synchronous pipe can be configured to have the stream's position exposed
0173 to the user logic at the FPGA. Such a pipe is also seekable on the host API.
0174 With this feature, a memory or register interface can be attached on the
0175 FPGA side to the seekable stream. Reading or writing to a certain address in
0176 the attached memory is done by seeking to the desired address, and calling
0177 read() or write() as required.
0178
0179
0180 Internals
0181 =========
0182
0183 Source code organization
0184 ------------------------
0185
0186 The Xillybus driver consists of a core module, xillybus_core.c, and modules
0187 that depend on the specific bus interface (xillybus_of.c and xillybus_pcie.c).
0188
0189 The bus specific modules are those probed when a suitable device is found by
0190 the kernel. Since the DMA mapping and synchronization functions, which are bus
0191 dependent by their nature, are used by the core module, a
0192 xilly_endpoint_hardware structure is passed to the core module on
0193 initialization. This structure is populated with pointers to wrapper functions
0194 which execute the DMA-related operations on the bus.
0195
0196 Pipe attributes
0197 ---------------
0198
0199 Each pipe has a number of attributes which are set when the FPGA component
0200 (IP core) is built. They are fetched from the IDT (the data structure which
0201 defines the core's configuration, see Probing below) by xilly_setupchannels()
0202 in xillybus_core.c as follows:
0203
0204 * is_writebuf: The pipe's direction. A non-zero value means it's an FPGA to
0205 host pipe (the FPGA "writes").
0206
0207 * channelnum: The pipe's identification number in communication between the
0208 host and FPGA.
0209
0210 * format: The underlying data width. See Data Granularity below.
0211
0212 * allowpartial: A non-zero value means that a read() or write() (whichever
0213 applies) may return with less than the requested number of bytes. The common
0214 choice is a non-zero value, to match standard UNIX behavior.
0215
0216 * synchronous: A non-zero value means that the pipe is synchronous. See
0217 Synchronization above.
0218
0219 * bufsize: Each DMA buffer's size. Always a power of two.
0220
0221 * bufnum: The number of buffers allocated for this pipe. Always a power of two.
0222
0223 * exclusive_open: A non-zero value forces exclusive opening of the associated
0224 device file. If the device file is bidirectional, and already opened only in
0225 one direction, the opposite direction may be opened once.
0226
0227 * seekable: A non-zero value indicates that the pipe is seekable. See
0228 Seekable pipes above.
0229
0230 * supports_nonempty: A non-zero value (which is typical) indicates that the
0231 hardware will send the messages that are necessary to support select() and
0232 poll() for this pipe.
0233
0234 Host never reads from the FPGA
0235 ------------------------------
0236
0237 Even though PCI Express is hotpluggable in general, a typical motherboard
0238 doesn't expect a card to go away all of the sudden. But since the PCIe card
0239 is based upon reprogrammable logic, a sudden disappearance from the bus is
0240 quite likely as a result of an accidental reprogramming of the FPGA while the
0241 host is up. In practice, nothing happens immediately in such a situation. But
0242 if the host attempts to read from an address that is mapped to the PCI Express
0243 device, that leads to an immediate freeze of the system on some motherboards,
0244 even though the PCIe standard requires a graceful recovery.
0245
0246 In order to avoid these freezes, the Xillybus driver refrains completely from
0247 reading from the device's register space. All communication from the FPGA to
0248 the host is done through DMA. In particular, the Interrupt Service Routine
0249 doesn't follow the common practice of checking a status register when it's
0250 invoked. Rather, the FPGA prepares a small buffer which contains short
0251 messages, which inform the host what the interrupt was about.
0252
0253 This mechanism is used on non-PCIe buses as well for the sake of uniformity.
0254
0255
0256 Channels, pipes, and the message channel
0257 ----------------------------------------
0258
0259 Each of the (possibly bidirectional) pipes presented to the user is allocated
0260 a data channel between the FPGA and the host. The distinction between channels
0261 and pipes is necessary only because of channel 0, which is used for interrupt-
0262 related messages from the FPGA, and has no pipe attached to it.
0263
0264 Data streaming
0265 --------------
0266
0267 Even though a non-segmented data stream is presented to the user at both
0268 sides, the implementation relies on a set of DMA buffers which is allocated
0269 for each channel. For the sake of illustration, let's take the FPGA to host
0270 direction: As data streams into the respective channel's interface in the
0271 FPGA, the Xillybus IP core writes it to one of the DMA buffers. When the
0272 buffer is full, the FPGA informs the host about that (appending a
0273 XILLYMSG_OPCODE_RELEASEBUF message channel 0 and sending an interrupt if
0274 necessary). The host responds by making the data available for reading through
0275 the character device. When all data has been read, the host writes on the
0276 FPGA's buffer control register, allowing the buffer's overwriting. Flow
0277 control mechanisms exist on both sides to prevent underflows and overflows.
0278
0279 This is not good enough for creating a TCP/IP-like stream: If the data flow
0280 stops momentarily before a DMA buffer is filled, the intuitive expectation is
0281 that the partial data in buffer will arrive anyhow, despite the buffer not
0282 being completed. This is implemented by adding a field in the
0283 XILLYMSG_OPCODE_RELEASEBUF message, through which the FPGA informs not just
0284 which buffer is submitted, but how much data it contains.
0285
0286 But the FPGA will submit a partially filled buffer only if directed to do so
0287 by the host. This situation occurs when the read() method has been blocking
0288 for XILLY_RX_TIMEOUT jiffies (currently 10 ms), after which the host commands
0289 the FPGA to submit a DMA buffer as soon as it can. This timeout mechanism
0290 balances between bus bandwidth efficiency (preventing a lot of partially
0291 filled buffers being sent) and a latency held fairly low for tails of data.
0292
0293 A similar setting is used in the host to FPGA direction. The handling of
0294 partial DMA buffers is somewhat different, though. The user can tell the
0295 driver to submit all data it has in the buffers to the FPGA, by issuing a
0296 write() with the byte count set to zero. This is similar to a flush request,
0297 but it doesn't block. There is also an autoflushing mechanism, which triggers
0298 an equivalent flush roughly XILLY_RX_TIMEOUT jiffies after the last write().
0299 This allows the user to be oblivious about the underlying buffering mechanism
0300 and yet enjoy a stream-like interface.
0301
0302 Note that the issue of partial buffer flushing is irrelevant for pipes having
0303 the "synchronous" attribute nonzero, since synchronous pipes don't allow data
0304 to lay around in the DMA buffers between read() and write() anyhow.
0305
0306 Data granularity
0307 ----------------
0308
0309 The data arrives or is sent at the FPGA as 8, 16 or 32 bit wide words, as
0310 configured by the "format" attribute. Whenever possible, the driver attempts
0311 to hide this when the pipe is accessed differently from its natural alignment.
0312 For example, reading single bytes from a pipe with 32 bit granularity works
0313 with no issues. Writing single bytes to pipes with 16 or 32 bit granularity
0314 will also work, but the driver can't send partially completed words to the
0315 FPGA, so the transmission of up to one word may be held until it's fully
0316 occupied with user data.
0317
0318 This somewhat complicates the handling of host to FPGA streams, because
0319 when a buffer is flushed, it may contain up to 3 bytes don't form a word in
0320 the FPGA, and hence can't be sent. To prevent loss of data, these leftover
0321 bytes need to be moved to the next buffer. The parts in xillybus_core.c
0322 that mention "leftovers" in some way are related to this complication.
0323
0324 Probing
0325 -------
0326
0327 As mentioned earlier, the number of pipes that are created when the driver
0328 loads and their attributes depend on the Xillybus IP core in the FPGA. During
0329 the driver's initialization, a blob containing configuration info, the
0330 Interface Description Table (IDT), is sent from the FPGA to the host. The
0331 bootstrap process is done in three phases:
0332
0333 1. Acquire the length of the IDT, so a buffer can be allocated for it. This
0334 is done by sending a quiesce command to the device, since the acknowledge
0335 for this command contains the IDT's buffer length.
0336
0337 2. Acquire the IDT itself.
0338
0339 3. Create the interfaces according to the IDT.
0340
0341 Buffer allocation
0342 -----------------
0343
0344 In order to simplify the logic that prevents illegal boundary crossings of
0345 PCIe packets, the following rule applies: If a buffer is smaller than 4kB,
0346 it must not cross a 4kB boundary. Otherwise, it must be 4kB aligned. The
0347 xilly_setupchannels() functions allocates these buffers by requesting whole
0348 pages from the kernel, and diving them into DMA buffers as necessary. Since
0349 all buffers' sizes are powers of two, it's possible to pack any set of such
0350 buffers, with a maximal waste of one page of memory.
0351
0352 All buffers are allocated when the driver is loaded. This is necessary,
0353 since large continuous physical memory segments are sometimes requested,
0354 which are more likely to be available when the system is freshly booted.
0355
0356 The allocation of buffer memory takes place in the same order they appear in
0357 the IDT. The driver relies on a rule that the pipes are sorted with decreasing
0358 buffer size in the IDT. If a requested buffer is larger or equal to a page,
0359 the necessary number of pages is requested from the kernel, and these are
0360 used for this buffer. If the requested buffer is smaller than a page, one
0361 single page is requested from the kernel, and that page is partially used.
0362 Or, if there already is a partially used page at hand, the buffer is packed
0363 into that page. It can be shown that all pages requested from the kernel
0364 (except possibly for the last) are 100% utilized this way.
0365
0366 The "nonempty" message (supporting poll)
0367 ----------------------------------------
0368
0369 In order to support the "poll" method (and hence select() ), there is a small
0370 catch regarding the FPGA to host direction: The FPGA may have filled a DMA
0371 buffer with some data, but not submitted that buffer. If the host waited for
0372 the buffer's submission by the FPGA, there would be a possibility that the
0373 FPGA side has sent data, but a select() call would still block, because the
0374 host has not received any notification about this. This is solved with
0375 XILLYMSG_OPCODE_NONEMPTY messages sent by the FPGA when a channel goes from
0376 completely empty to containing some data.
0377
0378 These messages are used only to support poll() and select(). The IP core can
0379 be configured not to send them for a slight reduction of bandwidth.