0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 Introduction of Uacce
0004 ---------------------
0005
0006 Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
0007 provide Shared Virtual Addressing (SVA) between accelerators and processes.
0008 So accelerator can access any data structure of the main cpu.
0009 This differs from the data sharing between cpu and io device, which share
0010 only data content rather than address.
0011 Because of the unified address, hardware and user space of process can
0012 share the same virtual address in the communication.
0013 Uacce takes the hardware accelerator as a heterogeneous processor, while
0014 IOMMU share the same CPU page tables and as a result the same translation
0015 from va to pa.
0016
0017 ::
0018
0019 __________________________ __________________________
0020 | | | |
0021 | User application (CPU) | | Hardware Accelerator |
0022 |__________________________| |__________________________|
0023
0024 | |
0025 | va | va
0026 V V
0027 __________ __________
0028 | | | |
0029 | MMU | | IOMMU |
0030 |__________| |__________|
0031 | |
0032 | |
0033 V pa V pa
0034 _______________________________________
0035 | |
0036 | Memory |
0037 |_______________________________________|
0038
0039
0040
0041 Architecture
0042 ------------
0043
0044 Uacce is the kernel module, taking charge of iommu and address sharing.
0045 The user drivers and libraries are called WarpDrive.
0046
0047 The uacce device, built around the IOMMU SVA API, can access multiple
0048 address spaces, including the one without PASID.
0049
0050 A virtual concept, queue, is used for the communication. It provides a
0051 FIFO-like interface. And it maintains a unified address space between the
0052 application and all involved hardware.
0053
0054 ::
0055
0056 ___________________ ________________
0057 | | user API | |
0058 | WarpDrive library | ------------> | user driver |
0059 |___________________| |________________|
0060 | |
0061 | |
0062 | queue fd |
0063 | |
0064 | |
0065 v |
0066 ___________________ _________ |
0067 | | | | | mmap memory
0068 | Other framework | | uacce | | r/w interface
0069 | crypto/nic/others | |_________| |
0070 |___________________| |
0071 | | |
0072 | register | register |
0073 | | |
0074 | | |
0075 | _________________ __________ |
0076 | | | | | |
0077 ------------- | Device Driver | | IOMMU | |
0078 |_________________| |__________| |
0079 | |
0080 | V
0081 | ___________________
0082 | | |
0083 -------------------------- | Device(Hardware) |
0084 |___________________|
0085
0086
0087 How does it work
0088 ----------------
0089
0090 Uacce uses mmap and IOMMU to play the trick.
0091
0092 Uacce creates a chrdev for every device registered to it. New queue is
0093 created when user application open the chrdev. The file descriptor is used
0094 as the user handle of the queue.
0095 The accelerator device present itself as an Uacce object, which exports as
0096 a chrdev to the user space. The user application communicates with the
0097 hardware by ioctl (as control path) or share memory (as data path).
0098
0099 The control path to the hardware is via file operation, while data path is
0100 via mmap space of the queue fd.
0101
0102 The queue file address space:
0103
0104 ::
0105
0106 /**
0107 * enum uacce_qfrt: qfrt type
0108 * @UACCE_QFRT_MMIO: device mmio region
0109 * @UACCE_QFRT_DUS: device user share region
0110 */
0111 enum uacce_qfrt {
0112 UACCE_QFRT_MMIO = 0,
0113 UACCE_QFRT_DUS = 1,
0114 };
0115
0116 All regions are optional and differ from device type to type.
0117 Each region can be mmapped only once, otherwise -EEXIST returns.
0118
0119 The device mmio region is mapped to the hardware mmio space. It is generally
0120 used for doorbell or other notification to the hardware. It is not fast enough
0121 as data channel.
0122
0123 The device user share region is used for share data buffer between user process
0124 and device.
0125
0126
0127 The Uacce register API
0128 ----------------------
0129
0130 The register API is defined in uacce.h.
0131
0132 ::
0133
0134 struct uacce_interface {
0135 char name[UACCE_MAX_NAME_SIZE];
0136 unsigned int flags;
0137 const struct uacce_ops *ops;
0138 };
0139
0140 According to the IOMMU capability, uacce_interface flags can be:
0141
0142 ::
0143
0144 /**
0145 * UACCE Device flags:
0146 * UACCE_DEV_SVA: Shared Virtual Addresses
0147 * Support PASID
0148 * Support device page faults (PCI PRI or SMMU Stall)
0149 */
0150 #define UACCE_DEV_SVA BIT(0)
0151
0152 struct uacce_device *uacce_alloc(struct device *parent,
0153 struct uacce_interface *interface);
0154 int uacce_register(struct uacce_device *uacce);
0155 void uacce_remove(struct uacce_device *uacce);
0156
0157 uacce_register results can be:
0158
0159 a. If uacce module is not compiled, ERR_PTR(-ENODEV)
0160
0161 b. Succeed with the desired flags
0162
0163 c. Succeed with the negotiated flags, for example
0164
0165 uacce_interface.flags = UACCE_DEV_SVA but uacce->flags = ~UACCE_DEV_SVA
0166
0167 So user driver need check return value as well as the negotiated uacce->flags.
0168
0169
0170 The user driver
0171 ---------------
0172
0173 The queue file mmap space will need a user driver to wrap the communication
0174 protocol. Uacce provides some attributes in sysfs for the user driver to
0175 match the right accelerator accordingly.
0176 More details in Documentation/ABI/testing/sysfs-driver-uacce.