Back to home page

LXR

 
 

    


0001 VFIO - "Virtual Function I/O"[1]
0002 -------------------------------------------------------------------------------
0003 Many modern system now provide DMA and interrupt remapping facilities
0004 to help ensure I/O devices behave within the boundaries they've been
0005 allotted.  This includes x86 hardware with AMD-Vi and Intel VT-d,
0006 POWER systems with Partitionable Endpoints (PEs) and embedded PowerPC
0007 systems such as Freescale PAMU.  The VFIO driver is an IOMMU/device
0008 agnostic framework for exposing direct device access to userspace, in
0009 a secure, IOMMU protected environment.  In other words, this allows
0010 safe[2], non-privileged, userspace drivers.
0011 
0012 Why do we want that?  Virtual machines often make use of direct device
0013 access ("device assignment") when configured for the highest possible
0014 I/O performance.  From a device and host perspective, this simply
0015 turns the VM into a userspace driver, with the benefits of
0016 significantly reduced latency, higher bandwidth, and direct use of
0017 bare-metal device drivers[3].
0018 
0019 Some applications, particularly in the high performance computing
0020 field, also benefit from low-overhead, direct device access from
0021 userspace.  Examples include network adapters (often non-TCP/IP based)
0022 and compute accelerators.  Prior to VFIO, these drivers had to either
0023 go through the full development cycle to become proper upstream
0024 driver, be maintained out of tree, or make use of the UIO framework,
0025 which has no notion of IOMMU protection, limited interrupt support,
0026 and requires root privileges to access things like PCI configuration
0027 space.
0028 
0029 The VFIO driver framework intends to unify these, replacing both the
0030 KVM PCI specific device assignment code as well as provide a more
0031 secure, more featureful userspace driver environment than UIO.
0032 
0033 Groups, Devices, and IOMMUs
0034 -------------------------------------------------------------------------------
0035 
0036 Devices are the main target of any I/O driver.  Devices typically
0037 create a programming interface made up of I/O access, interrupts,
0038 and DMA.  Without going into the details of each of these, DMA is
0039 by far the most critical aspect for maintaining a secure environment
0040 as allowing a device read-write access to system memory imposes the
0041 greatest risk to the overall system integrity.
0042 
0043 To help mitigate this risk, many modern IOMMUs now incorporate
0044 isolation properties into what was, in many cases, an interface only
0045 meant for translation (ie. solving the addressing problems of devices
0046 with limited address spaces).  With this, devices can now be isolated
0047 from each other and from arbitrary memory access, thus allowing
0048 things like secure direct assignment of devices into virtual machines.
0049 
0050 This isolation is not always at the granularity of a single device
0051 though.  Even when an IOMMU is capable of this, properties of devices,
0052 interconnects, and IOMMU topologies can each reduce this isolation.
0053 For instance, an individual device may be part of a larger multi-
0054 function enclosure.  While the IOMMU may be able to distinguish
0055 between devices within the enclosure, the enclosure may not require
0056 transactions between devices to reach the IOMMU.  Examples of this
0057 could be anything from a multi-function PCI device with backdoors
0058 between functions to a non-PCI-ACS (Access Control Services) capable
0059 bridge allowing redirection without reaching the IOMMU.  Topology
0060 can also play a factor in terms of hiding devices.  A PCIe-to-PCI
0061 bridge masks the devices behind it, making transaction appear as if
0062 from the bridge itself.  Obviously IOMMU design plays a major factor
0063 as well.
0064 
0065 Therefore, while for the most part an IOMMU may have device level
0066 granularity, any system is susceptible to reduced granularity.  The
0067 IOMMU API therefore supports a notion of IOMMU groups.  A group is
0068 a set of devices which is isolatable from all other devices in the
0069 system.  Groups are therefore the unit of ownership used by VFIO.
0070 
0071 While the group is the minimum granularity that must be used to
0072 ensure secure user access, it's not necessarily the preferred
0073 granularity.  In IOMMUs which make use of page tables, it may be
0074 possible to share a set of page tables between different groups,
0075 reducing the overhead both to the platform (reduced TLB thrashing,
0076 reduced duplicate page tables), and to the user (programming only
0077 a single set of translations).  For this reason, VFIO makes use of
0078 a container class, which may hold one or more groups.  A container
0079 is created by simply opening the /dev/vfio/vfio character device.
0080 
0081 On its own, the container provides little functionality, with all
0082 but a couple version and extension query interfaces locked away.
0083 The user needs to add a group into the container for the next level
0084 of functionality.  To do this, the user first needs to identify the
0085 group associated with the desired device.  This can be done using
0086 the sysfs links described in the example below.  By unbinding the
0087 device from the host driver and binding it to a VFIO driver, a new
0088 VFIO group will appear for the group as /dev/vfio/$GROUP, where
0089 $GROUP is the IOMMU group number of which the device is a member.
0090 If the IOMMU group contains multiple devices, each will need to
0091 be bound to a VFIO driver before operations on the VFIO group
0092 are allowed (it's also sufficient to only unbind the device from
0093 host drivers if a VFIO driver is unavailable; this will make the
0094 group available, but not that particular device).  TBD - interface
0095 for disabling driver probing/locking a device.
0096 
0097 Once the group is ready, it may be added to the container by opening
0098 the VFIO group character device (/dev/vfio/$GROUP) and using the
0099 VFIO_GROUP_SET_CONTAINER ioctl, passing the file descriptor of the
0100 previously opened container file.  If desired and if the IOMMU driver
0101 supports sharing the IOMMU context between groups, multiple groups may
0102 be set to the same container.  If a group fails to set to a container
0103 with existing groups, a new empty container will need to be used
0104 instead.
0105 
0106 With a group (or groups) attached to a container, the remaining
0107 ioctls become available, enabling access to the VFIO IOMMU interfaces.
0108 Additionally, it now becomes possible to get file descriptors for each
0109 device within a group using an ioctl on the VFIO group file descriptor.
0110 
0111 The VFIO device API includes ioctls for describing the device, the I/O
0112 regions and their read/write/mmap offsets on the device descriptor, as
0113 well as mechanisms for describing and registering interrupt
0114 notifications.
0115 
0116 VFIO Usage Example
0117 -------------------------------------------------------------------------------
0118 
0119 Assume user wants to access PCI device 0000:06:0d.0
0120 
0121 $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
0122 ../../../../kernel/iommu_groups/26
0123 
0124 This device is therefore in IOMMU group 26.  This device is on the
0125 pci bus, therefore the user will make use of vfio-pci to manage the
0126 group:
0127 
0128 # modprobe vfio-pci
0129 
0130 Binding this device to the vfio-pci driver creates the VFIO group
0131 character devices for this group:
0132 
0133 $ lspci -n -s 0000:06:0d.0
0134 06:0d.0 0401: 1102:0002 (rev 08)
0135 # echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind
0136 # echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id
0137 
0138 Now we need to look at what other devices are in the group to free
0139 it for use by VFIO:
0140 
0141 $ ls -l /sys/bus/pci/devices/0000:06:0d.0/iommu_group/devices
0142 total 0
0143 lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:00:1e.0 ->
0144         ../../../../devices/pci0000:00/0000:00:1e.0
0145 lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.0 ->
0146         ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0
0147 lrwxrwxrwx. 1 root root 0 Apr 23 16:13 0000:06:0d.1 ->
0148         ../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1
0149 
0150 This device is behind a PCIe-to-PCI bridge[4], therefore we also
0151 need to add device 0000:06:0d.1 to the group following the same
0152 procedure as above.  Device 0000:00:1e.0 is a bridge that does
0153 not currently have a host driver, therefore it's not required to
0154 bind this device to the vfio-pci driver (vfio-pci does not currently
0155 support PCI bridges).
0156 
0157 The final step is to provide the user with access to the group if
0158 unprivileged operation is desired (note that /dev/vfio/vfio provides
0159 no capabilities on its own and is therefore expected to be set to
0160 mode 0666 by the system).
0161 
0162 # chown user:user /dev/vfio/26
0163 
0164 The user now has full access to all the devices and the iommu for this
0165 group and can access them as follows:
0166 
0167         int container, group, device, i;
0168         struct vfio_group_status group_status =
0169                                         { .argsz = sizeof(group_status) };
0170         struct vfio_iommu_type1_info iommu_info = { .argsz = sizeof(iommu_info) };
0171         struct vfio_iommu_type1_dma_map dma_map = { .argsz = sizeof(dma_map) };
0172         struct vfio_device_info device_info = { .argsz = sizeof(device_info) };
0173 
0174         /* Create a new container */
0175         container = open("/dev/vfio/vfio", O_RDWR);
0176 
0177         if (ioctl(container, VFIO_GET_API_VERSION) != VFIO_API_VERSION)
0178                 /* Unknown API version */
0179 
0180         if (!ioctl(container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU))
0181                 /* Doesn't support the IOMMU driver we want. */
0182 
0183         /* Open the group */
0184         group = open("/dev/vfio/26", O_RDWR);
0185 
0186         /* Test the group is viable and available */
0187         ioctl(group, VFIO_GROUP_GET_STATUS, &group_status);
0188 
0189         if (!(group_status.flags & VFIO_GROUP_FLAGS_VIABLE))
0190                 /* Group is not viable (ie, not all devices bound for vfio) */
0191 
0192         /* Add the group to the container */
0193         ioctl(group, VFIO_GROUP_SET_CONTAINER, &container);
0194 
0195         /* Enable the IOMMU model we want */
0196         ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
0197 
0198         /* Get addition IOMMU info */
0199         ioctl(container, VFIO_IOMMU_GET_INFO, &iommu_info);
0200 
0201         /* Allocate some space and setup a DMA mapping */
0202         dma_map.vaddr = mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE,
0203                              MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
0204         dma_map.size = 1024 * 1024;
0205         dma_map.iova = 0; /* 1MB starting at 0x0 from device view */
0206         dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
0207 
0208         ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map);
0209 
0210         /* Get a file descriptor for the device */
0211         device = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, "0000:06:0d.0");
0212 
0213         /* Test and setup the device */
0214         ioctl(device, VFIO_DEVICE_GET_INFO, &device_info);
0215 
0216         for (i = 0; i < device_info.num_regions; i++) {
0217                 struct vfio_region_info reg = { .argsz = sizeof(reg) };
0218 
0219                 reg.index = i;
0220 
0221                 ioctl(device, VFIO_DEVICE_GET_REGION_INFO, &reg);
0222 
0223                 /* Setup mappings... read/write offsets, mmaps
0224                  * For PCI devices, config space is a region */
0225         }
0226 
0227         for (i = 0; i < device_info.num_irqs; i++) {
0228                 struct vfio_irq_info irq = { .argsz = sizeof(irq) };
0229 
0230                 irq.index = i;
0231 
0232                 ioctl(device, VFIO_DEVICE_GET_IRQ_INFO, &irq);
0233 
0234                 /* Setup IRQs... eventfds, VFIO_DEVICE_SET_IRQS */
0235         }
0236 
0237         /* Gratuitous device reset and go... */
0238         ioctl(device, VFIO_DEVICE_RESET);
0239 
0240 VFIO User API
0241 -------------------------------------------------------------------------------
0242 
0243 Please see include/linux/vfio.h for complete API documentation.
0244 
0245 VFIO bus driver API
0246 -------------------------------------------------------------------------------
0247 
0248 VFIO bus drivers, such as vfio-pci make use of only a few interfaces
0249 into VFIO core.  When devices are bound and unbound to the driver,
0250 the driver should call vfio_add_group_dev() and vfio_del_group_dev()
0251 respectively:
0252 
0253 extern int vfio_add_group_dev(struct iommu_group *iommu_group,
0254                               struct device *dev,
0255                               const struct vfio_device_ops *ops,
0256                               void *device_data);
0257 
0258 extern void *vfio_del_group_dev(struct device *dev);
0259 
0260 vfio_add_group_dev() indicates to the core to begin tracking the
0261 specified iommu_group and register the specified dev as owned by
0262 a VFIO bus driver.  The driver provides an ops structure for callbacks
0263 similar to a file operations structure:
0264 
0265 struct vfio_device_ops {
0266         int     (*open)(void *device_data);
0267         void    (*release)(void *device_data);
0268         ssize_t (*read)(void *device_data, char __user *buf,
0269                         size_t count, loff_t *ppos);
0270         ssize_t (*write)(void *device_data, const char __user *buf,
0271                          size_t size, loff_t *ppos);
0272         long    (*ioctl)(void *device_data, unsigned int cmd,
0273                          unsigned long arg);
0274         int     (*mmap)(void *device_data, struct vm_area_struct *vma);
0275 };
0276 
0277 Each function is passed the device_data that was originally registered
0278 in the vfio_add_group_dev() call above.  This allows the bus driver
0279 an easy place to store its opaque, private data.  The open/release
0280 callbacks are issued when a new file descriptor is created for a
0281 device (via VFIO_GROUP_GET_DEVICE_FD).  The ioctl interface provides
0282 a direct pass through for VFIO_DEVICE_* ioctls.  The read/write/mmap
0283 interfaces implement the device region access defined by the device's
0284 own VFIO_DEVICE_GET_REGION_INFO ioctl.
0285 
0286 
0287 PPC64 sPAPR implementation note
0288 -------------------------------------------------------------------------------
0289 
0290 This implementation has some specifics:
0291 
0292 1) On older systems (POWER7 with P5IOC2/IODA1) only one IOMMU group per
0293 container is supported as an IOMMU table is allocated at the boot time,
0294 one table per a IOMMU group which is a Partitionable Endpoint (PE)
0295 (PE is often a PCI domain but not always).
0296 Newer systems (POWER8 with IODA2) have improved hardware design which allows
0297 to remove this limitation and have multiple IOMMU groups per a VFIO container.
0298 
0299 2) The hardware supports so called DMA windows - the PCI address range
0300 within which DMA transfer is allowed, any attempt to access address space
0301 out of the window leads to the whole PE isolation.
0302 
0303 3) PPC64 guests are paravirtualized but not fully emulated. There is an API
0304 to map/unmap pages for DMA, and it normally maps 1..32 pages per call and
0305 currently there is no way to reduce the number of calls. In order to make things
0306 faster, the map/unmap handling has been implemented in real mode which provides
0307 an excellent performance which has limitations such as inability to do
0308 locked pages accounting in real time.
0309 
0310 4) According to sPAPR specification, A Partitionable Endpoint (PE) is an I/O
0311 subtree that can be treated as a unit for the purposes of partitioning and
0312 error recovery. A PE may be a single or multi-function IOA (IO Adapter), a
0313 function of a multi-function IOA, or multiple IOAs (possibly including switch
0314 and bridge structures above the multiple IOAs). PPC64 guests detect PCI errors
0315 and recover from them via EEH RTAS services, which works on the basis of
0316 additional ioctl commands.
0317 
0318 So 4 additional ioctls have been added:
0319 
0320         VFIO_IOMMU_SPAPR_TCE_GET_INFO - returns the size and the start
0321                 of the DMA window on the PCI bus.
0322 
0323         VFIO_IOMMU_ENABLE - enables the container. The locked pages accounting
0324                 is done at this point. This lets user first to know what
0325                 the DMA window is and adjust rlimit before doing any real job.
0326 
0327         VFIO_IOMMU_DISABLE - disables the container.
0328 
0329         VFIO_EEH_PE_OP - provides an API for EEH setup, error detection and recovery.
0330 
0331 The code flow from the example above should be slightly changed:
0332 
0333         struct vfio_eeh_pe_op pe_op = { .argsz = sizeof(pe_op), .flags = 0 };
0334 
0335         .....
0336         /* Add the group to the container */
0337         ioctl(group, VFIO_GROUP_SET_CONTAINER, &container);
0338 
0339         /* Enable the IOMMU model we want */
0340         ioctl(container, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU)
0341 
0342         /* Get addition sPAPR IOMMU info */
0343         vfio_iommu_spapr_tce_info spapr_iommu_info;
0344         ioctl(container, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &spapr_iommu_info);
0345 
0346         if (ioctl(container, VFIO_IOMMU_ENABLE))
0347                 /* Cannot enable container, may be low rlimit */
0348 
0349         /* Allocate some space and setup a DMA mapping */
0350         dma_map.vaddr = mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE,
0351                              MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
0352 
0353         dma_map.size = 1024 * 1024;
0354         dma_map.iova = 0; /* 1MB starting at 0x0 from device view */
0355         dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
0356 
0357         /* Check here is .iova/.size are within DMA window from spapr_iommu_info */
0358         ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map);
0359 
0360         /* Get a file descriptor for the device */
0361         device = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, "0000:06:0d.0");
0362 
0363         ....
0364 
0365         /* Gratuitous device reset and go... */
0366         ioctl(device, VFIO_DEVICE_RESET);
0367 
0368         /* Make sure EEH is supported */
0369         ioctl(container, VFIO_CHECK_EXTENSION, VFIO_EEH);
0370 
0371         /* Enable the EEH functionality on the device */
0372         pe_op.op = VFIO_EEH_PE_ENABLE;
0373         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0374 
0375         /* You're suggested to create additional data struct to represent
0376          * PE, and put child devices belonging to same IOMMU group to the
0377          * PE instance for later reference.
0378          */
0379 
0380         /* Check the PE's state and make sure it's in functional state */
0381         pe_op.op = VFIO_EEH_PE_GET_STATE;
0382         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0383 
0384         /* Save device state using pci_save_state().
0385          * EEH should be enabled on the specified device.
0386          */
0387 
0388         ....
0389 
0390         /* Inject EEH error, which is expected to be caused by 32-bits
0391          * config load.
0392          */
0393         pe_op.op = VFIO_EEH_PE_INJECT_ERR;
0394         pe_op.err.type = EEH_ERR_TYPE_32;
0395         pe_op.err.func = EEH_ERR_FUNC_LD_CFG_ADDR;
0396         pe_op.err.addr = 0ul;
0397         pe_op.err.mask = 0ul;
0398         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0399 
0400         ....
0401 
0402         /* When 0xFF's returned from reading PCI config space or IO BARs
0403          * of the PCI device. Check the PE's state to see if that has been
0404          * frozen.
0405          */
0406         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0407 
0408         /* Waiting for pending PCI transactions to be completed and don't
0409          * produce any more PCI traffic from/to the affected PE until
0410          * recovery is finished.
0411          */
0412 
0413         /* Enable IO for the affected PE and collect logs. Usually, the
0414          * standard part of PCI config space, AER registers are dumped
0415          * as logs for further analysis.
0416          */
0417         pe_op.op = VFIO_EEH_PE_UNFREEZE_IO;
0418         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0419 
0420         /*
0421          * Issue PE reset: hot or fundamental reset. Usually, hot reset
0422          * is enough. However, the firmware of some PCI adapters would
0423          * require fundamental reset.
0424          */
0425         pe_op.op = VFIO_EEH_PE_RESET_HOT;
0426         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0427         pe_op.op = VFIO_EEH_PE_RESET_DEACTIVATE;
0428         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0429 
0430         /* Configure the PCI bridges for the affected PE */
0431         pe_op.op = VFIO_EEH_PE_CONFIGURE;
0432         ioctl(container, VFIO_EEH_PE_OP, &pe_op);
0433 
0434         /* Restored state we saved at initialization time. pci_restore_state()
0435          * is good enough as an example.
0436          */
0437 
0438         /* Hopefully, error is recovered successfully. Now, you can resume to
0439          * start PCI traffic to/from the affected PE.
0440          */
0441 
0442         ....
0443 
0444 5) There is v2 of SPAPR TCE IOMMU. It deprecates VFIO_IOMMU_ENABLE/
0445 VFIO_IOMMU_DISABLE and implements 2 new ioctls:
0446 VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY
0447 (which are unsupported in v1 IOMMU).
0448 
0449 PPC64 paravirtualized guests generate a lot of map/unmap requests,
0450 and the handling of those includes pinning/unpinning pages and updating
0451 mm::locked_vm counter to make sure we do not exceed the rlimit.
0452 The v2 IOMMU splits accounting and pinning into separate operations:
0453 
0454 - VFIO_IOMMU_SPAPR_REGISTER_MEMORY/VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY ioctls
0455 receive a user space address and size of the block to be pinned.
0456 Bisecting is not supported and VFIO_IOMMU_UNREGISTER_MEMORY is expected to
0457 be called with the exact address and size used for registering
0458 the memory block. The userspace is not expected to call these often.
0459 The ranges are stored in a linked list in a VFIO container.
0460 
0461 - VFIO_IOMMU_MAP_DMA/VFIO_IOMMU_UNMAP_DMA ioctls only update the actual
0462 IOMMU table and do not do pinning; instead these check that the userspace
0463 address is from pre-registered range.
0464 
0465 This separation helps in optimizing DMA for guests.
0466 
0467 6) sPAPR specification allows guests to have an additional DMA window(s) on
0468 a PCI bus with a variable page size. Two ioctls have been added to support
0469 this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE.
0470 The platform has to support the functionality or error will be returned to
0471 the userspace. The existing hardware supports up to 2 DMA windows, one is
0472 2GB long, uses 4K pages and called "default 32bit window"; the other can
0473 be as big as entire RAM, use different page size, it is optional - guests
0474 create those in run-time if the guest driver supports 64bit DMA.
0475 
0476 VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and
0477 a number of TCE table levels (if a TCE table is going to be big enough and
0478 the kernel may not be able to allocate enough of physically contiguous memory).
0479 It creates a new window in the available slot and returns the bus address where
0480 the new window starts. Due to hardware limitation, the user space cannot choose
0481 the location of DMA windows.
0482 
0483 VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window
0484 and removes it.
0485 
0486 -------------------------------------------------------------------------------
0487 
0488 [1] VFIO was originally an acronym for "Virtual Function I/O" in its
0489 initial implementation by Tom Lyon while as Cisco.  We've since
0490 outgrown the acronym, but it's catchy.
0491 
0492 [2] "safe" also depends upon a device being "well behaved".  It's
0493 possible for multi-function devices to have backdoors between
0494 functions and even for single function devices to have alternative
0495 access to things like PCI config space through MMIO registers.  To
0496 guard against the former we can include additional precautions in the
0497 IOMMU driver to group multi-function PCI devices together
0498 (iommu=group_mf).  The latter we can't prevent, but the IOMMU should
0499 still provide isolation.  For PCI, SR-IOV Virtual Functions are the
0500 best indicator of "well behaved", as these are designed for
0501 virtualization usage models.
0502 
0503 [3] As always there are trade-offs to virtual machine device
0504 assignment that are beyond the scope of VFIO.  It's expected that
0505 future IOMMU technologies will reduce some, but maybe not all, of
0506 these trade-offs.
0507 
0508 [4] In this case the device is below a PCI bridge, so transactions
0509 from either function of the device are indistinguishable to the iommu:
0510 
0511 -[0000:00]-+-1e.0-[06]--+-0d.0
0512                         \-0d.1
0513 
0514 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)