DMA mask setup for xHCI

The xHCI interface defines data structures that are used by the xhci-hcd driver and the xHC host controller to manage the usb devices. The buffers referenced by these data structures are allocated in host memory and the transfer operations between these host memory buffers and the xHC host controller are performed using the DMA mechanism. DMA is critical for achieving USB 3.0 speeds.

DMA allows the xHC host controller to access the host memory without the CPU’s intervention. In order to do so, the xHC host controller must be capable to obtain the control of the bus and initialize a bus transaction. On pci platforms, the xHC host controller obtains pci mastering capabilities during the pci device enumeration via a call to pci_set_master().

As a peripheral, the xHC host controller must use bus addresses to address the DMA buffers allocated in host memory. So the bus and the xHC host controller must implement an initial means that will permit the xHC host controller to obtain the bus address of the DMA buffers. On pci platforms, during the pci enumeration, the set of xHC host controller’s registers is mapped in host memory. Some of the registers are used by the xhci-hcd to report the bus address (or DMA address) of some DMA buffers which will, in turn, also contain in their structure the DMA address of another buffer and so on. As a result, the xHC will use these addresses to perform DMA transfers to and from the corresponding buffers.

For the data structures used by the xHC host controller, the xhci-hcd driver shall allocate memory space and then map it into DMA addresses. This space must be physically contiguous and reside in a DMA-able region of the memory, meaning that it can be reached using a bus address. If the DMA buffers have been allocated in the non DMA-able region, then a bounce buffer must be set by the DMA mapping function and the transfers between the xHC host controller and the original buffer are now performed via the bounce buffer.

The linux kernel provides to the developers of peripheral drivers a generic DMA API that is architecture and bus independent and can facilitate the allocation and mapping of DMA buffers by abstracting the architecture and bus specific DMA setup layers. The xhci-hcd driver uses this DMA API to allocate its data structures and map their virtual addresses into DMA addresses.

The DMA API defines the dma_addr_t opaque type to hold a bus address and distinguishes between two types of DMA mappings, streaming and coherent. Coherent mappings guarantee that changes in the content of DMA buffers will be visible immediately to both the driver and the peripheral. On the other hand, streaming mappings do not guarantee cache coherency and coherent DMA operations rely on the correct usage of streaming mapping functions by the developer. Coherent mappings have additional overhead to setup and use in comparison to streaming mappings, so for single transfers streaming mappings are preferred.

In xhci-hcd, all the buffers used for communicating data between the xHC host controller and the driver are allocated using coherent mappings, with the exception of URB buffers that hold the usb packets passed between the host controller and the device driver which are allocated using streaming mappings.

The size of DMA addresses is constrained by the size of peripheral’s DMA engine internal address register and the number of bus address lines. DMA transactions to buffers with bus addresses greater than the minimum of the two above cannot be performed due to the limited number of available bits. The number of bits that can be used to hold a DMA address is hardware specific and is set by setting the dma mask. Setting the dma mask to the highest supported value will enable xHC host controller to address a bigger memory region and will improve system’s performance by eliminating the use of bounce buffers.

In order to set up the dma mask for a device, the fields dma_mask and coherent_dma_mask of the generic ‘struct device’ must be set. dma_mask is a pointer to dma mask used in streaming DMA transfers while coherent_dma_mask corresponds to the dma mask used in coherent DMA transfers.

The xHC host controller reports its addressing capabilities via the HCCPARAMS register. The host system’s addressing capability is architecture and bus dependent. Hence, in order to check whether the dma addressing mode supported by the xHC host controller is also supported by the host, the device dma_mask and coherent_dma_mask must be set using dma_set_mask() and dma_set_coherent_mask().

The definition of dma_set_mask() is found in the arch/ subdirectory since its implementation is architecture specific. The code, before setting the dma_mask to point to the appropriate DMA bitmask, checks whether the pointer has been initialized and whether the required bitmask is supported by the current architecture. For instance, for the x86, the implementation of dma_set_mask() is:

int dma_set_mask(struct device *dev, u64 mask)
         if (!dev->dma_mask || !dma_supported(dev, mask))
                 return -EIO;
         *dev->dma_mask = mask;
         return 0;

Hence, on success the dma_set_mask() returns 0, otherwise returns a negative error code.
For pci platforms, the dma_mask is inititialized during the pci device enumeration via the function pci_device_add() to point to the pci_dev dma_mask field which is 32bit. So, by default, it is assumed that the devices attached to PCI are 32-bit DMA capable (or SAC). For other platforms, the caller of dma_set_mask() shall verify that the dma_mask pointer is not NULL, otherwise the function will fail even if the bitmask is supported by the platform.

For the x86, the implementation of dma_set_coherent_mask() is:

int dma_set_coherent_mask(struct device *dev, u64 mask)
        if (!dma_supported(dev, mask))
                 return -EIO;
        dev->coherent_dma_mask = mask;
        return 0;

As it was mentionned above, the addressing capabilities of the USB 3.0 host controller are reported via the HCCPARAMS register and can be 32- or 64-bit. If the xHC is 64-bit DAC capable, the xhci-hcd driver has to set the dma_mask to 64-bits to avoid bounce buffers and IOMMU utilization.

In general, it is recommended to unmap DMA regions as soon as they are not used anymore for DMA transfers and to use dma_set_mask() and dma_set_coherent_mask() instead of assigning directly values to the dma and coherent dma masks.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s