SPI performance limitations and analysis – Total Phase
SPI performance limitations and analysis

Contents

Introduction

This article discusses latencies that limit how frequently API calls can be made and the overall data throughput of an SPI bus. Although we focus primarily on SPI with the Aardvark and Cheetah host adapters, the general concepts apply to our other products as well.

USB Packet Framing Background

USB packets are sent across the bus as frames of data. Each frame is sent at precisely timed intervals, as shown in the table below. Once data is ready to be transmitted, it must wait to be sent out with the next frame. The frame limits how quickly data can be transmitted over USB.

USB SpeedFrame Rate
Low-Speed
1ms
Full-Speed
1ms
High-Speed
125us*
SuperSpeed
125us*

* Data can be transmitted over High- and SuperSpeed USB at 125us intervals, however, the frame rate is technically 1ms. These frames are divided into eight 125us microframes, which is the most relevant figure for this discussion.

For more information about packet framing, see this USB Background article.

Inter-transaction Delay

The USB frame rate determines how quickly data can be sent to and retrieved from our products. Frame rate latencies are incurred between each API call that touches the hardware, and blocking API calls experience a 2x roundtrip delay.

It is important to keep this in mind when high performance is desired.

Total Bus Throughput Example

Let's consider a case where the Aardvark adapter is an SPI master and sends 1Mib (128KiB) of data at 1MHz. How would the full-speed framing rate effect the overall performance?

Theoretically, this transaction should take 1.05 seconds (1024*1024 bits at 1Mbps). However, this ignores many factors which have an impact on the actual performance. As demonstrated below, the actual performance can vary depending on the SPI transaction size. Other factors including inter-byte delays and setup time requirements can also reduce the total bus throughput. For an overview of these topics, please see the Aardvark adapter user manual.

1 Byte SPI Transactions

This section considers the case where 1 byte is sent per transaction. A transaction starts when the slave select line is asserted and ends when the slave select line is deasserted.

The aa_spi_write call is a synchronous function and must complete before the next API call. Each time this function is called, we are incurring a 2ms roundtrip latency caused by the full-speed USB link between the computer and the Aardvark adapter.

Aardvark Adapter SPI Timing for 1 Byte Transaction
ParameterDurationUnitsNotes
Roundtrip USB latency
2
ms
 
Data transmission time
8
µs
8 bits at 1MHz
SS# assertion to first clock
10-20
µs
See Aardvark adapter user manual.
Last clock to SS# deassertion
5-10
µs
See Aardvark adapter user manual.
Minimum SS# inactive time
1+
µs
Must be at least 1 clock period. Slave
might need more.
Master setup time
0
s
N/A if transaction is 1 byte
Total transaction time per byte
2.024+
ms
 

In the best case scenario, sending 1 byte takes 2.024ms, and the entire 128KiB transaction takes 4.42 minutes, reducing our ideal bitrate of 1Mbps to 3.95kbps. Oh no!

Clearly, sending 1 byte at a time is not very efficient.

4 KiB SPI Transactions

As discussed in the Aardvark Adapter user manual, 4KiB is the maximum recommended transfer size for aa_spi_write. The latency remains the same for this case, but has less of an impact because because the transaction size is larger.

Aardvark Adapter SPI Timing for 4KiB Transaction
ParameterDurationUnitsNotes
Roundtrip USB latency
2
ms
 
Data transmission time
32.768
ms
4KiB (32Kib) at 1MHz. 32Kbx1us=32.768ms
SS# assertion to first clock
10-20
µs
See Aardvark adapter user manual.
Last clock to SS# deassertion
5-10
µs
See Aardvark adapter user manual.
Minimum SS# inactive time
1+
µs
Must be at least 1 clock period. Slave
might need more.
Master setup time
28.665
ms
 7ux(4KB -1)=28.665ms
Total transaction time per 4KiB
63.449+
ms
 

In this case, sending 4KiB takes 34.8ms, and the entire 128KiB transaction takes 1.11 seconds, only reducing our ideal bitrate from 1Mbps to 942kbps.

This is the best SPI performance that the Aardvark adapter can reliably deliver at 1MHz. This kind of analysis can be used to estimate the performance for any system.

Cheetah's Solution to Latency Problem

The Cheetah High-Speed SPI host adapter has several advantages that help overcome the latency problem discussed above:

  1. It features a high-speed link, with a frame rate that is 8x faster.
  2. It features a command queue which can send a large number of commands at once, reducing the effect of the USB frame rate latency.
  3. Commands can be sent to the Cheetah asynchronously. Rather than blocking until completion, commands can be submitted to the Cheetah asynchronously. Using ch_spi_async_submit, new batches of commands can be submitted to the Cheetah before previously submitted commands have finished. This method gets around the USB frame rate problem completely. For more information, see the Cheetah API documentation.

USB Link Speeds of Total Phase Products

ProductUSB Link Speed
Aardvark I2C/SPI Host Adapter
Full-Speed
Komodo CAN Duo Interface
Full-Speed
Cheetah SPI Host Adapter
High-Speed
Beagle I2C/SPI Protocol Analyzer
High-Speed
Beagle USB 12 Protocol Analyzer
High-Speed
Beagle USB 480 Protocol Analyzer
High-Speed
Beagle USB 5000 SuperSpeed Protocol Analyzer
High-Speed
Have more questions? Submit a request