LL10GEMAC-IP with AAT QDMA Demo Instruction

 

1      Overview. 2

2      Target System Setup. 6

2.1   IP Address Configuration for Two 10G Ethernet Ports. 6

2.2   Installation of “tcpreplay” 8

3      Host System Setup. 9

4      Run AAT QDMA Demo. 14

4.1   Initialization. 14

4.2   Market Data Transmission. 16

4.3   AAT QDMA Demo. 18

4.3.1     Ethernet Subsystem.. 18

4.3.2     Feed Handler Submodule. 19

4.3.3     Order Book Submodule. 20

4.3.4     Data Mover Submodule. 21

4.3.5     Pricing Engine Submodule. 23

4.3.6     Order Entry Submodule. 24

5      Update Hardware via PCIe. 25

5.1   Create the MCS file. 25

5.2   Download the MCS file via PCIe. 26

6      Revision History. 27

 

1        Overview

This document provides detailed instructions for configuring the Alveo accelerator card and setting up the test environment to run the customized Accelerated Algorithmic Trading (AAT) demo from Design Gateway – DG AAT QDMA demo. This demo is a modified version of AMD’s original AAT demo, where the development platform has been migrated from Vitis to Vivado. Furthermore, the DG AAT QDMA demo is specifically designed to achieve lower latency by utilizing the LL10GEMAC-IP from Design Gateway, replacing the 10G/25G Ethernet subsystem.

Detailed information about the DG AAT QDMA demo is available in the reference design document:

https://dgway.com/products/IP/Lowlatency-IP/ll10gemac-ip-aat-qdma-refdesign-amd/

Latency details for the LL10GMEAC-IP can be found in its datasheet:

https://dgway.com/products/IP/Lowlatency-IP/dg_ll10gemacip_data_sheet_xilinx_en/

 

The demo operates on the Alveo accelerator card and demonstrates performance on a 10G Ethernet connection. The Gigabit Ethernet connectors available on each Alveo card vary depending on the model’s capabilities.

·       U250, U55C    : Equipped with two QSFP28 ports, supporting up to four 10G Ethernet channels per QSFP28.

·       U50                : Equipped with one QSFP28 port, supporting up to four 10G Ethernet channels.

·       X3522             : Equipped with two DSFP28 ports, supporting up to two 10G Ethernet channels per DSFP28.

In this demo, two 10G Ethernet channels are required:

1)     Market Data Transmission: Transmits sample market data using the UDP protocol.

2)     Order Transmission: Handles order data transmission using FIX over TCP.

To set up the system, a target PC with two 10G Ethernet ports is required. The sample market data is transmitted using the “tcpreplay” tool, while order reception is monitored by opening a TCP port on the target system. The demo on the Alveo accelerator is initiated by executing the “aat_qdma_exe” application.

The following test environment was configured to produce the results presented in this document.

1)     Supported Alveo cards: U50, U250, U55C, and X3522.

2)     Host system for Alveo accelerator card: Turnkey accelerator system (TKAS-D2101). Detailed specifications are available at https://dgway.com/solutions.html.

3)     The Vivado Design Suite installed on the host system to program the Alveo card.

4)     10G Ethernet cable

·      U50/U250/U55C card: QSFP+ to four SFP+ breakout cable:

https://www.sfpcables.com/5-meter-40g-qsfp-to-4-sfp-aoc-cable-om3-mmf-cisco-oem-compatible.

·      X3522 card: 2xSFP+ Active Optical Cable (AOC): https://www.10gtek.com/10gsfp+aoc.

5)     Programming cable

·      U250/U55C card      : Micro-USB cable.

·      U50 card                 : Alveo programming cable.

·      X3522 card              : Alveo Debug Kit (ADK2).

6)     Target system is configured with the following specifications.

·      Operation System: Ubuntu 20.04 LTS Server.

·      Market Data: Sample market data file (cme_input_arb.pcap).

·      Packet Replay: “tcpreplay” package for transmitting market data.

·      Ethernet Ports: Two 10G Ethernet ports using Intel X710-DA2 adapters.

https://ark.intel.com/content/www/us/en/ark/products/83964/intel-ethernet-converged-network-adapter-x710da2.html

 

 

Figure 1 LL10GEMAC-IP with AAT QDMA Demo using Alveo U250/U55C Card

 

 

Figure 2 LL10GEMAC-IP with AAT QDMA Demo using Alveo U50 Card

 

 

Figure 3 LL10GEMAC-IP with AAT QDMA Demo using Alveo X3522 Card

 

2        Target System Setup

This section provides step-by-step instructions for preparing the target system, equipped with two 10G Ethernet ports, to transfer market data and order packets with the Alveo accelerator card. The system runs Ubuntu 20.04 LTS Server OS.

2.1      IP Address Configuration for Two 10G Ethernet Ports

First, identify the logical names of the two 10G Ethernet ports, which connect to the SFP+#1 and SFP+#2 cables. These logical names may vary based on your test environment, so it is important to configure the correct IP address for the SFP+#1 and SFP+#2 connections.

1)     Open a Linux terminal and use the following command to list the logical names of the 10G Ethernet ports:

“lshw -C network”.

 

 

Figure 4 Display Logical Name of 10G Ethernet Ports

 

The output will display information about the network interfaces. For example, Figure 4 shows logical names such as “enp6s0f0np0” for SFP+#1 and “enp6s0f1np1” for SFP+#2.

 

2)     Configure the IP address for each port using the “ifconfig” command.

·       Set SFP+#1 (enp6s0f0np0) to “192.168.10.100”.

·       Set SFP+#2 (enp6s0f1np1) to “192.168.20.100”.

Additionally, configure the netmask to 255.255.255.0 (i.e., /24 subnet). The command format is as follows.

 

 

Figure 5 Configure IP Address and Netmask

 

3)     After configuring the IP addresses and netmask, verify the settings using the “ifconfig” command.

 

 

Figure 6 Verify IP Address and Netmask Setting

 

Ensure that both Ethernet ports are correctly assigned with their respective IP addresses and netmask values.

 

2.2      Installation of “tcpreplay”

To run the AAT QDMA demo, the target PC must have the “tcpreplay” tool installed, which is used to replay packet capture files over a network interface. Run the following command to install the “tcpreplay” package:

>> sudo apt-get install tcpreplay

 

 

Figure 7 “tcpreplay” Installation

 

This command will install “tcpreplay”, as illustrated in Figure 7. Once the installation is completed, “tcpreplay” will be ready for use in the demo.

 

3        Host System Setup

This section outlines the steps to prepare the Turnkey Acceleration System (TKAS-D2101 with an Alveo card), which is the host system for running the AAT QDMA demo.

1)     Install QDMA DPDK driver. Since DG AAT QDMA demo requires the packet transfer through PCIe for the Alveo card configuration, QDMA DPDK driver must be installed on the host system. The installation guide can be found on the AMD website, “QDMA DPDK Driver” on the topic of “Building QDMA DPDK Software”:

https://xilinx.github.io/dma_ip_drivers/master/QDMA/DPDK/html/build.html

2)     Connect the Ethernet and programming cable between Alveo card and the target system. The detail is slightly different depending on the Alveo card installed on the host system.

Alveo U250/U55C

i)       Insert a QSFP+ transceiver into the QSFP+ connector on the Alveo accelerator card. There are two QSFP+ channels on the U250/U55C card. Make sure to utilize the QSFP1 channel.

ii)     Connect SFP+ no.1 (IP: 192.168.10.100) and SFP+ no.2 (IP: 192.168.20.100) to the 10G Ethernet ports on the target system.

iii)    For programming the card, connect a Micro USB cable from the Alveo accelerator card to the target system.

 

 

Figure 8 QSFP+ and Micro USB Cable Connection on U250/U55C

 

Alveo U50

i)       Insert a QSFP+ transceiver into the QSFP+ connector on the Alveo accelerator card.

ii)     Connect SFP+ no.1 (IP: 192.168.10.100) and SFP+ no.2 (IP: 192.168.20.100) to the 10G Ethernet ports on the target system.

iii)    For programming the card, connect the Ribbon cable from the Alveo accelerator card to the Alveo Programming Module.

 

 

Figure 9 QSFP+ and Ribbon Cable Connection on U50

 

Alveo X3522

i)       Insert two SFP+ transceivers into the SFP+ connectors on the Alveo accelerator card.

ii)     Connect SFP+ no.1 (IP: 192.168.10.100) and SFP+ no.2 (IP: 192.168.20.100) to the 10G Ethernet ports on the target system.

iii)    For programming the card, connect the Flex cable from the Alveo accelerator card to the Alveo Debug kit (ADK2). Make sure the Flex cable is connected firmly.

 

 

Figure 10 SFP+ and Flex Cable Connection on X3522

 

3)     Utilize the Vivado Hardware Manager to program the Alveo card. Open the Vivado Hardware Manager and program the board with the required bit file as illustrated in Figure 11.

 

 

Figure 11 Program Alveo by Vivado Tool

 

4)     Warm reboot the system and confirm that the hardware is implemented on the card using the “lspci” command. The console must display “Network controller: Xilinx Corporation Device 903f” as shown in Figure 12.

 

 

Figure 12 Output of the “lspci” Command After Programming the Alveo Card

 

5)     Bind the previously installed DPDK driver from step (1), with the Alveo card on the host system.

i)       Navigate to the “usertools” folder within the DPDK directory:

>> cd <DPDK directory>/dpdk-20.11/usertools

ii)     User the following command to bind the vfio-pci driver to the Alveo card:

>> sudo ./dpdk-devbind.py -b vfio-pci 01:00.0

6)     Boost the AAT QDMA demo on the Alveo card by executing the “aat_qdma_exe” application.

i)       Navigate to the “download” folder:

>> cd <download directory>

ii)     Execute the application:

>> sudo ./software/aat_qdma_exe

After execution, the DPDK and AAT applications will initialize, as shown in Figure 13.

 

 

Figure 13 Status Displayed After Executing the Demo Application on the Host System

 

4        Run AAT QDMA Demo

To execute the demo, the user must follow three key steps: Initialization, market data transmission, and test result display. The initialization process establishes the connection and configure the necessary parameters. The market data transmission process involves sending market data from the target system, while the results are received from the Alveo card and displayed on the TKAS-D2101 console. Detailed instructions for each process are provided below.

 

4.1      Initialization

To run the AAT QDMA demo, both the target system and the Alveo card need to be properly configured. The target must listen on a specific port to receive order packets from the Alveo card, once it has completed processing the market data. Similarly, the Alveo card must be configured to process the market data and send the order packet. This is done using “demo_setup.cfg” or “demo_setup_with_datamover.cfg” script file. Follow the steps below for system initialization.

1)     On the target system’s console, enter the following command to listen on port 12345.

>> nc -l 192.168.20.100 12345 -v

This will configure the target to listen for incoming packets on the specified IP and port.

 

 

Figure 14 Listen TCP Port on Target PC

 

2)     After entering the command, you should see a confirmation message on the console “Listening on <Target PC name> 12345” indicating that the port is listening, as shown in Figure 14.

3)     On the TKAS-D2101 console, run the script file to configure the parameters for processing market data. There are two options for the configuration depending on the implementation of Pricing Engine.

·       Configure to implement Pricing Engine on the Alveo card by using “demo_setup.cfg” script file. This method offers more latency reduction for market data processing.

>> run support/demo_setup.cfg

·       Configure to implement Pricing Engine on the host software by using “demo_setup_with_datamover.cfg” script file. This method suits for more complicated algorithm for market data processing.

>> run support/ demo_setup_with_datamover.cfg

Note: The script file can only be executed once after the demo has already been boosted up (execute application in step (5) of section 3-Host System Setup). If user requires to rerun the script file, please use “exit” command to cease the demo application and repeat from step (5) of section 3-Host System Setup.

 

 

Figure 15 Run Demo Configuration Script (using demo_setup.cfg)

 

4)     TKAS-D2101 will display messages during the setup process, as shown in Figure 15. These messages will indicate the progress and status of the configuration process.

5)     If the parameter configuration is successful, the target console will display a message indicating that the port has been opened successfully, such as “Connection received on 192.168.20.200 62303”, as shown in Figure 16.

 

 

Figure 16 Port Opened Success

 

4.2      Market Data Transmission

To transmit sample market data, follow these steps using the “tcpreplay” on the target PC. You will need to open two terminal windows on the target PC: Target Console#1 and Target Console#2. Target Console#1 will display the details of the received order packet via TCP protocol (through SFP+#2: enp6s0f1np1), while Target Console#2 will be used to send the sample market data via UDP protocol (through SFP+#1: enp6s0f0np0). Follow the steps below.

1)     Send the sample market data. Use the “tcpreplay” command to send the provided sample market data file from AMD AAT demo (cme_input_arb.pcap). Execute the following command on Target Console#2 with the four parameters.

>> sudo tcpreplay –intf1=<eth I/F> --pps=<pac/sec> --stats=<stat period> <replay file>

i)       <eth I/F>                 : Ethernet interface used to send market data (SFP+#1: enp6s0f0np0).

ii)     <pac/sec>               : Transfer speed, defined as the number of packets per second.

iii)    <stat period>           : Time interval (in seconds) to display the transmission status on the console.

iv)    <replay file>            : File name of the market data to be transmitted (e.g., cme_input_arb.pcap).

 

 

Figure 17 Sample Market Data Transmission by “tcpreplay”

 

2)     As the data is transmitted, the console will display the status every second, showing the total number of packets transmitted.

3)     On Target Console#1, which is already connected to the listening port, the console will display the received data. The data represents the sample order packet returned by the Alveo card and serves as the result of the AAT QDMA demo.

 

 

Figure 18 The Sample Data of Order Packet on SFP+#2 Channel

 

4.3      AAT QDMA Demo

This section provides example results from market data processing on the Alveo card. The AAT QDMA demo design includes the subsystem and multiple submodules responsible for processing the sample market data. This section focuses on six key components: Ethernet subsystem, Feed Handler submodule, Order Book submodule, Data Mover submodule, Pricing Engine submodule, and Order Entry submodule. The following sections describe the sample results obtained from these components within the demo design.

 

4.3.1     Ethernet Subsystem

To check the status of the Ethernet subsystem in the AAT QDMA demo system, follow these steps.

1)     Enter the following command in the terminal to display the status of Ethernet subsystem.

>> ethernet getstatus

 

 

Figure 19: Ethernet Submodule Status

 

2)     The AAT QDMA demo system uses four Ethernet channels: channel0 to channel3. In this example, channel#0 is used for receiving the sample market data, and channel#1 is used for returning the sample order packet. Check the status of channel#0 and channel#1 to ensure they are operating correctly. The following parameters should be verified.

·       Rx Block Lock Status (Live)    : LOCKED

·       GT Power Good (Live)           : true

 

4.3.2     Feed Handler Submodule

To check the status of the Feed Handler submodule in the AAT QDMA demo system, follow these steps.

1)     Enter the following command in the terminal to display the status of the Feed Handler submodule.

>> feedhandler getstatus

2)     The console will display the count of processed data in various units, such as bytes, packets, and messages. Before transmitting the sample market data, the processed data count will be zero. After the market data transmission, this count will increase, indicating that the Feed Handler submodule has begun processing the market data.

 

 

Figure 20 Feed Handler Submodule Status

 

For example, in Figure 20, the left window shows that the processed data count is initially zero. After transmitting the sample market data, the count increases, confirming the submodule is processing the data.

In Figure 17, an example is shown where 104 packets of sample market data were sent by the target PC. Out of these packets, 53 packets were processed by the Feed Handler submodule, while the remaining packets were rejected.

 

4.3.3     Order Book Submodule

To check the status of the Order Book submodule in the AAT QDMA demo system, follow these steps.

1)     Enter the following command to read and display the current order book output from the Order Book submodule.

>> orderbook readdata

2)     The console will show the current values in the order book. Initially, before transmitting any sample market data, the order book will be in a clean state. However, after the transmission of all sample market data, the Order Book submodule will update the bid/ask quantities and prices in the order book to reflect the changes in market conditions.

 

 

Figure 21 Updated Order Book upon the Processing Completion

 

For instance, in Figure 21, the left window displays the clean status of the order book before the market data is transmitted. The right window shows the updated order book after all the sample market data has been processed. The bid/ask quantities and prices are adjusted based on the market data, as updated by the Order Book submodule.

 

4.3.4     Data Mover Submodule

To check the status of the Data Mover submodule in the AAT QDMA demo system, follow these steps.

1)     Enter the following command in the terminal to display the status of Data Mover submodule.

>> datamover getstatus

2)     The console will display the current status of Data Mover submodule. This Data Mover is active only when user selects to implement Pricing Engine by host software (utilized “demo_setup_with_datamover.cfg” file in step (3) of section 4.1-Initialization). Before transmitting the sample market data, the processed data count will be zero. After the market data transmission, this count will increase, indicating that the Data Mover submodule has begun delivering the market data. Before transmitting the sample market data, make sure that the software thread and hardware submodule (hardware kernel) is running:

·       HW Kernel Is Running            : true

·       SW Thread Is Running           : true

 

 

Figure 22 Data Mover Submodule Status

 

In Figure 22, the total number of packets moved to Pricing Engine on the host software is 53 packets. After the host Pricing engine completes the processing, the Data Mover receives 17 packets representing the orders to be delivered to the Order Entry submodule. The remaining 36 packets are not executed by the Pricing Engine.

 

3)     To measure the Round-Trip Time (RTT) – measured once packet is out from Data Mover submodule, be processed in host Pricing Engine, and order packet returns back to Data Mover submodule, execute the following command while transmitting the sample market data.

>> datamover timing

4)     After entering the command, the measurement is operated for 10 seconds.

 

 

Figure 23 Data Mover RTT Result

 

Figure 23 shows the RTT measurement result including the maximum, minimum, average values, and the total number of collected sample market data packets transmitted over a 10-second interval.

 

4.3.5     Pricing Engine Submodule

To check the status of the Pricing Engine submodule in the AAT QDMA demo system, follow these steps.

1)     Enter the following command in the terminal to display the status of Pricing Engine submodule.

>> pricingengine getstatus

2)     Pricing Engine submodule, operating on the Alveo card, is activated only when user selects to implement Pricing Engine on the FPGA (selected “demo_setup.cfg” file in step (3) of section 4.1-Initialization). Similar to the other components, before transmitting any sample market data, the Pricing Engine will be in a clean state.

 

 

Figure 24 Pricing Engine Submodule Status

 

As shown in Figure 24, after market data is transmitted and processed, Pricing Engine submodule receives 53 data sets from the updated order book. As the consequences of computation and algorithm of Pricing Engine, 17 data sets match the conditions and require trading actions, which is informed to Order Entry as the next step.

 

4.3.6     Order Entry Submodule

To check the current status of the Order Entry submodule in the AAT demo system, follow these steps.

1)     Enter the following command to display the current status of the Order Entry submodule.

>> orderentry getstatus

2)     Before transmitting the sample market data, check the connection status of Ethernet channel#1 in the Order Entry submodule status. Two key indicators to confirm are as follows.

·       Connection Established          : true

·       Connection Status                 : SUCCESS

 

 

Figure 25 Order Entry Submodule Status

 

3)     After transmitting the market data completely, the packet count in the Order Entry submodule will update from 0 to reflect the total number of messages or frames processed by the Order Entry submodule.

 

5        Update Hardware via PCIe

In certain environments, remote hardware updates are necessary. The AAT-QDMA includes a hardware module for supporting hardware updates via PCIe, eliminating the need for a programming cable. This section outlines two steps: mcs file creation and downloading.

For downloading the mcs file via PCIe, the "xbflash2" utility is required. Users can download and install "xbflash2" by following the link below:

https://www.amd.com/en/products/accelerators/alveo/u250/a-u250-a64g-pq-g.html#tabs-ca1f6f6dc7-item-2760c1c14c-tab

 

5.1      Create the MCS file

 

Figure 26 MCS File Creation

 

1)     Open the Vivado tcl console by using the following command: “vivado -mode tcl”.

2)     Execute the following command to generate the mcs file:

>> write_cfgmem -force -format mcs -interface <interface_type> -size <size> -loadbit “up <user_config_region_offset> <input_file.bit>” -file “output_file.mcs”

Note: The values for “interface_type”, “size”, and “user_config_region_offset” depend on the Alveo card model, as detailed in Table 1.

 

Table 1 Parameters for MCS File Creation

Alveo card model

Interface_type

size

user_config_region

X3522

spix4

256

0x01002000

U250/U50/U55C

128

 

3)     The operation result will be displayed upon completion, confirming the successful creation of the mcs file.

5.2      Download the MCS file via PCIe

1)     Execute the “xbflash2” utility with root permissions using the following command:

>> sudo xbflash2 program --spi --image <target_mcs_file>.mcs --bar 2 --bar-offset 0x80000 -d <BDF>

Note: The parameters “bar” and “bar-offset” are specific to the default AAT QDMA demo. If the current hardware on the card is not configured by the default AAT QDMA demo, these parameters may need to be adjusted accordingly.

2)     Input ‘Y’ to confirm the operation.

3)     Wait for the following message to appear on the console “Cold reboot machine to load the new image on device”.

4)     Perform a cold reboot of the system. After rebooting, the new hardware configuration will be permanently implemented on the card.

 

 

Figure 27 “xbflash2” Programming

 

6        Revision History

Revision

Date (D-M-Y)

Description

1.00

2-Dec-24

Initial version release