TOE100G-IP Core Data Sheet

Features. 1

Applications. 2

General Description. 4

Functional Description. 6

Control Block. 6

      Reg. 6

      TCP Stack. 9

Transmit Block. 10

      Tx Data Buffer 10

      Tx Packet Buffer 10

      Packet Builder 10

      Async Buffer (Tx) 10

Receive Block. 11

      Async Buffer (Rx) 11

      Packet Filtering. 11

      Packet Splitter 11

      Rx Data Buffer 11

User Block. 12

100G Ethernet MAC and PHY. 12

Core I/O Signals. 13

Timing Diagram.. 15

IP Initialization. 15

Register Interface. 17

Tx FIFO Interface. 18

Rx FIFO Interface. 19

EMAC Interface. 21

Example usage. 24

Client mode (SRV[1:0]=00b) 24

Server mode (SRV[1:0]=01b) 24

Fixed MAC mode (SRV[1] = 1b) 25

PKL and TDL setting in Send command. 26

TDL = N times of PKL. 26

TDL = N times of PKL + Residue. 27

Connection termination of unusual case. 28

Verification Methods. 29

Recommended Design Experience. 29

Ordering Information. 29

Revision History. 29

 

 

 

 

  Core Facts

Provided with Core

Documentation

Reference design manual,

Demo instruction manual

Design File Formats

Encrypted File

Instantiation Templates

VHDL

Reference Designs & Application Notes

Quartus Project,

See Reference Design Manual

Additional Items

Demo on Stratix10 MX,

Stratix10 TX, and

Agilex F-Series development kit

Support

Support Provided by Design Gateway Co., Ltd.

 

 

Design Gateway Co.,Ltd

E-mail:    ip-sales@design-gateway.com

URL:       design-gateway.com

 

Features

·     TCP/IP stack implementation

·     Support IPv4 protocol

·     Support one session for each TOE100G IP (Multisession can be implemented by using multiple TOE100G IPs)

·     Support both Server and Client mode (Passive/Active open and close)

·     Support Jumbo frame

·     Transmitted packet size aligned to 512-bit, transmitted data bus size

·     Total amount of receive data aligned to 512-bit, received data bus size

·     Simple data interface by standard FIFO interface at 512-bit data bus

·     Simple control interface by 32-bit single-port RAM interface

·     512-bit Avalon stream interface with 100G Ethernet MAC

·     Support window scaling feature with selectable buffer size up to 1 MB

·     At least 220 MHz user clock frequency recommended

·     Reference design available on Stratix10 MX, Stratix10 TX, and Agilex F-Series FPGA development Kit

·     Not support data fragmentation feature

·     Customized service for following features

·     Unaligned 512-bit data transferring

·     Network parameter assignment by other methods

 

 

Table 1: Example Implementation Statistics

Family

Example Device

Buffer size

(Tx and Rx)

Fmax

(MHz)

ALMs1

Registers1

Pin

Block Memory bit

Design

Tools

Stratix10 MX

1SG280HU2F50E1VG

64KB

350

7,713

16,563

-

1,706,496

QuartusII20.4

1MB

300

9,725

16,252

-

17,435,136

QuartusII20.4

Stratix10 TX

1ST280EY2F55E1VG

64KB

350

7,674

16,178

-

1,706,496

QuartusII20.4

1MB

300

9,707

16,077

-

17,435,136

QuartusII20.4

Agilex7

F-Series

AGFB014R24A2E3VR0

64KB

350

7,916

18,997

-

1,706,496

QuartusII20.4

1MB

300

9,908

20,630

-

17,435,136

QuartusII20.4

 

Notes:

1) Actual logic resource dependent on percentage of unrelated logic

 

Applications

The TOE100G IP core for enabling the data transfer using TCP/IP protocol over 100G Ethernet can transfer data at high speeds with reliability. This solution is frequently used in servers with large amounts of data to process and in test systems that require high-bandwidth data logging from multiple sources. Figure 1 and Figure 2 illustrate some of the FPGA-based applications of TOE100G IP.

 

 

Figure 1: NVMe over TCP (NVMe-oF) application

 

The first application is the NVMe-oF system, which uses the NVMe over TCP (NVMe/TCP) protocol to enable network access to storage through the NVMe protocol. This allows for low-latency, high-bandwidth data transfer. NVMe-oF protocols include RDMA, InfiniBand, and NVMe/TCP, with the latter being a more cost-effective and extensible option that can be implemented using common network hardware.

Figure 1 provides detailed instructions for implementing the NVMe/TCP Host using two TOE100G IPs, one for Admin command transfer and the other for data transfer. With the TOE100G IP, the Host controller only needs to implement the NVMe and NVMe-oF protocols without the need for the TCP/IP protocol. The TOE100G IP on the data port also offers high-speed data transfer, which is particularly advantageous for CPU-less systems as they do not require a CPU or DDR to implement the host controller.

Another side of NVMe-oF is the NVMe/TCP Target, which connects to the SSD Rack and translates data and commands from 100G Ethernet into NVMe on PCIe protocol for NVMe SSD. Similar to the Host, the Target can be designed using CPU system or hardwire logic with TOE100G IP.

 

 

Figure 2: Data acquisition system

 

In high-resolution data sources, achieving data transfer rates of up to 12 Gbytes/s can be challenging due to the limited hardware systems and communication channels available. One of the most rapid storage solutions to address this issue is the NVMe SSD. Using the 4-lane PCIe Gen5, NVMe SSDs can read or write data at speeds of up to 12000 Mbytes/s. Combining multiple NVMe SSDs as a RAID0 system can further increase the transfer speeds. Interestingly, the speed of NVMe SSDs at 12000 Mbytes/s is comparable to the performance of 100G Ethernet, which can achieve transfer rates of up to 12 Gbytes/s. By integrating both the NVMe Gen5 SSD and 100G Ethernet, it is possible to design a powerful data acquisition system with remote monitoring capabilities, as illustrated in Figure 2.

 

General Description

 

Figure 3: TOE100G IP Block Diagram

 

The TOE100G IP core is a powerful hardware module that implements the TCP/IP stack and connects with a 100G Ethernet MAC and PHY module for the lower-layer hardware. Its user interface is composed of a Register interface for control signals and a FIFO interface for data signals. The TOE100G IP operates using three clock domains: Clk for the user interface, MacTxClk for transmitting data through the 100G EMAC, and MacRxClk for receiving data through the 100G EMAC. However, note that some EMACs use the same clock source for both Tx and Rx interfaces, so MacTxClk and MacRxClk are the same signals.

To access up to 32 registers, the Register interface uses a 5-bit address. The registers store the network parameters, commands, and system parameters. Each TOE100G IP can operate one session to communicate with a single target device. Network parameters must be set before de-asserting the reset signal to execute IP initialization. After the reset operation and parameter initialization are complete, the IP is ready to transfer data with the target device. Network parameters cannot be changed without a reset process and the TOE100G IP has three initialization modes for obtaining the MAC address of the target device. Further details of each mode can be found in the IP Initialization topic.

To transfer data with the user, a 512-bit FIFO interface is used. However, there is no byte enable in the FIFO interface, so the transmitted data from the user must be aligned to 512-bit. The packet length and the total amount of transmitted data must also be aligned to 512-bit. On the other hand, the received data on Rx FIFO I/F can be read when at least one 512-bit data is available in the Rx data buffer. If the total amount of received data is not aligned to 512-bit, the user cannot read the last data and must wait until the next data is received to fill the remaining byte of 512-bit data for reading the Rx data buffer.

The TOE100G IP uses a 512-bit Avalon-ST interface to connect with the 100G Ethernet MAC. Ethernet MAC and PHY, provided by Intel, comes with EMAC, PCS, and PMA functionality. The clock frequency of the EMAC user interface for both Tx and Rx interface is either 390.625 MHz or 402.832 MHz, depending on the type of Ethernet MAC IP core.

The TOE100G IP has two Async buffers that enable the user interface to run on an independent clock that has a lower frequency than the Ethernet MAC clock frequency. However, it is recommended to use 220 MHz or more as the user clock frequency. Using a too slow frequency clock may result in the Async buffer becoming full, and some packets may be lost. Transfer performance is reduced when data recovery process is required.

In accordance with TCP/IP standard, connection establishment is the first step before transferring data. TOE100G IP supports both active open (the IP opens the port) and passive open (the target device opens the port) modes. After a successful connection, data can be transferred via the new connection. To send TCP payload data, the user must set the total transfer size, packet size, and send command to the IP. The TCP payload data is transferred via the TxFIFO interface. Conversely, when the TCP packet is received from the target, the TCP payload data is extracted and stored in the Rx data buffer. The user logic monitors FIFO status to detect the amount of received data and then asserts read enable to read the data via the RxFIFO interface. When there is no more data to transfer, the connection can be terminated by closing the port. TOE100G IP supports both active close (the IP closes the port) and passive close (the target device closes the port) modes.

To meet the requirements of user systems that may be sensitive to memory resources or performance, the buffer size inside the TOE100G IP can be adjusted by the user to accommodate these needs. Specifically, the sizes of the Tx data buffer and Rx data buffer can be modified with a maximum size of 1 MB. Utilizing larger buffer sizes can enhance transfer performance, but 1 MB size requires the use of the window scaling feature of TCP options, which is already implemented in the TOE100G IP. This feature is particularly useful for users who require high-speed data transfers. Further details about the hardware inside the IP are described in the next topic.

 

Functional Description

As shown in Figure 3, TOE100G IP can be divided into three parts, i.e., control block, transmit block, and receive block. The details of each block are described as follows.

Control Block

·       Reg

All parameters of the IP are set via Register interface that consists of 5-bit address signals and 32-bit data signals. The timing diagram of the Register interface is similar to a single-port RAM interface, as shown in Figure 7. The write and read address are the same signals. Table 2 provides a description of each register.

 

Table 2: Register map Definition

RegAddr

[4:0]

Reg

Name

Dir

Bit

Description

00000b

RST

Wr

/Rd

[0]

Reset IP. 0b: No reset, 1b: Reset. Default value is 1b.

Once the network parameters have been assigned, the user can execute system initialization by setting this register to 1b and then 0b. This action loads the parameters into the IP and executes the system initialization. If the user needs to update certain parameters, this process must be repeated by setting this register to 1b and then 0b again. The RST register controls the following network parameters: SML, SMH, DML, DMH, DIP, SIP, DPN, SPN, and SRV.

00001b

CMD

Wr

[1:0]

User command. 00b: Send data, 10b: Open connection (active), 11b: Close connection (active),

01b: Undefined. The command operation begins after the user sets CMD register.

In order to start a new operation by setting this register, the system must first be in the Idle state. To confirm that the system is not busy, the user should read bit[0] of CMD register or RegDataA1 output, which should be equal to 0b.

Rd

[0]

System busy flag. 0b: Idle, 1b: IP is busy.

[3:1]

Current IP status. 000b: Send data, 001b: Idle, 010b: Active open, 011b: Active close,

100b: Receive data, 101b: Initialization, 110b: Passive open, 111b: Passive close.

00010b

SML

Wr

/Rd

[31:0]

Define 32-bit lower MAC address (bit [31:0]) for this IP.

To update this value, the IP must be reset by RST register.

00011b

SMH

Wr

/Rd

[15:0]

Define 16-bit upper MAC address (bit [47:32]) for this IP.

To update this value, the IP must be reset by RST register.

00100b

DIP

Wr

/Rd

[31:0]

Define 32-bit target IP address.

To update this value, the IP must be reset by RST register.

00101b

SIP

Wr

/Rd

[31:0]

Define 32-bit IP address for this IP.

To update this value, the IP must be reset by RST register.

00110b

DPN

Wr

/Rd

[15:0]

Define 16-bit target port number. Unused when the port is opened in passive mode.

To update this value, the IP must be reset by RST register.

00111b

SPN

Wr

/Rd

[15:0]

Define 16-bit port number for this IP.

To update this value, the IP must be reset by RST register.

1000b

TDL

Wr

[31:0]

Total Tx data length in byte unit. The value must be aligned to 64-byte because bit[5:0] are not used. Valid range is 64-0xFFFFFFC0.

The user must first set this register before setting CMD register = Send data (00b). When the IP executes the ‘Send data’ command and asserts Busy to 1b, the system will read this register, allowing the user to subsequently set the TDL register for the next command. If the same TDL is used in the subsequent command, the user is not required to set TDL again.

Rd

[31:0]

Remaining transfer length in byte unit which does not transmit.

 

RegAddr

[4:0]

Reg

Name

Dir

Bit

Description

01001b

TMO

Wr

[31:0]

 

Define timeout value for awaiting the return of Rx packet from the target. The counter runs based on the Clk signal provided by the user, with the timer unit being equal to 1/Clk. If the packet is not received within the specified time, TimerInt will be asserted to 1b. For further information of TimerInt, please refer to the Read value of TMO[7:0] register. It is recommended to set the TMO to a value greater than 0x6000.

Rd

[31:0]

The details of timeout interrupt are shown in TMO[7:0]. Other bits are read for IP monitoring.

[0]-Timeout from not receiving ARP reply packet.

After timeout, the IP resends ARP request until ARP reply is received.

[1]-Timeout from not receiving SYN and ACK flag during active open operation.

After timeout, the IP resends SYN packet for 16 times and then sends FIN packet to close connection.

[2]-Timeout from not receiving ACK flag during passive open operation.

After timeout, the IP resends SYN/ACK packet for 16 times and then sends FIN packet to close connection.

[3]-Timeout from not receiving FIN and ACK flag during active close operation.

After the 1st timeout, the IP sends RST packet to close connection.

[4]-Timeout from not receiving ACK flag during passive close operation.

After timeout, the IP resends FIN/ACK packet for 16 times and then sends RST packet to close connection.

[5]-Timeout from not receiving ACK flag during data transmit operation.

After timeout, the IP resends the previous data packet.

[6]-Timeout from Rx packet lost, Rx data FIFO full, or wrong sequence number.

The IP generates duplicate ACK to request data retransmission.

[7]-Timeout from too small receive window size when running Send data command and setting PSH[2] to 1b. After timeout, the IP retransmits data packet, similar to TMO[5] recovery process.

[21]-Lost flag when the sequence number of the received ACK packet is skipped. As a result, TimerInt is asserted and TMO[6] is equal to 1b.

[22]-FIN flag is detected during sending operation.

[23]-Rx packet is ignored due to Rx data buffer full (fatal error).

[27]-Rx packet lost detected.

[30]-RST flag is detected in Rx packet.

[31],[29:28],[26:24]-Internal test status

01010b

PKL

Wr

/Rd

[15:0]

TCP data length of each Tx packet in byte unit. The value must be aligned to 64-byte because bit[5:0] are not used. Valid from 64-8960. Default value is 1408 bytes, which is the maximum size of non-jumbo frame that is aligned to 64-byte.

During running Send data command (Busy=1b), the user must not set this register.

Similar to TDL register, the user does not need to set PKL register again if the next command uses the same packet length.

01011b

PSH

Wr

/Rd

[2:0]

Sending mode when running Send data command.

[0]-Disable to retransmit packet.

0b: Generate the duplicated data packet for the last data packet in Send data command when TDL value is not equal to N times of PKL value to accelerate ACK packet (default).

1b: Disable the duplicate data packet.

[1]-PSH flag value in TCP header for all transmitted packet.

0b: PSH flag = 0b (default).

1b: PSH flag = 1b.

 



RegAddr

[4:0]

Reg

Name

Dir

Bit

Description

01011b

PSH

Wr

/Rd

[2:0]

[2]-Enable to retransmit data packet when Send data command is paused until timeout, caused by the receive window size being smaller than the packet size. This flag is designed to resolve the system hang problem resulting from lost window update packet. Activating data retransmission prompts the target device to regenerate the lost window update packet. All following conditions must be met to initiate data retransmission.

(1) PSH[2] is set to 1b.

(2) The current command is ‘Send data’ and all data are not completely sent.

(3) The receive window size is smaller than the packet size.

(4) Timer set by TMO register is overflowed.

0b: Disable the feature (default), 1b: Enable the feature.

01100b

WIN

Wr

/Rd

[9:0]

Threshold value in 1Kbyte unit to initiate window update packet transmission.

Default value is 0 (Not enable window update transmission).

The IP sends the window update packet when the free space in the Rx data buffer increases by an amount greater than the threshold value from the value in the most recently transmitted packet. For example, if the user sets WIN=”000001b” (1 Kbyte) and the window size of the most recently transmitted packet is 2 Kbyte, when the user reads 1 Kbyte data from the IP and the free space in the Rx data buffer is updated from 2 Kbyte to be 3 Kbyte, the IP detects the increased window size is greater than the threshold value of 1 Kbyte (3 KB – 2 KB). As a result, the IP sends the window update packet to update the receive buffer size.

01101b

ETL

Wr

[31:0]

Extended total Tx data length in byte unit. The value must be aligned to 64-byte and bit[5:0] are not used. The user can set this register during the Send data command operating (Busy=1b) to extend the total Tx data length. This allows for continuous data transmission without having to resend a new command to the IP. However, there are some important considerations to use this feature.

1) The ETL register must be programmed when the read value of TDL is greater than the size of the Tx data buffer to ensure that Busy is not de-asserted to 0b before setting the ETL register.

2) The set value of ETL must be less than the maximum value of TDL (0xFFFFFFC0) minus the read value of TDL, to avoid overflow value.

For example, the user sets TDL to 3.5 Gbytes and then sets CMD register to Send data. After the IP completes 2 Gbytes of data (remaining size = 1.5 Gbytes), the user sets the ETL register to 1.5 Gbytes. The total transmit length is equal to 5 Gbytes (3.5 Gbytes of TDL + 1.5 Gbytes of ETL).

01110b

SRV

Wr

/Rd

[1:0]

00b: Client mode (default). When the RST register changes from 1b to 0b, the IP sends an ARP request to obtain the Target MAC address from the ARP reply returned by the target device. The IP busy is de-asserted to 0b after receiving the ARP reply.

01b: Server mode. When the RST register changes from 1b to 0b, the IP waits for an ARP request from the target to obtain Target MAC address. After receiving the ARP request, the IP generates an ARP reply and then de-asserts the IP busy signal to 0b.

1Xb: Fixed MAC Mode. When the RST register changes from 1b to 0b, the IP updates all internal parameters and then de-asserts IP busy to 0b. Target MAC address is loaded through the DML/DMH register.

Note: In Server mode, when RST register changes from 1b to 0b, the target device must resend an ARP request for the TOE100G IP to complete the IP initialization process.

01111b

VER

Rd

[31:0]

IP version

10000b

DML

Wr

/Rd

[31:0]

Define 32-bit lower target MAC address (bit [31:0]) for this IP when SRV[1]=1b (Fixed MAC).

To update this value, the IP must be reset by RST register.

10001b

 

DMH

Wr

/Rd

[15:0]

Define 16-bit upper target MAC address (bit [47:32]) for this IP when SRV[1]=1b (Fixed MAC).

To update this value, the IP must be reset by RST register.

 

 

·       TCP Stack

The TCP stack is responsible for controlling the modules involved in interfacing with the user and transferring packets via EMAC. The IP operation involves two phases - IP initialization and data transfer. After the RST register transitions from 1b to 0b, the initialization phase begins. The SRV[1:0] are used to set the initialization mode, which can be Client mode, Server mode, or Fixed MAC mode. The TCP stack reads the parameters from the Reg module and sets them in the Transmit and Receive blocks for packet transfer with the target device. Once initialization is complete, the IP enters the data transfer phase.

To transfer data between the TOE100G IP and the target device, three processes are involved: port opening, data transfer, and port closing. The IP supports active open or close by sending SYN or FIN packets when the user sets the CMD register to 10b (port opening) or 11b (port closing). Alternatively, the port can be opened or closed by the target device (passive mode) when the TCP Stack receives SYN or FIN packet. While the port is being opened or closed, the Busy flag is asserted to 1b. Once all packets are transferred, Busy is de-asserted to 0b. The ConnOn signal can be applied to check if the port status is completely opened or closed. The data can be transferred when ConnOn is asserted to 1b (indicating that the port is completely opened).

To send the data, user data is stored in the Tx data and Tx packet buffers. Packet Builder uses the network parameters set by the user to build TCP header, and then the data of Tx data buffer is appended to the TCP packet. The Tx packet is stored to Async buffer (Tx) to cross the clock domain from Clk (user’s clock) to MacTxClk (TxEMAC I/F clock), before forwarding it to the EMAC for transmission to the target device. Once the target device receives data successfully, it sends an ACK packet to the Receive block. The TCP Stack monitors the status of both the Transmit and Receive blocks to confirm that the data has been sent successfully. If the data is lost, the TCP Stack pauses the current data transmission and initiates the data retransmission process in the Transmit block.

When the Receive block receives data, TCP Stack checks the order of the received data. If the data is in the correct order, a normal ACK packet is generated by the Transmit block. Otherwise, the TCP Stack starts the lost data recovery process by instructing the Transmit block to generate duplicate ACKs to the target device.

 

Table 3: TxBuf/RxBufBitWidth Parameter description

Value of BitWidth

Buffer Size

9

32KByte

10

64KByte

11

128KByte

12

256KByte

13

512KByte

14

1MByte

 

 

Transmit Block

Transmit block contains two buffers - Tx data buffer and Tx packet buffer – whose sizes can be adjusted through parameter assignment. A larger buffer size may improve transmit performance. Data from the Tx data buffer is split into packets based on the packet size and stored in the Tx packet buffer. TCP header is constructed using the network parameters from the Reg module and then combined with the TCP data from the Tx packet buffer to form a complete TCP packet. The data in the Tx data buffer is flushed after the target device sends an ACK packet. Once the Send data command is completed, the user can initiate the next command.

·       Tx Data Buffer

The size of this buffer is determined by the “TxBufBitWidth” parameter of the IP, with valid value ranging from 9 – 14 (32KB to 1MB), which corresponds to the address size of a 512-bit buffer as shown in Table 3. This buffer stores data from the user to prepare the transmit packet sent to the target device. Data is removed from the buffer when the target device confirms that the data has been completely received. When the buffer size is large enough, the IP can send multiple data packets to the target device without waiting for an ACK packet to clear the buffer. The user can continuously store new data in the Tx data buffer without waiting for long periods. This results in the best transmit performance on a 100G Ethernet connection. However, if there is significant latency time due to the carrier, networking interface, or target system, all the data in the Tx data buffer may be transferred before an ACK packet is returned to flush the buffer. In such cases, the user must pause filling the buffer with new data, resulting in reduced transmit performance.

If the total user data is greater than the value of the TDL register, the buffer will still have remaining data after completing the current Send command. This data can be applied for the next Send command. All data in the buffer is flushed when the connection is closed or the IP is reset.

Note: The IP cannot send the packet if the data stored in the buffer is less than the transmit size. The IP must wait until the data from user is sufficient to create one packet.

·       Tx Packet Buffer

This buffer stores at least one complete packet before forwarding a packet to Async buffer (Tx).

·       Packet Builder

The TCP packet is comprised of a header and data. The Packet builder first receives network parameters from the Reg module and uses them to construct the TCP header. The TCP and IP checksum are also calculated for the header. Once the header is fully constructed, it is combined with the data from the Tx packet buffer and then transmitted to the Async buffer (Tx).

·       Async Buffer (Tx)

Async buffer (Tx) transfers packets from the Clk domain to the MacTxClk domain and includes essential logic to interface with 100G EMAC. When the Clk frequency is too low, it can result in a lower data transfer rate than EMAC interface, causing decreased performance. To avoid this issue, a Clk frequency of at least 220 MHz is recommended.

 

Receive Block

The Receive block contains the Rx data buffer, which stores the data received from the target device. The received data is stored in the buffer when the header in the packet matches the expected value, set by the network parameters inside the Reg module, and when the IP and TCP checksum are correct. If any of these conditions are not met, the received packet is rejected. Increasing the size of the Rx data buffer may improve the receive performance. Additionally, the TOE100G IP can reorder packets if only one packet is out of order. For example, if the packet order is #1, #3, #2 and #4 (where packet #2 is interchanged with packet#3), the TOE100G IP can fix the order. However, if more than one packet is out of order, such as in the case of packet#1, #3, #4, and #2 (where packet #3 and #4 are received before packet#2), the TOE100G IP is unable to reorder the packets. In this scenario, the data needs to be retransmitted, and duplicate ACK packets must be generated.

·       Async Buffer (Rx)

The Async buffer (Rx) forwards EMAC packets from the MacRxClk domain to the Clk domain, and includes logic for interfacing with the 100G EMAC. Like the Async Buffer (Tx), it is recommended to use a Clk frequency of at least 220 MHz to prevent performance drops in the transmit direction and avoid the Async buffer (Rx) from becoming full and discarding the received packets from the EMAC.

·       Packet Filtering

This module is responsible for verifying the header of the Rx packet to determine its validity. The packet will be valid if all following conditions are met.

(1)   The network parameters must match the values set in the Reg module, such as the MAC address, IP address, and Port number.

(2)   The packet must either be an ARP packet or a TCP/IPv4 packet without a data fragment flag.

(3)   The IP header length and TCP header length must be valid, with the IP length being equal to 20 bytes and the TCP header length being between 20 and 60 bytes.

(4)   Both the IP checksum and TCP checksum must be correct.

(5)   The data pointer, as decoded by the sequence number, must be within a valid range.

(6)   The acknowledge number must be within a valid range.

 

·       Packet Splitter

The purpose of this module is to extract TCP payload data from incoming packets and store it in the Rx data buffer, after removing the packet header.

·       Rx Data Buffer

·       The size of the Rx data buffer is determined by the “RxBufBitWidth” parameter of the IP and can range from 9 – 14 (32KB to 1MB). The size of the Rx data buffer is also applied as the window size of the transmit packet with a feature of window scaling. When the Rx data buffer is sufficiently large, the target device can send multiple data packets to the TOE100G IP without having to wait for an ACK packet, which may be delayed by the networking system. Consequently, a larger Rx data buffer can improve the receive performance.

·       The data is stored in the buffer until it is read by the user. If the user does not read the data from the buffer for long time, the buffer becomes full, and the target device can no longer send data to the IP, resulting in reduced performance. To achieve optimal received performance, it is recommended that the user logic reads the data from the IP as soon as it is available. By doing so, the Rx data buffer will not become full, and the receive performance will not be affected by the full window size.

 

User Block

The core engine of the user module can be designed by state machine to set the command and the parameters through the Register interface. Additionally, the status can be monitored to ensure that the operation has been completed without any errors. The data path can also be connected to the FIFO for sending or receiving data with the IP.

 

100G Ethernet MAC and PHY

The 100G Ethernet MAC and PHY enable consists of an Ethernet MAC and PHY to enable the operation of 100G Ethernet. The user interface for connecting with the TOE100G IP is a 512-bit Avalon stream that run at a frequency of 390.625 MHz for Stratix10 MX, and 402.832 MHz for Stratix10 TX or Agilex 7. The 100GBASE-R standard is used as the physical interface to connect with 100G Ethernet.

For more information about the IP, please visit the following websites.

For Stratix10 MX: Low Latency 100G Ethernet MAC

https://www.intel.com/content/www/us/en/docs/programmable/683100/21-1-19-2-0/ip-overview.html

For Agilex 7 and Stratix 10 TX: E-Tile Hard IP Intel

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-s10-etile-hip-ethernet.pdf

 

Core I/O Signals

Descriptions of all parameters and I/O signals are provided in Table 4 -Table 7. The EMAC interface is 512-bit Avalon stream standard.

 

Table 4: Core Parameters

Name

Value

Description

TxBufBitWidth

9-14

Setting Tx data buffer size. The value is referred to address bus size of this buffer.

RxBufBitWidth

9-14

Setting Rx data buffer size. The value is referred to address bus size of this buffer.

 

Table 5: User I/O Signals (Synchronous to Clk)

Signal

Dir

Description

Common Interface Signal

RstB

In

Reset IP core. Active Low.

Clk

In

User clock. The clock frequency must be equal to or greater than 220 MHz to maintain good performance.

User Interface

RegAddr[4:0]

In

Register address bus. Valid when RegWrEn=1b in Write access.

RegWrData[31:0]

In

Register write data bus. Valid when RegWrEn=1b.

RegWrEn

In

Register write enable. Valid at the same clock as RegAddr and RegWrData.

RegRdData[31:0]

Out

Register read data bus. Valid in the next clock after RegAddr is valid.

ConnOn

Out

Connection Status. 1b: connection is opened, 0b: connection is closed.

TimerInt

Out

Timer interrupt. Asserted to 1b for 1 clock cycle when timeout is detected.

More details of Interrupt status are monitored from TMO[7:0] register.

RegDataA1[31:0]

Out

32-bit read value of CMD register (RegAddr=00001b). Bit[0] is TOE100G IP busy flag.

RegDataA8[31:0]

Out

32-bit read value of TDL register (RegAddr=01000b)

RegDataA9[31:0]

Out

32-bit read value of TMO register (RegAddr=01001b)

Tx Data Buffer Interface

TCPTxFfFlush

Out

Tx data buffer within the IP is reset.

Asserted to 1b when the connection is closed or the IP is reset.

TCPTxFfFull

Out

Asserted to 1b when Tx data buffer is full.

User needs to stop writing data within 4 clock cycles after this flag is asserted to 1b.

TCPTxFfWrCnt[13:0]

Out

Data counter in 512-bit unit of Tx data buffer to show the amount of data in Tx data buffer.

TCPTxFfWrEn

In

Write enable to Tx data buffer. Asserted to 1b to write data to Tx data buffer.

TCPTxFfWrData[511:0]

In

Write data to Tx data buffer. Valid when TCPTxFfWrEn=1b.

Rx Data Buffer Interface

TCPRxFfFlush

Out

Rx data buffer within the IP is reset.

Asserted to 1b when the connection is opened.

TCPRxFfRdCnt[13:0]

Out

Data counter of Rx data buffer to show the number of received data in 512-bit unit.

TCPRxFfLastRdCnt[5:0]

Out

Remaining byte of the last data in Rx data buffer when total amount of received data in the buffer is not aligned to 64-byte unit. User cannot read the data until all 64-byte data is received.

TCPRxFfRdEmpty

Out

Asserted to 1b when Rx data buffer is empty.

User needs to stop reading data immediately when this signal is asserted to 1b.

TCPRxFfRdEn

In

Asserted to 1b to read data from Rx data buffer.

TCPRxFfRdData[511:0]

Out

Data output from Rx data buffer.

Valid in the next clock cycle after TCPRxFfRdEn is asserted to 1b.

 

Table 6: Tx EMAC I/O Signals (Synchronous to MacTxClk)

Signal

Dir

Description

Tx MAC Interface

MacTxClk

In

The user interface clock for transmitting data to the 100G Ethernet MAC has different frequencies depending on the device.

For Stratix 10 devices, the clock frequency is 390.625 MHz.

For the E-tile Hard IP (Agilex 7 and Stratix 10 TX devices), the clock frequency is 402.832 MHz.

MacTxData[511:0]

Out

Transmitted data. Valid when MacTxValid=1b.

MacTxEmpty[5:0]

Out

Specify the number of bytes which are unused of the final word in the frame.

MacTxValid

Out

Valid signal of transmitted data.

MacTxSOP

Out

Control signal to indicate the first word in the frame. Valid when MacTxValid=1b.

MacTxEOP

Out

Control signal to indicate the final word in the frame. Valid when MacTxValid=1b.

MacTxReady

In

Handshaking signal. Asserted to 1b when MacTxData has been accepted.

 

Table 7: Rx EMAC I/O Signals (Synchronous to MacRxClk)

Signal

Dir

Description

Rx MAC Interface

MacRxClk

In

The user interface clock for receiving data from the 100G Ethernet MAC has different frequencies depending on the device.

For Stratix 10 devices, the clock frequency is 390.625 MHz.

For the E-tile Hard IP (Agilex 7 and Stratix 10 TX devices), MacRxClk and MacTxClk share the same signal, and the frequency is 402.832 MHz.

MacRxData[511:0]

In

Received data. Valid when MacRxValid=1b.

MacRxValid

In

Valid signal of received data.

MacRxEOP

In

Control signal to indicate the final word in the frame. Valid when MacRxValid=1b.

MacRxError

In

Control signal asserted at the end of received frame (MacRxValid=1b and MacRxEOP=1b) to indicate that the frame has CRC error. 0b: normal packet, 1b: error packet.

For Intel EMAC IP, connect with rx_error[1] signal.

MacRxReady

Out

Handshaking signal. Asserted to 1b when MacRxData has been accepted.

Typically, the MacRxReady signal is always asserted to 1b. However, when the Clk frequency is too low, the available space of the Async buffer (Rx) may be insufficient to store a packet, and as a result, MacRxReady may be de-asserted to 0b after receiving the end of packet. The signal is then re-asserted to 1b when the buffer has enough free space to store a packet of maximum size.

 

 

Timing Diagram

 

IP Initialization

After the RST register value is changed from 1b to 0b, the initialization of TOE100G IP is initialized. Three modes can be executed, Client mode (SRV=00b), Server mode (SRV=01b), or Fixed MAC mode (SRV=1Xb). The information on each mode is presented in the timing diagram below.

 

 

Figure 4: IP Initialization in Client mode

 

As shown in Figure 4, in Client mode, the TOE100G IP sends an ARP request packet and waits for an ARP reply packet returned from the target device. Target MAC address is extracted from ARP reply packet. Upon completion, the Busy signal (bit0 of RegDataA1) is de-asserted to 0b.

 

 

Figure 5: IP Initialization in Server mode

 

As shown in Figure 5, after reset process in Server mode is completed, the TOE100G IP waits for an ARP request packet from the target device. Upon receipt, the TOE100G IP generates an ARP reply packet. The Target MAC address is extracted from ARP request packet. Once the ARP reply packet has been transmitted, the Busy signal is de-asserted to 0b.

 

 

Figure 6: IP Initialization in Fixed mode

 

As shown in Figure 6, after reset process in Fixed MAC mode is completed, the TOE100G IP updates all parameters from the registers. The Target MAC address is loaded from DML and DMH register. Once this process is finished, the Busy signal is de-asserted to 0b.

 

Register Interface

The Register interface is responsible for setting and monitoring all control signals and network parameters during operation. The timing diagram of the interface is similar to that of Single-port RAM, which shares the address bus for write and read access, and has a read latency time of one clock cycle. A Register map of this interface is provided in Table 2.

As shown in Figure 7, to write to the register, the user sets RegWrEn to 1b with the valid values for RegAddr and RegWrData. Before setting RegWrEn to 1b, please confirm that RstB is de-asserted to 1b for at least 4 clock cycles. To read from the register, the user only sets RegAddr, and RegRdData becomes valid in the next clock cycle.

 

 

Figure 7: Register interface timing diagram

 

As shown in Figure 8, before the user sets CMD register to start the new command operation, Busy flag must be equal to 0b to confirm that IP is in Idle status. After CMD register is set, Busy flag is asserted to 1b. Busy is de-asserted to 0b when the command is completed.

 

 

Figure 8: CMD register timing diagram

 

Tx FIFO Interface

Tx FIFO interface provides two control signals for the flow control, the full flag (TCPTxFfFull) and the write data counter (TCPTxFfWrCnt). TCPTxFfWrCnt is updated two clock cycles after asserting TCPTxFfWrEn. TCPTxFfFull serves as an indicator of when the internal buffer is almost full and is asserted before it reaches its capacity. It is recommended to pause sending data within four clock cycles after TCPTxFfFull is asserted. Figure 9 shows an example timing diagram for the Tx FIFO interface.

 

 

Figure 9: Tx FIFO interface timing diagram

 

(1)   Before asserting TCPTxFfWrEn to 1b to write the data to TOE100G IP, the full flag (TCPTxFfFull) must not be asserted to 1b and ConnOn must be equal to 1b. To write the data, assert TCPTxFfWrEn to 1b along with TCPTxFfWrData.

(2)   If TCPTxFfFull is asserted to 1b, TCPTxFfWrEn must be de-asserted to 0b within four clock cycles to pause sending data.

(3)   When there is no more data for transferring, the connection may be terminated by active or passive mode. After the port is closed, the following situations are found.

i)       ConnOn changes from 1b to 0b.

ii)     TCPTxFfFlush is asserted to 1b to flush all data inside TxFIFO for a while and then de-asserted to 0b.

iii)    TCPTxFfWrCnt is reset to 0.

iv)    TCPTxFfFull is asserted to 1b to block the new user data and then de-asserted to 0b, similar to TCPTxFfFlush.

 

Rx FIFO Interface

The Rx FIFO interface is used to retrieve data stored in the Rx data buffer. To determine if data is available for reading, the Empty flag (TCPRxFfEmpty) is monitored, and the read enable signal (TCPRxFfRdEn) is then asserted to access the data, like a typical FIFO read interface, as illustrated in Figure 10.

 

 

Figure 10: Rx FIFO interface timing diagram by using Empty flag

 

(1)   Check the TCPRxFfEmpty flag to confirm data availability. When data is ready (TCPRxFfEmpty=0b), set TCPRxFfRdEn to 1b to read data from the Rx data buffer.

(2)   The TCPRxFfRdData signal is valid in the next clock cycle.

(3)   Reading data must be immediately paused by setting TCPRxFfRdEn=0b when TCPRxFfEmpty is equal to 1b.

(4)   The user must read all data from the Rx data buffer before creating a new connection. When a new connection is established, all data in the Rx data buffer is flushed, and TCPRxFfFlush is set to 1b. Once the new connection is completed, the ConnOn value changes from 0b to 1b.

(5)   After finishing the Flush operation, TCPRxFfEmpty is asserted to 1b.

 

 

Figure 11: Rx FIFO interface timing diagram by using read counter

 

When the user logic reads data in burst mode, the TOE100G IP provides a read data counter signal to indicate the total amount of data stored in the Rx data buffer in 512-bit unit. For instance, in Figure 11, there are five units of data available in the Rx data buffer. Therefore, the user can set TCPRxFfRdEn to 1b for five clock cycles to read all the data from the Rx data buffer. The latency time to update TCPRxFfRdCnt after setting TCPRxFfRdEn to 1b is two clock cycles.

 

EMAC Interface

EMAC interface of TOE100G IP utilizes a 512-bit Avalon-stream interface to transmit packets. When sending a packet, the TOE100G IP sets MacTxValid signal to 1b and sets the associated signals (MacTxSOP, MacTxData, MacTxEmpty, and MacTxEOP) to their valid values. During data transmission, the TOE100G IP can temporarily pause the transmission by setting MacTxReady to 0b if the target EMAC is not ready to accept the data. Figure 12 provides additional details about the EMAC interface for the Transmit direction.

 

 

Figure 12: Transmit EMAC interface timing diagram

 

(1)   To initiate packet transmission, the TOE100G IP asserts MacTxSOP and MacTxValid to 1b along with the first data on MacTxData. The remaining data is transferred without asserting MacTxSOP.

(2)   During packet transmission, if the target EMAC is not ready to receive data, MacTxReady is de-asserted to 0b. In such cases, the TOE100G IP holds the same value of all signals until MacTxReady is re-asserted to 1b.

(3)   Upon transmission of the final data of the packet, both MacTxEOP and MacTxValid are asserted to 1b. According to the EMAC specification, MacTxValid must always remain asserted to 1b during packet transmission and cannot be de-asserted to 0b before the end of the packet transmission.

 

The Receive EMAC interface of the TOE100G IP can handle discontinuous data stream of a packet, similar to the Transmit EMAC interface. Depending on the frequency of the Clk signal, the behavior of MacRxReady can vary. When the Clk signal frequency is equal to or greater than the recommended value (220 MHz) and the data stream is not continuous transferred using small packet size (less than 1408 bytes), MacRxReady is always asserted to 1b for receiving data from the EMAC, as shown in Figure 13. However, If the frequency of Clk signal is too low and the packet is continuously transferred, MacRxReady may be de-asserted to 0b after receiving the final data of a packet. This occurs when there is insufficient free space in the Async buffer (Rx) to store a packet of the maximum size, as shown in Figure 14.

 

 

Figure 13: Receive EMAC interface timing diagram (Normal)

 

(1)   The TOE100G IP detects the new received packet when MacRxValid changes from 0b to 1b. In the same clock cycle, the first data is valid on MacRxData, and EMAC keeps asserting MacRxValid to 1b for the continuous transfer of the data packet.

(2)   During the transfer of a data packet, MacRxValid can be de-asserted to 0b to pause the data transfer. The data transfer resumes when MacRxValid is re-asserted to 1b.

(3)   The end of the packet is detected when both MacRxEOP and MacRxValid are asserted to 1b. In this cycle, the final data of the packet is valid on MacRxData

(4)   In normal case, MacRxReady is always asserted to 1b because the TOE100G IP can process all packets in time.

(5)   After the final data of a packet has been transferred, EMAC may assert MacRxVaild to 1b for transferring the first data of the next packet.

 

 

Figure 14: Receive EMAC interface timing diagram (Data lost)

 

(1)   To ensure that all data in a packet is received from the EMAC, the TOE100G IP always asserts MacRxReady to 1b during packet transmission. This signal is de-asserted to 0b only after the final data of the packet is completely transferred.

(2)   If the frequency of the Clk signal is too low and the packet is continuously transferred, it may cause the Async buffer (Rx) inside the TOE100G IP to not have enough free space to store the next packet. MacRxReady is de-asserted to 0b after receiving the final data of the packet (when MacRxEOP=1b and MacRxValid=1b).

(3)   If MacRxReady is de-asserted to 0b, the TOE100G IP discards any incoming packets.

(4)   After the Async buffer (Rx) has enough free space, MacRxReady is re-asserted to 1b to indicate that the TOE100G IP is ready to receive and process the next packet from the EMAC.

 

 

Example usage

 

Client mode (SRV[1:0]=00b)

The steps to set the registers for transferring data in Client mode are outlined below.

1)     Set RST register=1b to reset the IP.

2)     Set SML/SMH for MAC address, DIP/SIP for IP address, and DPN/SPN for port number.

Note: DPN is optional setting when the port is opened by IP (Active open).

3)     Set RST register=0b to start the IP initialization process. The TOE100G IP will send an ARP request packet to get the Target MAC address from the ARP reply packet. The Busy signal is de-asserted to 0b after completing the initialization process.

4)     The new connection can be created by two modes.

a.     Active open: Write CMD register = “Open connection” to create the connection (SYN packet is firstly sent from TOE100G IP). After that, wait until Busy flag is de-asserted to 0b.

b.     Passive open: Wait until “ConnOn” signal = 1b (the target device sends SYN packet to TOE100G IP firstly).

5)     a. For sending data, set TDL register (total transmit length) and PKL register (packet size). Then, set CMD register = “Send Data” to start data transmission. The user can send the data to TOE100G IP via the TxFIFO interface before or after setting the CMD register. Once the command is finished, the Busy flag is de-asserted to 0b. The user can set a new value to the TDL/PKL register and then set CMD register = “Send Data” to start the next transmission.

b. For receiving data, the user should monitor RxFIFO status and read the data until RxFIFO is empty.

6)     Similar to creating the connection, the connection can be terminated by two modes.

a.     Active close: Set CMD register = “Close connection” to close the connection (FIN packet is firstly sent by TOE100G IP). After that, wait until Busy flag is de-asserted to 0b.

b.     Passive close: Wait until “ConnOn” signal = 0b (FIN packet is sent from the target to TOE100G IP firstly).

 

Server mode (SRV[1:0]=01b)

In Server mode, the MAC address is decoded from ARP request packet instead of ARP reply packet as in Client mode. However, the process for transferring data is the same as that of Client mode. The following steps illustrate an example of Server mode.

1)     Set RST register=1b to reset the IP.

2)     Set SML/SMH for MAC address, DIP/SIP for IP address, and DPN/SPN for port number.

3)     Set RST register=0b to begin the IP initialization process by waiting for an ARP request packet to get the Target MAC address. The IP then creates an ARP reply packet to return to the target device. Once the initialization process is completed, the Busy signal is de-asserted to 0b.

4)     The remaining steps are the same as step 4 – 6 of Client mode.

 

Fixed MAC mode (SRV[1] = 1b)

In Fixed MAC mode, the MAC Address of the target device is loaded from DML and DMH register. The process for transferring data is the same as that of Client and Server mode. The following steps provide an example of how to run TOE100G IP in Fixed MAC mode.

1)     Set RST register=1b to reset the IP.

2)     Set SML/SMH for MAC address of TOE100G IP, DML/DMH for MAC address of the target device, DIP/SIP for IP address, and DPN/SPN for port number.

3)     Set RST register=0b to begin the IP initialization process. Once initialization is completed, the busy signal will be de-asserted to 0b.

4)     The remaining steps are the same as step 4 – 6 of Client mode.

 

 

PKL and TDL setting in Send command

When executing the Send command, the TOE100G IP can operate in two modes based on the value of TDL compared to N times of PKL. The details for each mode are described as follows.

 

TDL = N times of PKL

 

Figure 15: TCP packet when TDL = N times of PKL

 

If TDL value is equal to N times of PKL value, the user data is split into N packets and transmitted to the target device, as shown in Figure 15. If the target device responds with an ACK packet for each TCP packet, there will be N ACK packets in the network system. To improve network performance, several ACK packets can be combined into be one packet using the TCP delayed ACK technique. Therefore, the number of ACK packets returned from the target device (M) may be less than the number of data packets from TOE100G IP (N) when running the Send command. The PSH[0] value does not affect this condition. The last data packet (TCP Data#N) is sent only once.

 

TDL = N times of PKL + Residue

 

Figure 16: TCP packet when TDL = (N times of PKL) + Residue

 

If TDL value is not equal to N times of PKL value, the data sent to the target device is split into N packets of PKL-byte data and one last packet that contains Res-byte data, as shown in Figure 16. The first step is similar to the condition where TDL is equal to N times of PKL. The IP needs to receive an ACK packet from the target device to confirm that all N-packets have been received completely. After that, the last packet, which contains the residue byte data, is sent to the target device. If the PSH[0] register is set to 0b (default value), the residue packet is sent twice. Otherwise, the last packet is sent only once. The Send command is completed when the target returns an ACK to confirm that the last packet have been received.

Note: If target device is running on an OS that enables the delayed ACK feature, the ACK#M packet, which confirms the acceptance of TCP Data#N, may arrive too late due to timeout condition in some conditions. Therefore, the target device needs to disable the delayed ACK feature or the TDL value should be aligned to PKL value in systems that are sensitive to this latency time.

 

Connection termination of unusual case

 

Figure 17: Terminate connection sequence

 

The process of terminating a connection in the normal case is illustrated in Figure 17, where four packets are exchanged between two devices. The first device (Device#0) initiates the connection termination by sending a FIN packet. If the second device (Device#1) agrees to terminate the connection, it responds with an ACK and FIN packet, which may be sent together in one packet or in separate packets. Finally, Device#0 confirms the termination by sending an ACK packet. The TOE100G IP can execute the close connection in two modes, Active and Passive. This section describes the operation of TOE100G IP in some unusual cases.

  1. In the Active mode, TOE100G IP sends a FIN packet to initiate the close and expects to receive ACK and FIN packets from the target. Assumed that a FIN packet sets sequence number (SeqNum) to be N and an acknowledge number (AckNum) to be M, the expected ACK and FIN packet must contain SeqNum=M and AckNum=N+1. If TOE100G IP does not receive the expected packets until timeout (set by the TMO register), it sends a RST packet to terminate the connection immediately without 16 retry times. TOE100G IP also asserts TimerInt and TMO[3] to 1b.
  2. If TOE100G IP receives new data from the target while executing the Active close command, it rejects the data and still waits for the expected ACK and FIN packets. Similar to the first case, if the expected packets are not received until the timeout, TOE100G IP sends the RST packet to terminate the connection.
  3. In the Passive mode, while TOE100G IP is transmitting data to the target, it receives a FIN packet from the target to terminate the connection. TOE100G IP sends an ACK and FIN packet in response, with SeqNum set to the most recently confirmed data acceptance value. After the termination of the connection, the ConnOn and Busy outputs are set to 0b. The user can check the amount of untransmitted data in the TDL register.

 

 

Verification Methods

The TOE100G IP Core functionality was verified by simulation and also proved on real board design by using Stratix10 MX, Stratix10 TX, or Agilex F-series FPGA development Kit.

 

Recommended Design Experience

User must be familiar with HDL design methodology to integrate this IP into their design.

 

Ordering Information

This product is available directly from Design Gateway Co., Ltd. Please contact Design Gateway Co., Ltd. For pricing and additional information about this product using the contact information on the front page of this datasheet.

 

Revision History

 

Revision

Date

Description

2.0

11-May-2023

Support TCP window scaling feature, add more selectable buffer size, add TCPTxFfWrCnt signal, update TCPRxFfRdCnt, and add Connection termination of unusual case section.

1.1

7-Mar-2022

Add Stratix10 TX deveice support

1.0

19-Apr-2021

New release