NVMeTCP IP Core for 10G Datasheet

Features 1

Applications 3

General Description. 4

Functional Description. 5

NVMe/TCP. 6

·      Register File. 6

·      Admin Command Handler 7

·      IO Command Handler 7

·      Read buffer 8

TCP/IP. 8

·      Admin TCP/IP Controller 8

·      IO TCP/IP Controller 9

User Logic. 9

10G Ethernet System (MAC+PHY) 9

Core I/O Signals 10

Timing Diagram.. 14

Reset process 14

Connection Establishment 15

Write command. 17

Read command. 19

Mixed Write-Read commands 21

Connection Termination. 22

Error 23

EMAC Interface. 24

Verification Methods 26

Recommended Design Experience. 26

Ordering Information. 26

Revision History. 26

 

  Core Facts

Provided with Core

Documentation

Reference Design Manual

 Demo Instruction Manual

Design File Formats

Encrypted File

Instantiation Templates

VHDL

Reference Designs & Application Notes

Vivado Project,

See Reference Design Manual

Additional Items

Demo on KCU105/ZCU102/ZCU106/VCK190

Support

Support Provided by Design Gateway Co., Ltd.

 

Design Gateway Co.,Ltd

E-mail:    ip-sales@design-gateway.com

URL:       design-gateway.com

Features

·     Protocol Support: Implement an NVMe/TCP host controller (Initiator) based on NVMe-oF specification rev 1.1 and NVMe specification rev 1.4

·     Target Access: Enable access to an NVMe SSD at the target (Subsystem) using a specified NVMe name (NQN)

·     Command Support: Write and Read commands

·     Performance: High performance with speeds up to 1200 MB/s for both Write and Read (1MB buffer)

·     Data Interface: Memory-mapped interface

·     Data Size per Command: Support a fixed data size of 4 KB

·     Maximum Command: Up to 256 commands, limited by Read buffer size for the Read command

·     Read Buffer Configurable: Configurable size ranging from 32KB (8 Cmds) - 1MB (256 Cmds)

·     Target Compatibility: Compatbile with the NVMe/TCP targets that meet the following criteria:

·     I/O Queue Command Capsule Support Size (IOCCSZ): At least 260 (104h)

·     Maximum Queue Entries Supported (MQES): At least 511 (1FFh)

·     Maximum Outstanding Commands (MAXCMD): At least 256 (100h)

·     Controller PDU Data

·     Authentication: No authentication required

·     Networking: 10G Ethernet speed in the same network domain for ARP packet transfer, utilizing jumbo frame packets

·     Ethernet MAC Interface: 64-bit AXI4 Stream interface operating at 156.25 MHz

·     User clock frequency: 156.25 MHz, synchronized with the EMAC interface

·     Reference Design: Available for KCU105, ZCU102, ZCU106, and VCK190 boards

·     Customization Options:

·     Access NVMe/TCP Target across different network domain without requiring ARP packet transfers

·     Support Ethernet packet transfer with non-jumbo frame size

 

Table 1: Example Implementation Statistics (UltraScale/UltraScale+)

Family

Example Device

Read

BufSize

Fmax

(MHz)

CLB

Regs

CLB

LUTs

CLB

BRAM

Tile

URAM

Design

Tools

Kintex-UltraScale

XCKU040FFVA1156-2E

32KB (min)

156.25

13164

12485

2548

46.5

-

Vivado2023.2

1MB (max)

13280

12737

2473

294.5

-

Zynq-Ultrascale+

XCZU9EG-FFVB1156-2E

32KB (min)

156.25

13164

12449

2473

46.5

-

Vivado2023.2

1MB (max)

13280

12706

2554

294.5

-

Zynq-Ultrascale+

XCZU7EV-FFVC1156-2E

32KB (min)

156.25

13192

12440

2559

4

6

Vivado2023.2

1MB (max)

13309

12669

2459

4

37

 

Table 2: Example Implementation Statistics (Versal)

Family

Example Device

Read

BufSize

Fmax

(MHz)

CLB Regs

CLB LUTs

Slice

BRAM

Tile

URAM

Design

Tools

Versal AI Core

XCVC1902-VSVA2197-2MP-ES

32KB (min)

156.25

13283

11770

2695

4

6

Vivado2023.2

1MB (max)

13397

11876

2470

4

37

Notes: Actual logic resource dependent on percentage of unrelated logic

 

Applications

 

Figure 1: NVMeTCP IP for 10G Application

 

The NVMeTCP10G IP core is designed to be integrated into FPGA platforms, serving as a host controller that enhances data transfer capabilities. It is adept at facilitating rapid data movement from sensors and other data-generating devices to server-based storage systems through a 10G Ethernet network leveraging the NVMe/TCP protocol. This capability is particularly vital in environments where real-time data acquisition and storage are crucial.

Moreover, the flexibility of the NVMeTCP10G IP core is highlighted in its ability to support simultaneous data transfer from multiple host systems to a singular storage server. This multiplexing ability not only streamlines the data transfer process but also introduces a layer of operational efficiency, as illustrated in Figure 1 of the document.

The NVMeTCP10G IP utilizes the NVM Qualified Name (NQN) to identify SSDs, so the IP core ensures that data is routed accurately to the intended storage destination. However, it is noted that switching between target SSDs requires the current NVMeTCP10G connection to be terminated before establishing a new link.

 

General Description

 

Figure 2: NVMeTCP IP for 10G Block Diagram

 

The NVMeTCP10G IP plays a crucial role as a host controller, also referred to as an NVMe/TCP initiator, enabling access to an SSD within an NVMe/TCP target through a 10G Ethernet connection. Its user interface includes three primary interfaces: the Parameters interface, the Control interface, and the Memory Map interface. Each interface has a distinct function: the Parameters interface configures network settings and timeout durations for establishing connection; the Control interface manages and tracks the operational status and errors, overseeing connection establishment and termination process; and the Memory Map interface facilitates data transfer, managing address allocation for Write and Read operations.

Internally, the NVMeTCP10G IP establishes two crucial TCP connections to communicate with the NVMe/TCP target. The Admin connection is responsible for the setup and termination of the connections and maintaining the link with Keep Alive commands. On the other hand, the IO connection processes Write and Read commands. Each connection is managed by separate TCP/IP Controllers and Command Handlers, ensuring dedicated and efficient control. The EMAC IF acts as a multiplexer, channeling Ethernet packets from both connections through the same EMAC.

The IP’s Register File interface is another component, receiving user inputs to initiate the connection process, accompanied by a Read buffer that reorders incoming data from Read commands to maintain the same sequence of command requests. The IO Command Handler can support up to 256 Write/Read commands, with the maximum number of commands queued being limited by the Read buffer’s size, which is a trade-off between read performance and resource consumption.

Operating within a single clock domain at 156.25 MHz, the NVMeTCP10G IP is tailored for 10G Ethernet speeds, aligning with the common user clock domain of 10G Ethernet systems found in most FPGA devices. Special consideration is given to specific FPGA models like the Versal, which employs a Multirate Ethernet MAC (MRMAC) operating on a different clock domain and data bus, 322.265 MHz at 32 bits. For such cases, a MAC Adapter is introduced to ensure seamless data transfer across clock domains and data bus widths.

To assist potential users in evaluating the NVMeTCP10G IP’s capabilities, reference designs are provided for various on FPGA evaluation boards, enabling an evaluation before making a purchasing decision.

 

Functional Description

The operational sequence of the NVMeTCP10G IP post-reset is presented in Figure 3, illustrating two main phases: ‘No connection’ and ‘Connection ready’. Initially, users activate the connection, waiting for its successful establishment, a process that includes setting up two TCP ports. Once these ports are successfully initialized, the IP transitions from the ‘No connection’ phase to ‘Connection ready’ phase.

During the ‘Connection ready’ phase, the IP manages two types of commands: Admin and IO. The Admin command type, which includes the Keep Alive command, operates continuously in the background. In contrast, the IO command type, encompassing Write and Read commands, allows user execution. Users can initiate multiple Write and Read commands, subject to a limit of 256 commands or until the Read buffer is full.

If there is no further data to transmit, users can terminate the connection. This process moves the IP back to the ‘No connection’ phase once the termination is successfully completed.

 

 

Figure 3: NVMeTCP10G IP Operation Flow

 

1)     The IP awaits the establishment of the Ethernet link. Once established, the Ethernet system is ready for packet transfer.

2)     User sets HostConnEn to 1b, initiating the connection establishment. This transition from 0b to 1b for HostConnEn activates the process.

3)     The Admin port, forming the initial communication pathway with the target, is established. With the Admin TCP connection ready, NVMe/TCP Protocol Data Units (PDUs) is transferred between the NVMeTCP10G IP and the target. Following a successful response from the target acknowledging the connection, the IO port is set up, connection for Write/Read command execution. Completion of this step signifies the IP’s readiness, transitioning to the ‘Connection ready’ phase.

Note: The NVMe Qualified Name (NQN) exchange between host and target must be unique values, set via HostNQN and TrgNQN (NVMeTCP10G IP input/parameters).

 

4)     In ‘Connection ready’ phase, the Keep Alive command is dispatched at user-defined intervals. If a timeout occurs without a response, an error is asserted, prompting the termination of both Admin and IO ports.

5)     To deactivate, users switch HostConnEn to 0b when the IP is idle (HostBusy is 0b), triggering the termination process for both ports and transitioning the IP to ‘No connection’ status.

6)     In the ‘Connection ready’ phase, initiating a Write command involves setting HostMMWrite to 1b, followed by data transmission. After this, step 8) checks if the IO command queue has sufficient space for more commands.

7)     Similarly, initiating a Read command with HostMMRead set to 1b requires checking the Read buffer’s remaining capacity. If space is insufficient, the process waits until more becomes available.

8)     The remaining transfer size is monitored. If no more data needs transferring, the IP switches to Idle. Otherwise, it ensures the IO command queue does not exceed 256 commands before continuing with additional Write or Read commands. If the queue is full, the process pauses until space frees up.

 

The NVMeTCP IP is designed with a dual-layer hardware structure to independently manage two distinct protocols: NVMe/TCP and TCP/IP. The Command Handler is responsible for NVMe/TCP operations, while the TCP/IP Controller manages TCP/IP functions.

 

NVMe/TCP

The NVMe/TCP protocol implementation encompasses three protocol layers: NVMe, NVMe over Fabrics (NVMe-oF), and NVMe/TCP transport layer. This hardware layer acts as a bridge between the user interface and the TCP/IP Controller, ensuring that both Admin and IO commands are processed individually. Below are further details on the submodules within the NVMe/TCP layer.

·       Register File

This submodule is used for storing both system and network parameters set by the user, which are required for communication with the target, like the IP address. These parameters are inputted through the Parameters interface and are required for the generation and interpretation of PDUs exchanged between the Command Handlers (Admin and IO) and the target device. Additionally, the Register File holds system-specific parameters, such as the timeout duration to wait for a response from the target.

·       Admin Command Handler

The Admin Command Handler is a submodule, designated to manage the Admin connection, which precedes the IO connection setup. Its operations unfold in a sequential three-step process to ensure efficient connection management.

Initially, the Admin Command Handler undertakes the connection establishment process, involving an extensive exchange of packets. Following the establishment, the submodule engages in periodic execution of the Keep Alive command. This command is crucial for maintaining the connection’s active status, facilitating uninterrupted Write and Read command operations. Users can set the Keep Alive interval, ranging from 0 to 3600 seconds. It’s noted that initiating the Keep Alive command temporarily interrupts packet transfer for Write/Read commands, which may slightly impact performance. The final step involves the execution of the connection termination process, effectively deactivating the connection when necessary.

To support these functions, the Admin Command Handler is structured into three subblocks, the control engine, the transmit (Tx) module, and the receive (Rx) module. The control engine orchestrates the sequence of the PDU transmission for each process. The Tx module generates the PDU during the connection initialization, termination, command transmission, and data transmission. Concurrently, the Rx module interprets incoming PDUs to confirm the success of connection initialization, command transmission, or data reception. If the control engine encounters a timeout while waiting for a PDU, it triggers the termination of both Admin and IO connections.

·       IO Command Handler

The IO Command Handler is a module to manage data transfer operations within the NVMeTCP IP, specifically handling Write and Read commands to optimize data transfer performance. This module manages the flow of data, ensuring that commands are processed efficiently and in the correct order.

The module’s functionality is distributed across four main processes: establishing connection, executing Write commands, processing Read commands, and terminating connections. This submodule shares the same structural design with the Admin Command Handler, comprising the control engine, the Tx module, and the Rx module. This design facilitates an approach to manage the various stages of data transfer.

The IO Command Handler has a queue memory, which can store up to 256 Write and Read commands. This capacity is critical for buffering commands and managing the flow of data. However, it is noted that data returned from the target may not always align with the sequence of Read command requests. To address this, the Read buffer is employed to reorder incoming data, ensuring that the data sequence presented to the user matches the original order of the Read command requests. When the Read buffer reaches its capacity, the IP de-asserts the ready signal, preventing additional command requests.

Additionally, the IO Command Handler include a timing mechanism to monitor the wait time for receiving PDU. If a timeout occurs, indicating that a response has not been received within the expected timeframe, the system terminates both Admin and IO connections.

·       Read buffer

The Read buffer is designed to manage and store incoming data from Read commands. The buffer’s size is configured through the “RdBufDepth” parameter, which can be set to a value ranging from 3 to 8, as illustrated in Table 3.

 

Table 3: RxBufDepth parameter description

RdBufDepth

Buffer size

Maximum

Read command

Estimated

Read Performance*(1)

3

32 KB

8

600 MB/s

4

64 KB

16

600 – 800 MB/s

5

128 KB

32

600 – 900 MB/s

6

256 KB

64

800 – 900 MB/s

7

512 KB

128

1000 MB/s

8

1 MB

256

1200 MB/s

Remark *(1): Estimated Read performance is the result in a specific test environment.

The Read buffer size determines how many Read commands can be queued and processed. For instance, setting ‘RxBufDepth’ to 8 allows the buffer to allocate 1 MB of space, which is adequate to handle 256 Read commands (each command involves 4 KB of data). Users need to balance resource utilization against read performance. The size of the buffer impacts the number of Read commands that can be stored and processed, influencing the maximum read throughput. Opting for a larger buffer size can improve read performance, not write performance.

 

TCP/IP

The TCP/IP layer is a fundamental component in the NVMeTCP IP, facilitating the transmission of NVMe/TCP protocol packets over the network. Using TCP/IP achieves reliable Ethernet packet transfer. Within this layer, two dedicated modules are designed to handle two distinct TCP ports (Admin and IO) simultaneously, alongside an integrated EMAC IF for packet multiplexing and forwarding to the Ethernet MAC through a 64-bit AXI4-Stream interface.

·       Admin TCP/IP Controller

This controller is segmented into three main logic components: the main controller, the Tx module, and the Rx module. The controller manages the opening and closing of ports and overseeing data transmission and reception. The Tx module encapsulate the PDU into an Ethernet packet, appending TCP, IP, and Ethernet headers to ensure proper transmission. Conversely, the Rx module decodes the Ethernet packet, extracting the TCP payload, and conveying it back to the Admin Command Handler.

The TCP/IP provides a lossless data transmission by supporting data retransmission for recovering lost data. Additionally, flow control is implemented by monitoring the available space in the receiver’s buffer before dispatching further data, preventing buffer overflow.

This controller decodes the MAC address of the NVMe/TCP target through ARP transfers. This operation necessitates that both the host and target reside within the same network domain. For scenarios necessitating cross-domain communication, please contact our sales team for support.

·       IO TCP/IP Controller

The IO TCP/IP Controller is designed for high-speed data transfers, particularly for managing Write and Read commands integral to the operation of the NVMeTCP IP. This demand for speed translates to increased resource utilization, due to the necessity for larger buffer sizes. A sufficiently sized buffer is crucial to ensure that data transfers with the NVMe/TCP target are executed at peak performance. The controller is optimized for using jumbo frames to enhance data transmission efficiency. However, for environments where jumbo frames are not supported, contact our sales team for further information.

·       EMAC IF

The EMAC IF manages the shared Ethernet system utilized by both the Admin and IO TCP/IP Controllers. It functions as a switch, selecting which TCP/IP Controller is active at any given moment, ensuring efficient packet transfer management. The switching mechanism is designed to alternate the active module following the complete transfer of a packet.

 

User Logic

The user interface of the NVMeTCP10G IP employs a Memory-mapped interface for executing both Write and Read commands. The core engine of user logic can be designed by a state machine, managing the connection processes, including establishment and termination. This state machine is also responsible for data transfer operation, where it handles to issue the Write and Read requests while managing address value assignments.

If the parameters remain constant, users can assign using constant values in the HDL code, simplifying the design. Additionally, the data flow interface utilizes ‘valid’ and ‘ready’ signals.

 

10G Ethernet System (MAC+PHY)

The 10G Ethernet system, integral to the NVMeTCP10G IP, encompasses the Media Access Control (MAC) and the Physical (PHY) layers, providing a robust foundation for Ethernet-based data transmission. The EMAC layer can be implemented by various solutions, both Soft IP core and Hard IP core. The PHY layer is typically provided at no additional cost by FPGA vendor. The solutions of Ethernet System are described below.

Solution 1: DG 10G25GEMAC IP core + 10G Ethernet PCS/PMA by AMD Xilinx

https://dgway.com/products/IP/GEMAC-IP/dg_10g25gemacip_data_sheet_xilinx/

Solution 2: 10G/25G Ethernet Subsystem (Soft IP)

https://www.xilinx.com/products/intellectual-property/ef-di-25gemac.html

Solution 3: Versal Multirate Ethernet MAC Subsystem (Hard IP)

https://www.xilinx.com/products/intellectual-property/mrmac.html

 

Core I/O Signals

Descriptions of all parameters and I/O signals are provided in Table 4 and Table 5.

Table 4: Core Parameters

Name

Value

Description

HostNQNH[1783:128]

Unicode

char

Represent the upper 207 bytes of the NVMe Qualifed Name (NQN) of the host system.

Note: If the HostNQN fits within 16 bytes, assignable to HostNQNL input size, then HostNQNH should be set to all zeros.

KeepAliveSet

0 – 3600

Specify the interval time for sending Keep Alive in seconds. Setting this to 0 to disable the Keep Alive transmission feature.

RdBufDepth

3 - 8

Configure the Read buffer size. Refer to Table 3 for additional details.

 

Table 5: Core I/O Signals

Signal

Dir

Description

Common Interface

RstB

In

Reset the IP core. Active Low.

Clk

In

Clock input, synchronized with the EMAC interface, set to 156.25 MHz for 10G Ethernet.

Parameters Interface

Note: All inputs must be valid when HostConnEn=1b and HostConnStatus=0b.

HostMAC[47:0]

In

The MAC address of the host system.

HostIPAddr[31:0]

In

The IP address of the host system.

HostAdmPort[15:0]

In

The Admin port number used by the host system

HostIOPort[15:0]

In

The IO port number for the host system

TrgIPAddr[31:0]

In

The IP address of the target system

TCPTimeOutSet[31:0]

In

The timeout duration before initiating the retransmission process, measured in 1/Clk frequency units or 6.4 ns at 156.25 MHz. It is recommended to set this to 3 times of the Round-trip time (RTT) or to 1 second if 3 times of RTT is less than 1 second.

NVMeTimeOutSet[31:0]

In

The timeout duration to wait for a response from the target before asserting error, measured in 1/Clk frequency units or 6.4 ns at 156.25 MHz. Setting this to 0 disables the timeout. It is recommended to set this to 4 times of the TCPTimeOutSet to accommodate TCP/IP retransmission before asserting an error.

HostNQNL[127:0]

In

The lower 16 bytes of the host system’s NVMe Qualifed Name (NQN), defined as a Unicode string. The total HostNQN size is 223 bytes, with 207 bytes set by HostNQNH and 16 bytes by HostNQNL. Excess characters should be filled with 00h, similar to TrgNQN. This is used when the target system grants access permissions for the SSD to specific hosts.

TrgNQN[1783:0]

In

The NVMe Qualifed Name (NQN) of the SSD within the target system, defined as a Unicode string. The maximum size of TrgNQN is 223 bytes. For example, if the NQN is “dgnvmettest”, configure TrgNQN[7:0]=’d’, TrgNQN[15:8]=’g’, …, TrgNQN[87:80]=’t’ accordingly, with remaining characters (TrgNQN[1783:88]) set to 0 (null). This NQN is utilized by the host to identify and interact with the correct SSD in the target system.

 

Signal

Dir

Description

Control Interface

IPVersion[31:0]

Out

The IP version number.

TestPin[127:0]

Out

Internal test pin.

HostConnEn

In

Enable/Disable the connection with the target.

Setting it from 0b to 1b initiates the connection establishment process.

Setting it from from 1b to 0b triggers the connection termination process.

Ensure the IP is idle (HostBusy=0b) before modifying this signal. The HostBusy will then be set to 1b to indicate the start of the connection establishment or termination process and revert to 0b once the operation is completed.

HostConnStatus

Out

The connection status with the target.

0b: No connection, 1b: An active connection.

HostBusy

Out

The busy status of the IP. 0b: Idle, 1b: Busy.

Set to 1b after changing HostConnEn value or processing Write/Read commands.

HostError

Out

Set to 1b when an error is detected, indicated by HostErrorType being non-zero.

It can be reset by setting RstB to 0b.

HostErrorType[31:0]

Out

Error types.

[0] – An error if the target system is not found.

[1] – An error if the target requires authentication.

[2] – An error when certain target parameters are unsupported (refer to ‘TrgCAPStatus’ signal for details).

- I/O Queue Command Capsule Supported Size (IOCCSZ) is less than 260.

- Maximum Queue Entries Supported (MQES) is less than 511.

- Maximum Outstanding Commands (MAXCMD) is less than 256.

[7:3] – Reserved.

[8] – An error when the Admin port fails to establish a connection successfully.

[9] – An error when the Admin port does not receive a response within specified time.

[10] – An error when status register of the Admin completion entry is incorrect, with additional information available in the ‘TrgAdmStatus’ signal.

[11] – Reserved.

[12] – An error when the Admin port receives an unrecognized packet.

[13] – An error when the Admin port receives a termination request unexpectedly.

[14] – An error when the Admin port detects unsupported CPDA during connection establishment.

[15] – A critical error within the Admin TCP/IP Controller, with more details in the ‘AdmTCPStatus’ signal.

[16] – An error when the IO port is unable to establish a connection successfully.

[17] – An error when the IO port does not receive a response within specified time.

[18] – An error when status register of the IO completion entry is incorrect, with additional information available in the ‘TrgIOStatus’ signal.

[19] – Reserved.

[20] – An error when the IO port receives an unrecognized packet.

[21] – An error when the IO port receives a termination request unexpectedly.

[22] – An error when the IO port detects unsupported CPDA during connection establishment.

[23] – A critical error within the IO TCP/IP Controller, with more details in the ‘IOTCPStatus’ signal.

[31:22] – Reserved.

 

Signal

Dir

Description

Control Interface

TrgLBASize[47:0]

Out

The total capacity of the SSD in 512-byte units. Bits[2:0] are set to 000 to align with 4 KB units, as the IP supports data transfer in 4KB units. The default value is 0, and this signal becomes valid after the connection establishment process is completed.

TrgCAPStatus[63:0]

Out

Status of the target capabilities.

[31:0] – I/O Queue Command Capsule Supported Size (IOCCSZ).

[47:32] – Maximum Queue Entries Supported (MQES).

[63:48] – Maximum Outstanding Commands (MAXCMD).

TrgAdmStatus[15:0]

Out

Status output from the Admin command.

[0] - Reserved

[15:1] – Status field value of the Admin Completion Entry

TrgIOStatus[15:0]

Out

Status output from the I/O command

[0] - Reserved

[15:1] – Status field value of the IO Completion Entry

AdmTCPStatus[31:0]

Out

Status from the Admin TCP/IP controller.

[0] – Set to 1b when TCP/IP Controller retries sending an ARP request packet.

[1] – Set to 1b when TCP/IP Controller retransmits a ‘SYN’ packet.

[2] – Reserved.

[3] – Set to 1b when TCP/IP Controller transmits a ‘RST’ packet during TCP open operation.

[4] – Set to 1b when TCP/IP Controller retransmits FIN and ACK packets.

[5] – Set to 1b when TCP/IP Controller retransmits data packet.

[6] – Set to 1b when TCP/IP Controller generates duplicate ACK packets to request the retransmission of lost data.

[7] – Set to 1b when TCP/IP Controller retransmits data packet to prompt the target to return an ACK packet for updating the window size value.

[20:8] – Reserved

[21] – Set to 1b when a received ACK packet indicates a lost packet.

[22] – Set to 1b upon receiving connection termination request during data transmission to the target (Fatal error).

[23] – Set to 1b when the received buffer of TCP/IP Controller is overflow (Fatal error).

[26:24] – Internal test status

[27] – Set to 1b when a received data packet indicates a lost packet.

[29:28] – Internal test status

[30] – Set to 1b when receiving a RST packet (Fatal error).

[31] – Internal test status

IOTCPStatus[31:0]

Out

Status from the IO TCP/IP controller. The description of each bit matches the AdmTCPStatus.

 

Signal

Dir

Description

Memory mapped Interface

HostMMAddr[47:0]

In

The Memory-mapped address for Write/Read commands in 512-byte units, requiring alignment to 4 KB boundaries. The maximum value is the (TrgLBASize – 8).

HostMMRead

In

Set to 1b, sending a 4 KB Read command. This signal must not be asserted simultaneously with a 4 KB Write command execution, setting HostMMWrite to 1b. Upon a request of Read command, 4 KB read data is returned via HostMMRdData.

HostMMWrite

In

Set to 1b, sending a 4 KB Write command and 4 KB Write data. This signal must not be asserted simultaneously with HostMMRead assertion.

HostMMWtReq

Out

The readiness of the IP to accept the Write/Read command request and Write data.

0b: Accept the request/data, 1b: Not ready.

HostMMWrData[63:0]

In

Write data during Write command execution, valid when HostMMWrite is set to 1b.

HostMMRdData[63:0]

Out

Read data returned from Read command, valid when HostMMRdValid is set to 1b.

HostMMRdValid

Out

Set to 1b to indicate the validity of HostMMRdData. The read data transmission can be paused with the HostMMRdPause signal set to 1b. Following this, the HostMMRdValid is set to 0b, with a latency of 2 clock cycles from the assertion of HostMMRdPause.

HostMMRdPause

In

Set to 1b to pause receiving Read data when the user is not ready to accept data.

MAC Interface

EthLinkup

In

The status of the Ethernet link. Set to 1b when the Ethernet link is successfully established.

MacTxData[63:0]

Out

Transmitted data to the Ethernet MAC (EMAC). Considered valid when MacTxValid is set to 1b.

MacTxKeep[7:0]

Out

The byte enable for MacTxData. Set to 1b to indicate the corresponding bytes are valid for transmission, with bit[0] for MacTxData[7:0], bit[1] for MacTxData[15:8], and so on.

MacTxValid

Out

Valid signal of MacTxData. Set to 1b to indicate MacTxData is ready for transmission.

MacTxLast

Out

Asserted to 1b to indicate the final data of a packet. Valid when MacTxValid is set to 1b.

MacTxReady

In

Handshaking signal. Set to 1b when the EMAC is ready to accept MacTxData. During data transmission, this signal must be maintained at 1b for continuous data transmission from the transmission of the first data of a packet until the last data.

MacRxData

In

Received data. Valid when MacRxValid is set to 1b.

MacRxValid

In

Valid signal of MacRxData. This signal is set to 1b continuously from the reception of the first data of a packet until the last data.

MacRxLast

In

Asserted to 1b to indicate the final data of the packet. Valid when MacRxValid is set to 1b.

MacRxUser

In

Control signal asserted at the end of received frame (when MacRxValid and MacRxLast are set to 1b) to indicate that the frame status. 1b: Normal packet, 0b: A packet with CRC error.

MacRxReady

Out

Handshaking signal. Asserted to 1b when MacRxData has been accepted. After the reception of the final data of a packet, MacRxReady is set to 0b for two clock cycles, preparing for the next data reception.

 

Timing Diagram

Reset process

 

Figure 4: Timing diagram of Reset process

 

1)     Ensure that the Clk signal is stable. Following this, set RstB to 1b to initiate the reset process.

2)     The IP then waits for the EthLinkup to be asserted, which confirms that the Ethernet link has been successfully established.

3)     Throughout the reset operation, the HostBusy signal is maintained at 1b, indicating that the IP is in the process of resetting. Once the reset process is finalized, HostBusy is reverted to 0b, signaling that the IP is ready for further operations, specifically for establishing a connection with the target system.

4)     To commence the connection establishment, the user change HostConnOn from 0b to 1b. This action triggers the process that enables the IP to start establishing a connection with the target system.

 

Connection Establishment

 

Figure 5: Timing diagram of Connection establishment

 

1)     Ensure that both HostConnStatus and HostBusy are set to 0b, indicating that the IP is idle and the connection is inactive. Next, configure the necessary parameters, which includes HostMAC, HostIPAddr, HostAdmPort, HostIOPort, TrgIOAddr, TCPTimeOutSet, NVMeTimeOutSet, HostNQNL, and TrgNQN. After setting these parameters, change HostConnEn to 1b to initiate the connection establishment. Maintain the parameter values stable throughout this process.

2)     Once the process begins, HostBusy is set to 1b, indicating the IP is establishing a connection.

3)     Upon the successful establishment, HostConnStatus changes to 1b. Following this, TrgLBASize provides the storage capacity and HostBusy is set to 0b, signifying that the IP is ready for data transmission, allowing the user to send Write/Read commands.

 

 

Figure 6: Error during running Connection establishment

 

Figure 6 illustrates an example of failure connection establishment, which can be found if the parameter values are incorrect. The reset is required to resolve this problem. Further details in step-by-step are described below.

1)     If the connection establishment process fails, indicated by setting the HostError to 1b. The error status is identified via the HostErrorType signal. During this failure state, HostConnStatus remains at 0b (no active connection), while HostBusy is set to 1b (incomplete operation).

2)     To address the error, the system requires a reset. Before attempting to re-establish the connection, ensure that any issues leading to the error are resolved. Then, initiate the reset process by setting RstB to 0b.

3)     During the reset, the error flag (HostError) and the error status (HostErrorType) are cleared, returning them to normal status.

 

Write command

 

 

Figure 7: Timing diagram of Write command (Single mode)

 

During a Write command, the memory-mapped interface utilizes HostMMWrite to initiate a write request along with the corresponding write data to the IP. The first cycle of asserting HostMMWrite includes sending the target address and the first data of the transfer. Each write request involves transferring 4 KB of data, equivalent to 512 cycles of 64-bit data. The IP can use HostMMWtReq set to 1b to pause a write request or data transfer at any point during the operation. Below are detailed steps for executing a single Write command request.

1)     The user asserts HostMMWrite to 1b to send a Write request. The target address is specified on HostMMAddr, and the first data (D0) is placed on HostMMWrData. If HostMMWtReq is set to 0b in that clock cycle, it indicates that the request and the first data have been accepted. Subsequent data (starting with D1) can be sent by maintaining HostMMWrite at 1b. HostMMAddr is not required for the remaining data transfer, but ensure that HostMMAddr values at the first cycle are aligned to 4 KB, with bits[2:0] set to 000b.

2)     Once the write request is accepted, the IP sets HostMMWtReq to 1b for at least 10 clock cycles to pause the next write request or data transfer, allowing time for pre-processing of the Write command. During this pause, data transfer can resume once the IP is ready, and HostBusy is set to 1b, indicating the IP is processing the command.

3)     The user has the option to pause the data transmission by setting HostMMWrite to 0b and can resume by reasserting it to 1b.

4)     The IP may assert HostMMWtReq to 1b to pause data transmission. During this pause, the user must maintain the current value of HostMMWrData.

5)     Upon transferring the last data (D511) for this Write request, the IP sets HostMMWtReq to 1b for at least 1 clock cycle, signifying the commencement of post-processing for the Write command.

6)     If there are no further commands pending in the queue, the IP de-asserts HostBusy to 0b, indicating the system has returned to an idle state.

 

 

Figure 8: Timing diagram of Write command (Multiple mode)

 

After completing the first Write command, the user can initiate the subsequent Write requests. This is done by immediately asserting HostMMWrtie to 1b, while simultaneously assigning the target address on HostMMAddr and placing the first data of the new request on HostMMWrData. However, it’s noted that during the first cycle of this new request, HostMMWtReq may still be set to 1b due to the post-processing activities of the preceding command.

In each Write request, HostMMWtReq is set to 1b for at least 11 clock cycles. This includes 10 cycles allocated for the pre-processing of the current request and 1 cycle dedicated to the post-processing of the previous request. The completion of all command executions within the sequence is indicated by HostBusy being reset to 0b, signaling that the IP has processed all command requests and has returned to an idle state.

 

Read command

 

 

Figure 9: Timing diagram of Read command (Single mode)

 

When executing a Read command, the data received upon request is consistently fixed at 4 KB, or 512 cycles of 64-bit data. This data is transferred continuously, but the user has the option to pause the data transmission during any cycle using HostMMRdPause. Below is a detailed step-by-step guide on sending a single Read command request.

1)     To send a Read command, the user sets HostMMRead to 1b, while specifying the desired memory address on HostMMAddr. It is crucial to ensure alignment to 4 KB units by setting bits[2:0] to 000b.

2)     Upon receiving the Read command, the IP sets HostMMWtReq to 1b for a minimun of 10 clock cycles to handle pre-processing. During this time, HostBusy is also set to 1b to indicate that the command is currently being processed.

3)     Once the data is returned to the IP by the target, the IP begins transferring the 4 KB data to the user. This is done by setting HostMMRdValid to 1b and streaming the data via HostRdData. HostMMRdValid remains asserted at 1b throughout the transfer of 512 cycles of 64-bit data, ensuring continuous data flow.

4)     If the user is not ready to receive more data using the data transfer cycle, HostMMRdPause can be set to 1b. This pauses the data transmission.

5)     Two clock cycles after asserting HostMMRdPause to 1b, HostMMRdValid is set to 0b, pausing the data transmission. The data transmission can be resumed by resetting HostMMPause to 0b.

6)     After all commands are processed and there are no remaining commands, HostBusy is de-asserted to 0b, indicating that the IP has returned to an idle state.

 

 

Figure 10: Timing diagram of Read command (Multiple mode)

 

The IP features a command queue that enables the handling of multiple Write/Read command requests concurrently, significantly enhancing overall data transfer performance. The maximum number of Read commands that can be queued depends on two factors: the total capacity of the command queue, which is capped at 256 commands, and the configured Read buffer size. The maximum number of Read commands that can be requested is determined by dividing the Read buffer size by 4 KB.

To send multiple Read requests, the user sets HostMMRead to 1b and assigns the corresponding address on HostMMAddr for each command. This process allows the user to stack multiple commands without waiting for the completion of earlier ones. As data is received from the target, 4 KB of data corresponding to each command request is transferred back to the user. This transfer is initiated by setting HostMMRdValid to 1b. There will be a single clock cycle gap between each data transfer. The sequence in which the 4 KB data blocks appear on HostMMRdData corresponds directly to the sequence of Read command requests specified by HostMMAddr. Once all queue commands have been processed and no further commands remain, HostBusy is reset to 0b, indicating that the IP has returned to an idle state.

 

Mixed Write-Read commands

 

 

Figure 11: Timing diagram of Mixed Write-Read commands

 

The IP allows Write and Read commands to be issued without needing to wait for the completion of preceding commands. This flexibility helps optimize the throughput and efficiency of data operations. However, it is crucial to manage the shared signals, HostMMAddr and HostMMWtReq, to ensure accurate command processing. To send mixed Write and Read commands, the user sequentially initiate each command by setting either HostMMWrite or HostMMRead to 1b, along with the corresponding address on HostMMAddr. It is crucial not to set both HostMMWrite and HostMMRead to 1b simultaneously, which is not supported by the interface.

Upon receiving each command, HostMMWtReq is set to 1b during the pre-processing and the post-processing phases. For Read commands, the sequence of 4 KB data blocks received will correspond directly to the order of Read command requests made. After all command requests have been processed, including any Write or Read operations, HostBusy is reset to 0b.

 

Connection Termination

The process of terminating a connection in the IP involves a few steps to ensure that the termination is handled completely.

 

 

Figure 12: Timing diagram of Connection termination

 

1)     Before initiating termination, confirm that HostBusy is set to 0b. This indicates that the IP is currently idle, with no commands being processed. This check ensures that this request does not interrupt any active command processing. Change the value of HostConnEn from 1b to 0b. This action signals the request to terminate the current connection.

2)     Following the request, HostBusy is set to 1b. This change signifies that the IP has started processing the termination request.

3)     Once the termination process is completed, both HostConnStatus and HostBusy are reset to 0b. HostConnStatus returning to 0b indicates that the connection is no longer active, and HostBusy being set to 0b confirms that the IP has returned to an idle state.

 

Error

 

Figure 13: Timing diagram of HostError and HostErrorType

 

When the IP encounters an error during the execution of an operation, it triggers and error notification process.

1)     The IP detects an issue, setting the HostError flag to 1b to indicate an error has occurred. Concurrently, the HostErrorType signal is updated to a non-zero value, specifying the details of the error. Each bit within HostErrorType corresponds to a different type of error, pointing to specific issues that can be further investigated using related status signals:

HostErrorType[2]           : Check TrgCAPStatus for unsupported features of the target system.

HostErrorType[10]         : Check TrgAdmStatus to review the Status register, output from the Admin command execution.

HostErrorType[15]         : Check AdmTCPStatus for the Admin TCP/IP controller’s status.

HostErrorType[18]         : Check TrgIOStatus to review the Status register, output from the IO command execution.

HostErrorType[23]         : Check IOTCPStatus for the IO TCP/IP controller’s status.

 

2)     After identifying and addressing the cause of the error, the user should reset the IP to clear the error state and restart operations. This is done by setting RstB to 0b, which initiates the IP reset process.

3)     The reset action clears the HostError and HostErrorType signals, effectively resetting all error statuses related to previous processes.

 

EMAC Interface

The EMAC (Ethernet MAC) interface of the NVMeTCP10G IP employs a 64-bit AXI4-stream interface, optimized for data handling. This interface is designed to continuously transmit packet data without the ability to pause until the final data of the packet is transmitted. This characteristic necessitates that MacTxReady be continuously asserted to 1b from the transmission of the first data packet until the final data packet, as illustrated in Figure 14.

From this limitation, the NVMeTCP10G IP is compatible with the DG 10G25GEMAC IP core and can be adapted to work with the AMD Xilinx 10G/25G Ethernet Subsystem by incorporating a specific logic with a small FIFO to manage interfacing requirements.

 

 

Figure 14: Transmit EMAC interface timing diagram

 

1)     The IP asserts MacTxValid to 1b and begins transmitting the first data (D0) on MacTxData. Except for the final data, all 64-bit data are valid, thus MacTxKeep is set to FFh for each data transfer, indicating full byte validity. The first data values are held until MacTxReady is confirmed to be asserted at 1b.

2)     Upon receiving the first data, EMAC asserts MacTxReady to 1b, indicating readiness to accept this and all subsequent data in the packet. This readiness continues until the end of the packet, ensuring a continuous data transfer in each packet.

3)     As the final data packet (Dn-1) is sent, the IP asserts MacTxValid and MacTxLast to 1b, signaling the end of the data packet. For this final transmission, MacTxKeep may not be set to FFh if the upper bytes of the data are not valid, adjusting based on the actual data length.

4)     Once the entire packet has been transferred, MacTxReady may be de-asserted to 0b. This action pauses the data transmission, allowing for any necessary processing before the start of the next packet.

Similar to Transmit EMAC interface, the Receive EMAC interface ensures that the data within a single packet is received continuously without interruption. The valid signal, MacRxValid, must be asserted to 1b from the start to the end of the packet, as depicted in Figure 15.

 

 

Figure 15: Receive EMAC interface timing diagram

 

1)     The reception process begins when MacRxValid transitions from 0b to 1b, indicating the arrival of the first data (D0) of a packet on MacRxData. Following this, both MacRxValid and MacRxReady are maintained at 1b throughout the packet transfer. This continuous assertion is necessary to facilitate the uninterrupted transfer of the remaining data in the packet.

2)     The final data within a packet (Dn-1) is identified when both MacRxValid and MacRxLast are asserted to 1b simultaneously. During this time, MacRxUser also becomes a valid signal to read. If the packet is free of errors, MacRxUser will be set to 1b; otherwise, if an error is detected, the packet will be discarded by the IP.

3)     Once the packet reception is completed, the IP de-asserts MacRxReady for 2 clock cycles to allow for packet post-processing. This brief pause is crucial for the EMAC to accommodate any final processing steps before resuming the reception of subsequent packets.

Note: Typically, a gap of at least two cycle is observed after the transmission of an Ethernet frame to handle the transfer of the control word.

 

Verification Methods

The NVMeTCP10G IP Core functionality was verified by simulation and also proved on real board design by using KCU105, ZCU102, ZCU106, and VCK190 boards.

 

Recommended Design Experience

Experience design engineers with a knowledge of Vivado Tools should easily integrate this IP into their design.

 

Ordering Information

This product is available directly from Design Gateway Co., Ltd. Please contact Design Gateway Co., Ltd. for pricing and additional information about this product using the contact information on the front page of this datasheet.

 

Revision History

Revision

Date

Description

2.00

29-Apr-24

Add HostMMRdPause, AdmTCPStatus, and IOTCPStatus signals

1.01

24-Mar-22

Correct IOCCSZ

1.00

4-Nov-21

Initial Release