NVMe IP Core with PCIe Gen4 Soft IP Datasheet

Features 1

Applications 2

General Description. 4

Functional Description. 6

NVMe. 8

·      NVMe Host Controller 8

·      Command Parameter 8

·      Data Buffer 9

·      NVMe Data Controller 9

PCIe. 9

·      PCIe Transaction Controller 9

·      PCIe Data Link Controller 9

·      PCIe Media Access Controller (PCIe MAC) 10

User Logic. 10

Xilinx PCI Express PHY. 11

Core I/O Signals 12

Timing Diagram.. 17

Initialization. 17

Control interface of dgIF typeS. 18

Data interface of dgIF typeS. 19

IdenCtrl/IdenName. 21

Shutdown. 23

SMART. 24

Secure Erase. 26

Flush. 27

Error 28

Verification Methods 29

Recommended Design Experience. 29

Ordering Information. 29

Revision History. 29

 

 

 

 

  Core Facts

Provided with Core

Documentation

Reference Design Manual

Demo Instruction Manual

Design File Formats

Encrypted File

Instantiation Templates

VHDL

Reference Designs & Application Notes

Vivado Project,

See Reference Design Manual

Additional Items

Demo on KCU116, ZCU102, ZCU106, and VCU118

Support

Support Provided by Design Gateway Co., Ltd.

 

 

Design Gateway Co.,Ltd

E-mail:    ip-sales@design-gateway.com

URL:       design-gateway.com

Features

·     NVMe host controller with PCIe Soft IP to directly access an NVMe Gen4 SSD without the need for CPU or external memory

·     Two Data buffer modes: High speed (1 MB RAM) or Small memory (256 KB RAM), implemented by URAM

·     Simple user interface through dgIF typeS

·     Support seven commands: Identify, Shutdown, Write, Read, SMART, Secure Erase, and Flush

·     Supported NVMe devices

·     PCIe I/F: 4-lane PCIe Gen4 (16.0 GT/s)

·     Base Class Code:01h (mass storage), Sub Class Code:08h (Non-volatile), Programming Interface:02h (NVMHCI)

·     MPSMIN (Memory Page Size Minimum): 0 (4KB)

·     MDTS (Maximum Data Transfer Size): At least 5 (128 KB) or 0 (no limitation)

·     MQES (Maximum Queue Entries Support): At least 15

·     LBA unit: 512 bytes or 4 KB

·     User clock frequency: At least the PCIe clock frequency (250 MHz for Gen4)

·     Xilinx PCIe PHY IP configuration: 4-lane PCIe Gen4 with 256-bit data bus

·     Available reference design:

·     1-ch demo on KCU116, ZCU102, ZCU106, and VCU118 with AB17-M2FMC and AB18-PCIeX16 adapter board

·     2-ch RAID0 demo on ZCU106 and VCU118 with AB17-M2FMC and AB18-PCIeX16 adapter board

·     Customized service for following features

·     Additional NVMe commands

·     RAM type (BRAM) modification

 

 

Table 1: Example Implementation Statistics

Family

Example Device

Buf

Mode

Fmax

(MHz)

CLB

Regs

CLB

LUTs

CLB2

BRAMTile

URAM

Design

Tools

Kintex-Ultrascale+

(URAM)

XCKU5P-FFVB676-2E

1 MB

295

19411

22185

4408

12.5

32

Vivado2019.1

256 KB

300

19399

22205

4640

12.5

8

Zynq-Ultrascale+

(URAM)

XCZU7EV-FFVC1156-2E

1 MB

295

19411

22190

4340

12.5

32

Vivado2019.1

256 KB

300

19399

22202

4483

12.5

8

Zynq-Ultrascale+

(BRAM)1

XCZU9EG-FFVB1156-2E

1 MB

295

19155

22345

4547

268.5

-

Vivado2019.1

256 KB

300

19143

22236

4462

76.5

-

Virtex-Ultrascale+

(URAM)

XCVU9P-FLGA2104-2LE

1 MB

295

19411

22205

4451

12.5

32

Vivado2019.1

256 KB

300

19399

22208

4432

12.5

8

Note:        (1) BRAM version is the customized IP core

(2) Actual logic resource dependent on percentage of unrelated logic

 

Applications

 

Figure 1: NVMeG4 IP Application

 

NVMe IP Core with PCIe Gen4 Soft IP (NVMeG4 IP) provides an ideal solution for accessing an NVMe Gen4 SSD without the need for PCIe Hard IP, CPU, or external memory. With its inclusion of PCIe Gen4 Soft IP and 1MB/256KB memory, the NVMeG4 IP provides an ideal option for applications which require vast storage capacity and high-speed performance utilizing the FPGAs which do not integrate a PCIe Gen4 Hard IP or do not include sufficient number of PCIe Gen4 Hard IPs, as shown in Figure 1.

However, when the selected FPGA integrates PCIe hard IP, and a number of PCIe hard IP in the FPGA is sufficient, it is recommended to use DG NVMe IP Core to optimize FPGA resource utilization.

We also offer alternative IP cores for specific applications using PCIe hard IP such as Multiple users and Random access.

NVMe IP Core – Accesses an SSD using PCIe hard IP to minimize FPGA resource utilization.

https://dgway.com/NVMe-IP_X_E.html

Multiple User NVMe IP Core – Enables multiple users to access an NVMe SSD for high-performance write and read operation simultaneously.

https://dgway.com/muNVMe-IP_X_E.html

Random Access by Multiple User NVMe IP Core – Enables two users to write and read to the same NVMe SSD simultaneously, providing high random-access performance for applications with non-contiguous storage requirements.

https://dgway.com/rmNVMe-IP_X_E.html

NVMe IP Core for PCIe Switch – Access multiple NVMe SSDs via PCIe switch to extend storage capacity. Enables high-speed write and read access to shared storage.

https://dgway.com/NVMe-IP_X_E.html

General Description

 

Figure 2: NVMeG4 IP Block Diagram

 

Design Gateway has developed the NVMeG4 IP, which serves as an NVMe host controller including PCIe Soft IP for accessing an NVMe Gen4 SSD. The user interface of the NVMeG4 IP is similar to the standard DG NVMe IP (Gen4). It offers the additional benefit of integrating PCIe Soft IP (Gen4), which implements the Data link layer and a part of the Physical layer of PCIe protocol. The Physical interface of NVMeG4 IP connects to the Xilinx PCIe PHY through a 256-bit PIPE interface. The Xilinx PCIe PHY contains the transceiver and equalizer logic.

The NVMeG4 IP core implements seven NVMe commands, including Identify, Shutdown, Write, Read, SMART, Secure Erase, and Flush command, and utilizes two user interface groups to transfer commands and data. The Control interface is used for transferring commands and their parameters, while the Data interface is used for transferring data when required by the command. For Write/Read commands, the Control interface and Data interface uses dgIF typeS, which is our standard interface for the storage. The Control interface of dgIF typeS includes start address, transfer length, and request signals, and the Data interface uses the standard FIFO interface.

SMART, Secure Erase, and Flush command are Custom commands that use the Ctm Cmd I/F for control path and Ctm RAM I/F for data path. Meanwhile, the Identify command uses its own data interface – Iden I/F, and the same Control interface as Write or Read command, as shown in Figure 2.

If abnormal conditions are detected during initialization or certain command operation, the NVMeG4 IP may assert an error signal. The error status can be read from the IP for more details. Once the error cause is resolved, both the NVMeG4 IP and SSD must be reset.

To ensure continuous packet transmission until the end of the packet on the user interface of Xilinx PCIe PHY, the user logic clock frequency must be equal to or greater than 250 MHz, PHY clock frequency. This requires that data is valid every clock cycle between the start and the end of the frame. This user logic clock frequency limitation can guarantee that the bandwidth on the user interface is equal to or greater than PCIe PHY bandwidth.

The NVMeG4 IP consists of both NVMe IP (Gen4) and PCIe Soft IP (Gen4), so all features of the standard IP are remained. Table 2 shows the comparison between NVMe IP (Gen4) and NVMeG4 IP.

 

Table 2: The comparison of NVMe IP (Gen4) and NVMeG4 IP

Feature

NVMe IP (Gen4)

NVMeG4 IP

PCIe Interface

256-bit AXI4 Stream

256-bit PIPE

Xilinx PCIe IP

Integrated Block for PCIe (PCIe Hard IP)

Xilinx PCIe PHY (Transceiver and equalizer)

PCIe Hard IP

Necessary

Not use

PCIe Speed

1-4 Lane with Gen4 or lower speed

Support only 4-lane PCIe Gen4

User Interface

dgIF typeS

dgIF typeS

FPGA resource

Smaller

Larger

Maximum SSD

Depend on the number of PCIe Hard IPs

(5 PCIe Gen4 Hard IP = 5 SSDs on VC1902)

Depend on the number of transceivers

(44 GTY = 11 SSDs on VC1902)

SSD Performance*

Write: Up to 6900 MB/s  

Read: Up to 7500 MB/s

Write: Up to 6900 MB/s  

Read: Up to 7500 MB/s

*Note: Test performance is measured using 1 TB Samsung 990 Pro SSD.

 

As shown in Table 2, it can be observed that the key advantage of NVMeG4 IP is the absence of the requirement for PCIe hard IP. Therefore, the maximum number of SSDs supported is not limited by the number of PCIe hard IP instances, but rather by the number of transceivers and the resource utilization. However, a drawback of the NVMeG4 IP is its higher resource utilization compared to the NVMe IP (Gen4) due to the implementation of PCIe soft IP. Besides, the NVMeG4 IP supports only 4-lane PCIe Gen4 SSD.

To facilitate evaluation before making a purchase, we offer reference design for FPGA evaluation boards. It enables the user to assess the performance and compatibility of the NVMeG4 IP.

 

Functional Description

The NVMeG4 IP operation is divided into three phases, including IP initialization, Operating command, and Inactive status, as shown in Figure 3. Upon de-asserting the IP reset, the initialization phase begins, and the user should execute the Identify command to check the device status and capacity. During the Operating command phase, the user can perform write and read operations, execute Custom commands such as SMART and Flush. Finally, before shutting down the system, it is recommended to execute the Shutdown command to ensure safe operation.

 

Figure 3: NVMeG4 IP operation flow

 

The operation of NVMeG4 IP is described as follows.

1)     The IP operates PCIe link initialization and training at 4-lane PCIe Gen4 speed. The equalizer parameters are adjusted to ensure optimal signal quality, and flow control parameters are configured. Once the process is completed without any errors, the IP is ready for transferring PCIe packets.

2)     The IP begins the initialization process by initializing flow control, configurating PCIe and NVMe registers. Upon successful completion of the initialization, the IP transitions to the Idle state, where it awaits new command request from the user. If any errors during the initialization process are detected, the IP switches to the Inactive state, with UserError set to 1b.

3)     The first command from the user must be the Identify command (UserCmd=000b), which updates the LBASize (disk capacity) and LBAMode (LBA unit=512 byte or 4 KB).

4)     The last command before powering down the system must be the Shutdown command (UserCmd=001b). This command is recommended to guarantee the SSD is powered down in a proper sequence. Without the Shutdown command, the write data in the SSD cannot be guaranteed. After finishing the Shutdown command, both the NVMeG4 IP and SSD change to the Inactive state. The new command cannot be operated until the IP is reset.

5)     When executing a Write command (UserCmd=010b), the maximum data size for each command is limited to 128 KB. If the total data length from the user exceeds 128 KB, the IP automatically repeats the following steps, 5a) – 5b), until all data has been completely transferred.

a)     The IP waits until the write data, sent by the user, is sufficient for one command. The transfer size of each command in the NVMeG4 IP is 128 KB, except for the last loop, which may be less than 128 KB.

b)     The IP sends the Write command to the SSD and then waits until the status response from the SSD. The IP returns to the Idle state only when all the data has been completely transferred. If not, the IP goes back to step 5a) to send the next Write command.

6)     Similar to the Write command, when executing a Read command (UserCmd=011b) with a transfer size exceeding 128 KB, the IP must iterate through the following steps, 6a) – 6b).

a)     If the remaining transfer size is zero, the IP proceeds to step 6c). Otherwise, it waits until there is sufficient free space in the Data buffer of the NVMeG4 IP for one command (either 128 KB or the remaining transfer size for the last loop).

b)     The IP sends the Read command to the SSD and then returns to step 6a).

c)     The IP waits until all the data has been completely transferred from the Data buffer to the user logic and then returns to the Idle state. Therefore, the Data buffer becomes empty after the Read command is completed.

7)     When executing a SMART command (UserCmd=100b and CtmSubmDW0-15=SMART), 512-byte data is returned upon operation completion.

a)     The IP sends a Get Log Page command to retrieve SMART/Health information from the SSD.

b)     The 512-byte data response is received from the SSD, and the IP forwards this data through the Custom command RAM interface (CtmRamAddr=0x000 – 0x00F).

8)     When executing a Secure Erase command (UserCmd=100b and CtmSubmDW0-15=Secure Erase), no data transfer occurs during the operation.

a)     The IP sends the Secure Erase command to the SSD.

b)     The IP waits until the SSD returns a status response to confirm the completion of the operation.

9)     When executing a Flush command (UserCmd=110b), no data transfer occurs during the operation.

a)     The IP sends the Flush command to the SSD.

b)     The IP waits until the SSD returns a status response to confirm the completion of the operation.

 

The design of an NVMeG4 IP implements two protocols: NVMe and PCIe. The NVMe protocol is used to interface with the user, while the PCIe protocol is used to interface with Xilinx PCIe PHY. Figure 2 shows the hardware inside the NVMeG4 IP which consists of two groups, NVMe and PCIe.

 

NVMe

The NVMe group supports seven commands, which are split into two types - Admin commands and NVM commands. Admin commands include Identify, Shutdown, SMART, and Secure Erase, while NVM commands include Write, Read, and Flush. After executing a command, the status returned from the SSD is latched either to AdmCompStatus (for status returned from Admin commands) or IOCompStatus (for status returned from NVM commands), depending on the command type.

The parameters of Write or Read command are configured through the Control interface of dgIF typeS, while the parameters of SMART, Secure Erase, or Flush command are set by CtmSubmDW0-15 of the Ctm Cmd interface. The Data interface for Write and Read command is transferred by the FIFO interface, a part of dgIF typeS. The data for Write and Read commands are stored in the Data buffer inside the IP. For other command types, the Data interface utilizes distinct interfaces - Identify I/F for the Identify command and Custom RAM I/F for the SMART command.

Further details of each submodule are described as follows.

·       NVMe Host Controller

The NVMe host controller serves as the core controller within the NVMe IP. It operates in two phases: the initialization phase and the command operation phase. The initialization phase runs once when the system is booted up, for configuring the NVMe register within the SSD. Once the initialization phase is completed, it enters the command operation phase. During this phase, the controller controls the sequence of transmitted and received packets for each command.

To initiate the execution of each command, the command parameters are stored in the Command Parameter, facilitating packet creation. Subsequently, the packet is forwarded to the AsyncCtrl for converting NVMe packets into PCIe packets. After each command operation is executed, a status packet is received from the SSD. The controller decodes the status value, verifying whether the operation was completed successfully or an error occurred. In cases where the command involves data transfer, such as Write or Read command, the controller must handle the order of data packets, which are created and decoded by the NVMe Data controller.

·       Command Parameter

The Command Parameter module creates the command packet sent to the SSD and decodes the status packet returned from the SSD. The input and output of this module are controlled by the NVMe host controller. Typically, a command consists of 16 Dwords (1 Dword = 32 bits). When executing Identify, Shutdown, Write, and Read commands, all 16 Dwords are created by the Command parameter module, which are initialized by the user inputs on dgIF typeS. When executing SMART, Secure Erase, and Flush commands, all 16 Dwords are directly loaded via CtmSubmDW0-CtmSubmDW15 of Ctm Cmd interface.

·       Data Buffer

Two data buffer modes are supported: High speed mode, which uses 1 MB RAM, and Small memory mode, which uses 256 Kbyte RAM. The RAM is implemented using UltraRAM. The buffer stores data for transferring between UserLogic and SSD while operating Write and Read commands.

·       NVMe Data Controller

The NVMe data controller module is used when the command must transfer data such as Identify, SMART, Write, and Read commands. This module manages three data interfaces for transferring with the SSD.

1)     The FIFO interface is used with the Data buffer during the execution of Write or Read commands.

2)     The Custom RAM interface is used when executing SMART command.

3)     The Identify interface is used when executing Identify command.

The NVMe data controller is responsible for creating and decoding of data packets. Similar to the Command Parameter module, the input and output signals of the NVMe data controller module are controlled by the NVMe host controller.

 

PCIe

The PCIe protocol is the outstanding low-layer protocol for the high-speed application, and the NVMe protocol runs over it. Therefore, the NVMe layer can be operated after the PCIe layer completes the initialization. Four modules are designed to support the PCIe protocol - PCIe transaction controller, AsyncCtrl, PCIe data link controller, and PCIe MAC. Additional details of these modules are provided below.

·       PCIe Transaction Controller

This module implements the Transaction layer of the PCIe protocol. In this module, NVMe command and data from the NVMe layer are transformed into Transaction Layer Packets (TLP) format. Conversely, TLP packets are decoded and forwarded to the NVMe layer logics.

·       AsyncCtrl

The AsyncCtrl includes asynchronous registers and asynchronous buffers to facilitate clock domain crossing. The user clock frequency must match or exceed the PCIe PHY clock frequency to ensure sufficient bandwidth for continuous packet data transmission. The majority of the logic within the NVMeG4 IP operates in the user clock domain, while the PCIe PHY IP operates in the PCIe clock domain.

·       PCIe Data Link Controller

The PCIe Data Link Controller implements the Data Link Layer of PCIe protocol. This layer ensures reliable delivery of TLPs which is the packet format transferring between the PCIe Transaction Controller and PCIe Data Link Controller. Each TLP is appended with a Link Cyclic Redundancy Code (LCRC) to facilitate error checking at the receiver. Besides, a Sequence Number is included to ensure the correct sequence of packets during transmission, matching to the order they were sent by the sender. The receiver can also detect any missing TLPs in the transmission.

Upon successful LCRC verification and Sequence Number validation, Ack DLLPs (Data Link Layer Packets) are generated to confirm the error-free reception of TLPs. In case of a transmission error, Nak DLLPs are created to indicate the issue, prompting the transmitter to re-send the TLPs to resolve the problem. The PCIe Soft IP also incorporates two 4KB RAMs, serving as a Replay buffer and a data buffer for bidirectional data transfer.

·       PCIe Media Access Controller (PCIe MAC)

The PCIe MAC module is designed to interface with the Xilinx PCIe PHY using the PIPE (PHY Interface for PCI Express) standard. This module serves two main purposes. Firstly, it manages the Link initialization and training process. Secondly, it controls the flow of data packets in accordance with the PCIe physical specification.

During Link initialization and training, certain processes are implemented within the Xilinx PCIe PHY such as Clock and Data Recovery (CDR) for Bit lock and Block lock for Gen4 speed. The PCIe MAC incorporates the Link Training and Status State Machine (LTSSM), which is responsible for controlling Link width, Lane reversal, Polarity inversion, and Link data rate at Gen4 speed. Since Gen4 operates at 16.0 GT/s and is very sensitive, additional features are implemented in the PCIe Gen4 MAC to ensure optimal performance, including DC balance and equalization.

Once the initialization and training process is completed, data packets can be transferred. To transmit packets, the PCIe MAC module includes a multiplexer for selecting data types, byte striping for organizing the data format in each lane, and data scrambling to minimize noise. On the receiving end, the receiver logic handles tasks such as data de-scrambling, byte un-striping, and data filtering.

 

User Logic

The user logic can be implemented using a small state machine responsible for sending commands along with their corresponding parameters. For instance, simple registers are used to specify parameters for Write or Read command, such as address and transfer size. Two separate FIFOs are connected to manage data transfer for Write and Read commands independently.

When executing the SMART and Identify commands, each data output interface connects to a simple dual port RAM with byte enable capability. Both the FIFO and RAM have a data width of 256 bits, while their memory depth can be configured to different values. Specifically, the data size for the Identify command is 8 KB, while for the SMART command, it is 512 bytes.

 

Xilinx PCI Express PHY

Xilinx provides the Xilinx PCIe Express PHY to enable the utilization of Soft IP instead of Hard IP to construct a PCIe MAC. The PCIe PHY utilizes the PHY Interface for PCI Express (PIPE) as its user interface. When operating with the NVMeG4 IP, the PCIe PHY is configured with a lane width of 4 and a Link speed of 16.0 GT/s. For more detailed information of the Xilinx PCIe PHY, please refer to the “PG239: PCI Express PHY” document, available at Xilinx website.

https://docs.xilinx.com/r/en-US/pg239-pcie-phy

 

 

Figure 4: Block Diagram of the PCI Express PHY

 

Core I/O Signals

Descriptions of all signal I/O are provided in Table 4 - Table 5

Table 3: Core Parameters

Name

Value

Description

Control I/F of dgIF typeS

BufMode

0 or 1

Data buffer mode.

1-High speed mode using 1 MB buffer.

0-Small memory mode using 256 KB buffer.

 

Table 4: User logic I/O Signals (Synchronous to Clk signal)

Signal

Dir

Description

Control I/F of dgIF typeS

RstB

In

Synchronous reset. Active low. It should be de-asserted to 1b when the Clk signal is stable.

Clk

In

User clock to run the NVMeG4 IP. The frequency of this clock must be equal to or greater than the PhyClk (250 MHz for PCIe Gen4)

UserCmd[2:0]

In

User Command. Valid when UserReq=1b. The possible values are

000b: Identify, 001b: Shutdown, 010b: Write SSD, 011b: Read SSD,

100b: SMART/Secure Erase, 110b: Flush, 101b/111b: Reserved.

UserAddr[47:0]

In

Start address to write/read SSD in 512-byte unit. Valid when UserReq=1b.

If the LBA unit = 4 KB, UserAddr[2:0] must be always set to 000b to align with 4KB unit.

If the LBA unit = 512 byte, it is recommended to set UserAddr[2:0]=000b to align with 4KB size (SSD page size). The 4KB address unalignment results in reduced write/read performance for most SSDs.

UserLen[47:0]

In

The total transfer size to write/read from the SSD in 512-byte units. Valid from 1 to (LBASize-UserAddr). If the LBA unit = 4 KB, UserLen[2:0] must be always set to 000b to align with the 4KB unit. This parameter is applicable when UserReq=1b.

UserReq

In

Asserts to 1b to send the new command request and de-asserts to 0b after the IP starts the operation by asserting UserBusy to 1b. This signal can only be asserted when the IP is Idle (UserBusy=0b). Command parameters (UserCmd, UserAddr, UserLen, and CtmSubmDW0-DW15) must be valid and stable during UserReq=1b. UserAddr and UserLen are inputs for Write/Read command while CtmSubmDW0-DW15 are inputs for SMART, Secure Erase, or Flush command.

UserBusy

Out

Asserted to 1b when IP is busy. A new request must not be sent (UserReq to1b) when the IP is still busy.

LBASize[47:0]

Out

The total capacity of the SSD in 512-byte units. Default value is 0.

This value is valid after finishing the Identify command.

LBAMode

Out

The LBA unit size of the SSD (0b: 512 bytes, 1b: 4 KB). Default value is 0b.

This value is valid after finishing the Identify command.

UserError

Out

Error flag. Asserted to 1b when UserErrorType is not equal to 0.

The flag is de-asserted to 0b by asserting RstB to 0b.

 

Signal

Dir

Description

Control I/F of dgIF typeS

UserErrorType[31:0]

Out

Error status.

[0] – An error when PCIe class code is incorrect.

[1] – An error from Controller capabilities (CAP) register, which can occur due to various reasons.

- Memory Page Size Minimum (MPSMIN) is not equal to 0.

- NVM command set flag (bit 37 of CAP register) is not set to 1.

- Doorbell Stride (DSTRD) is not equal to 0.

- Maximum Queue Entries Supported (MQES) is less than 15.

More details of each register can be found in the NVMeCAPReg signal.

[2] – An Error when the Admin completion entry is not received within the specified timeout.

[3] – An error when the status register in the Admin completion entry is not 0 or when the phase tag/command ID is invalid. More details can be found in the AdmCompStatus signal.

[4] – An error when the IO completion entry is not received within the specified timeout.

[5] – An error when the status register in the IO completion entry is not 0 or when the phase tag is invalid. More details can be found in the IOCompStatus signal.

[6] – An error from unsupported LBA unit (not equal to 512 bytes or 4 KB).

[7] – An error from PCIe PHY.

[8] – An error when receiving TLP packet with an incorrect size.

[9] – Reserved

Bit[10], [11], [24], and [28] are mapped to Uncorrectable Error Status Register

[10] – Mapped to Unsupported Request Error Status (bit[20]).

[11] – Mapped to Completer Abort Status (bit[15]).

[23:12] – Reserved

[24] – Mapped to Data Link Protocol Error Status (bit[4]).

[27:25] – Reserved

[28] – Mapped to Uncorrectable Internal Error Status (bit[22]).

[31:29] – Reserved

Note: Timeout period of bit[2]/[4] is set from TimeOutSet input.

Data I/F of dgIF typeS

UserFifoWrCnt[15:0]

In

Write data counter for the Receive FIFO. Used to monitor the FIFO full status. When the FIFO becomes full, data transmission from the Read command temporarily halts. If the data count of FIFO is less than 16 bits, the upper bits should be padded with 1b to complete the 16-bit count.

UserFifoWrEn

Out

Asserted to 1b to write data to the Receive FIFO when executing the Read command.

UserFifoWrData[255:0]

Out

Write data bus of the Receive FIFO. Valid when UserFifoWrEn=1b.

UserFifoRdCnt[15:0]

In

Read data counter for the Transmit FIFO. Used to monitor the amount of data stored in the FIFO. If the counter indicates an empty status, the transmission of data packets for the Write command temporarily pauses. When the data count of FIFO is less than 16 bits, the upper bis should be padded with 0b to complete the 16-bit count.

UserFifoEmpty

In

Unused for this IP.

UserFifoRdEn

Out

Asserted to 1b to read data from the Transmit FIFO when executing the Write command.

UserFifoRdData[255:0]

In

Read data returned from the Transmit FIFO.

Valid in the next clock after UserFifoRdEn is asserted to 1b.

 

Signal

Dir

Description

NVMeG4 IP Interface

IPVersion[31:0]

Out

IP version number

TestPin[31:0]

Out

Reserved to be IP Test point.

TimeOutSet[31:0]

In

Timeout value to wait for completion from SSD. The time unit is equal to 1/(Clk frequency).

When TimeOutSet is equal to 0, Timeout function is disabled.

AdmCompStatus[15:0]

Out

Status output from Admin Completion Entry

[0] – Set to 1b when the Phase tag or Command ID in Admin Completion Entry is invalid.

[15:1] – Status field value of Admin Completion Entry

IOCompStatus[15:0]

Out

Status output from IO Completion Entry

[0] – Set to 1b when the Phase tag in IO completion Entry is invalid.

[15:1] – Status field value of IO Completion Entry

NVMeCAPReg[31:0]

Out

The parameter value of the NVMe capability register when UserErrorType[1] is asserted to 1b.

[15:0] – Maximum Queue Entries Supported (MQES)

[19:16] – Doorbell Stride (DSTRD)

[20] – NVM command set flag

[24:21] – Memory Page Size Minimum (MPSMIN)

[31:25] – Undefined

Identify Interface

IdenWrEn

Out

Asserted to 1b for sending data output from the Identify command.

IdenWrDWEn[7:0]

Out

Dword (32 bits) enable of IdenWrData. Valid when IdenWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], …, [7] correspond to IdenWrData[31:0], [63:32], …, [255:224], respectively.

IdenWrAddr[7:0]

Out

Index of IdenWrData in 256-bit unit. Valid when IdenWrEn=1b.

0x000-0x07F: 4KB Identify controller data,

0x080-0x0FF: 4KB Identify namespace data.

IdenWrData[255:0]

Out

4KB Identify controller data or Identify namespace data. Valid when IdenWrEn=1b.

Custom Interface (Command and RAM)

CtmSubmDW0[31:0] – CtmSubmDW15[31:0]

In

16 Dwords of Submission queue entry for SMART, Secure Erase, or Flush command.

DW0: Command Dword0, DW1: Command Dword1, …, and DW15: Command Dword15.

These inputs must be valid and stable when UserReq=1b and UserCmd=100b (SMART/Secure Erase) or 110b (Flush).

CtmCompDW0[31:0] –

CtmCompDW3[31:0]

Out

4 Dwords of Completion queue entry, output from SMART, Secure Erase, or Flush command.

DW0: Completion Dword0, DW1: Completion Dword1, …, and DW3: Completion Dword3

CtmRamWrEn

Out

Asserted to 1b for sending data output from Custom command such as SMART command.

CtmRamWrDWEn[7:0]

Out

Dword (32 bits) enable of CtmRamWrData. Valid when CtmRamWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], …, [7] corresponds to CtmRamWrData[31:0], [63:32], …, [255:224], respectively.

CtmRamAddr[7:0]

Out

Index of CtmRamWrData when SMART data is received. Valid when CtmRamWrEn=1b.

(Optional) Index to request data input through CtmRamRdData for customized Custom commands.

CtmRamWrData[255:0]

Out

512-byte data output from SMART command. Valid when CtmRamWrEn=1b.

CtmRamRdData[255:0]

In

(Optional) Data input for customized Custom commands.

 

Table 5: Physical I/O Signals (Synchronous to PhyClk)

Signal

Dir

Description

PHY Clock and Reset

PhyRstB

In

Synchronous reset signal. Active low.

De-asserts to 1b when PCIe PHY is not in reset state, detected by phy_phystatus_rst signal being de-asserted to 0b.

PhyClk

In

Clock output from PCIe PHY (250 MHz).

Other PHY Interface

MACTestPin[127:0]

Out

Test point of PCIe MAC.

MACStatus[7:0]

Out

Status output from PCIe MAC.

[5:0] – LTSSM

- 000000b: Detect.Quiet

- 000001b: Detect.Active

- 000010b: Polling.Active

- 000011b: Polling.Configuration

- 000100b: Config.Linkwidth

- 000101b: Config.Lanenum

- 000110b: Config.Complete

- 000111b: Config.Idle

- 001000b: Recovery.RcvrLock

- 001001b: Recovery.Speed

- 001010b: Recovery.RcvrCfg

- 001011b: Recovery.Idle

- 010000b: L0

- 100000b: Recovery.EqP0

- 100001b: Recovery.EqP1

- 100010b: Recovery.EqP2

- 100011b: Recovery.EqP3

[6] – Reserved

[7] – Asserted to 1b after PCIe initialization is completed.

PIPE Data Interface

PhyTxData[255:0]

Out

Parallel data output to PHY.

PhyTxDataK[7:0]

Out

Control data to indicate whether PCIeTxData is control or data.

PhyTxDataValid[3:0]

Out

Asserted to 1b when the valid data is presented on PhyTxData.

PhyTxStartBlock[3:0]

Out

Asserted to 1b at the first clock of 128b block to indicate the start of block.

Valid when PhyTxDataValid=1b.

PhyTxSyncHeader[7:0]

Out

Indicates whether data block is ordered set or data stream.

Valid at the first clock of 128b block, together with PhyTxStartBlock.

PhyRxData[255:0]

In

Data input from PHY.

PhyRxDataK[7:0]

In

Control data to indicate whether PhyRxData is control or data.

PhyRxDataValid[3:0]

In

Asserts to 1b when the valid data is presented on PhyRxData.

PhyRxStartBlock[7:0]

In

Asserts to 1b at the first clock of 128b block to indicate the start of block.

Valid when PhyRxDataValid=1b.

PhyRxSyncHeader[7:0]

In

Indicates whether data block is ordered set or data stream.

Valid at the first clock of 128b block, together with PhyRxStartBlock.

 

Signal

Dir

Description

PIPE Control and Status Signal

PhyTxDetectRx

Out

Requests PCIe PHY to begin a receiver detection operation.

PhyTxElecIdle[3:0]

Out

Forces Tx to enter electrical idle state.

PhyTxCompliance[3:0]

Out

Asserted to 1b for running negative disparity.

PhyRxPolarity[3:0]

Out

Requests PCIe PHY to perform polarity inversion on the received data.

PhyPowerdown[1:0]

Out

Requests PCIe PHY to change the power state.

PhyRate[1:0]

Out

Requests PCIe PHY to change link rate.

PhyRxValid[3:0]

In

Indicates symbol lock and valid data when logic high.

PhyPhyStatus[3:0]

In

Used to communicate completion of several PIPE operations.

PhyRxElecIdle[3:0]

In

Indicates Rx electrical idle detected.

PhyRxStatus[11:0]

In

Rx status and error codes.

Driver and Equalization Signal

PhyTxMargin[2:0]

Out

Selects Tx voltage levels. This signal is fixed to 000b.

PhyTxSwing

Out

Controls Tx voltage swing level. This signal is fixed to 0b.

PhyTxDeEmph

Out

Selects Tx de-emphasis. This signal is fixed to 1b.

PhyTxEqCtrl[7:0]

Out

Tx equalization control.

PhyTxEqPreset[15:0]

Out

Tx equalization preset.

PhyTxEqCoeff[23:0]

Out

Tx equalization coefficient.

PhyTxEqFS[5:0]

In

Indicates the full swing of the Tx driver. Static value based on characteristics of Tx driver.

PhyTxEqLF[5:0]

In

Indicates the low frequency of the Tx driver. Static value based on characteristics of Tx driver.

PhyTxEqNewCoeff[71:0]

In

Status of the current Tx equalization coefficient.

PhyTxEqDone[3:0]

In

Asserts to 1b when Tx equalization is done.

PhyRxEqCtrl[7:0]

Out

Rx equalization control.

PhyRxEqTxPreset[15:0]

Out

Link partner status for Tx preset.

PhyRxEqPresetSel[3:0]

In

Serves indications as coefficient or preset.

PhyRxEqNewTxCoeff

[71:0]

In

New Tx coefficient or preset to request the link partner.

PhyRxEqAdaptDone[3:0]

In

Asserts to 1b when RX equalization is successfully done.

Valid when PhyRxEqDone is asserted to 1b.

PhyRxEqDone[3:0]

In

Asserts to 1b when Rx equalization is finished.

Assist Signal

AsMacInDetect

Out

Assists PCIe PHY to switch the receiver termination between VTT and GND.

AsCdrHoldReq

Out

Assists PCIe PHY to hold CDR.

 

Note: For more detailed information about the signals of Xilinx PCIe PHY, please refer to the “PG239: PCI Express PHY” document, available at Xilinx website.

 

Timing Diagram

 

Initialization

 

 

Figure 5: Timing diagram of the reset sequence

 

The initialization process of the NVMeG4 IP follows the steps below as shown in timing diagram.

1)     Monitor the phy_phystatus_rst signal from the PCIe PHY and wait until it is de-asserted to 0b, indicating the completion of the PCIe PHY reset process. After that, the user logic de-asserts PhyRstB to 1b to initiate the link training process by NVMeG4 IP.

2)     After de-asserting PhyRstB and ensuring the stability of the Clk signal, the user logic can de-assert RstB to 1b. This action triggers the NVMe logic within the NVMeG4 IP to initiate the operation.

3)     Once the NVMeG4 IP completes the initialization processes, including link training, flow control initialization, and configuration of PCIe and NVMe registers, UserBusy is de-asserted to 0b.

Following these sequences, the NVMeG4 IP is ready to receive the command from the user.

 

Control interface of dgIF typeS

The dgIF typeS signals can be split into two groups: the Control interface for sending commands and monitoring status, and the Data interface for transferring data streams in both directions.

Figure 6 shows an example of how to send a new command to the IP via the Control interface of dgIF typeS.

 

Figure 6: Control Interface of dgIF typeS timing diagram

 

1)     UserBusy must be equal to 0b before sending a new command request to confirm that the IP is Idle.

2)     Command and its parameters such as UserCmd, UserAddr, and UserLen must be valid when asserting UserReq to 1b to send the new command request.

3)     IP asserts UserBusy to 1b after starting the new command operation.

4)     After UserBusy is asserted to 1b, UserReq is de-asserted to 0b to finish the current request. New parameters for the next command can be prepared on the bus. UserReq for the new command must not be asserted to 1b until the current command operation is finished.

5)     UserBusy is de-asserted to 0b after the command operation is completed. Next, new command request can be initiated by asserting UserReq to b1b.

Note: The number of parameters used in each command is different. More details are described below.

·       Write and Read commands: UserCmd, UserAddr, and UserLen.

·       SMART, Secure Erase and Flush commands: UserCmd and CtmSubmDW0-DW15.

·       Identify and Shutdown commands: UserCmd.

 

Data interface of dgIF typeS

Data interface of dgIF typeS is applied for transferring data stream when operating Write or Read command, and it is compatible with a general FIFO interface. Figure 7 shows the data interface of dgIF typeS when transferring Write data to the IP in the Write command.

 

Figure 7: Transmit FIFO Interface for Write command

 

The 16-bit FIFO read data counter (UserFifoRdCnt) shows the total amount of data stored, and if the amount of data is sufficient, 512-byte data (16x256-bit) is transferred.

In the Write command, data is read from the Transmit FIFO until the total data is transferred completely, and the process to transfer data is described as follows.

1)     Before starting a new burst transfer, the IP waits until at least 512-byte data is available in the Transmit FIFO by monitoring UserFifoRdCnt[15:4] that must be not equal to 0.

2)     The IP asserts UserFifoRdEn to 1b for 16 clock cycles to read 512-byte data from the Transmit FIFO.

3)     UserFifoRdData is valid in the next clock cycle after asserting UserFifoRdEn to 1b, and 16 data are continuously transferred.

4)     After reading the 16th data (D15), UserFifoRdEn is de-asserted to 0b.

5)     Repeats steps 1) – 4) to transfer the next 512-byte data until total amount of data is equal to the transfer size specified in the command.

6)     After total data is completely transferred, UserBusy is de-asserted to 0b.

 

 

Figure 8: Receive FIFO Interface for Read command

 

When executing the Read command, the data is transferred from the SSD to the Receive FIFO until the entire data is transferred. The steps for transferring a burst of data are below.

1)     Before starting a new burst transmission, the UserFifoWrCnt[15:5] is checked to verify that there is enough free space available in the Receive FIFO, indicated by the condition UserFifoWrCnt[15:5] ≠ all 1b or 2047. Also, the IP waits until the amount of received data from the SSD reaches at least 512 bytes. Once both conditions are satisfied, the new burst transmission begins.

2)     The IP asserts UserFifoWrEn to 1b for 16 clock cycles to transfer 512-byte data from the Data buffer to user logic.

3)     Once the transfer of 512-byte data is completed, UserFifoWrEn is de-asserted to 0b for one clock cycle. If there is additional data remaining to be transferred, repeat steps 1) – 3) until the total data size matches the transfer size specified in the command.

4)     After the total data is completely transferred, UserBusy is de-asserted to 0b.

 

IdenCtrl/IdenName

To ensure proper operation of the system, it is recommended to send the Identify command to the IP as the first command after the system boots up. This command updates important information about the SSD, such as its total capacity (LBASize) and LBA unit size (LBAMode), which are necessary for Write and Read commands to operate correctly. The following rules apply to the input parameters of these commands.

1)     The sum of the address (UserAddr) and transfer length (UserLen), inputs of Write and Read commands, must not exceed the total capacity (LBASize) of the SSD.

2)     If LBAMode is 1b (LBA unit size is 4 KB), the three lower bit (bits[2:0]) of UserAddr and UserLen must be set to 0b to align with the 4 KB unit.

 

 

Figure 9: Identify command timing diagram

 

When executing the Identify command, the following steps are taken.

1)     Send the Identify command to the IP (UserCmd=000b and UserReq=1b).

2)     The IP asserts UserBusy to 1b after receiving the Identify command.

3)     The IP returns 4KB Identify controller data to the user with IdenWrAddr equal to 0-127 and asserts IdenWrEn. IdenWrData and IdenWrDWEn are valid at the same clock as IdenWrEn=1b.

4)     The IP returns 4KB Identify namespace data to the user with IdenWrAddr equal to 128-255. IdenWrAddr[7] can be used to determine the data type as Identify controller data or Identify namespace data.

5)     UserBusy is de-asserted to 0b after finishing the Identify command.

6)     The LBASize and LBAMode of the SSD are simultaneously updated with the values obtained from the Identify command.

 

 

Figure 10: IdenWrDWEn timing diagram

 

The signal IdenWrDWEn is 8-bit signal used to validate a 32-bit data signal. Some SSDs return the 4KB Identify controller data and Identify namespace data one word (32-bit) at a time instead of continuously. To forward 32-bit data, one bit of IdenWrDWEn is asserted to 1b in the write cycle, as illustrated in Figure 10. Each bit of IdenWrDWEn (IdenWrDWEn[0], [1], …, [7]) corresponds to each 32-bit data of IdenWrData (IdenWrData[31:0], [63:32], …, [255:224]).

 

Shutdown

The Shutdown command is a command that should be sent as the last command before the system is powered down. The SSD ensures that the data from its internal cache is written to the flash memory before the shutdown process finishes. After the shutdown operation is completed, the NVMeG4 IP and the SSD become inactive status. If the SSD is powered down without executing the Shutdown command, the total count of unsafe shutdowns is increased, as returned data from the SMART command.

 

Figure 11: Shutdown command timing diagram

 

The process for executing the Shutdown command is described below.

1)     Ensure that the IP is in an Idle state (UserBusy=0b) before sending the Shutdown command. The user must set UserReq=1b and UserCmd=001b to request the Shutdown command.

2)     Once the NVMeG4 IP runs the Shutdown command, UserBusy is asserted to 1b.

3)     To clear the current request, UserReq is de-asserted to 0b after UserBusy is asserted to 1b.

4)     UserBusy is de-asserted to 0b when the SSD is completely shut down. After the shutdown process is completed, the IP will not receive any further user commands.

 

SMART

The SMART command is the command to check the health of the SSD. When this command is sent, the SSD returns 512-byte health information. The SMART command parameters are loaded from the CtmSubmDW0-DW15 signals on the Custom command interface. The user must set the 16-dword data which is a constant value before asserting UserReq. Once the SMART data is returned, it can be accessed via the CtmRAM port, as shown in Figure 12.

 

 

Figure 12: SMART command timing diagram

Below are the details of how to run the SMART command.

1)     The NVMeG4 IP must be Idle (UserBusy=0b) before sending the command request. All input parameters must be stable when UserReq is asserted to 1b for sending the request. The CtmSubmDW0-DW15 is set as a constant value for the SMART command by following values.

CtmSubmDW0                                         = 0x0000_0002

CtmSubmDW1                                         = 0xFFFF_FFFF

CtmSubmDW2 – CtmSubmDW5               = 0x0000_0000

CtmSubmDW6                                         = 0x2000_0000

CtmSubmDW7 – CtmSubmDW9               = 0x0000_0000

CtmSubmDW10                                       = 0x007F_0002

CtmSubmDW11 – CtmSubmDW15           = 0x0000_0000

2)     UserBusy is asserted to 1b after the NVMeG4 IP executes the SMART command.

3)     UserReq is de-asserted to 0b to clear the current request. Next, user logic can change the input parameters for the next command request.

4)     512-byte SMART data is returned on CtmRamWrData signal with asserting CtmRamWrEn to 1b. CtmRamAddr is equal to 0-15 to be data index of 512-byte data. When CtmRamAddr=0, byte0-31 of SMART data is valid on CtmRamWrData. CtmRamWrDWEn is Dword enable for each 32-bit CtmRamWrData. If CtmRamWrDWEn=FFh, all 256 bits of CtmRamWrData are valid.

5)     UserBusy is de-asserted to 0b when finishing the SMART command.

 

 

Figure 13: CtmRamWrDWEn timing diagram

 

Similar to Identify command, some SSDs returns only one Dword (32-bit) of data at a time instead of 512-byte data continuously. In such cases, one bit of CtmRamWrDWEn is asserted to 1b in the write cycle to be the valid signal of 32-bit CtmRamWrData. Each bit of CtmRamWrDWEn (bit[0], [1], …, [7]) corresponds to each 32 bits of CtmRamWrData (bit[31:0], [63:32], …, [255:224]).

 

Secure Erase

Secure Erase is a command that erases all user data in the SSD. After the Secure Erase command is executed, the contents of the user data are indeterminate. Since executing this command may require long time for operation, users need to disable timer of the IP by setting ‘TimeoutSet’ signal to zero value.

 

Figure 14: Secure Erase command timing diagram

 

Below are the details of how to run the Secure Erase command.

1)     The IP must be in an Idle state (UserBusy=0b) before sending the command request. All input parameters must be stable when UserReq is asserted to 1b to send the request. TimeoutSet and CtmSubmDW0-DW15 are set as a constant value for the Secure Erase command by following values.

TimeoutSet                                              = 0x0000_0000 (Disable Timer)

CtmSubmDW0                                         = 0x0000_0080

CtmSubmDW1                                         = 0x0000_0001

CtmSubmDW2 – CtmSubmDW9               = 0x0000_0000

CtmSubmDW10                                       = 0x0000_0200

CtmSubmDW11 – CtmSubmDW15           = 0x0000_0000

2)     After the NVMeG4 IP executes the Secure Erase command, UserBusy is asserted to 1b.

3)     UserReq is then de-asserted to 0b to clear the current request, the user logic can change the input parameters for the next command request.

4)     UserBusy is de-asserted to 0b when the Secure Erase command is completed. After finishing the operation, the ‘TimeoutSet’ can be changed to other values to enable Timeout function of the IP.

Note: Some SSDs may experience a decrease in performance after long data transfer. For such SSDs, the Secure Erase command can help restore their performance.

 

Flush

The SSDs typically enhance write performance by caching write data before writing it to the flash memory. However, unexpected power loss can result data loss as cached data may not be stored in flash memory. To avoid data loss, the Flush command can be used to force the SSD controller to write cached data to the flash memory.

 

Figure 15: Flush command timing diagram

 

To execute the Flush command, the following details should be mentioned.

1)     The IP must be Idle (UserBusy=0b) before sending the command request, and all input parameters must be stable when UserReq is asserted to 1b for sending the request. The CtmSubmDW0-DW15 is set as a constant value with the following values for Flush command.

CtmSubmDW0                             = 0x0000_0000

CtmSubmDW1                             = 0x0000_0001

CtmSubmDW2 – CtmSubmDW15 = 0x0000_0000

2)     UserBusy is asserted to 1b after the NVMeG4 IP executes the Flush command.

3)     UserReq is de-asserted to 0b to clear the current request, and the user logic can change the input parameters for the next command request.

4)     UserBusy is de-asserted to 0b when the Flush command is completed.

Using the Flush command ensures that all data from the previous Write command is guaranteed to be stored in flash memory, thus preventing data loss in the event of unexpected power loss.

 

Error

 

Figure 16: Error flag timing diagram

 

If an error occurs during the initialization process or when running some commands, the UserError flag is set to 1b. To check the type of error, the UserErrorType should be read. The NVMeCAPReg, AdmCompStatus, and IOCompStatus signals can be used to monitor the error details after UserError is set to 1b.

If an error occurs during the initialization process, it is recommended to read the NVMeCAPReg signal to check the capabilities of the NVMe SSD. If an error occurs while operating a command, it is recommended to read the AdmCompStatus and IOCompStatus signals.

The UserError flag is cleared only by the RstB signal. After the failure is resolved, RstB is asserted to 0b to clear the error flag.

 

Verification Methods

The NVMeG4-IP Core functionality was verified by simulation and also proved on real board design by using KCU116, ZCU102, ZCU106 and VCU118 evaluation board.

 

Recommended Design Experience

Experience design engineers with a knowledge of Vivado Tools should easily integrate this IP into their design.

 

Ordering Information

This product is available directly from Design Gateway Co., Ltd. Please contact Design Gateway Co., Ltd. for pricing and additional information about this product using the contact information on the front page of this datasheet.

 

Revision History

Revision

Date

Description

2.01

15-Dec-23

Update FPGA resource and support Secure Erase command

2.00

19-Jul-23

Add data buffer mode and update physical I/O Signals

1.04

19-Jul-21

Update FPGA resource and memory usange

1.03

21-Dec-20

Support KCU116, ZCU102 and ZCU106

1.02

12-Oct-20

Update company info

1.01

8-May-20

Update FPGA resource and memory usage

1.00

30-Jan-20

Initial Release