NVMe IP Core with PCIe Gen3 Soft IP Datasheet

Features 1

Applications 2

General Description. 3

Functional Description. 5

·      PCIe Data Link Controller 6

·      PCIe Media Access Controller (PCIe MAC) 6

User Logic. 7

Xilinx PCI Express PHY. 7

Core I/O Signals 8

Timing Diagram.. 14

Initialization. 14

Control interface of dgIF typeS. 15

Data interface of dgIF typeS. 16

Verification Methods 18

Recommended Design Experience. 18

Ordering Information. 18

Revision History. 18

 

 

 

 

  Core Facts

Provided with Core

Documentation

Reference Design Manual

 Demo Instruction Manual

Design File Formats

Encrypted File

Instantiation Templates

VHDL

Reference Designs & Application Notes

Vivado Project,

See Reference Design Manual

Additional Items

Demo on ZCU102, ZCU106,

VCU118, KCU105, KCU116

Support

Support provided by Design Gateway Co., Ltd.

 

 

Design Gateway Co.,Ltd

E-mail:    ip-sales@design-gateway.com

URL:       design-gateway.com

Features

·     NVMe host controller with PCIe Soft IP to access an NVMe Gen3 SSD without CPU and external memory

·     Directly connected an NVMe SSD without PCIe switch connection

·     Includes 256-Kbyte RAM to be data buffer

·     Simple user interface by dgIF typeS

·     Supports seven commands - Identify, Shutdown, Write, Read, SMART, Secure Erase, and Flush

·     Supported NVMe device

·     Base Class Code:01h (mass storage), Sub Class Code:08h (Non-volatile), Programming Interface:02h (NVMHCI)

·     MPSMIN (Memory Page Size Minimum): 0 (4Kbyte)

·     MDTS (Maximum Data Transfer Size): At least 5 (128 Kbyte) or 0 (no limitation)

·     LBA unit: 512 bytes or 4096 bytes

·     User clock frequency: At least 250 MHz (the PHY clock frequency for Gen3)

·     Operating with Xilinx PCIe PHY, 4-lane PCIe Gen3 (128-bit bus interface)

·     Available reference design:

·     KCU105, KCU116, ZCU102, ZCU106, VCU118 with AB17-M2FMC, AB18-PCIeX16, or AB19-M2PCI adapter board

·     Customized service for following features

·     Additional NVMe commands

·     RAM size or RAM type (URAM) modification

 

 

Table 1: Example Implementation Statistics

Family

Example Device

Fmax

(MHz)

CLB Regs

CLB LUTs

CLB

IOB

BRAM

Tile

Design

Tools

Kintex-Ultrascale (GTH)

XCKU040-FFVA1156-2E

300

11941

10618

2187

-

70

Vivado2019.1

Kintex-Ultrascale+ (GTY)

XCKU5P-FFVB676-2E

300

11941

10589

2291

-

70

Vivado2019.1

Zynq-Ultrascale+ (GTH)

XCZU9EG-FFVB1156-2E

300

11941

10590

2292

-

70

Vivado2019.1

Zynq-Ultrascale+ (GTH)

XCZU7EV-FFVC1156-2E

300

11941

10596

2317

-

70

Vivado2019.1

Virtex-Ultrascale+ (GTY)

XCVU9P-FLGA2104-2LE

300

11941

10592

2287

-

70

Vivado2019.1

Note: Actual logic resource dependent on percentage of unrelated logic

 

 

Applications

 

Figure 1: NVMeG3 IP Application

 

NVMe IP Core with PCIe Gen3 Soft IP (NVMeG3 IP) provides an ideal solution for accessing an NVMe Gen3 SSD without the need for PCIe Hard IP, CPU, or external memory. With its inclusion of PCIe Gen3 Soft IP and a 256 KB memory, the NVMeG3 IP provides an ideal option for applications which require vast storage capacity and high-speed performance utilizing low-cost FPGAs which do not integrate a PCIe Hard IP. In scenarios where the selected device does not have sufficient PCIe hard IP to connect with all NVMe SSDs, a system can be designed by using both the NVMe IP and NVMeG3 IP, as shown in Figure 1.

However, when the selected FPGA integrates PCIe hard IP, and a number of PCIe hard IP in the FPGA is sufficient, it is recommended to use DG NVMe IP Core to optimize FPGA resource utilization.

 

We also offer alternative IP core for specific applications using PCIe hard IP such as Multiple users, Random access, and PCIe switch.

NVMe IP Core – Accesses an SSD using PCIe hard IP to minimize FPGA resource utilization.

https://dgway.com/NVMe-IP_X_E.html

Multiple User NVMe IP Core – Enables multiple users to access an NVMe SSD for high-performance write and read operation simultaneously.

https://dgway.com/muNVMe-IP_X_E.html

Random Access by Multiple User NVMe IP Core – Enables two users to write and read to the same NVMe SSD simultaneously, providing high random-access performance for applications with non-contiguous storage requirements.

https://dgway.com/rmNVMe-IP_X_E.html

NVMe IP Core for PCIe Switch – Access multiple NVMe SSDs via PCIe switch to extend storage capacity. Enables high-speed write and read access to shared storage.

https://dgway.com/NVMe-IP_X_E.html

 

 

General Description

 

 

Figure 2: NVMeG3 IP Block Diagram

 

Design Gateway has developed the NVMeG3 IP, which serves as an NVMe host controller including PCIe Soft IP for accessing an NVMe Gen3 SSD. While the user interface of the NVMeG3 IP is similar to the standard DG NVMe IP, it offers the additional benefit of integrating PCIe Soft IP, which implements the Data link layer and a part of the Physical layer of PCIe protocol. The Physical interface of NVMeG3 IP connects to the Xilinx PCIe PHY through a 128-bit PIPE interface. The Xilinx PCIe PHY contains the transceiver and equalizer logic.

The NVMeG3 IP consists of both NVMe IP and PCIe soft IP, so all features of the standard IP are remained. Table 2 shows the comparison between NVMe IP and NVMeG3 IP.

 

Table 2: The comparison of NVMe IP and NVMeG3 IP

Feature

NVMe IP

NVMeG3 IP

PCIe Interface

128-bit AXI4 Stream

128-bit PIPE

Xilinx PCIe IP

Integrated Block for PCIe (PCIe Hard IP)

Xilinx PCIe PHY (Transceiver and equalizer)

PCIe Hard IP

Necessary

Not use

PCIe Speed

1-4 Lane with Gen3 or lower speed

Support only 4-lane PCIe Gen3

User Interface

dgIF typeS

dgIF typeS

FPGA resource

Smaller

Larger

Maximum SSD

Depend on the number of PCIe Hard IPs

Depend on the number of transceivers

SSD Performance

Up to 3300 MB/s*

Up to 3300 MB/s*

*Note: This performance is achieved by testing with 500 GB Samsung 970 PRO SSD

 

As shown in Table 2, it can be observed that the key advantage of NVMeG3 IP is the absence of the requirement for PCIe hard IP. Therefore, the maximum number of SSDs supported is not limited by the number of PCIe hard IP instances, but rather by the number of transceivers and the resource utilization. However, a drawback of the NVMeG3 IP is its higher resource utilization compared to the NVMe IP due to the implementation of PCIe soft IP. Besides, the NVMeG3 IP supports only 4-lane PCIe Gen3 SSD.

For more detailed information about the standard NVMe IP, please refer to the NVMe IP datasheet from our website.

https://dgway.com/products/IP/NVMe-IP/dg_nvme_ip_data_sheet_en/

To facilitate evaluation before making a purchase, we offer reference design for FPGA evaluation boards. It enables the user to assess the performance and compatibility of the NVMeG3 IP.

 

Functional Description

Figure 3 shows an operation flow of NVMeG3 IP after the IP reset is de-asserted.

 

Figure 3: NVMeG3 IP Operation Flow

 

As shown in Figure 3, the operation of NVMeG3 IP is divided into three phases, including Initialization, Command operation, and No operation. Comparing to the NVMe IP, all operations of NVMeG3 IP are similar. However, the NVMeG3 IP includes an additional process in the initialization phase for executing PCIe link initialization and training. This process is controlled by the Physical layer to configure and initialize link and port. The steps involved in this process are as follows.

1)     Detects device connection by monitoring the electrical idle signal.

2)     Waits until Bit/Symbol of each lane is locked.

3)     Configures the number of lanes to 4 lanes.

4)     Sets the PCIe speed to Gen3.

5)     Adjusts the equalizer parameters.

6)     Sets flow control parameters.

Upon completing above steps, the signal quality is good and ready for the transfer of PCIe packets. The subsequent steps are similar to those of the NVMe IP. For more information, please refer to the NVMe IP datasheet.

As shown in Figure 2, NVMeG3 IP includes PCIe Soft IP which is optimized for executing the NVMe protocol. As a result, the resource utilization of this PCIe Soft IP is lower, compared to other PCIe Soft IPs. Further details of PCIe Soft IP are described as follows.

·       PCIe Data Link Controller

PCIe Data Link Controller implements the Data Link Layer of PCIe protocol. This layer ensures reliable delivery of TLPs which is the packet format transferring between the PCIe Transaction Controller and PCIe Data Link Controller. Each TLP is appended with a Link Cyclic Redundancy Code (LCRC) to facilitate error checking at the receiver. Besides, a Sequence Number is included to ensure the correct sequence of packets during transmission, matching to the order they were sent by the sender. The receiver can also detect any missing TLPs in the transmission.

Upon successful LCRC verification and Sequence Number validation, Ack DLLPs (Data Link Layer Packets) are generated to confirm the error-free reception of TLPs. In case of a transmission error, Nak DLLPs are created to indicate the issue, prompting the transmitter to re-send the TLPs to resolve the problem. The PCIe Soft IP also incorporates two 8KB RAMs, serving as a Replay buffer and a data buffer for bidirectional data transfer.

·       PCIe Media Access Controller (PCIe MAC)

The PCIe MAC module is designed to interface with the Xilinx PCIe PHY using the PIPE (PHY Interface for PCI Express) standard. This module serves two main purposes. Firstly, it manages the Link initialization and training process. Secondly, it controls the flows of data packet in accordance with the PCIe physical specification.

During Link initialization and training, certain processes are implemented within the Xilinx PCIe PHY such as Clock and Data Recovery (CDR) for Bit lock and Block lock for Gen3 speed. The PCIe MAC incorporates the Link Training and Status State Machine (LTSSM), which is responsible for controlling Link width, Lane reversal, Polarity inversion, and Link data rate at Gen3 speed. As Gen3 operates at 8.0 GT/s which is more sensitive than Gen1 and Gen2, the additional features are implemented in PCIe Gen3 MAC, including DC balance and equalization.

Once the initialization and training process is completed, data packets can be transferred. To transmit packets, the PCIe MAC module includes a multiplexer for selecting data types, byte striping for organizing the data format in each lane, and data scrambling to minimize noise. On the receiving end, the receiver logic handles tasks such as data de-scrambling, byte un-striping, and data filtering.

 

User Logic

User logic for operating the NVMeG3 IP is similar to that used for the NVMe IP, allowing users to utilize the same logic for both IP implementations.

 

Xilinx PCI Express PHY

Xilinx provides the Xilinx PCIe Express PHY to enable the utilization of Soft IP instead of Hard IP to construct a PCIe MAC. The PCIe PHY utilizes the PHY Interface for PCI Express (PIPE) as its user interface. When operating with the NVMeG3 IP, the PCIe PHY is configured with a lane width of 4 and a Link speed of 8.0 GT/s. For more detailed information of the Xilinx PCIe PHY, please refer to the “PG239: PCI Express PHY” document, available at Xilinx website.

https://docs.xilinx.com/r/en-US/pg239-pcie-phy

 

Figure 4: Block Diagram of Xilinx PCI Express PHY

 

Core I/O Signals

Descriptions of all I/O signals are provided in Table 3 and Table 4.

Table 3: User logic I/O Signals (Synchronous to Clk signal)

Signal

Dir

Description

Control I/F of dgIF typeS

RstB

In

Synchronous reset signal. Active low. De-asserts to 1b when Clk signal is stable.

Clk

In

System clock for running NVMeG3 IP. The frequency of this clock must be equal to or greater than the PhyClk (250 MHz for PCIe Gen3)

UserCmd[2:0]

In

User Command. Valid when UserReq=1b. The possible values are

000b: Identify, 001b: Shutdown, 010b: Write SSD, 011b: Read SSD,

100b: SMART/Secure Erase, 110b: Flush, 101b/111b: Reserved.

UserAddr[47:0]

In

Start address to write/read SSD in 512-byte unit. Valid when UserReq=1b.

In case LBA unit = 4 Kbyte, UserAddr[2:0] must be always set to 000b to align 4 KB unit.

In case LBA unit = 512 byte, it is recommended to set UserAddr[2:0]=000b to align 4 KB size (SSD page size). The write/read performance of most SSDs is reduced when the start address is not aligned with page size.

UserLen[47:0]

In

The total transfer size to write/read data from the SSD in 512-byte unit. Valid from 1 to (LBASize-UserAddr). In case LBA unit = 4 KB, UserLen[2:0] must always be set to 000b to align 4 KB unit. Valid when UserReq=1b.

UserReq

In

Asserts to 1b to send the new command request and de-asserts to 0b after the IP starts the operation by asserting UserBusy to 1b. This signal can only be asserted when the IP is Idle (UserBusy=0b). Command parameters (UserCmd, UserAddr, UserLen, and CtmSubmDW0-DW15) must be valid and stable during UserReq=1b. UserAddr and UserLen are inputs for Write/Read command while CtmSubmDW0-DW15 are inputs for SMART, Secure Erase, or Flush command.

UserBusy

Out

Asserted to 1b when IP is busy. New request must not be sent (UserReq=1b) when IP is still busy.

LBASize[47:0]

Out

The total capacity of SSD in 512-byte unit. Default value is 0.

This value is valid after finishing Identify command.

LBAMode

Out

The LBA unit size of SSD (0b: 512 bytes, 1b: 4 Kbytes). Default value is 0b.

This value is valid after finishing Identify command.

UserError

Out

Error flag. Asserted to 1b when UserErrorType is not equal to 0.

The flag is de-asserted to 0b by asserting RstB to 0b.

UserErrorType[31:0]

Out

Error status.

[0] – An error when PCIe class code is incorrect.

[1] – An error from Controller capabilities (CAP) register, which can occur due to various reasons.

- Memory Page Size Minimum (MPSMIN) is not equal to 0.

- NVM command set flag (bit 37 of CAP register) is not set to 1.

- Doorbell Stride (DSTRD) is not equal to 0.

- Maximum Queue Entries Supported (MQES) is less than 7.

More details of each register can be found in NVMeCAPReg signal.

[2] – An error when the Admin completion entry is not received within the specified timeout.

[3] – An error when the status register in the Admin completion entry is not 0 or when the phase tag/command ID is invalid. More details can be found in the AdmCompStatus signal.

[4] – An error when the IO completion entry is not received within the specified timeout.

[5] – An error when the status register in the IO completion entry is not 0 or when the phase tag is invalid. More details can be found in the IOCompStatus signal.

[6] – An error from unsupported LBA unit (not equal to 512 bytes or 4 KB).

 

Signal

Dir

Description

Control I/F of dgIF typeS

UserErrorType[31:0]

Out

[7] – An error from PCIe PHY.

[8] – An error when receiving TLP packet with an incorrect size.

[9] – Reserved

[10] – An error from Unsupported Request (UR) flag in Completion TLP packet.

[11] – An error from Completer Abort (CA) flag in Completion TLP packet.

[23:12] – Reserved

[24] – An error from Data Link Layer protocol.

[31:25] – Reserved

Note: Timeout period of bit[2]/[4] is set from TimeOutSet input.

Data I/F of dgIF typeS

UserFifoWrCnt[15:0]

In

Write data counter for the Receive FIFO. Used to monitor the FIFO full status. When the FIFO becomes full, data transmission from the Read command temporarily halts. If the data count of FIFO is less than 16 bits, the upper bits should be padded with 1b to complete the 16-bit count.

UserFifoWrEn

Out

Asserted to 1b to write data to the Receive FIFO when executing the Read command.

UserFifoWrData[127:0]

Out

Write data bus of Receive FIFO. Valid when UserFifoWrEn=1b.

UserFifoRdCnt[15:0]

In

Read data counter for the Transmit FIFO. Used to monitor the amount of data stored in the FIFO. If the counter indicates an empty status, the transmission of data packets for the Write command temporarily pauses. When the data count of FIFO is less than 16 bits, the upper bis should be padded with 0b to complete the 16-bit count.

UserFifoEmpty

In

Unused for this IP.

UserFifoRdEn

Out

Asserted to 1b to read data from the Transmit FIFO when executing the Write command.

UserFifoRdData[127:0]

In

Read data returned from the Transmit FIFO.

Valid in the next clock after UserFifoRdEn is asserted to 1b.

NVMeG3 IP Interface

IPVesion[31:0]

Out

IP version number

TestPin[31:0]

Out

Reserved to be IP Test point.

TimeOutSet[31:0]

In

Timeout value to wait for completion from SSD. The time unit is equal to 1/(Clk frequency).

When TimeOutSet is equal to 0, Timeout function is disabled.

AdmCompStatus[15:0]

Out

Status output from Admin Completion Entry

[0] – Set to 1b when the Phase tag or Command ID in Admin Completion Entry is invalid.

[15:1] – Status field value of Admin Completion Entry

IOCompStatus[15:0]

Out

Status output from IO Completion Entry

[0] – Set to 1b when the Phase tag in IO Completion Entry is invalid.

[15:1] – Status field value of IO Completion Entry

NVMeCAPReg[31:0]

Out

The parameter value of the NVMe capability register when UserErrorType[1] is asserted to 1b.

[15:0] – Maximum Queue Entries Supported (MQES)

[19:16] – Doorbell Stride (DSTRD)

[20] – NVM command set flag

[24:21] – Memory Page Size Minimum (MPSMIN)

[31:25] – Undefined

 

Signal

Dir

Description

Identify Interface

IdenWrEn

Out

Asserted to 1b for sending data output from the Identify command.

IdenWrDWEn[3:0]

Out

Dword (32 bit) enable of IdenWrData. Valid when IdenWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], [2], and [3] correspond to IdenWrData[31:0], [63:32], [95:64] and [127:96], respectively.

IdenWrAddr[8:0]

Out

Index of IdenWrData in 128-bit unit. Valid when IdenWrEn=1b.

0x000-0x0FF: 4KB Identify controller data,

0x100-0x1FF: 4KB Identify namespace data.

IdenWrData[127:0]

Out

4KB Identify controller data or Identify namespace data. Valid when IdenWrEn=1b.

Custom interface (Command and RAM)

CtmSubmDW0[31:0] – CtmSubmDW15[31:0]

In

16 Dwords of Submission queue entry for SMART, Secure Erase, or Flush command.

DW0: Command Dword0, DW1: Command Dword1, …, and DW15: Command Dword15.

These inputs must be valid and stable during UserReq=1b and UserCmd=100b (SMART/Secure Erase) or 110b (Flush).

CtmCompDW0[31:0] –

CtmCompDW3[31:0]

Out

4 Dwords of Completion queue entry, output from SMART, Secure Erase, or Flush command.

DW0: Completion Dword0, DW1: Completion Dword1, …, and DW3: Completion Dword3

CtmRamWrEn

Out

Asserted to 1b for sending data output from Custom command such as SMART command.

CtmRamWrDWEn[3:0]

Out

Dword (32 bit) enable of CtmRamWrData. Valid when CtmRamWrEn=1b.

1b: This Dword data is valid, 0b: This Dword data is not available.

Bit[0], [1], [2], and [3] correspond to CtmRamWrData[31:0], [63:32], [95:64], and [127:96], respectively.

CtmRamAddr[8:0]

Out

Index of CtmRamWrData when SMART data is received. Valid when CtmRamWrEn=1b.

(Optional) Index to request data input through CtmRamRdData for customized Custom commands.

CtmRamWrData[127:0]

Out

512-byte data output from SMART command. Valid when CtmRamWrEn=1b.

CtmRamRdData[127:0]

In

(Optional) Data input for customized Custom commands.

 

Table 4: Physical I/O Signals (Synchronous to PhyClk)

Signal

Dir

Description

PHY Clock and Reset

PhyRstB

In

Synchronous reset signal. Active low.

De-asserts to 1b when PCIe PHY is not in reset state, detected by phy_phystatus_rst signal being de-asserted to 0b.

PhyClk

In

Clock output from PCIe PHY (250 MHz).

Other PHY Interface

MACTestPin[63:0]

Out

Test point of PCIe MAC.

MACStatus[7:0]

Out

Status output from PCIe MAC.

[5:0] – LTSSM

- 000000b: Detect.Quiet

- 000001b: Detect.Active

- 000010b: Polling.Active

- 000011b: Polling.Configuration

- 000100b: Config.Linkwidth

- 000101b: Config.Lanenum

- 000110b: Config.Complete

- 000111b: Config.Idle

- 001000b: Recovery.RcvrLock

- 001001b: Recovery.Speed

- 001010b: Recovery.RcvrCfg

- 001011b: Recovery.Idle

- 010000b: L0

- 100000b: Recovery.EqP0

- 100001b: Recovery.EqP1

- 100010b: Recovery.EqP2

- 100011b: Recovery.EqP3

[6] – Reserved

[7] – Asserted to 1b after PCIe initialization is completed.

PIPE Data Interface

PhyTxData[255:0]

Out

Parallel data output to PHY.

Note: When connecting to PHY on Ultrascale device, only 128-bit is used.

PhyTxData[31:0] must connect to phy_txdata[31:0]

PhyTxData[95:64] must connect to phy_txdata[63:32]

PhyTxData[159:128] must connect to phy_txdata[95:64]

PhyTxData[223:192] must connect to phy_txdata[127:96]

PhyTxDataK[7:0]

Out

Control data to indicate whether PCIeTxData is control or data.

PhyTxDataValid[3:0]

Out

Asserted to 1b when the valid data is presented on PhyTxData.

PhyTxStartBlock[3:0]

Out

Asserted to 1b at the first clock of 128b block to indicate the start of block.

Valid when PhyTxDataValid=1b.

PhyTxSyncHeader[7:0]

Out

Indicates whether data block is ordered set or data stream.

Valid at the first clock of 128b block, together with PhyTxStartBlock.

 

Signal

Dir

Description

PIPE Data Interface

PhyRxData[255:0]

In

Data input from PHY.

Note: When connecting to PCIe PHY on Ultrascale device, only 128-bit is used.

PhyRxData[31:0] must connect to phy_rxdata[31:0]

PhyRxData[95:64] must connect to phy_rxdata[63:32]

PhyRxData[159:128] must connect to phy_rxdata[95:64]

PhyRxData[223:192] must connect to phy_rxdata[127:96].

NVMeG3 IP ignores the remaining bits (bit[63:32], bit[127:96], bit[191:160], and bit[255:224]).

PhyRxDataK[7:0]

In

Control data to indicate whether PhyRxData is control or data.

PhyRxDataValid[3:0]

In

Asserts to 1b when the valid data is presented on PhyRxData.

PhyRxStartBlock[7:0]

In

Asserts to 1b at the first clock of 128b block to indicate the start of block.

Valid when PhyRxDataValid=1b.

Note: When connecting to PCIe PHY on Ultrascale device, only 4-bit is used.

PhyRxStartBlock[0] must connect to phy_rxstart_block[0]

PhyRxStartBlock[2] must connect to phy_rxstart_block[1]

PhyRxStartBlock[4] must connect to phy_rxstart_block[2[

PhyRxStartBlock[6] must connect to phy_rxstart_block[3].

NVMeG3 IP ignores the remaining bits (bit[1], bit[3], bit[5], and bit[7]).

PhyRxSyncHeader[7:0]

In

Indicates whether data block is ordered set or data stream.

Valid at the first clock of 128b block, together with PhyRxStartBlock.

PIPE Control and Status Signal

PhyTxDetectRx

Out

Requests PCIe PHY to begin a receiver detection operation.

PhyTxElecIdle[3:0]

Out

Forces Tx to enter electrical idle state.

PhyTxCompliance[3:0]

Out

Asserted to 1b for running negative disparity.

PhyRxPolarity[3:0]

Out

Requests PCIe PHY to perform polarity inversion on the received data.

PhyPowerdown[1:0]

Out

Requests PCIe PHY to change the power state.

PhyRate[1:0]

Out

Requests PCIe PHY to change link rate.

PhyRxValid[3:0]

In

Indicates symbol lock and valid data when logic high.

PhyPhyStatus[3:0]

In

Used to communicate completion of several PIPE operations.

PhyRxElecIdle[3:0]

In

Indicates Rx electrical idle detected.

PhyRxStatus[11:0]

In

Rx status and error codes.

Driver and Equalization Signal

PhyTxMargin[2:0]

Out

Selects Tx voltage levels. This signal is fixed to 000b.

PhyTxSwing

Out

Controls Tx voltage swing level. This signal is fixed to 0b.

PhyTxDeEmph

Out

Selects Tx de-emphasis. This signal is fixed to 1b.

PhyTxEqCtrl[7:0]

Out

Tx equalization control.

PhyTxEqPreset[15:0]

Out

Tx equalization preset.

PhyTxEqCoeff[23:0]

Out

Tx equalization coefficient.

PhyTxEqFS[5:0]

In

Indicates the full swing of the Tx driver. Static value based on characteristics of Tx driver.

PhyTxEqLF[5:0]

In

Indicates the low frequency of the Tx driver. Static value based on characteristics of Tx driver.

PhyTxEqNewCoeff[71:0]

In

Status of the current Tx equalization coefficient.

PhyTxEqDone[3:0]

In

Asserts to 1b when Tx equalization is done.

 

Signal

Dir

Description

Driver and Equalization Signal

PhyRxEqCtrl[7:0]

Out

Rx equalization control.

PhyRxEqTxPreset[15:0]

Out

Link partner status for Tx preset.

PhyRxEqPresetSel[3:0]

In

Serves indications as Coefficient or preset.

PhyRxEqNewTxCoeff[71:0]

In

New Tx coefficient or preset to request the link partner.

PhyRxEqAdaptDone[3:0]

In

Asserts to 1b when RX equalization is successfully done.

Valid when PhyRxEqDone is asserted to 1b.

PhyRxEqDone[3:0]

In

Asserts to 1b when Rx equalization is finished.

Assist signal

AsMacInDetect

Out

Assists PCIe PHY to switch the receiver termination between VTT and GND.

AsCdrHoldReq

Out

Assists PCIe PHY to hold CDR.

Note: For more detailed information about the signals of Xilinx PCIe PHY, please refer to the “PG239: PCI Express PHY” document, available at Xilinx website.

 

Timing Diagram

 

Initialization

 

 

Figure 5: Timing diagram of the reset sequence

 

The initialization process of the NVMeG3 IP follows the steps below as shown in timing diagram.

1)     Monitor the phy_phystatus_rst signal from the PCIe PHY and wait until it is de-asserted to 0b, indicating the completion of the PCIe PHY reset process. After that, the user logic de-asserts PhyRstB to 1b to initiate the link training process by NVMeG3 IP.

2)     After de-asserting PhyRstB and ensuring the stability of the Clk signal, the user logic can de-assert RstB to 1b. This action triggers the NVMe logic within the NVMeG3 IP to initiate the operation.

3)     Once the NVMeG3 IP completes the initialization processes, including link training, flow control initialization, and configuration of PCIe and NVMe registers, UserBusy is de-asserted to 0b.

Following these sequences, the NVMeG3 IP is ready to receive the command from the user.

 

Control interface of dgIF typeS

The dgIF typeS signals can be split into two groups: the Control interface for sending commands and monitoring status, and the Data interface for transferring data streams in both directions.

Figure 6 shows an example of how to send a new command to the IP via the Control interface of dgIF typeS.

 

 

Figure 6: Control Interface of dgIF typeS timing diagram

 

1)     UserBusy must be equal to 0b before sending a new command request to confirm that the IP is Idle.

2)     Command and its parameters such as UserCmd, UserAddr, and UserLen must be valid when asserting UserReq to 1b to send the new command request.

3)     IP asserts UserBusy to 1b after starting the new command operation.

4)     After UserBusy is asserted to 1b, UserReq is de-asserted to 0b to finish the current request. New parameters for the next command can be prepared on the bus. UserReq for the new command must not be asserted to 1b until the current command operation is finished.

5)     UserBusy is de-asserted to 0b after the command operation is completed. Next, new command request can be initiated by asserting UserReq to 1b.

Note: The number of parameters used in each command is different. More details are described below.

 

Data interface of dgIF typeS

The Data interface of dgIF typeS is applied for transferring data stream when operating Write or Read command, and it is compatible with a general FIFO interface. Figure 7 shows the data interface of dgIF typeS when transferring Write data to the IP in the Write command.

 

 

Figure 7: Transmit FIFO Interface for Write command

 

The 16-bit FIFO read data counter (UserFifoRdCnt) shows the total amount of data stored, and if the amount of data is sufficient, 512-byte data (32x128-bit) is transferred.

In Write command, data is read from Transmit FIFO until the total data is transferred completely. The process details to transfer data are described as follows.

1)     Before starting a new burst transfer, the IP waits until at least 512-byte data is available in Transmit FIFO by monitoring UserFifoRdCnt[15:5] that must be not equal to 0.

2)     The IP asserts UserFifoRdEn to 1b for 32 clock cycles to read 512-byte data from the Transmit FIFO.

3)     UserFifoRdData is valid in the next clock cycle after asserting UserFifoRdEn to 1b, and 32 data are continuously transferred.

4)     After reading the 32nd data (D31), UserFifoRdEn is de-asserted to 0b.

5)     Repeat steps 1) – 4) to transfer the next 512-byte data until total amount of data is equal to the transfer size specified in the command.

6)     After total data is completely transferred, UserBusy is de-asserted to 0b.

 

 

Figure 8: Receive FIFO Interface for Read command

 

When executing the Read command, the data is transferred from the SSD to the Receive FIFO until the entire data is transferred. The steps for transferring a burst of data are below.

1)     Before starting a new burst transmission, the UserFifoWrCnt[15:6] is checked to verify that there is enough free space available in the Receive FIFO, indicated by the condition UserFifoWrCnt[15:6] ≠ all 1b or 1023. Also, the IP waits until the amount of received data from the SSD reaches at least 512 bytes. Once both conditions are satisfied, the new burst transmission begins.

2)     The IP asserts UserFifoWrEn to 1b for 32 clock cycles to transfer 512-byte data from the Data buffer to user logic.

3)     Once the transfer of 512-byte data is completed, UserFifoWrEn is de-asserted to 0b for single clock cycle. If there is additional data remaining to be transferred, repeat steps 1) – 3) until the total data size matches the transfer size specified in the command.

4)     After the total data is completely transferred, UserBusy is de-asserted to 0b.

 

The timing diagrams of user interface during the execution of other commands such as Identify, SMART, Flush, Secure Erase, and Shutdown, please refer to the timing diagram provided in NVMe IP datasheet, available at our website.

 

Verification Methods

The NVMeG3 IP Core functionality was verified by simulation and also proved on real board design by using KCU105/KCU116/ZCU102/ZCU106/VCU118 evaluation board.

 

Recommended Design Experience

Experience design engineers with a knowledge of Vivado Tools should easily integrate this IP into their design.

 

Ordering Information

This product is available directly from Design Gateway Co., Ltd. Please contact Design Gateway Co., Ltd. for pricing and additional information about this product using the contact information on the front page of this datasheet.

 

Revision History

Revision

Date

Description

1.6

23-Feb-24

- Update IP resource utilization

- Update UserErrorType and MACStatus description in Core I/O signals

- Support Secure Erase Command

1.5

22-Jun-23

Update NVMeG3 IP block diagram and UserErrorType description in Core I/O signals

1.4

14-Mar-22

Update IP resource utilization and board support

1.3

18-Dec-20

Update IP resource utilization

1.2

12-Oct-20

Update company info

1.1

22-Apr-20

Change example device to XCZU9EG, update resource in Table1, and support URAM for customized

1.0

29-Aug-19

Initial Release