Jetson TX2 NVMe Hotplug/Hotswap PCIe Switch
Mar 04, 2024
@vidyas
Hello,
I got another question expanding on the problem concerning NVMe hotplugging / hotswapping you helped me with back in February.
We now would need to hotplug / hotswap a NVMe ( CFX Card to be precise ) that is connected to the Xavier NX via a PCIe switch ( Pericom PI7C9X2G ).
If the CFX card is pluggin in before the system is booted it enumerates as 0004:05:00.0 Non-Volatile memory controller and works as expected.
factory@localhost:~$ sudo lspci
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:03:00.0 RAM memory: Xilinx Corporation Device d021
0004:05:00.0 Non-Volatile memory controller: Device 1987:5013 (rev 01)
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
$ ls /sys/class/nvme/
nvme0 nvme1
If the CFX card is plugged in after the system booted and I execute the hotplug sysfs command for the switch nothing happens.
$ sudo lspci
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:03:00.0 RAM memory: Xilinx Corporation Device d021
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
** plug in the CFX card
$ sudo cat /sys/kernel/debug/pcie-4/hot_plug
$ sudo lspci
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:03:00.0 RAM memory: Xilinx Corporation Device d021
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
With a rescan the CFX is enumerated but no nvme device is created.
$ sudo lspci
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:03:00.0 RAM memory: Xilinx Corporation Device d021
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
** plug in the CFX card
sudo sh -c "echo 1 > /sys/bus/pci/rescan"
factory@localhost:~$ sudo lspci
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1)
0004:01:00.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:01.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:02.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:02:03.0 PCI bridge: Pericom Semiconductor Device 2404 (rev 05)
0004:03:00.0 RAM memory: Xilinx Corporation Device d021
0004:05:00.0 Non-Volatile memory controller: Device 1987:5013 (rev 01)
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981
$ ls /sys/class/nvme/
nvme0
Is there any way to hotplug an nvme if it is connected to the root port via a pcie switch?
Thanks