(+86) 15013630202 sales@pcie.com

Getting PCIe RX Error when attempting to test PCIe on Jetson AGX Xavier on Jetpack 5.1

Mar 04, 2024

Hi. I am attempting to test PCIe using two Jetson AGX Xaviers on Jetpack 5.1. However, I have run into issues, and need help.


I am currently following this guide:
https://docs.nvidia.com/jetson/archives/r35.2.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html


However, when I get to the step where I need to boot the root port device, I run into an issue. When I boot the root port, I get the following error message repeatedly on terminal:


[   71.231818] pcieport 0005:00:00.0:    [ 0] RxErr
[ 71.240052] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 71.247907] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000

I have to remove the PCIe cable for the error messages to stop, and the board to boot up. However, when I connect the PCIe cable again, the same error once again re-appears, and repeats itself till I remove the PCIe cable.


I have tried disabling ASPM by running the following command on root port:
echo “performance” > /sys/module/pcie_aspm/parameters/policy


However, as soon as I plug PCIe cable after running this command, the same error still comes up.


I have also tried reducing the speed to Gen1 by modifying the device tree, but I still get the same error.


Even though I have modified the device tree, I am unsure whether the device tree change actually did take effect or not. How do I verify that the speed is indeed Gen1?


I made the following changes to the BSP sources:



  1. changed nvidia,max-speed property of pcie@141a0000 and pcie_ep@141a0000 from 4 to 1 in the Linux_for_Tegra/source/public/hardware/nvidia/soc/t19x/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi file,

  2. Changed CONFIG_PCIEASPM_POWER_SUPERSAVE=y to CONFIG_PCIEASPM_POWER_SUPERSAVE=n in the Linux_for_Tegra/source/public/kernel/kernel-5.10/arch/arm64/configs/tegra_defconfig file.


Then I recompiled the kernel, replaced the required files in the Jetpack 5.1 BSP files, applied binaries, and then flashed the root port and end point devices using the flash.sh script.


I tried running the following on the Root Port:
cat /proc/device-tree/pcie@141a0000/nvidia\,max-speed


But this results in no output.


How do I verify that the speed is indeed Gen1? Did I miss anything when setting speed to Gen1? And how do I resolve the original error? Any help would be much appreciated.


The UART log for root port is attached. Kindly let me know if any more info is needed.
Root_Port_UART_Log.txt (208.4 KB)


EDIT: I also get the following error repeatedly before the PCIe RX Error mentioned above.


[    9.080027] pcieport 0005:00:00.0: AER: can't find device of ID0000
[ 9.080030] pcieport 0005:00:00.0: AER: Corrected error received: 0005:00:00.0

Regards,

Sana Ur Rehman