Getting PCIe RX Error when attempting to test PCIe on Jetson AGX Xavier on Jetpack 5.1
Hi. I am attempting to test PCIe using two Jetson AGX Xaviers on Jetpack 5.1. However, I have run into issues, and need help.
I am currently following this guide:
https://docs.nvidia.com/jetson/archives/r35.2.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html
However, when I get to the step where I need to boot the root port device, I run into an issue. When I boot the root port, I get the following error message repeatedly on terminal:
[ 71.231818] pcieport 0005:00:00.0: [ 0] RxErr
[ 71.240052] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 71.247907] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000
I have to remove the PCIe cable for the error messages to stop, and the board to boot up. However, when I connect the PCIe cable again, the same error once again re-appears, and repeats itself till I remove the PCIe cable.
I have tried disabling ASPM by running the following command on root port:echo “performance” > /sys/module/pcie_aspm/parameters/policy
However, as soon as I plug PCIe cable after running this command, the same error still comes up.
I have also tried reducing the speed to Gen1 by modifying the device tree, but I still get the same error.
Even though I have modified the device tree, I am unsure whether the device tree change actually did take effect or not. How do I verify that the speed is indeed Gen1?
I made the following changes to the BSP sources:
- changed
nvidia,max-speed
property ofpcie@141a0000
andpcie_ep@141a0000
from 4 to 1 in theLinux_for_Tegra/source/public/hardware/nvidia/soc/t19x/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi
file, - Changed
CONFIG_PCIEASPM_POWER_SUPERSAVE=y
toCONFIG_PCIEASPM_POWER_SUPERSAVE=n
in the Linux_for_Tegra/source/public/kernel/kernel-5.10/arch/arm64/configs/tegra_defconfig file.
Then I recompiled the kernel, replaced the required files in the Jetpack 5.1 BSP files, applied binaries, and then flashed the root port and end point devices using the flash.sh script.
I tried running the following on the Root Port:cat /proc/device-tree/pcie@141a0000/nvidia\,max-speed
But this results in no output.
How do I verify that the speed is indeed Gen1? Did I miss anything when setting speed to Gen1? And how do I resolve the original error? Any help would be much appreciated.
The UART log for root port is attached. Kindly let me know if any more info is needed.
Root_Port_UART_Log.txt (208.4 KB)
EDIT: I also get the following error repeatedly before the PCIe RX Error mentioned above.
[ 9.080027] pcieport 0005:00:00.0: AER: can't find device of ID0000
[ 9.080030] pcieport 0005:00:00.0: AER: Corrected error received: 0005:00:00.0
Regards,
Sana Ur Rehman