(+86) 15013630202 sales@pcie.com

PCIe driver error on Orin PCIe CEM interface

Mar 04, 2024

Hi


I have several NVMe dirves from differenct vendors, but there is one NVMe drive will cause below error log in bootloader and kernel:


in the bootloader:


ASSERT [NvmExpressDxe] /dvs/git/dirty/git-master_linux/out/nvidia/bootloader/uefi/Jetson_RELEASE/edk2/MdeModulePkg/Bus/Pci/NvmExpressDxe/NvmExpressHci.c(772): (Private->Cap.Mpsmin + 12) <= 12


In the kernel, the following errors will be printed in a loop:


[ 132.683420] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.693298] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.701917] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.708229] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.718097] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.726704] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.734137] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.744040] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.752646] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.764151] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.774028] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.782660] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.788987] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.798861] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.807543] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.813887] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.823767] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.832404] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.838718] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.848649] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.857274] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.863600] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.873478] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.882109] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.895624] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.905486] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.914098] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.923483] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.933354] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.941991] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.948402] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.958285] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.966905] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.973225] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 132.983108] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 132.991736] pcieport 0005:00:00.0: [ 0] RxErr

[ 132.998054] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.007923] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.016541] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.022883] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.032751] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.041374] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.047706] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.057650] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.066280] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.072600] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.082467] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.091087] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.097728] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.107597] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.116214] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.122544] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.132458] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.141095] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.147429] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.157376] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.166020] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.172356] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 133.182227] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 133.190831] pcieport 0005:00:00.0: [ 0] RxErr

[ 133.197263] pcieport 0005:00:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)

[ 133.208284] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000020/00400000

[ 133.216901] pcieport 0005:00:00.0: [ 5] SDES (First)

[ 133.223911] nvme nvme0: frozen state error detected, reset controller

[ 134.271882] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)

[ 134.281954] pcieport 0005:00:00.0: device [10de:229a] error status/mask=00000001/0000e000

[ 134.290825] pcieport 0005:00:00.0: [ 0] RxErr


who knows the reason on this issue? is this related to the PCIe power or reference clock?