RDMA - PCIe module can not be inserted into kernel
Mar 04, 2024
Dear all,
the problem reported in GPUDirect RDMA - Module can not be insert into kernel and PCIe DMA driver can not be loaded is still present in the JetPack 5.1 release.
I installed on a Jetson AGX Orin Development Kit the JetPack Version 5.1 with the SDKManager.
Then I build the RDMA example from GitHub - NVIDIA/jetson-rdma-picoevb: Minimal HW-based demo of GPUDirect RDMA on NVIDIA Jetson AGX Xavier running L4T.
Trying to insert a kernel module with RDMA support results in following:
$ insmod picoevb-rdma.ko
> insmod: ERROR: could not insert module picoevb-rdma.ko: Invalid parameters
In the kernel log the following errors appear:
[ 2104.873824] picoevb_rdma: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[ 2104.882091] picoevb_rdma: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[ 2104.889571] picoevb_rdma: disagrees about version of symbol nvidia_p2p_get_pages
[ 2104.897208] picoevb_rdma: Unknown symbol nvidia_p2p_get_pages (err -22)
[ 2104.904057] picoevb_rdma: disagrees about version of symbol nvidia_p2p_put_pages
[ 2104.911675] picoevb_rdma: Unknown symbol nvidia_p2p_put_pages (err -22)
[ 2104.918525] picoevb_rdma: disagrees about version of symbol nvidia_p2p_dma_map_pages
[ 2104.926493] picoevb_rdma: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[ 2104.933702] picoevb_rdma: disagrees about version of symbol nvidia_p2p_free_page_table
[ 2104.941855] picoevb_rdma: Unknown symbol nvidia_p2p_free_page_table (err -22)
In GPUDirect RDMA - Module can not be insert into kernel - #28 by YK_SpartanRadar a possible work-around is presented.
When can we expect this problem to be fixed in the stock installation?
Best regards,
Gerrit