Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOMMU related issues #718

Open
hzc12321 opened this issue Nov 27, 2024 · 8 comments
Open

IOMMU related issues #718

hzc12321 opened this issue Nov 27, 2024 · 8 comments

Comments

@hzc12321
Copy link

I have built v1.2.0 RC2 from source. After running
sudo build/gatekeeper
an error is shown :
cannot add vfio group to container, error 22 (invalid argument)

and there I'm unable to start Gatekeeper. While troubleshooting, the error message below is found :
sudo dmesg | grep -i vfio
Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.

From the preliminary study, I think the issue is more related to hardware / BIOS. However, I can't find an exact solution to actually solve this. I'm trying my luck here to see if anyone with deeper understanding of Intel VT-d, IOMMU and vfio-pci can assist to provide any idea.

The same error occured on both GT and GK. Below is the specification of the testbed :
Bare-metal deployment, isolated lab environment.
GK :
OS : Ubuntu 24.04 LTS
Server : HPE ProLiant
RAM : 256GB
CPU : Intel Xeon E5-2665 2.4GHz, 32 cores
NUMA : 2 NUMA nodes
NIC : Intel I350 1G, both front and back (Confirmed that DPDK is supported). The server also has Intel 82599ES 10G interface that supports DPDK, but we neither have a 10G uplink router available at the moment, so we didn't use it for the testbed.

GT :
OS : Ubuntu 24.04 LTS
Server : HPE ProLiant
RAM : 256GB
CPU : Intel Xeon E5-2640 2.6GHz, 32 cores
NUMA : 2 NUMA nodes
NIC : Intel I350 1G, front

Solutions tried on GT (which didn't work):
Adding vfio_iommu_type1.allow_unsafe_interrupts=1 In GRUB_CMDLINE_LINUX_DEFAULT

Current thoughts :

  1. In https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt , it is mentioned that "The device tree node of the IOMMU device's parent bus must contain a valid "dma-ranges" property that describes how the physical address space of the IOMMU maps to memory. An empty "dma-ranges" property means that there is a 1:1 mapping from IOMMU to memory.".
    Is there a way to verify the "dma-ranges" property? If yes, at least I can know what is causing it a non 1:1 mapping, and probably being able to trace down the root cause from here.
  2. The "Contact your platform vendor" mentioned in the error message raised my suspicion of BIOS incompatibility. If these machines can't do the job, machines of which specification / brand can?

Some links that are probably relevant but I can't fully understand the content due to lacking of relevant experience :
https://github.com/kiler129/relax-intel-rmrr/blob/master/deep-dive.md#what-vendors-did-wrong
https://lore.kernel.org/linux-iommu/BN9PR11MB5276E84229B5BD952D78E9598C639@BN9PR11MB5276.namprd11.prod.outlook.com/
https://lore.kernel.org/linux-iommu/BN9PR11MB52768ACA721898D5C43CBE9B8C27A@BN9PR11MB5276.namprd11.prod.outlook.com/t/
https://lore.kernel.org/linux-iommu/[email protected]/
https://community.hpe.com/t5/proliant-servers-ml-dl-sl/proliant-dl360-gen9-getting-error-quot-rejecting-configuring-the/td-p/7220298
https://www.reddit.com/r/VFIO/comments/1gi95zf/rejecting_configuring_the_device_without_a_11/?rdt=37975
https://forum.proxmox.com/threads/qemu-exited-with-code-1-pcie-passthrough-not-working.146297/

@AltraMayor
Copy link
Owner

The Intel I350 NIC has four ports, doesn't it? Is it built into the mainboard or a discrete NIC? If it's discrete, does your server have an onboard NIC? Have you configured any of the ports to be used by the kernel?

@hzc12321
Copy link
Author

The Intel I350 NIC has four ports, doesn't it?

The I350 on GK has 2 ports, while the one on GT has 4.

Is it built into the mainboard or a discrete NIC? If it's discrete, does your server have an onboard NIC?

They're discrete, the onboard NICs on both machines are Broadcom NetXtreme which don't support DPDK.

Have you configured any of the ports to be used by the kernel?

I'm not quite sure about this, but here's all I have done to the ports :

  1. Prior to setting up Gatekeeper, I first ensure the network connectivity by normally assigning IP address and default gateway using netplan, just like setting up any Ubuntu server. Then I bring them up and ping between devices, including routers.

  2. Referring to Tips for Deployment, I changed their name to "front" and "back" respectively using systemd.link.

  3. Entered command sudo dependencies/dpdk/usertools/dpdk-devbind.py --bind=vfio-pci front and verified that the bind is successful using sudo dependencies/usertools/dpdk-devbind.py --status . The same goes to the back port.

I'm not sure whether it is necessary to bind all the ports on the same NIC to vfio-pci. However, this is already true for the GK server which has only 2 ports on the NIC.
While troubleshooting on GT server, I have also checked /sys/kernel/iommu_groups/, confirmed that the PCI address of the front port is under a group, and it's the only address in that group.
lspci -nnk -s 81:00.0 shows that the kernel-driver is vfio-pci, and the kernel modules is igb. I'm not sure if this is the expected result.
The machine is rebooted whenever necessary, and I actually did it a few more times as the old saying goes "If something is not working always try rebooting it".

@hzc12321
Copy link
Author

Btw, after ensuring network connectivity to check they are working fine, I have already removed their netplan configuration and changed the state to DOWN before proceed to modifying them to be used by DPDK
.

@AltraMayor
Copy link
Owner

AltraMayor commented Nov 28, 2024

I encountered a similar error while testing a NIC some time ago. I solved the problem then by binding all ports of the NIC with vfio-pci, as you did by calling the script dpdk-devbind.py. Since you already did that for your Gatekeeper server, and it still doesn't work, your problem is different from mine.

Hardware issues are time-consuming to address. I recommend getting another NIC model and moving forward.

Section NIC of our wiki page "Hardware Requirements" lists NICs that Gatekeeper deployers have been using in production.

@hzc12321
Copy link
Author

hzc12321 commented Dec 8, 2024

Thanks for the recommendation, I will see if I can find the same model of NICs. Meanwhile, can you advise some of the the brand, model and BIOS version of the bare metal servers that Gatekeeper have been successfully deployed on? While this error is not neccessarily caused by the BIOS, some proven effective examples will be a useful piece of information while trying to resolve it.

@AltraMayor
Copy link
Owner

All shared notes on hardware are centralized on the wiki page Hardware Requirements. That said, my personal experience is with Dell servers.

@hzc12321
Copy link
Author

Update :
GT server model used : HP DL380 Gen9
GK server model used : HP DL380p Gen8

After a bunch of study + trial & error, I have managed to bring GT server up. It is related to the kernel. The problem is fixed by using System Configuration utility on Gen9+ servers to disable "HP Shared Memory features", more details at https://github.com/kiler129/relax-intel-rmrr/blob/master/deep-dive.md

On the other hand, since the GK server is a Gen8, the same solution doesn't apply for this machine. It seems like I would have to patch the kernel manually using https://github.com/kiler129/relax-intel-rmrr . Alternatively, this looks promising as well. However, both solutions seems time-consuming and troublesome.

The good news is that I350 is not the culprit.

Do you ever have to deal with these kinds of workaround when dealing with Dell servers? If Dell servers don't make this kind of problem, it will be my primary option and I would want to advise anyone trying to deploy Gatekeeper afterwards to avoid using old HP servers.

Another link with good reference value :
https://forum.proxmox.com/threads/qemu-exited-with-code-1-pcie-passthrough-not-working.146297/

@AltraMayor
Copy link
Owner

Based on my experience, I recommend Dell. Nevertheless, I don't want to suggest Dell servers are trouble-free; see issue #703 for an example. My recommendation is based on the fact that we've been able to overcome the problems with a clean solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants