OpenWRT on SR-IOV: Good Idea?
If you’re not familiar, SR-IOV is a technology that allows you to expose a single physical network device (or more accurately, “physical function”) as multiple virtual functions (VFs). Each VF is a separate logical PCIe device, so a board equipped with an IOMMU should be able to pass them into VMs individually. This allows for basically the same level of performance as a passthrough of an entire PCIe device, but still allows it to be shared amongst many VMs. At the hardware level, it acts as essentially a network switch, instead of doing that in software like with typical virtual NICs.
Now, in my previous post, I talked about some of the new hardware going into my new router. The star of the show is the X10SDV-8C-TLN4F. So why not run OpenWRT baremetal on here? A few reasons:
- Sometimes, you have to reboot. Rebooting a board like this can take several minutes. Not good for uptime. Meanwhile, a VM takes about 20 seconds to reboot fully. A custom kernel plus bypassing the bootloader (either via efistub or KVM direct kernel boot) could probably get it down to 10 seconds.
- Lockouts – if you screw up your network configuration, it can be annoying to get back in. Not impossible, since it still has a VGA port and USB ports, but it’s definitely nicer to be able to still access the host and fix the VM from there.
- OpenWRT isn’t necessarily going to have support for every bit of hardware on a board like this. It may lack drivers, or userland applications necessary to control said hardware. e.g. IPMI tools are lacking.
- I can easily benchmark routing performance using virtual networks.
- Since I can run ZFS on the host, I get ZFS snapshots and all that.
But….was it difficult? Was it worth it?
Yes and yes.
First of all, VFs are NOT meant for something like this out of the box. They’re designed to be fairly secure, so by default, they allow neither promiscuous mode, overriding the MAC address, nor sending packets from a MAC address that is different from that of the NIC. Unfortunately, all three of those are the opposite of what you’d want from a network device like this.
Secondly, the exact feature set you require will depend on what you’re trying to do. For example, bonding will set every constituent NIC to the same MAC address, so you’ll need support for trusted VFs. Bridging will require promiscuous mode and the ability to turn anti-spoof off, as it will need to receive and transmit packets with MAC addresses different than that of the interface. The reason I’ve posted a couple reviews about 10GbE NICs lately is precisely because SR-IOV feature sets are all over the place. What I’ve found so far is that Intel is consistently the best here.
Let’s assume you want all the features you’d get out of a physical NIC. For the PF, you’ll need to enable promiscuous mode. For the VF, you’ll need to turn ‘trusted’ mode on, and disable spoof checking. Assuming your NIC is eno3, and you’ll be using vf 0:
ip link set eno3 promisc on
ip link set eno3 vf 0 trust on spoof off
Truth be told, I’m not 100% sure why the promisc
is necessary – it’s the VF that matters here, not the PF, but promisc on
at the PF level makes it work for some reason.
OpenWRT does build an official VM image, and even one with EFI support. One problem: it is somewhat lacking in builtin drivers. Namely, the network driver. The virtio driver is built in and works fine, but the ixgbevf
driver was not. Not a big issue. Just make a normal virtual interface, and then bridge or NAT as appropriate, and install the drivers necessary for the VFs to work (for Intel cards, likely kmod-igbvf
, kmod-ixgbevf
, or kmod-iavf
depending on what generation of card).
But, after doing all that, it works quite well. I’ve been using it for several months now, and it’s pretty much flawless. Routing performance via iperf3 was about 20gb/s, giving it more than enough to handle the line rate of its 10GbE NICs, at least for larger packet sizes.
Conclusion
It’s a little bit more setup work, but it works fine, and might pay off in the long run. Great alternative to either running baremetal or passing in entire PFs. As long as you’ve set up your network such that the host will be accessible even if the router is down, it’s basically impossible to lock yourself out – virsh console
or virt-manager will get you back into the router. Or, if you really messed up, revert to an earlier zfs snapshot. Turris’s snapshotting is nice, and this is one way to get similar functionality on generic hardware.