Before you deploy your cluster, make sure your host has a SR-IOV capable NIC plugged in, and that it supports it. You may need to adjust BIOS settings to make it work. For example, make sure VT-x (hardware virtualization) and Intel VT for Directed I/O are enabled in BIOS.
You should also configure kernel to enable IOMMU. This can be achieved by adding the following kernel parameters to kernel command line:
intel_iommu=on
Some hosts may experience problems detecting VFs. If you experience issues configuring VFs, you may try to add one of the following parameters:
pci=realloc
pci=assign-busses
If all goes well, after reboot you should be able to enable SR-IOV VFs for capable NICs.
To enable VFs, you should do the following:
$ find /sys -name *vfs*
/sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/sriov_numvfs
/sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/sriov_totalvfs
/sys/devices/pci0000:00/0000:00:09.0/0000:05:00.1/sriov_numvfs
/sys/devices/pci0000:00/0000:00:09.0/0000:05:00.1/sriov_totalvfs
$ cat /sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/sriov_totalvfs
7
$ echo 7 > /sys/devices/pci0000:00/0000:00:09.0/0000:05:00.0/sriov_numvfs
If all goes well you should see VFs in lspci output:
$ lspci
...
05:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.4 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.6 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
05:10.7 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
...
Kubevirt will use vfio
userspace driver to pass through PCI devices into
qemu
. For this to work, load the following driver:
$ modprobe vfio-pci
Depending on your hardware platform, the driver may need additional kernel options. For example, if your platform does not support interrupt remapping, you may need to configure the host as follows:
$ echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
Finally, we need to unbind each device from its respective network driver and
register it with vfio subsystem. You can find an example on how to do it under:
tools/util/vfio.sh
Now you are ready to set up your cluster.
You can use your preferred mechanism to deploy your kubernetes cluster as long as you deploy on bare metal.
Current recommendation is to use kubevirt-ansible
to deploy the cluster.
Ansible playbooks will also deploy all the relevant SR-IOV components
for you. See here.
You may still want to deploy software using local
provider if you'd like to
deploy from Kubevirt sources though.
In the following example, we configure the cluster using local
provider which
is part of kubevirt/kubevirt repo. Please consult cluster/local/README.md for
general information on setting up a host using the local
provider.
The local
provider does not install default CNI plugins like loopback
. So
first, install default CNI plugins:
$ go get -u -d github.com/containernetworking/plugins/
$ cd $GOPATH/src/github.com/containernetworking/plugins/
$ ./build.sh
$ mkdir -p /opt/cni/bin/
$ cp bin/* /opt/cni/bin/
Then, prepare kubernetes tree for CNI enabled deployment:
$ go get -u -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ git diff
diff --git a/hack/local-up-cluster.sh b/hack/local-up-cluster.sh
index bcf988b..9911eed 100755
--- a/hack/local-up-cluster.sh
+++ b/hack/local-up-cluster.sh
@@ -639,6 +639,8 @@ function start_controller_manager {
--use-service-account-credentials \
--controllers="${KUBE_CONTROLLERS}" \
--leader-elect=false \
+ --cert-dir="$CERT_DIR" \
+ --allocate-node-cidrs=true --cluster-cidr=10.244.0.0/16 \
--master="https://${API_HOST}:${API_SECURE_PORT}" >"${CTLRMGR_LOG}" 2>&1 &
CTLRMGR_PID=$!
}
export NET_PLUGIN=cni
export CNI_CONF_DIR=/etc/cni/net.d/
export CNI_BIN_DIR=/opt/cni/bin/
Install etcd:
$ ./hack/install-etcd.sh
Use local
provider for kubevirt:
$ export KUBEVIRT_PROVIDER=local
Now finally, deploy kubernetes:
$ cd $GOPATH/src/kubevirt.io/kubevirt
$ make cluster-up
Once the cluster is deployed, we can move to SR-IOV specific components.
First, deploy latest Multus with default Flannel backend. We will need to use
the latest code from their tree, hence using snapshot
image tag instead of
latest
. The snapshot
image adds support for reading IDs of devices
allocated by device plugins from "checkpoint" files, which is needed to make
the whole setup work.
$ go get -u -d github.com/intel/multus-cni
$ cd $GOPATH/src/github.com/intel/multus-cni/
$ vi images/multus-daemonset.yml # change to refer to nfvpe/multus:snapshot
$ mkdir -p /etc/cni/net.d
$ cp images/70-multus.conf /etc/cni/net.d/
$ ./cluster/kubectl.sh create -f $GOPATH/src/github.com/intel/multus-cni/images/multus-daemonset.yml
$ ./cluster/kubectl.sh create -f $GOPATH/src/github.com/intel/multus-cni/images/flannel-daemonset.yml
Now, deploy SR-IOV device plugin. Adjust config.json file for your particular setup. More information about configuration file format: https://github.com/intel/sriov-network-device-plugin/blob/master/README.md#configurations
$ go get -u -d github.com/intel/sriov-network-device-plugin/
$ cat <<EOF > /etc/pcidp/config.json
{
"resourceList":
[
{
"resourceName": "sriov",
"rootDevices": ["05:00.0", "05:00.1"],
"sriovMode": true,
"deviceType": "vfio"
}
]
}
EOF
$ ./cluster/kubectl.sh create -f $GOPATH/src/github.com/intel/sriov-network-device-plugin/images/sriovdp-daemonset.yaml
Deploy SR-IOV CNI plugin.
$ go get -u -d github.com/intel/sriov-cni/
$ ./cluster/kubectl.sh create -f $GOPATH/src/github.com/intel/sriov-cni/images/sriov-cni-daemonset.yaml
Finally, create a new SR-IOV network CRD that will use SR-IOV device plugin to allocate devices.
./cluster/kubectl.sh create -f $GOPATH/src/github.com/intel/sriov-network-device-plugin/deployments/sriov-crd.yaml
This particular step is not specific to SR-IOV.
make cluster-sync
If all goes well, you should be able to post a VMI spec referring to the SR-IOV multus network and get a PCI device allocated to virt-launcher and passed through into qemu. Please consult cluster/examples/vmi-sriov.yaml for example.