Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add annotation cni.spidernet.io/network-resource-inject #4421

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/usage/install/ai/get-started-macvlan-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@

## 基于 Webhook 自动注入 RDMA 网络资源

在上述步骤中,我们展示了如何使用 SR-IOV 技术在 RoCE 和 Infiniband 网络环境中为容器提供 RDMA 通信能力。然而,当配置多网卡的 AI 应用时,过程会变得复杂。为简化这个过程,Spiderpool 通过 annotations(`cni.spidernet.io/rdma-resource-inject`) 支持对一组网卡配置进行分类。用户只需要为应用添加与网卡配置相同的注解,Spiderpool 就会通过 webhook 自动为应用注入所有具有相同注解的对应网卡和网络资源。
在上述步骤中,我们展示了如何使用 SR-IOV 技术在 RoCE 和 Infiniband 网络环境中为容器提供 RDMA 通信能力。然而,当配置多网卡的 AI 应用时,过程会变得复杂。为简化这个过程,Spiderpool 通过 annotations(`cni.spidernet.io/rdma-resource-inject` 或 `cni.spidernet.io/network-resource-inject`) 支持对一组网卡配置进行分类。用户只需要为应用添加与网卡配置相同的注解,Spiderpool 就会通过 webhook 自动为应用注入所有具有相同注解的对应网卡和网络资源。`cni.spidernet.io/rdma-resource-inject` 只适用于 AI 场景,自动注入 RDMA 网卡及 RDMA Resources;`cni.spidernet.io/network-resource-inject` 不但可以用于 AI 场景,也支持 Underlay 场景。在未来我们希望都统一使用 `cni.spidernet.io/network-resource-inject` 支持这两种场景

> 该功能仅支持 [ macvlan, ipvlan, sriov, ib-sriov, ipoib ] 这几种 cniType 的网卡配置。

Expand All @@ -440,7 +440,7 @@
>
> 当前,完成配置变更后,您需要重启 spiderpool-controller 来使配置生效。

2. 在创建 AI 算力网络的所有 SpiderMultusConfig 实例时,添加 key 为 "cni.spidernet.io/rdma-resource-inject" 的 annotation,value 可自定义任何值
2. 在创建 AI 算力网络的所有 SpiderMultusConfig 实例时,添加 key 为 "cni.spidernet.io/rdma-resource-inject" 或 "cni.spidernet.io/network-resource-inject" 的 annotation,value 可自定义任何值

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
Expand Down
4 changes: 2 additions & 2 deletions docs/usage/install/ai/get-started-macvlan.md
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,7 @@ The network planning for the cluster is as follows:

## Auto Inject RDMA Resources Based on Webhook

In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA communication capabilities for containers in RoCE and Infiniband network environments. However, the process can become complex when configuring AI applications with multiple network cards. To simplify this process, Spiderpool supports classifying a set of network card configurations through annotations (`cni.spidernet.io/rdma-resource-inject`). Users only need to add the same annotation to the application, and Spiderpool will automatically inject all corresponding network cards and network resources with the same annotation into the application through a webhook.
In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA communication capabilities for containers in RoCE and Infiniband network environments. However, the process can become complex when configuring AI applications with multiple network cards. To simplify this process, Spiderpool supports classifying a set of network card configurations through annotations (`cni.spidernet.io/rdma-resource-inject` or `cni.spidernet.io/network-resource-inject`). Users only need to add the same annotation to the application, and Spiderpool will automatically inject all corresponding network cards and network resources with the same annotation into the application through a webhook. `cni.spidernet.io/rdma-resource-inject` annotation is only applicable to AI scenarios, automatically injecting RDMA network cards and RDMA resources. `cni.spidernet.io/network-resource-inject` annotation can be used not only for AI scenarios but also supports underlay scenarios. In the future, we hope to uniformly use `cni.spidernet.io/network-resource-inject` to support both of these scenarios.

> This feature only supports network card configurations with cniType of [macvlan, ipvlan, sriov, ib-sriov, ipoib].

Expand All @@ -438,7 +438,7 @@ In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA
>
> Currently, after completing the configuration change, you need to restart the spiderpool-controller for the configuration to take effect.

2. When creating all SpiderMultusConfig instances for AI computing networks, add an annotation with the key "cni.spidernet.io/rdma-resource-inject" and a customizable value.
2. When creating all SpiderMultusConfig instances for AI computing networks, add an annotation with the key "cni.spidernet.io/rdma-resource-inject" (or "cni.spidernet.io/network-resource-inject") and a customizable value.

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
Expand Down
4 changes: 2 additions & 2 deletions docs/usage/install/ai/get-started-sriov-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -609,7 +609,7 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb

## 基于 Webhook 自动注入 RDMA 网络资源

在上述步骤中,我们展示了如何使用 SR-IOV 技术在 RoCE 和 Infiniband 网络环境中为容器提供 RDMA 通信能力。然而,当配置多网卡的 AI 应用时,过程会变得复杂。为简化这个过程,Spiderpool 通过 annotations(`cni.spidernet.io/rdma-resource-inject`) 支持对一组网卡配置进行分类。用户只需要为应用添加与网卡配置相同的注解,Spiderpool 就会通过 webhook 自动为应用注入所有具有相同注解的对应网卡和网络资源。
在上述步骤中,我们展示了如何使用 SR-IOV 技术在 RoCE 和 Infiniband 网络环境中为容器提供 RDMA 通信能力。然而,当配置多网卡的 AI 应用时,过程会变得复杂。为简化这个过程,Spiderpool 通过 annotations(`cni.spidernet.io/rdma-resource-inject` 或 `cni.spidernet.io/network-resource-inject`) 支持对一组网卡配置进行分类。用户只需要为应用添加与网卡配置相同的注解,Spiderpool 就会通过 webhook 自动为应用注入所有具有相同注解的对应网卡和网络资源。`cni.spidernet.io/rdma-resource-inject` 只适用于 AI 场景,自动注入 RDMA 网卡及 RDMA Resources;`cni.spidernet.io/network-resource-inject` 不但可以用于 AI 场景,也支持 Underlay 场景。在未来我们希望都统一使用 `cni.spidernet.io/network-resource-inject` 支持这两种场景

> 该功能仅支持 [ macvlan, ipvlan, sriov, ib-sriov, ipoib ] 这几种 cniType 的网卡配置。

Expand All @@ -627,7 +627,7 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
>
> 当前,完成配置变更后,您需要重启 spiderpool-controller 来使配置生效。

2. 在创建 AI 算力网络的所有 SpiderMultusConfig 实例时,添加 key 为 "cni.spidernet.io/rdma-resource-inject" 的 annotation,value 可自定义任何值
2. 在创建 AI 算力网络的所有 SpiderMultusConfig 实例时,添加 key 为 "cni.spidernet.io/rdma-resource-inject" 或 "cni.spidernet.io/network-resource-inject" 的 annotation,value 可自定义任何值

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
Expand Down
4 changes: 2 additions & 2 deletions docs/usage/install/ai/get-started-sriov.md
Original file line number Diff line number Diff line change
Expand Up @@ -610,7 +610,7 @@ For clusters using Infiniband networks, if there is a [UFM management platform](

## Auto Inject RDMA Resources Based on Webhook

In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA communication capabilities for containers in RoCE and Infiniband network environments. However, the process can become complex when configuring AI applications with multiple network cards. To simplify this process, Spiderpool supports classifying a set of network card configurations through annotations (`cni.spidernet.io/rdma-resource-inject`). Users only need to add the same annotation to the application, and Spiderpool will automatically inject all corresponding network cards and network resources with the same annotation into the application through a webhook.
In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA communication capabilities for containers in RoCE and Infiniband network environments. However, the process can become complex when configuring AI applications with multiple network cards. To simplify this process, Spiderpool supports classifying a set of network card configurations through annotations (`cni.spidernet.io/rdma-resource-inject` or `cni.spidernet.io/network-resource-inject`). Users only need to add the same annotation to the application, and Spiderpool will automatically inject all corresponding network cards and network resources with the same annotation into the application through a webhook. `cni.spidernet.io/rdma-resource-inject` annotation is only applicable to AI scenarios, automatically injecting RDMA network cards and RDMA resources. `cni.spidernet.io/network-resource-inject` annotation can be used not only for AI scenarios but also supports underlay scenarios. In the future, we hope to uniformly use `cni.spidernet.io/network-resource-inject` to support both of these scenarios.

> This feature only supports network card configurations with cniType of [macvlan, ipvlan, sriov, ib-sriov, ipoib].

Expand All @@ -628,7 +628,7 @@ In the steps above, we demonstrated how to use SR-IOV technology to provide RDMA
>
> Currently, after completing the configuration change, you need to restart the spiderpool-controller for the configuration to take effect.

2. When creating all SpiderMultusConfig instances for AI computing networks, add an annotation with the key "cni.spidernet.io/rdma-resource-inject" and a customizable value.
2. When creating all SpiderMultusConfig instances for AI computing networks, add an annotation with the key "cni.spidernet.io/rdma-resource-inject" (or "cni.spidernet.io/network-resource-inject") and a customizable value.

```yaml
apiVersion: spiderpool.spidernet.io/v2beta1
Expand Down
1 change: 1 addition & 0 deletions docs/usage/install/overlay/get-started-calico-zh_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@ EOF
```

> `spec.macvlan.master` 设置为 `ens192`, `ens192`必须存在于主机上。并且 `spec.macvlan.spiderpoolConfigPools.IPv4IPPool`设置的子网和 `ens192`保持一致。
> 如果需要为 Pod 添加多张 Underlay 网卡的可以参考 [**Multi-Underlay-NIC**](./multi-underlay-nic-zh_CN.md)。

创建成功后, 查看 Multus NAD 是否成功创建:

Expand Down
1 change: 1 addition & 0 deletions docs/usage/install/overlay/get-started-calico.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ EOF
```

> The subnet should be consistent with the subnet of `ens192` on the nodes, and ensure that the IP addresses do not conflict with any existing ones.
> If you need to add multiple underlay NICs to a Pod, you can refer to[**Multi-Underlay-NIC**](./multi-underlay-nic.md).

### Create SpiderMultusConfig

Expand Down
1 change: 1 addition & 0 deletions docs/usage/install/overlay/get-started-cilium-zh_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ EOF
Note:

> subnet 应该与节点网卡 ens192 的子网保持一致,并且不与现有任何 IP 冲突。
> 如果需要为 Pod 添加多张 Underlay 网卡的可以参考 [**Multi-Underlay-NIC**](./multi-underlay-nic-zh_CN.md)。

### 创建 SpiderMultusConfig

Expand Down
1 change: 1 addition & 0 deletions docs/usage/install/overlay/get-started-cilium.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ EOF
```

> The subnet should be consistent with the subnet of `ens192` on the nodes, and ensure that the IP addresses do not conflict with any existing ones.
> If you need to add multiple underlay NICs to a Pod, you can refer to[**Multi-Underlay-NIC**](./multi-underlay-nic.md).

### Create SpiderMultusConfig

Expand Down
96 changes: 96 additions & 0 deletions docs/usage/install/overlay/multi-underlay-nic-zh_CN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Calico with multi underlay NIC

[**English**](./multi-underlay-nic.md) | **简体中文**

## 基于 Webhook 自动为 Pod 附加多张 Underlay 网卡

本文集群节点网卡: `ens192` 所在子网为 `10.6.0.0/16`,`ens193` 所在子网为 `10.7.0.0/16`,以此创建 SpiderIPPool:

```shell
$ cat <<EOF | kubectl apply -f -
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderIPPool
metadata:
name: macvlan-ens192
spec:
disable: false
gateway: 10.6.0.1
subnet: 10.6.0.0/16
ips:
- 10.6.212.100-10.6.212.200
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderIPPool
metadata:
name: macvlan-ens193
spec:
disable: false
gateway: 10.7.0.1
subnet: 10.7.0.0/16
ips:
- 10.7.212.100-10.7.212.200
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: macvlan-ens192
namespace: spiderpool
annotations:
cni.spidernet.io/network-resource-inject: multi-network
spec:
cniType: macvlan
macvlan:
master:
- ens192
ippools:
ipv4:
- macvlan-ens192
vlanID: 0
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: macvlan-ens193
namespace: spiderpool
annotations:
cni.spidernet.io/network-resource-inject: multi-network
spec:
cniType: macvlan
macvlan:
master:
- ens193
ippools:
ipv4:
- macvlan-ens193
vlanID: 0
EOF
```

## 创建测试应用

1. 为应用也添加相同注解:

```yaml
...
spec:
template:
metadata:
annotations:
cni.spidernet.io/network-resource-inject: multi-network
```

> 注意:使用 webhook 自动注入网络资源功能时,不能为应用添加其他网络配置注解(如 `k8s.v1.cni.cncf.io/networks` 和 `ipam.spidernet.io ippools`等),否则会影响资源自动注入功能。

2. 当 Pod 被创建后,可观测到 Pod 被自动注入了网卡 annotation

```yaml
...
spec:
template:
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: |-
[{"name":"macvlan-ens192","namespace":"spiderpool"},
{"name":"macvlan-ens193","namespace":"spiderpool"}]
....
```
96 changes: 96 additions & 0 deletions docs/usage/install/overlay/multi-underlay-nic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Calico with multi underlay NIC

**English** | [**简体中文**](./multi-underlay-nic-zh_CN.md)

## Auto attach multiple underlay NICs to Pod based on Webhook

The subnet for the interface `ens192` on the cluster nodes here is `10.6.0.0/16`. The subnet for the interface `ens193` on the cluster nodes here is `10.7.0.0/16`. Create SpiderIPPools using these subnets:

```shell
$ cat <<EOF | kubectl apply -f -
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderIPPool
metadata:
name: macvlan-ens192
spec:
disable: false
gateway: 10.6.0.1
subnet: 10.6.0.0/16
ips:
- 10.6.212.100-10.6.212.200
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderIPPool
metadata:
name: macvlan-ens193
spec:
disable: false
gateway: 10.7.0.1
subnet: 10.7.0.0/16
ips:
- 10.7.212.100-10.7.212.200
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: macvlan-ens192
namespace: spiderpool
annotations:
cni.spidernet.io/network-resource-inject: multi-network
spec:
cniType: macvlan
macvlan:
master:
- ens192
ippools:
ipv4:
- macvlan-ens192
vlanID: 0
---
apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
name: macvlan-ens193
namespace: spiderpool
annotations:
cni.spidernet.io/network-resource-inject: multi-network
spec:
cniType: macvlan
macvlan:
master:
- ens193
ippools:
ipv4:
- macvlan-ens193
vlanID: 0
EOF
```

## Create an application

1. Add the same annotation to the application:

```yaml
...
spec:
template:
metadata:
annotations:
cni.spidernet.io/network-resource-inject: multi-network
```

> Note: When using the webhook automatic injection of network resources feature, do not add other network configuration annotations (such as `k8s.v1.cni.cncf.io/networks` and `ipam.spidernet.io/ippools`) to the application, as it will affect the automatic injection of resources.

2. Once the Pod is created, you can observe that the Pod has been automatically injected with network card annotations.

```yaml
...
spec:
template:
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: |-
[{"name":"macvlan-ens192","namespace":"spiderpool"},
{"name":"macvlan-ens193","namespace":"spiderpool"}]
....
```
5 changes: 3 additions & 2 deletions pkg/constant/k8s.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,9 @@ const (
AnnoDraCdiVersion = AnnotationPre + "/cdi-version"

// webhook
PodMutatingWebhookName = "pods.spiderpool.spidernet.io"
AnnoPodResourceInject = CNIAnnotationPre + "/rdma-resource-inject"
PodMutatingWebhookName = "pods.spiderpool.spidernet.io"
AnnoPodResourceInject = CNIAnnotationPre + "/rdma-resource-inject"
AnnoNetworkResourceInject = CNIAnnotationPre + "/network-resource-inject"
)

const (
Expand Down
1 change: 1 addition & 0 deletions pkg/coordinatormanager/coordinator_informer.go
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,7 @@ func (cc *CoordinatorController) updatePodAndServerCIDR(ctx context.Context, log
return coordCopy
}


k8sPodCIDR, k8sServiceCIDR = utils.ExtractK8sCIDRFromKCMPod(&podList.Items[0])
logger.Sugar().Infof("kube-controller-manager k8sPodCIDR %v, k8sServiceCIDR %v", k8sPodCIDR, k8sServiceCIDR)
}
Expand Down
54 changes: 54 additions & 0 deletions pkg/coordinatormanager/coordinator_informer_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Copyright 2022 Authors of spidernet-io
// SPDX-License-Identifier: Apache-2.0

package coordinatormanager

import (
"encoding/json"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
corev1 "k8s.io/api/core/v1"
)

var _ = Describe("Coordinator Manager", Label("coordinatorinformer", "informer_test"), Serial, func() {
DescribeTable("should extract CIDRs correctly",
func(testName, cmStr string, expectedPodCIDR, expectedServiceCIDR []string, expectError bool) {
var cm corev1.ConfigMap
err := json.Unmarshal([]byte(cmStr), &cm)
Expect(err).NotTo(HaveOccurred(), "Failed to unmarshal configMap: %v\n", err)

podCIDR, serviceCIDR, err := ExtractK8sCIDRFromKubeadmConfigMap(&cm)

Check failure on line 20 in pkg/coordinatormanager/coordinator_informer_test.go

View workflow job for this annotation

GitHub Actions / lint-golang

undefined: ExtractK8sCIDRFromKubeadmConfigMap (typecheck)

if expectError {
Expect(err).To(HaveOccurred(), "Expected an error but got none")
} else {
Expect(err).NotTo(HaveOccurred(), "Did not expect an error but got one: %v", err)
}

Expect(podCIDR).To(Equal(expectedPodCIDR), "Pod CIDR does not match")
Expect(serviceCIDR).To(Equal(expectedServiceCIDR), "Service CIDR does not match")
},
Entry("ClusterConfiguration",
"ClusterConfiguration",
clusterConfigurationJson,
[]string{"192.168.165.0/24"},
[]string{"245.100.128.0/18"},
false,
),
Entry("No ClusterConfiguration",
"No ClusterConfiguration",
noClusterConfigurationJson,
nil,
nil,
true,
),
Entry("No CIDR",
"No CIDR",
noCIDRJson,
nil,
nil,
false,
),
)

})
Loading
Loading