-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Swift network for RGW to HCI scenario #386
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fultonj The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ecc7ae499dd0462b97f23c375b46bf28 ✔️ noop SUCCESS in 0s |
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0f4c9701ca2e421eb71dfa238ec09a19 ✔️ noop SUCCESS in 0s |
Regarding ❌ rhoso-architecture-validate-hci FAILURE in 3m 16s I ran the following in my own environment and I didn't reproduce the problem.
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/805ad2339a114206ab86d7c3a3002c29 ✔️ noop SUCCESS in 0s |
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/97539585015f4394a62edc009c2db3dd ✔️ noop SUCCESS in 0s |
This failure from rhoso-architecture-validate-hci
should just be running When I do that with these versions in my environment it works.
So I'm having trouble reproducing this error locally. Since I've added a depends-on to openstack-k8s-operators/ci-framework#2301 I might try updating ci-framework next to add debug info. |
recheck |
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4008a91b5d2640d68e606d99492fced6 ✔️ noop SUCCESS in 0s |
The CI is using the same versions:
https://softwarefactory-project.io/zuul/t/rdoproject.org/build/92be298ca3ca468c88dd08b92b03d2fb |
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3b6271692ffa4200ade4b8122c4f5c31 ✔️ noop SUCCESS in 0s |
...
https://github.com/openstack-k8s-operators/architecture/pull/386/files#r1742876830 |
Hi @fultonj, |
Right now I'm trying to get a separate network (not the storage network) for RGW so we can put an end to our workaround in CI of running Ceph on the provisioning network. It's easiest to implement this new network, named swift, as another VLAN. It could be changed to an external network if we wanted to expose the service externally but that would be done in another patch if desired. |
7eaefc6
to
a2f0598
Compare
recheck |
I tested this in my environment with VA1. I ran the Tempest object storage tests with the following and they passed. Kind: Tempest
spec:
networkAttachments:
- swift Waiting for a test-project run downstream before merging |
cidr: 172.22.0.0/24 | ||
gateway: 172.22.0.1 | ||
name: subnet1 | ||
vlan: 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VLAN 24 and 172.21 is the next unused range, so why VLAN 25 and 172.22 in this patch?
Because #376 is using VLAN 24 and 172.21
@fultonj I think it might be possible to do this without needing to remove VA1's dependency on I created a diff of your patch's generated content and my version, and I think it's just a white-space or trailing characters difference. I don't see any actual value differences (just showing one NNCP here):
What do you think? |
That's much nicer though it's missing a few things I'll still need.
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
labels:
osp/net: swift
osp/net-attach-def-type: standard
name: swift
namespace: openstack
spec:
config: |
{
"cniVersion": "0.3.1",
"name": "swift",
"type": "macvlan",
"master": "swift",
"ipam": {
"type": "whereabouts",
"range": "172.22.0.0/24",
"range_start": "172.22.0.100",
"range_end": "172.22.0.250"
}
}
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
labels:
osp/lb-addresses-type: standard
name: swift
namespace: metallb-system
spec:
addresses:
- 172.22.0.80-172.22.0.90
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: swift
namespace: metallb-system
spec:
interfaces:
- swift
ipAddressPools:
- swift
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
name: openstack-edpm
namespace: openstack
spec:
env:
- name: ANSIBLE_FORCE_COLOR
value: "True"
networkAttachments:
- ctlplane
nodeTemplate:
ansible:
ansiblePort: 22
ansibleUser: cloud-admin
ansibleVars:
edpm_ceph_hci_pre_enabled_services:
- ceph_mon
- ceph_mgr
- ceph_osd
- ceph_rgw
- ceph_nfs
- ceph_rgw_frontend
- ceph_nfs_frontend
edpm_network_config_hide_sensitive_logs: false
edpm_network_config_os_net_config_mappings:
edpm-compute-0:
nic2: 6a:fe:54:3f:8a:02
edpm-compute-1:
nic2: 6b:fe:54:3f:8a:02
edpm-compute-2:
nic2: 6c:fe:54:3f:8a:02
edpm_network_config_template: |
---
{% set mtu_list = [ctlplane_mtu] %}
{% for network in nodeset_networks %}
{{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
{%- endfor %}
{% set min_viable_mtu = mtu_list | max %}
network_config:
- type: interface
name: nic1
use_dhcp: true
mtu: {{ min_viable_mtu }}
- type: ovs_bridge
name: {{ neutron_physical_bridge_name }}
mtu: {{ min_viable_mtu }}
use_dhcp: false
dns_servers: {{ ctlplane_dns_nameservers }}
domain: {{ dns_search_domains }}
addresses:
- ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
routes: {{ ctlplane_host_routes }}
members:
- type: interface
name: nic2
mtu: {{ min_viable_mtu }}
# force the MAC address of the bridge to this interface
primary: true
{% for network in nodeset_networks %}
- type: vlan
mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
addresses:
- ip_netmask:
{{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
{% endfor %}
edpm_nodes_validation_validate_controllers_icmp: false
edpm_nodes_validation_validate_gateway_icmp: false
edpm_sshd_allowed_ranges:
- 192.168.122.0/24
edpm_sshd_configure_firewall: true
gather_facts: false
neutron_physical_bridge_name: br-ex
neutron_public_interface_name: eth0
storage_mgmt_cidr: "24"
storage_mgmt_host_routes: []
storage_mgmt_mtu: 9000
storage_mgmt_vlan_id: 23
storage_mtu: 9000
timesync_ntp_servers:
- hostname: pool.ntp.org
ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
extraMounts:
- extraVolType: Ceph
mounts:
- mountPath: /etc/ceph
name: ceph
readOnly: true
volumes:
- name: ceph
secret:
secretName: ceph-conf-files
managementNetwork: ctlplane
networks:
- defaultRoute: true
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: swift
subnetName: subnet1
nodes:
edpm-compute-0:
ansible:
ansibleHost: 192.168.122.100
hostName: edpm-compute-0
networks:
- defaultRoute: true
fixedIP: 192.168.122.100
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: swift
subnetName: subnet1
edpm-compute-1:
ansible:
ansibleHost: 192.168.122.101
hostName: edpm-compute-1
networks:
- defaultRoute: true
fixedIP: 192.168.122.101
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: swift
subnetName: subnet1
edpm-compute-2:
ansible:
ansibleHost: 192.168.122.102
hostName: edpm-compute-2
networks:
- defaultRoute: true
fixedIP: 192.168.122.102
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: swift
subnetName: subnet1
preProvisioned: true
services:
- bootstrap
- configure-network
- validate-network
- install-os
- ceph-hci-pre
- configure-os
- ssh-known-hosts
- run-os
- reboot-os
- install-certs
- ceph-client
- ovn
- neutron-metadata
- libvirt
- nova-custom-ceph |
@fultonj Yes, it's missing the other stuff. I just wanted to use NNCPs as an example approach to what might work. It's possible though that the pattern might break for the other components. But first I just wanted to see if you all thought this might be worth investigating. |
It is certainly worth investigating as it's much cleaner so thank you very much! I'll try implementing this change with that approach. |
todo:
|
/trigger github-experimental |
@fultonj Here's a new version of the rework for NNCP generation that doesn't hardcode NNCP names, as we discussed yesterday: abays@462eb45. You can see it uses labels instead. I tested it and it works. |
When Ceph RGW is used, an endpoint for Swift storage is hosted not in a pod on k8s but on an EDPM node. Thus, a service hosted on an EDPM node will need to be accessed by cloud users from a separate network. This patch adds the Swift storage network (swift) with VLAN 25 and range 172.22.0.0/24 in the HCI values example. The Swift network is configured on the HCI EDPM nodes and an NNCP, NAD, L2Advertisement and IPAddressPool are defined so that a pod in k8s can connect to it; such as the tempest pod which will perform object storage tests. In order to make these changes va/hci now keeps its own copy of the nncp and networking directories since they differ (by the new network) from the generic ones in the lib directory. Jira: https://issues.redhat.com/browse/OSPRH-6675 Depends-On: openstack-k8s-operators/ci-framework#2301 Signed-off-by: John Fulton <[email protected]>
This change depends on a change that failed to merge. Change openstack-k8s-operators/ci-framework#2301 is needed. |
/trigger github-experimental |
Obsoleted by #404 |
When Ceph RGW is used, an endpoint for Swift storage is hosted not in a pod on k8s but on an EDPM node. Thus, a service hosted on an EDPM node will need to be accessed by cloud users from a separate network.
This patch adds the Swift storage network (swift) with VLAN 25 and range 172.22.0.0/24 in the HCI values example. The Swift network is configured on the HCI EDPM nodes and an NNCP, NAD, L2Advertisement and IPAddressPool are defined so that a pod in k8s can connect to it; such as the tempest pod which will perform object storage tests.
In order to make these changes va/hci now keeps its own copy of the nncp and networking directories since they differ (by the new network) from the generic ones in the lib directory.
Jira: https://issues.redhat.com/browse/OSPRH-6675
Depends-On: openstack-k8s-operators/ci-framework#2301