Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD-695: Add job parallelism to recurrent backups #1381

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Elbehery
Copy link
Contributor

@Elbehery Elbehery commented Dec 28, 2024

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 28, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 28, 2024

@Elbehery: This pull request references ETCD-695 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target either version "4.19." or "openshift-4.19.", but it targets "4.18" instead.

In response to this:

/hold

testing ...

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 28, 2024
@openshift-ci openshift-ci bot requested review from dusk125 and tjungblu December 28, 2024 14:52
Copy link
Contributor

openshift-ci bot commented Dec 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Elbehery

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 28, 2024
@Elbehery Elbehery force-pushed the use_parallel_backup_jobs_test branch from 8c463b6 to efbaa09 Compare December 28, 2024 15:00
@Elbehery Elbehery force-pushed the use_parallel_backup_jobs_test branch 3 times, most recently from fd87df7 to e684567 Compare December 28, 2024 18:04
@Elbehery Elbehery force-pushed the use_parallel_backup_jobs_test branch from e684567 to a7e1fb0 Compare December 28, 2024 20:09
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 28, 2024

@Elbehery: This pull request references ETCD-695 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target either version "4.19." or "openshift-4.19.", but it targets "4.18" instead.

In response to this:

resolves https://issues.redhat.com/browse/ETCD-695

/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Elbehery
Copy link
Contributor Author

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Dec 28, 2024
@Elbehery
Copy link
Contributor Author

Elbehery commented Dec 28, 2024

Tested with this PR atop of 4.19.0-ec.0 OCP cluster

Backup CR used during testing

apiVersion: config.openshift.io/v1alpha1
kind: Backup
metadata:
  name: default
spec:
  etcd:
    schedule: "*/3 * * * *"
    timeZone: "UTC"
    retentionPolicy:
      retentionType: RetentionNumber
      retentionNumber:
        maxNumberOfBackups: 3

notes that PVC field is being ommitted in purpose.


Upon applying the CR above, PeriodicBackupController within CEO reacts by creating the following CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  creationTimestamp: "2024-12-28T20:17:23Z"
  generation: 1
  labels:
    app: cluster-backup-cronjob
    backup-name: default
  name: default
  namespace: openshift-etcd
  ownerReferences:
    - apiVersion: config.openshift.io/v1alpha1
      kind: Backup
      name: default
      uid: 0a0c1663-e70b-475a-bea1-1e36c20c0b65
  resourceVersion: "101174"
  uid: e127e972-bbfc-4886-b042-ee7c1f2647e2
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 10
  jobTemplate:
    metadata:
      creationTimestamp: null
      labels:
        app: cluster-backup-cronjob
    spec:
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: cluster-backup-cronjob
        spec:
          containers:
            - args:
                - request-backup
                - --pvc-name=no-config
              command:
                - cluster-etcd-operator
              env:
                - name: MY_JOB_NAME
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.labels['batch.kubernetes.io/job-name']
                - name: MY_JOB_UID
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.labels['batch.kubernetes.io/controller-uid']
              image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
              imagePullPolicy: IfNotPresent
              name: cluster-backup
              resources: {}
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: FallbackToLogsOnError
          dnsPolicy: ClusterFirst
          initContainers:
            - args:
                - prune-backups
                - --type=RetentionNumber
                - --maxNumberOfBackups=3
              command:
                - cluster-etcd-operator
              image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
              imagePullPolicy: IfNotPresent
              name: retention
              resources: {}
              securityContext:
                privileged: true
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: FallbackToLogsOnError
              volumeMounts:
                - mountPath: /etc/kubernetes/cluster-backup
                  name: etc-kubernetes-cluster-backup
          nodeSelector:
            node-role.kubernetes.io/master: ""
          restartPolicy: OnFailure
          schedulerName: default-scheduler
          securityContext: {}
          serviceAccount: etcd-backup-sa
          serviceAccountName: etcd-backup-sa
          terminationGracePeriodSeconds: 30
          tolerations:
            - operator: Exists
          volumes:
            - hostPath:
                path: /etc/kubernetes/cluster-backup
                type: DirectoryOrCreate
              name: etc-kubernetes-cluster-backup
  schedule: '*/3 * * * *'
  successfulJobsHistoryLimit: 5
  suspend: false
  timeZone: UTC
status:
  lastScheduleTime: "2024-12-28T20:54:00Z"
  lastSuccessfulTime: "2024-12-28T20:54:04Z"

The preceding CronJob deploys another Job resource, according to the specified schedule. See an example below

apiVersion: batch/v1
kind: Job
metadata:
  annotations:
    batch.kubernetes.io/cronjob-scheduled-timestamp: "2024-12-28T21:09:00Z"
  creationTimestamp: "2024-12-28T21:09:00Z"
  generation: 1
  labels:
    app: cluster-backup-cronjob
  name: default-28923669
  namespace: openshift-etcd
  ownerReferences:
    - apiVersion: batch/v1
      blockOwnerDeletion: true
      controller: true
      kind: CronJob
      name: default
      uid: e127e972-bbfc-4886-b042-ee7c1f2647e2
  resourceVersion: "108546"
  uid: 5f51b888-7eda-47e6-8f4e-a576a831fada
spec:
  backoffLimit: 6
  completionMode: NonIndexed
  completions: 1
  manualSelector: false
  parallelism: 1
  podReplacementPolicy: TerminatingOrFailed
  selector:
    matchLabels:
      batch.kubernetes.io/controller-uid: 5f51b888-7eda-47e6-8f4e-a576a831fada
  suspend: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: cluster-backup-cronjob
        batch.kubernetes.io/controller-uid: 5f51b888-7eda-47e6-8f4e-a576a831fada
        batch.kubernetes.io/job-name: default-28923669
        controller-uid: 5f51b888-7eda-47e6-8f4e-a576a831fada
        job-name: default-28923669
    spec:
      containers:
        - args:
            - request-backup
            - --pvc-name=no-config
          command:
            - cluster-etcd-operator
          env:
            - name: MY_JOB_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.labels['batch.kubernetes.io/job-name']
            - name: MY_JOB_UID
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.labels['batch.kubernetes.io/controller-uid']
          image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
          imagePullPolicy: IfNotPresent
          name: cluster-backup
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: FallbackToLogsOnError
      dnsPolicy: ClusterFirst
      initContainers:
        - args:
            - prune-backups
            - --type=RetentionNumber
            - --maxNumberOfBackups=3
          command:
            - cluster-etcd-operator
          image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
          imagePullPolicy: IfNotPresent
          name: retention
          resources: {}
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: FallbackToLogsOnError
          volumeMounts:
            - mountPath: /etc/kubernetes/cluster-backup
              name: etc-kubernetes-cluster-backup
      nodeSelector:
        node-role.kubernetes.io/master: ""
      restartPolicy: OnFailure
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: etcd-backup-sa
      serviceAccountName: etcd-backup-sa
      terminationGracePeriodSeconds: 30
      tolerations:
        - operator: Exists
      volumes:
        - hostPath:
            path: /etc/kubernetes/cluster-backup
            type: DirectoryOrCreate
          name: etc-kubernetes-cluster-backup
status:
  completionTime: "2024-12-28T21:09:04Z"
  conditions:
    - lastProbeTime: "2024-12-28T21:09:04Z"
      lastTransitionTime: "2024-12-28T21:09:04Z"
      message: Reached expected number of succeeded pods
      reason: CompletionsReached
      status: "True"
      type: SuccessCriteriaMet
    - lastProbeTime: "2024-12-28T21:09:04Z"
      lastTransitionTime: "2024-12-28T21:09:04Z"
      message: Reached expected number of succeeded pods
      reason: CompletionsReached
      status: "True"
      type: Complete
  ready: 0
  startTime: "2024-12-28T21:09:00Z"
  succeeded: 1
  terminating: 0
  uncountedTerminatedPods: {}

  • The created Job resource contains two containers.

  • prune-backups cmd runs as init-container to prune the backups according to the specified criteria.

                - prune-backups
                - --type=RetentionNumber
                - --maxNumberOfBackups=3

and

  • request-backups cmd runs as main-container to request a backup according the schedule specified in Backup CR, and to store the backups within the given PVC.
    • This cmd created another CR (i.e. EtcdBackupSpec.
    • This CR is being handled by BackupController within CEO, which reacts by creating another Job resource.
                - request-backup
                - --pvc-name=no-config
  • The Job resource's name always contains a prefix of the Backup CR's name (i.e. job.batch/default-28923624 is being created upon applying a Backup CR with name=default)

  • The previous Job deploys a Pod which runs both containers as explained above. See an example below

  • An example pod is pod/default-28923684-z8vb4. See below

apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.128.0.107/23"],"mac_address":"0a:58:0a:80:00:6b","gateway_ips":["10.128.0.1"],"routes":[{"dest":"10.128.0.0/14","nextHop":"10.128.0.1"},{"dest":"172.30.0.0/16","nextHop":"10.128.0.1"},{"dest":"169.254.0.5/32","nextHop":"10.128.0.1"},{"dest":"100.64.0.0/16","nextHop":"10.128.0.1"}],"ip_address":"10.128.0.107/23","gateway_ip":"10.128.0.1","role":"primary"}}'
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "ovn-kubernetes",
          "interface": "eth0",
          "ips": [
              "10.128.0.107"
          ],
          "mac": "0a:58:0a:80:00:6b",
          "default": true,
          "dns": {}
      }]
  creationTimestamp: "2024-12-28T21:24:00Z"
  generateName: default-28923684-
  labels:
    app: cluster-backup-cronjob
    batch.kubernetes.io/controller-uid: 07150b95-eea1-41ec-b366-2861b7e0b029
    batch.kubernetes.io/job-name: default-28923684
    controller-uid: 07150b95-eea1-41ec-b366-2861b7e0b029
    job-name: default-28923684
  name: default-28923684-z8vb4
  namespace: openshift-etcd
  ownerReferences:
    - apiVersion: batch/v1
      blockOwnerDeletion: true
      controller: true
      kind: Job
      name: default-28923684
      uid: 07150b95-eea1-41ec-b366-2861b7e0b029
  resourceVersion: "115659"
  uid: c03cf4f3-5799-4ff4-b15c-f3f78af20b04
spec:
  containers:
    - args:
        - request-backup
        - --pvc-name=no-config
      command:
        - cluster-etcd-operator
      env:
        - name: MY_JOB_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.labels['batch.kubernetes.io/job-name']
        - name: MY_JOB_UID
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.labels['batch.kubernetes.io/controller-uid']
      image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      imagePullPolicy: IfNotPresent
      name: cluster-backup
      resources: {}
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: FallbackToLogsOnError
      volumeMounts:
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-gbsvk
          readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
    - name: etcd-backup-sa-dockercfg-qs7fr
  initContainers:
    - args:
        - prune-backups
        - --type=RetentionNumber
        - --maxNumberOfBackups=3
      command:
        - cluster-etcd-operator
      image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      imagePullPolicy: IfNotPresent
      name: retention
      resources: {}
      securityContext:
        privileged: true
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: FallbackToLogsOnError
      volumeMounts:
        - mountPath: /etc/kubernetes/cluster-backup
          name: etc-kubernetes-cluster-backup
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-gbsvk
          readOnly: true
  nodeName: ip-10-0-59-75.us-west-1.compute.internal
  nodeSelector:
    node-role.kubernetes.io/master: ""
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: OnFailure
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: etcd-backup-sa
  serviceAccountName: etcd-backup-sa
  terminationGracePeriodSeconds: 30
  tolerations:
    - operator: Exists
  volumes:
    - hostPath:
        path: /etc/kubernetes/cluster-backup
        type: DirectoryOrCreate
      name: etc-kubernetes-cluster-backup
    - name: kube-api-access-gbsvk
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace
          - configMap:
              items:
                - key: service-ca.crt
                  path: service-ca.crt
              name: openshift-service-ca.crt
status:
  conditions:
    - lastProbeTime: null
      lastTransitionTime: "2024-12-28T21:24:03Z"
      status: "False"
      type: PodReadyToStartContainers
    - lastProbeTime: null
      lastTransitionTime: "2024-12-28T21:24:01Z"
      reason: PodCompleted
      status: "True"
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: "2024-12-28T21:24:00Z"
      reason: PodCompleted
      status: "False"
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: "2024-12-28T21:24:00Z"
      reason: PodCompleted
      status: "False"
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: "2024-12-28T21:24:00Z"
      status: "True"
      type: PodScheduled
  containerStatuses:
    - containerID: cri-o://2d52aace47048284af15f4488256d77ccd706006b81d45d46c004508eb8d8791
      image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      imageID: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      lastState: {}
      name: cluster-backup
      ready: false
      restartCount: 0
      started: false
      state:
        terminated:
          containerID: cri-o://2d52aace47048284af15f4488256d77ccd706006b81d45d46c004508eb8d8791
          exitCode: 0
          finishedAt: "2024-12-28T21:24:01Z"
          reason: Completed
          startedAt: "2024-12-28T21:24:01Z"
      volumeMounts:
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-gbsvk
          readOnly: true
          recursiveReadOnly: Disabled
  hostIP: 10.0.59.75
  hostIPs:
    - ip: 10.0.59.75
  initContainerStatuses:
    - containerID: cri-o://0f4ea084f3539cf1d1847a1b575158717db45a6bdf48d7f8af7514b2546c4d23
      image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      imageID: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
      lastState: {}
      name: retention
      ready: true
      restartCount: 0
      started: false
      state:
        terminated:
          containerID: cri-o://0f4ea084f3539cf1d1847a1b575158717db45a6bdf48d7f8af7514b2546c4d23
          exitCode: 0
          finishedAt: "2024-12-28T21:24:00Z"
          reason: Completed
          startedAt: "2024-12-28T21:24:00Z"
      volumeMounts:
        - mountPath: /etc/kubernetes/cluster-backup
          name: etc-kubernetes-cluster-backup
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-gbsvk
          readOnly: true
          recursiveReadOnly: Disabled
  phase: Succeeded
  podIP: 10.128.0.107
  podIPs:
    - ip: 10.128.0.107
  qosClass: BestEffort
  startTime: "2024-12-28T21:24:00Z"

  • The request-backup container creates an EtcdBackup CR. An example below
apiVersion: operator.openshift.io/v1alpha1
kind: EtcdBackup
metadata:
  creationTimestamp: "2024-12-28T21:30:01Z"
  generation: 1
  labels:
    state: processed
  name: default-28923690
  ownerReferences:
    - apiVersion: batch/v1
      kind: Job
      name: default-28923690
      uid: 0fc2121a-56fe-43ac-8225-df0ee10837cd
  resourceVersion: "118529"
  uid: 54fb6a7d-92ea-466c-aadf-da4003329547
spec:
  pvcName: no-config
status:
  backupJob:
    name: cluster-backup-job-7tvrz
    namespace: openshift-etcd
  conditions:
    - lastTransitionTime: "2024-12-28T21:30:01Z"
      message: Executing job cluster-backup-job-7tvrz to save backup file backup-default-28923690-2024-12-28_213001
      reason: BackupPending
      status: "False"
      type: BackupPending
    - lastTransitionTime: "2024-12-28T21:30:06Z"
      message: Complete
      reason: BackupCompleted
      status: "True"
      type: BackupCompleted

  • The EtcdBackup CR is being handled by BackupController within CEO.

  • The BackupController reacts by creating another Job resource, as show below.

  • An example Job is job.batch/cluster-backup-job-k7j4k.


Below is the Job manifest being created by BackupController, as a result of creating an instance of EtcdBackup CR.

apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: "2024-12-28T20:21:01Z"
  generation: 1
  labels:
    app: cluster-backup-job
    backup-name: default-28923621
    state: processed
  name: cluster-backup-job-k7j4k
  namespace: openshift-etcd
  ownerReferences:
    - apiVersion: operator.openshift.io/v1alpha1
      kind: EtcdBackup
      name: default-28923621
      uid: f1436db0-3713-4b33-94dc-5d8fd3d76ce3
    - apiVersion: batch/v1
      kind: Job
      name: default-28923621
      uid: 0ba1832a-3926-4ca4-b90b-e2b21f593586
  resourceVersion: "85259"
  uid: dbe24b55-ae86-4bb7-b3ec-066a908ef37d
spec:
  backoffLimit: 6
  completionMode: NonIndexed
  completions: 3
  manualSelector: false
  parallelism: 3
  podReplacementPolicy: TerminatingOrFailed
  selector:
    matchLabels:
      batch.kubernetes.io/controller-uid: dbe24b55-ae86-4bb7-b3ec-066a908ef37d
  suspend: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        batch.kubernetes.io/controller-uid: dbe24b55-ae86-4bb7-b3ec-066a908ef37d
        batch.kubernetes.io/job-name: cluster-backup-job-k7j4k
        controller-uid: dbe24b55-ae86-4bb7-b3ec-066a908ef37d
        job-name: cluster-backup-job-k7j4k
    spec:
      activeDeadlineSeconds: 900
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: Exists
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: batch.kubernetes.io/job-name
                    operator: In
                    values:
                      - cluster-backup-job-k7j4k
              topologyKey: kubernetes.io/hostname
      containers:
        - command:
            - /bin/sh
            - -c
            - |
              #!/bin/sh
              set -exuo pipefail

              cluster-etcd-operator cluster-backup --backup-dir "${CLUSTER_BACKUP_PATH}"
          env:
            - name: CLUSTER_BACKUP_PATH
              value: /etc/kubernetes/cluster-backup/backup-default-28923621-2024-12-28_202101
            - name: ETCDCTL_CERT
              value: /var/run/secrets/etcd-client/tls.crt
            - name: ETCDCTL_KEY
              value: /var/run/secrets/etcd-client/tls.key
            - name: ETCDCTL_CACERT
              value: /var/run/configmaps/etcd-ca/ca-bundle.crt
          image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
          imagePullPolicy: IfNotPresent
          name: cluster-backup
          resources:
            requests:
              cpu: 10m
              memory: 80Mi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: FallbackToLogsOnError
          volumeMounts:
            - mountPath: /usr/local/bin
              name: usr-local-bin
            - mountPath: /etc/kubernetes/static-pod-resources
              name: resources-dir
            - mountPath: /etc/kubernetes/static-pod-certs
              name: cert-dir
            - mountPath: /etc/kubernetes/manifests
              name: static-pod-dir
            - mountPath: /etc/kubernetes/cluster-backup
              name: etc-kubernetes-cluster-backup
            - mountPath: /var/run/secrets/etcd-client
              name: etcd-client
            - mountPath: /var/run/configmaps/etcd-ca
              name: etcd-ca
      dnsPolicy: ClusterFirst
      hostNetwork: true
      initContainers:
        - command:
            - cluster-etcd-operator
            - verify
            - backup-storage
          image: registry.build05.ci.openshift.org/ci-ln-6pd8862/stable@sha256:c97a42f7872a95ce78fe185e46a6a04f3e6da42009462b88da9f2766259bf80a
          imagePullPolicy: IfNotPresent
          name: verify-storage
          resources:
            requests:
              cpu: 5m
              memory: 50Mi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: FallbackToLogsOnError
          volumeMounts:
            - mountPath: /etc/kubernetes/cluster-backup
              name: etc-kubernetes-cluster-backup
            - mountPath: /var/run/secrets/etcd-client
              name: etcd-client
            - mountPath: /var/run/configmaps/etcd-ca
              name: etcd-ca
      nodeSelector:
        node-role.kubernetes.io/master: ""
      priorityClassName: system-node-critical
      restartPolicy: Never
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
        - operator: Exists
      volumes:
        - hostPath:
            path: /usr/local/bin
            type: ""
          name: usr-local-bin
        - hostPath:
            path: /etc/kubernetes/manifests
            type: ""
          name: static-pod-dir
        - hostPath:
            path: /etc/kubernetes/static-pod-resources
            type: ""
          name: resources-dir
        - hostPath:
            path: /etc/kubernetes/static-pod-resources/etcd-certs
            type: ""
          name: cert-dir
        - name: etcd-client
          secret:
            defaultMode: 420
            secretName: etcd-client
        - configMap:
            defaultMode: 420
            name: etcd-ca-bundle
          name: etcd-ca
        - hostPath:
            path: /etc/kubernetes/cluster-backup
            type: DirectoryOrCreate
          name: etc-kubernetes-cluster-backup
status:
  completionTime: "2024-12-28T20:21:06Z"
  conditions:
    - lastProbeTime: "2024-12-28T20:21:06Z"
      lastTransitionTime: "2024-12-28T20:21:06Z"
      message: Reached expected number of succeeded pods
      reason: CompletionsReached
      status: "True"
      type: SuccessCriteriaMet
    - lastProbeTime: "2024-12-28T20:21:06Z"
      lastTransitionTime: "2024-12-28T20:21:06Z"
      message: Reached expected number of succeeded pods
      reason: CompletionsReached
      status: "True"
      type: Complete
  ready: 0
  startTime: "2024-12-28T20:21:01Z"
  succeeded: 3
  terminating: 0
  uncountedTerminatedPods: {}

The previous job contains two main containers

  • init-container verify-storage
    initContainers:
        - command:
            - cluster-etcd-operator
            - verify
            - backup-storage
  • main-container cluster-backup, which takes the actual backup.
     - command:
            - /bin/sh
            - -c
            - |
              #!/bin/sh
              set -exuo pipefail

              cluster-etcd-operator cluster-backup --backup-dir "${CLUSTER_BACKUP_PATH}"

  • Taking advantage of Job Parallelism, nodeAffinity and podAntiAffinity as shown below.
    spec:
      activeDeadlineSeconds: 900
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: Exists
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: batch.kubernetes.io/job-name
                    operator: In
                    values:
                      - cluster-backup-job-k7j4k
              topologyKey: kubernetes.io/hostname
  • The result is having a backup being scheduled and stored on all master nodes.

  • The limitation is that, the pruning runs on only one master node at a time.

  • Indeed this can be fixed by applying job parallelism on the Job resource which contains the prune and request-backup container.

  • However, this will cause a problem, as the request-backup up container will succeed on only one pod, and fail the other two.

  • Again, this could be mitigated by applying random suffix to the EtcdBackup CR being creating. However, this will result into exponentially backups being created. Here is the line to add suffix

  • Meanwhile, I have parallelized only the actual backup pods, and waiting for further reviews and suggestions.

  • Below is the result of testing this approach on an OCP 4.19 cluster.


  • running oc debug node/ip-10-0-121-76.us-west-1.compute.internal
sh-5.1# ls -l /etc/kubernetes/cluster-backup
total 0
drwxr-xr-x. 2 root root 96 Dec 28 20:21 backup-default-28923621-2024-12-28_202101
drwxr-xr-x. 2 root root 96 Dec 28 20:24 backup-default-28923624-2024-12-28_202401
drwxr-xr-x. 2 root root 96 Dec 28 20:27 backup-default-28923627-2024-12-28_202701
drwxr-xr-x. 2 root root 96 Dec 28 20:30 backup-default-28923630-2024-12-28_203001
drwxr-xr-x. 2 root root 96 Dec 28 20:33 backup-default-28923633-2024-12-28_203302
drwxr-xr-x. 2 root root 96 Dec 28 20:36 backup-default-28923636-2024-12-28_203601

  • running oc debug node/ip-10-0-119-89.us-west-1.compute.internal
sh-5.1# ls -l /etc/kubernetes/cluster-backup
total 0
drwxr-xr-x. 2 root root 96 Dec 28 20:21 backup-default-28923621-2024-12-28_202101
drwxr-xr-x. 2 root root 96 Dec 28 20:24 backup-default-28923624-2024-12-28_202401
drwxr-xr-x. 2 root root 96 Dec 28 20:27 backup-default-28923627-2024-12-28_202701
drwxr-xr-x. 2 root root 96 Dec 28 20:30 backup-default-28923630-2024-12-28_203001
drwxr-xr-x. 2 root root 96 Dec 28 20:33 backup-default-28923633-2024-12-28_203302
drwxr-xr-x. 2 root root 96 Dec 28 20:36 backup-default-28923636-2024-12-28_203601

  • running oc debug node/ip-10-0-59-75.us-west-1.compute.internal
sh-5.1# ls -l /etc/kubernetes/cluster-backup
total 0
drwxr-xr-x. 2 root root 96 Dec 28 20:30 backup-default-28923630-2024-12-28_203001
drwxr-xr-x. 2 root root 96 Dec 28 20:33 backup-default-28923633-2024-12-28_203302
drwxr-xr-x. 2 root root 96 Dec 28 20:36 backup-default-28923636-2024-12-28_203601

Note: this is the only master node where prune was scheduled


To summarize, this approach works but as you can see, the DaemonSet approach is much more flexible.

Looking for your consensus over an approach to merge

Happy New Year 🥳 🥳 🥳 🥳

cc @JoelSpeed @jmhbnz @deads2k @wking @rhuss @vrutkovs @tjungblu @sjenning @csrwng @cuppett

@Elbehery
Copy link
Contributor Author

@JoelSpeed FYI ^^

Copy link
Contributor

openshift-ci bot commented Dec 28, 2024

@Elbehery: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-etcd-certrotation a7e1fb0 link false /test e2e-aws-etcd-certrotation
ci/prow/e2e-aws-ovn-etcd-scaling a7e1fb0 link true /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-metal-ovn-sno-cert-rotation-shutdown a7e1fb0 link false /test e2e-metal-ovn-sno-cert-rotation-shutdown
ci/prow/e2e-aws-etcd-recovery a7e1fb0 link false /test e2e-aws-etcd-recovery
ci/prow/e2e-aws-ovn-single-node a7e1fb0 link true /test e2e-aws-ovn-single-node
ci/prow/okd-scos-e2e-aws-ovn a7e1fb0 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-metal-ovn-ha-cert-rotation-shutdown a7e1fb0 link false /test e2e-metal-ovn-ha-cert-rotation-shutdown

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@JoelSpeed
Copy link
Contributor

The limitation is that, the pruning runs on only one master node at a time.

I guess I didn't fully understand the flow here, did not expect to see a cronjob to create EtcdBackup which then triggers the actual backups, I thought that would be a controller of itself and that we only had one Job involved here.

Anyway, is there any particular reason why the pruning has to be done prior to the EtcdBackup being created? Why does the pruning not happen as part of the second Job, the one triggered by the EtcdBackup being created? (Do you have well known names for these two jobs?)

To summarize, this approach works but as you can see, the DaemonSet approach is much more flexible

How would you make the argument that it is more flexible? Which parts are more flexible? Is this flexibility required, and, is it worth having to implement and maintain a completely new backup system, vs just re-using/extending the existing system and having 1 path?

spec:
  pvcName: no-config

The intention in the future would be to extend this API and create a specific area to specify what kind of storage is being used, and assume not to just assume based on a magic keyname.

I believe we discussed making this something like

spec:
  backupStorage:
    type: HostPath | PVC
    pvc: # Only valid when type is PVC
      name: ...

@Elbehery
Copy link
Contributor Author

Elbehery commented Jan 2, 2025

I guess I didn't fully understand the flow here, did not expect to see a cronjob to create EtcdBackup which then triggers the actual backups, I thought that would be a controller of itself and that we only had one Job involved here.

So this is the current implementation, as I explained in #1381 (comment)

Anyway, is there any particular reason why the pruning has to be done prior to the EtcdBackup being created? Why does the pruning not happen as part of the second Job, the one triggered by the EtcdBackup being created? (Do you have well known names for these two jobs?)

I am not the one who can answer this, I believe @tjungblu and @hasbro17 are the best to answer

However, iirc, the prune container had to run as init-container, since it should run in serial order with the cluster-backup container.

I can also assume that since the we have a one-time cluster backup, which is supported by the request-backup container and being handled by backup controller, and a recurring backup being handled by periodicbackupcontroller, maybe this is the reason we have two separate resources.

The templates used for creating the CronJob and the Job resources are below

How would you make the argument that it is more flexible? Which parts are more flexible? Is this flexibility required, and, is it worth having to implement and maintain a completely new backup system, vs just re-using/extending the existing system and having 1 path?

As the whole code of the etcd-backup-server runs as a single entity (i.e. Pod), it is easier to adapt, and edit.

For instance, if we were now moving the prune container from the CronJob level to the BackupJob level, this could be a big change. On the other hand, doing the same changes on the etcd-backup-server Pod, is bound to the same entity, and would not have a big impact on CU experience with the product.

The intention in the future would be to extend this API and create a specific area to specify what kind of storage is being used, and assume not to just assume based on a magic keyname.

Indeed, if we were to move forward with this approach, then we would make the API changes 👍🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants