Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backup to s3, when bucket not exist ,backup mission keep no moving #448

Closed
jinyingsunny opened this issue Feb 22, 2024 · 2 comments
Closed
Assignees
Labels
affects/master PR/issue: this bug affects master version. process/fixed Process of bug severity/major Severity of bug type/bug Type: something is unexpected

Comments

@jinyingsunny
Copy link

as title

# kubectl -n nebula get nb nb20240222
NAME         TYPE   BACKUP                       STATUS   STARTED   COMPLETED   AGE
nb20240222   full   BACKUP_2024_02_22_02_45_37            29m                   29m

# kubectl -n nebula get nb nb20240222 -o yaml
apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaBackup
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps.nebula-graph.io/v1alpha1","kind":"NebulaBackup","metadata":{"annotations":{},"name":"nb20240222","namespace":"nebula"},"spec":{"autoRemoveFinished":false,"cleanBackupData":false,"config":{"clusterName":"nebulav","s3":{"bucket":"nebula-test","endpoint":"https://s3.us-east-1.amazonaws.com","region":"us-east-1","secretName":"aws-secret"}},"image":"reg.vesoft-inc.com/cloud-dev/br-ent","imagePullSecrets":[{"name":"image-pull-secret"}],"resources":{"limits":{"cpu":"200m","memory":"300Mi"},"requests":{"cpu":"100m","memory":"200Mi"}},"version":"v3.7.0"}}
  creationTimestamp: "2024-02-22T02:45:26Z"
  generation: 1
  name: nb20240222
  namespace: nebula
  resourceVersion: "58077445"
  uid: 2f3fe4e4-e371-4f4e-a658-33ad97e969a4
spec:
  autoRemoveFinished: false
  cleanBackupData: false
  config:
    clusterName: nebulav
    s3:
      bucket: nebula-test
      endpoint: https://s3.us-east-1.amazonaws.com
      region: us-east-1
      secretName: aws-secret
  image: reg.vesoft-inc.com/cloud-dev/br-ent
  imagePullPolicy: Always
  imagePullSecrets:
  - name: image-pull-secret
  resources:
    limits:
      cpu: 200m
      memory: 300Mi
    requests:
      cpu: 100m
      memory: 200Mi
  version: v3.7.0
status:
  backupName: BACKUP_2024_02_22_02_45_37
  conditions:
  - lastTransitionTime: "2024-02-22T02:45:26Z"
    status: "True"
    type: Running
  timeStarted: "2024-02-22T02:45:26Z"
  type: full

my full backup config is follows:

apiVersion: apps.nebula-graph.io/v1alpha1
kind: NebulaBackup
metadata:
  name: nb20240222
  namespace: nebula
spec:
  image: reg.vesoft-inc.com/cloud-dev/br-ent
  version: v3.7.0
  resources:
    limits:
      cpu: "200m"
      memory: 300Mi
    requests:
      cpu: 100m
      memory: 200Mi
  imagePullSecrets:
  - name: image-pull-secret
  autoRemoveFinished: false
  cleanBackupData: false
  config:
    clusterName: nebulav
    s3:
      region: "us-east-1"
      bucket: "nebula-test"
      endpoint: "https://s3.us-east-1.amazonaws.com"
      secretName: "aws-secret"

and i finnally know bucket nebula-test not exists in s3:

$ aws s3 ls
2023-09-14 17:11:40 nebula-br-test2
2024-02-18 07:44:01 nebula-e2e
2023-11-07 17:57:14 nebula-jerry-test
2023-11-17 07:25:44 nebula-kevin-liu-test
2023-09-19 10:53:41 nebula-terraform-dev-ap-southeast-1
2023-09-11 14:52:44 nebula-terraform-dev-us-east-2
2024-01-08 17:58:55 vesoft-harris-test
2023-12-14 10:49:00 vesoft-sc-v2
2023-12-05 14:14:07 vesoft-veezhang-test
2023-12-14 01:48:32 vesoft-zj

Your Environments (required)

operator镜像:reg.vesoft-inc.com/cloud-dev/nebula-operator:snap-1.35
br-ent: reg.vesoft-inc.com/cloud-dev/br-ent:v3.7.0

Expected behavior
report an error

@jinyingsunny jinyingsunny added severity/major Severity of bug type/bug Type: something is unexpected affects/master PR/issue: this bug affects master version. labels Feb 22, 2024
@MegaByte875
Copy link
Contributor

{"level":"error","msg":"Cleanup full backup BACKUP_2024_02_22_11_39_35 successfully after backup failed.","time":"2024-02-22T19:39:39.228Z"} Error: upload file by agent failed: rpc error: code = Unknown desc = upload from /usr/local/nebula/data/meta/nebula/0/0/checkpoints/BACKUP_2024_02_22_11_39_35/__disk_parts__.sst to BACKUP_2024_02_22_11_39_35/meta/__disk_parts__.sst failed: upload from /usr/local/nebula/data/meta/nebula/0/0/checkpoints/BACKUP_2024_02_22_11_39_35/__disk_parts__.sst to BACKUP_2024_02_22_11_39_35/meta/__disk_parts__.sst failed: NoSuchBucket: The specified bucket does not exist status code: 404, request id: SXYCXZ5C6N7KGTCQ, host id: yJXUCzZyQwVTCuPdob97Po1j+NiNOprAHDNF5O3j/NLB/I2GqM9hxlekhsauInYyoNgFd89priBSYFnWd44SBg==

The backup stuck was caused by DNS resolution.

@jinyingsunny
Copy link
Author

已解决,集群除了master节点,其他节点的dns resolv.conf都调整了
关闭掉systemd-resolved.servic
设置相同的 nameserver 即可。

@github-actions github-actions bot added the process/fixed Process of bug label Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/master PR/issue: this bug affects master version. process/fixed Process of bug severity/major Severity of bug type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

2 participants