Skip to content
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.

[WSL2] Kim builder install fails: CrashLoopBackOff #90

Open
tobiasoort opened this issue Feb 22, 2022 · 2 comments
Open

[WSL2] Kim builder install fails: CrashLoopBackOff #90

tobiasoort opened this issue Feb 22, 2022 · 2 comments

Comments

@tobiasoort
Copy link

tobiasoort commented Feb 22, 2022

I'm running:

  • Windows 11 with WSL2
  • Rancher Desktop 1.0.1 with Kubernetes v1.23.3, on containerd
  • Ubuntu 20.04 in WSL2 as a client/ui
  • Rancher Desktop WSL integration
  • Installed Arkade, installed Kim via arkade get kim on Ubuntu
  • Tried running kim builder install on Ubuntu

Result:

INFO[0000] Applying node-role `builder` to `myhostname-redacted`
INFO[0000] Asserting namespace `kube-image`
INFO[0000] Asserting TLS secrets
INFO[0000] Asserting service/endpoints
INFO[0000] Installing builder daemon
INFO[0000] Waiting on builder daemon availability...
INFO[0006] Waiting on builder daemon availability...
INFO[0013] Waiting on builder daemon availability...
INFO[0018] Waiting on builder daemon availability...
INFO[0024] Waiting on builder daemon availability...
INFO[0030] Waiting on builder daemon availability...
INFO[0036] Waiting on builder daemon availability...
INFO[0041] Waiting on builder daemon availability...
INFO[0047] Waiting on builder daemon availability...
INFO[0052] Waiting on builder daemon availability...
INFO[0059] Waiting on builder daemon availability...
INFO[0065] Waiting on builder daemon availability...
INFO[0070] Waiting on builder daemon availability...
INFO[0075] Waiting on builder daemon availability...
INFO[0081] Waiting on builder daemon availability...
Error: timeout waiting for builder to become available

On the kubectl side:

$ kubectl get pods -A
NAMESPACE     NAME                                      READY   STATUS             RESTARTS      AGE
kube-system   helm-install-traefik-crd-45xtb            0/1     Completed          0             28m
kube-system   helm-install-traefik-j5hws                0/1     Completed          1             28m
kube-system   svclb-traefik-vr9hp                       2/2     Running            2 (12m ago)   28m
kube-system   local-path-provisioner-6c79684f77-pzzpw   1/1     Running            1 (12m ago)   28m
kube-system   coredns-5789895cd-ngcvk                   1/1     Running            1 (12m ago)   28m
kube-system   metrics-server-7cd5fcb6b7-jhbwm           1/1     Running            1 (12m ago)   28m
kube-system   traefik-6bb96f9bd8-zrqtm                  1/1     Running            1 (12m ago)   28m
kube-image    builder-4rcj8                             1/2     CrashLoopBackOff   5 (81s ago)   4m24s

So lets describe the offending pod:

$ kubectl -n kube-image describe pods builder-4rcj8
Name:         builder-4rcj8
Namespace:    kube-image
Priority:     0
Node:         myhostname-redacted/192.168.98.213
Start Time:   Tue, 22 Feb 2022 21:53:25 +0100
Labels:       app=kim
              app.kubernetes.io/component=builder
              app.kubernetes.io/managed-by=kim
              app.kubernetes.io/name=kim
              component=builder
              controller-revision-hash=6df6b4765c
              pod-template-generation=1
Annotations:  <none>
Status:       Running
IP:           192.168.98.213
IPs:
  IP:           192.168.98.213
Controlled By:  DaemonSet/builder
Init Containers:
  rshared-tmp:
    Container ID:  containerd://0b6b9560c261531abfaa779b3a5701f683d2b0f0b99af0c0b3d04dbd428656f6
    Image:         docker.io/moby/buildkit:v0.8.3
    Image ID:      docker.io/moby/buildkit@sha256:171689e43026533b48701ab6566b72659dd1839488d715c73ef3fe387fab9a80
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
    Args:
      (if mountpoint $_DIR; then set -x; nsenter -m -p -t 1 -- env PATH=$_PATH sh -c 'mount --make-rshared $_DIR'; fi) || true
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 22 Feb 2022 21:53:31 +0100
      Finished:     Tue, 22 Feb 2022 21:53:31 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      _DIR:   /tmp
      _PATH:  /usr/sbin:/usr/bin:/sbin:/bin:/bin/aux
    Mounts:
      /tmp from host-tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nw66x (ro)
  rshared-buildkit:
    Container ID:  containerd://feade611fdc670eab306f9dbe44a8a34c2f2fd1f0cdbaa94a4310c6c3af748e1
    Image:         docker.io/moby/buildkit:v0.8.3
    Image ID:      docker.io/moby/buildkit@sha256:171689e43026533b48701ab6566b72659dd1839488d715c73ef3fe387fab9a80
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
    Args:
      (if mountpoint $_DIR; then set -x; nsenter -m -p -t 1 -- env PATH=$_PATH sh -c 'mount --make-rshared $_DIR'; fi) || true
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 22 Feb 2022 21:53:32 +0100
      Finished:     Tue, 22 Feb 2022 21:53:32 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      _DIR:   /var/lib/buildkit
      _PATH:  /usr/sbin:/usr/bin:/sbin:/bin:/bin/aux
    Mounts:
      /var/lib/buildkit from host-var-lib-buildkit (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nw66x (ro)
  rshared-containerd:
    Container ID:  containerd://e7a8850075281fe5167447e62a33d3309b2502f5a9233b4c0f5f4d61de06465f
    Image:         docker.io/moby/buildkit:v0.8.3
    Image ID:      docker.io/moby/buildkit@sha256:171689e43026533b48701ab6566b72659dd1839488d715c73ef3fe387fab9a80
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
    Args:
      (if mountpoint $_DIR; then set -x; nsenter -m -p -t 1 -- env PATH=$_PATH sh -c 'mount --make-rshared $_DIR'; fi) || true
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 22 Feb 2022 21:53:33 +0100
      Finished:     Tue, 22 Feb 2022 21:53:33 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      _DIR:   /var/lib/rancher
      _PATH:  /usr/sbin:/usr/bin:/sbin:/bin:/bin/aux
    Mounts:
      /var/lib/rancher from host-containerd (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nw66x (ro)
Containers:
  buildkit:
    Container ID:  containerd://073df55342e8e3d59254525aeba6b6fca60cb0777a3f3bcd0152eace779c2c13
    Image:         docker.io/moby/buildkit:v0.8.3
    Image ID:      docker.io/moby/buildkit@sha256:171689e43026533b48701ab6566b72659dd1839488d715c73ef3fe387fab9a80
    Port:          1234/TCP
    Host Port:     1234/TCP
    Args:
      --addr=unix:///run/buildkit/buildkitd.sock
      --addr=tcp://0.0.0.0:1234
      --containerd-worker=true
      --containerd-worker-addr=/run/k3s/containerd/containerd.sock
      --containerd-worker-gc
      --oci-worker=false
      --tlscacert=/certs/ca/tls.crt
      --tlscert=/certs/server/tls.crt
      --tlskey=/certs/server/tls.key
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 22 Feb 2022 21:56:28 +0100
      Finished:     Tue, 22 Feb 2022 21:56:28 +0100
    Ready:          False
    Restart Count:  5
    Liveness:       exec [buildctl debug workers] delay=5s timeout=1s period=20s #success=1 #failure=3
    Readiness:      exec [buildctl debug workers] delay=5s timeout=1s period=20s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /certs/ca from certs-ca (ro)
      /certs/server from certs-server (ro)
      /run from host-run (rw)
      /sys/fs/cgroup from host-ctl (rw)
      /tmp from host-tmp (rw)
      /var/lib/buildkit from host-var-lib-buildkit (rw)
      /var/lib/rancher from host-containerd (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nw66x (ro)
  agent:
    Container ID:  containerd://bf79e9def3ff4eb74481a13c8a0f4b0a663a80b0e22facd69f5b8dd77bb7b172
    Image:         rancher/kim:v0.1.0-beta.4
    Image ID:      docker.io/rancher/kim@sha256:091daceebc3f3b9f9e126d39f6e8b6ef96d3085813f4afbd35efc1a8a94e7bf4
    Port:          1233/TCP
    Host Port:     1233/TCP
    Command:
      kim
      --debug
      agent
    Args:
      --agent-port=1233
      --buildkit-socket=unix:///run/buildkit/buildkitd.sock
      --buildkit-port=1234
      --containerd-socket=/run/k3s/containerd/containerd.sock
      --tlscacert=/certs/ca/tls.crt
      --tlscert=/certs/server/tls.crt
      --tlskey=/certs/server/tls.key
    State:          Running
      Started:      Tue, 22 Feb 2022 21:53:38 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /certs/ca from certs-ca (ro)
      /certs/server from certs-server (ro)
      /etc/pki from host-etc-pki (ro)
      /etc/ssl from host-etc-ssl (ro)
      /run from host-run (rw)
      /sys/fs/cgroup from host-ctl (rw)
      /var/lib/buildkit from host-var-lib-buildkit (rw)
      /var/lib/rancher from host-containerd (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nw66x (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  host-ctl:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/cgroup
    HostPathType:  Directory
  host-etc-pki:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/pki
    HostPathType:  DirectoryOrCreate
  host-etc-ssl:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl
    HostPathType:  DirectoryOrCreate
  host-run:
    Type:          HostPath (bare host directory volume)
    Path:          /run
    HostPathType:  Directory
  host-tmp:
    Type:          HostPath (bare host directory volume)
    Path:          /tmp
    HostPathType:  Directory
  host-var-lib-buildkit:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/buildkit
    HostPathType:  DirectoryOrCreate
  host-containerd:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/rancher
    HostPathType:  DirectoryOrCreate
  certs-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kim-tls-ca
    Optional:    false
  certs-server:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kim-tls-server
    Optional:    false
  kube-api-access-nw66x:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              node-role.kubernetes.io/builder=true
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  5m23s                  default-scheduler  Successfully assigned kube-image/builder-4rcj8 to myhostname-redacted
  Normal   Pulling    5m23s                  kubelet            Pulling image "docker.io/moby/buildkit:v0.8.3"
  Normal   Pulled     5m17s                  kubelet            Successfully pulled image "docker.io/moby/buildkit:v0.8.3" in 5.713335137s
  Normal   Started    5m17s                  kubelet            Started container rshared-tmp
  Normal   Created    5m17s                  kubelet            Created container rshared-tmp
  Normal   Created    5m16s                  kubelet            Created container rshared-buildkit
  Normal   Pulled     5m16s                  kubelet            Container image "docker.io/moby/buildkit:v0.8.3" already present on machine
  Normal   Started    5m16s                  kubelet            Started container rshared-buildkit
  Normal   Started    5m15s                  kubelet            Started container rshared-containerd
  Normal   Pulled     5m15s                  kubelet            Container image "docker.io/moby/buildkit:v0.8.3" already present on machine
  Normal   Created    5m15s                  kubelet            Created container rshared-containerd
  Normal   Pulling    5m14s                  kubelet            Pulling image "rancher/kim:v0.1.0-beta.4"
  Normal   Pulled     5m10s                  kubelet            Successfully pulled image "rancher/kim:v0.1.0-beta.4" in 4.262237397s
  Normal   Created    5m10s                  kubelet            Created container agent
  Normal   Started    5m10s                  kubelet            Started container agent
  Normal   Created    5m9s (x2 over 5m14s)   kubelet            Created container buildkit
  Normal   Started    5m9s (x2 over 5m14s)   kubelet            Started container buildkit
  Normal   Pulled     4m50s (x3 over 5m14s)  kubelet            Container image "docker.io/moby/buildkit:v0.8.3" already present on machine
  Warning  BackOff    22s (x32 over 5m8s)    kubelet            Back-off restarting failed container

I have no clue why this pod is crashlooping. I've not been able to get kim to work on this machine. I'd love to use it so I can not deal with shuffling images by hand.

@tobiasoort
Copy link
Author

Looking at the logs for the buildkit container in the pod (which is the only one that actually failed):

$ kubectl logs -n kube-image -p builder-4rcj8 -c buildkit
buildkitd: could not lock /var/lib/buildkit/buildkitd.lock, another instance running?

That's weird right - why wouldn't it be able to get a lock?

@dweomer
Copy link
Contributor

dweomer commented Jun 23, 2022

While kim was originally developed to support rancher-desktop and generic uses I believe that RD has stopped shipping kim and does start up their own buildkit instance (looks like they still ship it but remove the kim builder). Even though kim runs buildkit in a container it does bind-mount a number of things from the host, including /var/lib/buildkit. This might explain the conflict.
Worth asking about over at https://github.com/rancher-sandbox/rancher-desktop (I no longer work at SUSE/Rancher)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants