Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with k3s install script when using --cluster-init option (etcd) #11635

Closed
bootsie123 opened this issue Jan 22, 2025 · 1 comment
Closed

Comments

@bootsie123
Copy link

bootsie123 commented Jan 22, 2025

Environmental Info:
K3s Version:

k3s version v1.31.4+k3s1 (a562d090)
go version go1.22.9

Node(s) CPU architecture, OS, and Version:

Linux test 5.15.0-112-generic #122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
My ultimate configuration is 3 servers and 2 agents, however, my results here are from specifically a single server test.

Describe the bug:

A few namespace and user forbidden errors are encountered during the k3s installation script. This causes the k3s service to eventually end up crashing and creating a crash cycle.

The full logs will be attached, however, I think these three lines in the logs are potentially the culprit:

Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.716849    1330 controller.go:148] "Unhandled Error" err="while syncing ConfigMap \"kube-system/kube-apiserver-legacy-service-account-token-tracking\", err: namespaces \"kube-system\" not found"
Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.716912    1330 controller.go:145] "Failed to ensure lease exists, will retry" err="namespaces \"kube-system\" not found" interval="200ms"
Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.803764    1330 server.go:666] "Failed to retrieve node info" err="nodes \"test\" is forbidden: User \"system:kube-proxy\" cannot get resource \"nodes\" in API group \"\" at the cluster scope"

Steps To Reproduce:

  1. Install k3s using the installation script
curl -sfL https://get.k3s.io | K3S_TOKEN=test K3S_KUBECONFIG_MODE="644" sh -s - server --cluster-init --tls-san=192.168.3.131
  1. Wait and monitor k3s service

Expected behavior:

Simply put, I'd expect the k3s installation script to fully run without errors and result in a health cluster being initiated.

Actual behavior:

The k3s installation script stalls and causes the k3s service to enter a constant crashing state. Interestingly, when --cluster-init is omitted (running without etcd) the installation script succeeds and a healthy cluster is created.

Let me know if there's any additional debugging steps I could do! I'm kind of at a loss for what else I should investigate here. Thanks!

Additional context / logs:
k3s Journal Logs - See attached file

k3s logs.txt

@brandond
Copy link
Member

brandond commented Jan 22, 2025

I suspect your disk performance is not sufficient to support the combination of etcd and workload IO even at a baseline level. Kine/dqlite is less demanding.

@k3s-io k3s-io locked and limited conversation to collaborators Jan 22, 2025
@brandond brandond converted this issue into discussion #11636 Jan 22, 2025
@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Jan 22, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants