Issues with k3s install script when using --cluster-init option (etcd) #11635

bootsie123 · 2025-01-22T02:35:13Z

Environmental Info:
K3s Version:

k3s version v1.31.4+k3s1 (a562d090)
go version go1.22.9

Node(s) CPU architecture, OS, and Version:

Linux test 5.15.0-112-generic #122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
My ultimate configuration is 3 servers and 2 agents, however, my results here are from specifically a single server test.

Describe the bug:

A few namespace and user forbidden errors are encountered during the k3s installation script. This causes the k3s service to eventually end up crashing and creating a crash cycle.

The full logs will be attached, however, I think these three lines in the logs are potentially the culprit:

Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.716849    1330 controller.go:148] "Unhandled Error" err="while syncing ConfigMap \"kube-system/kube-apiserver-legacy-service-account-token-tracking\", err: namespaces \"kube-system\" not found"
Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.716912    1330 controller.go:145] "Failed to ensure lease exists, will retry" err="namespaces \"kube-system\" not found" interval="200ms"
Jan 21 20:55:35 test k3s[1330]: E0121 20:55:35.803764    1330 server.go:666] "Failed to retrieve node info" err="nodes \"test\" is forbidden: User \"system:kube-proxy\" cannot get resource \"nodes\" in API group \"\" at the cluster scope"

Steps To Reproduce:

Install k3s using the installation script

curl -sfL https://get.k3s.io | K3S_TOKEN=test K3S_KUBECONFIG_MODE="644" sh -s - server --cluster-init --tls-san=192.168.3.131

Wait and monitor k3s service

Expected behavior:

Simply put, I'd expect the k3s installation script to fully run without errors and result in a health cluster being initiated.

Actual behavior:

The k3s installation script stalls and causes the k3s service to enter a constant crashing state. Interestingly, when --cluster-init is omitted (running without etcd) the installation script succeeds and a healthy cluster is created.

Let me know if there's any additional debugging steps I could do! I'm kind of at a loss for what else I should investigate here. Thanks!

Additional context / logs:
k3s Journal Logs - See attached file

k3s logs.txt

The text was updated successfully, but these errors were encountered:

brandond · 2025-01-22T04:31:32Z

I suspect your disk performance is not sufficient to support the combination of etcd and workload IO even at a baseline level. Kine/dqlite is less demanding.

github-project-automation bot added this to K3s Development Jan 22, 2025

github-project-automation bot moved this to New in K3s Development Jan 22, 2025

k3s-io locked and limited conversation to collaborators Jan 22, 2025

brandond converted this issue into discussion #11636 Jan 22, 2025

github-project-automation bot moved this from New to Done Issue in K3s Development Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Issues with k3s install script when using --cluster-init option (etcd) #11635

Issues with k3s install script when using --cluster-init option (etcd) #11635

bootsie123 commented Jan 22, 2025 •

edited

Loading

brandond commented Jan 22, 2025 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Issues with k3s install script when using --cluster-init option (etcd) #11635

Issues with k3s install script when using --cluster-init option (etcd) #11635

Comments

bootsie123 commented Jan 22, 2025 • edited Loading

brandond commented Jan 22, 2025 • edited Loading

This issue was moved to a discussion.

bootsie123 commented Jan 22, 2025 •

edited

Loading

brandond commented Jan 22, 2025 •

edited

Loading