You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating a basic Talos cluster with Terraform, I think there's a kind of logical problem that fails health checks in the plan phase when adding new nodes. The talos_cluster_health data source takes a list of all controlplane and workers. The first run, this is fine, the nodes are configured, and the health check passes. However, the second time, let's say you increase your worker node count. Now, the health check times out in the plan phase. The reason is that it now expects an additional worker node to be available. However, that node hasn't yet been given machine configuration in the apply step.
So, the health check runs during the plan phase, expects a new healthy worker node that has yet to be configured, and so fails the check.
Facing the exact same issue here. We need health check only on the first run to confirm cluster is up before deploying CNI. There should be skip_node_check that can be used so it doesnt worry anything about nodes, just makes sure cluster endpoint is healthy.
That does not work, it keeps trying to connect to the IP of new worker node (Since we are passing IPs of all workers nodes(calculated in a local var) the IP of new worker node is in there) and then times out
When creating a basic Talos cluster with Terraform, I think there's a kind of logical problem that fails health checks in the plan phase when adding new nodes. The
talos_cluster_health
data source takes a list of all controlplane and workers. The first run, this is fine, the nodes are configured, and the health check passes. However, the second time, let's say you increase your worker node count. Now, the health check times out in the plan phase. The reason is that it now expects an additional worker node to be available. However, that node hasn't yet been given machine configuration in the apply step.So, the health check runs during the plan phase, expects a new healthy worker node that has yet to be configured, and so fails the check.
Here's the cluster health config:
Add a new node, and the plan times out with:
data.talos_cluster_health.this: Still reading... [20s elapsed]
Not sure how to fix this, wish the health check just knew not to check on nodes that have yet to be configured in the planning phase.
The text was updated successfully, but these errors were encountered: