Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use context.WithTimeoutCause & context.WithCancelCause instead of context.WithTimeout & context.WithCancel #11280

Open
sbueringer opened this issue Oct 9, 2024 · 11 comments · May be fixed by #11705
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Oct 9, 2024

I think we should start using the With*Cause variants of these functions as they allow adding additional context. This should lead to better messages than the standard "context canceled" when a context is canceled.

I took a look at our code base and we have roughly 20 context.WithCancel & 20 context.WithTimeout so it shouldn't be a lot of work to audit all of them and see if we want to use the With*Cause funcs instead

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Oct 9, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPI contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 9, 2024
@sbueringer sbueringer added priority/backlog Higher priority than priority/awaiting-more-evidence. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. labels Oct 9, 2024
@sbueringer
Copy link
Member Author

@fabriziopandini @chrischdi @vincepri @enxebre Opinions?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 7, 2025
@sbueringer
Copy link
Member Author

/remove-lifecycle stale

/help

@k8s-ci-robot
Copy link
Contributor

@sbueringer:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/remove-lifecycle stale

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 7, 2025
@Amulyam24
Copy link
Contributor

Hi @sbueringer, I'm interested and would like to work on this issue if this is still relevant. Can you please confirm?

@sbueringer
Copy link
Member Author

It's still relevant. Feel free to go ahead

@Amulyam24
Copy link
Contributor

/assign

@Amulyam24
Copy link
Contributor

Hi @sbueringer, I looked around the repo and as you mentioned there are around 40 instances of context.WithCancel and context.WithTimeout. The change seems pretty minimal with adding errors wherever applicable for additional info after switching to context.With*Cause
I also understand that the switch is not needed at every occurrence.

Is this what you've been looking for or is there anything else that specific that needs to be done?

For example

diff --git a/controllers/clustercache/cluster_accessor_client.go b/controllers/clustercache/cluster_accessor_client.go
index 55fe61ca6..908800133 100644
--- a/controllers/clustercache/cluster_accessor_client.go
+++ b/controllers/clustercache/cluster_accessor_client.go
@@ -261,7 +261,7 @@ func createCachedClient(ctx context.Context, clusterAccessorConfig *clusterAcces
        go cache.Start(cacheCtx) //nolint:errcheck
 
        // Wait until the cache is initially synced.
-       cacheSyncCtx, cacheSyncCtxCancel := context.WithTimeout(ctx, clusterAccessorConfig.Cache.InitialSyncTimeout)
+       cacheSyncCtx, cacheSyncCtxCancel := context.WithTimeoutCause(ctx, clusterAccessorConfig.Cache.InitialSyncTimeout, errors.New("timed out while waiting for caches to sync"))
        defer cacheSyncCtxCancel()
        if !cache.WaitForCacheSync(cacheSyncCtx) {
                cache.Stop()
@@ -297,13 +297,13 @@ type clientWithTimeout struct {
 var _ client.Client = &clientWithTimeout{}
 
 func (c clientWithTimeout) Get(ctx context.Context, key client.ObjectKey, obj client.Object, opts ...client.GetOption) error {
-       ctx, cancel := context.WithTimeout(ctx, c.timeout)
+       ctx, cancel := context.WithTimeoutCause(ctx, c.timeout, errors.New("timed out while retrieving object in cluster"))
        defer cancel()
        return c.Client.Get(ctx, key, obj, opts...)
 }

@sbueringer
Copy link
Member Author

@Amulyam24 Yup, that is what I was looking for. I'll take a look at your PR

@chrischdi
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
5 participants