Provide Full Exception Details for Retry Handling #345

Kobudzik · 2024-10-02T07:31:09Z

Description:

In the current implementation of durabletask-dotnet, the TaskFailureDetails class is passed as a Func argument for retry logic, but it only provides limited information. Specifically, the actual exception details are not fully available, which complicates retry scenarios where granular control over exception handling is required. In #314 HandleFailure method was introduced, to allow TaskFailureDetails filtering on RetryPolicy, so now it is possible to use both backoff coefficient and basic exception filtering. The problem is, TaskFailureDetails contains very little information about the real exception, and it doesn't fit some scenarios.

Use Case:
I need to retry on two types of transient errors:

When SqlServerTransientExceptionDetector.ShouldRetryOn(Exception ex) resolves to true.
When the exception is an ApiException with a status code of 400, 401, or 404.

Unfortunately, the current TaskFailureDetails implementation does not provide the full exception details, making it impossible to inspect the original exception. This issue hinders scenarios where both transient database issues and HTTP API failures need to be handled in the same orchestration.

Suggested Solutions:
Would it be feasible to either:

Serialize the original Exception into TaskFailureDetails so that users have access to full exception details?
Or, if TaskFailureDetails shouldn’t grow too much, introduce an alternative flow that allows accessing the original exception?

Additionally, there was a line in the codebase:

public Func<Exception, Task<bool>>? HandleAsync { get; set; }
This function seems to fit my use case more closely but was left unimplemented. Is there a specific reason why it wasn’t implemented?

Why This Matters:
Retry policies are commonly used to handle transient issues like database connection problems or temporary service unavailability. However, without full exception details, it’s challenging to make fine-tuned decisions about when and how to retry.

This enhancement would significantly improve the flexibility of retry logic by allowing users to filter retries based on the complete exception context.

Looking forward to your thoughts and feedback!
@cgillum

The text was updated successfully, but these errors were encountered:

jerry-dixon · 2025-01-06T22:19:50Z

I have this need as well, except that I wish to handle specific CosmosDB issues. Has there been any movement on this?

Thanks!

microsoft-github-policy-service bot added the Needs: Triage 🔍 label Oct 2, 2024

bachuv added P2 and removed Needs: Triage 🔍 labels Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide Full Exception Details for Retry Handling #345

Provide Full Exception Details for Retry Handling #345

Kobudzik commented Oct 2, 2024

jerry-dixon commented Jan 6, 2025

Provide Full Exception Details for Retry Handling #345

Provide Full Exception Details for Retry Handling #345

Comments

Kobudzik commented Oct 2, 2024

jerry-dixon commented Jan 6, 2025