Skip to content

Comments

feat(retrier): add context-aware retries via DoWithContext#17

Merged
hyp3rd merged 2 commits intomainfrom
feat/improvements
Jan 20, 2026
Merged

feat(retrier): add context-aware retries via DoWithContext#17
hyp3rd merged 2 commits intomainfrom
feat/improvements

Conversation

@hyp3rd
Copy link
Owner

@hyp3rd hyp3rd commented Jan 20, 2026

Introduce RetryableFuncWithContext and DoWithContext to support cancellation and deadlines using context.Context. Implement supporting helpers for attempt handling and backoff that respect context (e.g., shouldRetry, waitRetry, handleAttemptError, runAttemptWithContext, ensureInitialized). Update README with guidance and examples for long-running operations using DoWithContext. Expand PRD with acceptance criteria and test plan for context-aware retries. Add TestDoWithContext_Cancel to verify cancellation propagates (expects context.Canceled).

No breaking changes; existing Do() behavior remains unchanged.

Introduce RetryableFuncWithContext and DoWithContext to support cancellation and deadlines using context.Context.
Implement supporting helpers for attempt handling and backoff that respect context (e.g., shouldRetry, waitRetry, handleAttemptError, runAttemptWithContext, ensureInitialized).
Update README with guidance and examples for long-running operations using DoWithContext.
Expand PRD with acceptance criteria and test plan for context-aware retries.
Add TestDoWithContext_Cancel to verify cancellation propagates (expects context.Canceled).

No breaking changes; existing Do() behavior remains unchanged.
Copilot AI review requested due to automatic review settings January 20, 2026 14:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces context-aware retry functionality by adding a new DoWithContext method that accepts functions observing context cancellation. The implementation refactors the existing Do method to extract reusable helper functions (shouldRetry, waitRetry, handleAttemptError) and adds new infrastructure (runAttemptWithContext, attemptOutcome type) to support concurrent execution with proper cancellation handling.

Changes:

  • Added RetryableFuncWithContext type and DoWithContext method for context-aware retries
  • Refactored Do method to use extracted helper functions
  • Updated documentation in README and PRD to explain when to use context-aware retries

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
retrier.go Added DoWithContext method, RetryableFuncWithContext type, and helper methods (shouldRetry, waitRetry, handleAttemptError, runAttemptWithContext, ensureInitialized); refactored Do to use helpers
tests/retrier_test.go Added TestDoWithContext_Cancel to verify context cancellation behavior
README.md Added example and guidance for using DoWithContext with long-running operations
PRD.md Updated goals, acceptance criteria, and test plan to include context-aware retry support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +503 to +533
func (r *Retrier) runAttemptWithContext(
ctx context.Context,
timeoutTimer *time.Timer,
retryableFunc RetryableFuncWithContext,
) (attemptErr, terminalErr error) {
attemptCtx, attemptCancel := context.WithCancelCause(ctx)
resultCh := make(chan error, 1)

go func() {
resultCh <- retryableFunc(attemptCtx)
}()

select {
case err := <-resultCh:
attemptCancel(nil)

return err, nil
case <-ctx.Done():
attemptCancel(context.Cause(ctx))

return nil, ewrap.Wrap(ctx.Err(), "context done")
case <-r.ctx.Done():
attemptCancel(ErrOperationStopped)
//nolint:contextcheck
return nil, ewrap.Wrap(context.Cause(r.ctx), "retrier context done")
case <-timeoutTimer.C:
attemptCancel(ErrTimeoutReached)

return nil, ErrTimeoutReached
}
}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutine launched in runAttemptWithContext may leak if the select statement exits through the ctx.Done(), r.ctx.Done(), or timeoutTimer.C cases. After calling attemptCancel, the function returns immediately without waiting for the goroutine to complete. The goroutine will continue executing retryableFunc(attemptCtx) and attempt to send to resultCh, but no receiver will be reading from it. While the channel is buffered with size 1, this prevents immediate blocking, but the goroutine itself is not guaranteed to terminate promptly if retryableFunc is long-running or doesn't observe cancellation. Consider adding a mechanism to ensure the goroutine completes, or document that retryableFunc must observe context cancellation to avoid goroutine leaks.

Copilot uses AI. Check for mistakes.
Comment on lines +289 to +312
func TestDoWithContext_Cancel(t *testing.T) {
r, err := again.NewRetrier(
context.Background(),
again.WithMaxRetries(1),
again.WithInterval(10*time.Millisecond),
again.WithTimeout(1*time.Second),
)
if err != nil {
t.Fatalf("failed to create retrier: %v", err)
}

ctx, cancel := context.WithCancel(context.Background())

errs := r.DoWithContext(ctx, func(ctx context.Context) error {
cancel()
<-ctx.Done()

return ctx.Err()
})

if errs.Last == nil || !errors.Is(errs.Last, context.Canceled) {
t.Errorf("expected last error to match %v, got %v", context.Canceled, errs.Last)
}
}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test has a race condition. The function calls cancel() and then waits for ctx.Done() before returning ctx.Err(). However, in runAttemptWithContext, there's a race between the function completing (case err := <-resultCh) and the context being detected as done (case <-ctx.Done()). Depending on which case wins, the error handling path differs: if resultCh wins, context.Canceled is treated as attemptErr and goes through shouldRetry logic; if ctx.Done() wins, it's returned as a terminalErr. This makes the test behavior non-deterministic. Consider using a more controlled scenario, such as a longer-running operation or explicit synchronization to ensure the desired execution path is consistently tested.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@hyp3rd hyp3rd merged commit 297136b into main Jan 20, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant