Skip to content

Comments

feat(retrier): per-attempt error tracking, early stop, and init guards#16

Merged
hyp3rd merged 2 commits intomainfrom
feat/improvements
Jan 20, 2026
Merged

feat(retrier): per-attempt error tracking, early stop, and init guards#16
hyp3rd merged 2 commits intomainfrom
feat/improvements

Conversation

@hyp3rd
Copy link
Owner

@hyp3rd hyp3rd commented Jan 20, 2026

• Track errors per attempt (populate Errors.Attempts); remove shared r.err usage. • Return early when a non-temporary error is encountered (when a temporary-error list/registry is provided). • Wrap timeouts with ErrTimeoutReached and include attempt context. • Add ensureInitialized/ensureContext; default registry, logger, timer pool; support manual initialization. • Tests: cover early-stop behavior, timeouts, and manual init (e.g., TestDo_NonTemporaryStopsEarly, TestDo_ManualRetrierInitialization). • Docs: add PRD for Retrier Hardening; update README and examples (temporary error registry usage).

BREAKING CHANGE: Errors.Retries renamed to Errors.Attempts. Update usages and expectations: • Use Errors.Attempts for attempt traces.
• Register temporary errors via retrier.Registry.RegisterTemporaryError. • Expect timeout errors as ErrTimeoutReached and max-retries errors wrapped by ErrMaxRetriesReached.

•  Track errors per attempt (populate Errors.Attempts); remove shared r.err usage.
•  Return early when a non-temporary error is encountered (when a temporary-error list/registry is provided).
•  Wrap timeouts with ErrTimeoutReached and include attempt context.
•  Add ensureInitialized/ensureContext; default registry, logger, timer pool; support manual initialization.
•  Tests: cover early-stop behavior, timeouts, and manual init (e.g., TestDo_NonTemporaryStopsEarly, TestDo_ManualRetrierInitialization).
•  Docs: add PRD for Retrier Hardening; update README and examples (temporary error registry usage).

BREAKING CHANGE: Errors.Retries renamed to Errors.Attempts. Update usages and expectations:
•  Use Errors.Attempts for attempt traces.
•  Register temporary errors via retrier.Registry.RegisterTemporaryError.
•  Expect timeout errors as ErrTimeoutReached and max-retries errors wrapped by ErrMaxRetriesReached.
Copilot AI review requested due to automatic review settings January 20, 2026 13:04
…temp error policy

- MaxRetries now defined as “retries after the first attempt”; total attempts = MaxRetries + 1.
- Validation permits MaxRetries >= 0 and updates the error message accordingly.
- Default behavior when temporaryErrors is omitted:
  - If a registry is configured, retry only registry-identified temporary errors.
  - If the registry is empty, treat all errors as temporary and retry.
- Update retrier loop/logic, option docs, tests, README, and PRD to reflect the above.

Refs: options.go, retrier.go, tests/retrier_test.go, README.md, PRD.md
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request hardens the go-again retrier implementation by fixing concurrency issues, improving error handling, and adding support for manually constructed Retrier instances. The PR introduces per-attempt error tracking, early stopping on non-temporary errors, and proper timeout error wrapping.

Changes:

  • Removed shared mutable state (r.err field) to enable safe concurrent calls to Do
  • Added initialization guards (ensureInitialized, ensureContext) to support manual Retrier construction without NewRetrier
  • Implemented early return when non-temporary errors are encountered with a temporary error list provided
  • Wrapped timeout errors with ErrTimeoutReached for errors.Is compatibility

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
retrier.go Core implementation changes: removed shared error state, added per-call error tracking, implemented initialization guards, improved timeout error wrapping
tests/retrier_test.go Added tests for early-stop behavior, timeout error detection, and manual Retrier initialization
README.md Updated API documentation to reflect Errors.Attempts field rename and corrected registry usage examples
PRD.md Added Product Requirements Document describing the hardening goals and implementation approach

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

PRD.md Outdated
Comment on lines 69 to 73
arly on non-temporary errors when a list is provided.

arly on non-temporary errors when a list is provided.

- Update README snippets and API notes to match current code.
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Rollout" section contains duplicated and incomplete text fragments. The section should be properly completed or removed if not needed. The text "arly on non-temporary errors when a list is provided" appears to be a copy-paste error from the "Proposed Changes" section.

Suggested change
arly on non-temporary errors when a list is provided.
arly on non-temporary errors when a list is provided.
- Update README snippets and API notes to match current code.
- Release changes as a minor version (no breaking API changes).
- Coordinate with dependent services to update the `go-again` dependency and run regression tests.
- Monitor retry behavior, error rates, and timeout frequency after rollout.
- Be prepared to roll back to the previous version if regressions in retry or timeout handling are observed.

Copilot uses AI. Check for mistakes.
@hyp3rd hyp3rd merged commit 1fcc886 into main Jan 20, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant