Improves/request retry#131
Conversation
There was a problem hiding this comment.
Pull request overview
This PR centralizes “retryable error” handling and reduces direct HTTP/JSON parsing by routing more network-asset reads through common.SafeReadAssetUntilSufficient, while also standardizing retry loops to use mtg.CheckRetryableError.
Changes:
- Replace direct Mixin Network asset HTTP calls with
common.SafeReadAssetUntilSufficientin observer/keeper/cmd paths. - Standardize transient error retry logic across Mixin/Ethereum/Bitcoin RPC call loops via
mtg.CheckRetryableError. - Add an Ethereum
FilterLogsretry helper (filterLogsUntilSufficient) and use it for ERC20 transfer log retrieval.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| observer/bond.go | Switch asset fetch + retry gating to shared helpers (SafeReadAssetUntilSufficient, mtg.CheckRetryableError). |
| mtg/utils.go | Expand/normalize retryable error string matching using lowercase comparisons. |
| keeper/network.go | Switch Mixin asset fetch + retry gating to shared helpers. |
| common/wallet.go | Treat retryable UTXO listing failures as non-fatal and continue loop. |
| cmd/transaction.go | Use SafeReadAssetUntilSufficient for CLI asset-id lookup; add context. |
| apps/mixin/rpc.go | Replace local retry string matching with mtg.CheckRetryableError. |
| apps/ethereum/rpc.go | Use shared retry logic and add filterLogsUntilSufficient for log filtering. |
| apps/bitcoin/rpc.go | Replace local retry string matching with mtg.CheckRetryableError. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| asset, err := common.SafeReadAssetUntilSufficient(ctx, mixinId) | ||
| if err != nil { | ||
| panic(mixinId) | ||
| } | ||
| defer resp.Body.Close() | ||
|
|
||
| var body struct { | ||
| Data struct { | ||
| AssetId string `json:"asset_id"` | ||
| MixinId string `json:"mixin_id"` | ||
| } `json:"data"` | ||
| } | ||
| json.NewDecoder(resp.Body).Decode(&body) | ||
| if body.Data.MixinId != mixinId { | ||
| if asset.KernelAssetID != mixinId { | ||
| panic(mixinId) | ||
| } | ||
| return body.Data.AssetId | ||
| return asset.AssetID |
There was a problem hiding this comment.
SafeReadAssetUntilSufficient returns (nil, nil) on 404 (see common/mixin.go), so asset can be nil here; dereferencing asset.KernelAssetID / asset.AssetID will panic. Handle the asset == nil case explicitly (e.g., panic with a clearer message or return an empty string / error depending on expected CLI behavior).
| @@ -278,14 +254,10 @@ func (node *Node) fetchAssetMeta(ctx context.Context, id string) (*store.Asset, | |||
| if err == nil { | |||
There was a problem hiding this comment.
fetchMixinAsset can return (nil, nil) when the asset is not found (404), which makes meta nil here; calling WriteAssetMeta(ctx, meta) will panic because it dereferences asset. Add a meta == nil branch (similar to observer/bond.go) to return (nil, nil) without writing.
| if err == nil { | |
| if err == nil { | |
| if meta == nil { | |
| return nil, nil | |
| } |
| func filterLogsUntilSufficient(ctx context.Context, rpc string, query ethereum.FilterQuery) ([]types.Log, error) { | ||
| client, err := ethclient.Dial(rpc) | ||
| if err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
This dials an ethclient but never closes it; over time this can leak TCP connections/file descriptors (especially if called frequently). Ensure the client is closed (or reuse a shared client) and also consider breaking out when ctx is done to avoid retrying forever on context-related failures.
| for { | ||
| logs, err := client.FilterLogs(ctx, query) | ||
| if mtg.CheckRetryableError(err) { | ||
| time.Sleep(7 * time.Second) | ||
| continue | ||
| } | ||
| time.Sleep(7 * time.Second) | ||
| return logs, err |
There was a problem hiding this comment.
If client.FilterLogs returns a retryable error caused by the passed-in ctx (e.g., deadline exceeded), this loop will retry indefinitely with the same expired context. Add an explicit if ctx.Err() != nil { return nil, ctx.Err() } (or avoid treating context deadline errors as retryable) before sleeping/retrying.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3a8d672c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| logs, err := filterLogsUntilSufficient(ctx, rpc, ethereum.FilterQuery{ | ||
| FromBlock: big.NewInt(height), | ||
| ToBlock: big.NewInt(height), | ||
| }) |
There was a problem hiding this comment.
Return filter-log errors before parsing transfers
GetERC20TransferLogFromBlock now calls filterLogsUntilSufficient but never checks the returned err, so any non-retryable FilterLogs failure is silently treated as success and the function returns an empty transfer list. This can hide RPC/query failures (for example invalid filter params or provider-side errors) and cause missed ERC-20 transfer processing instead of surfacing an error to callers.
Useful? React with 👍 / 👎.
| err = json.NewDecoder(resp.Body).Decode(&body) | ||
| if err != nil { | ||
| func (node *Node) fetchMixinAsset(ctx context.Context, id string) (*store.Asset, error) { | ||
| asset, err := common.SafeReadAssetUntilSufficient(ctx, id) |
There was a problem hiding this comment.
Preserve messenger API config in asset metadata fetch
This change removes use of node.conf.MixinMessengerAPI from keeper asset lookups by switching to common.SafeReadAssetUntilSufficient(ctx, id), which takes no API endpoint argument. As a result, deployments that rely on a non-default mixin-messenger-api value will no longer query their configured endpoint here, and asset metadata fetches can fail in those environments.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| members := []string{mw.client.ClientID} | ||
| utxos, err := listUnspentUTXOsUntilSufficient(ctx, mw.client, members, 1, "", checkpoint) | ||
| if CheckRetryableError(err) { |
There was a problem hiding this comment.
This retries on CheckRetryableError(err) by immediately continueing without logging the error. Since drainOutputsFromNetwork runs forever, persistent retryable failures would become silent and make it hard to diagnose network/API issues. Consider logging the error (and possibly using a slightly longer backoff on repeated failures) before continuing.
| if CheckRetryableError(err) { | |
| if CheckRetryableError(err) { | |
| fmt.Printf("drainOutputsFromNetwork retryable error at checkpoint %d: %v\n", checkpoint, err) |
| } | ||
| b.Height = blockHeight | ||
| b.Height = uint64(height) | ||
| for _, tx := range b.Tx { |
There was a problem hiding this comment.
RPCGetBlockWithTransactions no longer validates the RPC response’s number field and now unconditionally sets b.Height from the requested height. If the RPC returns null (common when querying beyond the tip) or returns an unexpected block, json.Unmarshal can still succeed with a zero-value struct and this function will incorrectly return a “valid” block with empty hash/number. Consider unmarshalling into a pointer and checking for nil, and/or parsing/validating b.Number (and matching it to height) before returning success.
| case strings.Contains(es, "bad gateway"): | ||
| case strings.Contains(es, "internal server error"): | ||
| case strings.Contains(es, "invalid character"): | ||
| case strings.Contains(es, "unexpected end of json input"): |
There was a problem hiding this comment.
CheckRetryableError treats any error message containing the generic substring "invalid character" as retryable. This broad match will also include non-transient parsing/validation errors (not just transient HTML/JSON corruption from upstream), potentially causing infinite retry loops in the many UntilSufficient callers. Consider narrowing this condition to specific known transient messages (e.g. the previous "invalid character '<'" case) and/or prefer typed checks (e.g. net.Error timeout, io.EOF, context.DeadlineExceeded) over string matching for JSON parse errors.
| func (node *Node) fetchMixinAsset(ctx context.Context, id string) (*Asset, error) { | ||
| asset, err := common.SafeReadAssetUntilSufficient(ctx, id) | ||
| if err != nil || asset == nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
common.SafeReadAssetUntilSufficient returns (asset=nil, err=nil) when the upstream returns 404 (see common/mixin.go). In that case this function returns (nil, nil), which makes it easy for callers to treat "not found" as success and then panic/nil-deref. Consider converting the asset == nil case into an explicit error (e.g. ErrNotFound) so callers can branch reliably.
No description provided.