Skip to content

fix: add RPC fallback with timeout for on-chain reads#154

Open
0x-SquidSol wants to merge 1 commit intodcccrypto:mainfrom
0x-SquidSol:fix/rpc-fallback-connection
Open

fix: add RPC fallback with timeout for on-chain reads#154
0x-SquidSol wants to merge 1 commit intodcccrypto:mainfrom
0x-SquidSol:fix/rpc-fallback-connection

Conversation

@0x-SquidSol
Copy link
Copy Markdown
Contributor

Summary

  • Bug: All on-chain RPC calls (fetchSlab, getSlot) use a single Connection from getConnection() with no failover. If the primary Solana RPC is down, rate-limited, or hung, /markets/:slab, /api/adl/rankings, and /health fail with no recovery path.
  • Fix: Adds withRpcFallback() — tries primary connection with a timeout, on ANY error tries FALLBACK_RPC_URL with its own independent timeout. Also adds withRpcTimeout() using Promise.race to prevent indefinite hangs.

Changes

File Change
src/utils/rpc-timeout.ts New — withRpcTimeout(), RpcTimeoutError, env-configurable defaults
src/utils/rpc-fallback.ts New — withRpcFallback() with primary→fallback failover
src/routes/markets.ts Wraps fetchSlab() with withRpcFallback, returns 504 on timeout
src/routes/adl.ts Wraps fetchSlab() with withRpcFallback, returns 504 on timeout
src/routes/health.ts Wraps getSlot() with withRpcFallback (5s timeout)
.env.example Documents FALLBACK_RPC_URL

Design decisions

  • Promise.racefetchSlab/getSlot don't accept AbortSignal
  • Fallback on ANY error — not just 429 like rateLimitedCall. Connection failures, timeouts, 5xx all trigger failover. Better for read-only API.
  • Independent timeout per attempt — primary gets 10s, then fallback gets a fresh 10s. No shared budget.
  • Guard against devnet defaultgetFallbackConnection() silently defaults to devnet when FALLBACK_RPC_URL is unset. We check Boolean(process.env.FALLBACK_RPC_URL) before attempting fallback to prevent silent devnet routing on mainnet.
  • 504 Gateway Timeout for RPC timeouts — semantically correct
  • Env-configurable: RPC_TIMEOUT_MS (default 10s), HEALTH_RPC_TIMEOUT_MS (default 5s), FALLBACK_RPC_URL

Note

This PR supersedes #153 (timeout-only fix) by including both timeout AND fallback in one cohesive change.

Test plan

  • tsc --noEmit passes (zero type errors)
  • vitest run passes (186/186 tests)
  • Manual: confirm 504 response when primary RPC is unreachable
  • Manual: confirm fallback activates when FALLBACK_RPC_URL is set and primary fails
  • Verify no devnet requests when FALLBACK_RPC_URL is unset

🤖 Generated with Claude Code

All on-chain RPC calls (fetchSlab, getSlot) use a single Connection
from getConnection() with no failover.  If the primary Solana RPC is
down, rate-limited, or hung, /markets/:slab, /api/adl/rankings, and
/health fail with no recovery path.

This adds:
- withRpcTimeout() — Promise.race deadline (10s routes, 5s health)
- withRpcFallback() — tries primary, on ANY error tries FALLBACK_RPC_URL
  - Each attempt gets its own independent timeout budget
  - Only activates when FALLBACK_RPC_URL is explicitly set (prevents
    silent failover to the devnet default in @percolator/shared)
- RpcTimeoutError class for 504 Gateway Timeout responses
- FALLBACK_RPC_URL documented in .env.example

Configurable via env: RPC_TIMEOUT_MS, HEALTH_RPC_TIMEOUT_MS, FALLBACK_RPC_URL

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Warning

Rate limit exceeded

@0x-SquidSol has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 49 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 20 minutes and 49 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 12bd6590-21a5-4a99-8db3-33849c0bfcf6

📥 Commits

Reviewing files that changed from the base of the PR and between fb94c29 and 27a64f5.

📒 Files selected for processing (6)
  • .env.example
  • src/routes/adl.ts
  • src/routes/health.ts
  • src/routes/markets.ts
  • src/utils/rpc-fallback.ts
  • src/utils/rpc-timeout.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0x-SquidSol
Copy link
Copy Markdown
Contributor Author

⚠️ Merge conflict — needs rebase against current main

This PR has merge conflicts due to the SDK migration from @percolator/sdk (GitHub commit-pinned) to @percolatorct/sdk@1.0.0-beta.13 (npm registry). The imports in markets.ts and adl.ts changed from @percolator/sdk to @percolatorct/sdk.

The fix itself is still needed — no merged PR adds RPC timeout or fallback to fetchSlab/getSlot calls. Merged #133 only timeouts resolvePrice in the oracle router; this PR covers the on-chain read paths in markets, ADL, and health.

To resolve: Rebase against main, update imports from @percolator/sdk to @percolatorct/sdk, and resolve the src/routes/markets.ts / src/routes/adl.ts conflicts.

Also supersedes #153 (timeout-only version of this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant