first phase of db improvements#400
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Request: split this PR for safer rollout and clean attribution
This PR bundles six different change shapes that all touch hot ingest paths: fanout window, strategy-performance query rewrite, current APY/APR lookback bounds, things.get() SQL pushdown, JSONB projection narrowing, and test runner cleanup. Two of those are behavior changes (dropping stale-output fallback past the lookback windows), not pure perf.
Bundled, this is hard to operate:
- If anything regresses post-deploy, revert has to unwind six unrelated shapes at once.
pg_stat_statementsdeltas only attribute cleanly per deploy. Bundled, we see aggregate movement but can't tie a queryid improvement (or regression) to a specific change. That's the whole point of the HackMD baseline.- The fanout query is the 99%+ row-reduction win. Landing it on its own starts clawing back egress immediately instead of waiting on consensus about the smaller changes.
Suggested split
PR 1 — test runner cleanup
packages/lib/run-tests.tsdefault spec discovery.- Remove
describe.onlyatpackages/lib/strider.spec.ts:4(still there on this branch — the PR only removesit.only). - Confirm
bun run --filter lib testruns the full suite, not just strider.
PR 2 — fanout timestamp discovery
- Just the
series_timebounded discovery inpackages/ingest/fanout/timeseries.ts. - Maps to the top HackMD offender (
SELECT DISTINCT block_time FROM output…). - Deploy and measure before adding more.
PR 3 — current performance / output lookups
- Bounded current APY/APR pivots. Replace the hardcoded
7 dayswith a dedicated env var, e.g.CURRENT_PERFORMANCE_LOOKBACK_DAYS, defaulting to7. - Reuse that same env var for the 14-day strategy performance lookback. If the 7d/14d gap was deliberate (e.g. strategy reports lag), call out why and keep them named separately instead.
- Preserve the single-timestamp invariant in
fetchStrategyPerformance(packages/ingest/abis/yearn/3/vault/snapshot/hook.ts). The old CTE usedGROUP BY address, label, so every component for a given(address, label)came from the sameblock_time. The newDISTINCT ON (address, label, component)shape picks the latestblock_timeper component independently, so a strategy can returnnetfrom one output point andweeklyNetfrom another.
After 1–3 are measured, decide whether the remaining projection changes (thing.defaults, things.get() pushdown, evmlog.args keys) still carry their weight against the new baseline.
PR 1 is pretty lightweight, if convenient combine with PR 2.
|
Split per review feedback:
Closing this in favor of the split PRs. |
Summary
Implements the first phase of the Neon egress reduction plan from the HackMD. The branch reduces rows/bytes returned by hot ingest queries without adding a cache layer.
Main changes:
series_time.latest_timesCTE/JOIN with a boundedDISTINCT ON (address, label, component)scan.thing.defaultsin hot timeseries hooks instead of fullthingrows.things.get()equality / inequality filters into SQL and keep semver filtering in JS.evmlog.argsreads to the fields each caller actually uses.Expected savings
These are directional estimates from the HackMD diagnosis, not proof of production impact. The real pass/fail check is the post-deploy
pg_stat_statementsresample after about 3 days.DISTINCT ONscan should turn the full-history CTE/JOIN into a recent-window lookup.thingandevmlogprojection changes reduce JSONB bytes returned at hot call sites. These are expected to help most where callers only need a few fields from largedefaults/argsblobs.How to review
Start with:
packages/ingest/fanout/timeseries.tspackages/ingest/abis/yearn/3/vault/snapshot/hook.tspackages/ingest/things.tspackages/ingest/helpers/apy-apr.tsThen skim the small hook updates under
packages/ingest/abis/yearn/**andpackages/ingest/abis/erc4626/**to confirm shape-preserving query projection changes.Behavior should stay the same for normal/current data. The intentional tradeoff is that current strategy performance and APY/APR helpers now ignore stale output outside their bounded lookback windows instead of returning old historical values.
Test plan
Automated checks run:
What those checks cover:
things.get()filters and affected Yearn snapshot hooks.What tests do not prove:
pg_stat_statements.rowsdrops for production queryids.GraphQL/API smoke: run representative frontend-style queries and confirm response shapes stay unchanged.
Production measurement after deploy:
Expected production pass criteria:
Risk / impact
No migrations or cache layer. Query changes are additive/narrowing, but production validation still needs the post-deploy
pg_stat_statementsresample above. Main behavior tradeoff: stale output outside the bounded lookback windows is no longer used for current strategy performance and APY/APR helpers.