first phase of db improvements by matheus1lva · Pull Request #400 · yearn/kong

matheus1lva · 2026-05-07T21:39:34Z

Summary

Implements the first phase of the Neon egress reduction plan from the HackMD. The branch reduces rows/bytes returned by hot ingest queries without adding a cache layer.

Main changes:

Bound fanout timestamp discovery to the requested series window using series_time.
Replace the strategy-performance latest_times CTE/JOIN with a bounded DISTINCT ON (address, label, component) scan.
Fetch only thing.defaults in hot timeseries hooks instead of full thing rows.
Push things.get() equality / inequality filters into SQL and keep semver filtering in JS.
Bound current APY/APR MAX pivots to the last 7 days.
Narrow targeted evmlog.args reads to the fields each caller actually uses.

Expected savings

These are directional estimates from the HackMD diagnosis, not proof of production impact. The real pass/fail check is the post-deploy pg_stat_statements resample after about 3 days.

Fanout timestamp discovery was the biggest measured row offender: about 4M calls / 3.5d, 2B rows, and 16 GB estimated egress. Bounding it to the requested window should plausibly remove 99%+ of returned rows for that query shape.
Strategy performance fetch had the largest CPU cost: about 91K calls / 3.5d and roughly 35 days of database CPU over that window. The bounded DISTINCT ON scan should turn the full-history CTE/JOIN into a recent-window lookup.
thing and evmlog projection changes reduce JSONB bytes returned at hot call sites. These are expected to help most where callers only need a few fields from large defaults / args blobs.
Overall expectation: targeted queryids should drop by at least 80% in returned rows, with the fanout query expected at 99%+. Invoice-level savings still need the production measurement window because JSONB/TOAST wire bytes are not fully captured by local tests.

How to review

Start with:

packages/ingest/fanout/timeseries.ts
packages/ingest/abis/yearn/3/vault/snapshot/hook.ts
packages/ingest/things.ts
packages/ingest/helpers/apy-apr.ts

Then skim the small hook updates under packages/ingest/abis/yearn/** and packages/ingest/abis/erc4626/** to confirm shape-preserving query projection changes.

Behavior should stay the same for normal/current data. The intentional tradeoff is that current strategy performance and APY/APR helpers now ignore stale output outside their bounded lookback windows instead of returning old historical values.

Test plan

Automated checks run:

bun run --filter ingest lint
bunx tsc -p packages/ingest/tsconfig.json --noEmit
cd packages/ingest && bun run test things.spec.ts abis/yearn/2/vault/snapshot/hook.spec.ts abis/yearn/2/strategy/snapshot/hook.spec.ts
bun run --filter ingest test
bun run --filter lib test

What those checks cover:

Type safety and syntax of changed SQL call sites.
Existing behavior for things.get() filters and affected Yearn snapshot hooks.
No obvious shape regression in the tested ingest flows.

What tests do not prove:

They do not prove Neon egress reduction by themselves.
They do not prove pg_stat_statements.rows drops for production queryids.
They do not fully smoke GraphQL/API response shapes for frontend consumers.

GraphQL/API smoke: run representative frontend-style queries and confirm response shapes stay unchanged.

(
  GQL_URL="http://localhost:3001/api/gql"
  CHAIN_ID=1
  VAULT_ADDRESS="0x0000000000000000000000000000000000000000" # replace with a known production vault
  STRATEGY_ADDRESS="0x0000000000000000000000000000000000000000" # replace with one of that vault's strategies

  # Vault response shape
  curl -sS "$GQL_URL" \
    -H "content-type: application/json" \
    --data "$(jq -nc --argjson chainId "$CHAIN_ID" --arg address "$VAULT_ADDRESS" '{query:"query Vault($chainId:Int!,$address:String!){ vault(chainId:$chainId,address:$address){ chainId address name symbol apy { net weeklyNet monthlyNet } tvl { close } strategies { address name status performance { oracle { apr apy } historical { net weeklyNet monthlyNet inceptionNet } } } } }", variables:{chainId:$chainId,address:$address}}')" \
    | jq '{errors, vault: .data.vault | {chainId, address, name, symbol, apy, tvl, strategiesCount: (.strategies | length), firstStrategy: .strategies[0]}}'

  # Strategy response shape
  curl -sS "$GQL_URL" \
    -H "content-type: application/json" \
    --data "$(jq -nc --argjson chainId "$CHAIN_ID" --arg address "$STRATEGY_ADDRESS" '{query:"query Strategy($chainId:Int!,$address:String!){ strategy(chainId:$chainId,address:$address){ chainId address name apiVersion vault { address name } apr { net gross } reports { blockNumber transactionHash gain loss totalDebt } } }", variables:{chainId:$chainId,address:$address}}')" \
    | jq '{errors, strategy: .data.strategy | {chainId, address, name, apiVersion, vault, apr, reportsCount: (.reports | length), firstReport: .reports[0]}}'
)

Production measurement after deploy:

psql "$DATABASE_URL" -c "SELECT pg_stat_statements_reset();"
# Wait about 3 days, then sample targeted queryids. This emits JSON so reviewers can inspect with jq.
psql "$DATABASE_URL" -Atc "SELECT jsonb_pretty(jsonb_agg(jsonb_build_object('queryid', queryid, 'calls', calls, 'rows', rows, 'mean_exec_time', mean_exec_time, 'query', left(query, 220)) ORDER BY rows DESC)) FROM pg_stat_statements WHERE query ILIKE '%FROM output%' OR query ILIKE '%FROM thing%' OR query ILIKE '%FROM evmlog%';" | jq .

Expected production pass criteria:

Targeted queryids show at least 80% row reduction versus the HackMD baseline.
Fanout timestamp query shows 99%+ row reduction.
Targeted queries drop out of, or materially fall within, the top-N egress offenders.

Risk / impact

No migrations or cache layer. Query changes are additive/narrowing, but production validation still needs the post-deploy pg_stat_statements resample above. Main behavior tradeoff: stale output outside the bounded lookback windows is no longer used for current strategy performance and APY/APR helpers.

vercel · 2026-05-07T21:39:40Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
kong	Ready	Preview, Comment	May 13, 2026 11:10pm

murderteeth

Request: split this PR for safer rollout and clean attribution

This PR bundles six different change shapes that all touch hot ingest paths: fanout window, strategy-performance query rewrite, current APY/APR lookback bounds, things.get() SQL pushdown, JSONB projection narrowing, and test runner cleanup. Two of those are behavior changes (dropping stale-output fallback past the lookback windows), not pure perf.

Bundled, this is hard to operate:

If anything regresses post-deploy, revert has to unwind six unrelated shapes at once.
pg_stat_statements deltas only attribute cleanly per deploy. Bundled, we see aggregate movement but can't tie a queryid improvement (or regression) to a specific change. That's the whole point of the HackMD baseline.
The fanout query is the 99%+ row-reduction win. Landing it on its own starts clawing back egress immediately instead of waiting on consensus about the smaller changes.

Suggested split

PR 1 — test runner cleanup

packages/lib/run-tests.ts default spec discovery.
Remove describe.only at packages/lib/strider.spec.ts:4 (still there on this branch — the PR only removes it.only).
Confirm bun run --filter lib test runs the full suite, not just strider.

PR 2 — fanout timestamp discovery

Just the series_time bounded discovery in packages/ingest/fanout/timeseries.ts.
Maps to the top HackMD offender (SELECT DISTINCT block_time FROM output…).
Deploy and measure before adding more.

PR 3 — current performance / output lookups

Bounded current APY/APR pivots. Replace the hardcoded 7 days with a dedicated env var, e.g. CURRENT_PERFORMANCE_LOOKBACK_DAYS, defaulting to 7.
Reuse that same env var for the 14-day strategy performance lookback. If the 7d/14d gap was deliberate (e.g. strategy reports lag), call out why and keep them named separately instead.
Preserve the single-timestamp invariant in fetchStrategyPerformance (packages/ingest/abis/yearn/3/vault/snapshot/hook.ts). The old CTE used GROUP BY address, label, so every component for a given (address, label) came from the same block_time. The new DISTINCT ON (address, label, component) shape picks the latest block_time per component independently, so a strategy can return net from one output point and weeklyNet from another.

After 1–3 are measured, decide whether the remaining projection changes (thing.defaults, things.get() pushdown, evmlog.args keys) still carry their weight against the new baseline.

PR 1 is pretty lightweight, if convenient combine with PR 2.

matheus1lva · 2026-05-27T12:09:41Z

Split per review feedback:

PR fix: test runner cleanup + fanout timestamp bounded discovery #413 — test runner cleanup + fanout timestamp bounded discovery (PR 1+2 combined)
PR fix: add lookback bounds to current performance queries #414 — current performance lookback bounds + strategy performance CTE fix (PR 2)
Remaining projection changes (things.ts pushdown, first→firstRow, other JSONB projections) deferred until fix: test runner cleanup + fanout timestamp bounded discovery #413 and fix: add lookback bounds to current performance queries #414 are measured.

Closing this in favor of the split PRs.

first phase of db improvements

be65499

matheus1lva added 2 commits May 8, 2026 21:47

saving

951af0f

fix(ingest): normalize series_time fanout

cc57e1b

vercel Bot deployed to Preview May 9, 2026 22:14 View deployment

Merge branch 'main' into fix/first-phase-egress

3ce59dc

yearn deleted a comment from matheusilva-stord May 13, 2026

pr updates

a4ea40b

vercel Bot deployed to Preview May 13, 2026 23:10 View deployment

matheus1lva marked this pull request as ready for review May 13, 2026 23:11

matheus1lva added the ready for review label May 13, 2026

murderteeth requested changes May 27, 2026

View reviewed changes

This was referenced May 27, 2026

fix: test runner cleanup + fanout timestamp bounded discovery #413

Merged

fix: add lookback bounds to current performance queries #414

Open

matheus1lva closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first phase of db improvements#400

first phase of db improvements#400
matheus1lva wants to merge 5 commits into
mainfrom
fix/first-phase-egress

matheus1lva commented May 7, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 7, 2026 •

edited

Loading

Uh oh!

murderteeth left a comment •

edited

Loading

Uh oh!

matheus1lva commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matheus1lva commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Expected savings

How to review

Test plan

Risk / impact

Uh oh!

vercel Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murderteeth left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Request: split this PR for safer rollout and clean attribution

Suggested split

Uh oh!

matheus1lva commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matheus1lva commented May 7, 2026 •

edited

Loading

vercel Bot commented May 7, 2026 •

edited

Loading

murderteeth left a comment •

edited

Loading