Skip to content

fix: PostgreSQL + TanStack Start migration (merge-ready)#9

Merged
fazil-raja-glean merged 30 commits into
mainfrom
dev
Jun 1, 2026
Merged

fix: PostgreSQL + TanStack Start migration (merge-ready)#9
fazil-raja-glean merged 30 commits into
mainfrom
dev

Conversation

@ken-cavanagh-glean
Copy link
Copy Markdown
Collaborator

Summary

Promotes the dev migration milestone to main: Seer moves from SQLite → PostgreSQL and the web app from Next.js → TanStack Start, plus the customer-readiness work (unified Glean API key, internal references removed). This PR also includes the pre-merge review fixes that make the migration actually merge-ready.

Two infra migrations land together:

  • Storage: SQLite (data/seer.db) → PostgreSQL via Docker Compose + Drizzle (pg, jsonb columns, drizzle-kit push).
  • Web: Next.js app router → TanStack Start (web/src/routes/*, web/src/server/*, feature slices under web/features/*). All 9 API routes reimplemented.

Pre-merge review fixes (this branch's final commit)

A full review of main...dev (155 files) surfaced blockers that are fixed here:

  • CI was guaranteed to failcheck.yml ran pnpm test:web-api-smoke with no Postgres. Added a postgres:16-alpine service, DATABASE_URL, and a db:push step.
  • Silent DB footguninitializeDB swallowed a missing-schema error and permanently marked the DB initialized. It now throws an actionable error and only marks initialized on success.
  • CLI hangs + token misattribution — open pg pool kept the event loop alive (CLI never exited on success); --parallel runs used shared ledger state. Now uses parseAsync + closeDB() and withLedgerContext per case (matching the web run-service).
  • PATCH /api/cases now returns 404 for a missing ID instead of an empty 200.
  • Smoke test injected a failing case by writing invalid JSON into the now-jsonb metadata column (rejected by Postgres at write). Reworked to fail the case via a sentinel query that yields an empty agent response.

Test plan

  • pnpm typecheck — pass
  • pnpm lint — pass
  • pnpm test — 79/79 pass
  • pnpm --filter web typecheck + build — pass
  • pnpm test:web-api-smoke — pass against a fresh Postgres (verified locally on a disposable container)
  • CI green on this PR (the point of the CI fix)

Follow-up / open decision

  • No migration path for existing data/seer.db data. Versioned migrations were removed in favor of drizzle-kit push. Greenfield is fine; anyone with an existing SQLite DB (local dev included) has no ETL path. Decide whether that data needs to be preserved before this is relied on.
  • Workflow agents no longer return traces/tool calls (public Agents Runs API move, documented in CHANGELOG) — intentional, but it does reduce faithfulness/hallucination grounding for workflow-agent runs.
  • Stale docs in a few places still reference SQLite (web/README.md, docs/architecture.md, CLAUDE.md).

kenneth cavanagh and others added 30 commits May 19, 2026 20:29
Replace text nav links with icon buttons (gear for settings).
Convert new eval set page into a modal with sticky header/footer,
backdrop blur, escape-to-close, and SSE abort on close. Remove
settings breadcrumb. /sets/new now redirects to dashboard.

Components: NewEvalSetContext (provider), Header, AppShell,
NewEvalSetModal, NewEvalSetButton.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rences

Migrate workflow agents from internal /rest/api/v1/runworkflow to public
/rest/api/v1/agents/runs/wait. Autonomous agents remain on /rest/api/v1/chat
with full trace data. Workflow agents lose traces (accepted limitation —
documented in docs/TRACE_API_LIMITATIONS.md).

Remove internal docs (gko-eval-presentation, harness-engineering-plan,
issues, glean-api-needs), unused @gleanwork/api-client dep, and all
scio-prod references. Fix settings password field rendering and CaseEditor
alert() calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… Start and Node based

Co-authored-by: Cursor <cursoragent@cursor.com>
Move the web app from Next.js/Bun to TanStack Start on React 19 with pnpm/Node runtime support. Preserve the Node CLI, shared SQLite/settings behavior, and migrate tests to Vitest.

The prompt snapshot diff is format-only from Bun snapshots to Vitest snapshots; prompt contents were compared by snapshot name and are unchanged.
Remove dead modal create flow, move new-eval-set and eval-config into
focused modules with React Query hooks, and fix set creation API.
Export AgentSchema from generate-agent and use it in CLI schema
fetch paths so agent/set/generate commands share typed field shapes.

Co-authored-by: Cursor <cursoragent@cursor.com>
Restore dev ui-refresh UX: gear-only header, global create-eval-set modal,
icon dashboard CTA, /sets/new redirect, settings breadcrumb removal, and
abort in-flight generation when the modal closes. Use native password input
for the API key field.
Restore web token usage recording, make failed web runs surface as failed, and bring active web code under Biome coverage with the route/service cleanup integrated.
Preserve failed and running runs in latest-run displays so the TanStack UI does not hide partial failures behind completed-only stats.
Defines the architecture for Seer as a deployable, cloud-agnostic
evaluation platform for Glean's solutions library. Covers repo
structure, OAuth passthrough auth, data model, API contract,
deployment model (Terraform), and 9-phase migration path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Frame Seer around Glean-specific knowledge-work evals, human calibration, macro behavior patterns, and business outcome evidence.
Drop temporary TanStack migration planning artifacts now that the migration has landed.
Replace better-sqlite3/bun:sqlite with pg (node-postgres) across CLI and
TanStack web app. Schema uses pgTable with typed jsonb columns via
$type<T>(), eliminating ~25 manual JSON.stringify/parse calls. Bootstrap
simplified from 180 lines of raw SQL to criteria seeding only — schema
management moves to drizzle-kit push.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use --env-file-if-exists so the CLI works locally with .env and in cloud
deployments where DATABASE_URL is set by the platform.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-merge review of dev surfaced several blockers in the SQLite→Postgres
and Next.js→TanStack Start migrations:

- CI: provision a Postgres service + DATABASE_URL + `db:push` step in
  check.yml so `pnpm test:web-api-smoke` (and any DB-touching job) can run.
- DB bootstrap: initializeDB no longer swallows a missing-schema error and
  permanently marks the DB initialized; it now throws an actionable error
  and only marks initialized on success.
- CLI: use withLedgerContext (AsyncLocalStorage) per case so --parallel runs
  attribute token usage to the correct case, and parseAsync + closeDB() so
  the open pg pool no longer keeps the event loop alive and hangs the CLI.
- web: PATCH /api/cases returns 404 when the case ID does not exist.
- smoke test: inject the partial-failure case via a sentinel query (empty
  agent response) instead of writing invalid JSON into the now-jsonb
  metadata column, which Postgres rejected at write time.

Verified: typecheck, lint, unit tests (79), web build, and web-api-smoke
(against a fresh Postgres) all pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
Route all outbound Glean requests through gleanFetch (undici) so
SEER_GLEAN_TLS_INSECURE=1 bypasses cert verification in local dev.
TLS/cert errors now fail fast (no retry) with actionable messages
in both CLI and web UI. Agent lookup card shows errors inline and
handles missing schemas gracefully.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add undici to web/package.json (was only in root, fragile hoisting)
- Defer initializeDB() to preAction hook so --help/--version work
  without a running database
- Update CLAUDE.md: SQLite → PostgreSQL (7 references)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PG migration, TanStack Start, TLS resilience.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge product vision and technical architecture into AGENTS.md.
CLAUDE.md is now a symlink so both Claude and Codex read the same
context. Updated architecture section for TanStack Start, PostgreSQL,
gleanFetch, and current file layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI failed because pnpm-lock.yaml was missing the undici specifier
for the web workspace package.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fazil-raja-glean fazil-raja-glean merged commit 384213e into main Jun 1, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants