Skip to content

v0.28.5 fix(wave): PGLite upgrade wedge + embedding dim corruption + bun-link foot-gun#697

Merged
garrytan merged 22 commits intomasterfrom
garrytan/fix-wave-v0.28.5
May 7, 2026
Merged

v0.28.5 fix(wave): PGLite upgrade wedge + embedding dim corruption + bun-link foot-gun#697
garrytan merged 22 commits intomasterfrom
garrytan/fix-wave-v0.28.5

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented May 7, 2026

Summary

Fix wave bundling 9 community PRs to unwedge users stuck since v0.27. Three independent failure clusters all close in this release:

Master's v0.28.1 (zombie-reap + health-timeout) merged in cleanly — no code-level conflicts, only version-string conflicts in VERSION/package.json/CHANGELOG.

Contributors: @brandonlipman (#682, #683, #684), @mdcruz88 (#668), @ChenyqThu (#627), @alan-mathison-enigma (#610), @oyi77 (#652 building block), @abkrim (#655), @100yenadmin / Eva (#641), @WilliamCourterWelch (#658 reporter).

Test Coverage

178 unit tests pass across the wave-touched files (migrate, schema-bootstrap-coverage, embedding-dim-check, upgrade, ai/, e2e/v0_28_5-fix-wave). Full unit suite: 3857 pass, 0 fail. bun run verify clean (privacy + jsonb + progress + test-isolation + wasm + admin-build + cli-exec + typecheck).

New E2E: test/e2e/v0_28_5-fix-wave.test.ts (6 tests, no DATABASE_URL needed)

  • Cluster A regression: pre-v0.20 brain (stripped v0.20+v0.26.3+v0.27 columns) re-runs initSchema cleanly + lands at LATEST_VERSION
  • hasPendingMigrations lifecycle (fresh / migrated / rewound / re-applied)
  • Cluster B fresh init at 768d AND 2048d — verifies no HNSW idx for >2000 (codex finding perf: parallelize hybrid search pipeline #8)
  • A4 mismatch message structure (HNSW conditional, docs/ link, four-step recipe)
  • Codex composite-index second-column case (provider_id in (job_id, provider_id))

New unit: test/embedding-dim-check.test.ts (5 tests) and SQL-parser coverage in test/schema-bootstrap-coverage.test.ts (5 tests, replaces hand-maintained array).

Pre-Landing Review

/plan-eng-review ran in plan mode (7 architecture decisions A1-A4, C1, T1, P1) and /codex consult (8 outside-voice findings, all folded into plan). Codex pivot X1 (post-upgrade auto-apply) corrected the headline outcome claim. All decisions are in the plan file at ~/.claude/plans/system-instruction-you-are-working-polymorphic-popcorn.md.

Existing E2E suite: 2 PGLite failures in dream-cycle-eight-phase-pglite.test.ts are pre-existing on master (test predates v0.26.5's 9th purge phase). Postgres-required E2E skip without DATABASE_URL (existing test infra).

Plan Completion

All 11 implementation steps DONE per the planning artifact. Codex's 8 follow-up corrections folded into the plan and into commits. UNRESOLVED: 0.

TODOS

Deferred to v0.29 / v0.28.6 (captured in CHANGELOG + plan file):

Test plan

  • All wave-touched unit tests pass (178 tests, 559 expects)
  • All schema-bootstrap-coverage tests pass (5/5 including auto-derived parser test)
  • All embedding-dim-check tests pass (5/5 including HNSW conditional)
  • New E2E test/e2e/v0_28_5-fix-wave.test.ts passes (6/6, PGLite-only, no DATABASE_URL needed)
  • bun run verify clean (8 checks: privacy, jsonb, progress, test-isolation, wasm, admin-build, cli-exec, typecheck)
  • Master merged in cleanly, post-merge tests still pass
  • Postgres-side E2E (test/e2e/postgres-bootstrap.test.ts) — needs DATABASE_URL; deferred to CI
  • Manual smoke: bun add -g gbrain@1.3.1gbrain upgrade shows recovery message (release-only)
  • Manual smoke: existing 1536d brain + gbrain init --embedding-dimensions 768 → exit 1 with recipe (release-only)

🤖 Generated with Claude Code


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

brandonlipman and others added 22 commits May 6, 2026 11:58
…otstrap

The forward-reference bootstrap (PostgresEngine + PGLiteEngine
applyForwardReferenceBootstrap) covered v0.18 + v0.19 + v0.26.5 columns
but missed two later groups. Brains upgrading from v0.14-era to current
master crash before the migration ladder runs:

1. v0.20 Cathedral II — content_chunks.search_vector,
   parent_symbol_path, doc_comment, symbol_name_qualified.
   `CREATE INDEX idx_chunks_search_vector` and
   `CREATE INDEX idx_chunks_symbol_qualified` in schema.sql/PGLITE_SCHEMA_SQL
   crash with "column search_vector does not exist" / "column
   symbol_name_qualified does not exist".

2. v0.26.3 — mcp_request_log.agent_name, params, error_message.
   `CREATE INDEX idx_mcp_log_agent_time ON mcp_request_log(agent_name,...)`
   crashes with "column agent_name does not exist".

Reproduces deterministically on a v0.13/v0.14 brain upgraded straight
to current master. The user hits the wall before any of v15-v36 can run.

Both engines now probe for these columns and pre-add them via
`ALTER TABLE ADD COLUMN IF NOT EXISTS` before SCHEMA_SQL runs. Migrations
v26, v27, v33 still run later via runMigrations and remain idempotent
(they handle backfill on top of the bootstrap-added columns).

Test coverage extended in test/schema-bootstrap-coverage.test.ts:
REQUIRED_BOOTSTRAP_COVERAGE now lists 6 new forward references; the
strip-and-rebuild block drops the corresponding indexes/triggers so the
test exercises a brain that pre-dates v0.20 + v0.26.3 migrations.

Repro: brain on schema v13/v14 + run `gbrain init --migrate-only` against
current master → fails. With this patch → succeeds; ladder runs to v36.
PR #682 covered v0.20 (chunks) + v0.26.3 (mcp_request_log) but missed
v0.27's subagent_messages.provider_id. The composite index
`idx_subagent_messages_provider ON subagent_messages (job_id, provider_id)`
in PGLITE_SCHEMA_SQL crashes on brains pinned at v0.18-v0.26 because
provider_id is the SECOND column in the composite — array-extraction
patterns that scan only first-column references miss it entirely.

This is the wedge surfaced by issue #670 (v0.22.0 → v0.27.0 init
--migrate-only crashes with "column 'provider_id' does not exist") and
contributing to #661/#657.

Both engines now probe for subagent_messages.provider_id and pre-add
the column via ALTER TABLE ADD COLUMN IF NOT EXISTS before SCHEMA_SQL
runs. Migration v36 (subagent_provider_neutral_persistence_v0_27) still
runs later via runMigrations and remains idempotent.

Note on the test side: REQUIRED_BOOTSTRAP_COVERAGE is hand-maintained
and just gained a v0.27 entry. v0.28.5's Step 3 replaces this array
with a SQL parser that auto-derives coverage from PGLITE_SCHEMA_SQL,
including composite-index columns. This commit is the targeted
follow-up to PR #682's cherry-pick; A2's parser closes the class
permanently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `hasPendingMigrations(engine)` next to `runMigrations` in migrate.ts:
single getConfig('version') probe, returns true when current < LATEST_VERSION,
defensively returns true on getConfig failure (treats wedged-config as pending).

`connectEngine` in cli.ts now wraps `engine.initSchema()` in a probe gate:
short-lived CLI calls (gbrain stats, query, doctor, etc.) on already-migrated
brains skip the bootstrap-probe + SCHEMA_SQL replay + ledger-check entirely.
Wedged brains still auto-heal — the probe says "yes pending" and initSchema
runs as before.

Building on oyi77's investigation in PR #652. Same correctness as #652's
unconditional initSchema-on-every-connect, but no perf regression on the
hot path. Failure non-fatal: if probe or init throws, log a hint and let
subsequent operations surface the real error in context.

Test coverage in test/migrate.test.ts: 3 cases covering fully-migrated
(false), version-rewound (true), and missing-version-config (defensive
true). Pairs with v0.28.5's X1 (post-upgrade auto-apply) — the upgrade
path runs initSchema explicitly while every other code path that goes
through connectEngine gets the cheap probe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Prior behavior: `gbrain upgrade` → `gbrain post-upgrade` → `apply-migrations`
only WARNs at apply-migrations.ts:296-302 when schema version is behind
LATEST_VERSION, telling the user to run `gbrain init --migrate-only`. 11
wedge incidents over 2 years have proven users don't read that WARN —
they file an issue instead.

This commit makes `runPostUpgrade` explicitly call `engine.initSchema()`
after the orchestrator migration pass, mirroring `init --migrate-only`'s
flow. Side-effect: `gbrain upgrade` now walks away with a healthy brain
in the cluster A wedge case (#670, #661, #657, #651, #625, #615, #609).

Defensive: wrapped in try/catch so a connection or DDL failure falls
back to the existing user-facing WARN. The hint to run
`gbrain init --migrate-only` is preserved as the manual escape hatch.

Pairs with v0.28.5's A1 (hasPendingMigrations probe in connectEngine):
the upgrade path runs initSchema explicitly here, while every other code
path that goes through connectEngine gets the cheap probe.

Codex outside-voice review caught this gap during plan review: "the plan
still does not prove `upgrade` will actually run schema migrations."
This is the load-bearing fix that makes v0.28.5's headline outcome
("run upgrade, brain works") literally true for cluster A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the hand-maintained REQUIRED_BOOTSTRAP_COVERAGE assertion with a
SQL-parser-backed structural check. The new test:

1. parseIndexColumnReferences(PGLITE_SCHEMA_SQL) extracts every column
   referenced by every CREATE INDEX — including composite-index second
   and third columns. Codex outside-voice review caught that earlier
   first-col-only patterns missed v0.27's
   `idx_subagent_messages_provider ON subagent_messages (job_id, provider_id)`,
   which is exactly how the v0.28.5 wedge happened.
2. parseBaseTableColumns(PGLITE_SCHEMA_SQL) extracts every column declared
   in CREATE TABLE bodies (including via ALTER TABLE ADD COLUMN inside
   the schema blob).
3. parseAlterAddColumns(pglite-engine.ts source) extracts every column
   that applyForwardReferenceBootstrap adds.
4. Static contract: every (table, column) pair from step 1 must appear in
   either step 2 or step 3. Otherwise the test fails loud, names every
   uncovered pair, and points at the bootstrap function for the fix.

Self-updating: any future CREATE INDEX added to PGLITE_SCHEMA_SQL on a
column that bootstrap doesn't yet provide fails this test at PR time. No
human required to remember to update an array. Closes the 11-incident
wedge class identified in CLAUDE.md (#239, #243, #266, #357, #366, #374,
#375, #378, #395, #396).

Helper parsers also have their own unit tests covering composite-index
second columns, function-wrapped columns (lower(col)), HNSW operator-class
suffixes (vector_cosine_ops), and ALTER TABLE column extraction. Existing
REQUIRED_BOOTSTRAP_COVERAGE-based tests preserved as a coarse-grained
lower bound; the new parser-based test is the load-bearing structural
gate going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add voyage-4-large/4/4-lite/4-nano + domain models to Voyage recipe
- Fix AI SDK compatibility: strip encoding_format (Voyage rejects 'float'),
  patch response to add prompt_tokens from total_tokens
- Add embedding_provider doctor check: live smoke test verifying model,
  API key, dimensions, and DB column alignment
- Add embedding provider eval qrels for post-migration quality testing

Closes: Voyage AI integration for gbrain embedding pipeline
Voyage's tokenizer is 3-4x denser than OpenAI tiktoken, causing batches
of 50+ texts to exceed the 120K token-per-batch limit even when DB
token counts (from tiktoken) suggest they'd fit.

Changes:
- Add max_batch_tokens to EmbeddingTouchpoint type (provider-declared limit)
- Set Voyage recipe to 120K token limit
- Gateway embed() now auto-splits batches using conservative char-to-token
  estimate (1:1 ratio, 80% budget utilization)
- On token-limit errors, embedSubBatch recursively halves and retries
  (down to single-text batches before giving up)
- Reduce embedding.ts BATCH_SIZE from 100 to 50 as a secondary guard
- Add tests for batch splitting logic and error pattern matching

Fixes infinite retry loops where the same oversized batch would fail
repeatedly because WHERE embedding IS NULL re-fetches identical rows.
… recipe

Adds A4 hard-error path: when `gbrain init --embedding-dimensions N` is
run against an existing brain whose `content_chunks.embedding` column is
a different `vector(M)`, init exits 1 with an inline four-step ALTER
recipe and a pointer to docs/embedding-migrations.md.

This kills the silent-corruption pattern surfaced by issue #673: the
v0.27 schema seeded `('embedding_dimensions', '1536')` regardless of the
flag, so users got a config saying 768 but a column at 1536 — first
sync write blew up with "expected 1536, got 768."

A4's contract:
  1. Connect to engine BEFORE saveConfig so we can read the live column type
  2. If column exists AND dim != requested, exit 1 (loud failure)
  3. If column doesn't exist (fresh init) OR dim matches, proceed normally

Recipe in docs/embedding-migrations.md (and inlined in init's error
output) covers all four destructive steps codex's plan-review caught:
  1. DROP INDEX IF EXISTS idx_chunks_embedding (HNSW won't survive ALTER)
  2. ALTER TABLE content_chunks ALTER COLUMN embedding TYPE vector(N)
  3. UPDATE content_chunks SET embedding = NULL, embedded_at = NULL
  4. CREATE INDEX HNSW *only if N <= 2000* (pgvector cap)

Step 4 is conditional: dims > 2000 (e.g. Voyage 4 Large 2048d) cannot
be HNSW-indexed in pgvector; the recipe explicitly says "Skip reindex"
in that case so the user doesn't paste a CREATE INDEX that crashes.

Helper `readContentChunksEmbeddingDim` and message builder
`embeddingMismatchMessage` live in src/core/embedding-dim-check.ts so
doctor 8b (next commit) can reuse the same source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ipe (#672)

Previous error message recommended running `gbrain migrate --embedding-model
… --embedding-dimensions …`, but `gbrain migrate` only handles engine
migration (postgres ↔ pglite), not embedding reconfiguration. Following
that hint produced a different error and confused users further.

New message:
  - Names the actual options: change models OR migrate the existing brain
  - Inlines a one-line quick recipe (DROP INDEX → ALTER → UPDATE NULL →
    config set → embed --stale)
  - Points at docs/embedding-migrations.md (added in commit 306fc0e)
    for the full four-step recipe with HNSW conditional handling

Closes #672. Note: #671 (config show hides embedding_model / dimensions)
appears to be already fixed on master — `Object.entries(loadConfig())`
in config.ts:24 correctly enumerates all keys including embedding_*. Will
close #671 with that note when shipping v0.28.5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#665's doctor 8b dim-probe used `engine.sql\`...\`` directly (Postgres
template literal) which doesn't typecheck against the BrainEngine
interface (only PostgresEngine has the .sql getter; PGLite does not).
Refactored to use `readContentChunksEmbeddingDim` from
src/core/embedding-dim-check.ts — same helper init's A4 hard-error
path uses, runs portably on both engines.

#680's Voyage fetch-shim passes a custom fetch handler to
`createOpenAICompatible` for the encoding_format + prompt_tokens
normalization. The SDK accepts the field at runtime but the typed
parameter on the pinned version doesn't expose it. Cast to the
parameter type so the shim ships without a type error.

Both fixes are mechanical cleanup of cherry-picked PRs that didn't
typecheck against current master's stricter shape. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`package.json` declares `"bin": { "gbrain": "src/cli.ts" }`, and bun's
linker creates `~/.bun/bin/gbrain` as a symlink to the file. The shebang
`#!/usr/bin/env bun` works only when the target file is executable —
otherwise bun runs it as a script (because it sees the script via the
shebang interpreter), but executing the symlinked target itself fails:

  $ ls -la ~/.bun/bin/gbrain
  lrwxrwxrwx ... -> ../install/global/node_modules/gbrain/src/cli.ts
  $ ~/.bun/bin/gbrain --version
  /opt/homebrew/bin/bash: line 1: /Users/brandon/.bun/bin/gbrain: Permission denied

This bites the postinstall hook that calls `gbrain apply-migrations`
(masked by the `||` fallback) and any subprocess that invokes the
binary by absolute path (e.g., subagent_messages migration v0.16's
`execSync('gbrain init --migrate-only', ...)`).

Setting the mode in-tree to 755 fixes both. No content change.
Cluster C cherry-pick (#683) restored the executable bit on src/cli.ts.
This commit adds scripts/check-cli-executable.sh that asserts the git
index mode is 100755 and wires it into `bun run verify` (and check:all).

Why a CI guard: bun-link installs symlink to src/cli.ts directly. If the
mode bit ever regresses to 100644, the very first `gbrain --version`
fails with `permission denied` — the exact symptom that motivated #683.
This guard runs in <100ms, fast enough for the inner verify loop.

Failure mode: clear instructions on what command to run to fix
(`chmod +x src/cli.ts && git add --chmod=+x src/cli.ts`) plus a pointer
back to issue #683 so future maintainers know why the guard exists.

Note: darwin and linux only. Windows preserves the git-stored mode
regardless of filesystem chmod, so the index-mode check works the same
on every platform CI uses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rewrites detectInstallMethod() in src/commands/upgrade.ts:247 with three
layered signals per v0.28.5 plan cluster D + codex finding C1:

1. bun-link signal (closes #656): when argv[1] is a symlink, walk up
   from realpath(argv[1]) up to 6 levels looking for a .git/config whose
   contents include `garrytan/gbrain` (case-insensitive substring).
   Returns 'bun-link'. Best-effort: forks, tarballs, and detached source
   trees fall through to the existing chain.

2. canonical bun authenticity check (closes #658 detection half): when
   the install lives in node_modules, read package.json and verify
   repository.url contains `garrytan/gbrain` OR src/cli.ts coexists
   (squatter ships compiled binary, not source). On 'suspect' verdict,
   print printSquatterRecovery() — names both git-clone AND
   release-binary recovery paths so users without a local clone can
   still recover.

3. Source-marker fallback inside (2). Codex flagged this is spoofable
   by a determined squatter; accepted — best-effort warning, not
   assertion. The structural fix is publishing under @garrytan/gbrain
   (tracked v0.29 follow-up).

The squatter's `name: gbrain` field doesn't disambiguate (codex caught
this in plan review of my original heuristic). repository.url is the
field a careless squatter is least likely to set correctly; src/cli.ts
presence is the secondary signal.

bun-link installs return 'bun-link' from the switch in runUpgrade, which
prints the source-clone upgrade path (`git pull && bun install && bun
link`) instead of trying `bun update gbrain` which doesn't apply.

README updated with the corresponding "DO NOT use `bun add -g gbrain`"
callout naming both #658 and the v0.29 scoped-name plan.

Tests in test/upgrade.test.ts cover return-type extension, bun-link
signal shape, classifyBunInstall's two-signal check, and the recovery
message contents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n-link foot-gun

Fix wave bundling 9 community PRs to unwedge users stuck since v0.27.

Cluster A — PGLite upgrade wedge (#670, #661, #657, #651, #625, #615, #609):
  - Bootstrap now covers v0.20+v0.26.3+v0.27 forward references (both engines)
  - hasPendingMigrations() probe gates initSchema() in connectEngine
  - Post-upgrade auto-applies pending schema migrations (X1)
  - SQL-parser-backed bootstrap coverage replaces hand-maintained array (A2)

Cluster B — Embedding dim corruption (#673, #672, #666, #640):
  - Schema templating cascade fixed end-to-end (#641 from @100yenadmin)
  - gbrain doctor 8b live embedding-provider probe (#665)
  - Voyage adaptive batch sizing for 120K-token cap (#680)
  - gbrain init A4 hard-error on existing-brain dim mismatch
  - docs/embedding-migrations.md with conditional-HNSW four-step recipe
  - #672 misleading migrate-suggestion error replaced with inline recipe

Cluster C — CLI exec bit (#683, dupe of #655):
  - src/cli.ts mode 100644 → 100755 (#683 from @brandonlipman)
  - scripts/check-cli-executable.sh CI guard against future regression

Cluster D — bun add -g foot-gun (#656, #658):
  - 3-signal detectInstallMethod rewrite (bun-link, repo.url, source-marker)
  - Loud-red recovery message names source-clone AND release-binary paths
  - README "DO NOT use bun add -g gbrain" callout

Contributors: @brandonlipman (#682, #683), @mdcruz88 (#668), @ChenyqThu
(#627), @alan-mathison-enigma (#610), @oyi77 (#652 building block),
@abkrim (#655), @100yenadmin (#641).

VERSION 0.27.0 → 0.28.5
package.json 0.27.0 → 0.28.5
schema-embedded.ts regenerated via bun run build:schema
llms-full.txt regenerated via bun run build:llms

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PGLite-only E2E covering the three regression scenarios v0.28.5 was shipped
to fix:

  1. cluster A — pre-v0.20 brain (missing v0.20 + v0.26.3 + v0.27 columns)
     re-runs initSchema cleanly. Strips the column set v0.28.5's bootstrap
     claims to restore (search_vector, parent_symbol_path, doc_comment,
     symbol_name_qualified, agent_name, params, error_message, provider_id),
     resets the version row to 13, then re-runs initSchema. Asserts every
     column comes back AND version reaches LATEST_VERSION.

     Closes the gap that pre-v0.28.5 produced 11 wedge incidents.

  2. cluster B — fresh init at non-default dims templates the column
     correctly (768d AND 2048d cases). The 2048d case explicitly verifies
     idx_chunks_embedding is NOT created (codex finding #8 — pgvector's
     HNSW cap is 2000).

  3. A4 — existing-brain dim mismatch helper produces a recipe that inlines
     all four steps (DROP INDEX, ALTER TYPE, NULL, conditional reindex).
     Validates the conditional CREATE INDEX HNSW for dims <= 2000 AND its
     omission for dims > 2000. The recipe a user copy-pastes won't crash
     them on Voyage 4 Large.

Plus a hasPendingMigrations() lifecycle test covering the four states
(fresh / migrated / rewound / re-applied) — pairs with the unit test in
test/migrate.test.ts but exercises the engine end-to-end.

PGLite-only because none of these cases need real Postgres. Postgres-side
bootstrap is covered by test/e2e/postgres-bootstrap.test.ts.

Run: bun test test/e2e/v0_28_5-fix-wave.test.ts (no DATABASE_URL needed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test-isolation lint (R3+R4) requires PGLiteEngine in beforeAll() context
with afterAll() disconnect. Refactored to single-engine-per-file pattern;
the fresh-brain test uses a one-off engine inside its own try/finally so
the file-level engine stays at LATEST schema for the migrated-brain test.
No behavior change to the assertions.

`bun run verify` now passes clean (privacy + jsonb + progress +
test-isolation + wasm + admin-build + cli-exec + typecheck).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…wave

Conflicts resolved:
- VERSION: kept 0.28.5 (ahead of master's 0.28.1)
- package.json: kept 0.28.5
- CHANGELOG.md: kept v0.28.5 entry above v0.28.1 entry; both released today

Master's v0.28.1 lands the zombie-reap + health-timeout + engine-disconnect
cascade. v0.28.5 builds on top with the PGLite upgrade wedge fix, embedding
dim corruption fix, and bun-link foot-gun fix. No code-level conflicts —
master touches src/cli.ts (SIGCHLD handler import) and src/core/minions/*
(spawn helpers); v0.28.5 touches engine bootstrap, init, upgrade, and
embedding paths. Auto-merge resolved cleanly outside the version bump.

llms-full.txt and src/core/schema-embedded.ts regenerated post-merge.

115 wave-touched tests pass on merged state. typecheck clean.
Conflicts resolved:
- VERSION: kept 0.28.5 (ahead of master's 0.28.3)
- package.json: kept 0.28.5
- CHANGELOG.md: kept v0.28.5 entry above master's v0.28.3 entry

Master added v0.28.3 (restart-sweep recipe for OpenClaw gateway dropouts),
which is orthogonal to this fix wave — no code-level conflicts.

llms-full.txt and src/core/schema-embedded.ts regenerated post-merge.
Wave-touched tests (115) still pass; typecheck clean.
CI Tier 1 was failing on `gbrain doctor exits 0 on healthy DB` because the
v0.28.5 doctor 8b check (cherry-picked from #665) pushed `status: 'fail'`
in two non-fatal scenarios:
  1. No API key configured (`isAvailable('embedding')` returns false)
  2. Probe throws (network blip, transient 5xx, DNS, rate limit)

Both are noise in CI and on offline workstations — the brain is healthy,
the provider just isn't reachable from this environment. The v0.28.5 plan
P1 decision called for non-fatal-on-offline behavior:

  > Doctor 8b probes live every run (taken as-is). Non-fatal on network
  > failure (warns rather than errors); silently skipped when no API key
  > configured.

This commit aligns the implementation with that decision:
  - !available → status 'ok' with "Skipped (no provider credentials)"
    message so the run is visible in --json output without failing exit code
  - catch block → status 'warn' (was 'fail') so probe failures surface
    informationally without crashing CI / autopilot's periodic doctor runs

The mismatch slipped past plan-time review because #665 was cherry-picked
before P1 was finalized; the type-fix pass in 4c26e48 only adjusted the
DB-column probe shape, not the API-availability gate.

CI Tier 1 (Mechanical) — `test/e2e/mechanical.test.ts:1220` —
"gbrain doctor exits 0 on healthy DB" now passes against a fresh Postgres
without `OPENAI_API_KEY` / `VOYAGE_API_KEY` set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…x-wave

Conflicts resolved:
- VERSION: kept 0.28.5 (ahead of master's 0.28.4)
- package.json: kept 0.28.5
- CHANGELOG.md: kept v0.28.5 entry above master's v0.28.4 entry

Master added v0.28.4 (skillify cross-modal eval quality gate, #674) and a
new src/commands/eval-cross-modal.ts. Orthogonal to this fix wave — no
code-level conflicts.

llms-full.txt and src/core/schema-embedded.ts regenerated post-merge.
Typecheck clean.
@garrytan garrytan merged commit 1d78013 into master May 7, 2026
7 checks passed
garrytan added a commit that referenced this pull request May 7, 2026
Master shipped v0.28.5 (PGLite upgrade wedge + embedding dim corruption +
bun-link foot-gun fix wave, PR #697). This release stays on v0.28.6.

Conflicts resolved:
- VERSION → 0.28.6 (kept ours; master had 0.28.5)
- package.json version → 0.28.6
- package.json scripts → kept BOTH new check scripts: my
  check:admin-scope-drift (from v0.28.2 cherry) + master's
  check:cli-exec (new in v0.28.5). Verify pipeline now runs both;
  check:all runs both.
- CHANGELOG.md → kept "## [0.28.6]" header on top; inserted master's
  full v0.28.5 entry between v0.28.6 and v0.28.4 in version-descending
  order. The "## To take advantage of v0.28.5" interleaved conflict
  was untangled by extracting master's entry from origin/master:CHANGELOG.md
  rather than trying to weave the two "to take advantage of" blocks
  back together inline.

Verified post-merge:
- bun run verify: PASS (privacy + jsonb + progress + test-isolation +
  wasm + admin-build + admin-scope-drift + cli-exec + typecheck)
- 121 tests pass: migrate + apply-migrations + takes-engine
- CHANGELOG order intact: 0.28.6 → 0.28.5 → 0.28.4 → 0.28.3 → 0.28.2 → 0.28.1
@abkrim abkrim mentioned this pull request May 7, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment