feat: behavioral verification in the Validator#1666
Conversation
Prevent the settings save split from serializing custom provider definitions through the global or project patch paths. - skip `customProviders` when building global settings patches - skip `customProviders` when building project settings patches - add a regression test that keeps custom providers out of both serialized patches Files changed: .../app/__tests__/settings-save-split.test.ts | 24 ++++++++++++++++++++++ .../app/components/settings/save-split.ts | 4 ++++ 2 files changed, 28 insertions(+) Fusion-Task-Id: FN-6205 Fusion-Task-Lineage: 84c3f42b-c621-44ff-b977-ecd3a6729e04
Move undocumented flaky engine tests into the formal quarantine path. - add quarantine ledger entries for flaky engine test files observed under concurrent/full-suite load - exclude the quarantined engine test files from vitest projects so they no longer run in gate and reliability pools - remove the undocumented it.skip markers by quarantining at the file/config level instead Files changed: .../__tests__/merger-file-scope-invariant.test.ts | 4 ++-- .../src/__tests__/project-engine-manager.test.ts | 2 +- packages/engine/vitest.config.ts | 10 ++++++++-- scripts/lib/test-quarantine.json | 23 +++++++++++++++++++++- 4 files changed, 33 insertions(+), 6 deletions(-) Fusion-Task-Id: FN-6206 Fusion-Task-Lineage: 8d88725e-1e92-4a06-aaa6-be34287e613a
Synchronize workflow setting values through node settings sync. - Include workflow settings in settings payloads, diffs, status responses, and API types. - Apply inbound and pulled workflow settings through the TaskStore while ignoring rejected setting ids. - Add core, route, hook, API, and Nodes view coverage for workflow settings sync behavior. - Add a patch changeset for the published Fusion package. Files changed: .changeset/loud-nodes-sync.md | 5 + packages/core/src/__tests__/central-core.test.ts | 37 ++++ packages/core/src/central-core.ts | 13 +- packages/core/src/types.ts | 4 + packages/dashboard/app/__tests__/api-node.test.ts | 4 +- packages/dashboard/app/api-node.ts | 3 + .../app/components/__tests__/NodesView.test.tsx | 8 +- .../hooks/__tests__/useNodeSettingsSync.test.ts | 33 +++- .../dashboard/app/hooks/useNodeSettingsSync.ts | 4 +- .../__tests__/routes-nodes-sync-contract.test.ts | 18 +- .../src/__tests__/routes-nodes-sync.test.ts | 191 ++++++++++++++++++++- .../src/routes/register-settings-memory-routes.ts | 7 +- .../register-settings-sync-inbound-routes.ts | 73 +++++++- .../src/routes/register-settings-sync-routes.ts | 128 ++++++++++++-- 14 files changed, 488 insertions(+), 40 deletions(-) Fusion-Task-Id: FN-6208 Fusion-Task-Lineage: 9b5e7e56-6287-4bc5-966a-2bcc2ba292a7
Adds a persisted `type` field (behavioral|static) to MissionContractAssertion, the schema column + migration 115 + version-sweep, and store population with a conservative static default so existing rows preserve legacy judging. Foundation for scoping the validator's default-to-fail / verification posture.
… (U2,U3) Behavioral/bug assertions now default to fail unless a bounded verification run confirms them; static assertions keep the existing read-only judging path. Adds mission-verification.ts (injected capability): disposable git checkout at a trusted revision, explicit isolating sandbox backend with fail-closed + scrubbed env, fixed command template + validated test path, merge-base pre-fix baseline (rejects pass-on-both), git-clean post-condition, inconclusive as a first-class verdict. Judge session stays tools:readonly.
The migration-115 bump in U1 left many test files asserting the old version 114; update them (db-migrate.test.ts now asserts the exported SCHEMA_VERSION constant to drop it from future sweeps).
… (U6) Threads observed-vs-expected failure detail into the generated Fix Feature (R6); routes inconclusive verdicts to needs-attention with no Fix Feature and a distinguishable infra event (R21); makes createGeneratedFixFeature idempotent on (sourceFeatureId, runId) / skip-when-open so recovery/reaper re-drives don't duplicate features or exhaust the retry budget (R22); persists mission/audit events for failures, inconclusive, and previously-swallowed triage errors (R16).
…ock (U7) Widens the stranded-done recovery branch so a reaped task-less validating/done feature is re-driven to a terminal verdict instead of stalling its slice forever (closes the reaper->slice-deadlock P0). Adds a Surface Enumeration + adversarial re-drive reliability suite (git-clean, no duplicate Fix Features, terminal verdict, no deadlock) gating release. Updates missions docs + CONCEPTS for the default-to-fail posture, non-mutating verification, and inconclusive verdict. Adds changeset.
launchIsolatedApp() binds a non-reserved port (reuses the 4040/FUSION_RESERVED_PORTS guards), serves a freshly-built bundle, runs against a fresh/empty disposable DB under a run-unique tmpdir with a scrubbed env (no inherited credentials), and tears down unconditionally on crash/timeout/abort. Process-spawn and bundle-build are injected seams; 18 tests characterize the R13 isolation contracts. Real end-to-end launch wires the default spawner behind the seam in a heavier lane.
Replaces the metadata/probe stub with a playwright-core driver (no bundled browser download; uses a probe-discovered Chrome) exposing navigate/click/type/ observe + idempotent dispose. Browser-unavailable and un-exercisable assertions return inconclusive (never a false pass/fail); a missing selector returns a distinct 'absent' negative observation. Driver is a direct export scoped to the verification run, not registered as a coding-agent tool.
…sm (U5) Adds the app-driving channel (U4 isolated launch + U8 driver, via an injected seam to keep the engine decoupled from the plugin), a dispatching capability that routes by assertion shape (test/app/both), the R21 three-outcome mapping (absent-vs-found by expectation; all infra failures -> inconclusive), and the R20 determinism contract (N-run agreement before an authoritative fail; flaky -> inconclusive). Channel runs only against the isolated instance (R11).
- Drop the unnecessary regex escape in the shell-metacharacter class. - Move the R17 git-clean post-condition out of finally (no-unsafe-finally): a dirty tree now fails closed to an inconclusive verdict instead of a finally-throw that masked the verdict. Test updated to the new behavior.
|
Warning Review limit reached
More reviews will be available in 29 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR introduces a behavioral verification pipeline for mission contract assertions: adds a ChangesBehavioral verification and workflow sync
Sequence Diagram(s)sequenceDiagram
rect rgba(70, 130, 180, 0.5)
note over MissionExecutionLoop,VerificationCapability: Behavioral posture application
end
participant MissionExecutionLoop
participant applyBehavioralPosture
participant VerificationCapability
participant MissionStore
MissionExecutionLoop->>applyBehavioralPosture: judgeResult + linked assertions
applyBehavioralPosture->>applyBehavioralPosture: static assertions keep judge verdict
applyBehavioralPosture->>VerificationCapability: verifyBehavioralAssertion(request) per behavioral assertion
VerificationCapability-->>applyBehavioralPosture: pass | fail | inconclusive
applyBehavioralPosture-->>MissionExecutionLoop: ValidationResult (pass | fail | inconclusive)
alt pass
MissionExecutionLoop->>MissionStore: completeValidatorRun("passed")
else fail
MissionExecutionLoop->>MissionStore: completeValidatorRun("failed") + createGeneratedFixFeature(failureReason)
MissionStore->>MissionStore: findGeneratedFixFeature / findOpenGeneratedFixFeature (dedup)
else inconclusive
MissionExecutionLoop->>MissionStore: completeValidatorRun("blocked")
MissionExecutionLoop->>MissionStore: emitMissionEvent("verification_inconclusive")
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
Greptile SummaryThis PR wires behavioral verification into the mission Validator: contract assertions typed
Confidence Score: 4/5Safe to merge after fixing the repairJson infinite loop in mission-execution-loop.ts. The behavioral verification machinery, sandbox isolation, and deadlock fix are well-structured and the two previously-flagged issues (blocked-verdict rewrite, global sandbox race) have been addressed. The one remaining defect is in repairJson — const counters are loop-invariant so any malformed AI response with unbalanced braces produces a non-terminating synchronous while loop, permanently blocking the Node.js event loop since the validation timeout does not cover the JSON-repair step. packages/engine/src/mission-execution-loop.ts — repairJson infinite-loop at lines 961–967 Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Task completes] --> B[runFeatureValidation]
B --> C[AI Judge Session\ntools: readonly]
C --> D[parseValidationResult]
D --> E[applyBehavioralPosture]
E --> F{error or blocked?}
F -- Yes --> G[Return early]
F -- No --> H{Has behavioral assertions?}
H -- No --> I[Return static judge verdict]
H -- Yes --> J[Promise.all: verifyBehavioralAssertion]
J --> K{Capability injected?}
K -- No --> L[Default to fail]
K -- Yes --> M[Select isolating backend R18]
M --> N{Available?}
N -- No --> O[inconclusive: fail-closed]
N -- Yes --> P[Disposable checkout R11/R17]
P --> Q[Run in sandbox R19]
Q --> R{Proof supplied?}
R -- Yes --> S[Baseline checkout R5/AE5]
S --> T{Result?}
T -- pass-on-both --> U[fail: proof invalid]
T -- fail-base pass-impl --> V[pass]
T -- fail-impl --> W[fail]
R -- No --> X{Suite passed?}
X -- Yes --> V
X -- No --> W
O --> Y[Aggregate]
L --> Y
V --> Y
U --> Y
W --> Y
Y --> Z{allPassed?}
Z -- Yes --> AA[pass: feature done]
Z -- sawInconclusive --> AB[inconclusive: no Fix Feature]
Z -- No --> AC[fail: createGeneratedFixFeature]
|
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
packages/core/src/mission-store.ts (1)
2986-2990:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate run-to-feature ownership before dedup/creation.
At Line 2986, the code verifies the run exists but not that
run.featureId === sourceFeatureId. A mismatched call can attach lineage to the wrong run and consume retry budget on the wrong source feature.Suggested fix
const run = this.getValidatorRun(runId); if (!run) { throw new Error(`Validator run ${runId} not found`); } + if (run.featureId !== sourceFeatureId) { + throw new Error( + `Validator run ${runId} belongs to feature ${run.featureId}, expected ${sourceFeatureId}`, + ); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/mission-store.ts` around lines 2986 - 2990, The code in the method using getValidatorRun(runId) checks existence but not ownership: after retrieving run via getValidatorRun(runId) verify that run.featureId === sourceFeatureId before proceeding with dedup/creation; if they differ, throw an Error (or return a clear failure) to prevent attaching lineage to the wrong run and consuming the wrong source feature's retry budget. Ensure you update the block around getValidatorRun/run to perform this equality check and fail fast when mismatched.packages/engine/src/mission-execution-loop.ts (3)
590-596:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear the validation timeout once the session finishes.
Every successful validation leaves this 10-minute timer armed. Repeated validations will accumulate pending timers until they fire, which keeps unnecessary work alive long after the run completed.
Suggested fix
- const timeoutPromise = new Promise<never>((_, reject) => { - setTimeout(() => reject(new Error("Validation timeout")), VALIDATION_TIMEOUT_MS); - }); - - const validationPromise = this.runValidationSession(session.session, prompt); - - await Promise.race([validationPromise, timeoutPromise]); + let timeoutHandle: ReturnType<typeof setTimeout> | undefined; + const timeoutPromise = new Promise<never>((_, reject) => { + timeoutHandle = setTimeout(() => reject(new Error("Validation timeout")), VALIDATION_TIMEOUT_MS); + }); + + const validationPromise = this.runValidationSession(session.session, prompt); + + try { + await Promise.race([validationPromise, timeoutPromise]); + } finally { + if (timeoutHandle) clearTimeout(timeoutHandle); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/engine/src/mission-execution-loop.ts` around lines 590 - 596, The validation timeout Promise created as timeoutPromise uses setTimeout but never clears the timer, so repeated validations accumulate pending timers; modify the logic around timeoutPromise and validationPromise (the call to runValidationSession and the use of VALIDATION_TIMEOUT_MS) to capture the timer id from setTimeout, race the promises, and ensure clearTimeout(timerId) is called once validationPromise settles (use then/finally on the raced result or attach finally to validationPromise) so the timer is always cleared whether the validation succeeds or errors.
1014-1051:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBackfill omitted linked assertions before recomputing the result.
This only synthesizes defaults when the judge returns zero assertion rows. A partial response can silently drop linked assertions, which means a feature can pass without evaluating every assertion, and skipped behavioral assertions never reach
verifyBehavioralAssertion(...).Suggested fix
private extractAssertionResults( parsed: Record<string, unknown>, assertions: MissionContractAssertion[], ): Array<{ assertionId: string; passed: boolean; message?: string; expected?: string; actual?: string }> { const results: Array<{ @@ if (Array.isArray(parsed.assertions)) { for (const item of parsed.assertions) { @@ } } - // If no assertion results but we have assertions, create default results based on status - if (results.length === 0 && assertions.length > 0) { - const overallPassed = parsed.status === "pass"; - for (const assertion of assertions) { - results.push({ - assertionId: assertion.id, - passed: overallPassed, - message: overallPassed ? "Passed" : "Failed", - }); - } + const resultIds = new Set(results.map((r) => r.assertionId)); + const overallPassed = parsed.status === "pass"; + for (const assertion of assertions) { + if (resultIds.has(assertion.id)) continue; + results.push({ + assertionId: assertion.id, + passed: overallPassed, + message: overallPassed ? "Passed" : "Failed", + }); } return results; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/engine/src/mission-execution-loop.ts` around lines 1014 - 1051, The current logic only synthesizes defaults when parsed.assertions is empty, allowing partial responses to omit linked assertions; update the processing in mission-execution-loop.ts so after iterating parsed.assertions you build a set/map of assertion IDs seen (from assertionId or id), then iterate the original assertions array and for any assertion.id not present push a default result (use parsed.status to set passed and suitable message) so every linked assertion is represented; reference parsed.assertions, assertions, results, and parsed.status (and ensure downstream verifyBehavioralAssertion(...) will receive these synthesized rows).
1332-1352:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNotify autopilot with
blockedwhen the retry budget is exhausted.
createGeneratedFixFeature(...)blocks the source feature on exhausted budget, but this method still falls through tonotifyValidationComplete(featureId, "failed"). That leaves autopilot out of sync with the persisted terminal state and can trigger the wrong follow-up path. Based on learnings from the provided store contract, exhausted budgets transition the source feature to blocked.Suggested fix
private async handleValidationFail( featureId: string, runId: string | undefined, result: ValidationResult, ): Promise<void> { try { + let terminalStatus: "failed" | "blocked" = "failed"; + @@ if (message.includes("retry budget exhausted") || message.includes("exhausted its retry budget")) { + terminalStatus = "blocked"; loopLog.warn(`Feature ${featureId} retry budget exhausted; marking as blocked`); // completeValidatorRun already handles the blocked transition when budget is exhausted this.logFeatureMissionEvent(featureId, "error", "retry_budget_exhausted", `Feature ${featureId} exhausted its retry budget`, { runId: runId ?? null, }); @@ // Notify autopilot if configured if (this.missionAutopilot?.notifyValidationComplete) { - await this.missionAutopilot.notifyValidationComplete(featureId, "failed"); + await this.missionAutopilot.notifyValidationComplete(featureId, terminalStatus); } } catch (err) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/engine/src/mission-execution-loop.ts` around lines 1332 - 1352, The code currently always calls this.missionAutopilot.notifyValidationComplete(featureId, "failed") after createGeneratedFixFeature errors; change it so when the error message indicates a retry budget exhaustion (the same condition that logs "retry_budget_exhausted" and emits "validation:budget_exhausted") you call this.missionAutopilot.notifyValidationComplete(featureId, "blocked") instead of "failed". Locate the block that checks message.includes("retry budget exhausted") || message.includes("exhausted its retry budget") (and the else branch that logs fix_feature_creation_failed) and add the alternate notifyValidationComplete call for the exhausted-budget branch, keeping the existing logging/emission behavior for that branch.packages/dashboard/src/routes/register-settings-sync-routes.ts (2)
322-350:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPersist pulled global settings through the dashboard store.
CentralCore.applyRemoteSettings()only validates/strips the global payload; it does not write~/.fusion/settings.json. Unlike/settings/sync-receive, this last-write-wins path never callsstore.updateGlobalSettings(...), so a successful pull can report remote global keys inappliedFieldswhile leaving the local global settings unchanged.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/dashboard/src/routes/register-settings-sync-routes.ts` around lines 322 - 350, The pull path currently validates but does not persist global settings; after a successful central.applyRemoteSettings(...) and before building applied/skipped lists, call the dashboard store to persist the pulled global settings (e.g. invoke store.updateGlobalSettings(...) or the equivalent method) using remoteSettings.global (or an empty object if missing), ensuring this only runs when result.success is true so the local ~/.fusion/settings.json is updated to reflect the remote last-write-wins state.
227-234:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPreserve
workflowIdin the surfaced field lists.These responses flatten workflow changes down to bare setting ids. If two workflows both sync
timeoutMs(or any shared declaration id), callers only see duplicate"timeoutMs"entries and cannot tell which workflow changed. Return qualified keys (for exampleworkflowId.settingId) or structured{ workflowId, settingId }entries instead. The same ambiguity exists in the inbound route’sappliedFields.Also applies to: 339-350
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/dashboard/src/routes/register-settings-sync-routes.ts` around lines 227 - 234, The current syncedFields array (built from globalSettings, projectSettings.project, and workflowSettings) flattens workflow setting ids and loses which workflow they came from; update the collection logic that builds syncedFields (and the similar appliedFields usage referenced around the other block) to preserve workflow identity by emitting either qualified keys like `${workflowId}.${settingId}` or small objects `{ workflowId, settingId }` for entries coming from workflowSettings, while leaving global and project keys unchanged; locate usages of workflowSettings, syncedFields, and appliedFields in register-settings-sync-routes.ts and adjust the mapping for Object.values(workflowSettings).flatMap(...) and the inbound appliedFields builder so callers receive per-workflow qualified/structured entries.
🧹 Nitpick comments (3)
packages/engine/vitest.config.ts (1)
90-90: 💤 Low valueRedundant exclude entry in engine-core.
Line 90 adds
merger-file-scope-invariant.test.tsto theengine-coreexclude list, butengine-coreuses an explicit include allow-list (lines 66-86) that never contained this file. Excluding a file that was never included is redundant (though harmless). If the file is being removed from the merge gate, simply not listing it in the include is sufficient.♻️ Simplify by removing redundant exclude
Since
engine-coreuses an explicit include list andmerger-file-scope-invariant.test.tsisn't in it, the exclude entry is unnecessary:- exclude: [ - "node_modules/**", - "dist/**", - "src/__tests__/merger-file-scope-invariant.test.ts", - ],However, keep the exclude if it serves as documentation that this file was intentionally removed from the gate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/engine/vitest.config.ts` at line 90, Remove the redundant exclude entry "src/__tests__/merger-file-scope-invariant.test.ts" from the engine-core vitest exclude list because engine-core already uses an explicit include allow-list (the include array) and that test file is not present there; simply delete the string from the exclude array in vitest.config.ts so the config is simpler and less confusing (keep the exclude only if you want an explicit documentation note that the test was intentionally removed).packages/core/src/__tests__/db-migrate.test.ts (1)
718-719: ⚡ Quick winRemove duplicated schema-version assertions.
Line 718 (and the other listed ranges) repeats the exact same
expect(db.getSchemaVersion()).toBe(SCHEMA_VERSION)twice in a row. Keep one assertion per site to reduce noise and keep failures crisp.Proposed cleanup pattern
- expect(db.getSchemaVersion()).toBe(SCHEMA_VERSION); expect(db.getSchemaVersion()).toBe(SCHEMA_VERSION);Also applies to: 752-753, 803-804, 833-834, 875-876, 910-911, 948-949, 1229-1230, 1237-1238, 1244-1245
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/__tests__/db-migrate.test.ts` around lines 718 - 719, Duplicate assertions calling db.getSchemaVersion() against SCHEMA_VERSION appear back-to-back (e.g., expect(db.getSchemaVersion()).toBe(SCHEMA_VERSION)); remove the redundant second occurrence at each location so only a single assertion remains per check, leaving the test logic unchanged; look for uses of db.getSchemaVersion and SCHEMA_VERSION in the test cases (lines shown around the duplicate pairs) and delete the extra expect(...) lines at each listed spot.packages/dashboard/app/components/settings/save-split.ts (1)
79-81: ⚡ Quick winDocument why
customProvidersis excluded from both patches.
customProvidersis skipped in both the global patch (lines 79-81) and project patch (line 96), even though the test at line 121 ofsettings-save-split.test.tsconfirmsisGlobalSettingsKey("customProviders")returnstrue. This suggestscustomProvidersis managed through a separate persistence path (perhaps for security/credential reasons), but the rationale is not documented in the code.Consider adding a brief comment explaining why this field is intentionally excluded from the save-split flow, similar to the existing comments for
persistAgentThinkingLog(line 76) andgithubTrackingDefaultRepo(line 73).Also applies to: 96-96
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/dashboard/app/components/settings/save-split.ts` around lines 79 - 81, Add a brief explanatory comment above both places where the code skips the "customProviders" key (the if (key === "customProviders") continue; checks) stating why customProviders is intentionally excluded from the global/project save-split flow (e.g., persisted separately for security/credential or provider-specific reasons), mirroring the style of the existing comments for persistAgentThinkingLog and githubTrackingDefaultRepo; mention the isGlobalSettingsKey("customProviders") test behavior to clarify that exclusion is deliberate and not a bug.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/dashboard/app/api-node.ts`:
- Line 83: NodeSettingsSyncStatus currently declares workflowSettings as
required in api-node.ts but runtime/consumer code (useNodeSettingsSync) treats
it as possibly undefined; update the type in api-node.ts by changing the
declaration of workflowSettings to optional (workflowSettings?: Record<string,
string[]>) so it matches observed runtime behavior and avoid the type/runtime
mismatch; locate the NodeSettingsSyncStatus type and change the field there,
then run typechecks and keep the existing defensive usage in useNodeSettingsSync
(or remove it only after verifying backend always returns {}) to ensure safety.
In `@packages/dashboard/app/hooks/useNodeSettingsSync.ts`:
- Around line 35-37: The code defensively treats
NodeSettingsSyncStatus.diff.workflowSettings as possibly undefined, so update
the type to match runtime behavior: make the workflowSettings property optional
(workflowSettings?: Record<string, string[]>) on the NodeSettingsSyncStatus.diff
type declaration and ensure any consumers (e.g., useNodeSettingsSync's use of
diff.workflowSettings) rely on that optional typing; verify and adjust related
imports/usages of NodeSettingsSyncStatus and the workflowSettings symbol so the
compiler no longer requires the null-coalescing guard everywhere.
In `@packages/engine/src/__tests__/mission-verification.test.ts`:
- Around line 316-334: The test currently allows either "inconclusive" or "fail"
for timeout outcomes, which masks regressions; update the assertion in the
"inconclusive when the run times out (R9), still asserts source clean" test so
that verifyBehavioralAssertion's result (outcome.verdict) must be exactly
"inconclusive" (remove "fail" from the allowed set) — locate the test that
constructs TestExecutionVerificationCapability (using makeScriptedBackend with
outcome: "timeout") and change the expect that checks outcome.verdict to assert
a single-value equality to "inconclusive" while leaving the
materializer.cleanCalls assertion intact.
In `@packages/engine/src/mission-execution-loop.ts`:
- Around line 658-676: The code currently only short-circuits when
judgeResult.status === "error", so runs with judgeResult.status === "blocked"
get re-evaluated and may be downgraded to "fail" if any behavioral assertions
exist; update the early-exit logic to also preserve blocked verdicts by
returning judgeResult when judgeResult.status is "blocked" (i.e., change the
check in the initial short-circuit that references judgeResult to include
"blocked"), and apply the same fix to the analogous block handling around the
second occurrence (the other branch that re-aggregates assertions) so functions
like normalizeMissionAssertionType, the typeById map, and the hasBehavioral
gating do not override blocked runs.
In `@packages/engine/src/mission-verification.ts`:
- Around line 476-483: verifyBehavioralAssertion currently mutates process-wide
state by calling __setSandboxBackendForTests(isolating) and restoreBackend which
races under concurrent Promise.all; instead remove any calls to
__setSandboxBackendForTests in verifyBehavioralAssertion and pass the selected
backend object (obtained via backendFactory(selection.backendId) — the isolating
variable) directly into the execution path (e.g., add a backend parameter to
runVerificationCommand or to the lower-level resolveSandboxBackend call so
runVerificationCommand uses the provided backend rather than reading global
state). Update all call sites of runVerificationCommand (and the other affected
block referenced at the second occurrence) to forward the selected backend and
remove the temporary restoreBackend logic.
- Around line 504-553: The current logic treats any result with success ===
false as a test failure; change it to first distinguish
execution-level/infrastructure failures from real test failures by using the
richer outcome from runVerificationCommand (e.g., an returned enum or fields
like timedOut, aborted, exitCode, or executionError) rather than only success.
Update the branches that read baselineResult and implResult (inside the
validatedTestPath block and the whole-suite fallback) to: if either result
indicates an infra failure (timeout/abort/executionError), return inconclusive
via this.inconclusive(assertionId, "...") with an explanatory message; only
treat results as failing tests when the verifier specifically indicates a test
failure (e.g., exitCode/testStatus shows a failing test). Modify
runVerificationCommand or its returned shape and then use those new fields in
the checks around baselineResult, implResult, and the pass/fail logic so
timeouts/aborts do not count as reproduction or passes.
In `@packages/engine/vitest.config.ts`:
- Around line 87-91: The Vitest config excludes three test files
("src/__tests__/merger-file-scope-invariant.test.ts",
"src/__tests__/project-engine-manager.test.ts",
"src/__tests__/merger-ai-cleanup.test.ts") but no engine project (engine-core,
engine-default, engine-reliability, engine-slow) actually includes them, so they
never run; fix by either moving/renaming those tests to match an existing
project's include pattern (e.g., put them under a path matched by
engine-default's "src/**/*.test.ts" or engine-reliability/engine-slow patterns)
or update the appropriate project's include array to explicitly add those three
paths (or remove them from the exclude list if they should run in that project),
making sure the include/exclude rules for engine-core, engine-default,
engine-reliability, or engine-slow are consistent so the tests are discoverable.
In `@plugins/fusion-plugin-agent-browser/src/driver.ts`:
- Around line 243-253: The observe function currently swallows all errors from
page.waitForSelector and treats them as "absent"; change the catch to capture
the error (e.g., catch (err)) and distinguish timeout errors from others: if err
is a playwright.errors.TimeoutError return the absent observation as before,
otherwise return an inconclusive result (include the current page.url() and the
error message/details) or rethrow depending on caller expectations; update
references around observe, page.waitForSelector, obsOpts?.expectAbsent, and
DEFAULT_OP_TIMEOUT_MS to implement this narrow mapping.
---
Outside diff comments:
In `@packages/core/src/mission-store.ts`:
- Around line 2986-2990: The code in the method using getValidatorRun(runId)
checks existence but not ownership: after retrieving run via
getValidatorRun(runId) verify that run.featureId === sourceFeatureId before
proceeding with dedup/creation; if they differ, throw an Error (or return a
clear failure) to prevent attaching lineage to the wrong run and consuming the
wrong source feature's retry budget. Ensure you update the block around
getValidatorRun/run to perform this equality check and fail fast when
mismatched.
In `@packages/dashboard/src/routes/register-settings-sync-routes.ts`:
- Around line 322-350: The pull path currently validates but does not persist
global settings; after a successful central.applyRemoteSettings(...) and before
building applied/skipped lists, call the dashboard store to persist the pulled
global settings (e.g. invoke store.updateGlobalSettings(...) or the equivalent
method) using remoteSettings.global (or an empty object if missing), ensuring
this only runs when result.success is true so the local ~/.fusion/settings.json
is updated to reflect the remote last-write-wins state.
- Around line 227-234: The current syncedFields array (built from
globalSettings, projectSettings.project, and workflowSettings) flattens workflow
setting ids and loses which workflow they came from; update the collection logic
that builds syncedFields (and the similar appliedFields usage referenced around
the other block) to preserve workflow identity by emitting either qualified keys
like `${workflowId}.${settingId}` or small objects `{ workflowId, settingId }`
for entries coming from workflowSettings, while leaving global and project keys
unchanged; locate usages of workflowSettings, syncedFields, and appliedFields in
register-settings-sync-routes.ts and adjust the mapping for
Object.values(workflowSettings).flatMap(...) and the inbound appliedFields
builder so callers receive per-workflow qualified/structured entries.
In `@packages/engine/src/mission-execution-loop.ts`:
- Around line 590-596: The validation timeout Promise created as timeoutPromise
uses setTimeout but never clears the timer, so repeated validations accumulate
pending timers; modify the logic around timeoutPromise and validationPromise
(the call to runValidationSession and the use of VALIDATION_TIMEOUT_MS) to
capture the timer id from setTimeout, race the promises, and ensure
clearTimeout(timerId) is called once validationPromise settles (use then/finally
on the raced result or attach finally to validationPromise) so the timer is
always cleared whether the validation succeeds or errors.
- Around line 1014-1051: The current logic only synthesizes defaults when
parsed.assertions is empty, allowing partial responses to omit linked
assertions; update the processing in mission-execution-loop.ts so after
iterating parsed.assertions you build a set/map of assertion IDs seen (from
assertionId or id), then iterate the original assertions array and for any
assertion.id not present push a default result (use parsed.status to set passed
and suitable message) so every linked assertion is represented; reference
parsed.assertions, assertions, results, and parsed.status (and ensure downstream
verifyBehavioralAssertion(...) will receive these synthesized rows).
- Around line 1332-1352: The code currently always calls
this.missionAutopilot.notifyValidationComplete(featureId, "failed") after
createGeneratedFixFeature errors; change it so when the error message indicates
a retry budget exhaustion (the same condition that logs "retry_budget_exhausted"
and emits "validation:budget_exhausted") you call
this.missionAutopilot.notifyValidationComplete(featureId, "blocked") instead of
"failed". Locate the block that checks message.includes("retry budget
exhausted") || message.includes("exhausted its retry budget") (and the else
branch that logs fix_feature_creation_failed) and add the alternate
notifyValidationComplete call for the exhausted-budget branch, keeping the
existing logging/emission behavior for that branch.
---
Nitpick comments:
In `@packages/core/src/__tests__/db-migrate.test.ts`:
- Around line 718-719: Duplicate assertions calling db.getSchemaVersion()
against SCHEMA_VERSION appear back-to-back (e.g.,
expect(db.getSchemaVersion()).toBe(SCHEMA_VERSION)); remove the redundant second
occurrence at each location so only a single assertion remains per check,
leaving the test logic unchanged; look for uses of db.getSchemaVersion and
SCHEMA_VERSION in the test cases (lines shown around the duplicate pairs) and
delete the extra expect(...) lines at each listed spot.
In `@packages/dashboard/app/components/settings/save-split.ts`:
- Around line 79-81: Add a brief explanatory comment above both places where the
code skips the "customProviders" key (the if (key === "customProviders")
continue; checks) stating why customProviders is intentionally excluded from the
global/project save-split flow (e.g., persisted separately for
security/credential or provider-specific reasons), mirroring the style of the
existing comments for persistAgentThinkingLog and githubTrackingDefaultRepo;
mention the isGlobalSettingsKey("customProviders") test behavior to clarify that
exclusion is deliberate and not a bug.
In `@packages/engine/vitest.config.ts`:
- Line 90: Remove the redundant exclude entry
"src/__tests__/merger-file-scope-invariant.test.ts" from the engine-core vitest
exclude list because engine-core already uses an explicit include allow-list
(the include array) and that test file is not present there; simply delete the
string from the exclude array in vitest.config.ts so the config is simpler and
less confusing (keep the exclude only if you want an explicit documentation note
that the test was intentionally removed).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 641b44d1-bd9c-4003-ab03-d16707c99f72
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (56)
.changeset/loud-nodes-sync.md.changeset/proud-validators-verify.mdCONCEPTS.mddocs/brainstorms/2026-06-11-validator-behavioral-verification-requirements.mddocs/missions-completion-contract.mddocs/missions.mddocs/plans/2026-06-11-001-feat-validator-behavioral-verification-plan.mdpackages/core/src/__tests__/central-core.test.tspackages/core/src/__tests__/db-migrate.test.tspackages/core/src/__tests__/db.test.tspackages/core/src/__tests__/goals-schema.test.tspackages/core/src/__tests__/insight-store.test.tspackages/core/src/__tests__/merge-request-record.test.tspackages/core/src/__tests__/mission-store.test.tspackages/core/src/__tests__/run-audit.test.tspackages/core/src/__tests__/store-merge-queue.test.tspackages/core/src/__tests__/task-documents.test.tspackages/core/src/central-core.tspackages/core/src/db.tspackages/core/src/index.tspackages/core/src/mission-store.tspackages/core/src/mission-types.tspackages/core/src/types.tspackages/dashboard/app/__tests__/api-node.test.tspackages/dashboard/app/__tests__/settings-save-split.test.tspackages/dashboard/app/api-node.tspackages/dashboard/app/components/__tests__/NodesView.test.tsxpackages/dashboard/app/components/settings/save-split.tspackages/dashboard/app/hooks/__tests__/useNodeSettingsSync.test.tspackages/dashboard/app/hooks/useNodeSettingsSync.tspackages/dashboard/src/__tests__/routes-nodes-sync-contract.test.tspackages/dashboard/src/__tests__/routes-nodes-sync.test.tspackages/dashboard/src/routes/register-settings-memory-routes.tspackages/dashboard/src/routes/register-settings-sync-inbound-routes.tspackages/dashboard/src/routes/register-settings-sync-routes.tspackages/engine/src/__tests__/merger-file-scope-invariant.test.tspackages/engine/src/__tests__/mission-execution-loop.test.tspackages/engine/src/__tests__/mission-verification-app-driving.test.tspackages/engine/src/__tests__/mission-verification-app-harness.test.tspackages/engine/src/__tests__/mission-verification.test.tspackages/engine/src/__tests__/project-engine-manager.test.tspackages/engine/src/__tests__/reliability-interactions/mission-validator-behavioral-posture.test.tspackages/engine/src/__tests__/reliability-interactions/mission-verification-redrive-surface.test.tspackages/engine/src/mission-execution-loop.tspackages/engine/src/mission-verification-app-harness.tspackages/engine/src/mission-verification.tspackages/engine/vitest.config.tsplugins/fusion-plugin-agent-browser/package.jsonplugins/fusion-plugin-agent-browser/src/__tests__/driver-scope.test.tsplugins/fusion-plugin-agent-browser/src/__tests__/driver.test.tsplugins/fusion-plugin-agent-browser/src/__tests__/probe-browser.test.tsplugins/fusion-plugin-agent-browser/src/driver.tsplugins/fusion-plugin-agent-browser/src/index.tsplugins/fusion-plugin-agent-browser/src/probe.tsplugins/fusion-plugin-roadmap/src/store/__tests__/roadmap-store.test.tsscripts/lib/test-quarantine.json
- mission-execution-loop: preserve judge "blocked" verdict (don't downgrade to fail), clear validation timeout, backfill omitted linked assertions, notify autopilot "blocked" on exhausted retry budget
- mission-verification: pass selected sandbox backend explicitly to fix concurrent global-override race (fail-closed isolation); distinguish infra outcomes (timeout/abort) from real test failures so they stay inconclusive
- mission-store: validate run-to-feature ownership before fix-feature dedup/creation
- settings-sync routes: persist pulled global settings via store; qualify workflow field keys with workflowId
- useNodeSettingsSync: drop redundant ?? {} guard (backend always emits diff.workflowSettings)
- driver(agent-browser): map only Playwright TimeoutError to "absent" in observe(); other faults -> inconclusive
- vitest.config: remove redundant engine-core exclude (quarantine lives in engine-default)
- tests: tighten timeout verdict assertion to inconclusive; remove duplicate schema-version assertions
- save-split: document why customProviders is excluded (persisted via dedicated provider routes)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed CodeRabbit out-of-diff review findings (commit
Addressed:
Addressed: the
Addressed:
Addressed:
Addressed: on a successful pull the route now calls
Addressed: workflow-sourced entries in
Addressed: removed the no-op
Addressed: removed 10 back-to-back duplicate
Addressed: added comments explaining Validation: engine/core/dashboard/agent-browser typecheck clean; engine (104), core (298), dashboard (518), agent-browser (19) targeted suites pass. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/engine/vitest.config.ts (1)
108-110:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThese excludes disable the files in every engine Vitest lane.
engine-coreno longer includes these files,engine-defaultnow excludes them here, and they do not matchengine-reliabilityorengine-slow. That leavesmerger-file-scope-invariant.test.ts,project-engine-manager.test.ts, andmerger-ai-cleanup.test.tsunrunnable anywhere instead of merely evicting them from the merge gate. If the intent is merge-gate eviction, drop these per-file excludes and let a non-gate project still own them.As per coding guidelines, "A flake INSIDE the merge gate is evicted, not skipped: remove its line from
engine-coreallow-list inpackages/engine/vitest.config.ts."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/engine/vitest.config.ts` around lines 108 - 110, The per-file excludes in the Vitest lane configuration are removing these tests from every engine project instead of only evicting them from the merge gate. Update the test selection in vitest.config.ts around the engine-core/engine-default lane config so merger-file-scope-invariant.test.ts, project-engine-manager.test.ts, and merger-ai-cleanup.test.ts are not excluded everywhere; keep them runnable in a non-gate lane and only remove them from engine-core’s allow-list if that is the intended merge-gate eviction behavior.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/engine/src/verification-utils.ts`:
- Around line 431-437: Reorder the condition checks in the error handling block
to prioritize abort detection over timeout detection. Currently, the check for
ETIMEDOUT or killed-process conditions (which sets result.timedOut) executes
before the ABORT_ERR check (which sets result.aborted), causing abort cases that
have both killed and aborted flags to be incorrectly labeled as timeouts. Swap
the order of these two if/else if branches so that the condition checking for
errish.code === "ABORT_ERR" || errish.aborted executes first, followed by the
check for errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode ===
null). This ensures abort outcomes are correctly distinguished from timeout
outcomes for downstream routing decisions.
---
Duplicate comments:
In `@packages/engine/vitest.config.ts`:
- Around line 108-110: The per-file excludes in the Vitest lane configuration
are removing these tests from every engine project instead of only evicting them
from the merge gate. Update the test selection in vitest.config.ts around the
engine-core/engine-default lane config so merger-file-scope-invariant.test.ts,
project-engine-manager.test.ts, and merger-ai-cleanup.test.ts are not excluded
everywhere; keep them runnable in a non-gate lane and only remove them from
engine-core’s allow-list if that is the intended merge-gate eviction behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 8e946b8c-c262-448a-bf9f-4cd927d0e039
📒 Files selected for processing (14)
packages/core/src/__tests__/db-migrate.test.tspackages/core/src/mission-store.tspackages/dashboard/app/components/settings/save-split.tspackages/dashboard/app/hooks/useNodeSettingsSync.tspackages/dashboard/src/__tests__/routes-nodes-sync-contract.test.tspackages/dashboard/src/__tests__/routes-nodes-sync.test.tspackages/dashboard/src/routes/register-settings-sync-routes.tspackages/engine/src/__tests__/mission-verification.test.tspackages/engine/src/mission-execution-loop.tspackages/engine/src/mission-verification.tspackages/engine/src/verification-utils.tspackages/engine/vitest.config.tsplugins/fusion-plugin-agent-browser/src/__tests__/driver.test.tsplugins/fusion-plugin-agent-browser/src/driver.ts
💤 Files with no reviewable changes (1)
- packages/core/src/tests/db-migrate.test.ts
🚧 Files skipped from review as they are similar to previous changes (10)
- packages/dashboard/app/hooks/useNodeSettingsSync.ts
- packages/dashboard/app/components/settings/save-split.ts
- packages/dashboard/src/tests/routes-nodes-sync-contract.test.ts
- packages/engine/src/tests/mission-verification.test.ts
- packages/core/src/mission-store.ts
- plugins/fusion-plugin-agent-browser/src/tests/driver.test.ts
- packages/dashboard/src/tests/routes-nodes-sync.test.ts
- packages/dashboard/src/routes/register-settings-sync-routes.ts
- packages/engine/src/mission-execution-loop.ts
- packages/engine/src/mission-verification.ts
| if (!result.success && !maxBufferExceeded) { | ||
| const errish = err as { code?: number | string; killed?: boolean; aborted?: boolean }; | ||
| if (errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null)) { | ||
| result.timedOut = true; | ||
| } else if (errish.code === "ABORT_ERR" || errish.aborted) { | ||
| result.aborted = true; | ||
| } else if (result.exitCode === null) { |
There was a problem hiding this comment.
Check ABORT_ERR before the generic killed-process timeout heuristic.
toLegacyExecResult() marks in-flight aborts with { code: "ABORT_ERR", aborted: true, killed: true }. With the current branch order, those cases hit timedOut first and never set result.aborted, so user cancels get mislabeled downstream.
🔧 Minimal fix
- if (errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null)) {
- result.timedOut = true;
- } else if (errish.code === "ABORT_ERR" || errish.aborted) {
+ if (errish.code === "ABORT_ERR" || errish.aborted) {
result.aborted = true;
+ } else if (errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null)) {
+ result.timedOut = true;
} else if (result.exitCode === null) {Based on PR objectives, timeout and abort outcomes are supposed to stay distinct for inconclusive routing.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!result.success && !maxBufferExceeded) { | |
| const errish = err as { code?: number | string; killed?: boolean; aborted?: boolean }; | |
| if (errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null)) { | |
| result.timedOut = true; | |
| } else if (errish.code === "ABORT_ERR" || errish.aborted) { | |
| result.aborted = true; | |
| } else if (result.exitCode === null) { | |
| if (!result.success && !maxBufferExceeded) { | |
| const errish = err as { code?: number | string; killed?: boolean; aborted?: boolean }; | |
| if (errish.code === "ABORT_ERR" || errish.aborted) { | |
| result.aborted = true; | |
| } else if (errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null)) { | |
| result.timedOut = true; | |
| } else if (result.exitCode === null) { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/engine/src/verification-utils.ts` around lines 431 - 437, Reorder
the condition checks in the error handling block to prioritize abort detection
over timeout detection. Currently, the check for ETIMEDOUT or killed-process
conditions (which sets result.timedOut) executes before the ABORT_ERR check
(which sets result.aborted), causing abort cases that have both killed and
aborted flags to be incorrectly labeled as timeouts. Swap the order of these two
if/else if branches so that the condition checking for errish.code ===
"ABORT_ERR" || errish.aborted executes first, followed by the check for
errish.code === "ETIMEDOUT" || (errish.killed && result.exitCode === null). This
ensures abort outcomes are correctly distinguished from timeout outcomes for
downstream routing decisions.
Resolve conflicts from concurrent main work: - db.ts: renumber behavioral-verification migration 115 -> 118 after main's workflow-work-items (115), graphResumeRetryCount (116), autoMergeProvenance (117); SCHEMA_VERSION = 118. Update all schema-version test assertions to 118 (keep SCHEMA_VERSION constant where used). - vitest.config / test-quarantine.json: union both branches' quarantine sets (FN-6206 behavioral-verification files + main's merger-ai-cleanup/merger-ai/ QuickEntryBox/soft-delete-blocker-residue/routes-settings entries). - settings-sync routes: keep workflowId-qualified field keys + persisted global pull; drop duplicate workflowApplyResult from auto-merge interleave. - Keep behavioral-verification changes in mission-execution-loop imports, useNodeSettingsSync, save-split, driver, and sync tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- mission-execution-loop: drop now-unused @fusion/core imports (TEST_MODE_RESOLVED, isTestModeActive, resolveTaskValidatorModel) left over from the merge; main refactored their consumers away. Keep normalizeMissionAssertionType. - mission-verification-app-harness: add port-4040-allowlist marker; the "4040" references are documentation of the reserved-port guard contract, not a kill/bind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Ready to review this PR? Stage has broken it down into 13 individual chapters for you: Chapters generated by Stage for commit c7f7a0c on Jun 16, 2026 9:20am UTC. |
Behavioral verification in the Validator
Makes the mission Validator's "done" verdict reflect observed behavior instead of the diff's apparent intent. Behavioral/bug Contract Assertions now default to fail unless a bounded, non-mutating verification run confirms them by exercising the code; non-behavioral assertions keep today's static read-only judging.
Origin:
docs/brainstorms/2026-06-11-validator-behavioral-verification-requirements.md· Plan:docs/plans/2026-06-11-001-feat-validator-behavioral-verification-plan.mdWhat's live and verified (test-execution quality gate)
type(behavioral|static) on Contract Assertions; schema migration 115 + repo-wide version sweep; conservativestaticdefault so existing rows preserve current behavior.mission-verification.ts: disposable git checkout at a trusted revision, mandatory isolating sandbox backend (fail-closed) + scrubbed env, fixed command template + validated test-path, merge-base pre-fix baseline (rejects pass-on-both), git-clean post-condition,inconclusiveas a first-class verdict. Judge session staystools: readonly.createGeneratedFixFeatureidempotent on (sourceFeature, run); durable mission/audit events.docs/missions.md,docs/missions-completion-contract.md,CONCEPTS.md; changeset.What's built but NOT yet live-wired (app-driving — read before merging)
U4 (isolated-app-launch harness), U8 (playwright-core browser driver), U5 (dispatch + determinism contract) are implemented and unit-tested via injected seams, but app-driving is dormant in production: the real app-server spawner (U4) is left behind an injected seam for a heavier lane, and the per-assertion channel/UI-spec classifier is the plan's deferred Open Question. Today behavioral assertions resolve through the test-execution channel only; the app-driving channel is reachable in code/tests but nothing populates
channel: "app"yet. Wiring the live spawner + classifier is follow-up work.Verification performed
@fusion/core,@fusion/engine, agent-browser plugin.eslintclean on all changed source files.playwright-coreadded (no bundled-browser download; CI-safe). One local headless-Chrome smoke of the U8 driver passed.Not verified locally (CI will run these)
eslint ., recursivebuild, Boot Smoke, andtest:gatewere not run locally — relying on CI.reliability-interactions/soft-delete-blocker-residue.test.ts(task soft-delete; flakes independent of this change).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation