feat: Interactive shell streaming with live output by vsumner · Pull Request #562 · spacedriveapp/spacebot

vsumner · 2026-04-12T19:35:09Z

Summary

Implements spawn-based streaming shell execution with real-time output and quiesce detection. This eliminates the #1 worker reliability gap: interactive commands that silently hang and timeout.

What Changed

Backend

ShellTool: Rewrote from batch cmd.output() to spawn-based streaming
ProcessEvent::ToolOutput: New event variant for line-by-line output streaming
Quiesce Detection: 5-second silence watchdog detects interactive prompts
Timeout Enforcement: Full timeout wraps entire streaming operation (not just spawn)
Watchdog Kill: When quiesce fires, watchdog now kills the child process to unblock stream readers

Frontend

ToolCall Component: Renders live output in scrollable panel with auto-scroll
useLiveContext: Accumulates streaming lines and clears on tool completion
call_id Matching: Uses pending call IDs to properly pair streaming output with tool calls

Prompts

Shell tool description now explains interactive behavior and --yes/-y flags
Worker prompt mentions retry patterns for interactive commands

Key Design Decisions

Separate Broadcast Channel: ToolOutput uses capacity-1024 channel, lossy-by-design (dropping lines acceptable)
Arc<Mutex>: Watchdog and main task both need kill access to the child process
Exit Code -2: Signals "waiting_for_input" state when quiesce detection fires
Line-by-line Streaming: Each stdout/stderr line broadcasts immediately via SSE

Verification

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds end-to-end live tool-output streaming: new ProcessEvent::ToolOutput and ApiEvent::ToolOutput, a dedicated tool-output broadcast channel, streaming ShellTool per-line emission with input-detection, API forwarding of per-line events to SSE, backend accumulation into worker transcripts, frontend SSE handling/rendering of liveOutput, and a UI primitives compatibility wrapper.

Changes

Cohort / File(s)	Summary
Core types & bus wiring `src/lib.rs`, `src/main.rs`, `src/api/agents.rs`	Introduced `ProcessEvent::ToolOutput`, `TOOL_OUTPUT_BUS_CAPACITY`, and a third broadcast sender; threaded `tool_output_tx` into AgentDeps and main agent wiring.
API state & SSE `src/api/state.rs`, `src/api/system.rs`, `interface/src/api/client.ts`	Added `ApiEvent::ToolOutput`, ApiState.register_tool_output_stream task to forward per-line events and accumulate live transcripts, and SSE mapping to `"tool_output"`; client type `ToolOutputEvent` added.
Shell tool & worker tool-server `src/tools.rs`, `src/tools/shell.rs`	`create_worker_tool_server` now accepts `tool_output_tx`; `ShellTool` gains streaming mode (with with_streaming) that emits per-line `ToolOutput`, scrubs secrets, streams stdout/stderr, and detects/kills on ~5s interactive waits (waiting_for_input flag).
Agent & worker integration `src/agent/worker.rs`, `src/agent/cortex.rs`, `src/agent/channel_history.rs`	Wired tool_output_tx into worker startup; track ToolOutput as worker activity; exclude ToolOutput from channel routing and from cortex signals where appropriate.
Transcript model & conversation handling `src/conversation/worker_transcript.rs`, `src/api/state.rs`	Added `live_output: Option<String>` to `TranscriptStep::ToolResult`; constructors set `live_output: None` for non-streaming; ApiState accumulates live_output for worker processes.
Frontend live handling & rendering `interface/src/hooks/useLiveContext.tsx`, `interface/src/components/ToolCall.tsx`	Added SSE `tool_output` handler to append/create `tool_result` steps with `live_output`; `ToolCallPair` gains `liveOutput?: string` and running-state renders scrollable `liveOutput`.
Frontend primitives shim `interface/src/lib/primitives.tsx`	New compatibility wrappers re-exporting `@spacedrive/primitives` and providing legacy-prop mappings for `Button`, `Banner`, `FilterButton`, `NumberStepper`.
Docs / prompts `prompts/en/tools/shell_description.md.j2`, `prompts/en/worker.md.j2`	Clarified shell tool runs with no interactive stdin/TTY, 5s inactivity-driven input detection, and guidance for non-interactive flags/pipes/heredocs.
Tests `tests/bulletin.rs`, `tests/context_dump.rs`	Updated test bootstrapping to request third bus capacity and pass `tool_output_tx` into AgentDeps and worker tool-server setup.
Interface types `interface/src/api/client.ts`	Exported `ToolOutputEvent` and extended `ApiEvent` union to include `tool_output`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Workers tab: full transcript viewer, live SSE streaming, introspection tool #192 — Overlaps on SSE/transcript plumbing and frontend live-stream handling; likely closely related to these streaming/event changes.
fix: persist transcripts on idle + reconnect idle workers on restart #334 — Touches worker_transcript shape and history handling; directly related to ToolResult.live_output changes.
refactor: merge exec tool into shell, add per-command env vars #360 — Modifies shell tool behavior; related to streaming/call-path changes in src/tools/shell.rs.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: Interactive shell streaming with live output' clearly and concisely summarizes the primary change—implementing real-time streaming shell output with interactive command detection.
Description check	✅ Passed	The description provides comprehensive context about the streaming shell execution feature, architectural changes, design decisions, and verification steps, all directly related to the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/interactive-shell-streaming

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

src/tools/shell.rs

interface/src/hooks/useLiveContext.tsx

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (4)

src/agent/channel_history.rs (1)

470-471: Add a regression test for ToolOutput exclusion.

This changes channel-scoping semantics for a brand-new ProcessEvent variant, but the tests below still only lock in ToolStarted, agent-message, and TextDelta behavior. A small assertion for ProcessEvent::ToolOutput { .. } => false would make this much harder to regress later.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_history.rs` around lines 470 - 471, Add a regression test
that asserts ProcessEvent::ToolOutput is excluded from channel-scoped history:
in the tests near the existing assertions for ToolStarted, agent-message, and
TextDelta in channel_history.rs, construct a ProcessEvent::ToolOutput { .. }
instance and assert that the channel-scoping predicate (the same helper/assert
used for ToolStarted/TextDelta) returns false; this ensures the new ToolOutput
variant is explicitly covered and prevents regressions to the channel-scoping
semantics for ProcessEvent::ToolOutput.

tests/context_dump.rs (1)

90-121: Extract the test AgentDeps bootstrap into a shared helper.

Adding tool_output_tx required touching this bootstrap and the near-copy in tests/bulletin.rs. At this point, the event-bus setup plus AgentDeps skeleton looks worth centralizing in a shared test utility so future deps changes stay one-edit instead of many.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/context_dump.rs` around lines 90 - 121, Extract the shared event-bus +
AgentDeps bootstrap into a single test helper (e.g., make_test_agent_deps or
build_agent_deps) that encapsulates calling
create_process_event_buses_with_capacity(...) and constructing
spacebot::AgentDeps (including agent_id, mcp_manager, sandbox, memory_search,
llm_manager, task_store, project_store, runtime_config, event_tx,
memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated
inline setup in tests that currently call
create_process_event_buses_with_capacity and build AgentDeps (such as where
AgentDeps is instantiated) to call this helper so future changes to AgentDeps
fields only need one edit and the helper returns the fully wired AgentDeps and
any event channels you need.

interface/src/components/ToolCall.tsx (1)

61-62: Redundant type cast.

step is already typed as TranscriptStep (line 54 parameter), which is aliased to ExtendedTranscriptStep (line 12). The cast is unnecessary.

✨ Suggested simplification

 		if (step.type === "tool_result") {
-			const liveOutput = (step as ExtendedTranscriptStep).live_output;
+			const liveOutput = step.live_output;
 			resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput});
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@interface/src/components/ToolCall.tsx` around lines 61 - 62, The explicit
cast "(step as ExtendedTranscriptStep)" is redundant because the parameter
"step" is already typed as TranscriptStep (an alias of ExtendedTranscriptStep);
remove the cast and read the property directly (use step.live_output) when
computing liveOutput and calling resultsById.set with step.call_id, step.name,
and step.text in the ToolCall component to simplify the code.

src/tools/shell.rs (1)

425-429: Watchdog task not aborted on normal completion.

When the command completes normally before the 5-second quiesce timeout, the watchdog task continues running in the background until its next wake-up (up to 5 seconds later). This is a minor resource leak.

Consider aborting the watchdog when streaming completes:

♻️ Proposed fix

     let watchdog = spawn_quiesce_watchdog(
         Arc::clone(&last_output_ms),
         Arc::clone(&interactive_detected),
         Arc::clone(&child_arc),
     );
 
     // ... streaming code ...
 
     let (stdout_lines, stderr_lines) = match streaming_result {
         Ok((stdout, stderr)) => (stdout, stderr),
         Err(_) => {
             // Timeout - kill child and return error
+            watchdog.abort();
             if let Ok(mut child_guard) = child_arc.try_lock() {
                 let _ = child_guard.kill().await;
             }
             return Err(ShellError {
                 message: format!("Command timed out after {} seconds", timeout.as_secs()),
                 exit_code: -1,
             });
         }
     };
+    
+    // Clean up watchdog if still running
+    watchdog.abort();
 
     let waiting_for_input = interactive_detected.load(Ordering::SeqCst);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 425 - 429, The watchdog started via
spawn_quiesce_watchdog is never aborted on normal command completion (let
_watchdog ...), so it can linger up to the quiesce timeout; change the code to
retain the returned JoinHandle (e.g., let watchdog =
spawn_quiesce_watchdog(...)) and explicitly abort it when
streaming/child-handling finishes (call watchdog.abort() or equivalent in the
code path that runs after the child exits or streaming completes), ensuring the
JoinHandle is dropped/awaited as appropriate to avoid the background task leak;
reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and
child_arc to locate where to capture and abort the handle.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 394-402: Remove the unreachable shell live-output block in
ToolCall.tsx: inside the renderResult function, delete the early-return
conditional that checks `if (pair.status === "running" && pair.liveOutput) { ...
}` because `renderResult` already delegates to `renderer.resultView(pair)` which
handles live output; ensure no other code expects that branch and keep
`renderer.resultView(pair)` as the single path for rendering the result.

In `@interface/src/lib/primitives.tsx`:
- Around line 4-10: The file imports PrimitiveBanner, PrimitiveButton,
PrimitiveFilterButton, PrimitiveNumberStepper and re-exports the package using a
hard-coded ../../node_modules path; change both import and export to import from
the package name "@spacedrive/primitives" instead (e.g., update the named import
that yields
PrimitiveBanner/PrimitiveButton/PrimitiveFilterButton/PrimitiveNumberStepper and
the export statement) so the module resolution uses the package entrypoint
rather than a workspace-specific node_modules path.

In `@prompts/en/worker.md.j2`:
- Around line 90-91: Update the wording that currently says "Commands have no
stdin" in prompts/en/worker.md.j2 to clarify "no interactive stdin / no attached
TTY" and explicitly state that shell-created pipes and here-docs (e.g., echo y |
..., heredoc <<EOF) are valid non-interactive input paths; then mirror the exact
revised phrasing in prompts/en/tools/shell_description.md.j2 so both templates
use the same wording and examples.

In `@src/agent/cortex.rs`:
- Around line 1414-1415: The ProcessEvent::ToolOutput events are currently
ignored in signal_from_event()/cortex signal handling but must still update
worker activity timestamps; modify observe_health_event() so that when it
receives a ProcessEvent::ToolOutput originating from a ProcessId::Worker(...) it
updates HealthRuntimeState::last_activity_at (same way other worker activity
events do) while leaving signal_from_event() behavior unchanged if desired;
locate references to ProcessEvent::ToolOutput, observe_health_event(),
HealthRuntimeState::last_activity_at, and ProcessId::Worker and add the
activity-timestamp update for worker tool output events.

In `@src/api/state.rs`:
- Around line 312-322: ToolOutput lacks a stable per-call identifier so streamed
lines can’t be deterministically associated with a specific invocation; add a
server-generated unique call_id (e.g., UUID/String) field to the ToolOutput
payload and ensure the same call_id is emitted on ToolStarted and ToolCompleted
events for the same invocation so consumers can join streams reliably; update
the ToolOutput struct and the code paths that construct/emits ToolOutput,
ToolStarted and ToolCompleted to generate/populate the call_id (preserve
existing process_id/tool_name fields) and include it in serialization so
downstream consumers receive the stable identifier.
- Around line 824-875: register_tool_output_stream currently forwards
ProcessEvent::ToolOutput only to SSE and does not update the in-memory
transcript cache, so streamed shell lines are lost on refresh; modify
register_tool_output_stream to also update live_worker_transcripts (the same
structure updated by register_agent_events for
ToolStarted/ToolCompleted/WorkerText) whenever a ToolOutput arrives — either
call the existing transcript-accumulation helper used by register_agent_events
or replicate its logic: derive the (process_type, process_id) via
process_id_info(process_id), locate/update the live_worker_transcripts entry for
agent_id and channel_id, append the streamed line/stream metadata to the
transcript, and then continue sending the ApiEvent::ToolOutput to api_tx; ensure
the update is performed inside the tokio::spawn task before/after sending the
ApiEvent so get_live_transcript will recover in-flight output.

---

Nitpick comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 61-62: The explicit cast "(step as ExtendedTranscriptStep)" is
redundant because the parameter "step" is already typed as TranscriptStep (an
alias of ExtendedTranscriptStep); remove the cast and read the property directly
(use step.live_output) when computing liveOutput and calling resultsById.set
with step.call_id, step.name, and step.text in the ToolCall component to
simplify the code.

In `@src/agent/channel_history.rs`:
- Around line 470-471: Add a regression test that asserts
ProcessEvent::ToolOutput is excluded from channel-scoped history: in the tests
near the existing assertions for ToolStarted, agent-message, and TextDelta in
channel_history.rs, construct a ProcessEvent::ToolOutput { .. } instance and
assert that the channel-scoping predicate (the same helper/assert used for
ToolStarted/TextDelta) returns false; this ensures the new ToolOutput variant is
explicitly covered and prevents regressions to the channel-scoping semantics for
ProcessEvent::ToolOutput.

In `@src/tools/shell.rs`:
- Around line 425-429: The watchdog started via spawn_quiesce_watchdog is never
aborted on normal command completion (let _watchdog ...), so it can linger up to
the quiesce timeout; change the code to retain the returned JoinHandle (e.g.,
let watchdog = spawn_quiesce_watchdog(...)) and explicitly abort it when
streaming/child-handling finishes (call watchdog.abort() or equivalent in the
code path that runs after the child exits or streaming completes), ensuring the
JoinHandle is dropped/awaited as appropriate to avoid the background task leak;
reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and
child_arc to locate where to capture and abort the handle.

In `@tests/context_dump.rs`:
- Around line 90-121: Extract the shared event-bus + AgentDeps bootstrap into a
single test helper (e.g., make_test_agent_deps or build_agent_deps) that
encapsulates calling create_process_event_buses_with_capacity(...) and
constructing spacebot::AgentDeps (including agent_id, mcp_manager, sandbox,
memory_search, llm_manager, task_store, project_store, runtime_config, event_tx,
memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated
inline setup in tests that currently call
create_process_event_buses_with_capacity and build AgentDeps (such as where
AgentDeps is instantiated) to call this helper so future changes to AgentDeps
fields only need one edit and the helper returns the fully wired AgentDeps and
any event channels you need.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fef29a4a-6652-4570-b229-4ef6e266e925

📥 Commits

Reviewing files that changed from the base of the PR and between 55ad26a and a006718.

📒 Files selected for processing (18)

interface/src/api/client.ts
interface/src/components/ToolCall.tsx
interface/src/hooks/useLiveContext.tsx
interface/src/lib/primitives.tsx
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/channel_history.rs
src/agent/cortex.rs
src/agent/worker.rs
src/api/agents.rs
src/api/state.rs
src/api/system.rs
src/lib.rs
src/main.rs
src/tools.rs
src/tools/shell.rs
tests/bulletin.rs
tests/context_dump.rs

interface/src/components/ToolCall.tsx

interface/src/lib/primitives.tsx

prompts/en/worker.md.j2

src/agent/cortex.rs

src/api/state.rs

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

interface/src/lib/primitives.tsx (1)
8-10: ⚠️ Potential issue | 🟠 Major

Use package entrypoint imports instead of hard-coded node_modules paths.

Line 8 and Line 10 couple this module to a specific workspace layout (../../node_modules/.../dist/index.js) and bypass @spacedrive/primitives export resolution. Import/re-export from @spacedrive/primitives directly.
Proposed fix
 import {
 	Banner as PrimitiveBanner,
 	Button as PrimitiveButton,
 	FilterButton as PrimitiveFilterButton,
 	NumberStepper as PrimitiveNumberStepper,
-} from "../../node_modules/@spacedrive/primitives/dist/index.js";
+} from "@spacedrive/primitives";
 
-export * from "../../node_modules/@spacedrive/primitives/dist/index.js";
+export * from "@spacedrive/primitives";
#!/bin/bash
# Verify direct node_modules imports and confirm package-entrypoint usage.

# 1) Find hard-coded node_modules imports in interface source.
rg -n --type ts --type tsx 'from\s+["'\'']\.\./\.\./node_modules/@spacedrive/primitives/dist/index\.js["'\'']|from\s+["'\''].*node_modules/'

# 2) Inspect this file specifically.
cat -n interface/src/lib/primitives.tsx | sed -n '1,40p'

# 3) Check dependency declaration presence for `@spacedrive/primitives`.
fd -i package.json | xargs rg -n '"@spacedrive/primitives"'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 8 - 10, The file imports and
re-exports from a hard-coded
"../../node_modules/@spacedrive/primitives/dist/index.js"; update both the
import and the export to use the package entrypoint " `@spacedrive/primitives` "
instead so Node/TS resolution and package exports are respected. Locate the
import statement and the export-all statement in primitives.tsx (the lines that
reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace
them to import/export directly from `@spacedrive/primitives`; ensure any named
imports still match exported symbols from the package and run type/checking to
confirm no breakage.
interface/src/hooks/useLiveContext.tsx (1)
247-275: ⚠️ Potential issue | 🟡 Minor

Preserve streamed lines as-is and bound the buffer.

The backend is sending event.line from read_line(), so event.line + "\n" will double-space normal output. This path also grows live_output without any cap, which will balloon React state for noisy commands.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 247 - 275, The code
appends "\n" to every streamed event.line and unboundedly grows live_output;
update all places in useLiveContext.tsx where live_output is built (both when
updating an existing TranscriptStep and when creating new tool_result steps) to
preserve event.line exactly (append event.line without forcing an extra newline,
or only add a newline when event.line does not already endWith("\n")), and clamp
the live_output buffer to a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH
constant and after concatenation truncate to the most recent N characters) so
pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path)
won't let React state grow without bound.
src/tools/shell.rs (2)
135-149: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog will kill valid quiet commands.

last_output_ms only advances when read_line() completes, so commands that are legitimately quiet for >5s, or that print progress/prompts without newline terminators, are classified as waiting_for_input and killed. That will false-positive on things like sleep, quiet build phases, and \r-driven progress output.

Also applies to: 178-197

194-195: ⚠️ Potential issue | 🔴 Critical

Don't make process termination best-effort here.

Both kill paths use try_lock(), so a brief lock holder can cause the kill to be skipped entirely, leaving the child running after quiesce/timeout. Even when the lock succeeds, let _ = child_guard.kill().await hides kill failures.

As per coding guidelines "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped."

Also applies to: 471-472
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 194 - 195, The current termination uses
child.try_lock() and then swallows kill() errors with let _ = ..., which can
skip killing or hide failures; replace the non-blocking try_lock() with an
awaited lock acquisition (e.g., child.lock().await) so we reliably obtain the
guard before attempting termination, and do not discard the Result from
child_guard.kill().await—either propagate the error (use ? or return a Result)
or log it with context; apply this change to both sites where child.try_lock() /
child_guard.kill().await are used so termination is not best-effort and failures
are handled.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 247-275: The code appends "\n" to every streamed event.line and
unboundedly grows live_output; update all places in useLiveContext.tsx where
live_output is built (both when updating an existing TranscriptStep and when
creating new tool_result steps) to preserve event.line exactly (append
event.line without forcing an extra newline, or only add a newline when
event.line does not already endWith("\n")), and clamp the live_output buffer to
a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH constant and after
concatenation truncate to the most recent N characters) so
pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path)
won't let React state grow without bound.

In `@interface/src/lib/primitives.tsx`:
- Around line 8-10: The file imports and re-exports from a hard-coded
"../../node_modules/@spacedrive/primitives/dist/index.js"; update both the
import and the export to use the package entrypoint " `@spacedrive/primitives` "
instead so Node/TS resolution and package exports are respected. Locate the
import statement and the export-all statement in primitives.tsx (the lines that
reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace
them to import/export directly from `@spacedrive/primitives`; ensure any named
imports still match exported symbols from the package and run type/checking to
confirm no breakage.

In `@src/tools/shell.rs`:
- Around line 194-195: The current termination uses child.try_lock() and then
swallows kill() errors with let _ = ..., which can skip killing or hide
failures; replace the non-blocking try_lock() with an awaited lock acquisition
(e.g., child.lock().await) so we reliably obtain the guard before attempting
termination, and do not discard the Result from child_guard.kill().await—either
propagate the error (use ? or return a Result) or log it with context; apply
this change to both sites where child.try_lock() / child_guard.kill().await are
used so termination is not best-effort and failures are handled.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09ee23a5-9bf2-4a44-8400-eb1da0f3d567

📥 Commits

Reviewing files that changed from the base of the PR and between a006718 and dbd87b0.

📒 Files selected for processing (18)

interface/src/api/client.ts
interface/src/components/ToolCall.tsx
interface/src/hooks/useLiveContext.tsx
interface/src/lib/primitives.tsx
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/channel_history.rs
src/agent/cortex.rs
src/agent/worker.rs
src/api/agents.rs
src/api/state.rs
src/api/system.rs
src/lib.rs
src/main.rs
src/tools.rs
src/tools/shell.rs
tests/bulletin.rs
tests/context_dump.rs

✅ Files skipped from review due to trivial changes (6)

src/agent/worker.rs
src/api/system.rs
prompts/en/worker.md.j2
src/agent/cortex.rs
prompts/en/tools/shell_description.md.j2
src/lib.rs

🚧 Files skipped from review as they are similar to previous changes (7)

src/agent/channel_history.rs
tests/context_dump.rs
src/tools.rs
src/main.rs
interface/src/api/client.ts
tests/bulletin.rs
src/api/state.rs

interface/src/hooks/useLiveContext.tsx

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (4)

interface/src/lib/primitives.tsx (1)
3-10: ⚠️ Potential issue | 🟠 Major

Use package entrypoint imports instead of filesystem node_modules paths.

Line 8 and Line 10 are tightly coupled to repo layout and internal build output. Import/export from @spacedrive/primitives directly.
Suggested patch
 import {
 	Banner as PrimitiveBanner,
 	Button as PrimitiveButton,
 	FilterButton as PrimitiveFilterButton,
 	NumberStepper as PrimitiveNumberStepper,
-} from "../../node_modules/@spacedrive/primitives/dist/index.js";
+} from "@spacedrive/primitives";
 
-export * from "../../node_modules/@spacedrive/primitives/dist/index.js";
+export * from "@spacedrive/primitives";
#!/bin/bash
# Verify no hard-coded primitives node_modules path remains in interface sources.
rg -n --type=ts --type=tsx 'node_modules/@spacedrive/primitives|@spacedrive/primitives/dist/index\.js' interface/src
Expected result: no matches after applying the fix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 3 - 10, The imports and
re-export are using a hard-coded filesystem path
("../../node_modules/@spacedrive/primitives/dist/index.js"); update the import
source and the export statement to use the package entrypoint
"@spacedrive/primitives" instead. Replace the current import that brings in
PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and
PrimitiveNumberStepper from the dist path so they import from
"@spacedrive/primitives", and change the export * line to export everything from
"@spacedrive/primitives" to remove the tight coupling to node_modules layout.
interface/src/hooks/useLiveContext.tsx (2)
242-266: ⚠️ Potential issue | 🟠 Major

Don't create a tool_result before tool_completed.

pairTranscriptSteps() in interface/src/components/ToolCall.tsx treats the existence of a tool_result as completion. Synthesizing one on the first tool_output line will flip a long-running shell call out of the running state before the real completion event arrives. Store live output separately, or attach it to the pending call entry instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 266, The code
currently creates a TranscriptStep of type "tool_result" on the first
tool_output event, which makes pairTranscriptSteps() (in ToolCall.tsx) treat the
call as completed; instead, on tool_output events update the existing pending
call entry or a separate live-output store and do not synthesize a "tool_result"
until a "tool_completed" event arrives. Modify setLiveTranscripts in
useLiveContext.tsx to: 1) locate the pending transcript step for
event.process_id and event.call_id (e.g., the existing "tool_call" step) and
append to its live_output field if present, or 2) if no pending call exists,
store output in a dedicated liveOutputs map keyed by call_id/process_id; only
create a "tool_result" step when handling tool_completed. Keep references to
TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts and
pairTranscriptSteps to locate and implement this change.
251-264: ⚠️ Potential issue | 🟡 Minor

Don't append an extra newline to streamed lines.

The backend reads with read_line(), so event.line already usually includes its trailing newline. Adding another "\n" here double-spaces most live output.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 251 - 264, In
useLiveContext.tsx (in the streaming update where you build/update
TranscriptStep), stop appending an extra "\n" to event.line: change the
concatenation that sets newOutput from existingOutput + event.line + "\n" to
existingOutput + event.line, and change the new step's live_output from
event.line + "\n" to event.line so live_output uses the backend-provided newline
only; update the code paths that set live_output (the updatedStep creation and
the new step object) accordingly.
src/tools/shell.rs (1)

188-200: ⚠️ Potential issue | 🟠 Major

Don't treat every 5-second quiet period as an interactive prompt.

This will kill valid commands that are simply silent for a while (sleep 10, long compile phases, tools that update with \r instead of \n). Those runs will incorrectly surface as waiting_for_input with exit code -2. The watchdog needs a stronger signal than “no completed read_line() for 5 seconds.”

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The current prop forwarding sets allowFloat to false when
both allowFloat and legacy type are undefined, overriding
PrimitiveNumberStepper's own default; update the prop expression in the wrapper
(the arrow component that spreads {...props} into PrimitiveNumberStepper) so
allowFloat is only passed when explicitly provided or when type === "float"
(e.g. compute allowFloatProp = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={allowFloatProp}) so
PrimitiveNumberStepper can use its internal default when neither is specified.

In `@src/tools/shell.rs`:
- Around line 197-198: The code is currently discarding the Result of
child_guard.kill().await (call site using child.try_lock() and
child_guard.kill().await) which can hide failures and leave subprocesses
running; update both occurrences (the block around child.try_lock() /
child_guard.kill().await and the similar block at the other occurrence) to
handle the Result instead of using let _ =: if kill() returns Err, log the error
with context (including which branch triggered the kill, e.g., timeout or
waiting_for_input) and, where appropriate, propagate or surface the failure so
callers/tests can observe it rather than silently ignoring it.
- Around line 437-460: The stream code is creating a new call_id with
uuid::Uuid::new_v4() which breaks correlation with the ToolStarted/ToolCompleted
IDs; instead, propagate the existing invocation ID used when emitting
ToolStarted into the StreamContext structs so ToolOutput events use the same
call_id. Replace the local generation of call_id in this block with the existing
invocation/tool-call identifier (the same variable used when you send
ToolStarted) and pass that same identifier into both stdout_ctx and stderr_ctx
StreamContext.call_id fields so all ToolStarted, ToolOutput, and ToolCompleted
share the same ID.

---

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-266: The code currently creates a TranscriptStep of type
"tool_result" on the first tool_output event, which makes pairTranscriptSteps()
(in ToolCall.tsx) treat the call as completed; instead, on tool_output events
update the existing pending call entry or a separate live-output store and do
not synthesize a "tool_result" until a "tool_completed" event arrives. Modify
setLiveTranscripts in useLiveContext.tsx to: 1) locate the pending transcript
step for event.process_id and event.call_id (e.g., the existing "tool_call"
step) and append to its live_output field if present, or 2) if no pending call
exists, store output in a dedicated liveOutputs map keyed by call_id/process_id;
only create a "tool_result" step when handling tool_completed. Keep references
to TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts
and pairTranscriptSteps to locate and implement this change.
- Around line 251-264: In useLiveContext.tsx (in the streaming update where you
build/update TranscriptStep), stop appending an extra "\n" to event.line: change
the concatenation that sets newOutput from existingOutput + event.line + "\n" to
existingOutput + event.line, and change the new step's live_output from
event.line + "\n" to event.line so live_output uses the backend-provided newline
only; update the code paths that set live_output (the updatedStep creation and
the new step object) accordingly.

In `@interface/src/lib/primitives.tsx`:
- Around line 3-10: The imports and re-export are using a hard-coded filesystem
path ("../../node_modules/@spacedrive/primitives/dist/index.js"); update the
import source and the export statement to use the package entrypoint
"@spacedrive/primitives" instead. Replace the current import that brings in
PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and
PrimitiveNumberStepper from the dist path so they import from
"@spacedrive/primitives", and change the export * line to export everything from
"@spacedrive/primitives" to remove the tight coupling to node_modules layout.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 32995432-7fa1-43de-ba2f-fd8f2da08cd8

📥 Commits

Reviewing files that changed from the base of the PR and between dbd87b0 and 236bd42.

📒 Files selected for processing (18)

interface/src/api/client.ts
interface/src/components/ToolCall.tsx
interface/src/hooks/useLiveContext.tsx
interface/src/lib/primitives.tsx
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/channel_history.rs
src/agent/cortex.rs
src/agent/worker.rs
src/api/agents.rs
src/api/state.rs
src/api/system.rs
src/lib.rs
src/main.rs
src/tools.rs
src/tools/shell.rs
tests/bulletin.rs
tests/context_dump.rs

✅ Files skipped from review due to trivial changes (4)

src/api/system.rs
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/cortex.rs

🚧 Files skipped from review as they are similar to previous changes (6)

src/agent/worker.rs
src/agent/channel_history.rs
tests/context_dump.rs
src/main.rs
interface/src/api/client.ts
src/api/state.rs

interface/src/lib/primitives.tsx

src/tools/shell.rs

coderabbitai

♻️ Duplicate comments (5)

src/tools/shell.rs (3)
197-199: ⚠️ Potential issue | 🟠 Major

Don't let timeout/quiesce report success if the subprocess kill was skipped or failed.

Both branches use try_lock() and then discard kill().await. If the mutex is busy or kill() returns an error, the tool still reports timeout / waiting_for_input while the child may keep running in the background. Take the lock with a bounded await and at least log the failure.

As per coding guidelines "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped."

Also applies to: 483-486
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 197 - 199, The timeout/quiesce branches
currently call child.try_lock() and discard the result of child.kill().await,
which can silently skip killing the subprocess; replace try_lock() with an
awaited, bounded lock acquisition (e.g., with a timeout) on the child mutex and
handle the Result from child.kill().await instead of using let _ = …; if the
lock cannot be acquired or kill() returns Err, log an error indicating the
failure (including the error) and ensure the function reports failure rather
than success for the timeout/waiting_for_input paths; apply the same change to
the other occurrence referenced around lines 483-486.
437-460: ⚠️ Potential issue | 🟠 Major

This call_id still can't correlate live output with the originating tool call.

interface/src/hooks/useLiveContext.tsx still synthesizes IDs when tool_started arrives and reuses those for tool_completed, but ToolOutput now carries a separate UUID minted only here. That leaves streamed shell lines orphaned from the initiating tool_call, so the live panel never attaches to the running invocation. Reuse one server-side invocation ID across ToolStarted, ToolOutput, and ToolCompleted instead of generating a stream-only ID here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 437 - 460, The code generates a fresh UUID
into call_id and embeds it in StreamContext, which breaks correlation with the
server-side invocation id used by ToolStarted/ToolCompleted; instead stop
creating a new UUID and reuse the server invocation id passed into this routine
(or add an invocation_id parameter) so the same id flows into StreamContext for
both stdout_ctx and stderr_ctx and into all emitted ToolOutput events; update
the call_id binding to use that existing invocation id (replace the call to
uuid::Uuid::new_v4()) and ensure StreamContext.call_id is set from that single
server-side id so ToolStarted, ToolOutput and ToolCompleted share the same
identifier.
137-145: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog kills any quiet command after 5s, not just interactive ones.

This fires on any 5-second gap between completed read_line() calls, so legitimate quiet commands like sleep 10, some cargo build phases, or progress that uses \r/no newline will be misclassified as waiting_for_input and killed.

Also applies to: 181-199
interface/src/lib/primitives.tsx (1)
172-177: ⚠️ Potential issue | 🟠 Major

Preserve the primitive default when neither allowFloat nor legacy type is set.

allowFloat ?? type === "float" collapses to false when both props are undefined, so this wrapper stops PrimitiveNumberStepper from using its own default behavior.
Suggested patch
 >(({type, variant: _variant, allowFloat, ...props}, ref) => (
 	<PrimitiveNumberStepper
 		{...props}
 		ref={ref}
-		allowFloat={allowFloat ?? type === "float"}
+		allowFloat={
+			allowFloat !== undefined
+				? allowFloat
+				: type === "float"
+					? true
+					: undefined
+		}
 	/>
 ));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 172 - 177, The wrapper
currently forces allowFloat to false when both allowFloat and type are undefined
by using allowFloat ?? type === "float"; update the wrapper around
PrimitiveNumberStepper (the arrow function destructuring {type, variant:
_variant, allowFloat, ...props}, ref) so it only passes allowFloat when
explicitly provided or when type === "float"; e.g., compute a resolvedAllowFloat
= allowFloat !== undefined ? allowFloat : (type === "float" ? true : undefined)
and pass allowFloat={resolvedAllowFloat} (or conditionally include the prop) so
PrimitiveNumberStepper can use its own default when neither prop is set.
interface/src/hooks/useLiveContext.tsx (1)
242-267: ⚠️ Potential issue | 🟠 Major

Store streamed chunks verbatim and cap the live buffer.

event.line is already newline-terminated in the common case, so + "\n" double-spaces shell output. More importantly, this keeps copying an ever-growing string on every SSE event, which will balloon browser memory and rerender cost on noisy commands.
Suggested patch
+const MAX_LIVE_OUTPUT_BYTES = 32_000;
+
+function appendLiveOutput(current: string, chunk: string): string {
+	const combined = current + chunk;
+	return combined.length > MAX_LIVE_OUTPUT_BYTES
+		? combined.slice(-MAX_LIVE_OUTPUT_BYTES)
+		: combined;
+}
+
 	const handleToolOutput = useCallback((data: unknown) => {
 		const event = data as ToolOutputEvent;
 		if (event.process_type === "worker") {
 			setLiveTranscripts((prev) => {
 				const steps = prev[event.process_id] ?? [];
@@
 				if (existingIndex >= 0) {
 					// Append to existing result step
 					const step = steps[existingIndex];
 					const existingOutput = (step as TranscriptStep).live_output ?? "";
-					const newOutput = existingOutput + event.line + "\n";
+					const newOutput = appendLiveOutput(existingOutput, event.line);
 					const updatedStep = { ...step, live_output: newOutput };
 					const newSteps = [...steps];
 					newSteps[existingIndex] = updatedStep;
 					return { ...prev, [event.process_id]: newSteps };
 				}
@@
 				const step: TranscriptStep = {
 					type: "tool_result",
 					call_id: event.call_id,
 					name: event.tool_name,
 					text: "",
-					live_output: event.line + "\n",
+					live_output: appendLiveOutput("", event.line),
 				};
 				return { ...prev, [event.process_id]: [...steps, step] };
 			});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 267, In
useLiveContext's setLiveTranscripts update (the block that finds/creates a
"tool_result" step by event.call_id), stop appending an extra "\n" and instead
append event.line verbatim, and enforce a fixed max buffer for
TranscriptStep.live_output to avoid unbounded growth; when updating
existingStep.live_output concatenate event.line (no added newline) and then trim
the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N
characters or last M lines) before setting updatedStep.live_output, so the
update logic in setLiveTranscripts (and the creation of the new step) both use
the same capped, verbatim buffering behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-267: In useLiveContext's setLiveTranscripts update (the block
that finds/creates a "tool_result" step by event.call_id), stop appending an
extra "\n" and instead append event.line verbatim, and enforce a fixed max
buffer for TranscriptStep.live_output to avoid unbounded growth; when updating
existingStep.live_output concatenate event.line (no added newline) and then trim
the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N
characters or last M lines) before setting updatedStep.live_output, so the
update logic in setLiveTranscripts (and the creation of the new step) both use
the same capped, verbatim buffering behavior.

In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The wrapper currently forces allowFloat to false when
both allowFloat and type are undefined by using allowFloat ?? type === "float";
update the wrapper around PrimitiveNumberStepper (the arrow function
destructuring {type, variant: _variant, allowFloat, ...props}, ref) so it only
passes allowFloat when explicitly provided or when type === "float"; e.g.,
compute a resolvedAllowFloat = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={resolvedAllowFloat} (or
conditionally include the prop) so PrimitiveNumberStepper can use its own
default when neither prop is set.

In `@src/tools/shell.rs`:
- Around line 197-199: The timeout/quiesce branches currently call
child.try_lock() and discard the result of child.kill().await, which can
silently skip killing the subprocess; replace try_lock() with an awaited,
bounded lock acquisition (e.g., with a timeout) on the child mutex and handle
the Result from child.kill().await instead of using let _ = …; if the lock
cannot be acquired or kill() returns Err, log an error indicating the failure
(including the error) and ensure the function reports failure rather than
success for the timeout/waiting_for_input paths; apply the same change to the
other occurrence referenced around lines 483-486.
- Around line 437-460: The code generates a fresh UUID into call_id and embeds
it in StreamContext, which breaks correlation with the server-side invocation id
used by ToolStarted/ToolCompleted; instead stop creating a new UUID and reuse
the server invocation id passed into this routine (or add an invocation_id
parameter) so the same id flows into StreamContext for both stdout_ctx and
stderr_ctx and into all emitted ToolOutput events; update the call_id binding to
use that existing invocation id (replace the call to uuid::Uuid::new_v4()) and
ensure StreamContext.call_id is set from that single server-side id so
ToolStarted, ToolOutput and ToolCompleted share the same identifier.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a35e9be-754d-4f6a-ba93-ba812e1abb48

📥 Commits

Reviewing files that changed from the base of the PR and between 236bd42 and 58fe5f4.

📒 Files selected for processing (19)

interface/src/api/client.ts
interface/src/components/ToolCall.tsx
interface/src/hooks/useLiveContext.tsx
interface/src/lib/primitives.tsx
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/channel_history.rs
src/agent/cortex.rs
src/agent/worker.rs
src/api/agents.rs
src/api/state.rs
src/api/system.rs
src/conversation/worker_transcript.rs
src/lib.rs
src/main.rs
src/tools.rs
src/tools/shell.rs
tests/bulletin.rs
tests/context_dump.rs

✅ Files skipped from review due to trivial changes (4)

src/api/system.rs
prompts/en/worker.md.j2
prompts/en/tools/shell_description.md.j2
src/api/agents.rs

🚧 Files skipped from review as they are similar to previous changes (8)

src/agent/worker.rs
src/agent/channel_history.rs
src/tools.rs
tests/context_dump.rs
src/main.rs
interface/src/api/client.ts
src/agent/cortex.rs
src/api/state.rs

Replace batch-output shell execution with spawn-based streaming: - Shell commands now stream output line-by-line via ProcessEvent::ToolOutput - 5-second quiesce detection kills interactive prompts (waiting_for_input) - Frontend renders live output in ToolCall component with auto-scroll - ToolOutput events broadcast via separate SSE channel (capacity 1024) - Timeout enforcement wraps entire streaming operation - Watchdog now kills child process when quiesce detected Fixes: - P1: Timeout was ignored in streaming mode (now enforced) - P1: Watchdog didn't kill child (now passes Arc<Mutex<Child>>) - P2: Frontend call_id mismatch (now uses pendingToolCallIdsRef) Files changed: - src/tools/shell.rs: Complete rewrite with streaming - src/lib.rs: Add ProcessEvent::ToolOutput variant - src/api/state.rs: Wire SSE for tool_output events - interface/: Live output rendering components - prompts/: Interactive guidance for LLM

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (2)

src/tools/shell.rs (1)

186-219: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog will misclassify quiet commands as interactive.

This loop treats any ~5s window without a completed line read as waiting_for_input. Commands like sleep 10, long quiet build phases, or tools that only repaint with \r will be terminated even though they never asked for input. Please gate this on prompt-like output or make the heuristic configurable/opt-in.
src/api/state.rs (1)
653-653: ⚠️ Potential issue | 🟠 Major

The live transcript builder still uses mismatched call IDs.

TranscriptStep::ToolResult already carries call_id in src/conversation/worker_transcript.rs, Lines 35-42, but this path appends streamed output under the real call_id while Line 596 still mints a separate live_* ID for the ToolCall step and Lines 649-653 append a second ToolResult with an empty call_id. Because register_tool_output_stream() mutates the same cache from a different task, a fast ToolOutput can also land before the ToolCall/cache entry exists. Refreshing mid-run can therefore show the same shell invocation as missing, reordered, or duplicated transcript steps. Reuse one lifecycle call_id end-to-end and serialize transcript accumulation through one ordered path.

Also applies to: 823-876
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/state.rs` at line 653, The live transcript path mints a separate
live_* call_id and appends ToolResult with empty call_id causing
mismatched/duplicated steps; update the code so the same lifecycle call_id is
created once and reused end-to-end: generate the call_id deterministically when
creating the ToolCall entry, store that call_id into the live_output entry
instead of minting a new live_* id, and make register_tool_output_stream()
append streamed output to that same call_id (and not an empty call_id); also
handle the race where ToolOutput can arrive before ToolCall by creating the
cache entry (or buffering the output keyed by the deterministic call_id)
immediately when the call_id is generated so subsequent
register_tool_output_stream() mutations serialize into the same transcript path
(apply the same fixes in the second region referenced).

🧹 Nitpick comments (1)

src/lib.rs (1)
361-390: Consider returning a named bus struct here.

All three return values now have the same Sender<ProcessEvent> type, so swapping memory_event_tx and tool_output_tx will still compile and silently misroute events. A small ProcessEventBuses { event_tx, memory_event_tx, tool_output_tx } return type would make this API much harder to misuse.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib.rs` around lines 361 - 390, The three broadcast senders returned by
create_process_event_buses and create_process_event_buses_with_capacity are all
the same type and can be accidentally swapped; introduce a new ProcessEventBuses
struct (e.g. pub struct ProcessEventBuses { pub control:
tokio::sync::broadcast::Sender<ProcessEvent>, pub memory:
tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output:
tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to
return that struct instead of a tuple, constructing it with the existing local
variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named
fields and cannot misroute events.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 56-63: The code incorrectly treats any tool_result step as a final
completion; update the indexing in ToolCall.tsx (the resultsById Map created in
the steps loop) to include an explicit status field (e.g., status: "pending" |
"final" | "waiting_for_input" ) and the liveOutput separately instead of
inferring completion from the presence of tool_result. Change the producer in
useLiveContext.tsx to set that explicit status on the temporary tool_result it
emits while streaming, and then update the consumer logic (the conditional
around result ? ... : "running" and the live-output branch referenced around
lines ~82-93) to switch behavior based on the new status value (treat status ===
"pending" as running, status === "final" as completed, and handle
"waiting_for_input" distinctly).

In `@interface/src/lib/primitives.tsx`:
- Around line 89-99: The href branch rendering via hasHref(props) and
PrimitiveButton must honor the loading guard like the action-button path: when
props.loading (or the component's loading state) is true, set the element to
disabled/aria-busy (prevent activation/clicks and focus as appropriate), prevent
navigation/activation handlers from running, and render the spinner instead of
interactive content; update the PrimitiveButton usage in that branch (the JSX
that spreads {...props} and passes ref, variant/buttonVariant,
className/buttonClassName, and children/content) to include the same loading
props/guards (disabled, aria-busy, and click/keypress suppression) used by the
action-button code path so the link appears and behaves disabled while loading.

In `@src/tools/shell.rs`:
- Around line 53-84: with_streaming currently stores a single call_id on the
ShellTool instance causing reuse across multiple shell invocations; instead
generate a fresh call_id per execution inside the execution path (e.g., in
ShellTool::call or ShellTool::run_streaming) and thread that per-run ID into all
lifecycle event emissions (ToolStarted/ToolOutput/ToolCompleted) and wherever
run_streaming needs it; update ShellTool::with_streaming to stop setting a
persistent call_id (or remove the field) and change run_streaming/call
signatures to accept a call_id (or create it at start of call() and pass to
run_streaming) so each invocation uses a unique ID.
- Around line 149-169: The emitted ToolOutput currently forwards the raw buffer
(in the read_line loop that constructs and sends ProcessEvent::ToolOutput),
which preserves the trailing newline and causes double newlines when the
accumulator in src/api/state.rs appends another '\n'; fix it by normalizing the
line before sending — call a trim that removes trailing newline(s) (e.g.,
trim_end_matches or trim_end) on the local `line`/`scrubbed` value in the
reader.read_line loop prior to populating ProcessEvent::ToolOutput.line so the
sent payload has no trailing line ending; ensure this change is applied where
ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is
constructed (leave the accumulator unchanged).

---

Duplicate comments:
In `@src/api/state.rs`:
- Line 653: The live transcript path mints a separate live_* call_id and appends
ToolResult with empty call_id causing mismatched/duplicated steps; update the
code so the same lifecycle call_id is created once and reused end-to-end:
generate the call_id deterministically when creating the ToolCall entry, store
that call_id into the live_output entry instead of minting a new live_* id, and
make register_tool_output_stream() append streamed output to that same call_id
(and not an empty call_id); also handle the race where ToolOutput can arrive
before ToolCall by creating the cache entry (or buffering the output keyed by
the deterministic call_id) immediately when the call_id is generated so
subsequent register_tool_output_stream() mutations serialize into the same
transcript path (apply the same fixes in the second region referenced).

---

Nitpick comments:
In `@src/lib.rs`:
- Around line 361-390: The three broadcast senders returned by
create_process_event_buses and create_process_event_buses_with_capacity are all
the same type and can be accidentally swapped; introduce a new ProcessEventBuses
struct (e.g. pub struct ProcessEventBuses { pub control:
tokio::sync::broadcast::Sender<ProcessEvent>, pub memory:
tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output:
tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to
return that struct instead of a tuple, constructing it with the existing local
variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named
fields and cannot misroute events.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb3a5e3f-ba19-4d82-81d5-c10ab5476a90

📥 Commits

Reviewing files that changed from the base of the PR and between 58fe5f4 and 4d081e0.

📒 Files selected for processing (19)

interface/src/api/client.ts
interface/src/components/ToolCall.tsx
interface/src/hooks/useLiveContext.tsx
interface/src/lib/primitives.tsx
prompts/en/tools/shell_description.md.j2
prompts/en/worker.md.j2
src/agent/channel_history.rs
src/agent/cortex.rs
src/agent/worker.rs
src/api/agents.rs
src/api/state.rs
src/api/system.rs
src/conversation/worker_transcript.rs
src/lib.rs
src/main.rs
src/tools.rs
src/tools/shell.rs
tests/bulletin.rs
tests/context_dump.rs

🚧 Files skipped from review as they are similar to previous changes (12)

src/agent/channel_history.rs
src/agent/worker.rs
prompts/en/tools/shell_description.md.j2
tests/context_dump.rs
prompts/en/worker.md.j2
src/api/system.rs
src/agent/cortex.rs
src/api/agents.rs
src/tools.rs
src/conversation/worker_transcript.rs
interface/src/hooks/useLiveContext.tsx
tests/bulletin.rs

coderabbitai · 2026-04-12T22:53:03Z

interface/src/components/ToolCall.tsx

+	const resultsById = new Map<string, {name: string; text: string; liveOutput?: string}>();

 	// First pass: index all tool_result steps by call_id
 	for (const step of steps) {
 		if (step.type === "tool_result") {
-			resultsById.set(step.call_id, {name: step.name, text: step.text});
+			const liveOutput = step.live_output;
+			resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput});
 		}


⚠️ Potential issue | 🟠 Major

Don't use placeholder tool_result steps as the completion signal.

interface/src/hooks/useLiveContext.tsx:239-275 now creates a temporary tool_result with text: "" just to carry live_output while the tool is still running. Line 91's result ? ... : "running" heuristic treats that placeholder as a real completion, so the card flips out of the running state on the first streamed line and the live-output branch below becomes effectively dead. It also leaves no way to represent the new waiting_for_input shell outcome distinctly from a hard error. Please carry an explicit pending/final status instead of deriving it from the presence of a tool_result record.

Also applies to: 82-93

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@interface/src/components/ToolCall.tsx` around lines 56 - 63, The code incorrectly treats any tool_result step as a final completion; update the indexing in ToolCall.tsx (the resultsById Map created in the steps loop) to include an explicit status field (e.g., status: "pending" | "final" | "waiting_for_input" ) and the liveOutput separately instead of inferring completion from the presence of tool_result. Change the producer in useLiveContext.tsx to set that explicit status on the temporary tool_result it emits while streaming, and then update the consumer logic (the conditional around result ? ... : "running" and the live-output branch referenced around lines ~82-93) to switch behavior based on the new status value (treat status === "pending" as running, status === "final" as completed, and handle "waiting_for_input" distinctly).

coderabbitai · 2026-04-12T22:53:03Z

interface/src/lib/primitives.tsx

+	if (hasHref(props)) {
+		return (
+			<PrimitiveButton
+				{...props}
+				ref={ref}
+				variant={buttonVariant}
+				className={buttonClassName}
+			>
+				{content}
+			</PrimitiveButton>
+		);


⚠️ Potential issue | 🟡 Minor

Apply the loading guard in the href branch too.

This branch renders the spinner, but unlike the action-button path it never marks the control busy/disabled or blocks activation. A loading link can still be focused and clicked, which makes the wrapper look disabled while remaining interactive.

Suggested patch

if (hasHref(props)) { + const {onClick, tabIndex, ...linkProps} = props; return ( <PrimitiveButton - {...props} + {...linkProps} ref={ref} + aria-busy={loading || undefined} + aria-disabled={loading || undefined} + tabIndex={loading ? -1 : tabIndex} + onClick={(event) => { + if (loading) { + event.preventDefault(); + event.stopPropagation(); + return; + } + onClick?.(event); + }} variant={buttonVariant} - className={buttonClassName} + className={cx(buttonClassName, loading && "pointer-events-none")} > {content} </PrimitiveButton> ); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (hasHref(props)) {

return (

<PrimitiveButton

{...props}

ref={ref}

variant={buttonVariant}

className={buttonClassName}

>

{content}

</PrimitiveButton>

);

if (hasHref(props)) {

const {onClick, tabIndex, ...linkProps} = props;

return (

<PrimitiveButton

{...linkProps}

ref={ref}

aria-busy={loading || undefined}

aria-disabled={loading || undefined}

tabIndex={loading ? -1 : tabIndex}

onClick={(event) => {

if (loading) {

event.preventDefault();

event.stopPropagation();

return;

}

onClick?.(event);

}}

variant={buttonVariant}

className={cx(buttonClassName, loading && "pointer-events-none")}

>

{content}

</PrimitiveButton>

);

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@interface/src/lib/primitives.tsx` around lines 89 - 99, The href branch rendering via hasHref(props) and PrimitiveButton must honor the loading guard like the action-button path: when props.loading (or the component's loading state) is true, set the element to disabled/aria-busy (prevent activation/clicks and focus as appropriate), prevent navigation/activation handlers from running, and render the spinner instead of interactive content; update the PrimitiveButton usage in that branch (the JSX that spreads {...props} and passes ref, variant/buttonVariant, className/buttonClassName, and children/content) to include the same loading props/guards (disabled, aria-busy, and click/keypress suppression) used by the action-button code path so the link appears and behaves disabled while loading.

src/tools/shell.rs

coderabbitai · 2026-04-12T22:53:03Z

src/tools/shell.rs

+        match reader.read_line(&mut line).await {
+            Ok(0) => break,
+            Ok(_) => {
+                let now_ms = std::time::SystemTime::now()
+                    .duration_since(std::time::UNIX_EPOCH)
+                    .unwrap_or_default()
+                    .as_millis() as u64;
+                ctx.last_output_ms.store(now_ms, Ordering::SeqCst);
+
+                if let (Some(tx), Some(agent_id), Some(process_id)) =
+                    (&ctx.tool_output_tx, &ctx.agent_id, &ctx.process_id)
+                {
+                    let scrubbed = crate::secrets::scrub::scrub_leaks(&line);
+                    if let Err(err) = tx.send(ProcessEvent::ToolOutput {
+                        agent_id: agent_id.clone(),
+                        process_id: process_id.clone(),
+                        channel_id: ctx.channel_id.clone(),
+                        call_id: ctx.call_id.clone(),
+                        tool_name: "shell".to_string(),
+                        line: scrubbed,
+                        stream: ctx.stream_name.clone(),


⚠️ Potential issue | 🟡 Minor

Normalize ToolOutput.line before emitting it.

This forwards the raw line buffer, and in src/api/state.rs, Lines 862-867 and Line 875 append another \n while rebuilding live_output. That double-delimits most lines and produces blank rows after a refresh. Strip the trailing line ending here or stop appending a second newline in the accumulator.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 149 - 169, The emitted ToolOutput currently forwards the raw buffer (in the read_line loop that constructs and sends ProcessEvent::ToolOutput), which preserves the trailing newline and causes double newlines when the accumulator in src/api/state.rs appends another '\n'; fix it by normalizing the line before sending — call a trim that removes trailing newline(s) (e.g., trim_end_matches or trim_end) on the local `line`/`scrubbed` value in the reader.read_line loop prior to populating ProcessEvent::ToolOutput.line so the sent payload has no trailing line ending; ensure this change is applied where ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is constructed (leave the accumulator unchanged).

vsumner force-pushed the feat/interactive-shell-streaming branch from a006718 to dbd87b0 Compare April 12, 2026 19:41

tembo bot reviewed Apr 12, 2026

View reviewed changes

src/tools/shell.rs Show resolved Hide resolved

src/tools/shell.rs Outdated Show resolved Hide resolved

interface/src/hooks/useLiveContext.tsx Outdated Show resolved Hide resolved