Skip to content

feat: Interactive shell streaming with live output#562

Open
vsumner wants to merge 2 commits intomainfrom
feat/interactive-shell-streaming
Open

feat: Interactive shell streaming with live output#562
vsumner wants to merge 2 commits intomainfrom
feat/interactive-shell-streaming

Conversation

@vsumner
Copy link
Copy Markdown
Collaborator

@vsumner vsumner commented Apr 12, 2026

Summary

Implements spawn-based streaming shell execution with real-time output and quiesce detection. This eliminates the #1 worker reliability gap: interactive commands that silently hang and timeout.

What Changed

Backend

  • ShellTool: Rewrote from batch cmd.output() to spawn-based streaming
  • ProcessEvent::ToolOutput: New event variant for line-by-line output streaming
  • Quiesce Detection: 5-second silence watchdog detects interactive prompts
  • Timeout Enforcement: Full timeout wraps entire streaming operation (not just spawn)
  • Watchdog Kill: When quiesce fires, watchdog now kills the child process to unblock stream readers

Frontend

  • ToolCall Component: Renders live output in scrollable panel with auto-scroll
  • useLiveContext: Accumulates streaming lines and clears on tool completion
  • call_id Matching: Uses pending call IDs to properly pair streaming output with tool calls

Prompts

  • Shell tool description now explains interactive behavior and --yes/-y flags
  • Worker prompt mentions retry patterns for interactive commands

Key Design Decisions

  1. Separate Broadcast Channel: ToolOutput uses capacity-1024 channel, lossy-by-design (dropping lines acceptable)
  2. Arc<Mutex>: Watchdog and main task both need kill access to the child process
  3. Exit Code -2: Signals "waiting_for_input" state when quiesce detection fires
  4. Line-by-line Streaming: Each stdout/stderr line broadcasts immediately via SSE

Verification

  • cargo check --lib passes
  • cargo clippy --lib passes
  • cargo check --tests passes
  • bun run build (frontend) passes
  • just preflight passes

Related

Closes the interactive shell reliability gap. CI=true and DEBIAN_FRONTEND=noninteractive already set in sandbox (pre-existing).

Note

Transforms shell execution from blocking batch mode to live streaming with automatic detection of interactive prompts. Fixes three P1/P2 gaps: timeout enforcement, process termination, and UI state synchronization.

Written by Tembo for commit a006718. This will update automatically on new commits.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds end-to-end live tool-output streaming: new ProcessEvent::ToolOutput and ApiEvent::ToolOutput, a dedicated tool-output broadcast channel, streaming ShellTool per-line emission with input-detection, API forwarding of per-line events to SSE, backend accumulation into worker transcripts, frontend SSE handling/rendering of liveOutput, and a UI primitives compatibility wrapper.

Changes

Cohort / File(s) Summary
Core types & bus wiring
src/lib.rs, src/main.rs, src/api/agents.rs
Introduced ProcessEvent::ToolOutput, TOOL_OUTPUT_BUS_CAPACITY, and a third broadcast sender; threaded tool_output_tx into AgentDeps and main agent wiring.
API state & SSE
src/api/state.rs, src/api/system.rs, interface/src/api/client.ts
Added ApiEvent::ToolOutput, ApiState.register_tool_output_stream task to forward per-line events and accumulate live transcripts, and SSE mapping to "tool_output"; client type ToolOutputEvent added.
Shell tool & worker tool-server
src/tools.rs, src/tools/shell.rs
create_worker_tool_server now accepts tool_output_tx; ShellTool gains streaming mode (with with_streaming) that emits per-line ToolOutput, scrubs secrets, streams stdout/stderr, and detects/kills on ~5s interactive waits (waiting_for_input flag).
Agent & worker integration
src/agent/worker.rs, src/agent/cortex.rs, src/agent/channel_history.rs
Wired tool_output_tx into worker startup; track ToolOutput as worker activity; exclude ToolOutput from channel routing and from cortex signals where appropriate.
Transcript model & conversation handling
src/conversation/worker_transcript.rs, src/api/state.rs
Added live_output: Option<String> to TranscriptStep::ToolResult; constructors set live_output: None for non-streaming; ApiState accumulates live_output for worker processes.
Frontend live handling & rendering
interface/src/hooks/useLiveContext.tsx, interface/src/components/ToolCall.tsx
Added SSE tool_output handler to append/create tool_result steps with live_output; ToolCallPair gains liveOutput?: string and running-state renders scrollable liveOutput.
Frontend primitives shim
interface/src/lib/primitives.tsx
New compatibility wrappers re-exporting @spacedrive/primitives and providing legacy-prop mappings for Button, Banner, FilterButton, NumberStepper.
Docs / prompts
prompts/en/tools/shell_description.md.j2, prompts/en/worker.md.j2
Clarified shell tool runs with no interactive stdin/TTY, 5s inactivity-driven input detection, and guidance for non-interactive flags/pipes/heredocs.
Tests
tests/bulletin.rs, tests/context_dump.rs
Updated test bootstrapping to request third bus capacity and pass tool_output_tx into AgentDeps and worker tool-server setup.
Interface types
interface/src/api/client.ts
Exported ToolOutputEvent and extended ApiEvent union to include tool_output.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Interactive shell streaming with live output' clearly and concisely summarizes the primary change—implementing real-time streaming shell output with interactive command detection.
Description check ✅ Passed The description provides comprehensive context about the streaming shell execution feature, architectural changes, design decisions, and verification steps, all directly related to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/interactive-shell-streaming

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vsumner vsumner force-pushed the feat/interactive-shell-streaming branch from a006718 to dbd87b0 Compare April 12, 2026 19:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
src/agent/channel_history.rs (1)

470-471: Add a regression test for ToolOutput exclusion.

This changes channel-scoping semantics for a brand-new ProcessEvent variant, but the tests below still only lock in ToolStarted, agent-message, and TextDelta behavior. A small assertion for ProcessEvent::ToolOutput { .. } => false would make this much harder to regress later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_history.rs` around lines 470 - 471, Add a regression test
that asserts ProcessEvent::ToolOutput is excluded from channel-scoped history:
in the tests near the existing assertions for ToolStarted, agent-message, and
TextDelta in channel_history.rs, construct a ProcessEvent::ToolOutput { .. }
instance and assert that the channel-scoping predicate (the same helper/assert
used for ToolStarted/TextDelta) returns false; this ensures the new ToolOutput
variant is explicitly covered and prevents regressions to the channel-scoping
semantics for ProcessEvent::ToolOutput.
tests/context_dump.rs (1)

90-121: Extract the test AgentDeps bootstrap into a shared helper.

Adding tool_output_tx required touching this bootstrap and the near-copy in tests/bulletin.rs. At this point, the event-bus setup plus AgentDeps skeleton looks worth centralizing in a shared test utility so future deps changes stay one-edit instead of many.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/context_dump.rs` around lines 90 - 121, Extract the shared event-bus +
AgentDeps bootstrap into a single test helper (e.g., make_test_agent_deps or
build_agent_deps) that encapsulates calling
create_process_event_buses_with_capacity(...) and constructing
spacebot::AgentDeps (including agent_id, mcp_manager, sandbox, memory_search,
llm_manager, task_store, project_store, runtime_config, event_tx,
memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated
inline setup in tests that currently call
create_process_event_buses_with_capacity and build AgentDeps (such as where
AgentDeps is instantiated) to call this helper so future changes to AgentDeps
fields only need one edit and the helper returns the fully wired AgentDeps and
any event channels you need.
interface/src/components/ToolCall.tsx (1)

61-62: Redundant type cast.

step is already typed as TranscriptStep (line 54 parameter), which is aliased to ExtendedTranscriptStep (line 12). The cast is unnecessary.

✨ Suggested simplification
 		if (step.type === "tool_result") {
-			const liveOutput = (step as ExtendedTranscriptStep).live_output;
+			const liveOutput = step.live_output;
 			resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput});
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/components/ToolCall.tsx` around lines 61 - 62, The explicit
cast "(step as ExtendedTranscriptStep)" is redundant because the parameter
"step" is already typed as TranscriptStep (an alias of ExtendedTranscriptStep);
remove the cast and read the property directly (use step.live_output) when
computing liveOutput and calling resultsById.set with step.call_id, step.name,
and step.text in the ToolCall component to simplify the code.
src/tools/shell.rs (1)

425-429: Watchdog task not aborted on normal completion.

When the command completes normally before the 5-second quiesce timeout, the watchdog task continues running in the background until its next wake-up (up to 5 seconds later). This is a minor resource leak.

Consider aborting the watchdog when streaming completes:

♻️ Proposed fix
     let watchdog = spawn_quiesce_watchdog(
         Arc::clone(&last_output_ms),
         Arc::clone(&interactive_detected),
         Arc::clone(&child_arc),
     );
 
     // ... streaming code ...
 
     let (stdout_lines, stderr_lines) = match streaming_result {
         Ok((stdout, stderr)) => (stdout, stderr),
         Err(_) => {
             // Timeout - kill child and return error
+            watchdog.abort();
             if let Ok(mut child_guard) = child_arc.try_lock() {
                 let _ = child_guard.kill().await;
             }
             return Err(ShellError {
                 message: format!("Command timed out after {} seconds", timeout.as_secs()),
                 exit_code: -1,
             });
         }
     };
+    
+    // Clean up watchdog if still running
+    watchdog.abort();
 
     let waiting_for_input = interactive_detected.load(Ordering::SeqCst);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 425 - 429, The watchdog started via
spawn_quiesce_watchdog is never aborted on normal command completion (let
_watchdog ...), so it can linger up to the quiesce timeout; change the code to
retain the returned JoinHandle (e.g., let watchdog =
spawn_quiesce_watchdog(...)) and explicitly abort it when
streaming/child-handling finishes (call watchdog.abort() or equivalent in the
code path that runs after the child exits or streaming completes), ensuring the
JoinHandle is dropped/awaited as appropriate to avoid the background task leak;
reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and
child_arc to locate where to capture and abort the handle.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 394-402: Remove the unreachable shell live-output block in
ToolCall.tsx: inside the renderResult function, delete the early-return
conditional that checks `if (pair.status === "running" && pair.liveOutput) { ...
}` because `renderResult` already delegates to `renderer.resultView(pair)` which
handles live output; ensure no other code expects that branch and keep
`renderer.resultView(pair)` as the single path for rendering the result.

In `@interface/src/lib/primitives.tsx`:
- Around line 4-10: The file imports PrimitiveBanner, PrimitiveButton,
PrimitiveFilterButton, PrimitiveNumberStepper and re-exports the package using a
hard-coded ../../node_modules path; change both import and export to import from
the package name "@spacedrive/primitives" instead (e.g., update the named import
that yields
PrimitiveBanner/PrimitiveButton/PrimitiveFilterButton/PrimitiveNumberStepper and
the export statement) so the module resolution uses the package entrypoint
rather than a workspace-specific node_modules path.

In `@prompts/en/worker.md.j2`:
- Around line 90-91: Update the wording that currently says "Commands have no
stdin" in prompts/en/worker.md.j2 to clarify "no interactive stdin / no attached
TTY" and explicitly state that shell-created pipes and here-docs (e.g., echo y |
..., heredoc <<EOF) are valid non-interactive input paths; then mirror the exact
revised phrasing in prompts/en/tools/shell_description.md.j2 so both templates
use the same wording and examples.

In `@src/agent/cortex.rs`:
- Around line 1414-1415: The ProcessEvent::ToolOutput events are currently
ignored in signal_from_event()/cortex signal handling but must still update
worker activity timestamps; modify observe_health_event() so that when it
receives a ProcessEvent::ToolOutput originating from a ProcessId::Worker(...) it
updates HealthRuntimeState::last_activity_at (same way other worker activity
events do) while leaving signal_from_event() behavior unchanged if desired;
locate references to ProcessEvent::ToolOutput, observe_health_event(),
HealthRuntimeState::last_activity_at, and ProcessId::Worker and add the
activity-timestamp update for worker tool output events.

In `@src/api/state.rs`:
- Around line 312-322: ToolOutput lacks a stable per-call identifier so streamed
lines can’t be deterministically associated with a specific invocation; add a
server-generated unique call_id (e.g., UUID/String) field to the ToolOutput
payload and ensure the same call_id is emitted on ToolStarted and ToolCompleted
events for the same invocation so consumers can join streams reliably; update
the ToolOutput struct and the code paths that construct/emits ToolOutput,
ToolStarted and ToolCompleted to generate/populate the call_id (preserve
existing process_id/tool_name fields) and include it in serialization so
downstream consumers receive the stable identifier.
- Around line 824-875: register_tool_output_stream currently forwards
ProcessEvent::ToolOutput only to SSE and does not update the in-memory
transcript cache, so streamed shell lines are lost on refresh; modify
register_tool_output_stream to also update live_worker_transcripts (the same
structure updated by register_agent_events for
ToolStarted/ToolCompleted/WorkerText) whenever a ToolOutput arrives — either
call the existing transcript-accumulation helper used by register_agent_events
or replicate its logic: derive the (process_type, process_id) via
process_id_info(process_id), locate/update the live_worker_transcripts entry for
agent_id and channel_id, append the streamed line/stream metadata to the
transcript, and then continue sending the ApiEvent::ToolOutput to api_tx; ensure
the update is performed inside the tokio::spawn task before/after sending the
ApiEvent so get_live_transcript will recover in-flight output.

---

Nitpick comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 61-62: The explicit cast "(step as ExtendedTranscriptStep)" is
redundant because the parameter "step" is already typed as TranscriptStep (an
alias of ExtendedTranscriptStep); remove the cast and read the property directly
(use step.live_output) when computing liveOutput and calling resultsById.set
with step.call_id, step.name, and step.text in the ToolCall component to
simplify the code.

In `@src/agent/channel_history.rs`:
- Around line 470-471: Add a regression test that asserts
ProcessEvent::ToolOutput is excluded from channel-scoped history: in the tests
near the existing assertions for ToolStarted, agent-message, and TextDelta in
channel_history.rs, construct a ProcessEvent::ToolOutput { .. } instance and
assert that the channel-scoping predicate (the same helper/assert used for
ToolStarted/TextDelta) returns false; this ensures the new ToolOutput variant is
explicitly covered and prevents regressions to the channel-scoping semantics for
ProcessEvent::ToolOutput.

In `@src/tools/shell.rs`:
- Around line 425-429: The watchdog started via spawn_quiesce_watchdog is never
aborted on normal command completion (let _watchdog ...), so it can linger up to
the quiesce timeout; change the code to retain the returned JoinHandle (e.g.,
let watchdog = spawn_quiesce_watchdog(...)) and explicitly abort it when
streaming/child-handling finishes (call watchdog.abort() or equivalent in the
code path that runs after the child exits or streaming completes), ensuring the
JoinHandle is dropped/awaited as appropriate to avoid the background task leak;
reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and
child_arc to locate where to capture and abort the handle.

In `@tests/context_dump.rs`:
- Around line 90-121: Extract the shared event-bus + AgentDeps bootstrap into a
single test helper (e.g., make_test_agent_deps or build_agent_deps) that
encapsulates calling create_process_event_buses_with_capacity(...) and
constructing spacebot::AgentDeps (including agent_id, mcp_manager, sandbox,
memory_search, llm_manager, task_store, project_store, runtime_config, event_tx,
memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated
inline setup in tests that currently call
create_process_event_buses_with_capacity and build AgentDeps (such as where
AgentDeps is instantiated) to call this helper so future changes to AgentDeps
fields only need one edit and the helper returns the fully wired AgentDeps and
any event channels you need.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fef29a4a-6652-4570-b229-4ef6e266e925

📥 Commits

Reviewing files that changed from the base of the PR and between 55ad26a and a006718.

📒 Files selected for processing (18)
  • interface/src/api/client.ts
  • interface/src/components/ToolCall.tsx
  • interface/src/hooks/useLiveContext.tsx
  • interface/src/lib/primitives.tsx
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/agent/worker.rs
  • src/api/agents.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/lib.rs
  • src/main.rs
  • src/tools.rs
  • src/tools/shell.rs
  • tests/bulletin.rs
  • tests/context_dump.rs

@vsumner vsumner force-pushed the feat/interactive-shell-streaming branch from dbd87b0 to 236bd42 Compare April 12, 2026 19:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
interface/src/lib/primitives.tsx (1)

8-10: ⚠️ Potential issue | 🟠 Major

Use package entrypoint imports instead of hard-coded node_modules paths.

Line 8 and Line 10 couple this module to a specific workspace layout (../../node_modules/.../dist/index.js) and bypass @spacedrive/primitives export resolution. Import/re-export from @spacedrive/primitives directly.

Proposed fix
 import {
 	Banner as PrimitiveBanner,
 	Button as PrimitiveButton,
 	FilterButton as PrimitiveFilterButton,
 	NumberStepper as PrimitiveNumberStepper,
-} from "../../node_modules/@spacedrive/primitives/dist/index.js";
+} from "@spacedrive/primitives";
 
-export * from "../../node_modules/@spacedrive/primitives/dist/index.js";
+export * from "@spacedrive/primitives";
#!/bin/bash
# Verify direct node_modules imports and confirm package-entrypoint usage.

# 1) Find hard-coded node_modules imports in interface source.
rg -n --type ts --type tsx 'from\s+["'\'']\.\./\.\./node_modules/@spacedrive/primitives/dist/index\.js["'\'']|from\s+["'\''].*node_modules/'

# 2) Inspect this file specifically.
cat -n interface/src/lib/primitives.tsx | sed -n '1,40p'

# 3) Check dependency declaration presence for `@spacedrive/primitives`.
fd -i package.json | xargs rg -n '"@spacedrive/primitives"'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 8 - 10, The file imports and
re-exports from a hard-coded
"../../node_modules/@spacedrive/primitives/dist/index.js"; update both the
import and the export to use the package entrypoint " `@spacedrive/primitives` "
instead so Node/TS resolution and package exports are respected. Locate the
import statement and the export-all statement in primitives.tsx (the lines that
reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace
them to import/export directly from `@spacedrive/primitives`; ensure any named
imports still match exported symbols from the package and run type/checking to
confirm no breakage.
interface/src/hooks/useLiveContext.tsx (1)

247-275: ⚠️ Potential issue | 🟡 Minor

Preserve streamed lines as-is and bound the buffer.

The backend is sending event.line from read_line(), so event.line + "\n" will double-space normal output. This path also grows live_output without any cap, which will balloon React state for noisy commands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 247 - 275, The code
appends "\n" to every streamed event.line and unboundedly grows live_output;
update all places in useLiveContext.tsx where live_output is built (both when
updating an existing TranscriptStep and when creating new tool_result steps) to
preserve event.line exactly (append event.line without forcing an extra newline,
or only add a newline when event.line does not already endWith("\n")), and clamp
the live_output buffer to a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH
constant and after concatenation truncate to the most recent N characters) so
pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path)
won't let React state grow without bound.
src/tools/shell.rs (2)

135-149: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog will kill valid quiet commands.

last_output_ms only advances when read_line() completes, so commands that are legitimately quiet for >5s, or that print progress/prompts without newline terminators, are classified as waiting_for_input and killed. That will false-positive on things like sleep, quiet build phases, and \r-driven progress output.

Also applies to: 178-197


194-195: ⚠️ Potential issue | 🔴 Critical

Don't make process termination best-effort here.

Both kill paths use try_lock(), so a brief lock holder can cause the kill to be skipped entirely, leaving the child running after quiesce/timeout. Even when the lock succeeds, let _ = child_guard.kill().await hides kill failures.

As per coding guidelines "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped."

Also applies to: 471-472

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 194 - 195, The current termination uses
child.try_lock() and then swallows kill() errors with let _ = ..., which can
skip killing or hide failures; replace the non-blocking try_lock() with an
awaited lock acquisition (e.g., child.lock().await) so we reliably obtain the
guard before attempting termination, and do not discard the Result from
child_guard.kill().await—either propagate the error (use ? or return a Result)
or log it with context; apply this change to both sites where child.try_lock() /
child_guard.kill().await are used so termination is not best-effort and failures
are handled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 247-275: The code appends "\n" to every streamed event.line and
unboundedly grows live_output; update all places in useLiveContext.tsx where
live_output is built (both when updating an existing TranscriptStep and when
creating new tool_result steps) to preserve event.line exactly (append
event.line without forcing an extra newline, or only add a newline when
event.line does not already endWith("\n")), and clamp the live_output buffer to
a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH constant and after
concatenation truncate to the most recent N characters) so
pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path)
won't let React state grow without bound.

In `@interface/src/lib/primitives.tsx`:
- Around line 8-10: The file imports and re-exports from a hard-coded
"../../node_modules/@spacedrive/primitives/dist/index.js"; update both the
import and the export to use the package entrypoint " `@spacedrive/primitives` "
instead so Node/TS resolution and package exports are respected. Locate the
import statement and the export-all statement in primitives.tsx (the lines that
reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace
them to import/export directly from `@spacedrive/primitives`; ensure any named
imports still match exported symbols from the package and run type/checking to
confirm no breakage.

In `@src/tools/shell.rs`:
- Around line 194-195: The current termination uses child.try_lock() and then
swallows kill() errors with let _ = ..., which can skip killing or hide
failures; replace the non-blocking try_lock() with an awaited lock acquisition
(e.g., child.lock().await) so we reliably obtain the guard before attempting
termination, and do not discard the Result from child_guard.kill().await—either
propagate the error (use ? or return a Result) or log it with context; apply
this change to both sites where child.try_lock() / child_guard.kill().await are
used so termination is not best-effort and failures are handled.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09ee23a5-9bf2-4a44-8400-eb1da0f3d567

📥 Commits

Reviewing files that changed from the base of the PR and between a006718 and dbd87b0.

📒 Files selected for processing (18)
  • interface/src/api/client.ts
  • interface/src/components/ToolCall.tsx
  • interface/src/hooks/useLiveContext.tsx
  • interface/src/lib/primitives.tsx
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/agent/worker.rs
  • src/api/agents.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/lib.rs
  • src/main.rs
  • src/tools.rs
  • src/tools/shell.rs
  • tests/bulletin.rs
  • tests/context_dump.rs
✅ Files skipped from review due to trivial changes (6)
  • src/agent/worker.rs
  • src/api/system.rs
  • prompts/en/worker.md.j2
  • src/agent/cortex.rs
  • prompts/en/tools/shell_description.md.j2
  • src/lib.rs
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/agent/channel_history.rs
  • tests/context_dump.rs
  • src/tools.rs
  • src/main.rs
  • interface/src/api/client.ts
  • tests/bulletin.rs
  • src/api/state.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
interface/src/lib/primitives.tsx (1)

3-10: ⚠️ Potential issue | 🟠 Major

Use package entrypoint imports instead of filesystem node_modules paths.

Line 8 and Line 10 are tightly coupled to repo layout and internal build output. Import/export from @spacedrive/primitives directly.

Suggested patch
 import {
 	Banner as PrimitiveBanner,
 	Button as PrimitiveButton,
 	FilterButton as PrimitiveFilterButton,
 	NumberStepper as PrimitiveNumberStepper,
-} from "../../node_modules/@spacedrive/primitives/dist/index.js";
+} from "@spacedrive/primitives";
 
-export * from "../../node_modules/@spacedrive/primitives/dist/index.js";
+export * from "@spacedrive/primitives";
#!/bin/bash
# Verify no hard-coded primitives node_modules path remains in interface sources.
rg -n --type=ts --type=tsx 'node_modules/@spacedrive/primitives|@spacedrive/primitives/dist/index\.js' interface/src

Expected result: no matches after applying the fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 3 - 10, The imports and
re-export are using a hard-coded filesystem path
("../../node_modules/@spacedrive/primitives/dist/index.js"); update the import
source and the export statement to use the package entrypoint
"@spacedrive/primitives" instead. Replace the current import that brings in
PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and
PrimitiveNumberStepper from the dist path so they import from
"@spacedrive/primitives", and change the export * line to export everything from
"@spacedrive/primitives" to remove the tight coupling to node_modules layout.
interface/src/hooks/useLiveContext.tsx (2)

242-266: ⚠️ Potential issue | 🟠 Major

Don't create a tool_result before tool_completed.

pairTranscriptSteps() in interface/src/components/ToolCall.tsx treats the existence of a tool_result as completion. Synthesizing one on the first tool_output line will flip a long-running shell call out of the running state before the real completion event arrives. Store live output separately, or attach it to the pending call entry instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 266, The code
currently creates a TranscriptStep of type "tool_result" on the first
tool_output event, which makes pairTranscriptSteps() (in ToolCall.tsx) treat the
call as completed; instead, on tool_output events update the existing pending
call entry or a separate live-output store and do not synthesize a "tool_result"
until a "tool_completed" event arrives. Modify setLiveTranscripts in
useLiveContext.tsx to: 1) locate the pending transcript step for
event.process_id and event.call_id (e.g., the existing "tool_call" step) and
append to its live_output field if present, or 2) if no pending call exists,
store output in a dedicated liveOutputs map keyed by call_id/process_id; only
create a "tool_result" step when handling tool_completed. Keep references to
TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts and
pairTranscriptSteps to locate and implement this change.

251-264: ⚠️ Potential issue | 🟡 Minor

Don't append an extra newline to streamed lines.

The backend reads with read_line(), so event.line already usually includes its trailing newline. Adding another "\n" here double-spaces most live output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 251 - 264, In
useLiveContext.tsx (in the streaming update where you build/update
TranscriptStep), stop appending an extra "\n" to event.line: change the
concatenation that sets newOutput from existingOutput + event.line + "\n" to
existingOutput + event.line, and change the new step's live_output from
event.line + "\n" to event.line so live_output uses the backend-provided newline
only; update the code paths that set live_output (the updatedStep creation and
the new step object) accordingly.
src/tools/shell.rs (1)

188-200: ⚠️ Potential issue | 🟠 Major

Don't treat every 5-second quiet period as an interactive prompt.

This will kill valid commands that are simply silent for a while (sleep 10, long compile phases, tools that update with \r instead of \n). Those runs will incorrectly surface as waiting_for_input with exit code -2. The watchdog needs a stronger signal than “no completed read_line() for 5 seconds.”

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The current prop forwarding sets allowFloat to false when
both allowFloat and legacy type are undefined, overriding
PrimitiveNumberStepper's own default; update the prop expression in the wrapper
(the arrow component that spreads {...props} into PrimitiveNumberStepper) so
allowFloat is only passed when explicitly provided or when type === "float"
(e.g. compute allowFloatProp = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={allowFloatProp}) so
PrimitiveNumberStepper can use its internal default when neither is specified.

In `@src/tools/shell.rs`:
- Around line 197-198: The code is currently discarding the Result of
child_guard.kill().await (call site using child.try_lock() and
child_guard.kill().await) which can hide failures and leave subprocesses
running; update both occurrences (the block around child.try_lock() /
child_guard.kill().await and the similar block at the other occurrence) to
handle the Result instead of using let _ =: if kill() returns Err, log the error
with context (including which branch triggered the kill, e.g., timeout or
waiting_for_input) and, where appropriate, propagate or surface the failure so
callers/tests can observe it rather than silently ignoring it.
- Around line 437-460: The stream code is creating a new call_id with
uuid::Uuid::new_v4() which breaks correlation with the ToolStarted/ToolCompleted
IDs; instead, propagate the existing invocation ID used when emitting
ToolStarted into the StreamContext structs so ToolOutput events use the same
call_id. Replace the local generation of call_id in this block with the existing
invocation/tool-call identifier (the same variable used when you send
ToolStarted) and pass that same identifier into both stdout_ctx and stderr_ctx
StreamContext.call_id fields so all ToolStarted, ToolOutput, and ToolCompleted
share the same ID.

---

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-266: The code currently creates a TranscriptStep of type
"tool_result" on the first tool_output event, which makes pairTranscriptSteps()
(in ToolCall.tsx) treat the call as completed; instead, on tool_output events
update the existing pending call entry or a separate live-output store and do
not synthesize a "tool_result" until a "tool_completed" event arrives. Modify
setLiveTranscripts in useLiveContext.tsx to: 1) locate the pending transcript
step for event.process_id and event.call_id (e.g., the existing "tool_call"
step) and append to its live_output field if present, or 2) if no pending call
exists, store output in a dedicated liveOutputs map keyed by call_id/process_id;
only create a "tool_result" step when handling tool_completed. Keep references
to TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts
and pairTranscriptSteps to locate and implement this change.
- Around line 251-264: In useLiveContext.tsx (in the streaming update where you
build/update TranscriptStep), stop appending an extra "\n" to event.line: change
the concatenation that sets newOutput from existingOutput + event.line + "\n" to
existingOutput + event.line, and change the new step's live_output from
event.line + "\n" to event.line so live_output uses the backend-provided newline
only; update the code paths that set live_output (the updatedStep creation and
the new step object) accordingly.

In `@interface/src/lib/primitives.tsx`:
- Around line 3-10: The imports and re-export are using a hard-coded filesystem
path ("../../node_modules/@spacedrive/primitives/dist/index.js"); update the
import source and the export statement to use the package entrypoint
"@spacedrive/primitives" instead. Replace the current import that brings in
PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and
PrimitiveNumberStepper from the dist path so they import from
"@spacedrive/primitives", and change the export * line to export everything from
"@spacedrive/primitives" to remove the tight coupling to node_modules layout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 32995432-7fa1-43de-ba2f-fd8f2da08cd8

📥 Commits

Reviewing files that changed from the base of the PR and between dbd87b0 and 236bd42.

📒 Files selected for processing (18)
  • interface/src/api/client.ts
  • interface/src/components/ToolCall.tsx
  • interface/src/hooks/useLiveContext.tsx
  • interface/src/lib/primitives.tsx
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/agent/worker.rs
  • src/api/agents.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/lib.rs
  • src/main.rs
  • src/tools.rs
  • src/tools/shell.rs
  • tests/bulletin.rs
  • tests/context_dump.rs
✅ Files skipped from review due to trivial changes (4)
  • src/api/system.rs
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/cortex.rs
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/agent/worker.rs
  • src/agent/channel_history.rs
  • tests/context_dump.rs
  • src/main.rs
  • interface/src/api/client.ts
  • src/api/state.rs

@vsumner vsumner force-pushed the feat/interactive-shell-streaming branch from 236bd42 to 58fe5f4 Compare April 12, 2026 20:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
src/tools/shell.rs (3)

197-199: ⚠️ Potential issue | 🟠 Major

Don't let timeout/quiesce report success if the subprocess kill was skipped or failed.

Both branches use try_lock() and then discard kill().await. If the mutex is busy or kill() returns an error, the tool still reports timeout / waiting_for_input while the child may keep running in the background. Take the lock with a bounded await and at least log the failure.

As per coding guidelines "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them. The only exception is .ok() on channel sends where the receiver may be dropped."

Also applies to: 483-486

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 197 - 199, The timeout/quiesce branches
currently call child.try_lock() and discard the result of child.kill().await,
which can silently skip killing the subprocess; replace try_lock() with an
awaited, bounded lock acquisition (e.g., with a timeout) on the child mutex and
handle the Result from child.kill().await instead of using let _ = …; if the
lock cannot be acquired or kill() returns Err, log an error indicating the
failure (including the error) and ensure the function reports failure rather
than success for the timeout/waiting_for_input paths; apply the same change to
the other occurrence referenced around lines 483-486.

437-460: ⚠️ Potential issue | 🟠 Major

This call_id still can't correlate live output with the originating tool call.

interface/src/hooks/useLiveContext.tsx still synthesizes IDs when tool_started arrives and reuses those for tool_completed, but ToolOutput now carries a separate UUID minted only here. That leaves streamed shell lines orphaned from the initiating tool_call, so the live panel never attaches to the running invocation. Reuse one server-side invocation ID across ToolStarted, ToolOutput, and ToolCompleted instead of generating a stream-only ID here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 437 - 460, The code generates a fresh UUID
into call_id and embeds it in StreamContext, which breaks correlation with the
server-side invocation id used by ToolStarted/ToolCompleted; instead stop
creating a new UUID and reuse the server invocation id passed into this routine
(or add an invocation_id parameter) so the same id flows into StreamContext for
both stdout_ctx and stderr_ctx and into all emitted ToolOutput events; update
the call_id binding to use that existing invocation id (replace the call to
uuid::Uuid::new_v4()) and ensure StreamContext.call_id is set from that single
server-side id so ToolStarted, ToolOutput and ToolCompleted share the same
identifier.

137-145: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog kills any quiet command after 5s, not just interactive ones.

This fires on any 5-second gap between completed read_line() calls, so legitimate quiet commands like sleep 10, some cargo build phases, or progress that uses \r/no newline will be misclassified as waiting_for_input and killed.

Also applies to: 181-199

interface/src/lib/primitives.tsx (1)

172-177: ⚠️ Potential issue | 🟠 Major

Preserve the primitive default when neither allowFloat nor legacy type is set.

allowFloat ?? type === "float" collapses to false when both props are undefined, so this wrapper stops PrimitiveNumberStepper from using its own default behavior.

Suggested patch
 >(({type, variant: _variant, allowFloat, ...props}, ref) => (
 	<PrimitiveNumberStepper
 		{...props}
 		ref={ref}
-		allowFloat={allowFloat ?? type === "float"}
+		allowFloat={
+			allowFloat !== undefined
+				? allowFloat
+				: type === "float"
+					? true
+					: undefined
+		}
 	/>
 ));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 172 - 177, The wrapper
currently forces allowFloat to false when both allowFloat and type are undefined
by using allowFloat ?? type === "float"; update the wrapper around
PrimitiveNumberStepper (the arrow function destructuring {type, variant:
_variant, allowFloat, ...props}, ref) so it only passes allowFloat when
explicitly provided or when type === "float"; e.g., compute a resolvedAllowFloat
= allowFloat !== undefined ? allowFloat : (type === "float" ? true : undefined)
and pass allowFloat={resolvedAllowFloat} (or conditionally include the prop) so
PrimitiveNumberStepper can use its own default when neither prop is set.
interface/src/hooks/useLiveContext.tsx (1)

242-267: ⚠️ Potential issue | 🟠 Major

Store streamed chunks verbatim and cap the live buffer.

event.line is already newline-terminated in the common case, so + "\n" double-spaces shell output. More importantly, this keeps copying an ever-growing string on every SSE event, which will balloon browser memory and rerender cost on noisy commands.

Suggested patch
+const MAX_LIVE_OUTPUT_BYTES = 32_000;
+
+function appendLiveOutput(current: string, chunk: string): string {
+	const combined = current + chunk;
+	return combined.length > MAX_LIVE_OUTPUT_BYTES
+		? combined.slice(-MAX_LIVE_OUTPUT_BYTES)
+		: combined;
+}
+
 	const handleToolOutput = useCallback((data: unknown) => {
 		const event = data as ToolOutputEvent;
 		if (event.process_type === "worker") {
 			setLiveTranscripts((prev) => {
 				const steps = prev[event.process_id] ?? [];
@@
 				if (existingIndex >= 0) {
 					// Append to existing result step
 					const step = steps[existingIndex];
 					const existingOutput = (step as TranscriptStep).live_output ?? "";
-					const newOutput = existingOutput + event.line + "\n";
+					const newOutput = appendLiveOutput(existingOutput, event.line);
 					const updatedStep = { ...step, live_output: newOutput };
 					const newSteps = [...steps];
 					newSteps[existingIndex] = updatedStep;
 					return { ...prev, [event.process_id]: newSteps };
 				}
@@
 				const step: TranscriptStep = {
 					type: "tool_result",
 					call_id: event.call_id,
 					name: event.tool_name,
 					text: "",
-					live_output: event.line + "\n",
+					live_output: appendLiveOutput("", event.line),
 				};
 				return { ...prev, [event.process_id]: [...steps, step] };
 			});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 267, In
useLiveContext's setLiveTranscripts update (the block that finds/creates a
"tool_result" step by event.call_id), stop appending an extra "\n" and instead
append event.line verbatim, and enforce a fixed max buffer for
TranscriptStep.live_output to avoid unbounded growth; when updating
existingStep.live_output concatenate event.line (no added newline) and then trim
the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N
characters or last M lines) before setting updatedStep.live_output, so the
update logic in setLiveTranscripts (and the creation of the new step) both use
the same capped, verbatim buffering behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-267: In useLiveContext's setLiveTranscripts update (the block
that finds/creates a "tool_result" step by event.call_id), stop appending an
extra "\n" and instead append event.line verbatim, and enforce a fixed max
buffer for TranscriptStep.live_output to avoid unbounded growth; when updating
existingStep.live_output concatenate event.line (no added newline) and then trim
the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N
characters or last M lines) before setting updatedStep.live_output, so the
update logic in setLiveTranscripts (and the creation of the new step) both use
the same capped, verbatim buffering behavior.

In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The wrapper currently forces allowFloat to false when
both allowFloat and type are undefined by using allowFloat ?? type === "float";
update the wrapper around PrimitiveNumberStepper (the arrow function
destructuring {type, variant: _variant, allowFloat, ...props}, ref) so it only
passes allowFloat when explicitly provided or when type === "float"; e.g.,
compute a resolvedAllowFloat = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={resolvedAllowFloat} (or
conditionally include the prop) so PrimitiveNumberStepper can use its own
default when neither prop is set.

In `@src/tools/shell.rs`:
- Around line 197-199: The timeout/quiesce branches currently call
child.try_lock() and discard the result of child.kill().await, which can
silently skip killing the subprocess; replace try_lock() with an awaited,
bounded lock acquisition (e.g., with a timeout) on the child mutex and handle
the Result from child.kill().await instead of using let _ = …; if the lock
cannot be acquired or kill() returns Err, log an error indicating the failure
(including the error) and ensure the function reports failure rather than
success for the timeout/waiting_for_input paths; apply the same change to the
other occurrence referenced around lines 483-486.
- Around line 437-460: The code generates a fresh UUID into call_id and embeds
it in StreamContext, which breaks correlation with the server-side invocation id
used by ToolStarted/ToolCompleted; instead stop creating a new UUID and reuse
the server invocation id passed into this routine (or add an invocation_id
parameter) so the same id flows into StreamContext for both stdout_ctx and
stderr_ctx and into all emitted ToolOutput events; update the call_id binding to
use that existing invocation id (replace the call to uuid::Uuid::new_v4()) and
ensure StreamContext.call_id is set from that single server-side id so
ToolStarted, ToolOutput and ToolCompleted share the same identifier.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a35e9be-754d-4f6a-ba93-ba812e1abb48

📥 Commits

Reviewing files that changed from the base of the PR and between 236bd42 and 58fe5f4.

📒 Files selected for processing (19)
  • interface/src/api/client.ts
  • interface/src/components/ToolCall.tsx
  • interface/src/hooks/useLiveContext.tsx
  • interface/src/lib/primitives.tsx
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/agent/worker.rs
  • src/api/agents.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/conversation/worker_transcript.rs
  • src/lib.rs
  • src/main.rs
  • src/tools.rs
  • src/tools/shell.rs
  • tests/bulletin.rs
  • tests/context_dump.rs
✅ Files skipped from review due to trivial changes (4)
  • src/api/system.rs
  • prompts/en/worker.md.j2
  • prompts/en/tools/shell_description.md.j2
  • src/api/agents.rs
🚧 Files skipped from review as they are similar to previous changes (8)
  • src/agent/worker.rs
  • src/agent/channel_history.rs
  • src/tools.rs
  • tests/context_dump.rs
  • src/main.rs
  • interface/src/api/client.ts
  • src/agent/cortex.rs
  • src/api/state.rs

@vsumner vsumner force-pushed the feat/interactive-shell-streaming branch from 58fe5f4 to 4d081e0 Compare April 12, 2026 22:42
Replace batch-output shell execution with spawn-based streaming:
- Shell commands now stream output line-by-line via ProcessEvent::ToolOutput
- 5-second quiesce detection kills interactive prompts (waiting_for_input)
- Frontend renders live output in ToolCall component with auto-scroll
- ToolOutput events broadcast via separate SSE channel (capacity 1024)
- Timeout enforcement wraps entire streaming operation
- Watchdog now kills child process when quiesce detected

Fixes:
- P1: Timeout was ignored in streaming mode (now enforced)
- P1: Watchdog didn't kill child (now passes Arc<Mutex<Child>>)
- P2: Frontend call_id mismatch (now uses pendingToolCallIdsRef)

Files changed:
- src/tools/shell.rs: Complete rewrite with streaming
- src/lib.rs: Add ProcessEvent::ToolOutput variant
- src/api/state.rs: Wire SSE for tool_output events
- interface/: Live output rendering components
- prompts/: Interactive guidance for LLM
@vsumner vsumner force-pushed the feat/interactive-shell-streaming branch from 4d081e0 to ada9281 Compare April 12, 2026 22:44
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
src/tools/shell.rs (1)

186-219: ⚠️ Potential issue | 🟠 Major

The quiesce watchdog will misclassify quiet commands as interactive.

This loop treats any ~5s window without a completed line read as waiting_for_input. Commands like sleep 10, long quiet build phases, or tools that only repaint with \r will be terminated even though they never asked for input. Please gate this on prompt-like output or make the heuristic configurable/opt-in.

src/api/state.rs (1)

653-653: ⚠️ Potential issue | 🟠 Major

The live transcript builder still uses mismatched call IDs.

TranscriptStep::ToolResult already carries call_id in src/conversation/worker_transcript.rs, Lines 35-42, but this path appends streamed output under the real call_id while Line 596 still mints a separate live_* ID for the ToolCall step and Lines 649-653 append a second ToolResult with an empty call_id. Because register_tool_output_stream() mutates the same cache from a different task, a fast ToolOutput can also land before the ToolCall/cache entry exists. Refreshing mid-run can therefore show the same shell invocation as missing, reordered, or duplicated transcript steps. Reuse one lifecycle call_id end-to-end and serialize transcript accumulation through one ordered path.

Also applies to: 823-876

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/state.rs` at line 653, The live transcript path mints a separate
live_* call_id and appends ToolResult with empty call_id causing
mismatched/duplicated steps; update the code so the same lifecycle call_id is
created once and reused end-to-end: generate the call_id deterministically when
creating the ToolCall entry, store that call_id into the live_output entry
instead of minting a new live_* id, and make register_tool_output_stream()
append streamed output to that same call_id (and not an empty call_id); also
handle the race where ToolOutput can arrive before ToolCall by creating the
cache entry (or buffering the output keyed by the deterministic call_id)
immediately when the call_id is generated so subsequent
register_tool_output_stream() mutations serialize into the same transcript path
(apply the same fixes in the second region referenced).
🧹 Nitpick comments (1)
src/lib.rs (1)

361-390: Consider returning a named bus struct here.

All three return values now have the same Sender<ProcessEvent> type, so swapping memory_event_tx and tool_output_tx will still compile and silently misroute events. A small ProcessEventBuses { event_tx, memory_event_tx, tool_output_tx } return type would make this API much harder to misuse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib.rs` around lines 361 - 390, The three broadcast senders returned by
create_process_event_buses and create_process_event_buses_with_capacity are all
the same type and can be accidentally swapped; introduce a new ProcessEventBuses
struct (e.g. pub struct ProcessEventBuses { pub control:
tokio::sync::broadcast::Sender<ProcessEvent>, pub memory:
tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output:
tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to
return that struct instead of a tuple, constructing it with the existing local
variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named
fields and cannot misroute events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 56-63: The code incorrectly treats any tool_result step as a final
completion; update the indexing in ToolCall.tsx (the resultsById Map created in
the steps loop) to include an explicit status field (e.g., status: "pending" |
"final" | "waiting_for_input" ) and the liveOutput separately instead of
inferring completion from the presence of tool_result. Change the producer in
useLiveContext.tsx to set that explicit status on the temporary tool_result it
emits while streaming, and then update the consumer logic (the conditional
around result ? ... : "running" and the live-output branch referenced around
lines ~82-93) to switch behavior based on the new status value (treat status ===
"pending" as running, status === "final" as completed, and handle
"waiting_for_input" distinctly).

In `@interface/src/lib/primitives.tsx`:
- Around line 89-99: The href branch rendering via hasHref(props) and
PrimitiveButton must honor the loading guard like the action-button path: when
props.loading (or the component's loading state) is true, set the element to
disabled/aria-busy (prevent activation/clicks and focus as appropriate), prevent
navigation/activation handlers from running, and render the spinner instead of
interactive content; update the PrimitiveButton usage in that branch (the JSX
that spreads {...props} and passes ref, variant/buttonVariant,
className/buttonClassName, and children/content) to include the same loading
props/guards (disabled, aria-busy, and click/keypress suppression) used by the
action-button code path so the link appears and behaves disabled while loading.

In `@src/tools/shell.rs`:
- Around line 53-84: with_streaming currently stores a single call_id on the
ShellTool instance causing reuse across multiple shell invocations; instead
generate a fresh call_id per execution inside the execution path (e.g., in
ShellTool::call or ShellTool::run_streaming) and thread that per-run ID into all
lifecycle event emissions (ToolStarted/ToolOutput/ToolCompleted) and wherever
run_streaming needs it; update ShellTool::with_streaming to stop setting a
persistent call_id (or remove the field) and change run_streaming/call
signatures to accept a call_id (or create it at start of call() and pass to
run_streaming) so each invocation uses a unique ID.
- Around line 149-169: The emitted ToolOutput currently forwards the raw buffer
(in the read_line loop that constructs and sends ProcessEvent::ToolOutput),
which preserves the trailing newline and causes double newlines when the
accumulator in src/api/state.rs appends another '\n'; fix it by normalizing the
line before sending — call a trim that removes trailing newline(s) (e.g.,
trim_end_matches or trim_end) on the local `line`/`scrubbed` value in the
reader.read_line loop prior to populating ProcessEvent::ToolOutput.line so the
sent payload has no trailing line ending; ensure this change is applied where
ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is
constructed (leave the accumulator unchanged).

---

Duplicate comments:
In `@src/api/state.rs`:
- Line 653: The live transcript path mints a separate live_* call_id and appends
ToolResult with empty call_id causing mismatched/duplicated steps; update the
code so the same lifecycle call_id is created once and reused end-to-end:
generate the call_id deterministically when creating the ToolCall entry, store
that call_id into the live_output entry instead of minting a new live_* id, and
make register_tool_output_stream() append streamed output to that same call_id
(and not an empty call_id); also handle the race where ToolOutput can arrive
before ToolCall by creating the cache entry (or buffering the output keyed by
the deterministic call_id) immediately when the call_id is generated so
subsequent register_tool_output_stream() mutations serialize into the same
transcript path (apply the same fixes in the second region referenced).

---

Nitpick comments:
In `@src/lib.rs`:
- Around line 361-390: The three broadcast senders returned by
create_process_event_buses and create_process_event_buses_with_capacity are all
the same type and can be accidentally swapped; introduce a new ProcessEventBuses
struct (e.g. pub struct ProcessEventBuses { pub control:
tokio::sync::broadcast::Sender<ProcessEvent>, pub memory:
tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output:
tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to
return that struct instead of a tuple, constructing it with the existing local
variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named
fields and cannot misroute events.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb3a5e3f-ba19-4d82-81d5-c10ab5476a90

📥 Commits

Reviewing files that changed from the base of the PR and between 58fe5f4 and 4d081e0.

📒 Files selected for processing (19)
  • interface/src/api/client.ts
  • interface/src/components/ToolCall.tsx
  • interface/src/hooks/useLiveContext.tsx
  • interface/src/lib/primitives.tsx
  • prompts/en/tools/shell_description.md.j2
  • prompts/en/worker.md.j2
  • src/agent/channel_history.rs
  • src/agent/cortex.rs
  • src/agent/worker.rs
  • src/api/agents.rs
  • src/api/state.rs
  • src/api/system.rs
  • src/conversation/worker_transcript.rs
  • src/lib.rs
  • src/main.rs
  • src/tools.rs
  • src/tools/shell.rs
  • tests/bulletin.rs
  • tests/context_dump.rs
🚧 Files skipped from review as they are similar to previous changes (12)
  • src/agent/channel_history.rs
  • src/agent/worker.rs
  • prompts/en/tools/shell_description.md.j2
  • tests/context_dump.rs
  • prompts/en/worker.md.j2
  • src/api/system.rs
  • src/agent/cortex.rs
  • src/api/agents.rs
  • src/tools.rs
  • src/conversation/worker_transcript.rs
  • interface/src/hooks/useLiveContext.tsx
  • tests/bulletin.rs

Comment on lines +56 to 63
const resultsById = new Map<string, {name: string; text: string; liveOutput?: string}>();

// First pass: index all tool_result steps by call_id
for (const step of steps) {
if (step.type === "tool_result") {
resultsById.set(step.call_id, {name: step.name, text: step.text});
const liveOutput = step.live_output;
resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't use placeholder tool_result steps as the completion signal.

interface/src/hooks/useLiveContext.tsx:239-275 now creates a temporary tool_result with text: "" just to carry live_output while the tool is still running. Line 91's result ? ... : "running" heuristic treats that placeholder as a real completion, so the card flips out of the running state on the first streamed line and the live-output branch below becomes effectively dead. It also leaves no way to represent the new waiting_for_input shell outcome distinctly from a hard error. Please carry an explicit pending/final status instead of deriving it from the presence of a tool_result record.

Also applies to: 82-93

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/components/ToolCall.tsx` around lines 56 - 63, The code
incorrectly treats any tool_result step as a final completion; update the
indexing in ToolCall.tsx (the resultsById Map created in the steps loop) to
include an explicit status field (e.g., status: "pending" | "final" |
"waiting_for_input" ) and the liveOutput separately instead of inferring
completion from the presence of tool_result. Change the producer in
useLiveContext.tsx to set that explicit status on the temporary tool_result it
emits while streaming, and then update the consumer logic (the conditional
around result ? ... : "running" and the live-output branch referenced around
lines ~82-93) to switch behavior based on the new status value (treat status ===
"pending" as running, status === "final" as completed, and handle
"waiting_for_input" distinctly).

Comment on lines +89 to +99
if (hasHref(props)) {
return (
<PrimitiveButton
{...props}
ref={ref}
variant={buttonVariant}
className={buttonClassName}
>
{content}
</PrimitiveButton>
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Apply the loading guard in the href branch too.

This branch renders the spinner, but unlike the action-button path it never marks the control busy/disabled or blocks activation. A loading link can still be focused and clicked, which makes the wrapper look disabled while remaining interactive.

Suggested patch
 	if (hasHref(props)) {
+		const {onClick, tabIndex, ...linkProps} = props;
 		return (
 			<PrimitiveButton
-				{...props}
+				{...linkProps}
 				ref={ref}
+				aria-busy={loading || undefined}
+				aria-disabled={loading || undefined}
+				tabIndex={loading ? -1 : tabIndex}
+				onClick={(event) => {
+					if (loading) {
+						event.preventDefault();
+						event.stopPropagation();
+						return;
+					}
+					onClick?.(event);
+				}}
 				variant={buttonVariant}
-				className={buttonClassName}
+				className={cx(buttonClassName, loading && "pointer-events-none")}
 			>
 				{content}
 			</PrimitiveButton>
 		);
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (hasHref(props)) {
return (
<PrimitiveButton
{...props}
ref={ref}
variant={buttonVariant}
className={buttonClassName}
>
{content}
</PrimitiveButton>
);
if (hasHref(props)) {
const {onClick, tabIndex, ...linkProps} = props;
return (
<PrimitiveButton
{...linkProps}
ref={ref}
aria-busy={loading || undefined}
aria-disabled={loading || undefined}
tabIndex={loading ? -1 : tabIndex}
onClick={(event) => {
if (loading) {
event.preventDefault();
event.stopPropagation();
return;
}
onClick?.(event);
}}
variant={buttonVariant}
className={cx(buttonClassName, loading && "pointer-events-none")}
>
{content}
</PrimitiveButton>
);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@interface/src/lib/primitives.tsx` around lines 89 - 99, The href branch
rendering via hasHref(props) and PrimitiveButton must honor the loading guard
like the action-button path: when props.loading (or the component's loading
state) is true, set the element to disabled/aria-busy (prevent activation/clicks
and focus as appropriate), prevent navigation/activation handlers from running,
and render the spinner instead of interactive content; update the
PrimitiveButton usage in that branch (the JSX that spreads {...props} and passes
ref, variant/buttonVariant, className/buttonClassName, and children/content) to
include the same loading props/guards (disabled, aria-busy, and click/keypress
suppression) used by the action-button code path so the link appears and behaves
disabled while loading.

Comment on lines +149 to +169
match reader.read_line(&mut line).await {
Ok(0) => break,
Ok(_) => {
let now_ms = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_millis() as u64;
ctx.last_output_ms.store(now_ms, Ordering::SeqCst);

if let (Some(tx), Some(agent_id), Some(process_id)) =
(&ctx.tool_output_tx, &ctx.agent_id, &ctx.process_id)
{
let scrubbed = crate::secrets::scrub::scrub_leaks(&line);
if let Err(err) = tx.send(ProcessEvent::ToolOutput {
agent_id: agent_id.clone(),
process_id: process_id.clone(),
channel_id: ctx.channel_id.clone(),
call_id: ctx.call_id.clone(),
tool_name: "shell".to_string(),
line: scrubbed,
stream: ctx.stream_name.clone(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize ToolOutput.line before emitting it.

This forwards the raw line buffer, and in src/api/state.rs, Lines 862-867 and Line 875 append another \n while rebuilding live_output. That double-delimits most lines and produces blank rows after a refresh. Strip the trailing line ending here or stop appending a second newline in the accumulator.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/shell.rs` around lines 149 - 169, The emitted ToolOutput currently
forwards the raw buffer (in the read_line loop that constructs and sends
ProcessEvent::ToolOutput), which preserves the trailing newline and causes
double newlines when the accumulator in src/api/state.rs appends another '\n';
fix it by normalizing the line before sending — call a trim that removes
trailing newline(s) (e.g., trim_end_matches or trim_end) on the local
`line`/`scrubbed` value in the reader.read_line loop prior to populating
ProcessEvent::ToolOutput.line so the sent payload has no trailing line ending;
ensure this change is applied where
ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is
constructed (leave the accumulator unchanged).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants