feat: Interactive shell streaming with live output#562
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end live tool-output streaming: new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
a006718 to
dbd87b0
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
src/agent/channel_history.rs (1)
470-471: Add a regression test forToolOutputexclusion.This changes channel-scoping semantics for a brand-new
ProcessEventvariant, but the tests below still only lock inToolStarted, agent-message, andTextDeltabehavior. A small assertion forProcessEvent::ToolOutput { .. } => falsewould make this much harder to regress later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/channel_history.rs` around lines 470 - 471, Add a regression test that asserts ProcessEvent::ToolOutput is excluded from channel-scoped history: in the tests near the existing assertions for ToolStarted, agent-message, and TextDelta in channel_history.rs, construct a ProcessEvent::ToolOutput { .. } instance and assert that the channel-scoping predicate (the same helper/assert used for ToolStarted/TextDelta) returns false; this ensures the new ToolOutput variant is explicitly covered and prevents regressions to the channel-scoping semantics for ProcessEvent::ToolOutput.tests/context_dump.rs (1)
90-121: Extract the testAgentDepsbootstrap into a shared helper.Adding
tool_output_txrequired touching this bootstrap and the near-copy intests/bulletin.rs. At this point, the event-bus setup plusAgentDepsskeleton looks worth centralizing in a shared test utility so future deps changes stay one-edit instead of many.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/context_dump.rs` around lines 90 - 121, Extract the shared event-bus + AgentDeps bootstrap into a single test helper (e.g., make_test_agent_deps or build_agent_deps) that encapsulates calling create_process_event_buses_with_capacity(...) and constructing spacebot::AgentDeps (including agent_id, mcp_manager, sandbox, memory_search, llm_manager, task_store, project_store, runtime_config, event_tx, memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated inline setup in tests that currently call create_process_event_buses_with_capacity and build AgentDeps (such as where AgentDeps is instantiated) to call this helper so future changes to AgentDeps fields only need one edit and the helper returns the fully wired AgentDeps and any event channels you need.interface/src/components/ToolCall.tsx (1)
61-62: Redundant type cast.
stepis already typed asTranscriptStep(line 54 parameter), which is aliased toExtendedTranscriptStep(line 12). The cast is unnecessary.✨ Suggested simplification
if (step.type === "tool_result") { - const liveOutput = (step as ExtendedTranscriptStep).live_output; + const liveOutput = step.live_output; resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput}); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/components/ToolCall.tsx` around lines 61 - 62, The explicit cast "(step as ExtendedTranscriptStep)" is redundant because the parameter "step" is already typed as TranscriptStep (an alias of ExtendedTranscriptStep); remove the cast and read the property directly (use step.live_output) when computing liveOutput and calling resultsById.set with step.call_id, step.name, and step.text in the ToolCall component to simplify the code.src/tools/shell.rs (1)
425-429: Watchdog task not aborted on normal completion.When the command completes normally before the 5-second quiesce timeout, the watchdog task continues running in the background until its next wake-up (up to 5 seconds later). This is a minor resource leak.
Consider aborting the watchdog when streaming completes:
♻️ Proposed fix
let watchdog = spawn_quiesce_watchdog( Arc::clone(&last_output_ms), Arc::clone(&interactive_detected), Arc::clone(&child_arc), ); // ... streaming code ... let (stdout_lines, stderr_lines) = match streaming_result { Ok((stdout, stderr)) => (stdout, stderr), Err(_) => { // Timeout - kill child and return error + watchdog.abort(); if let Ok(mut child_guard) = child_arc.try_lock() { let _ = child_guard.kill().await; } return Err(ShellError { message: format!("Command timed out after {} seconds", timeout.as_secs()), exit_code: -1, }); } }; + + // Clean up watchdog if still running + watchdog.abort(); let waiting_for_input = interactive_detected.load(Ordering::SeqCst);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 425 - 429, The watchdog started via spawn_quiesce_watchdog is never aborted on normal command completion (let _watchdog ...), so it can linger up to the quiesce timeout; change the code to retain the returned JoinHandle (e.g., let watchdog = spawn_quiesce_watchdog(...)) and explicitly abort it when streaming/child-handling finishes (call watchdog.abort() or equivalent in the code path that runs after the child exits or streaming completes), ensuring the JoinHandle is dropped/awaited as appropriate to avoid the background task leak; reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and child_arc to locate where to capture and abort the handle.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 394-402: Remove the unreachable shell live-output block in
ToolCall.tsx: inside the renderResult function, delete the early-return
conditional that checks `if (pair.status === "running" && pair.liveOutput) { ...
}` because `renderResult` already delegates to `renderer.resultView(pair)` which
handles live output; ensure no other code expects that branch and keep
`renderer.resultView(pair)` as the single path for rendering the result.
In `@interface/src/lib/primitives.tsx`:
- Around line 4-10: The file imports PrimitiveBanner, PrimitiveButton,
PrimitiveFilterButton, PrimitiveNumberStepper and re-exports the package using a
hard-coded ../../node_modules path; change both import and export to import from
the package name "@spacedrive/primitives" instead (e.g., update the named import
that yields
PrimitiveBanner/PrimitiveButton/PrimitiveFilterButton/PrimitiveNumberStepper and
the export statement) so the module resolution uses the package entrypoint
rather than a workspace-specific node_modules path.
In `@prompts/en/worker.md.j2`:
- Around line 90-91: Update the wording that currently says "Commands have no
stdin" in prompts/en/worker.md.j2 to clarify "no interactive stdin / no attached
TTY" and explicitly state that shell-created pipes and here-docs (e.g., echo y |
..., heredoc <<EOF) are valid non-interactive input paths; then mirror the exact
revised phrasing in prompts/en/tools/shell_description.md.j2 so both templates
use the same wording and examples.
In `@src/agent/cortex.rs`:
- Around line 1414-1415: The ProcessEvent::ToolOutput events are currently
ignored in signal_from_event()/cortex signal handling but must still update
worker activity timestamps; modify observe_health_event() so that when it
receives a ProcessEvent::ToolOutput originating from a ProcessId::Worker(...) it
updates HealthRuntimeState::last_activity_at (same way other worker activity
events do) while leaving signal_from_event() behavior unchanged if desired;
locate references to ProcessEvent::ToolOutput, observe_health_event(),
HealthRuntimeState::last_activity_at, and ProcessId::Worker and add the
activity-timestamp update for worker tool output events.
In `@src/api/state.rs`:
- Around line 312-322: ToolOutput lacks a stable per-call identifier so streamed
lines can’t be deterministically associated with a specific invocation; add a
server-generated unique call_id (e.g., UUID/String) field to the ToolOutput
payload and ensure the same call_id is emitted on ToolStarted and ToolCompleted
events for the same invocation so consumers can join streams reliably; update
the ToolOutput struct and the code paths that construct/emits ToolOutput,
ToolStarted and ToolCompleted to generate/populate the call_id (preserve
existing process_id/tool_name fields) and include it in serialization so
downstream consumers receive the stable identifier.
- Around line 824-875: register_tool_output_stream currently forwards
ProcessEvent::ToolOutput only to SSE and does not update the in-memory
transcript cache, so streamed shell lines are lost on refresh; modify
register_tool_output_stream to also update live_worker_transcripts (the same
structure updated by register_agent_events for
ToolStarted/ToolCompleted/WorkerText) whenever a ToolOutput arrives — either
call the existing transcript-accumulation helper used by register_agent_events
or replicate its logic: derive the (process_type, process_id) via
process_id_info(process_id), locate/update the live_worker_transcripts entry for
agent_id and channel_id, append the streamed line/stream metadata to the
transcript, and then continue sending the ApiEvent::ToolOutput to api_tx; ensure
the update is performed inside the tokio::spawn task before/after sending the
ApiEvent so get_live_transcript will recover in-flight output.
---
Nitpick comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 61-62: The explicit cast "(step as ExtendedTranscriptStep)" is
redundant because the parameter "step" is already typed as TranscriptStep (an
alias of ExtendedTranscriptStep); remove the cast and read the property directly
(use step.live_output) when computing liveOutput and calling resultsById.set
with step.call_id, step.name, and step.text in the ToolCall component to
simplify the code.
In `@src/agent/channel_history.rs`:
- Around line 470-471: Add a regression test that asserts
ProcessEvent::ToolOutput is excluded from channel-scoped history: in the tests
near the existing assertions for ToolStarted, agent-message, and TextDelta in
channel_history.rs, construct a ProcessEvent::ToolOutput { .. } instance and
assert that the channel-scoping predicate (the same helper/assert used for
ToolStarted/TextDelta) returns false; this ensures the new ToolOutput variant is
explicitly covered and prevents regressions to the channel-scoping semantics for
ProcessEvent::ToolOutput.
In `@src/tools/shell.rs`:
- Around line 425-429: The watchdog started via spawn_quiesce_watchdog is never
aborted on normal command completion (let _watchdog ...), so it can linger up to
the quiesce timeout; change the code to retain the returned JoinHandle (e.g.,
let watchdog = spawn_quiesce_watchdog(...)) and explicitly abort it when
streaming/child-handling finishes (call watchdog.abort() or equivalent in the
code path that runs after the child exits or streaming completes), ensuring the
JoinHandle is dropped/awaited as appropriate to avoid the background task leak;
reference spawn_quiesce_watchdog, last_output_ms, interactive_detected, and
child_arc to locate where to capture and abort the handle.
In `@tests/context_dump.rs`:
- Around line 90-121: Extract the shared event-bus + AgentDeps bootstrap into a
single test helper (e.g., make_test_agent_deps or build_agent_deps) that
encapsulates calling create_process_event_buses_with_capacity(...) and
constructing spacebot::AgentDeps (including agent_id, mcp_manager, sandbox,
memory_search, llm_manager, task_store, project_store, runtime_config, event_tx,
memory_event_tx, tool_output_tx, sqlite_pool, etc.); replace the duplicated
inline setup in tests that currently call
create_process_event_buses_with_capacity and build AgentDeps (such as where
AgentDeps is instantiated) to call this helper so future changes to AgentDeps
fields only need one edit and the helper returns the fully wired AgentDeps and
any event channels you need.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fef29a4a-6652-4570-b229-4ef6e266e925
📒 Files selected for processing (18)
interface/src/api/client.tsinterface/src/components/ToolCall.tsxinterface/src/hooks/useLiveContext.tsxinterface/src/lib/primitives.tsxprompts/en/tools/shell_description.md.j2prompts/en/worker.md.j2src/agent/channel_history.rssrc/agent/cortex.rssrc/agent/worker.rssrc/api/agents.rssrc/api/state.rssrc/api/system.rssrc/lib.rssrc/main.rssrc/tools.rssrc/tools/shell.rstests/bulletin.rstests/context_dump.rs
dbd87b0 to
236bd42
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
interface/src/lib/primitives.tsx (1)
8-10:⚠️ Potential issue | 🟠 MajorUse package entrypoint imports instead of hard-coded
node_modulespaths.Line 8 and Line 10 couple this module to a specific workspace layout (
../../node_modules/.../dist/index.js) and bypass@spacedrive/primitivesexport resolution. Import/re-export from@spacedrive/primitivesdirectly.Proposed fix
import { Banner as PrimitiveBanner, Button as PrimitiveButton, FilterButton as PrimitiveFilterButton, NumberStepper as PrimitiveNumberStepper, -} from "../../node_modules/@spacedrive/primitives/dist/index.js"; +} from "@spacedrive/primitives"; -export * from "../../node_modules/@spacedrive/primitives/dist/index.js"; +export * from "@spacedrive/primitives";#!/bin/bash # Verify direct node_modules imports and confirm package-entrypoint usage. # 1) Find hard-coded node_modules imports in interface source. rg -n --type ts --type tsx 'from\s+["'\'']\.\./\.\./node_modules/@spacedrive/primitives/dist/index\.js["'\'']|from\s+["'\''].*node_modules/' # 2) Inspect this file specifically. cat -n interface/src/lib/primitives.tsx | sed -n '1,40p' # 3) Check dependency declaration presence for `@spacedrive/primitives`. fd -i package.json | xargs rg -n '"@spacedrive/primitives"'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/lib/primitives.tsx` around lines 8 - 10, The file imports and re-exports from a hard-coded "../../node_modules/@spacedrive/primitives/dist/index.js"; update both the import and the export to use the package entrypoint " `@spacedrive/primitives` " instead so Node/TS resolution and package exports are respected. Locate the import statement and the export-all statement in primitives.tsx (the lines that reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace them to import/export directly from `@spacedrive/primitives`; ensure any named imports still match exported symbols from the package and run type/checking to confirm no breakage.interface/src/hooks/useLiveContext.tsx (1)
247-275:⚠️ Potential issue | 🟡 MinorPreserve streamed lines as-is and bound the buffer.
The backend is sending
event.linefromread_line(), soevent.line + "\n"will double-space normal output. This path also growslive_outputwithout any cap, which will balloon React state for noisy commands.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useLiveContext.tsx` around lines 247 - 275, The code appends "\n" to every streamed event.line and unboundedly grows live_output; update all places in useLiveContext.tsx where live_output is built (both when updating an existing TranscriptStep and when creating new tool_result steps) to preserve event.line exactly (append event.line without forcing an extra newline, or only add a newline when event.line does not already endWith("\n")), and clamp the live_output buffer to a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH constant and after concatenation truncate to the most recent N characters) so pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path) won't let React state grow without bound.src/tools/shell.rs (2)
135-149:⚠️ Potential issue | 🟠 MajorThe quiesce watchdog will kill valid quiet commands.
last_output_msonly advances whenread_line()completes, so commands that are legitimately quiet for >5s, or that print progress/prompts without newline terminators, are classified aswaiting_for_inputand killed. That will false-positive on things likesleep, quiet build phases, and\r-driven progress output.Also applies to: 178-197
194-195:⚠️ Potential issue | 🔴 CriticalDon't make process termination best-effort here.
Both kill paths use
try_lock(), so a brief lock holder can cause the kill to be skipped entirely, leaving the child running after quiesce/timeout. Even when the lock succeeds,let _ = child_guard.kill().awaithides kill failures.As per coding guidelines "Don't silently discard errors. No
let _ =on Results. Handle them, log them, or propagate them. The only exception is.ok()on channel sends where the receiver may be dropped."Also applies to: 471-472
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 194 - 195, The current termination uses child.try_lock() and then swallows kill() errors with let _ = ..., which can skip killing or hide failures; replace the non-blocking try_lock() with an awaited lock acquisition (e.g., child.lock().await) so we reliably obtain the guard before attempting termination, and do not discard the Result from child_guard.kill().await—either propagate the error (use ? or return a Result) or log it with context; apply this change to both sites where child.try_lock() / child_guard.kill().await are used so termination is not best-effort and failures are handled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 247-275: The code appends "\n" to every streamed event.line and
unboundedly grows live_output; update all places in useLiveContext.tsx where
live_output is built (both when updating an existing TranscriptStep and when
creating new tool_result steps) to preserve event.line exactly (append
event.line without forcing an extra newline, or only add a newline when
event.line does not already endWith("\n")), and clamp the live_output buffer to
a sensible maximum (introduce a MAX_LIVE_OUTPUT_LENGTH constant and after
concatenation truncate to the most recent N characters) so
pendingToolCallIdsRef-related handlers (and the fallback synthetic call_id path)
won't let React state grow without bound.
In `@interface/src/lib/primitives.tsx`:
- Around line 8-10: The file imports and re-exports from a hard-coded
"../../node_modules/@spacedrive/primitives/dist/index.js"; update both the
import and the export to use the package entrypoint " `@spacedrive/primitives` "
instead so Node/TS resolution and package exports are respected. Locate the
import statement and the export-all statement in primitives.tsx (the lines that
reference "../../node_modules/@spacedrive/primitives/dist/index.js") and replace
them to import/export directly from `@spacedrive/primitives`; ensure any named
imports still match exported symbols from the package and run type/checking to
confirm no breakage.
In `@src/tools/shell.rs`:
- Around line 194-195: The current termination uses child.try_lock() and then
swallows kill() errors with let _ = ..., which can skip killing or hide
failures; replace the non-blocking try_lock() with an awaited lock acquisition
(e.g., child.lock().await) so we reliably obtain the guard before attempting
termination, and do not discard the Result from child_guard.kill().await—either
propagate the error (use ? or return a Result) or log it with context; apply
this change to both sites where child.try_lock() / child_guard.kill().await are
used so termination is not best-effort and failures are handled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 09ee23a5-9bf2-4a44-8400-eb1da0f3d567
📒 Files selected for processing (18)
interface/src/api/client.tsinterface/src/components/ToolCall.tsxinterface/src/hooks/useLiveContext.tsxinterface/src/lib/primitives.tsxprompts/en/tools/shell_description.md.j2prompts/en/worker.md.j2src/agent/channel_history.rssrc/agent/cortex.rssrc/agent/worker.rssrc/api/agents.rssrc/api/state.rssrc/api/system.rssrc/lib.rssrc/main.rssrc/tools.rssrc/tools/shell.rstests/bulletin.rstests/context_dump.rs
✅ Files skipped from review due to trivial changes (6)
- src/agent/worker.rs
- src/api/system.rs
- prompts/en/worker.md.j2
- src/agent/cortex.rs
- prompts/en/tools/shell_description.md.j2
- src/lib.rs
🚧 Files skipped from review as they are similar to previous changes (7)
- src/agent/channel_history.rs
- tests/context_dump.rs
- src/tools.rs
- src/main.rs
- interface/src/api/client.ts
- tests/bulletin.rs
- src/api/state.rs
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (4)
interface/src/lib/primitives.tsx (1)
3-10:⚠️ Potential issue | 🟠 MajorUse package entrypoint imports instead of filesystem
node_modulespaths.Line 8 and Line 10 are tightly coupled to repo layout and internal build output. Import/export from
@spacedrive/primitivesdirectly.Suggested patch
import { Banner as PrimitiveBanner, Button as PrimitiveButton, FilterButton as PrimitiveFilterButton, NumberStepper as PrimitiveNumberStepper, -} from "../../node_modules/@spacedrive/primitives/dist/index.js"; +} from "@spacedrive/primitives"; -export * from "../../node_modules/@spacedrive/primitives/dist/index.js"; +export * from "@spacedrive/primitives";#!/bin/bash # Verify no hard-coded primitives node_modules path remains in interface sources. rg -n --type=ts --type=tsx 'node_modules/@spacedrive/primitives|@spacedrive/primitives/dist/index\.js' interface/srcExpected result: no matches after applying the fix.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/lib/primitives.tsx` around lines 3 - 10, The imports and re-export are using a hard-coded filesystem path ("../../node_modules/@spacedrive/primitives/dist/index.js"); update the import source and the export statement to use the package entrypoint "@spacedrive/primitives" instead. Replace the current import that brings in PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and PrimitiveNumberStepper from the dist path so they import from "@spacedrive/primitives", and change the export * line to export everything from "@spacedrive/primitives" to remove the tight coupling to node_modules layout.interface/src/hooks/useLiveContext.tsx (2)
242-266:⚠️ Potential issue | 🟠 MajorDon't create a
tool_resultbeforetool_completed.
pairTranscriptSteps()ininterface/src/components/ToolCall.tsxtreats the existence of atool_resultas completion. Synthesizing one on the firsttool_outputline will flip a long-running shell call out of the running state before the real completion event arrives. Store live output separately, or attach it to the pending call entry instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 266, The code currently creates a TranscriptStep of type "tool_result" on the first tool_output event, which makes pairTranscriptSteps() (in ToolCall.tsx) treat the call as completed; instead, on tool_output events update the existing pending call entry or a separate live-output store and do not synthesize a "tool_result" until a "tool_completed" event arrives. Modify setLiveTranscripts in useLiveContext.tsx to: 1) locate the pending transcript step for event.process_id and event.call_id (e.g., the existing "tool_call" step) and append to its live_output field if present, or 2) if no pending call exists, store output in a dedicated liveOutputs map keyed by call_id/process_id; only create a "tool_result" step when handling tool_completed. Keep references to TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts and pairTranscriptSteps to locate and implement this change.
251-264:⚠️ Potential issue | 🟡 MinorDon't append an extra newline to streamed lines.
The backend reads with
read_line(), soevent.linealready usually includes its trailing newline. Adding another"\n"here double-spaces most live output.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useLiveContext.tsx` around lines 251 - 264, In useLiveContext.tsx (in the streaming update where you build/update TranscriptStep), stop appending an extra "\n" to event.line: change the concatenation that sets newOutput from existingOutput + event.line + "\n" to existingOutput + event.line, and change the new step's live_output from event.line + "\n" to event.line so live_output uses the backend-provided newline only; update the code paths that set live_output (the updatedStep creation and the new step object) accordingly.src/tools/shell.rs (1)
188-200:⚠️ Potential issue | 🟠 MajorDon't treat every 5-second quiet period as an interactive prompt.
This will kill valid commands that are simply silent for a while (
sleep 10, long compile phases, tools that update with\rinstead of\n). Those runs will incorrectly surface aswaiting_for_inputwith exit code-2. The watchdog needs a stronger signal than “no completedread_line()for 5 seconds.”
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The current prop forwarding sets allowFloat to false when
both allowFloat and legacy type are undefined, overriding
PrimitiveNumberStepper's own default; update the prop expression in the wrapper
(the arrow component that spreads {...props} into PrimitiveNumberStepper) so
allowFloat is only passed when explicitly provided or when type === "float"
(e.g. compute allowFloatProp = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={allowFloatProp}) so
PrimitiveNumberStepper can use its internal default when neither is specified.
In `@src/tools/shell.rs`:
- Around line 197-198: The code is currently discarding the Result of
child_guard.kill().await (call site using child.try_lock() and
child_guard.kill().await) which can hide failures and leave subprocesses
running; update both occurrences (the block around child.try_lock() /
child_guard.kill().await and the similar block at the other occurrence) to
handle the Result instead of using let _ =: if kill() returns Err, log the error
with context (including which branch triggered the kill, e.g., timeout or
waiting_for_input) and, where appropriate, propagate or surface the failure so
callers/tests can observe it rather than silently ignoring it.
- Around line 437-460: The stream code is creating a new call_id with
uuid::Uuid::new_v4() which breaks correlation with the ToolStarted/ToolCompleted
IDs; instead, propagate the existing invocation ID used when emitting
ToolStarted into the StreamContext structs so ToolOutput events use the same
call_id. Replace the local generation of call_id in this block with the existing
invocation/tool-call identifier (the same variable used when you send
ToolStarted) and pass that same identifier into both stdout_ctx and stderr_ctx
StreamContext.call_id fields so all ToolStarted, ToolOutput, and ToolCompleted
share the same ID.
---
Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-266: The code currently creates a TranscriptStep of type
"tool_result" on the first tool_output event, which makes pairTranscriptSteps()
(in ToolCall.tsx) treat the call as completed; instead, on tool_output events
update the existing pending call entry or a separate live-output store and do
not synthesize a "tool_result" until a "tool_completed" event arrives. Modify
setLiveTranscripts in useLiveContext.tsx to: 1) locate the pending transcript
step for event.process_id and event.call_id (e.g., the existing "tool_call"
step) and append to its live_output field if present, or 2) if no pending call
exists, store output in a dedicated liveOutputs map keyed by call_id/process_id;
only create a "tool_result" step when handling tool_completed. Keep references
to TranscriptStep, tool_result, tool_output, tool_completed, setLiveTranscripts
and pairTranscriptSteps to locate and implement this change.
- Around line 251-264: In useLiveContext.tsx (in the streaming update where you
build/update TranscriptStep), stop appending an extra "\n" to event.line: change
the concatenation that sets newOutput from existingOutput + event.line + "\n" to
existingOutput + event.line, and change the new step's live_output from
event.line + "\n" to event.line so live_output uses the backend-provided newline
only; update the code paths that set live_output (the updatedStep creation and
the new step object) accordingly.
In `@interface/src/lib/primitives.tsx`:
- Around line 3-10: The imports and re-export are using a hard-coded filesystem
path ("../../node_modules/@spacedrive/primitives/dist/index.js"); update the
import source and the export statement to use the package entrypoint
"@spacedrive/primitives" instead. Replace the current import that brings in
PrimitiveBanner, PrimitiveButton, PrimitiveFilterButton, and
PrimitiveNumberStepper from the dist path so they import from
"@spacedrive/primitives", and change the export * line to export everything from
"@spacedrive/primitives" to remove the tight coupling to node_modules layout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 32995432-7fa1-43de-ba2f-fd8f2da08cd8
📒 Files selected for processing (18)
interface/src/api/client.tsinterface/src/components/ToolCall.tsxinterface/src/hooks/useLiveContext.tsxinterface/src/lib/primitives.tsxprompts/en/tools/shell_description.md.j2prompts/en/worker.md.j2src/agent/channel_history.rssrc/agent/cortex.rssrc/agent/worker.rssrc/api/agents.rssrc/api/state.rssrc/api/system.rssrc/lib.rssrc/main.rssrc/tools.rssrc/tools/shell.rstests/bulletin.rstests/context_dump.rs
✅ Files skipped from review due to trivial changes (4)
- src/api/system.rs
- prompts/en/tools/shell_description.md.j2
- prompts/en/worker.md.j2
- src/agent/cortex.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- src/agent/worker.rs
- src/agent/channel_history.rs
- tests/context_dump.rs
- src/main.rs
- interface/src/api/client.ts
- src/api/state.rs
236bd42 to
58fe5f4
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (5)
src/tools/shell.rs (3)
197-199:⚠️ Potential issue | 🟠 MajorDon't let timeout/quiesce report success if the subprocess kill was skipped or failed.
Both branches use
try_lock()and then discardkill().await. If the mutex is busy orkill()returns an error, the tool still reports timeout /waiting_for_inputwhile the child may keep running in the background. Take the lock with a bounded await and at least log the failure.As per coding guidelines "Don't silently discard errors. No
let _ =on Results. Handle them, log them, or propagate them. The only exception is.ok()on channel sends where the receiver may be dropped."Also applies to: 483-486
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 197 - 199, The timeout/quiesce branches currently call child.try_lock() and discard the result of child.kill().await, which can silently skip killing the subprocess; replace try_lock() with an awaited, bounded lock acquisition (e.g., with a timeout) on the child mutex and handle the Result from child.kill().await instead of using let _ = …; if the lock cannot be acquired or kill() returns Err, log an error indicating the failure (including the error) and ensure the function reports failure rather than success for the timeout/waiting_for_input paths; apply the same change to the other occurrence referenced around lines 483-486.
437-460:⚠️ Potential issue | 🟠 MajorThis
call_idstill can't correlate live output with the originating tool call.
interface/src/hooks/useLiveContext.tsxstill synthesizes IDs whentool_startedarrives and reuses those fortool_completed, butToolOutputnow carries a separate UUID minted only here. That leaves streamed shell lines orphaned from the initiatingtool_call, so the live panel never attaches to the running invocation. Reuse one server-side invocation ID acrossToolStarted,ToolOutput, andToolCompletedinstead of generating a stream-only ID here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/shell.rs` around lines 437 - 460, The code generates a fresh UUID into call_id and embeds it in StreamContext, which breaks correlation with the server-side invocation id used by ToolStarted/ToolCompleted; instead stop creating a new UUID and reuse the server invocation id passed into this routine (or add an invocation_id parameter) so the same id flows into StreamContext for both stdout_ctx and stderr_ctx and into all emitted ToolOutput events; update the call_id binding to use that existing invocation id (replace the call to uuid::Uuid::new_v4()) and ensure StreamContext.call_id is set from that single server-side id so ToolStarted, ToolOutput and ToolCompleted share the same identifier.
137-145:⚠️ Potential issue | 🟠 MajorThe quiesce watchdog kills any quiet command after 5s, not just interactive ones.
This fires on any 5-second gap between completed
read_line()calls, so legitimate quiet commands likesleep 10, somecargo buildphases, or progress that uses\r/no newline will be misclassified aswaiting_for_inputand killed.Also applies to: 181-199
interface/src/lib/primitives.tsx (1)
172-177:⚠️ Potential issue | 🟠 MajorPreserve the primitive default when neither
allowFloatnor legacytypeis set.
allowFloat ?? type === "float"collapses tofalsewhen both props areundefined, so this wrapper stopsPrimitiveNumberStepperfrom using its own default behavior.Suggested patch
>(({type, variant: _variant, allowFloat, ...props}, ref) => ( <PrimitiveNumberStepper {...props} ref={ref} - allowFloat={allowFloat ?? type === "float"} + allowFloat={ + allowFloat !== undefined + ? allowFloat + : type === "float" + ? true + : undefined + } /> ));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/lib/primitives.tsx` around lines 172 - 177, The wrapper currently forces allowFloat to false when both allowFloat and type are undefined by using allowFloat ?? type === "float"; update the wrapper around PrimitiveNumberStepper (the arrow function destructuring {type, variant: _variant, allowFloat, ...props}, ref) so it only passes allowFloat when explicitly provided or when type === "float"; e.g., compute a resolvedAllowFloat = allowFloat !== undefined ? allowFloat : (type === "float" ? true : undefined) and pass allowFloat={resolvedAllowFloat} (or conditionally include the prop) so PrimitiveNumberStepper can use its own default when neither prop is set.interface/src/hooks/useLiveContext.tsx (1)
242-267:⚠️ Potential issue | 🟠 MajorStore streamed chunks verbatim and cap the live buffer.
event.lineis already newline-terminated in the common case, so+ "\n"double-spaces shell output. More importantly, this keeps copying an ever-growing string on every SSE event, which will balloon browser memory and rerender cost on noisy commands.Suggested patch
+const MAX_LIVE_OUTPUT_BYTES = 32_000; + +function appendLiveOutput(current: string, chunk: string): string { + const combined = current + chunk; + return combined.length > MAX_LIVE_OUTPUT_BYTES + ? combined.slice(-MAX_LIVE_OUTPUT_BYTES) + : combined; +} + const handleToolOutput = useCallback((data: unknown) => { const event = data as ToolOutputEvent; if (event.process_type === "worker") { setLiveTranscripts((prev) => { const steps = prev[event.process_id] ?? []; @@ if (existingIndex >= 0) { // Append to existing result step const step = steps[existingIndex]; const existingOutput = (step as TranscriptStep).live_output ?? ""; - const newOutput = existingOutput + event.line + "\n"; + const newOutput = appendLiveOutput(existingOutput, event.line); const updatedStep = { ...step, live_output: newOutput }; const newSteps = [...steps]; newSteps[existingIndex] = updatedStep; return { ...prev, [event.process_id]: newSteps }; } @@ const step: TranscriptStep = { type: "tool_result", call_id: event.call_id, name: event.tool_name, text: "", - live_output: event.line + "\n", + live_output: appendLiveOutput("", event.line), }; return { ...prev, [event.process_id]: [...steps, step] }; });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@interface/src/hooks/useLiveContext.tsx` around lines 242 - 267, In useLiveContext's setLiveTranscripts update (the block that finds/creates a "tool_result" step by event.call_id), stop appending an extra "\n" and instead append event.line verbatim, and enforce a fixed max buffer for TranscriptStep.live_output to avoid unbounded growth; when updating existingStep.live_output concatenate event.line (no added newline) and then trim the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N characters or last M lines) before setting updatedStep.live_output, so the update logic in setLiveTranscripts (and the creation of the new step) both use the same capped, verbatim buffering behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@interface/src/hooks/useLiveContext.tsx`:
- Around line 242-267: In useLiveContext's setLiveTranscripts update (the block
that finds/creates a "tool_result" step by event.call_id), stop appending an
extra "\n" and instead append event.line verbatim, and enforce a fixed max
buffer for TranscriptStep.live_output to avoid unbounded growth; when updating
existingStep.live_output concatenate event.line (no added newline) and then trim
the resulting string to a configured MAX_LIVE_OUTPUT_SIZE (e.g., keep the last N
characters or last M lines) before setting updatedStep.live_output, so the
update logic in setLiveTranscripts (and the creation of the new step) both use
the same capped, verbatim buffering behavior.
In `@interface/src/lib/primitives.tsx`:
- Around line 172-177: The wrapper currently forces allowFloat to false when
both allowFloat and type are undefined by using allowFloat ?? type === "float";
update the wrapper around PrimitiveNumberStepper (the arrow function
destructuring {type, variant: _variant, allowFloat, ...props}, ref) so it only
passes allowFloat when explicitly provided or when type === "float"; e.g.,
compute a resolvedAllowFloat = allowFloat !== undefined ? allowFloat : (type ===
"float" ? true : undefined) and pass allowFloat={resolvedAllowFloat} (or
conditionally include the prop) so PrimitiveNumberStepper can use its own
default when neither prop is set.
In `@src/tools/shell.rs`:
- Around line 197-199: The timeout/quiesce branches currently call
child.try_lock() and discard the result of child.kill().await, which can
silently skip killing the subprocess; replace try_lock() with an awaited,
bounded lock acquisition (e.g., with a timeout) on the child mutex and handle
the Result from child.kill().await instead of using let _ = …; if the lock
cannot be acquired or kill() returns Err, log an error indicating the failure
(including the error) and ensure the function reports failure rather than
success for the timeout/waiting_for_input paths; apply the same change to the
other occurrence referenced around lines 483-486.
- Around line 437-460: The code generates a fresh UUID into call_id and embeds
it in StreamContext, which breaks correlation with the server-side invocation id
used by ToolStarted/ToolCompleted; instead stop creating a new UUID and reuse
the server invocation id passed into this routine (or add an invocation_id
parameter) so the same id flows into StreamContext for both stdout_ctx and
stderr_ctx and into all emitted ToolOutput events; update the call_id binding to
use that existing invocation id (replace the call to uuid::Uuid::new_v4()) and
ensure StreamContext.call_id is set from that single server-side id so
ToolStarted, ToolOutput and ToolCompleted share the same identifier.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2a35e9be-754d-4f6a-ba93-ba812e1abb48
📒 Files selected for processing (19)
interface/src/api/client.tsinterface/src/components/ToolCall.tsxinterface/src/hooks/useLiveContext.tsxinterface/src/lib/primitives.tsxprompts/en/tools/shell_description.md.j2prompts/en/worker.md.j2src/agent/channel_history.rssrc/agent/cortex.rssrc/agent/worker.rssrc/api/agents.rssrc/api/state.rssrc/api/system.rssrc/conversation/worker_transcript.rssrc/lib.rssrc/main.rssrc/tools.rssrc/tools/shell.rstests/bulletin.rstests/context_dump.rs
✅ Files skipped from review due to trivial changes (4)
- src/api/system.rs
- prompts/en/worker.md.j2
- prompts/en/tools/shell_description.md.j2
- src/api/agents.rs
🚧 Files skipped from review as they are similar to previous changes (8)
- src/agent/worker.rs
- src/agent/channel_history.rs
- src/tools.rs
- tests/context_dump.rs
- src/main.rs
- interface/src/api/client.ts
- src/agent/cortex.rs
- src/api/state.rs
58fe5f4 to
4d081e0
Compare
Replace batch-output shell execution with spawn-based streaming: - Shell commands now stream output line-by-line via ProcessEvent::ToolOutput - 5-second quiesce detection kills interactive prompts (waiting_for_input) - Frontend renders live output in ToolCall component with auto-scroll - ToolOutput events broadcast via separate SSE channel (capacity 1024) - Timeout enforcement wraps entire streaming operation - Watchdog now kills child process when quiesce detected Fixes: - P1: Timeout was ignored in streaming mode (now enforced) - P1: Watchdog didn't kill child (now passes Arc<Mutex<Child>>) - P2: Frontend call_id mismatch (now uses pendingToolCallIdsRef) Files changed: - src/tools/shell.rs: Complete rewrite with streaming - src/lib.rs: Add ProcessEvent::ToolOutput variant - src/api/state.rs: Wire SSE for tool_output events - interface/: Live output rendering components - prompts/: Interactive guidance for LLM
4d081e0 to
ada9281
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
src/tools/shell.rs (1)
186-219:⚠️ Potential issue | 🟠 MajorThe quiesce watchdog will misclassify quiet commands as interactive.
This loop treats any ~5s window without a completed line read as
waiting_for_input. Commands likesleep 10, long quiet build phases, or tools that only repaint with\rwill be terminated even though they never asked for input. Please gate this on prompt-like output or make the heuristic configurable/opt-in.src/api/state.rs (1)
653-653:⚠️ Potential issue | 🟠 MajorThe live transcript builder still uses mismatched call IDs.
TranscriptStep::ToolResultalready carriescall_idinsrc/conversation/worker_transcript.rs, Lines 35-42, but this path appends streamed output under the realcall_idwhile Line 596 still mints a separatelive_*ID for the ToolCall step and Lines 649-653 append a second ToolResult with an emptycall_id. Becauseregister_tool_output_stream()mutates the same cache from a different task, a fastToolOutputcan also land before the ToolCall/cache entry exists. Refreshing mid-run can therefore show the same shell invocation as missing, reordered, or duplicated transcript steps. Reuse one lifecyclecall_idend-to-end and serialize transcript accumulation through one ordered path.Also applies to: 823-876
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/api/state.rs` at line 653, The live transcript path mints a separate live_* call_id and appends ToolResult with empty call_id causing mismatched/duplicated steps; update the code so the same lifecycle call_id is created once and reused end-to-end: generate the call_id deterministically when creating the ToolCall entry, store that call_id into the live_output entry instead of minting a new live_* id, and make register_tool_output_stream() append streamed output to that same call_id (and not an empty call_id); also handle the race where ToolOutput can arrive before ToolCall by creating the cache entry (or buffering the output keyed by the deterministic call_id) immediately when the call_id is generated so subsequent register_tool_output_stream() mutations serialize into the same transcript path (apply the same fixes in the second region referenced).
🧹 Nitpick comments (1)
src/lib.rs (1)
361-390: Consider returning a named bus struct here.All three return values now have the same
Sender<ProcessEvent>type, so swappingmemory_event_txandtool_output_txwill still compile and silently misroute events. A smallProcessEventBuses { event_tx, memory_event_tx, tool_output_tx }return type would make this API much harder to misuse.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib.rs` around lines 361 - 390, The three broadcast senders returned by create_process_event_buses and create_process_event_buses_with_capacity are all the same type and can be accidentally swapped; introduce a new ProcessEventBuses struct (e.g. pub struct ProcessEventBuses { pub control: tokio::sync::broadcast::Sender<ProcessEvent>, pub memory: tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output: tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to return that struct instead of a tuple, constructing it with the existing local variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named fields and cannot misroute events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@interface/src/components/ToolCall.tsx`:
- Around line 56-63: The code incorrectly treats any tool_result step as a final
completion; update the indexing in ToolCall.tsx (the resultsById Map created in
the steps loop) to include an explicit status field (e.g., status: "pending" |
"final" | "waiting_for_input" ) and the liveOutput separately instead of
inferring completion from the presence of tool_result. Change the producer in
useLiveContext.tsx to set that explicit status on the temporary tool_result it
emits while streaming, and then update the consumer logic (the conditional
around result ? ... : "running" and the live-output branch referenced around
lines ~82-93) to switch behavior based on the new status value (treat status ===
"pending" as running, status === "final" as completed, and handle
"waiting_for_input" distinctly).
In `@interface/src/lib/primitives.tsx`:
- Around line 89-99: The href branch rendering via hasHref(props) and
PrimitiveButton must honor the loading guard like the action-button path: when
props.loading (or the component's loading state) is true, set the element to
disabled/aria-busy (prevent activation/clicks and focus as appropriate), prevent
navigation/activation handlers from running, and render the spinner instead of
interactive content; update the PrimitiveButton usage in that branch (the JSX
that spreads {...props} and passes ref, variant/buttonVariant,
className/buttonClassName, and children/content) to include the same loading
props/guards (disabled, aria-busy, and click/keypress suppression) used by the
action-button code path so the link appears and behaves disabled while loading.
In `@src/tools/shell.rs`:
- Around line 53-84: with_streaming currently stores a single call_id on the
ShellTool instance causing reuse across multiple shell invocations; instead
generate a fresh call_id per execution inside the execution path (e.g., in
ShellTool::call or ShellTool::run_streaming) and thread that per-run ID into all
lifecycle event emissions (ToolStarted/ToolOutput/ToolCompleted) and wherever
run_streaming needs it; update ShellTool::with_streaming to stop setting a
persistent call_id (or remove the field) and change run_streaming/call
signatures to accept a call_id (or create it at start of call() and pass to
run_streaming) so each invocation uses a unique ID.
- Around line 149-169: The emitted ToolOutput currently forwards the raw buffer
(in the read_line loop that constructs and sends ProcessEvent::ToolOutput),
which preserves the trailing newline and causes double newlines when the
accumulator in src/api/state.rs appends another '\n'; fix it by normalizing the
line before sending — call a trim that removes trailing newline(s) (e.g.,
trim_end_matches or trim_end) on the local `line`/`scrubbed` value in the
reader.read_line loop prior to populating ProcessEvent::ToolOutput.line so the
sent payload has no trailing line ending; ensure this change is applied where
ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is
constructed (leave the accumulator unchanged).
---
Duplicate comments:
In `@src/api/state.rs`:
- Line 653: The live transcript path mints a separate live_* call_id and appends
ToolResult with empty call_id causing mismatched/duplicated steps; update the
code so the same lifecycle call_id is created once and reused end-to-end:
generate the call_id deterministically when creating the ToolCall entry, store
that call_id into the live_output entry instead of minting a new live_* id, and
make register_tool_output_stream() append streamed output to that same call_id
(and not an empty call_id); also handle the race where ToolOutput can arrive
before ToolCall by creating the cache entry (or buffering the output keyed by
the deterministic call_id) immediately when the call_id is generated so
subsequent register_tool_output_stream() mutations serialize into the same
transcript path (apply the same fixes in the second region referenced).
---
Nitpick comments:
In `@src/lib.rs`:
- Around line 361-390: The three broadcast senders returned by
create_process_event_buses and create_process_event_buses_with_capacity are all
the same type and can be accidentally swapped; introduce a new ProcessEventBuses
struct (e.g. pub struct ProcessEventBuses { pub control:
tokio::sync::broadcast::Sender<ProcessEvent>, pub memory:
tokio::sync::broadcast::Sender<ProcessEvent>, pub tool_output:
tokio::sync::broadcast::Sender<ProcessEvent> }) and change both functions to
return that struct instead of a tuple, constructing it with the existing local
variables (event_tx, memory_event_tx, tool_output_tx) so callers must use named
fields and cannot misroute events.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eb3a5e3f-ba19-4d82-81d5-c10ab5476a90
📒 Files selected for processing (19)
interface/src/api/client.tsinterface/src/components/ToolCall.tsxinterface/src/hooks/useLiveContext.tsxinterface/src/lib/primitives.tsxprompts/en/tools/shell_description.md.j2prompts/en/worker.md.j2src/agent/channel_history.rssrc/agent/cortex.rssrc/agent/worker.rssrc/api/agents.rssrc/api/state.rssrc/api/system.rssrc/conversation/worker_transcript.rssrc/lib.rssrc/main.rssrc/tools.rssrc/tools/shell.rstests/bulletin.rstests/context_dump.rs
🚧 Files skipped from review as they are similar to previous changes (12)
- src/agent/channel_history.rs
- src/agent/worker.rs
- prompts/en/tools/shell_description.md.j2
- tests/context_dump.rs
- prompts/en/worker.md.j2
- src/api/system.rs
- src/agent/cortex.rs
- src/api/agents.rs
- src/tools.rs
- src/conversation/worker_transcript.rs
- interface/src/hooks/useLiveContext.tsx
- tests/bulletin.rs
| const resultsById = new Map<string, {name: string; text: string; liveOutput?: string}>(); | ||
|
|
||
| // First pass: index all tool_result steps by call_id | ||
| for (const step of steps) { | ||
| if (step.type === "tool_result") { | ||
| resultsById.set(step.call_id, {name: step.name, text: step.text}); | ||
| const liveOutput = step.live_output; | ||
| resultsById.set(step.call_id, {name: step.name, text: step.text, liveOutput}); | ||
| } |
There was a problem hiding this comment.
Don't use placeholder tool_result steps as the completion signal.
interface/src/hooks/useLiveContext.tsx:239-275 now creates a temporary tool_result with text: "" just to carry live_output while the tool is still running. Line 91's result ? ... : "running" heuristic treats that placeholder as a real completion, so the card flips out of the running state on the first streamed line and the live-output branch below becomes effectively dead. It also leaves no way to represent the new waiting_for_input shell outcome distinctly from a hard error. Please carry an explicit pending/final status instead of deriving it from the presence of a tool_result record.
Also applies to: 82-93
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/components/ToolCall.tsx` around lines 56 - 63, The code
incorrectly treats any tool_result step as a final completion; update the
indexing in ToolCall.tsx (the resultsById Map created in the steps loop) to
include an explicit status field (e.g., status: "pending" | "final" |
"waiting_for_input" ) and the liveOutput separately instead of inferring
completion from the presence of tool_result. Change the producer in
useLiveContext.tsx to set that explicit status on the temporary tool_result it
emits while streaming, and then update the consumer logic (the conditional
around result ? ... : "running" and the live-output branch referenced around
lines ~82-93) to switch behavior based on the new status value (treat status ===
"pending" as running, status === "final" as completed, and handle
"waiting_for_input" distinctly).
| if (hasHref(props)) { | ||
| return ( | ||
| <PrimitiveButton | ||
| {...props} | ||
| ref={ref} | ||
| variant={buttonVariant} | ||
| className={buttonClassName} | ||
| > | ||
| {content} | ||
| </PrimitiveButton> | ||
| ); |
There was a problem hiding this comment.
Apply the loading guard in the href branch too.
This branch renders the spinner, but unlike the action-button path it never marks the control busy/disabled or blocks activation. A loading link can still be focused and clicked, which makes the wrapper look disabled while remaining interactive.
Suggested patch
if (hasHref(props)) {
+ const {onClick, tabIndex, ...linkProps} = props;
return (
<PrimitiveButton
- {...props}
+ {...linkProps}
ref={ref}
+ aria-busy={loading || undefined}
+ aria-disabled={loading || undefined}
+ tabIndex={loading ? -1 : tabIndex}
+ onClick={(event) => {
+ if (loading) {
+ event.preventDefault();
+ event.stopPropagation();
+ return;
+ }
+ onClick?.(event);
+ }}
variant={buttonVariant}
- className={buttonClassName}
+ className={cx(buttonClassName, loading && "pointer-events-none")}
>
{content}
</PrimitiveButton>
);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (hasHref(props)) { | |
| return ( | |
| <PrimitiveButton | |
| {...props} | |
| ref={ref} | |
| variant={buttonVariant} | |
| className={buttonClassName} | |
| > | |
| {content} | |
| </PrimitiveButton> | |
| ); | |
| if (hasHref(props)) { | |
| const {onClick, tabIndex, ...linkProps} = props; | |
| return ( | |
| <PrimitiveButton | |
| {...linkProps} | |
| ref={ref} | |
| aria-busy={loading || undefined} | |
| aria-disabled={loading || undefined} | |
| tabIndex={loading ? -1 : tabIndex} | |
| onClick={(event) => { | |
| if (loading) { | |
| event.preventDefault(); | |
| event.stopPropagation(); | |
| return; | |
| } | |
| onClick?.(event); | |
| }} | |
| variant={buttonVariant} | |
| className={cx(buttonClassName, loading && "pointer-events-none")} | |
| > | |
| {content} | |
| </PrimitiveButton> | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@interface/src/lib/primitives.tsx` around lines 89 - 99, The href branch
rendering via hasHref(props) and PrimitiveButton must honor the loading guard
like the action-button path: when props.loading (or the component's loading
state) is true, set the element to disabled/aria-busy (prevent activation/clicks
and focus as appropriate), prevent navigation/activation handlers from running,
and render the spinner instead of interactive content; update the
PrimitiveButton usage in that branch (the JSX that spreads {...props} and passes
ref, variant/buttonVariant, className/buttonClassName, and children/content) to
include the same loading props/guards (disabled, aria-busy, and click/keypress
suppression) used by the action-button code path so the link appears and behaves
disabled while loading.
| match reader.read_line(&mut line).await { | ||
| Ok(0) => break, | ||
| Ok(_) => { | ||
| let now_ms = std::time::SystemTime::now() | ||
| .duration_since(std::time::UNIX_EPOCH) | ||
| .unwrap_or_default() | ||
| .as_millis() as u64; | ||
| ctx.last_output_ms.store(now_ms, Ordering::SeqCst); | ||
|
|
||
| if let (Some(tx), Some(agent_id), Some(process_id)) = | ||
| (&ctx.tool_output_tx, &ctx.agent_id, &ctx.process_id) | ||
| { | ||
| let scrubbed = crate::secrets::scrub::scrub_leaks(&line); | ||
| if let Err(err) = tx.send(ProcessEvent::ToolOutput { | ||
| agent_id: agent_id.clone(), | ||
| process_id: process_id.clone(), | ||
| channel_id: ctx.channel_id.clone(), | ||
| call_id: ctx.call_id.clone(), | ||
| tool_name: "shell".to_string(), | ||
| line: scrubbed, | ||
| stream: ctx.stream_name.clone(), |
There was a problem hiding this comment.
Normalize ToolOutput.line before emitting it.
This forwards the raw line buffer, and in src/api/state.rs, Lines 862-867 and Line 875 append another \n while rebuilding live_output. That double-delimits most lines and produces blank rows after a refresh. Strip the trailing line ending here or stop appending a second newline in the accumulator.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/tools/shell.rs` around lines 149 - 169, The emitted ToolOutput currently
forwards the raw buffer (in the read_line loop that constructs and sends
ProcessEvent::ToolOutput), which preserves the trailing newline and causes
double newlines when the accumulator in src/api/state.rs appends another '\n';
fix it by normalizing the line before sending — call a trim that removes
trailing newline(s) (e.g., trim_end_matches or trim_end) on the local
`line`/`scrubbed` value in the reader.read_line loop prior to populating
ProcessEvent::ToolOutput.line so the sent payload has no trailing line ending;
ensure this change is applied where
ctx.tool_output_tx.send(ProcessEvent::ToolOutput { ... line: ... }) is
constructed (leave the accumulator unchanged).
Summary
Implements spawn-based streaming shell execution with real-time output and quiesce detection. This eliminates the #1 worker reliability gap: interactive commands that silently hang and timeout.
What Changed
Backend
cmd.output()to spawn-based streamingFrontend
Prompts
--yes/-yflagsKey Design Decisions
Verification
cargo check --libpassescargo clippy --libpassescargo check --testspassesbun run build(frontend) passesjust preflightpassesRelated
Closes the interactive shell reliability gap. CI=true and DEBIAN_FRONTEND=noninteractive already set in sandbox (pre-existing).
Note
Transforms shell execution from blocking batch mode to live streaming with automatic detection of interactive prompts. Fixes three P1/P2 gaps: timeout enforcement, process termination, and UI state synchronization.
Written by Tembo for commit a006718. This will update automatically on new commits.