Skip to content

feat(agent): live "Thinking…" indicator for reasoning models#98

Open
Mariomarquezt wants to merge 8 commits into
CoreBunch:mainfrom
Mariomarquezt:feat/agent-reasoning-display
Open

feat(agent): live "Thinking…" indicator for reasoning models#98
Mariomarquezt wants to merge 8 commits into
CoreBunch:mainfrom
Mariomarquezt:feat/agent-reasoning-display

Conversation

@Mariomarquezt

Copy link
Copy Markdown

What & why

Reasoning models reached over the chat/completions wire stream their chain-of-thought in delta.reasoning_content (or delta.reasoning on some gateways) while the visible answer (delta.content) stays empty for several seconds. The agent panel showed nothing during that window — it looked frozen. This surfaces a lightweight, ephemeral "Thinking…" indicator plus an on-demand "Show reasoning" expander.

Approach

  • New ephemeral stream event { type: 'reasoning'; text } in AiStreamEvent.
  • The chat/completions translator emits it from delta.reasoning_content ?? delta.reasoningnever appended to the assistant message, so reasoning is not persisted and never replayed to the provider.
  • The runner forwards it (no persistence). The agent store accumulates it (rAF-batched, like text) into a session-only AgentMessage.reasoning.
  • AgentPanel shows an animated "Thinking…" indicator while a turn reasons with no answer yet, then a collapsed "Show reasoning" expander once the answer arrives. UI extracted to a small MessageReasoning component. No DB changes.

Stacking note

This builds on #97 (it depends on the shared chatCompletions.ts adapter introduced there). Until #97 merges, this PR's diff also shows the provider commits. Happy to rebase onto main once #97 lands so this becomes a clean, isolated diff.

User / developer impact

Reasoning models (and any chat/completions model that emits reasoning) no longer appear idle; users see live "thinking" feedback and can inspect the reasoning on demand. Non-reasoning models are unaffected (no indicator shown).

Verification

  • bun test — translator tests (reasoning emitted for both field names; never leaks into the answer) + stream-reducer routing test.
  • bun run build and bun run lint — green.
  • Live-tested against reasoning models on the OpenCode Zen gateway.

Disclosure

Authored with Claude Code (Claude Opus 4.8), reviewed and live-tested by the submitter. Harness: Claude Code.

Mario and others added 8 commits June 27, 2026 17:51
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…olish

- Add normalizeOpenAiBaseUrl() to chatCompletions.ts that strips trailing
  slashes and an optional trailing /v1 segment, preventing the /v1/v1/
  double-append footgun when users paste provider-documented URLs.
- Use normalizeOpenAiBaseUrl in makeChatCompletionsAdapter (endpoint) and
  fetchOpenAiCompatibleModels (/v1/models fetch); drop the now-unused
  trimSlash import from openaiCompatible.ts.
- Remove redundant 'as AiProviderId' cast (M4); drop the unused import.
- Add normalizeOpenAiBaseUrl test coverage in chatCompletions.test.ts and
  a /v1-suffixed base-URL normalization case in openaiCompatible.test.ts.
- Update AiAuthMode baseUrl JSDoc to reflect Ollama + openai-compatible (M1).
- Add OpenAI-Compatible to contextTokens.ts comment for parity (M3).
- Update ProvidersTab base-URL placeholder to https://api.groq.com/openai/v1
  so the UI matches the now-correct /v1-inclusive provider-documented form.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_0115W5vEDNwsWeaeS5PyTFgG
The id stays 'openai-compatible' (stable registry/DB identifier); only the
user-facing display label changes — dropdown, credential card, driver label,
and docs. Protocol descriptions and the filename are unchanged.
Real OpenAI-compatible gateways (OpenCode Zen, OpenRouter, vLLM, …) send
explicit `null` for optional per-chunk fields (`usage: null`,
`tool_calls: null`, `delta.content: null`) on every chunk. The chunk schema
used Type.Optional, which accepts absent-or-value but not null, so parseValue
threw, the frame was dropped in translate()'s catch, and the model's entire
reply silently vanished — reasoning models (GLM, DeepSeek, Qwen, MiniMax)
appeared to 'not reply'. Wrap the optional fields in a nullable() helper so
both absent and null validate. Verified against real gateway frames.
Reasoning models reached over the chat/completions wire stream their
chain-of-thought in delta.reasoning_content / delta.reasoning while the answer
stays empty — the panel looked frozen for seconds. Add an ephemeral reasoning
stream event: the chat/completions translator emits { type: 'reasoning' }
(never added to the assistant message, so it is not persisted or replayed); the
runner forwards it; the agent store accumulates it (rAF-batched, session-only)
into AgentMessage.reasoning; the panel shows an animated 'Thinking…' indicator
while reasoning streams with no answer yet, then an on-demand 'Show reasoning'
expander. No DB/schema changes. Reasoning UI extracted to MessageReasoning to
keep AgentPanel under the module-size ceiling.
@Mariomarquezt Mariomarquezt marked this pull request as ready for review June 28, 2026 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant