feat: add thinking/reasoning display UI by gadenbuie · Pull Request #208 · posit-dev/shinychat

gadenbuie · 2026-05-06T15:35:44Z

Summary

Adds a collapsible ThinkingDisplay component that shows LLM reasoning tokens in real-time, with topic labels, animated transitions, and auto-collapse on completion
Supports thinking via content_type: "thinking" transport (R/ellmer, Python/chatlas) and client-side <thinking> tag detection (local models, Python bookmark restore)
Includes accessibility (aria-expanded, inert, focus-visible, prefers-reduced-motion), stick-to-bottom for overflow, and comprehensive reducer tests (54 cases)
Handles edge cases: disconnect mid-stream (finalizes in-flight blocks), <thinking> tags inside fenced code blocks (preserved as content), empty thinking blocks (hidden), history replay (suppresses flash/duration artifacts), and multiple thinking-response cycles within a single turn

Examples

Local thinking models

Models that emit <thinking>...</thinking> tags (DeepSeek, QwQ, Gemma, etc.) get thinking UI automatically with no server-side changes:

Python:

from chatlas import ChatOpenAI

chat_model = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="google/gemma-4-26b-a4b",
    api_key="lm-studio",
)

app = chat_model.app()

R:

library(shinychat)
library(ellmer)

chat <- chat_lmstudio(
    base_url = "http://localhost:1234",
    model = "google/gemma-4-26b-a4b"
)

live_browser(chat)

Structured thinking APIs

Models with dedicated reasoning APIs (Claude with extended thinking, OpenAI with reasoning) emit ContentThinking objects that flow through the content_type: "thinking" transport:

from chatlas import ChatAnthropic, ChatOpenAI

# Claude with extended thinking
chat = ChatAnthropic(reasoning=2000, model="claude-sonnet-4-6")
chat.app()

# OpenAI with reasoning
chat = ChatOpenAI(reasoning="medium", model="gpt-5.5")
chat.app()

Full demo app (Python) — mock thinking stream for testing

import asyncio
import random

from chatlas._content import ContentThinking
from shiny import App, reactive, ui



async def split_words_stream(texts: list[str]):
    """Yield characters with random delays to simulate streaming."""
    full = "".join(texts)
    for char in full:
        yield char
        await asyncio.sleep(random.uniform(0.005, 0.05))


thinking_chunks = [
    "The user is asking about the relationship between violent video games ",
    "and violent behavior in children. This is a well-studied topic in ",
    "psychology and media studies. I need to consider the research carefully.\n\n",
    "There have been several major meta-analyses on this topic. ",
    "Anderson et al. (2010) found small but significant effects, ",
    "while Ferguson (2015) argued these effects disappear when ",
    "publication bias is accounted for. The APA task force (2015) ",
    "concluded there's a link to aggression but insufficient evidence ",
    "for a link to criminal violence.\n\n",
    "Important confounding variables include family environment, ",
    "pre-existing behavioral issues, socioeconomic status, and ",
    "peer influences. Longitudinal studies have had mixed results ",
    "when controlling for these factors.\n\n",
    "I should present the nuanced scientific consensus: ",
    "short-term increases in aggressive thoughts/feelings are documented, ",
    "but the leap to real-world violent behavior is not well-supported. ",
    "I'll note the difference between aggression and violence.",
]

response_chunks = [
    "## The Short Answer\n\n",
    "The relationship between violent video games and real-world violence ",
    "is **far more nuanced** than popular media often suggests.\n\n",
    "## What the Research Shows\n\n",
    "### Short-term effects\n",
    "Studies do find that playing violent video games can temporarily ",
    "increase aggressive *thoughts* and *feelings*. ",
    "However, these effects are small and short-lived.\n\n",
    "### Long-term behavioral effects\n",
    "The evidence for a causal link to actual violent *behavior* is ",
    "much weaker. Major meta-analyses disagree:\n\n",
    "- **Anderson et al. (2010)**: Found small correlations with aggressive behavior\n",
    "- **Ferguson (2015)**: Argued effects vanish when correcting for publication bias\n",
    "- **APA (2015, 2020)**: Insufficient evidence to link games to criminal violence\n\n",
    "### The bigger picture\n",
    "As video game sales have skyrocketed globally since the 1990s, ",
    "youth violence rates have actually *declined* significantly in most countries. ",
    "This ecological data doesn't prove games reduce violence, but it strongly ",
    "undermines claims of a major causal effect.\n\n",
    "## Key Takeaway\n\n",
    "There's a meaningful difference between *aggression* (competitive feelings, ",
    "frustration) and *violence* (intent to harm). Video games may briefly ",
    "increase the former, but the scientific consensus does not support a ",
    "causal link to the latter.",
]

local_model_thinking_chunks = [
    "The user is asking about the relationship between violent video games ",
    "and violent behavior in children. This is a well-studied topic in ",
    "psychology and media studies. I need to consider the research carefully.\n\n",
    "<topic>Reviewing the research</topic>\n",
    "There have been several major meta-analyses on this topic. ",
    "Anderson et al. (2010) found small but significant effects, ",
    "while Ferguson (2015) argued these effects disappear when ",
    "publication bias is accounted for. The APA task force (2015) ",
    "concluded there's a link to aggression but insufficient evidence ",
    "for a link to criminal violence.\n\n",
    "I should present the nuanced scientific consensus.",
]

local_model_response_chunks = [
    "## The Short Answer\n\n",
    "The relationship between violent video games and real-world violence ",
    "is **far more nuanced** than popular media often suggests.\n\n",
    "## What the Research Shows\n\n",
    "Studies do find that playing violent video games can temporarily ",
    "increase aggressive *thoughts* and *feelings*. ",
    "However, these effects are small and short-lived.\n\n",
    "The evidence for a causal link to actual violent *behavior* is ",
    "much weaker.",
]


async def mock_thinking_stream(n: int = 1):
    """Yield ContentThinking and str chunks simulating structured thinking API."""
    indexes: list[dict[str, list[list[int]]]] = [
        {
            "thinking": [list(range(len(thinking_chunks)))],
            "response": [list(range(len(response_chunks)))],
        },
        {
            "thinking": [list(range(11)), list(range(11, len(thinking_chunks)))],
            "response": [list(range(15)), list(range(15, len(response_chunks)))],
        },
        {
            "thinking": [
                list(range(5)),
                list(range(5, 10)),
                list(range(10, len(thinking_chunks))),
            ],
            "response": [
                list(range(10)),
                list(range(10, 20)),
                list(range(20, len(response_chunks))),
            ],
        },
    ]

    idx = indexes[n - 1]

    for i in range(n):
        selected_thinking = [thinking_chunks[j] for j in idx["thinking"][i]]
        async for char in split_words_stream(selected_thinking):
            yield ContentThinking(thinking=char)

        selected_response = [response_chunks[j] for j in idx["response"][i]]
        async for char in split_words_stream(selected_response):
            yield char


async def mock_local_model_stream():
    """Yield plain strings with <thinking> tags (simulates local model output)."""
    async for char in split_words_stream(["<thinking>\n"]):
        yield char
    async for char in split_words_stream(local_model_thinking_chunks):
        yield char
    async for char in split_words_stream(["\n</thinking>\n\n"]):
        yield char
    async for char in split_words_stream(local_model_response_chunks):
        yield char


app_ui = ui.page_sidebar(
    ui.sidebar(
        ui.p(
            "This app demonstrates the thinking/reasoning display using mock data."
        ),
        ui.p(
            "Type anything (or just hit Enter) to trigger a simulated thinking response."
        ),
        ui.input_numeric(
            "num_responses",
            "Number of responses to simulate:",
            value=1,
            min=1,
            max=3,
        ),
        ui.input_checkbox(
            "local_model",
            "Simulate local model (raw <thinking> tags)",
            value=False,
        ),
        ui.input_action_button(
            "reset", "Clear Chat", class_="btn-outline-secondary btn-sm"
        ),
        width=300,
    ),
    ui.chat_ui("chat", placeholder="Ask me anything..."),
    title="Thinking UI Demo",
)


def server(input, output, session):
    chat = ui.Chat("chat")

    @chat.on_user_submit
    async def _():
        if input.local_model():
            stream = mock_local_model_stream()
        else:
            stream = mock_thinking_stream(input.num_responses())
        await chat.append_message_stream(stream)

    @reactive.effect
    @reactive.event(input.reset)
    async def _():
        await chat.clear_messages()


app = App(app_ui, server)

Full demo app (R) — mock thinking stream for testing

library(shiny)
library(bslib)
library(shinychat)
library(coro)
library(ellmer)

split_words <- function(texts) {
  texts <- unlist(strsplit(texts, ""))

  new_word <- TRUE

  purrr::reduce(texts, .init = list(), .f = function(acc, char) {
    if (new_word) {
      new_word <<- FALSE
      acc[[length(acc) + 1]] <- list(word = char, delay = runif(1, 0.01, 0.1))
      return(acc)
    }
    last_word <- acc[[length(acc)]]
    last_word$word <- paste0(last_word$word, char)
    acc[[length(acc)]] <- last_word
    if (char == " ") {
      new_word <<- TRUE
    }
    acc
  })
}

mock_local_model_stream <- async_generator(function(n = 1) {
  thinking_chunks <- c(
    "The user is asking about the relationship between violent video games ",
    "and violent behavior in children. This is a well-studied topic in ",
    "psychology and media studies. I need to consider the research carefully.\n\n",
    "<topic>Reviewing the research</topic>\n",
    "There have been several major meta-analyses on this topic. ",
    "Anderson et al. (2010) found small but significant effects, ",
    "while Ferguson (2015) argued these effects disappear when ",
    "publication bias is accounted for. The APA task force (2015) ",
    "concluded there's a link to aggression but insufficient evidence ",
    "for a link to criminal violence.\n\n",
    "I should present the nuanced scientific consensus."
  )

  response_chunks <- c(
    "## The Short Answer\n\n",
    "The relationship between violent video games and real-world violence ",
    "is **far more nuanced** than popular media often suggests.\n\n",
    "## What the Research Shows\n\n",
    "Studies do find that playing violent video games can temporarily ",
    "increase aggressive *thoughts* and *feelings*. ",
    "However, these effects are small and short-lived.\n\n",
    "The evidence for a causal link to actual violent *behavior* is ",
    "much weaker."
  )

  for (chunk in split_words(c("<thinking>\n"))) {
    yield(ContentText(chunk$word))
    await(async_sleep(chunk$delay))
  }
  for (chunk in split_words(thinking_chunks)) {
    yield(ContentText(chunk$word))
    await(async_sleep(chunk$delay))
  }
  for (chunk in split_words(c("\n</thinking>\n\n"))) {
    yield(ContentText(chunk$word))
    await(async_sleep(chunk$delay))
  }
  for (chunk in split_words(response_chunks)) {
    yield(ContentText(chunk$word))
    await(async_sleep(chunk$delay))
  }
})

mock_thinking_stream <- async_generator(function(n = 1) {
  thinking_chunks <- c(
    "The user is asking about the relationship between violent video games ",
    "and violent behavior in children. This is a well-studied topic in ",
    "psychology and media studies. I need to consider the research carefully.\n\n",
    "There have been several major meta-analyses on this topic. ",
    "Anderson et al. (2010) found small but significant effects, ",
    "while Ferguson (2015) argued these effects disappear when ",
    "publication bias is accounted for. The APA task force (2015) ",
    "concluded there's a link to aggression but insufficient evidence ",
    "for a link to criminal violence.\n\n",
    "Important confounding variables include family environment, ",
    "pre-existing behavioral issues, socioeconomic status, and ",
    "peer influences. Longitudinal studies have had mixed results ",
    "when controlling for these factors.\n\n",
    "I should present the nuanced scientific consensus: ",
    "short-term increases in aggressive thoughts/feelings are documented, ",
    "but the leap to real-world violent behavior is not well-supported. ",
    "I'll note the difference between aggression and violence.",
    paste(lorem::ipsum(4), collapse = "\n\n")
  )

  response_chunks <- c(
    "## The Short Answer\n\n",
    "The relationship between violent video games and real-world violence ",
    "is **far more nuanced** than popular media often suggests.\n\n",
    "## What the Research Shows\n\n",
    "### Short-term effects\n",
    "Studies do find that playing violent video games can temporarily ",
    "increase aggressive *thoughts* and *feelings*. ",
    "However, these effects are small and short-lived.\n\n",
    "### Long-term behavioral effects\n",
    "The evidence for a causal link to actual violent *behavior* is ",
    "much weaker. Major meta-analyses disagree:\n\n",
    "- **Anderson et al. (2010)**: Found small correlations with aggressive behavior\n",
    "- **Ferguson (2015)**: Argued effects vanish when correcting for publication bias\n",
    "- **APA (2015, 2020)**: Insufficient evidence to link games to criminal violence\n\n",
    "### The bigger picture\n",
    "As video game sales have skyrocketed globally since the 1990s, ",
    "youth violence rates have actually *declined* significantly in most countries. ",
    "This ecological data doesn't prove games reduce violence, but it strongly ",
    "undermines claims of a major causal effect.\n\n",
    "## Key Takeaway\n\n",
    "There's a meaningful difference between *aggression* (competitive feelings, ",
    "frustration) and *violence* (intent to harm). Video games may briefly ",
    "increase the former, but the scientific consensus does not support a ",
    "causal link to the latter."
  )

  indexes <- list(
    list(
      thinking = list(seq_along(thinking_chunks)),
      response = list(seq_along(response_chunks))
    ),
    list(
      thinking = list(1:11, 12:length(thinking_chunks)),
      response = list(1:15, 16:length(response_chunks))
    ),
    list(
      thinking = list(1:5, 6:10, 11:length(thinking_chunks)),
      response = list(1:10, 11:20, 21:length(response_chunks))
    )
  )

  for (i in seq_len(n)) {
    for (chunk in split_words(thinking_chunks[indexes[[n]]$thinking[[i]]])) {
      yield(ContentThinking(chunk$word))
      await(async_sleep(chunk$delay))
    }

    for (chunk in split_words(response_chunks[indexes[[n]]$response[[i]]])) {
      yield(ContentText(chunk$word))
      await(async_sleep(chunk$delay))
    }
  }
})

ui <- page_sidebar(
  title = "Thinking UI Demo",
  sidebar = sidebar(
    width = 300,
    p("This app demonstrates the thinking/reasoning display using mock data."),
    p(
      "Type anything (or just hit Enter) to trigger a simulated thinking response."
    ),
    numericInput(
      "num_responses",
      "Number of responses to simulate:",
      value = 1,
      min = 1,
      max = 3
    ),
    checkboxInput(
      "local_model",
      "Simulate local model (raw <thinking> tags)",
      value = FALSE
    ),
    actionButton("reset", "Clear Chat", class = "btn-outline-secondary btn-sm")
  ),
  chat_ui("chat", placeholder = "Ask me anything...")
)

server <- function(input, output, session) {
  observeEvent(input$chat_user_input, {
    stream <- if (input$local_model) {
      mock_local_model_stream()
    } else {
      mock_thinking_stream(input$num_responses)
    }
    chat_append("chat", stream)
  })

  observeEvent(input$reset, {
    chat_clear("chat")
  })
}

shinyApp(ui, server)

Test plan

cd js && npm run lint — TypeScript + ESLint pass
npx vitest run — 54 state reducer tests pass
Manual test with thinking-enabled model (Claude extended thinking via ellmer)
Verify collapsed/expanded states, topic transitions, auto-collapse behavior
Verify <thinking> tags inside fenced code blocks are NOT promoted to thinking UI
Verify bookmark/restore preserves thinking blocks (R via stream replay, Python via <thinking> tag wrapper)

Add thinking_start, thinking, and thinking_end ChatAction variants for streaming LLM reasoning tokens. Extend ChatMessageData with role "thinking", durationMs, topic, and startedAt fields. Handle all three actions in the reducer following the chunk_start/chunk/chunk_end pattern.

Create pkg-r/R/thinking.R with ThinkingAccumulator pattern: - new_thinking_state() for mutable state tracking - handle_thinking_chunk() sends thinking_start/thinking actions - end_thinking() computes duration and sends thinking_end - extract_topics() strips <topic> tags and buffers partial tags across chunk boundaries

Add contents_shinychat method for ellmer::ContentThinking that returns a sentinel object. Modify chat_append_stream_impl to detect thinking content, route to handle_thinking_chunk(), and call end_thinking() when transitioning to non-thinking content or at stream end.

Collapsible panel showing LLM reasoning with: - Animated dots during streaming, auto-collapse on completion - Topic label display in header during streaming - Duration display ("Thought for Xs") when complete - Accessible markup with aria-expanded and role="region" - CSS custom properties for theming (--shinychat-thinking-*) - prefers-reduced-motion support

Route messages with role "thinking" to ThinkingDisplay in both the finalized message list (ChatMessages) and the streaming message slot (ChatContainer).

…integration Add ThinkingStartAction, ThinkingAction, ThinkingEndAction to chat types. Create _thinking.py with ThinkingAccumulator (topic tag extraction with chunk boundary buffering) and ThinkingState (timing). Integrate into _append_message_stream to intercept ContentThinking objects, send thinking actions directly, and transition cleanly to content streaming.

On bookmark restore, thinking blocks are replayed instantly so duration_ms will be 0. Show "Thinking" instead of "Thought for less than a second" in that case.

ThinkingDisplay.css was imported via JS but never reached the SCSS build output. Moved styles into chat.scss, replaced text chevron with SVG, and refined hover/spacing for the thinking header button.

The stream loop was sending chunk_start before any content arrived. When thinking_start replaced the streaming message, subsequent chunk actions were dropped because streamingMessage was null after thinking_end. Now chunk_start is deferred until the first non-thinking content, and if no content follows thinking, a remove_loading action re-enables input. Also fixes named vector warning from proc.time() in thinking duration.

Remove server-side thinking state management (ThinkingState, ThinkingAccumulator, topic extraction) from both R and Python backends. Thinking content now flows through the existing chunk pipeline with content_type="thinking". The JS reducer handles state transitions (thinking↔markdown), topic tag extraction with cross-chunk buffering, and duration computation — one implementation instead of three. Removes pkg-r/R/thinking.R and pkg-py/src/shinychat/_thinking.py.

Add smooth expand/collapse animation using CSS grid transitions, debounce topic text updates with a minimum display time to prevent rapid flickering, and respect prefers-reduced-motion.

Fade out the old topic text and fade in the new one using a 200ms opacity transition, preventing jarring text swaps during streaming.

Replace <topic> tags with a styled div instead of stripping them, so topics appear as section markers in the expanded thinking trace. Default style is bold; users can override .shinychat-thinking-topic.

Instead of thinking being a separate message with role "thinking", thinking data is now stored as a field on the assistant message. ThinkingDisplay renders inside ChatMessage above the response content, so it participates in scroll management and is visually part of the assistant response.

Support interleaved thinking blocks by using a single `blocks` array on ChatMessageData. Each block is either `{ type: "content" }` or `{ type: "thinking" }`, rendered in order. This allows multiple thinking-then-content cycles within a single assistant message.

Bump dot size from 4px to 5px and add aspect-ratio: 1 to prevent subpixel rendering from distorting the circles.

Use a single SVG circle with scale+opacity pulse animation (matching the streaming dot style) instead of three staggered opacity dots. Includes a 1s animation delay so the dot sits still briefly before pulsing.

Append remaining topicBuffer content when finalizing thinking blocks and when transitioning from thinking to non-thinking content, preventing silent text loss when a stream ends with a partial topic tag.

Expand PendingMessage tuple and _flush_pending_messages to carry content_type_override through queued chunks, ensuring thinking content renders correctly under concurrent stream conditions.

- Use proper aria-hidden="true" string value instead of boolean - Respect prefers-reduced-motion in useFadingText hook by skipping the fade delay entirely, preventing a confusing 200ms pause - Fix auto-collapse effect to only update prevStreamingRef on actual transitions

…ience) When the connection drops mid-thinking-stream, remove_loading now finalizes the in-flight streaming message rather than discarding it, preserving accumulated thinking content for the user.

Prevents very long thinking content from pushing the rest of the conversation off-screen.

Cover: thinking chunk creation, appending, content type transitions, topic extraction, cross-chunk topic buffering, topicBuffer flush on finalization and transition, multiple thinking/content cycles, remove_loading disconnect resilience, empty chunks, missing startedAt.

Use dvh units for better mobile viewport handling and cap expanded thinking content at 33dvh instead of 60vh to keep response text visible.

…ython restore Python: accumulate thinking text separately during streaming and wrap in <thinking> tags before storage, so thinking content survives the lossy StoredMessage round-trip on bookmark/restore. JS: detect top-level <thinking>...</thinking> tags in message content and promote them to ThinkingBlock entries. Works both on the non-streaming restore path (messagePayloadToData) and at stream finalization for models that emit raw <thinking> tags in their markdown output.

…tags When a model emits raw <thinking>...</thinking> tags in its markdown output (common with DeepSeek, QwQ, and other local models), the JS reducer now detects the tags during streaming and routes content to a ThinkingBlock in real-time — rather than waiting until stream finalization. Uses a state machine (insideThinkingTag + tagBuffer on ChatMessageData) to track whether we're inside an unclosed <thinking> tag across chunk boundaries. Only top-level tags are detected (non-whitespace before the tag means it's inline content, not a thinking block).

Adds a @section describing how the thinking/reasoning panel works, including structured ContentThinking support via ellmer and automatic detection of raw <thinking> tags from local models. Covers optional topic label prompt engineering.

Adds a section describing how the thinking/reasoning panel works, including structured ContentThinking support via chatlas and automatic detection of raw <thinking> tags from local models. Covers optional topic label prompt engineering.

…d thinking Buffer partial closing </topic> tags across chunk boundaries so they don't leak raw markup into rendered thinking text. Add `inert` attribute to collapsed thinking content to prevent hidden tab stops on focusable descendants.

- Don't render ThinkingDisplay when block has no non-whitespace content - Require minimum 500ms duration before showing "Thought for..." text - Delay streaming dot appearance by 500ms to avoid flash during history replay - Apply delay in prefers-reduced-motion as well

When the thinking panel is expanded during streaming but hasn't yet hit the max-height, useStickToBottom isn't engaged. Detect the transition from non-overflowing to overflowing and call scrollToBottom() at that moment. Also only stop outer scroll on collapse (not expand).

Required for ContentThinking support in the thinking UI feature.

splitThinkingBlocks now identifies code fence regions and excludes <thinking> tag matches within them. Also skips splitting for non-markdown content types where tags are literal content.

cpsievert · 2026-05-08T14:20:03Z


        # Chunked messages get accumulated (using this property) before changing state
        self._current_stream_message: str = ""
+        self._current_stream_thinking: str = ""


I would like to avoid adding this piece of state -- #211 proposes a way of doing that.

Admittedly, the state management stuff in Python has always been problematic from the standpoint of complexity with somewhat questionable upside. By adding more state, we're compounding that problem and also contributing to drift between R and Python.

It's also worth calling out that for nested streams we track a separate piece of "checkpoint" state. I'm pretty sure if we were to keep this approach, we would also need a piece of checkpoint state for thinking as well.

Another problem I just realized with the current implementation: thinking state never gets reset when the stream is finished.

Yeah most of this is about me not being sure about the machinery and usage behind _current_stream_message. My more direct take is that I'd rather consider dropping _current_stream_thinking entirely unless it's absolutely required. So my first question in response is really: do we even need it in the first place?

Okay, I took another look and understand how _current_stream_message operates now. As an aside, I think we should revisit this at some point to move away from accumulating strings into this field, but that's for another time.

In 1b2a514, I updated the internal logic to buffer thinking tokens until thinking is complete and then append them to _current_stream_message when other content is received (or in a finally block if interrupted). This way, we stream tokens to the UI as they arrive, but also keep the current semantics of _current_stream_message.

So my first question in response is really: do we even need it in the first place?

If we want to properly restore streams that are getting transformed as they're sent out, then yes. I know we'll probably disagree on whether that is an important use case, but I can tell you right now there are real apps doing useful things by transforming the stream (shiny assistant, etc).

Can I follow up by asking what your objection to heading in the direction of #211 would be?

In 1b2a514, I updated the internal logic to buffer thinking tokens

This is a step in the right direction, but I still feel like it's more complex than it needs to be. If part of your resistance to #211 is that we currently don't store and restore the content type properly, then I think we should fix that separately (we already have this problem on main). Once we have that, I think all we would really need is to make sure we set the right content type (similar to the R changes).

If we want to properly restore streams that are getting transformed as they're sent out, then yes.

Do you mean _current_stream_thinking? That's what I was asking if we need (or more specifically if we need to accumulate thinking tokens somewhere). But it seems the answer is yes, we need to accumulate the tokens somewhere but not in a separate field. Does the new approach that appends them to _current_stream_message work?

Can I follow up by asking what your objection to heading in the direction of #211 would be?

I would say that it's less about having an objection and more about needing context about the problems you're seeing and wanting to solve in #211. I'm finding this conversation here to be a lot more productive (because I'm talking with you and not Claude) than it is to try to read between the lines of Claude's PR description and code changes.

As I see it now, I generally like the idea of richer data types for delta objects, but at this stage it seems like #211 is trying to solve shinychat problems by pushing behavior changes into chatlas and ellmer. I think that at this point in time, we have enough available to us that we can solve problems locally in shinychat before we need to make changes in chatlas/ellmer.

Do you mean _current_stream_thinking?

Not specifically. I mean more generally any state that is tracking outgoing messages with the intention of restoring them on bookmark.

Does the new approach that appends them to _current_stream_message work?

Mostly yes, but not if the content type is changing mid-stream. Doing that actually wasn't supported before the React migration. Now that we more officially support it, I think we should update the Python state management to mirror how the JS side works.

solve problems locally in shinychat before we need to make changes in chatlas/ellmer.

I think I agree, but can we also agree that the chatlas/ellmer changes are an improvement regardless of shinychat? If so, I think we'll want to wait for those changes to land so that we can detect the right delta class.

Mostly yes, but not if the content type is changing mid-stream.

That part of this PR does work, though, because we're storing the thinking as <thinking>...</thinking> in _current_stream_message, and those are handled client side. I think this an acceptable trade-off for now. I think in the long run we should stop storing or passing state via strings, but that's not something we should try to thread through this PR.

I think I agree, but can we also agree that the chatlas/ellmer changes are an improvement regardless of shinychat? If so, I think we'll want to wait for those changes to land so that we can detect the right delta class.

I think ContentDelta types would be useful, but they're a big enough feature that they should be developed carefully and independently. I think they'll be an improvement, but from ellmer and chatlas' perspectives the thinking deltas are just one type of delta that we'd want to support. I don't think it's a good idea to commit to a design now that without taking into consideration the other ways that delta types would be used.

Also, because ellmer just released, it will likely be more than a month before delta content classes will land in a usable way in ellmer. So I don't think we should wait for a ContentDeltaThinking class to land upstream.

I think in the long run we should stop storing or passing state via strings, but that's not something we should try to thread through this PR.

Yeah, agreed. If we do that, it would open a path where the python thinking logic becomes very minimal, while also more generally fixing the "incorrect restoration of mixed content types" problem.

BTW, I appreciate your pragmatic approach, but it would also bother me to keep kicking this pre-existing (& self-inflicted) problem down the road, making it harder to undo. It doesn't necessarily have to block this PR, but I'd at least like to take a look first.

Replace instance-level `_current_stream_thinking` state with a local `thinking_buffer` in `_append_message_stream`. Thinking chunks now flow through `_append_message_chunk` for UI streaming (skipping accumulation into `_current_stream_message` via a `content_type == "thinking"` guard), and are flushed as a single wrapped `<thinking>...</thinking>` block at the transition to non-thinking content or at stream end. Also adds `content_type` to `ChatMessage` so the wire-format content type is carried on the message itself, removing the need for the `content_type_override` escape hatch on `_append_message_chunk` and `_send_append_message`.

Thinking chunks were being sent to the UI via _append_message_chunk during streaming, then sent a second time wrapped in <thinking> tags when flush_thinking fired at the transition point. flush_thinking now only updates _current_stream_message directly for bookmark/restore without triggering another client-side send.

cpsievert · 2026-05-08T20:12:28Z

+                if _is_content_thinking(msg):
+                    thinking_text = msg.thinking if hasattr(msg, "thinking") else str(msg)


If is_content_thinking was a TypeGuard, you could drop the hasattr() part

cpsievert · 2026-05-08T20:13:25Z

+                    thinking_text = msg.thinking if hasattr(msg, "thinking") else str(msg)
+                    thinking_buffer += thinking_text
+                    await self._append_message_chunk(
+                        message_content_chunk(msg), chunk=True, stream_id=id


I'm not sure the message_content_chunk() is necessary?

cpsievert · 2026-05-08T21:40:49Z

+          : undefined) ?? "markdown"
+
+      // If server explicitly says "thinking", use the direct thinking path
+      if (explicitType === "thinking") {


I think this path leaves the reducer in an inconsistent state: the outer message is streaming, but the inner thinking block can still have streaming: false. Since ThinkingDisplay keys off the block-level flag, should we set that block to streaming: true here (or at creation time) so the UI state stays aligned?

cpsievert · 2026-05-08T22:17:48Z

  content: string
  contentType: ContentType
  streaming: boolean
  /** True for the empty placeholder message shown while waiting for the assistant to respond. */
  isPlaceholder?: boolean
  icon?: string
-  segments?: ContentSegment[]
+  blocks: MessageBlock[]


Here's another area where I'm now realizing that I introduced a problem that's getting more exposed. That is, there are two levels of content and content type: top level content as well as block level content. I think it'd be worth simplifying this to have just one (block) level of content and building that in at the protocol level. That way we can restore properly and have less confusion around the source of truth.

gadenbuie added 30 commits May 4, 2026 15:41

feat(js): render ThinkingDisplay for thinking messages

cf1c948

Route messages with role "thinking" to ThinkingDisplay in both the finalized message list (ChatMessages) and the streaming message slot (ChatContainer).

fix(js): treat durationMs of 0 as no timing data

4925f3a

On bookmark restore, thinking blocks are replayed instantly so duration_ms will be 0. Show "Thinking" instead of "Thought for less than a second" in that case.

chore(pkg-r): Use dev ellmer

687f23e

fix(js): move thinking styles into SCSS pipeline and polish component

d4ecb5e

ThinkingDisplay.css was imported via JS but never reached the SCSS build output. Moved styles into chat.scss, replaced text chevron with SVG, and refined hover/spacing for the thinking header button.

feat(js): animate thinking panel and debounce topic display

353db78

Add smooth expand/collapse animation using CSS grid transitions, debounce topic text updates with a minimum display time to prevent rapid flickering, and respect prefers-reduced-motion.

feat(js): crossfade thinking topic label on change

1c5d08d

Fade out the old topic text and fade in the new one using a 200ms opacity transition, preventing jarring text swaps during streaming.

feat(js): render topic labels inline in thinking content

c927343

Replace <topic> tags with a styled div instead of stripping them, so topics appear as section markers in the expanded thinking trace. Default style is bold; users can override .shinychat-thinking-topic.

fix(js): make thinking dots perfectly circular

acaaead

Bump dot size from 4px to 5px and add aspect-ratio: 1 to prevent subpixel rendering from distorting the circles.

feat(js): replace three thinking dots with single pulsing dot

91183f4

Use a single SVG circle with scale+opacity pulse animation (matching the streaming dot style) instead of three staggered opacity dots. Includes a 1s animation delay so the dot sits still briefly before pulsing.

fix(js): flush topicBuffer on thinking block finalization

e641530

Append remaining topicBuffer content when finalizing thinking blocks and when transitioning from thinking to non-thinking content, preventing silent text loss when a stream ends with a partial topic tag.

fix(js): add missing id to thinking header button for aria-labelledby

346f9e4

fix(py): preserve content_type_override in pending message queue

3ad150f

Expand PendingMessage tuple and _flush_pending_messages to carry content_type_override through queued chunks, ensuring thinking content renders correctly under concurrent stream conditions.

chore: sync JS dist assets to R and Python packages

9c7132d

fix(js): finalize thinking blocks on remove_loading (disconnect resil…

520daf0

…ience) When the connection drops mid-thinking-stream, remove_loading now finalizes the in-flight streaming message rather than discarding it, preserving accumulated thinking content for the user.

fix(js): cap expanded thinking content at 60vh with scroll

e68ab2f

Prevents very long thinking content from pushing the rest of the conversation off-screen.

chore: sync JS dist assets to R and Python packages

0febdaa

fix(js): add focus-visible outline to thinking header button

6bc635f

chore: sync JS dist assets to R and Python packages

ff8b989

fix(css): reduce thinking content max-height to 33dvh

7902f10

Use dvh units for better mobile viewport handling and cap expanded thinking content at 33dvh instead of 60vh to keep response text visible.

gadenbuie and others added 14 commits May 5, 2026 16:00

feat(context): add ChatScrollContext and useChatStopScroll hook

0ed8957

chore(py): bump chatlas minimum to >=0.15.0

97c1b6c

Required for ContentThinking support in the thinking UI feature.

chore: build and update assets

2fae0ec

fix(js): skip thinking tag detection inside fenced code blocks

c5357d0

splitThinkingBlocks now identifies code fence regions and excludes <thinking> tag matches within them. Also skips splitting for non-markdown content types where tags are literal content.

chore: build and update assets

2e030cb

air format (GitHub Actions)

e461570

devtools::document() (GitHub Actions)

a58bf13

gadenbuie requested a review from cpsievert May 6, 2026 15:45

cpsievert force-pushed the ui/thinking branch from 98a2531 to a58bf13 Compare May 6, 2026 23:01

cpsievert reviewed May 8, 2026

View reviewed changes

cpsievert mentioned this pull request May 8, 2026

Use ContentThinkingDelta for streaming thinking #211

Closed

2 tasks

cpsievert reviewed May 8, 2026

View reviewed changes

Comment thread pkg-py/src/shinychat/_chat.py

cpsievert reviewed May 8, 2026

View reviewed changes

Comment thread pkg-py/src/shinychat/_chat.py Outdated

gadenbuie and others added 4 commits May 8, 2026 11:07

Merge remote-tracking branch 'origin/main' into ui/thinking

ebbf3d7

usethis::use_tidy_description() (GitHub Actions)

8f09b48

cpsievert reviewed May 8, 2026

View reviewed changes

		if _is_content_thinking(msg):
		thinking_text = msg.thinking if hasattr(msg, "thinking") else str(msg)

Conversation

gadenbuie commented May 6, 2026

Summary

Examples

Local thinking models

Structured thinking APIs

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpsievert May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpsievert May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gadenbuie May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpsievert May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gadenbuie May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cpsievert May 8, 2026 •

edited

Loading

cpsievert May 8, 2026 •

edited

Loading

gadenbuie May 8, 2026 •

edited

Loading

cpsievert May 8, 2026 •

edited

Loading

gadenbuie May 8, 2026 •

edited

Loading