You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tool calling is currently entirely limited to the back-end chat interface. Tools are registered with ellmer or chatlas (see below), but shinychat does not do anything in the UI to indicate that a tool call is being made.
For background, here are how ellmer and chatlas register and store tool definitions.
ellmer
library(ellmer)
get_current_time<-function(tz="UTC") {
format(Sys.time(), tz=tz, usetz=TRUE)
}
tool_get_current_time<- tool(
get_current_time,
.description="Gets the current time in the given time zone.",
tz= type_string(
"The time zone to get the current time in. Defaults to `\"UTC\"`.",
required=FALSE
)
)
chat<- chat_openai(model="gpt-4o", echo="all")
chat$register_tool(tool_get_current_time)
chat$chat("How long ago exactly was the moment Neil Armstrong touched down on the moon?")
#> > How long ago exactly was the moment Neil Armstrong touched down on the moon?#> < [tool request (call_WkRmaly9E7kgpMB5RPWVzijh)]: get_current_time(tz = "UTC")#> < [tool request (call_0OqdPpugMw3wjX9IEz1xVwxd)]: get_current_time(tz = #> < "America/New_York")#> > [tool result (call_WkRmaly9E7kgpMB5RPWVzijh)]: 2025-02-27 17:53:17 UTC#> > [tool result (call_0OqdPpugMw3wjX9IEz1xVwxd)]: 2025-02-27 12:53:17 EST#> < Neil Armstrong touched down on the moon on July 20, 1969, at 20:17 UTC.#> < #> < As of now, which is February 27, 2025, at 17:53 UTC, it has been #> < approximately 55 years, 7 months, and 7 days since that historic moment.#> <
Internally, tool() creates a ToolDef instance with name, description and arguments properties.
chatlas
importrequestsfromchatlasimportChatOpenAIfromdotenvimportload_dotenvload_dotenv()
defget_current_temperature(latitude: float, longitude: float):
""" Get the current weather given a latitude and longitude. Parameters ---------- latitude The latitude of the location. longitude The longitude of the location. """lat_lng=f"latitude={latitude}&longitude={longitude}"url=f"https://api.open-meteo.com/v1/forecast?{lat_lng}¤t=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m"response=requests.get(url)
json=response.json()
returnjson["current"]
chat=ChatOpenAI(model="gpt-4o-mini")
chat.register_tool(get_current_temperature)
chat.chat("What's the weather like today in Duluth, MN?", echo="all")
#> 👤 User turn:#> #> What's the weather like today in Duluth, MN?#> #> 🤖 Assistant turn:#> #> # tool request (call_YRma1FOUHVPGHkfylqdHw886)#> get_current_temperature(latitude=46.7833, longitude=-92.1062)#> #> << 🤖 finish reason: tool_calls >>#> #> #> 👤 User turn:#> #> # tool result (call_YRma1FOUHVPGHkfylqdHw886)#> {'time': '2025-02-27T17:45', 'interval': 900, 'temperature_2m': 3.2, 'wind_speed_10m': 21.6}#> #> 🤖 Assistant turn:#> #> Today in Duluth, MN, the temperature is approximately 3.2°C with a wind speed of 21.6 km/h.
Chat.register_tool() has signature Chat.register_tool(func, *, model=None), i.e. it takes a function func and determines input parameters from the function docstring. For more complicated tools, you can pass a pydantic model to model.
Internally, .register_tool() creates a Tool instance with properties .func, .schema and .name. The schema stores the function description.
Tool calls in shinychat
Here are a few design sketches for what tool calling might look like:
Step
Block
Inline
Tool call starts
Tool call completes
Extra info1
1. The extra info being outlined here is happening with two different mechanisms. In the block-style UI, the tool call completion hook returns custom UI with Shiny UI elements to show the results. In the inline-style UI, we could use popovers to display additional information about the call, e.g. showing the parameters used, etc.
There are two points in the tool-calling lifecycle where we need status updates:
When the tool is called
When the tool call completes
I'm envisioning that tools would be registered with shinychat in similar to how they're registered with chatlas or ellmer. Internally (or alternatively) shinychat could provide classes that extend chatlas.Turn or UI methods for ellmer::TurnDef.
Ideally, the minimum amount of work required would be to register the tools, from there our default methods could create the UI as needed, using only information in the tool definition. For example, the name of the get_current_weather tool above could be converted into the UI label "Get current weather".
Depending on the use-case, I can also see wanting to pick between a block display or an inline display. The block works best for larger or bigger tasks, and the inline display could be used in situations where each turn is likely to include many tool calls.
Both of these variants would be encapsulated in functions available to users that we call with normal defaults. For example, we could default to showing "Get current weather", but a user could provide their own method that uses our block-display design but changes the title to "Get weather in Duluth, MN".
The completion method would, by default, find the tool call UI added at the start of the tool call and simply mark it as completed by updating its attributes.
That said, we also want it to be possible for the completion callback to send entirely custom UI that replaces the initial tool call UI. This is shown in the last row of the table for the block method, where instead of simply marking the flight search tool call as complete, the data received in the tool call is presented using custom UI.
Static Case
To start with the most simple and straight-forward case, we'll consider a simple chat with a single tool call.
User: What time is it in London?
Assistant:
Sure, I can look up the time.
<tool_request id="123" name="get_current_time" tz="UK/London">
User: <tool_result id="123">2025-03-31 11:12:13</tool_result>
Assistant: It's 11am in London.
ellmer handles the tool request in the first assistant message, automatically invokes the tool, and returns the result. Note that in the chat turns, ellmer stores the tool result as a user turn, because we ran code locally to invoke the tool. Note also that, in a live session, control of the chat isn't returned to the user until after the second assistant message.
Structurally, the turns look like this at the end of this chat:
Turn(role = "user")
ContentText (user message)
Turn(role = "assistant")
ContentText (assistant message)
ContentToolRequest
Turn(role = "user")
ContentToolResult
Turn(role = "assistant")
ContentToolText
Currently, when launching live_browser(client) on a chat client object, ellmer calls contents_markdown() on the turns. For contents_markdown(<ContentText>), this is a simple transformation that extracts the text entered by the user or returned from the LLM. On the other hand, contents_markdown(<ContentToolRequest>) and contents_markdown(<ContentToolResult>) are no-ops and hide the tool request/result from the chat UI.
I propose that we introduce a new generic -- contents_shinychat() -- that we use instead and that can be used to create a default display for ContentToolRequest or ContentToolResult. We'd also have a contents_shinychat(<Chat>) method, wherein we'd reorganize the turns to coalesce tool results into a single assistant message. We would also suppress the tool request display and only show the tool results.
sequenceDiagram
participant R as R Session
participant client
participant ellmer
participant Terminal as Terminal (emit)
participant UI as UI (yield)
rect rgba(255, 165, 0, 0.05)
Note over client: ContentText (user)
client-->>UI: contents_shinychat(turn)
Note over UI: User message
end
rect rgba(100, 100, 255, 0.05)
Note over client: ContentText (assistant)
client-->>UI: contents_shinychat(turn)
Note over UI: Asisstant message
Note over client: ContentToolRequest
Note over client: ContentToolResult
client-->>UI: contents_shinychat(ContentToolResult)
Note over UI: Tool result display
end
Loading
Live Case
When running live, the Content* objects are not directly used in the UI. In general, they're created by ellmer and recorded in the chat's turns, but the response from the LLM (API) are streamed to the UI via yielded strings.
sequenceDiagram
participant R as R Session
participant client as client<br>(record)
participant ellmer
participant Terminal as Terminal<br>(emit)
participant UI as UI<br>(yield)
UI->>client: User input submitted
Note over client: ContentText (user)
client->>ellmer: chat_append(client$stream())
activate client
activate ellmer
ellmer-->>Terminal: emit(chunk)
Note over Terminal: Text of assistant response
ellmer-->>UI: yield(chunk)
Note over UI: Display of assistant response
ellmer->>client:
Note over client: ContentText (assistant)
deactivate client
rect rgba(100, 100, 255, 0.05)
ellmer->>client: Record tool request
activate ellmer
Note over client: ContentToolRequest
ellmer-->>Terminal: emit(request)
Note over Terminal: echo="all"
ellmer-->>UI: yield(request)
Note over UI: on_tool_request(request)
Note over UI: contents_shinychate(request)
ellmer->>R: Invoke tool
activate R
R-->>UI: chat_append_stream()
Note over UI: Text appended during tool call
R->>ellmer: tool result
deactivate R
ellmer-->>Terminal: emit(result)
Note over Terminal: echo="all"
ellmer-->>UI: yield(result)
Note over UI: on_tool_result(result)
Note over UI: contents_shinychat(result)
deactivate ellmer
deactivate ellmer
ellmer->>client:
Note over client: ContentToolResult
end
Loading
Here's a description of the process depicted in the sequence diagram. I've added bold to the steps that we would be modifying to make this approach work.
User Input: The user provides input through the UI. This input is sent to the client as ContentText from the user.
Client to ellmer: The client calls client$stream(input) and ellmer makes the API request to the LLM and returns a generator as a response. Calling shinychat::chat_append(stream) directs yielded strings to the UI.
Assistant Response (Text): ellmer emits chunks of the assistant's response to the Terminal (assuming echo="all") and yields chunks to the UI for display, showing the text of the assistant's response in both the Terminal and the UI.
Assistant Response (Content): ellmer records the assistant message with ContentText and ContentToolRequest objects in the assistant turn.
Tool Request: ellmer currently doesn't yield the tool request, but I propose we add a yield_all option to $stream() and $stream_async(). When TRUE, we yield all non-text content at the end of the assistant turn.
Tool Request Display: shinychat receives the yielded ContentToolRequest and transforms it with contents_shinychat() before appending to the current chat message.
Tool Invocation: ellmer invokes the tool in the R session.
Text Appended During Tool Call: During the tool invocation (inside the tool function body), tool authors can use chat_append_stream() to append content to the UI. This content is ephemeral unless it ends up recorded in the tool result.
Tool Result: The tool function returns a result.
Currently, this can be any jsonifiable object but doesn't have a special class. In addition to returning a regular R object, I propose we also allow the tool to return a ContentToolResult object, which might be a custom user-defined class that inherits from ContentToolResult.
In addition to the properties currently used by ContentToolResult -- id, value, error -- we would add detail (additional data), call_tool (the tool def of the calling tool) and call_args (the arguments used when calling the tool).
Currently, the tool results are converted to ContentToolResult and stored as a new user turn, but with yield_all = TRUE, ellmer would yield the contents of the turn into the generator.
Tool Result Display: Again, shinychat would receive the yielded ContentToolResult, call contents_shinychat() on the ContentToolResult and append the formatted result to the chat message.
On Tool Callbacks: With this approach, shinychat can also own the on_tool_request() and on_tool_result() callbacks. The immediate need is to to remove, replace or hide any UI from the ContentToolRequest when we receive a ContentToolResult. These callbacks do not need to be user-facing at this point.
The final goal is that, once we have a paired ContentToolRequest and ContentToolResult, to have the final live state be equivalent to the chat state in the static case.
Things we need
A way to list attached tools from chat, e.g. chat$get_tools() or chat.get_tools(). Would be useful if we want to take a chat client and register its tools with shinychat. Not sure if this is strictly required, but could be a nice addition regardless. This now exists in ellmer and chatlas.
shinychat gains contents_shinychat() generic, with methods for Chat, ContentToolRequest, ContentToolResult, etc., otherwise falling back to contents_html() or contents_markdown().
Expand data included in the ContentToolResult object: id, call_tool, call_args are added by ellmer when the tool is invoked. value is the value that's sent to the LLM. detail is a list that collects any other data that someone would want to add to the tool result (analogue to CustomEvent.detail in JavaScript).
ellmer gains support for tools to return ContentToolResult objects. If a tool returns a ContentToolResult, ellmer fills in id, call_tool and call_args. Otherwise, ellmer creates the ContentToolResult.
ellmer gains yield_all in Chat$stream() and Chat$stream_async(). When TRUE, we yield non-text assistant responses after the assistant turn completes (text is already yielded) and we yield the contents of the user turn added by tool invokation.
With the above in place, tool authors could return custom ContentToolRequest objects with custom contents_shinychat methods for formatting for display in Shiny.
Additionally, ToolDef should gain an annotations property that allows tool definitions to carry additional properties that would be used in display. annotations were recently added to the MCP schema. contents_shinychat() would hook into these annotations for displaying the tool name or knowing which tools modify their environment, etc.
Relatedly, shinychat needs a way to append to the current chat without knowing the ID of the chat. This would allow tool authors to write tool functions that append to shinychat when it's available or do something else when used without a shinychat UI. That might look something like this:
my_tool<-function(...) {
chat_ui<- local_shinychat()
chat_ui$append("Progress: 0%") # no-op if called outside `chat_append()`# ... do stuff ...chat_ui$replace("Progress: 50%")
# or maybechat_ui$append("Progress: 50%", operation="replace")
# finally...chat_ui$replace("All done!")
result
}
Tool Call Current State
Tool calling is currently entirely limited to the back-end chat interface. Tools are registered with ellmer or chatlas (see below), but shinychat does not do anything in the UI to indicate that a tool call is being made.
For background, here are how
ellmerandchatlasregister and store tool definitions.ellmer
Internally,
tool()creates aToolDefinstance withname,descriptionandargumentsproperties.chatlas
Chat.register_tool() has signature
Chat.register_tool(func, *, model=None), i.e. it takes a functionfuncand determines input parameters from the function docstring. For more complicated tools, you can pass a pydantic model tomodel.Internally,
.register_tool()creates aToolinstance with properties.func,.schemaand.name. Theschemastores the function description.Tool calls in shinychat
Here are a few design sketches for what tool calling might look like:
1. The extra info being outlined here is happening with two different mechanisms. In the block-style UI, the tool call completion hook returns custom UI with Shiny UI elements to show the results. In the inline-style UI, we could use popovers to display additional information about the call, e.g. showing the parameters used, etc.
There are two points in the tool-calling lifecycle where we need status updates:
I'm envisioning that tools would be registered with shinychat in similar to how they're registered with
chatlasorellmer. Internally (or alternatively) shinychat could provide classes that extendchatlas.Turnor UI methods forellmer::TurnDef.Ideally, the minimum amount of work required would be to register the tools, from there our default methods could create the UI as needed, using only information in the tool definition. For example, the name of the
get_current_weathertool above could be converted into the UI label "Get current weather".Depending on the use-case, I can also see wanting to pick between a block display or an inline display. The block works best for larger or bigger tasks, and the inline display could be used in situations where each turn is likely to include many tool calls.
Both of these variants would be encapsulated in functions available to users that we call with normal defaults. For example, we could default to showing "Get current weather", but a user could provide their own method that uses our block-display design but changes the title to "Get weather in Duluth, MN".
The completion method would, by default, find the tool call UI added at the start of the tool call and simply mark it as completed by updating its attributes.
That said, we also want it to be possible for the completion callback to send entirely custom UI that replaces the initial tool call UI. This is shown in the last row of the table for the block method, where instead of simply marking the flight search tool call as complete, the data received in the tool call is presented using custom UI.
Static Case
To start with the most simple and straight-forward case, we'll consider a simple chat with a single tool call.
ellmer handles the tool request in the first assistant message, automatically invokes the tool, and returns the result. Note that in the chat turns, ellmer stores the tool result as a user turn, because we ran code locally to invoke the tool. Note also that, in a live session, control of the chat isn't returned to the user until after the second assistant message.
Structurally, the turns look like this at the end of this chat:
Turn(role = "user")ContentText(user message)Turn(role = "assistant")ContentText(assistant message)ContentToolRequestTurn(role = "user")ContentToolResultTurn(role = "assistant")ContentToolTextCurrently, when launching
live_browser(client)on a chat client object, ellmer callscontents_markdown()on the turns. Forcontents_markdown(<ContentText>), this is a simple transformation that extracts the text entered by the user or returned from the LLM. On the other hand,contents_markdown(<ContentToolRequest>)andcontents_markdown(<ContentToolResult>)are no-ops and hide the tool request/result from the chat UI.I propose that we introduce a new generic --
contents_shinychat()-- that we use instead and that can be used to create a default display forContentToolRequestorContentToolResult. We'd also have acontents_shinychat(<Chat>)method, wherein we'd reorganize the turns to coalesce tool results into a single assistant message. We would also suppress the tool request display and only show the tool results.sequenceDiagram participant R as R Session participant client participant ellmer participant Terminal as Terminal (emit) participant UI as UI (yield) rect rgba(255, 165, 0, 0.05) Note over client: ContentText (user) client-->>UI: contents_shinychat(turn) Note over UI: User message end rect rgba(100, 100, 255, 0.05) Note over client: ContentText (assistant) client-->>UI: contents_shinychat(turn) Note over UI: Asisstant message Note over client: ContentToolRequest Note over client: ContentToolResult client-->>UI: contents_shinychat(ContentToolResult) Note over UI: Tool result display endLive Case
When running live, the
Content*objects are not directly used in the UI. In general, they're created by ellmer and recorded in the chat'sturns, but the response from the LLM (API) are streamed to the UI via yielded strings.sequenceDiagram participant R as R Session participant client as client<br>(record) participant ellmer participant Terminal as Terminal<br>(emit) participant UI as UI<br>(yield) UI->>client: User input submitted Note over client: ContentText (user) client->>ellmer: chat_append(client$stream()) activate client activate ellmer ellmer-->>Terminal: emit(chunk) Note over Terminal: Text of assistant response ellmer-->>UI: yield(chunk) Note over UI: Display of assistant response ellmer->>client: Note over client: ContentText (assistant) deactivate client rect rgba(100, 100, 255, 0.05) ellmer->>client: Record tool request activate ellmer Note over client: ContentToolRequest ellmer-->>Terminal: emit(request) Note over Terminal: echo="all" ellmer-->>UI: yield(request) Note over UI: on_tool_request(request) Note over UI: contents_shinychate(request) ellmer->>R: Invoke tool activate R R-->>UI: chat_append_stream() Note over UI: Text appended during tool call R->>ellmer: tool result deactivate R ellmer-->>Terminal: emit(result) Note over Terminal: echo="all" ellmer-->>UI: yield(result) Note over UI: on_tool_result(result) Note over UI: contents_shinychat(result) deactivate ellmer deactivate ellmer ellmer->>client: Note over client: ContentToolResult endHere's a description of the process depicted in the sequence diagram. I've added bold to the steps that we would be modifying to make this approach work.
User Input: The user provides input through the UI. This input is sent to the client as
ContentTextfrom the user.Client to ellmer: The client calls
client$stream(input)and ellmer makes the API request to the LLM and returns a generator as a response. Callingshinychat::chat_append(stream)directs yielded strings to the UI.Assistant Response (Text): ellmer emits chunks of the assistant's response to the Terminal (assuming
echo="all") and yields chunks to the UI for display, showing the text of the assistant's response in both the Terminal and the UI.Assistant Response (Content): ellmer records the assistant message with
ContentTextandContentToolRequestobjects in the assistant turn.Tool Request: ellmer currently doesn't yield the tool request, but I propose we add a
yield_alloption to$stream()and$stream_async(). WhenTRUE, we yield all non-text content at the end of the assistant turn.Tool Request Display: shinychat receives the yielded
ContentToolRequestand transforms it withcontents_shinychat()before appending to the current chat message.Tool Invocation: ellmer invokes the tool in the R session.
Text Appended During Tool Call: During the tool invocation (inside the tool function body), tool authors can use
chat_append_stream()to append content to the UI. This content is ephemeral unless it ends up recorded in the tool result.Tool Result: The tool function returns a result.
Currently, this can be any jsonifiable object but doesn't have a special class. In addition to returning a regular R object, I propose we also allow the tool to return a
ContentToolResultobject, which might be a custom user-defined class that inherits fromContentToolResult.In addition to the properties currently used by
ContentToolResult--id,value,error-- we would adddetail(additional data),call_tool(the tool def of the calling tool) andcall_args(the arguments used when calling the tool).Currently, the tool results are converted to
ContentToolResultand stored as a new user turn, but withyield_all = TRUE, ellmer would yield the contents of the turn into the generator.Tool Result Display: Again, shinychat would receive the yielded
ContentToolResult, callcontents_shinychat()on theContentToolResultand append the formatted result to the chat message.On Tool Callbacks: With this approach, shinychat can also own the
on_tool_request()andon_tool_result()callbacks. The immediate need is to to remove, replace or hide any UI from theContentToolRequestwhen we receive aContentToolResult. These callbacks do not need to be user-facing at this point.The final goal is that, once we have a paired
ContentToolRequestandContentToolResult, to have the final live state be equivalent to the chat state in the static case.Things we need
A way to list attached tools from chat, e.g.
chat$get_tools()orchat.get_tools(). Would be useful if we want to take achatclient and register its tools with shinychat. Not sure if this is strictly required, but could be a nice addition regardless. This now exists in ellmer and chatlas.shinychat gains
contents_shinychat()generic, with methods forChat,ContentToolRequest,ContentToolResult, etc., otherwise falling back tocontents_html()orcontents_markdown().Expand data included in the
ContentToolResultobject:id,call_tool,call_argsare added by ellmer when the tool is invoked.valueis the value that's sent to the LLM.detailis a list that collects any other data that someone would want to add to the tool result (analogue toCustomEvent.detailin JavaScript).ellmer gains support for tools to return
ContentToolResultobjects. If a tool returns aContentToolResult, ellmer fills inid,call_toolandcall_args. Otherwise, ellmer creates theContentToolResult.ellmer gains
yield_allinChat$stream()andChat$stream_async(). WhenTRUE, we yield non-text assistant responses after the assistant turn completes (text is already yielded) and we yield the contents of the user turn added by tool invokation.With the above in place, tool authors could return custom
ContentToolRequestobjects with customcontents_shinychatmethods for formatting for display in Shiny.Additionally,
ToolDefshould gain anannotationsproperty that allows tool definitions to carry additional properties that would be used in display.annotationswere recently added to the MCP schema.contents_shinychat()would hook into these annotations for displaying the tool name or knowing which tools modify their environment, etc.Relatedly, shinychat needs a way to append to the current chat without knowing the ID of the chat. This would allow tool authors to write tool functions that append to shinychat when it's available or do something else when used without a shinychat UI. That might look something like this: