diff --git a/.github/upstream-projects.yaml b/.github/upstream-projects.yaml
index 89bfb331..1451198e 100644
--- a/.github/upstream-projects.yaml
+++ b/.github/upstream-projects.yaml
@@ -69,7 +69,7 @@ projects:
- id: toolhive-studio
repo: stacklok/toolhive-studio
- version: v0.34.0
+ version: v0.35.0
docs_paths:
- docs/toolhive/guides-ui
diff --git a/docs/toolhive/guides-ui/playground.mdx b/docs/toolhive/guides-ui/playground.mdx
index e7fd779d..7de215f3 100644
--- a/docs/toolhive/guides-ui/playground.mdx
+++ b/docs/toolhive/guides-ui/playground.mdx
@@ -1,50 +1,43 @@
---
-title: Test MCP servers
+title: Chat with AI agents
description:
- Use the playground feature to test and validate MCP servers directly in the
- ToolHive UI with AI model providers.
+ Chat with built-in or custom AI agents in the ToolHive playground, with
+ per-thread agent, model, MCP tool, and skill selection.
---
import useBaseUrl from '@docusaurus/useBaseUrl';
import ThemedImage from '@theme/ThemedImage';
-ToolHive's playground lets you test and validate MCP servers directly in the UI
-without requiring additional client setup. This streamlined testing environment
-helps you quickly evaluate functionality and behavior before deploying MCP
-servers to production environments.
+The ToolHive playground is a chat workspace for built-in or custom AI agents
+that runs inside the desktop app. Each thread runs against an agent that sets
+the system prompt and a default toolset. You can then give that agent specific
+MCP tools and skills, and chat with it directly. Testing MCP servers is one
+workflow the playground supports, alongside building skills and managing your
+ToolHive setup through chat.
## Key capabilities
-### Instant testing of MCP servers
-
-Configure your AI model providers, select your MCP servers and tools, and begin
-testing immediately in the desktop app. The playground eliminates the friction
-of setting up external AI clients just to validate that your MCP servers work
-correctly.
-
-### Detailed interaction logs
-
-See tool details, parameters, and execution results directly in the UI, ensuring
-full visibility into tool performance and responses. Every interaction is
-logged, making it easy to understand exactly what your MCP servers are doing and
-how they respond to requests.
-
-### Integrated ToolHive management
-
-The playground includes a built-in MCP server that lets you manage your other
-MCP servers directly through natural language commands. You can list servers,
-check their status, start or stop them, and perform other management tasks using
-conversational AI.
-
-### Threaded conversations
-
-Keep multiple chats organized in a sidebar with starred and recent groups. See
-[Manage playground threads](#manage-playground-threads).
-
-### Attachments
-
-Send images and PDFs alongside your prompt. See
-[Attach files to a message](#attach-files-to-a-message).
+- **Built-in and custom agents**: switch between the built-in **ToolHive
+ Assistant** and **Skill Engineer** agents, or create your own with a custom
+ name, description, and system prompt. See
+ [Choose an agent for a thread](#choose-an-agent-for-a-thread).
+- **Per-thread settings**: each thread keeps its own agent, model, MCP tool
+ selection, and skill selection, so switching threads doesn't reshuffle your
+ setup.
+- **Conversational ToolHive management**: the default agent uses a built-in MCP
+ server (`toolhive mcp`) to list, start, stop, and inspect your other MCP
+ servers from the chat. See
+ [Manage MCP servers through conversation](#manage-mcp-servers-through-conversation).
+- **Inline tool calls**: see tool calls, parameters, status, response data, and
+ timing in the chat so you can verify exactly what an agent is doing.
+- **Message actions**: copy any message, edit and rewind a previous user
+ message, or queue a follow-up while a response is still streaming. See
+ [Work with chat messages](#work-with-chat-messages).
+- **Per-message cost**: see an estimated USD cost next to the token totals on
+ each assistant message for paid providers. See
+ [See per-message cost](#see-per-message-cost).
+- **Attachments**: send images and PDFs alongside your prompt. See
+ [Attach files to a message](#attach-files-to-a-message).
## Getting started
@@ -67,136 +60,101 @@ To start using the playground:
- **OpenRouter**: Add OpenRouter API key for access to multiple model
providers
-3. **Select MCP tools**: Click the tools icon to manage which MCP servers and
- tools are available in the playground.
- - View all your running MCP servers
- - Enable or disable specific tools from each server
- - Search and filter tools by name or functionality
- - The `toolhive mcp` server is included by default, providing management
- capabilities
+3. **Start a thread**: Click **New chat** in the sidebar. The first time you use
+ the playground, new threads open with the **ToolHive Assistant** agent. After
+ that, new threads inherit the agent, model, MCP tool selection, and skill
+ selection from your most recently used thread.
-
+4. **Pick an agent (optional)**: Use the agent selector in the chat toolbar to
+ switch to **Skill Engineer**, a custom agent, or open **Manage agents** to
+ build a new one. See
+ [Choose an agent for a thread](#choose-an-agent-for-a-thread).
- :::tip
+5. **Add MCP tools and skills (optional)**: Pick which MCP tools and installed
+ skills the agent can use in this thread. See
+ [Configure MCP tools and skills for a thread](#configure-mcp-tools-and-skills-for-a-thread).
- For more control over tool availability, use
- [Customize tools](./customize-tools.mdx) to permanently configure which tools
- are enabled for each registry server. The playground tool selection is
- temporary and only affects your testing session.
+6. **Start chatting**: Send a message. The agent uses its system prompt together
+ with the tools and skills you enabled for this thread.
- :::
+## Choose an agent for a thread
-4. **Start testing**: Begin chatting with your chosen AI model. The model will
- have access to all enabled MCP tools and can execute them based on your
- requests.
+Each thread runs against an agent that sets the system prompt and a default
+toolset. Two agents are built in:
-5. **Manage chat threads**: See
- [Manage playground threads](#manage-playground-threads) for sidebar, rename,
- star, and delete actions.
+- **ToolHive Assistant** is the default. It's tuned to manage your MCP servers,
+ run tools through them, and answer questions about ToolHive itself.
+- **Skill Engineer** is tuned to design, build, and audit
+ [skills](./skills.mdx).
-6. **Attach images or PDFs**: See
- [Attach files to a message](#attach-files-to-a-message) for supported types
- and limits.
+To switch agents on the active thread, open the agent selector in the chat
+toolbar and pick one. Agent selection is per-thread, so different threads can
+run different agents at the same time. New threads inherit the agent you used
+most recently.
-## Using the playground
+### Manage custom agents
-### Testing MCP server functionality
+You can build your own agents alongside the built-ins. Open the agent selector
+and choose **Manage agents** to open the **Agents** page. From there you can:
-Use the playground to validate that your MCP servers work as expected:
+- **Create an agent** with a name, description, and system prompt.
+- **Edit** an existing custom agent.
+- **Delete** a custom agent you no longer need.
-```text
-Can you list all my MCP servers and show their current status?
-```
+Custom agents appear in the agent selector alongside the built-ins so you can
+pick them for any thread.
-The AI will use the `list_servers` tool from the ToolHive MCP server to provide
-a comprehensive overview of your server status.
+## Configure MCP tools and skills for a thread
+
+After you pick an agent, choose which MCP tools and skills it can use in the
+active thread. Both selections are scoped to the thread and persist across
+reloads. New threads inherit your most recent choices.
+
+### MCP tools
+
+Click the tools icon in the chat toolbar to manage which MCP servers and tools
+are available in the active thread:
+
+- View all your running MCP servers
+- Enable or disable specific tools from each server
+- Search and filter tools by name or functionality
+- The `toolhive mcp` server is included by default, providing management
+ capabilities
-Or test that an individual MCP tool is working as expected:
-
-```text
-Use the GitHub MCP server to search for recent issues in the microsoft/vscode repository
-```
-
-If you have the GitHub MCP server running, the AI will execute the appropriate
-GitHub API calls and return formatted results.
-
-### Managing servers through conversation
-
-The ToolHive desktop app automatically starts a dedicated MCP server
-(`toolhive mcp`) that orchestrates ToolHive operations through natural language
-commands. This approach provides several key benefits:
-
-- **Unified interface**: Manage your MCP infrastructure using the same
- conversational AI interface you use for testing.
-- **Contextual operations**: The AI understands your current server state and
- can make intelligent decisions about which servers to start, stop, or
- troubleshoot.
-- **Reduced complexity**: No need to switch between the chat interface and
- traditional UI controls. Everything can be done through conversation.
-- **Audit trail**: All management operations are logged in the same transparent
- way as tool executions, providing clear visibility into what actions were
- taken.
-
-Take advantage of these integrated ToolHive management tools:
-
-```text
-Start the fetch MCP server for me
-```
-
-```text
-Stop all unhealthy MCP servers
-```
-
-```text
-Show me the logs for the fetch MCP server
-```
+:::tip
-### Validating tool responses
+For more control over tool availability, use
+[Customize tools](./customize-tools.mdx) to permanently configure which tools
+are enabled for each registry server across all clients and threads. Playground
+tool selection applies only to the active thread.
-The playground shows detailed information about each tool execution:
+:::
-- **Tool name and description**: What tool was called and its purpose
-- **Input parameters**: The exact parameters passed to the tool
-- **Execution status**: Whether the tool succeeded or failed
-- **Response data**: The complete response from the tool
-- **Timing information**: How long the tool took to execute
+### Skills
-This visibility helps you understand exactly how your MCP servers are behaving
-and identify any issues with tool implementation or configuration.
+If you've installed [skills](./skills.mdx), pick which ones the agent can use in
+this thread alongside its MCP tools. Skill selection is per-thread, same as MCP
+tools, so different threads can give the same agent different skill sets.
-### Manage playground threads
+## Manage chat threads
The playground keeps each conversation in a separate thread so you can run
-several testing sessions in parallel without losing context. Open the sidebar to
-see your threads, with **Starred** entries pinned at the top and **Recents**
-below. Untitled threads show as `New chat` until you give them a name.
+several sessions in parallel without losing context. Open the sidebar to see
+your threads, with **Starred** entries pinned at the top and **Recents** below.
+Untitled threads show as `New chat` until you give them a name.
Each row shows a relative timestamp such as `just now`, `5m ago`, `2h ago`, or
`3d ago`. Older threads show a short date instead.
@@ -221,9 +179,66 @@ To work with threads:
Confirm with **Delete**, or back out with **Cancel**.
+## Work with chat messages
+
+Hover over any message in the chat to reveal message actions:
+
+- **Copy** copies the message text to your clipboard. Tool inputs, tool outputs,
+ and internal reasoning are excluded; tool result blocks have their own
+ per-block **Copy** button.
+- **Edit** is only available on your own messages. It pre-fills the composer
+ with the message text so you can revise and resend it.
+
+The behavior of **Edit** depends on whether the assistant is currently
+responding:
+
+- **Idle**: the edited message is sent as a new message at the end of the
+ thread. The original message stays in the history.
+- **Streaming the last user message**: the composer shows a chip reading
+ **Editing last message - submit to rewind and retry**, and the submit button
+ switches to a refresh icon. Submitting cancels the in-flight response, drops
+ the partial assistant reply and the original user message, and sends your
+ edited text as a fresh turn.
+
+To exit edit mode, click **cancel** on the chip or empty the composer.
+
+### Queue a message while a response is streaming
+
+If you type into the composer while the assistant is still streaming, the submit
+button switches to a send icon. Clicking it queues your message instead of
+stopping the response. The composer clears and shows a chip:
+
+> Queued: `` - sends when the current response finishes
+
+When the current response finishes, the queued message sends automatically.
+Click the **X** on the chip to cancel the queued message at any time. Only one
+message can be queued at a time; submitting a second one replaces the queued
+slot. Switching threads also clears the queue.
+
+If the streaming response fails instead of finishing cleanly, the queued message
+stays in the chip but isn't sent automatically. Click the **X** to discard it.
+
+### See per-message cost
+
+For assistant messages that use a paid provider (OpenAI, Anthropic, Google, xAI,
+OpenRouter), the playground shows an estimated USD cost next to the token totals
+(for example, `100 → 50 = 150 • $0.0012`). Hover the totals to see a breakdown
+of input, cached, output, and total cost.
+
+Pricing comes from [models.dev](https://models.dev) and is cached locally and
+refreshed daily. Local providers like Ollama and LM Studio, and any model
+without published pricing, render without a cost line.
+
+:::note
+
+These figures are estimates for guidance only. Refer to your provider's billing
+dashboard for authoritative usage and charges.
+
+:::
+
### Attach files to a message
-Add images and PDFs to a message so the model can read them while it works with
+Add images and PDFs to a message so the agent can read them while it works with
your MCP tools. The composer accepts up to 5 files per message, each 10 MB or
smaller, and supports image files and PDFs.
@@ -244,63 +259,100 @@ message:
- **PDFs** show as `📎 ` with a **Download** link so you can save the
original file.
-If a file is rejected, the playground shows a toast that explains why:
+If a file exceeds these limits or isn't a supported type, the playground rejects
+it and shows a notification explaining why.
-- When you exceed the per-message limit:
+## Example workflows
- > You reached the maximum number of files
- >
- > You can only upload up to 5 files
+The playground supports any chat-driven task you want an agent for. A few common
+starting points:
-- When a file is over 10 MB:
+### Manage MCP servers through conversation
- > File size too large
- >
- > The file size must be less than 10MB
+The desktop app starts a dedicated MCP server (`toolhive mcp`) that orchestrates
+ToolHive operations through natural language. With the default **ToolHive
+Assistant** agent, you can list, start, stop, and inspect servers without
+leaving the chat:
-- When a file isn't an image or a PDF:
+```text
+Can you list all my MCP servers and show their current status?
+```
- > File type not supported
- >
- > Only images and PDFs are supported
+```text
+Start the fetch MCP server for me
+```
-The composer placeholder reflects the playground state:
+```text
+Stop all unhealthy MCP servers
+```
-- Before you select a model:
+```text
+Show me the logs for the fetch MCP server
+```
- > Select an AI model to get started
+The agent calls the matching `toolhive mcp` tools and shows the results inline,
+giving you a unified interface and an audit trail in the same place as any other
+tool execution.
-- After you select a model:
+
- > Type your message...
+### Test MCP server functionality
+
+Use the playground to validate that an MCP server works as expected before you
+connect external clients to it. Enable the server's tools in the active thread,
+then prompt the agent to call them:
+
+```text
+Use the GitHub MCP server to search for recent issues in the
+microsoft/vscode repository
+```
+
+If the GitHub MCP server is running, the agent makes the appropriate API calls
+and returns formatted results. The playground shows each tool execution inline:
+
+- **Tool name and description**: what tool was called and its purpose
+- **Input parameters**: the exact parameters passed to the tool
+- **Execution status**: whether the tool succeeded or failed
+- **Response data**: the complete response from the tool
+- **Timing information**: how long the tool took to execute
+
+This makes it easy to spot tool implementation or configuration issues.
+
+### Build and audit skills
+
+Switch to the **Skill Engineer** agent on a thread to design new skills, refine
+an existing one, or audit a skill's behavior. See [Skills](./skills.mdx) for
+details on skill formats and how to install or build your own.
## Recommended practices
### Provider security
-- Use dedicated API keys for testing that have appropriate rate limits
-- Regularly rotate API keys used in development environments
-- Consider using API keys with restricted permissions for testing purposes
+- Use dedicated API keys for testing that have appropriate rate limits.
+- Regularly rotate API keys used in development environments.
+- Consider using API keys with restricted permissions for testing purposes.
- When using local providers like Ollama or LM Studio, ensure the server URLs
- are only accessible on your local network to prevent unauthorized access
+ are only accessible on your local network to prevent unauthorized access.
-### Server management
+### Agents, servers, and tools
-- Start only the MCP servers you need for testing to improve performance
+- Start only the MCP servers you need so that agents only see relevant tools.
+- Save reusable prompts as custom agents so you don't have to retype the system
+ prompt for every new thread.
- Use the playground to validate new server configurations before connecting
- them to production AI clients
-- Test different combinations of tools to understand how they work together
-
-### Testing workflow
-
-1. **Isolated testing**: Test individual MCP servers one at a time to validate
- their functionality
-2. **Integration testing**: Enable multiple servers to test how they work
- together
-3. **Performance validation**: Monitor tool execution times and responses under
- different loads
-4. **Error handling**: Intentionally trigger error conditions to validate proper
- error handling
+ them to external AI clients.
### Thread and attachment hygiene
@@ -314,18 +366,21 @@ The composer placeholder reflects the playground state:
## Next steps
-- Learn about [client configuration](./client-configuration.mdx) to connect
- ToolHive to external AI applications
+- Browse the [Skills](./skills.mdx) section to install or build skills you can
+ enable in a thread
+- Set up [client configuration](./client-configuration.mdx) to connect ToolHive
+ to external AI applications
- Set up [secrets management](./secrets-management.mdx) for secure handling of
API keys and tokens
- Explore [network isolation](./network-isolation.mdx) for enhanced security
when testing untrusted MCP servers
-- Browse the [registry](./registry.mdx) to discover new MCP servers to test in
- the playground
+- Discover more MCP servers to add to your agent threads in the
+ [registry](./registry.mdx)
## Related information
- [Run MCP servers](./run-mcp-servers.mdx)
+- [Skills](./skills.mdx)
- [Install ToolHive](./install.mdx)
## Troubleshooting
@@ -359,7 +414,7 @@ If your MCP server tools aren't showing up:
1. Verify the MCP server is running on the **MCP Servers** page.
2. Click the tools icon in the playground and confirm the server's tools are
- enabled for this session.
+ enabled for this thread.
3. Restart the MCP server if it shows as unhealthy.
4. Check the server logs for errors.
diff --git a/sidebars.ts b/sidebars.ts
index 9fa905bf..bd77117e 100644
--- a/sidebars.ts
+++ b/sidebars.ts
@@ -75,8 +75,8 @@ const sidebars: SidebarsConfig = {
'toolhive/guides-ui/skills-manage',
],
},
- 'toolhive/guides-ui/cli-access',
'toolhive/guides-ui/playground',
+ 'toolhive/guides-ui/cli-access',
],
},