diff --git a/.github/upstream-projects.yaml b/.github/upstream-projects.yaml index 89bfb331..1451198e 100644 --- a/.github/upstream-projects.yaml +++ b/.github/upstream-projects.yaml @@ -69,7 +69,7 @@ projects: - id: toolhive-studio repo: stacklok/toolhive-studio - version: v0.34.0 + version: v0.35.0 docs_paths: - docs/toolhive/guides-ui diff --git a/docs/toolhive/guides-ui/playground.mdx b/docs/toolhive/guides-ui/playground.mdx index e7fd779d..7de215f3 100644 --- a/docs/toolhive/guides-ui/playground.mdx +++ b/docs/toolhive/guides-ui/playground.mdx @@ -1,50 +1,43 @@ --- -title: Test MCP servers +title: Chat with AI agents description: - Use the playground feature to test and validate MCP servers directly in the - ToolHive UI with AI model providers. + Chat with built-in or custom AI agents in the ToolHive playground, with + per-thread agent, model, MCP tool, and skill selection. --- import useBaseUrl from '@docusaurus/useBaseUrl'; import ThemedImage from '@theme/ThemedImage'; -ToolHive's playground lets you test and validate MCP servers directly in the UI -without requiring additional client setup. This streamlined testing environment -helps you quickly evaluate functionality and behavior before deploying MCP -servers to production environments. +The ToolHive playground is a chat workspace for built-in or custom AI agents +that runs inside the desktop app. Each thread runs against an agent that sets +the system prompt and a default toolset. You can then give that agent specific +MCP tools and skills, and chat with it directly. Testing MCP servers is one +workflow the playground supports, alongside building skills and managing your +ToolHive setup through chat. ## Key capabilities -### Instant testing of MCP servers - -Configure your AI model providers, select your MCP servers and tools, and begin -testing immediately in the desktop app. The playground eliminates the friction -of setting up external AI clients just to validate that your MCP servers work -correctly. - -### Detailed interaction logs - -See tool details, parameters, and execution results directly in the UI, ensuring -full visibility into tool performance and responses. Every interaction is -logged, making it easy to understand exactly what your MCP servers are doing and -how they respond to requests. - -### Integrated ToolHive management - -The playground includes a built-in MCP server that lets you manage your other -MCP servers directly through natural language commands. You can list servers, -check their status, start or stop them, and perform other management tasks using -conversational AI. - -### Threaded conversations - -Keep multiple chats organized in a sidebar with starred and recent groups. See -[Manage playground threads](#manage-playground-threads). - -### Attachments - -Send images and PDFs alongside your prompt. See -[Attach files to a message](#attach-files-to-a-message). +- **Built-in and custom agents**: switch between the built-in **ToolHive + Assistant** and **Skill Engineer** agents, or create your own with a custom + name, description, and system prompt. See + [Choose an agent for a thread](#choose-an-agent-for-a-thread). +- **Per-thread settings**: each thread keeps its own agent, model, MCP tool + selection, and skill selection, so switching threads doesn't reshuffle your + setup. +- **Conversational ToolHive management**: the default agent uses a built-in MCP + server (`toolhive mcp`) to list, start, stop, and inspect your other MCP + servers from the chat. See + [Manage MCP servers through conversation](#manage-mcp-servers-through-conversation). +- **Inline tool calls**: see tool calls, parameters, status, response data, and + timing in the chat so you can verify exactly what an agent is doing. +- **Message actions**: copy any message, edit and rewind a previous user + message, or queue a follow-up while a response is still streaming. See + [Work with chat messages](#work-with-chat-messages). +- **Per-message cost**: see an estimated USD cost next to the token totals on + each assistant message for paid providers. See + [See per-message cost](#see-per-message-cost). +- **Attachments**: send images and PDFs alongside your prompt. See + [Attach files to a message](#attach-files-to-a-message). ## Getting started @@ -67,136 +60,101 @@ To start using the playground: - **OpenRouter**: Add OpenRouter API key for access to multiple model providers -3. **Select MCP tools**: Click the tools icon to manage which MCP servers and - tools are available in the playground. - - View all your running MCP servers - - Enable or disable specific tools from each server - - Search and filter tools by name or functionality - - The `toolhive mcp` server is included by default, providing management - capabilities +3. **Start a thread**: Click **New chat** in the sidebar. The first time you use + the playground, new threads open with the **ToolHive Assistant** agent. After + that, new threads inherit the agent, model, MCP tool selection, and skill + selection from your most recently used thread. - +4. **Pick an agent (optional)**: Use the agent selector in the chat toolbar to + switch to **Skill Engineer**, a custom agent, or open **Manage agents** to + build a new one. See + [Choose an agent for a thread](#choose-an-agent-for-a-thread). - :::tip +5. **Add MCP tools and skills (optional)**: Pick which MCP tools and installed + skills the agent can use in this thread. See + [Configure MCP tools and skills for a thread](#configure-mcp-tools-and-skills-for-a-thread). - For more control over tool availability, use - [Customize tools](./customize-tools.mdx) to permanently configure which tools - are enabled for each registry server. The playground tool selection is - temporary and only affects your testing session. +6. **Start chatting**: Send a message. The agent uses its system prompt together + with the tools and skills you enabled for this thread. - ::: +## Choose an agent for a thread -4. **Start testing**: Begin chatting with your chosen AI model. The model will - have access to all enabled MCP tools and can execute them based on your - requests. +Each thread runs against an agent that sets the system prompt and a default +toolset. Two agents are built in: -5. **Manage chat threads**: See - [Manage playground threads](#manage-playground-threads) for sidebar, rename, - star, and delete actions. +- **ToolHive Assistant** is the default. It's tuned to manage your MCP servers, + run tools through them, and answer questions about ToolHive itself. +- **Skill Engineer** is tuned to design, build, and audit + [skills](./skills.mdx). -6. **Attach images or PDFs**: See - [Attach files to a message](#attach-files-to-a-message) for supported types - and limits. +To switch agents on the active thread, open the agent selector in the chat +toolbar and pick one. Agent selection is per-thread, so different threads can +run different agents at the same time. New threads inherit the agent you used +most recently. -## Using the playground +### Manage custom agents -### Testing MCP server functionality +You can build your own agents alongside the built-ins. Open the agent selector +and choose **Manage agents** to open the **Agents** page. From there you can: -Use the playground to validate that your MCP servers work as expected: +- **Create an agent** with a name, description, and system prompt. +- **Edit** an existing custom agent. +- **Delete** a custom agent you no longer need. -```text -Can you list all my MCP servers and show their current status? -``` +Custom agents appear in the agent selector alongside the built-ins so you can +pick them for any thread. -The AI will use the `list_servers` tool from the ToolHive MCP server to provide -a comprehensive overview of your server status. +## Configure MCP tools and skills for a thread + +After you pick an agent, choose which MCP tools and skills it can use in the +active thread. Both selections are scoped to the thread and persist across +reloads. New threads inherit your most recent choices. + +### MCP tools + +Click the tools icon in the chat toolbar to manage which MCP servers and tools +are available in the active thread: + +- View all your running MCP servers +- Enable or disable specific tools from each server +- Search and filter tools by name or functionality +- The `toolhive mcp` server is included by default, providing management + capabilities -Or test that an individual MCP tool is working as expected: - -```text -Use the GitHub MCP server to search for recent issues in the microsoft/vscode repository -``` - -If you have the GitHub MCP server running, the AI will execute the appropriate -GitHub API calls and return formatted results. - -### Managing servers through conversation - -The ToolHive desktop app automatically starts a dedicated MCP server -(`toolhive mcp`) that orchestrates ToolHive operations through natural language -commands. This approach provides several key benefits: - -- **Unified interface**: Manage your MCP infrastructure using the same - conversational AI interface you use for testing. -- **Contextual operations**: The AI understands your current server state and - can make intelligent decisions about which servers to start, stop, or - troubleshoot. -- **Reduced complexity**: No need to switch between the chat interface and - traditional UI controls. Everything can be done through conversation. -- **Audit trail**: All management operations are logged in the same transparent - way as tool executions, providing clear visibility into what actions were - taken. - -Take advantage of these integrated ToolHive management tools: - -```text -Start the fetch MCP server for me -``` - -```text -Stop all unhealthy MCP servers -``` - -```text -Show me the logs for the fetch MCP server -``` +:::tip -### Validating tool responses +For more control over tool availability, use +[Customize tools](./customize-tools.mdx) to permanently configure which tools +are enabled for each registry server across all clients and threads. Playground +tool selection applies only to the active thread. -The playground shows detailed information about each tool execution: +::: -- **Tool name and description**: What tool was called and its purpose -- **Input parameters**: The exact parameters passed to the tool -- **Execution status**: Whether the tool succeeded or failed -- **Response data**: The complete response from the tool -- **Timing information**: How long the tool took to execute +### Skills -This visibility helps you understand exactly how your MCP servers are behaving -and identify any issues with tool implementation or configuration. +If you've installed [skills](./skills.mdx), pick which ones the agent can use in +this thread alongside its MCP tools. Skill selection is per-thread, same as MCP +tools, so different threads can give the same agent different skill sets. -### Manage playground threads +## Manage chat threads The playground keeps each conversation in a separate thread so you can run -several testing sessions in parallel without losing context. Open the sidebar to -see your threads, with **Starred** entries pinned at the top and **Recents** -below. Untitled threads show as `New chat` until you give them a name. +several sessions in parallel without losing context. Open the sidebar to see +your threads, with **Starred** entries pinned at the top and **Recents** below. +Untitled threads show as `New chat` until you give them a name. Each row shows a relative timestamp such as `just now`, `5m ago`, `2h ago`, or `3d ago`. Older threads show a short date instead. @@ -221,9 +179,66 @@ To work with threads: Confirm with **Delete**, or back out with **Cancel**. +## Work with chat messages + +Hover over any message in the chat to reveal message actions: + +- **Copy** copies the message text to your clipboard. Tool inputs, tool outputs, + and internal reasoning are excluded; tool result blocks have their own + per-block **Copy** button. +- **Edit** is only available on your own messages. It pre-fills the composer + with the message text so you can revise and resend it. + +The behavior of **Edit** depends on whether the assistant is currently +responding: + +- **Idle**: the edited message is sent as a new message at the end of the + thread. The original message stays in the history. +- **Streaming the last user message**: the composer shows a chip reading + **Editing last message - submit to rewind and retry**, and the submit button + switches to a refresh icon. Submitting cancels the in-flight response, drops + the partial assistant reply and the original user message, and sends your + edited text as a fresh turn. + +To exit edit mode, click **cancel** on the chip or empty the composer. + +### Queue a message while a response is streaming + +If you type into the composer while the assistant is still streaming, the submit +button switches to a send icon. Clicking it queues your message instead of +stopping the response. The composer clears and shows a chip: + +> Queued: `` - sends when the current response finishes + +When the current response finishes, the queued message sends automatically. +Click the **X** on the chip to cancel the queued message at any time. Only one +message can be queued at a time; submitting a second one replaces the queued +slot. Switching threads also clears the queue. + +If the streaming response fails instead of finishing cleanly, the queued message +stays in the chip but isn't sent automatically. Click the **X** to discard it. + +### See per-message cost + +For assistant messages that use a paid provider (OpenAI, Anthropic, Google, xAI, +OpenRouter), the playground shows an estimated USD cost next to the token totals +(for example, `100 → 50 = 150 • $0.0012`). Hover the totals to see a breakdown +of input, cached, output, and total cost. + +Pricing comes from [models.dev](https://models.dev) and is cached locally and +refreshed daily. Local providers like Ollama and LM Studio, and any model +without published pricing, render without a cost line. + +:::note + +These figures are estimates for guidance only. Refer to your provider's billing +dashboard for authoritative usage and charges. + +::: + ### Attach files to a message -Add images and PDFs to a message so the model can read them while it works with +Add images and PDFs to a message so the agent can read them while it works with your MCP tools. The composer accepts up to 5 files per message, each 10 MB or smaller, and supports image files and PDFs. @@ -244,63 +259,100 @@ message: - **PDFs** show as `📎 ` with a **Download** link so you can save the original file. -If a file is rejected, the playground shows a toast that explains why: +If a file exceeds these limits or isn't a supported type, the playground rejects +it and shows a notification explaining why. -- When you exceed the per-message limit: +## Example workflows - > You reached the maximum number of files - > - > You can only upload up to 5 files +The playground supports any chat-driven task you want an agent for. A few common +starting points: -- When a file is over 10 MB: +### Manage MCP servers through conversation - > File size too large - > - > The file size must be less than 10MB +The desktop app starts a dedicated MCP server (`toolhive mcp`) that orchestrates +ToolHive operations through natural language. With the default **ToolHive +Assistant** agent, you can list, start, stop, and inspect servers without +leaving the chat: -- When a file isn't an image or a PDF: +```text +Can you list all my MCP servers and show their current status? +``` - > File type not supported - > - > Only images and PDFs are supported +```text +Start the fetch MCP server for me +``` -The composer placeholder reflects the playground state: +```text +Stop all unhealthy MCP servers +``` -- Before you select a model: +```text +Show me the logs for the fetch MCP server +``` - > Select an AI model to get started +The agent calls the matching `toolhive mcp` tools and shows the results inline, +giving you a unified interface and an audit trail in the same place as any other +tool execution. -- After you select a model: + - > Type your message... +### Test MCP server functionality + +Use the playground to validate that an MCP server works as expected before you +connect external clients to it. Enable the server's tools in the active thread, +then prompt the agent to call them: + +```text +Use the GitHub MCP server to search for recent issues in the +microsoft/vscode repository +``` + +If the GitHub MCP server is running, the agent makes the appropriate API calls +and returns formatted results. The playground shows each tool execution inline: + +- **Tool name and description**: what tool was called and its purpose +- **Input parameters**: the exact parameters passed to the tool +- **Execution status**: whether the tool succeeded or failed +- **Response data**: the complete response from the tool +- **Timing information**: how long the tool took to execute + +This makes it easy to spot tool implementation or configuration issues. + +### Build and audit skills + +Switch to the **Skill Engineer** agent on a thread to design new skills, refine +an existing one, or audit a skill's behavior. See [Skills](./skills.mdx) for +details on skill formats and how to install or build your own. ## Recommended practices ### Provider security -- Use dedicated API keys for testing that have appropriate rate limits -- Regularly rotate API keys used in development environments -- Consider using API keys with restricted permissions for testing purposes +- Use dedicated API keys for testing that have appropriate rate limits. +- Regularly rotate API keys used in development environments. +- Consider using API keys with restricted permissions for testing purposes. - When using local providers like Ollama or LM Studio, ensure the server URLs - are only accessible on your local network to prevent unauthorized access + are only accessible on your local network to prevent unauthorized access. -### Server management +### Agents, servers, and tools -- Start only the MCP servers you need for testing to improve performance +- Start only the MCP servers you need so that agents only see relevant tools. +- Save reusable prompts as custom agents so you don't have to retype the system + prompt for every new thread. - Use the playground to validate new server configurations before connecting - them to production AI clients -- Test different combinations of tools to understand how they work together - -### Testing workflow - -1. **Isolated testing**: Test individual MCP servers one at a time to validate - their functionality -2. **Integration testing**: Enable multiple servers to test how they work - together -3. **Performance validation**: Monitor tool execution times and responses under - different loads -4. **Error handling**: Intentionally trigger error conditions to validate proper - error handling + them to external AI clients. ### Thread and attachment hygiene @@ -314,18 +366,21 @@ The composer placeholder reflects the playground state: ## Next steps -- Learn about [client configuration](./client-configuration.mdx) to connect - ToolHive to external AI applications +- Browse the [Skills](./skills.mdx) section to install or build skills you can + enable in a thread +- Set up [client configuration](./client-configuration.mdx) to connect ToolHive + to external AI applications - Set up [secrets management](./secrets-management.mdx) for secure handling of API keys and tokens - Explore [network isolation](./network-isolation.mdx) for enhanced security when testing untrusted MCP servers -- Browse the [registry](./registry.mdx) to discover new MCP servers to test in - the playground +- Discover more MCP servers to add to your agent threads in the + [registry](./registry.mdx) ## Related information - [Run MCP servers](./run-mcp-servers.mdx) +- [Skills](./skills.mdx) - [Install ToolHive](./install.mdx) ## Troubleshooting @@ -359,7 +414,7 @@ If your MCP server tools aren't showing up: 1. Verify the MCP server is running on the **MCP Servers** page. 2. Click the tools icon in the playground and confirm the server's tools are - enabled for this session. + enabled for this thread. 3. Restart the MCP server if it shows as unhealthy. 4. Check the server logs for errors. diff --git a/sidebars.ts b/sidebars.ts index 9fa905bf..bd77117e 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -75,8 +75,8 @@ const sidebars: SidebarsConfig = { 'toolhive/guides-ui/skills-manage', ], }, - 'toolhive/guides-ui/cli-access', 'toolhive/guides-ui/playground', + 'toolhive/guides-ui/cli-access', ], },