feat(fetch-url): support fetching images from URLs#1266
Conversation
Closes MoonshotAI#1016 The LLM sometimes passes 'completed' as the status for TodoList items, but the schema only accepted 'pending' | 'in_progress' | 'done'. This produced two problems: 1. Validation failed when the model used 'completed'. 2. Even if validation passed, statusMarker() had no case for 'completed' and fell through to the unreachable default branch. Changes: - Extend TodoStatus union to include 'completed' so it is accepted at the type level. - Map 'completed' -> 'done' in setTodos() so persisted state stays clean. - Handle 'completed' in statusMarker() so it renders as '[done]'. - Update the markdown description to explicitly warn against using 'completed'. - Add a test confirming 'completed' is accepted and mapped to 'done'.
Extend FetchURL to support image responses: - Add 'image' kind to UrlFetchResult with base64 imageData field - LocalFetchURLProvider: detect image/* content-type, stream binary data, and return base64-encoded imageData - MoonshotFetchURLProvider: handle image responses from coding-fetch service - FetchURLTool: render returned imageData as image_url content parts so the model can view fetched images directly - Update tool description to mention image support - Add tests for image fetching in all three layers Fixes: Fetch 工具支持把图片也拉下来
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6909f86392
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| title: z.string().min(1).describe('Short, actionable title for the todo.'), | ||
| status: z.enum(['pending', 'in_progress', 'done']).describe('Current status of the todo.'), | ||
| status: z | ||
| .preprocess((val) => (val === 'completed' ? 'done' : val), z.enum(['pending', 'in_progress', 'done'])) |
There was a problem hiding this comment.
Normalize completed after tool-call validation
When the model actually calls TodoList, args are validated by AJV from tool.parameters before resolveExecution runs, and toInputJsonSchema explicitly notes there is no Zod parse/strip step before dispatch (packages/agent-core/src/tools/support/input-schema.ts:41, packages/agent-core/src/loop/tool-call.ts:202-225). With zod's input JSON schema for this preprocess still derived from the inner enum, status: "completed" is rejected before the new setTodos normalization can run, so the added acceptance only works in tests that bypass preflight validation. Please either make the JSON schema explicitly allow the handled alias or normalize after validation in the loop/tool path.
Useful? React with 👍 / 👎.
|
|
||
| // Image responses: stream as binary and return as base64-encoded image data | ||
| // so the tool can render them directly as image_url content parts. | ||
| if (contentType.startsWith('image/')) { |
There was a problem hiding this comment.
Restrict URL images to supported MIME types
For any URL that returns image/svg+xml, image/bmp, image/tiff, etc., this new branch emits a base64 image_url with that MIME type. Downstream Anthropic conversion only accepts base64 PNG/JPEG/GIF/WebP (packages/kosong/src/providers/anthropic.ts:446-470), and the file media path already treats those as the accepted image formats (packages/agent-core/src/tools/support/file-type.ts:379-381), so fetching one of these URLs can make the next model call fail instead of producing a useful tool result. Please gate this path, and the matching Moonshot image path, to supported image MIME types or return a controlled error/text fallback for the rest.
Useful? React with 👍 / 👎.
Problem
The FetchURL tool can only return text content. When a URL points directly to an image (e.g. a chart PNG, a diagram SVG, or a photo), the tool receives binary image data but has no way to present it to the model as a viewable image.
Solution
Extend the fetch URL pipeline to detect image responses and render them as
image_urlcontent parts:Type extension — Add
'image'toUrlFetchKindand introduceUrlFetchImageDatawith base64 + MIME type.Local provider —
LocalFetchURLProvidernow checkscontent-typebefore deciding how to consume the response body. Forimage/*it streams the body asarrayBuffer(), encodes to base64, and returnskind: 'image'withimageData.Moonshot provider —
MoonshotFetchURLProvideralso checks the response content-type and handles image responses the same way, falling back to local fetcher on any error as before.Tool execution —
FetchURLTool.execution()now branches: whenimageDatais present it returns aContentPart[]with a text note + animage_urlpart using adata:…base64,…URL, so the model sees the image directly.Docs & tests — Updated tool description to mention image support. Added tests covering image fetch in all three layers (tool, local provider, moonshot provider integration via fallback).
Files changed
packages/agent-core/src/tools/builtin/web/fetch-url.tspackages/agent-core/src/tools/builtin/web/fetch-url.mdpackages/agent-core/src/tools/providers/local-fetch-url.tspackages/agent-core/src/tools/providers/moonshot-fetch-url.tspackages/agent-core/test/tools/fetch-url.test.tspackages/agent-core/test/tools/providers/local-fetch-url.test.tsAll 3348 tests in
packages/agent-core/testpass (212 test files).