-
-
Notifications
You must be signed in to change notification settings - Fork 232
fix: proxy external images in package readmes #1143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: proxy external images in package readmes #1143
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
📝 WalkthroughWalkthroughAdds a server-side image proxy endpoint at /api/registry/image-proxy with strict request validation, DNS/host resolution checks to prevent SSRF, content-type and size enforcement, a 10 MB streaming cap, redirect re-validation, and security response headers. Introduces server/utils/image-proxy.ts with trusted-domain logic, URL classification, isTrustedImageDomain, isAllowedImageUrl, toProxiedImageUrl and resolveAndValidateHost. README image resolution is updated to route external images through the proxy. Adds unit tests for the proxy utilities and README routing, and adds the ipaddr.js dependency. 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
shared/utils/image-proxy.ts (1)
71-83: Consider refining the private IP range checks.Two observations on the SSRF protection:
172.x range is overly broad: The check
hostname.startsWith('172.')blocks all 172.0.0.0–172.255.255.255, but only 172.16.0.0–172.31.255.255 is private (RFC 1918). This could inadvertently block legitimate public IPs.Missing IPv6 private ranges: The check only covers
::1but misses other internal IPv6 addresses likefe80::/10(link-local),fc00::/7(unique local), and IPv4-mapped addresses like::ffff:127.0.0.1.The current approach errs on the side of caution, which is reasonable for a first iteration, but you may want to refine it to avoid false positives and close IPv6 gaps.
🛡️ Suggested refinements
if ( hostname === 'localhost' || hostname === '127.0.0.1' || hostname === '::1' || hostname === '0.0.0.0' || hostname.startsWith('10.') || hostname.startsWith('192.168.') || - hostname.startsWith('172.') || + /^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(hostname) || hostname.endsWith('.local') || - hostname.endsWith('.internal') + hostname.endsWith('.internal') || + // IPv6 private/internal ranges + hostname.startsWith('fe80:') || + hostname.startsWith('fc') || + hostname.startsWith('fd') || + hostname.startsWith('::ffff:127.') || + hostname.startsWith('::ffff:10.') || + hostname.startsWith('::ffff:192.168.') || + hostname === '[::1]' ) { return false }
| const query = getQuery(event) | ||
| const url = query.url as string | undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle potential array values in query parameter.
If a request contains multiple url parameters (e.g., ?url=a&url=b), query.url will be an array, making the type cast unsafe. This also affects the cache key on line 116.
🔧 Suggested fix
const query = getQuery(event)
- const url = query.url as string | undefined
+ const rawUrl = query.url
+ const url = Array.isArray(rawUrl) ? rawUrl[0] : rawUrl as string | undefined📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const query = getQuery(event) | |
| const url = query.url as string | undefined | |
| const query = getQuery(event) | |
| const rawUrl = query.url | |
| const url = Array.isArray(rawUrl) ? rawUrl[0] : rawUrl as string | undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
okineadev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this PR AI generated?
Please remove all links to issue #1138, because I don't think it will be needed since it will be closed anyway
Although this PR looks good overall for me 👍
|
You probably need to add "Resolves https://github.com/npmx-dev/npmx.dev/security/advisories/GHSA-fhmf-6j9f-wmmq" to the PR description to close the private Security Issue at the same time. |
|
We need to make sure this PR works exactly like Camo - https://github.com/atmos/camo/blob/master/server.coffee |
|
Please wait 🙏 |
|
@okineadev can you explain more? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
First, we need to use the same request and response headers as Camo - https://github.com/atmos/camo/blob/master/server.coffee#L39-L57 I don't know much about encryption, I asked Claude to explain how Camo does it (see https://github.com/atmos/camo#url-formats) Like we need to make it so that only npmx can proxy only approved URLs (i.e. only from the readme of any package on npmx) and not any that the user specifies directly, as can be done now. We have two options - either update the current code or ... just port https://github.com/atmos/camo/blob/master/server.js to this project |
|
There is another similar well-commented project - https://github.com/rehypejs/camomile It might be useful |
Summary
Resolves #1138
README images from third-party servers are currently loaded directly by the visitor's browser, which exposes their IP address, User-Agent, and other request metadata to potentially malicious image hosts.
This PR introduces a server-side image proxy (similar to GitHub's camo proxy) that routes untrusted external images through
/api/registry/image-proxy, so visitors never make direct requests to third-party servers.Changes
New files
server/api/registry/image-proxy/index.get.ts— Cached Nitro API endpoint that fetches images server-side and forwards them to the client. Includes:image/*responses)X-Content-Type-Options: nosniff, restrictive CSP)defineCachedEventHandlershared/utils/image-proxy.ts— Shared utilities:isTrustedImageDomain()— Allowlist of trusted domains (GitHub, GitLab, jsdelivr, shields.io, etc.) that don't need proxyingisAllowedImageUrl()— Validates URLs are safe to proxy (blocks SSRF vectors)toProxiedImageUrl()— Converts external image URLs to proxied URLs; trusted domains pass through unchangedModified files
server/utils/readme.ts— UpdatedresolveImageUrl()to route resolved image URLs throughtoProxiedImageUrl(), which proxies untrusted external images while leaving trusted sources (GitHub raw, jsdelivr, shields.io) unchanged.Tests
test/unit/shared/utils/image-proxy.spec.ts— 19 unit tests covering trusted domain detection, URL validation, SSRF blocking, and proxy URL generation.test/unit/server/utils/readme.spec.ts— 6 new integration tests verifying that README rendering correctly proxies untrusted images and passes through trusted ones.How it works
resolveImageUrl()toProxiedImageUrl()checks if the domain is trusted/api/registry/image-proxy?url=<encoded_url>