Skip to content

fix(security): validate IPs on every redirect hop to prevent SSRF bypass#5711

Open
iris-clawd wants to merge 2 commits intomainfrom
fix/oss-51-ssrf-redirect-bypass
Open

fix(security): validate IPs on every redirect hop to prevent SSRF bypass#5711
iris-clawd wants to merge 2 commits intomainfrom
fix/oss-51-ssrf-redirect-bypass

Conversation

@iris-clawd
Copy link
Copy Markdown
Contributor

Summary

Fixes OSS-51 — SSRF bypass via redirect following in scraping tools.

Problem

validate_url() in crewai_tools/security/safe_path.py only checked the initial URL before requests.get() fired. Since requests.get() follows redirects by default, an attacker could host a public URL that 302-redirects to an internal IP (e.g. 169.254.169.254 for cloud metadata) — the validator never saw the redirect target.

This was reported by Casco Security and confirmed end-to-end on a live app.crewai.com deployment worker (CVE-2026-2286 follow-up).

Fix

Added a custom HTTPAdapter (_SSRFSafeAdapter) that intercepts every request — including redirect hops — and validates the resolved IP against the private/reserved blocklist before the connection proceeds.

New public API:

  • safe_request_session() — returns a requests.Session with the SSRF-safe adapter mounted
  • safe_get(url, **kwargs) — drop-in replacement for requests.get() that validates both the initial URL and every redirect destination

Updated tools:

  • ScrapeWebsiteTool — now uses safe_get()
  • ScrapeElementFromWebsiteTool — now uses safe_get()
  • WebPageLoader (RAG) — now uses safe_get()

Tests

5 new tests in test_safe_path.py:

  • ✅ Public URL allowed through
  • ✅ Redirect to localhost (127.0.0.1) blocked
  • ✅ Redirect to cloud metadata (169.254.169.254) blocked
  • ✅ Redirect to private range (10.0.0.1) blocked
  • ✅ Escape hatch (CREWAI_TOOLS_ALLOW_UNSAFE_PATHS=true) bypasses redirect check

All 32 tests pass (27 existing + 5 new).

Note

Other tools that use third-party scraping services (Firecrawl, Spider, Scrapfly, etc.) are not affected — they delegate fetching to external APIs rather than using requests.get() directly. The RAG PDF/docs loaders (pdf_loader.py, utils.py) also use bare requests.get() and could benefit from the same treatment in a follow-up.

Closes OSS-51

…ass (OSS-51)

Adds a custom HTTPAdapter (_SSRFSafeAdapter) that intercepts every
request — including redirect hops — and validates the resolved IP
against the private/reserved blocklist before the connection proceeds.

New public API:
- safe_request_session(): returns a Session with the adapter mounted
- safe_get(url, **kwargs): drop-in replacement for requests.get() that
  validates the initial URL AND every redirect destination

Updated tools to use safe_get() instead of validate_url() + requests.get():
- ScrapeWebsiteTool
- ScrapeElementFromWebsiteTool
- WebPageLoader (RAG)

Closes OSS-51
@linear
Copy link
Copy Markdown

linear Bot commented May 5, 2026

@github-actions github-actions Bot added the size/M label May 5, 2026
…ocks

- Add proper type annotations to _SSRFSafeAdapter.send() to satisfy mypy
- Add 'Any' import from typing
- Update webpage_loader tests to mock safe_get instead of requests.get
  (the loader now uses safe_get for SSRF protection)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant