Skip to content

feat(artifacts): handle HTTP redirects for presigned cloud-storage URLs#196

Open
devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1781146607-artifacts-redirect-handling
Open

feat(artifacts): handle HTTP redirects for presigned cloud-storage URLs#196
devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1781146607-artifacts-redirect-handling

Conversation

@devin-ai-integration

@devin-ai-integration devin-ai-integration Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

When the /artifacts API responds with a redirect (301–308) to a presigned cloud-storage URL (S3, GCS, etc.), the SDK now follows it without forwarding the Authorization header — presigned URLs carry their own auth via query params and reject requests that also include a Bearer token.

Key changes:

  • APIRequestor.request() — new allow_redirects parameter (default True) passed through to requests.Session.request().
  • Artifacts.get() — calls the API with allow_redirects=False, detects 3xx responses, and delegates to _follow_redirect() which fetches the Location URL using only browser-like headers:
    response, status_code, headers = self._requestor.request(..., allow_redirects=False)
    if status_code in _REDIRECT_STATUS_CODES:
        redirect_resp = _follow_redirect(headers["Location"])
        response, headers = redirect_resp.content, dict(redirect_resp.headers)
  • _filename_from_url() — replaces hand-rolled path.name.split("?")[0] with urlparse-based extraction, fixing a double-extension bug where url artifacts were saved as e.g. file.jpg.jpg.
  • Content-type assertions now accept application/octet-stream for file-type artifacts (vid/aud/doc/recon) since cloud storage may not preserve the original MIME type.
  • 13 new tests covering redirect flows (raw, image, video, doc), filename extraction from presigned URLs, missing Location header error, and application/octet-stream tolerance.

Link to Devin session: https://app.devin.ai/sessions/91bd0ee339014c0689a7ea5112830c6f
Requested by: @spillai


Open in Devin Review

- Add allow_redirects parameter to APIRequestor.request() to let callers
  disable automatic redirect following when auth headers would leak.
- Artifacts.get() now uses allow_redirects=False and manually follows
  redirect responses (301-308) via _follow_redirect(), which strips the
  Authorization header before fetching from presigned S3/GCS URLs.
- Replace hand-rolled query-param stripping with _filename_from_url()
  using urlparse for robust presigned-URL filename extraction.
- Fix double-extension bug in url artifact file naming (was foo.jpg.jpg).
- Accept application/octet-stream Content-Type for redirected responses
  since cloud storage may not preserve the original MIME type.
- Add comprehensive tests for redirect handling, filename extraction, and
  content-type tolerance.

Co-Authored-By: Sudeep Pillai <sudeep.pillai@gmail.com>
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +147 to +149
raise ValueError(
f"Received redirect ({status_code}) without a Location header"
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 New code raises ValueError for API-related error, violating AGENTS.md rule

The AGENTS.md rule states: "Use the existing exception hierarchy in vlmrun/client/exceptions.py. Do not raise generic Exception or ValueError for API-related errors." The new redirect-handling code at vlmrun/client/artifacts.py:147-149 raises a plain ValueError when the server responds with a redirect status code but no Location header. This is an API-related error (malformed server response) and should use the custom exception hierarchy — e.g. APIError from vlmrun/client/exceptions.py. This means callers wrapping the call with except APIError would miss this error, leading to unhandled exceptions.

Prompt for agents
In vlmrun/client/artifacts.py, lines 147-149, a ValueError is raised when a redirect response lacks a Location header. Per the AGENTS.md rule, API-related errors must use the custom exception hierarchy from vlmrun/client/exceptions.py. Replace the ValueError with an appropriate exception such as APIError (imported from vlmrun.client.exceptions). For example:

from vlmrun.client.exceptions import APIError

raise APIError(
    message=f"Received redirect ({status_code}) without a Location header",
    http_status=status_code,
    headers=headers,
    suggestion="The API returned a redirect response without specifying a target URL.",
)

Also update the docstring Raises section (currently lists ValueError for this case) to reflect the new exception type.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +192 to +194
assert (
content_type in allowed_img_types
), f"Expected one of {allowed_img_types}, got {content_type}"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 assert used for content-type validation in production code for img type

At vlmrun/client/artifacts.py:192-194, the img content-type check uses assert, which is silently stripped when Python runs with -O (optimize mode). For the vid/aud/doc/recon types (lines 215-222), the PR correctly replaced the old assert with a logger.warning, but didn't apply the same treatment to the img branch. This inconsistency means the img path has no content-type validation at all under -O. The pre-existing assert was modified (to add application/octet-stream to the allowed set), so while the mechanism is pre-existing, the inconsistency with the other branches is new.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +215 to +222
if actual_content_type not in (
expected_content_type,
"application/octet-stream",
):
logger.warning(
f"Unexpected Content-Type for {obj_type} artifact: "
f"expected {expected_content_type}, got {actual_content_type}"
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Content-type validation downgraded from assertion to warning for file artifacts

For vid/aud/doc/recon artifact types, the old code used assert actual_content_type == expected_content_type which would crash on mismatch. The new code at vlmrun/client/artifacts.py:215-222 only logs a warning and proceeds to write the file regardless. This is a deliberate behavioral change to handle cloud storage providers that return application/octet-stream instead of specific MIME types. While this improves robustness for the redirect case, it also means that a completely wrong content type (e.g., text/html for a video artifact, possibly an error page) will be silently accepted and written to disk. This trade-off seems intentional but worth noting.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant