feat(artifacts): handle HTTP redirects for presigned cloud-storage URLs#196
feat(artifacts): handle HTTP redirects for presigned cloud-storage URLs#196devin-ai-integration[bot] wants to merge 1 commit into
Conversation
- Add allow_redirects parameter to APIRequestor.request() to let callers disable automatic redirect following when auth headers would leak. - Artifacts.get() now uses allow_redirects=False and manually follows redirect responses (301-308) via _follow_redirect(), which strips the Authorization header before fetching from presigned S3/GCS URLs. - Replace hand-rolled query-param stripping with _filename_from_url() using urlparse for robust presigned-URL filename extraction. - Fix double-extension bug in url artifact file naming (was foo.jpg.jpg). - Accept application/octet-stream Content-Type for redirected responses since cloud storage may not preserve the original MIME type. - Add comprehensive tests for redirect handling, filename extraction, and content-type tolerance. Co-Authored-By: Sudeep Pillai <sudeep.pillai@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| raise ValueError( | ||
| f"Received redirect ({status_code}) without a Location header" | ||
| ) |
There was a problem hiding this comment.
🟡 New code raises ValueError for API-related error, violating AGENTS.md rule
The AGENTS.md rule states: "Use the existing exception hierarchy in vlmrun/client/exceptions.py. Do not raise generic Exception or ValueError for API-related errors." The new redirect-handling code at vlmrun/client/artifacts.py:147-149 raises a plain ValueError when the server responds with a redirect status code but no Location header. This is an API-related error (malformed server response) and should use the custom exception hierarchy — e.g. APIError from vlmrun/client/exceptions.py. This means callers wrapping the call with except APIError would miss this error, leading to unhandled exceptions.
Prompt for agents
In vlmrun/client/artifacts.py, lines 147-149, a ValueError is raised when a redirect response lacks a Location header. Per the AGENTS.md rule, API-related errors must use the custom exception hierarchy from vlmrun/client/exceptions.py. Replace the ValueError with an appropriate exception such as APIError (imported from vlmrun.client.exceptions). For example:
from vlmrun.client.exceptions import APIError
raise APIError(
message=f"Received redirect ({status_code}) without a Location header",
http_status=status_code,
headers=headers,
suggestion="The API returned a redirect response without specifying a target URL.",
)
Also update the docstring Raises section (currently lists ValueError for this case) to reflect the new exception type.
Was this helpful? React with 👍 or 👎 to provide feedback.
| assert ( | ||
| content_type in allowed_img_types | ||
| ), f"Expected one of {allowed_img_types}, got {content_type}" |
There was a problem hiding this comment.
🚩 assert used for content-type validation in production code for img type
At vlmrun/client/artifacts.py:192-194, the img content-type check uses assert, which is silently stripped when Python runs with -O (optimize mode). For the vid/aud/doc/recon types (lines 215-222), the PR correctly replaced the old assert with a logger.warning, but didn't apply the same treatment to the img branch. This inconsistency means the img path has no content-type validation at all under -O. The pre-existing assert was modified (to add application/octet-stream to the allowed set), so while the mechanism is pre-existing, the inconsistency with the other branches is new.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if actual_content_type not in ( | ||
| expected_content_type, | ||
| "application/octet-stream", | ||
| ): | ||
| logger.warning( | ||
| f"Unexpected Content-Type for {obj_type} artifact: " | ||
| f"expected {expected_content_type}, got {actual_content_type}" | ||
| ) |
There was a problem hiding this comment.
🚩 Content-type validation downgraded from assertion to warning for file artifacts
For vid/aud/doc/recon artifact types, the old code used assert actual_content_type == expected_content_type which would crash on mismatch. The new code at vlmrun/client/artifacts.py:215-222 only logs a warning and proceeds to write the file regardless. This is a deliberate behavioral change to handle cloud storage providers that return application/octet-stream instead of specific MIME types. While this improves robustness for the redirect case, it also means that a completely wrong content type (e.g., text/html for a video artifact, possibly an error page) will be silently accepted and written to disk. This trade-off seems intentional but worth noting.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
When the
/artifactsAPI responds with a redirect (301–308) to a presigned cloud-storage URL (S3, GCS, etc.), the SDK now follows it without forwarding theAuthorizationheader — presigned URLs carry their own auth via query params and reject requests that also include a Bearer token.Key changes:
APIRequestor.request()— newallow_redirectsparameter (defaultTrue) passed through torequests.Session.request().Artifacts.get()— calls the API withallow_redirects=False, detects 3xx responses, and delegates to_follow_redirect()which fetches theLocationURL using only browser-like headers:_filename_from_url()— replaces hand-rolledpath.name.split("?")[0]withurlparse-based extraction, fixing a double-extension bug whereurlartifacts were saved as e.g.file.jpg.jpg.application/octet-streamfor file-type artifacts (vid/aud/doc/recon) since cloud storage may not preserve the original MIME type.Locationheader error, andapplication/octet-streamtolerance.Link to Devin session: https://app.devin.ai/sessions/91bd0ee339014c0689a7ea5112830c6f
Requested by: @spillai