Skip to content

[image_spec]: add FLYTEKIT_USE_DEPOT env var override and docker fallback on depot auth errors#36

Closed
ryanjwong wants to merge 4 commits intomasterfrom
devin/1771394502-depot-fallback
Closed

[image_spec]: add FLYTEKIT_USE_DEPOT env var override and docker fallback on depot auth errors#36
ryanjwong wants to merge 4 commits intomasterfrom
devin/1771394502-depot-fallback

Conversation

@ryanjwong
Copy link

@ryanjwong ryanjwong commented Feb 18, 2026

Why are the changes needed?

The DEPOT_TOKEN stored in AWS Secrets Manager expired/became invalid, causing all Flyte image builds to fail with permission_denied: Invalid token. Since use_depot=True is the default on ImageSpec, there was no way to bypass depot without code changes, and no fallback mechanism.

What changes were proposed in this pull request?

Refactors DefaultImageBuilder._build_image to add resilience against depot auth failures:

  1. FLYTEKIT_USE_DEPOT env var — overrides ImageSpec.use_depot at runtime. Set FLYTEKIT_USE_DEPOT=false to skip depot entirely without code changes.
  2. Preflight auth check + automatic docker fallback — before starting a build, runs depot projects list to verify credentials. If the preflight detects an auth error (or times out), automatically falls back to docker image build (if docker is available and running). The actual build command still uses run(command, check=True) so stdout/stderr stream in real-time.
  3. Extracted helpers_resolve_use_depot, _build_command, _validate_build_tool, _is_depot_auth_error, _depot_auth_preflight for testability.

_depot_auth_preflight returns a tri-state bool | None:

  • True — auth ok, proceed with depot
  • False — auth failed, fall back to docker
  • None — depot not installed, fall through to _validate_build_tool (which raises the proper "not installed" error)

Updates since last revision

  • Replaced capture_output=True approach with a preflight auth check (_depot_auth_preflight) to avoid buffering build output during long builds. The fallback decision now happens before the build starts, preserving real-time streaming.
  • _depot_auth_preflight now catches subprocess.TimeoutExpired (15s timeout on depot projects list) and returns False to trigger docker fallback instead of crashing the build.
  • _depot_auth_preflight now distinguishes "depot not installed" (None) from "auth failed" (False), so missing-depot errors propagate correctly through _validate_build_tool instead of printing a misleading auth failure message.

How was this patch tested?

Inline Python tests verifying:

  • _resolve_use_depot respects env var override and falls back to image_spec.use_depot
  • _is_depot_auth_error matches expected patterns ("permission_denied", "not authorized", "invalid token", "unauthenticated") and rejects unrelated errors
  • _build_command generates correct depot/docker commands, includes --project for nix, skips --push for nix

No integration tests for the actual fallback path (requires depot installed with an invalid token).

Human review checklist

  • Preflight latency on happy path: _depot_auth_preflight runs depot projects list (with 15s timeout) on every build where use_depot=True. This adds a network round-trip even when depot auth is fine. Consider whether caching the preflight result or skipping the check would be better.
  • Non-auth depot failures still proceed: If depot projects list fails with a non-auth error (e.g. rate limit, 500), _depot_auth_preflight returns True and the build proceeds with depot. The build could then fail for the same transient reason.
  • Race between shutil.which and run: _validate_build_tool checks shutil.which("docker") then calls run(["docker", "info"]). The original code wrapped this in a broader try/except; the new code lets FileNotFoundError propagate uncaught if the binary disappears between those calls.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Link to Devin run | Requested by: @ryanjwong

…fallback on depot auth errors

Co-Authored-By: ryan@exa.ai <ryanjwong007@gmail.com>
@devin-ai-integration
Copy link

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

…rve build streaming

Co-Authored-By: ryan@exa.ai <ryanjwong007@gmail.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Co-Authored-By: ryan@exa.ai <ryanjwong007@gmail.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…led (False) in preflight

Co-Authored-By: ryan@exa.ai <ryanjwong007@gmail.com>
@ryanjwong ryanjwong closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant