Skip to content

Handle Oz API rate limits during post-run apply #435

@captainsafia

Description

@captainsafia

Summary

A respond-to-pr-comment run can successfully push commits, but the Vercel cron post-run apply step can still overwrite the progress comment with “I ran into an unexpected error” if it hits an Oz API 429 Too Many Requests while applying optional artifacts/results.

Example

PR: warpdotdev/warp#10204 (comment)

Target run:

  • Workflow: respond-to-pr-comment
  • Run ID: 019dfac9-bf0e-7e6f-ad51-96f909d9dc47
  • Oz run state: SUCCEEDED
  • Completed at: 2026-05-06T01:01:38.065636Z
  • Pushed commit: 4b4e8d7a5f6f3c6a0a3b988bf14ad3670ad82909
  • Uploaded artifact: resolved_review_comments.json (019dfacd-6c13-7dc4-8be2-d8d5b8bcad19)

Despite that, the progress reply was updated to:

I ran into an unexpected error while working on this.

The Vercel KV state for the run remained in-flight with:

  • attempts: 6
  • workflow: respond-to-pr-comment
  • last_error: apply failed: <!doctype html>...<title>429</title>429 Too Many Requests

Expected behavior

A transient Oz API 429 while applying a successful run should not make the user-facing result look like the agent failed after it already pushed commits.

At minimum, optional artifact loading should degrade gracefully or retry/back off when the Oz API rate-limits, especially for optional artifacts like:

  • pr-metadata.json
  • resolved_review_comments.json

Likely cause

try_load_pr_metadata_artifact() and try_load_resolved_review_comments_artifact() catch RuntimeError, ValueError, and httpx.HTTPError, but not oz_agent_sdk.RateLimitError or broader Oz SDK API errors. A 429 from client.agent.runs.retrieve(...) or client.agent.get_artifact(...) can therefore escape the optional artifact helper and abort apply_pr_comment_result().

Relevant flow:

  • core/poll_runs.py calls handler.result_applier(...)
  • core/workflow_adapters.py catches apply exceptions and calls progress.report_error()
  • core/workflows/respond_to_pr_comment.py calls optional artifact helpers while applying successful runs
  • oz/artifacts.py optional artifact helpers do not currently catch Oz SDK rate-limit errors

Notes

There were also unrelated /api/cron 504s around the same window from a stuck review-pull-request apply failure (019dd819...) with GitHub 422 “Line could not be resolved”, which likely contributed to cron pressure. The specific respond-to-pr-comment run above, however, succeeded and pushed its commit; the failure was in post-run apply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:workflowGitHub workflows, Python automation, or Oz integrationbugSomething isn't workingrepro:mediumThe report looks partially reproducible but has some uncertaintytriagedInitial Oz triage has been completed for this issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions