Skip to content

Pull requests: OpenHands/benchmarks

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Capture SWE-bench patches from failed runs
#751 opened Jun 12, 2026 by neubig Member Draft
do not retry for max-limit or stuck-in-loop errors
#750 opened Jun 12, 2026 by yiqingxyq Loading…
Capture SWE-bench patches on failed runs
#748 opened Jun 10, 2026 by neubig Member Draft
Add per-instance cost cap to swe-bench runner
#741 opened Jun 8, 2026 by juanmichelini Collaborator Loading…
build(deps): bump the version-all group across 1 directory with 4 updates dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
#732 opened Jun 1, 2026 by dependabot Bot Loading…
[codex] Add EvoClaw benchmark inference
#705 opened May 7, 2026 by xingyaoww Member Draft
fix: reset BuildKit cache between retries for base/assembly builds
#631 opened Apr 4, 2026 by simonrosenberg Member Loading…
3 tasks
Update Claude ACP package references
#629 opened Apr 3, 2026 by simonrosenberg Member Loading…
build(deps): bump the version-all group across 1 directory with 21 updates dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code
#596 opened Mar 31, 2026 by dependabot Bot Loading…
NeMo Evaluator Integration
#455 opened Feb 26, 2026 by simonrosenberg Member Loading…
Add security benchmark with ASTRA
#361 opened Jan 26, 2026 by XZ-X Loading…
Agentic code search
#141 opened Dec 8, 2025 by adityasoni9998 Contributor Loading…
ProTip! Add no:assignee to see everything that’s not assigned.