Skip to content

trbl benchmark executing not picked

Takumi Yanagawa edited this page Aug 14, 2025 · 1 revision

Benchmark Executing but Harness Not Picking It Up

If the benchmark status shows Executing but you are sure that no agent harness is working on it, it may be stuck. A typical agent harness log in this case looks like:

If you face the benchmark status is executing but ensure no agent harness works on this benchmark, OOO The typical log of agent harness is like this:

[2025-08-14 05:12:29,858 INFO itbench_utilities.agent_harness.agent] The benchmark statuses: ['e989f744-c730-47c0-b251-9158656c5237: Finished', 'deeb73cf-887d-4bc6-9052-e8f238e1d86b: Executing']

How to resolve

  • Option 1: Create and approve a new benchmark (by opening a benchmark GitHub issue). The agent harness will pick it up and start benchmarking. If the ITBench service resources are still occupied by the previous benchmark, it may take a while to start.

  • Option 2: Manually release the stuck benchmark (requires appropriate permissions; customers cannot do this themselves).

How to Manually Release a Benchmark

  1. Set shell variables

    endpoint=https://itbench.apps.prod.itbench.res.ibm.com/bench-server
    agent_token=<token in agent-manifest.json>
    manifest_endpoint=<manifest_endpoint in agent-manifest.json>
    benchmark_id=<benchmark_id; you can find it in the comments of the benchmark creation GitHub issue>
    
  2. Call the API

    curl -X PUT \
        -H "Authorization: Bearer $agent_token" \
        -H "Content-type: application/json" \
        "$endpoint$manifest_endpoint/benchmark-entries/$benchmark_id" \
        -d '{"phase": "NotStarted"}'
    
  3. Verify the state

    The state should change from Executing to NotStarted.

    The agent harness will then pick up this benchmark and resume benchmarking from the ready scenario.