Skip to content

Add aggregate runner log and run-count safety limits #13

@ifsheldon

Description

@ifsheldon

Summary

Add platform-owned aggregate safety limits for evaluation logs and challenge run manifests. This is not required for the small-scale MVP launch, but should be tracked before broader public usage.

Background

The runner already enforces a mandatory per-container Docker log cap and truncates collected container logs before persistence. We intentionally should not expose log_limit_bytes as a challenge-owner setting, because log caps are platform safety policy rather than benchmark semantics.

The remaining gap is aggregate size: one evaluation can involve multiple setup/build/run/scorer/prepare containers, and challenge-owned static or prepared run manifests can contain many runs. Even with per-container caps, total persisted runner.log size and total invocation count can grow with the number of containers/runs.

Proposed work

  • Add a platform-owned maximum run count for static and prepared run manifests.
  • Add an aggregate per-evaluation persisted log cap.
  • Keep per-container Docker log caps mandatory and platform-owned.
  • Make validation/prepared-manifest errors clear when limits are exceeded.
  • Document the limits in operations and solution protocol docs.

Non-goals

  • Do not add log_limit_bytes or similar knobs to challenge-owner configs.
  • Do not make these limits benchmark semantics. They should remain admin/platform policy.

Priority

Post-MVP hardening. Small-scale MVP publish can proceed without this, but the issue should be resolved before broader public launch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions