Skip to content

Releases: tinycomputerai/seaport

v0.4.0

12 Jun 08:14

Choose a tag to compare

0.4.0 (2026-06-12)

Features

  • add timeout multiplier flags for slow hosts (7be54a0)
  • agent setup hook and host-credential passthrough (374b68f)
  • collect task-declared artifacts from the trial container (5cfa485)
  • curated clean-room verifier, per-step separate verifier and setup.sh (df75edb)
  • emit harbor-compatible per-key reward stats in the job result (286fd81)
  • environment and per-step healthcheck (d5fd189)
  • honor task-configured agent/verifier user (46c5aa2)
  • make --strict-resources mirror harbor cpu/memory limits (76a08c7)
  • model verifier rewards as a named map (6e508a6)
  • run multi-step tasks (86277f3)
  • run the verifier in a separate environment when declared (f8519ed)
  • support reward.json and fractional rewards (bdc9ad6)
  • whole-trial retries with a non-retryable default set (cadba66)

Bug Fixes

  • give image pulls at least 30 minutes (19ca62e)
  • honor image WORKDIR and mount task solution at /solution (36e3659)
  • let the verifier reward decide the trial, not the script exit code (1a9a105)
  • request explicit platform for prebuilt foreign-arch images (a9837b9)
  • sweep orphaned trial containers and quiet cleanup races (f01f4f7)

Performance Improvements

  • give each trial a fair share of host CPUs; lower default concurrency (d9c40d4)
  • move /app to docker volumes, boost task resource caps, drop preflight barrier (079c957)

v0.3.0

09 Jun 16:00

Choose a tag to compare

0.3.0 (2026-06-09)

Features

v0.2.0

09 Jun 00:41

Choose a tag to compare

0.2.0 (2026-06-09)

Features

  • add deterministic evaluation core (e5f1b47)
  • add direct nop agent execution (e158cfe)
  • add sandboxed docker execution backend (530513a)
  • add seaport CLI and project naming (e5bc6c9)
  • align task environment parsing with harbor (658a188)
  • group resolution progress under run header (bf94c57)
  • improve run logging (80444fe)
  • make run output terminal friendly (8b2ecaf)
  • pass phase environment variables (f272892)
  • resolve git-backed registry tasks (46b5995)
  • resolve harbor local registry datasets (a8f31c9)
  • resolve remote package datasets (65d0c16)
  • run direct registry and git tasks (76904ae)
  • run harbor-compatible local datasets (6d8b7de)
  • run local oracle tasks (8a6fe5d)
  • run sandboxed external agent commands (48e018b)
  • run task attempts concurrently (94a5e40)
  • show run startup progress (d7c4949)
  • show task durations in run output (f9aa331)
  • stream task execution logs (b07261f)

Bug Fixes

  • align docker runtime with benchmark toolchains (904ed2f)
  • allow sandboxed tmp binaries to execute (1f8a034)
  • build task dockerfiles on native platform (1d09989)
  • expose cobol copybooks at workspace root (a86f426)
  • gate docker api response parsing to unix (411c805)
  • honor task docker resource limits (e84f8a0)
  • infer amd64 platform for legacy Java images (a4af2ce)
  • infer amd64 platform for x86 assembly tasks (8402b98)
  • match benchmark container layout (62e5385)
  • prefer native docker platform when available (903dabf)
  • preserve docker image workspace (7b0867d)
  • record failed trials without stopping runs (3333674)
  • report each task's share of the execution timeline (0cbeb07)
  • retry package archive downloads (ed9efbd)
  • retry transient docker build failures (52f24b9)
  • satisfy formatting and clippy checks (e9925c5)
  • set docker platform for task containers (1235408)
  • stage task files in execution workspaces (6c0f46e)
  • stream docker build progress (c8604f9)
  • use buildkit-compatible network mode (ac36a85)

Performance Improvements

  • adapt default run concurrency (f209a1b)
  • cache docker task environments (acec736)
  • deduplicate docker image pulls (10bfeaf)
  • preflight docker environments (fdf0f00)
  • raise adaptive concurrency cap (9631112)
  • schedule long-running trials first (e1196c1)
  • schedule work by run phase (db643a1)
  • speed up workspace snapshot restores (f3195fb)
  • use a managed buildkit builder (358f7cd)
  • use docker api for control-plane checks (8b5cfea)