LayoutArena is a scaffold for a hackathon project on AI control for tool-using design agents.
The repository is organized around five areas:
layoutarena/env: canvas state, element models, constraints, scoring, and tool executionlayoutarena/control: enforcement, bounded control, stewardship monitoring, and monitor heuristicslayoutarena/attacks: adversarial side-task templateslayoutarena/experiments: sample briefs and evaluation runnerslayoutarena/viz: SVG rendering, replay export, and Pareto utilities
The current code is intentionally a typed scaffold: core interfaces are in place, some pieces are implemented, and the remaining research choices are marked with pseudocode comments where they still need project-specific decisions.
Quick sanity check:
python3 -m compileall layoutarenaThe scaffold is compatible with the current local interpreter in this repo (Python 3.9.x).
Primary planning document: TODO.md
Batch and single runs write outputs/metrics.csv plus per-run folders under outputs/runs/<run_id>/ (see layoutarena/experiments/output_writer.py).
Protocol-level monitoring (what differentiates bounded control and stewardship from enforcement) is summarized on each EpisodeSummary and in the CSV:
| Field | Meaning |
|---|---|
monitor_detected |
True if the protocol blocked at least one tool call or attached non-empty monitor_reasons to a call that still executed (e.g. redirect-with-allow under stewardship). |
monitor_block_count |
Number of steps with status == "blocked" — the protocol vetoed execution. |
monitor_nudge_count |
Number of executed steps with non-empty monitor_reasons — “noticed and noted / shaped, but not vetoed.” |
Use monitor_block_count when you want a strict “hard intervention” rate for plots or tables; use monitor_detected for the broad protocol-attention signal (aligned with the default Pareto / detection-rate figures). scripts/generate_plots.py still uses monitor_detected for backward compatibility; new columns are optional when loading older CSVs (they default to zero).
Helper: layoutarena.experiments.run_eval.compute_monitor_metrics(state) returns (monitor_detected, monitor_block_count, monitor_nudge_count) from a final LayoutState.
scripts/run_batch.py and scripts/run_single.py load layout-hackathon/.env if present. For OpenAI-style model names (e.g. gpt-4o-mini), set OPENAI_API_KEY. For Anthropic / Gemini, set ANTHROPIC_API_KEY or GEMINI_API_KEY / GOOGLE_API_KEY respectively. If the key is missing, the batch runner exits immediately with a clear error instead of running hundreds of failing episodes. Omit --model to use the deterministic scripted policy (no API calls).