Add free subagent-based competition mode to /mobius-run#2
Add free subagent-based competition mode to /mobius-run#2AaronGoldsmith merged 4 commits intomainfrom
Conversation
Introduce --free mode (now the default) that runs competitions entirely within Claude Code using haiku subagents judged by Opus — zero API cost. The original CLI-based flow is preserved via --api flag. New helper scripts: - create_match.py: creates match records with agent selection by slug or Elo - record_outputs.py: saves subagent outputs to an existing match record Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR expands the /mobius-run skill to support a --free “subagent competition” mode (no API calls) and adds helper scripts to create a match record and later persist subagent outputs into the Mobius DB.
Changes:
- Updated
.claude/skills/mobius-run/SKILL.mdwith--freevs--apiflows and step-by-step orchestration instructions. - Added
create_match.pyto select agents and insert a pending match record, emitting agent details as JSON. - Added
record_outputs.pyto write collected outputs into the match record.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
.claude/skills/mobius-run/SKILL.md |
Documents the new --free mode workflow and CLI steps. |
.claude/skills/mobius-run/scripts/create_match.py |
Creates a pending match in SQLite and outputs match/agent JSON for orchestration. |
.claude/skills/mobius-run/scripts/record_outputs.py |
Persists collected agent outputs to the match record in SQLite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d5cf598d2c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Fix warnings to stderr in create_match.py (prevents JSON corruption) - Add --match flag to SKILL.md verdict step (prevents wrong-match writes) - Remove undocumented --count from skill argument-hint - README: add Skills section, Memory subsection, fix accuracy issues - README: remove stray TODO, add train command, expand config table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shell escaping makes passing full agent outputs as CLI args
impractical. Switch to stdin-based input with two modes:
- Per-agent: echo "output" | record_outputs.py <match> <agent>
- Bulk: echo '{"id":"out"}' | record_outputs.py <match> --bulk
Per-agent mode also supports incremental recording as agents
finish, merging into existing outputs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sys.stdin defaults to cp1252 on Windows, which crashes on Unicode characters (em-dashes, smart quotes) in agent outputs. Reconfigure to UTF-8 with error replacement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
--freemode (default) that runs competitions entirely within Claude Code using haiku subagents, judged by Opus — zero API cost on Pro subscription--apiflag for cross-family diversity (Anthropic + Google + OpenAI judges) when neededcreate_match.py— creates match records with agent selection by slug list or top-N by Elorecord_outputs.py— saves subagent outputs to an existing match recordWhy
The key insight: Claude Code Opus on Pro is the same model as the API's Opus, but free. By orchestrating competitions as subagent spawning (haiku for contestants, Opus as judge), we get the full adversarial swarm experience at zero cost. API mode remains available for when cross-family judge diversity matters.
Test plan
/mobius-run "write a Python function to merge two sorted lists"and verify it defaults to--freemodecreate_match.pycorrectly selects agents by slug (--agents) and by Elo ranking (--count)record_outputs.pycorrectly updates match outputs in SQLite/mobius-run "..." --apiand verify it falls back to the CLI-based flowmobius leaderboard🤖 Generated with Claude Code