Skip to content

Add free subagent-based competition mode to /mobius-run#2

Merged
AaronGoldsmith merged 4 commits intomainfrom
feature/free-competition-mode
Mar 15, 2026
Merged

Add free subagent-based competition mode to /mobius-run#2
AaronGoldsmith merged 4 commits intomainfrom
feature/free-competition-mode

Conversation

@AaronGoldsmith
Copy link
Owner

Summary

  • New --free mode (default) that runs competitions entirely within Claude Code using haiku subagents, judged by Opus — zero API cost on Pro subscription
  • Skill generates task-specific challenger personas on the fly, spawns them as parallel haiku subagents, collects outputs, and judges with Opus — the full competition loop without any API spend
  • API mode preserved via --api flag for cross-family diversity (Anthropic + Google + OpenAI judges) when needed
  • Two new helper scripts:
    • create_match.py — creates match records with agent selection by slug list or top-N by Elo
    • record_outputs.py — saves subagent outputs to an existing match record
  • Same Elo system — free and API competition results feed into the same leaderboard

Why

The key insight: Claude Code Opus on Pro is the same model as the API's Opus, but free. By orchestrating competitions as subagent spawning (haiku for contestants, Opus as judge), we get the full adversarial swarm experience at zero cost. API mode remains available for when cross-family judge diversity matters.

Test plan

  • Run /mobius-run "write a Python function to merge two sorted lists" and verify it defaults to --free mode
  • Verify create_match.py correctly selects agents by slug (--agents) and by Elo ranking (--count)
  • Verify record_outputs.py correctly updates match outputs in SQLite
  • Run /mobius-run "..." --api and verify it falls back to the CLI-based flow
  • Check that Elo updates from free competitions appear on mobius leaderboard

🤖 Generated with Claude Code

Introduce --free mode (now the default) that runs competitions entirely
within Claude Code using haiku subagents judged by Opus — zero API cost.
The original CLI-based flow is preserved via --api flag.

New helper scripts:
- create_match.py: creates match records with agent selection by slug or Elo
- record_outputs.py: saves subagent outputs to an existing match record

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 15, 2026 02:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the /mobius-run skill to support a --free “subagent competition” mode (no API calls) and adds helper scripts to create a match record and later persist subagent outputs into the Mobius DB.

Changes:

  • Updated .claude/skills/mobius-run/SKILL.md with --free vs --api flows and step-by-step orchestration instructions.
  • Added create_match.py to select agents and insert a pending match record, emitting agent details as JSON.
  • Added record_outputs.py to write collected outputs into the match record.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
.claude/skills/mobius-run/SKILL.md Documents the new --free mode workflow and CLI steps.
.claude/skills/mobius-run/scripts/create_match.py Creates a pending match in SQLite and outputs match/agent JSON for orchestration.
.claude/skills/mobius-run/scripts/record_outputs.py Persists collected agent outputs to the match record in SQLite.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5cf598d2c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

AaronGoldsmith and others added 3 commits March 14, 2026 19:19
- Fix warnings to stderr in create_match.py (prevents JSON corruption)
- Add --match flag to SKILL.md verdict step (prevents wrong-match writes)
- Remove undocumented --count from skill argument-hint
- README: add Skills section, Memory subsection, fix accuracy issues
- README: remove stray TODO, add train command, expand config table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shell escaping makes passing full agent outputs as CLI args
impractical. Switch to stdin-based input with two modes:
- Per-agent: echo "output" | record_outputs.py <match> <agent>
- Bulk: echo '{"id":"out"}' | record_outputs.py <match> --bulk

Per-agent mode also supports incremental recording as agents
finish, merging into existing outputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sys.stdin defaults to cp1252 on Windows, which crashes on
Unicode characters (em-dashes, smart quotes) in agent outputs.
Reconfigure to UTF-8 with error replacement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@AaronGoldsmith AaronGoldsmith merged commit d75aaa8 into main Mar 15, 2026
2 checks passed
@AaronGoldsmith AaronGoldsmith deleted the feature/free-competition-mode branch March 15, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants