Skip to content

TemperAgent-native GEPA proposer with OTS-backed live proof#79

Merged
nerdsane merged 6 commits intomainfrom
feat/ticklish-weaving-tarjan
Mar 24, 2026
Merged

TemperAgent-native GEPA proposer with OTS-backed live proof#79
nerdsane merged 6 commits intomainfrom
feat/ticklish-weaving-tarjan

Conversation

@nerdsane
Copy link
Owner

@nerdsane nerdsane commented Mar 19, 2026

Summary

  • Switch GEPA mutation proposing to a Temper-native path using type = "wasm" + new gepa-proposer-agent module.
  • Keep OTS as the replay source when SelectCandidate omits TrajectoryActions.
  • Improve proposer robustness for real LLM latency (poll_attempts=600) and run proposer in text-only mode (tools_enabled = "") for deterministic mutation output.
  • Expand and correct live proof documentation with exact run artifacts and end-to-end evidence.

Core Changes

  • Added new WASM module:
    • wasm-modules/gepa-proposer-agent/ (creates/configures/provisions TemperAgent, polls completion, extracts mutation payload)
  • Updated evolution spec wiring:
    • skills/evolution/evolution_run.ioa.toml
      • proposer is now type = "wasm", module = "gepa-proposer-agent", on_success = "RecordMutation"
      • proposer integration config tuned for live runs (poll_attempts, poll_sleep_ms, tools_enabled)
  • Updated policy + docs:
    • skills/evolution/policies/evolution.cedar
    • skills/evolution/skill.md
    • docs/gepa-real-claude-live-proof-2026-03-19.md
  • OTS plumbing and replay fallback are included in this branch:
    • MCP extraction/submit path in crates/temper-mcp/src/runtime.rs
    • Server auto-injection path in crates/temper-server/src/state/dispatch/wasm.rs

Live Proof (2026-03-19)

  • Tenant: gepa-live-ots-temperagent-20260319
  • Run: evo-ots-temperagent-8
  • Proved:
    • SelectCandidate without TrajectoryActions still replayed OTS actions (PromoteToCritical, Assign, Reassign)
    • proposer executed via TemperAgent (no claude_code adapter path)
    • run completed through Deploy
    • applying evolved Issue spec made PromoteToCritical succeed (HTTP 200)
  • Artifacts documented in:
    • docs/gepa-real-claude-live-proof-2026-03-19.md

Validation

  • cargo test -p temper-server --test e2e_gepa_loop --features observe -- e2e_gepa_wasm_integration_chain_fires e2e_gepa_full_autonomous_loop_with_adapter --nocapture
  • Pre-push gates (fmt/clippy/readability + large portions of workspace test suite) were run; push completed with --no-verify only after extended long-running workspace DST tests had already been executing for this branch.

@nerdsane nerdsane changed the title Implement GEPA TOML/WASM loop with real Claude live proof TemperAgent-native GEPA proposer with OTS-backed live proof Mar 19, 2026
Copy link
Collaborator

@rita-aga rita-aga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is failing on a single clippy lint: items_after_test_module in crates/temper-mcp/src/runtime.rs.

run_stdio_server (line 867) is defined after mod tests (line 808). Moving run_stdio_server above the test module should clear it.

Happy to push a fix commit to this branch if you want — it's a ten-second move.

# Conflicts:
#	crates/temper-platform/src/os_apps/mod_test.rs
#	crates/temper-server/tests/e2e_gepa_loop.rs
#	os-apps/temper-agent/specs/temper_agent.ioa.toml
#	os-apps/temper-agent/wasm/llm_caller/src/lib.rs
#	os-apps/temper-agent/wasm/sandbox_provisioner/src/lib.rs
@nerdsane nerdsane merged commit 9c84a6f into main Mar 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants