Skip to content

ex: add conformance scaffolding and telemetry instrumentation#18

Open
deepfates wants to merge 1 commit intomainfrom
codex/implement-conformance-audit-for-elixir
Open

ex: add conformance scaffolding and telemetry instrumentation#18
deepfates wants to merge 1 commit intomainfrom
codex/implement-conformance-audit-for-elixir

Conversation

@deepfates
Copy link
Owner

Motivation

  • Provide a tests-first foundation to verify Elixir runtime conformance against the language-agnostic tests.yaml spec.
  • Enable deterministic, scripted LLM behavior for reproducible conformance runs via a FakeLLM adapter.
  • Add runtime telemetry so internal lifecycle and gate/code execution events are observable without changing runtime semantics.

Description

  • Add a small conformance scaffold: Cantrip.Conformance.TestCase, Cantrip.Conformance.Loader, Cantrip.Conformance.FakeLLM, Cantrip.Conformance.Expect, and Cantrip.Conformance.Runner under ex/lib/cantrip/conformance/ to load/validate tests.yaml, produce typed test cases, and execute simple cast-style actions.
  • Implement a YAML loader (Cantrip.Conformance.Loader) that converts tests.yaml into Elixir data (uses a small Ruby-based YAML->JSON shim via System.cmd/2 to avoid adding Hex deps in this environment).
  • Instrument runtime telemetry: emit [:cantrip, :entity, :start] / [:cantrip, :entity, :stop] in EntityServer, emit [:cantrip, :turn, :start] / [:cantrip, :turn, :stop] per turn with entity_id and turn_number, emit [:cantrip, :gate, :call] and [:cantrip, :gate, :result] with duration in Circle.execute_gate/3, and emit [:cantrip, :code, :eval, :start|:stop] around sandboxed code evaluation.
  • Add ExUnit test files under ex/test/ to exercise the loader, FakeLLM, a small runner smoke test, and a telemetry assertion (M25TelemetryTest) that attaches a handler and asserts events/metadata were emitted.

Testing

  • New ExUnit tests added: ex/test/conformance/loader_test.exs, ex/test/conformance/fake_llm_test.exs, ex/test/conformance_test.exs, and ex/test/m25_telemetry_test.exs (all tagged to exercise the new conformance scaffolding and telemetry).
  • Confirmed tests.yaml contains 71 cases via a quick YAML count (ruby -ryaml -e ...) to match the loader expectation.
  • Attempted cd ex && mix deps.get but dependency fetch was blocked in this environment due to network/SSL restrictions to Hex, so full dependency installation could not be completed.
  • Attempted targeted mix test for the new tests but execution was blocked by a local Elixir version mismatch (project requires ~> 1.19 while the environment provides 1.18.3), so automated test runs could not complete here.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant