Skip to content

feat(instrument): typed events Bundle #3 — google_adk + llama_index + ms_agent_framework#151

Closed
mmercuri wants to merge 3 commits into
feat/instrument-typed-events-foundationfrom
feat/instrument-typed-events-bundle-3-google-adk-llama-index-ms-agent-framework
Closed

feat(instrument): typed events Bundle #3 — google_adk + llama_index + ms_agent_framework#151
mmercuri wants to merge 3 commits into
feat/instrument-typed-events-foundationfrom
feat/instrument-typed-events-bundle-3-google-adk-llama-index-ms-agent-framework

Conversation

@mmercuri

Copy link
Copy Markdown
Contributor

Summary

Migrates the next 3 framework adapters from emit_dict_event(...) to typed
emit_event(TypedModel.create(...)) against the canonical Pydantic models
from layerlens.instrument._compat.events (PR #129 foundation).

This is Bundle #3 of 6 in the typed-events migration series. Bundle
#1 (PR #129) shipped the foundation + agno reference. Bundle #2 (PR #138)
shipped autogen + crewai + smolagents. This bundle ships google_adk +
llama_index + ms_agent_framework.

Honest site counts (grep, not estimated)

Per CLAUDE.md item 11 — counts verified with
grep -E 'emit_dict_event' src/.../<adapter>/:

Adapter Migration doc estimate Actual grep count Delta
google_adk ~11 11 0
llama_index ~12 12 0
ms_agent_framework ~12 12 0
Total ~35 35 0

Estimates were honest in this bundle.

Per-adapter status

google_adk (commit ffa1aca)

  • 11 emission sites migrated
  • Helpers added: _stringify, _sha256_of, _coerce_to_dict,
    _detect_provider
  • mypy --strict: clean (2 source files)
  • pytest: 15/15 passing
  • Regression tests added: typed-payloads-only assertion +
    no-DeprecationWarning assertion

llama_index (commit 4fc3a29)

  • 12 emission sites migrated
  • Retrieval events mapped onto ToolCallEvent with name="retrieval",
    integration=IntegrationType.LIBRARY (no canonical retrieval shape;
    mirrors agno reference)
  • Helpers added: _stringify, _coerce_to_dict, _sha256_of,
    _detect_provider
  • mypy --strict: clean (2 source files)
  • pytest: 15/15 passing
  • Regression tests added

ms_agent_framework (commit 0917481)

  • 12 emission sites migrated
  • Behavioural change: the previous adapter emitted an ad-hoc
    agent.state.change payload alongside agent.output carrying a
    run_complete / run_failed marker. That payload did not satisfy
    the canonical AgentStateChangeEvent before_hash / after_hash
    contract (no real state mutation to hash at the run boundary).
    The post-migration mapping carries the same signal as run_status
    on AgentOutputEvent.metadata — preserves the cross-cutting
    completion marker without violating the canonical schema.
  • Helpers added: _stringify, _coerce_to_dict, _sha256_of
  • mypy --strict: clean (2 source files)
  • pytest: 16/16 passing
  • Regression tests added

Combined verification

$ uv run mypy --strict src/layerlens/instrument/adapters/frameworks/google_adk \
    src/layerlens/instrument/adapters/frameworks/llama_index \
    src/layerlens/instrument/adapters/frameworks/ms_agent_framework
Success: no issues found in 6 source files

$ uv run python -m pytest \
    tests/instrument/adapters/frameworks/test_google_adk_adapter.py \
    tests/instrument/adapters/frameworks/test_llama_index_adapter.py \
    tests/instrument/adapters/frameworks/test_ms_agent_framework_adapter.py
46 passed in 1.05s

Multi-tenancy compliance

Per PR #129 foundation: BaseAdapter.emit_event stamps org_id onto
every typed payload before delegating to self._stratix.emit(...).
No emit site in this bundle bypasses the typed path, so every
emission is tenant-scoped by construction.

Test plan

  • All 3 adapters: uv run mypy --strict passes
  • All 3 adapters: every emit_dict_event call replaced
    (grep -c 'emit_dict_event' = 0)
  • All 3 adapters: regression test asserts typed payloads only
  • All 3 adapters: regression test asserts no DeprecationWarning
  • All 3 adapters' existing test suites pass (46/46)
  • Reviewer: confirm canonical model field paths match adapter
    emission shape on the wire
  • Reviewer: confirm ms_agent_framework agent.state.change
    collapse is the right call (vs. emitting a synthetic state-
    change with hashed metadata)

Bundle progression

mmercuri added 3 commits May 10, 2026 10:39
Migrate all 11 emission sites in google_adk lifecycle.py from
emit_dict_event(...) to emit_event(TypedModel.create(...)) using the
canonical Pydantic models from layerlens.instrument._compat.events
(PR #129 foundation).

Sites migrated (grep-counted, not estimated):
- _before_agent_callback     -> AgentInputEvent
- _after_agent_callback      -> AgentOutputEvent
- _after_model_callback      -> ModelInvokeEvent + CostRecordEvent
- _after_tool_callback       -> ToolCallEvent
- on_agent_start             -> AgentInputEvent
- on_agent_end               -> AgentOutputEvent
- on_handoff                 -> AgentHandoffEvent (sha256:<hex64> format)
- on_tool_use                -> ToolCallEvent
- on_llm_call                -> ModelInvokeEvent
- _emit_agent_config         -> EnvironmentConfigEvent

Total: 11 emission sites (matches estimate).

ADK-specific provenance (framework, agent_name, timestamp_ns,
session_id, description, instruction) is carried on canonical metadata
/ attributes / parameters slots.

Helpers added:
- _stringify: coerce ADK Content/dict/None to canonical message string
- _sha256_of: produce sha256:<hex64> for handoff context hash
- _coerce_to_dict: wrap scalar tool input/output in {value: ...}
- _detect_provider: derive provider from model identifier (Gemini default)

Regression tests added:
- test_google_adk_lifecycle_emits_typed_payloads_only — every emission
  is a Pydantic model instance, not a dict
- test_google_adk_emit_does_not_warn_after_migration — no
  DeprecationWarning fires from any lifecycle path

mypy --strict: clean (2 source files)
pytest: 15/15 passing
Migrate all 12 emission sites in llama_index lifecycle.py from
emit_dict_event(...) to emit_event(TypedModel.create(...)) using the
canonical Pydantic models from layerlens.instrument._compat.events
(PR #129 foundation).

Sites migrated (grep-counted, not estimated):
- _on_llm_end                -> ModelInvokeEvent + CostRecordEvent (when usage)
- _on_tool_call              -> ToolCallEvent
- _on_retrieval_end          -> ToolCallEvent (name=retrieval, library)
- _on_agent_step_start       -> AgentInputEvent
- _on_agent_step_end         -> AgentOutputEvent
- on_agent_start             -> AgentInputEvent
- on_agent_end               -> AgentOutputEvent
- on_tool_use                -> ToolCallEvent
- on_llm_call                -> ModelInvokeEvent
- on_handoff                 -> AgentHandoffEvent (sha256:<hex64> format)
- _emit_agent_config         -> EnvironmentConfigEvent

Total: 12 emission sites (matches estimate).

LlamaIndex-specific provenance (framework, agent_name, step,
timestamp_ns, tool_type, result_count) is carried on canonical
metadata / attributes / parameters / input slots.

Retrieval events map onto ToolCallEvent with name='retrieval' and
integration=IntegrationType.LIBRARY — the canonical schema has no
dedicated retrieval shape, mirroring agno reference adapter.

Helpers added: _stringify, _coerce_to_dict, _sha256_of,
_detect_provider.

Regression tests added:
- test_llama_index_lifecycle_emits_typed_payloads_only
- test_llama_index_emit_does_not_warn_after_migration

mypy --strict: clean (2 source files)
pytest: 15/15 passing
Migrate all 12 emission sites in ms_agent_framework lifecycle.py from
emit_dict_event(...) to emit_event(TypedModel.create(...)) using the
canonical Pydantic models from layerlens.instrument._compat.events
(PR #129 foundation).

Sites migrated (grep-counted, not estimated):
- _process_message handoff           -> AgentHandoffEvent
- _process_message FunctionCall      -> ToolCallEvent (input)
- _process_message FunctionResult    -> ToolCallEvent (output)
- _process_message model metadata    -> ModelInvokeEvent
- _process_message usage metadata    -> CostRecordEvent
- on_run_start                       -> AgentInputEvent
- on_run_end (output)                -> AgentOutputEvent
- on_run_end (state.change)          -> COLLAPSED into AgentOutputEvent metadata
- on_tool_use                        -> ToolCallEvent
- on_llm_call                        -> ModelInvokeEvent
- on_handoff                         -> AgentHandoffEvent (sha256:<hex64> format)
- _emit_chat_config                  -> EnvironmentConfigEvent

Total: 12 emission sites (matches estimate).

Behavioural change: the previous adapter emitted an ad-hoc
agent.state.change payload alongside agent.output to carry a
run_complete / run_failed marker. That payload did not satisfy the
canonical AgentStateChangeEvent before_hash / after_hash contract
(the run boundary has no real state mutation to hash). The post-
migration mapping carries the same signal as run_status on the
AgentOutputEvent metadata, preserving the cross-cutting completion
marker without violating the canonical schema. Test coverage updated
to assert agent.state.change is no longer emitted and run_status is
on AgentOutputEvent.metadata.

MS Agent Framework-specific provenance (framework, agent_name,
chat_name, chat_type, timestamp_ns, selection_strategy,
termination_strategy, plugins) is carried on canonical metadata /
attributes / parameters / input slots.

Helpers added: _stringify, _coerce_to_dict, _sha256_of.

Regression tests added:
- test_ms_agent_framework_lifecycle_emits_typed_payloads_only
- test_ms_agent_framework_emit_does_not_warn_after_migration

mypy --strict: clean (2 source files)
pytest: 16/16 passing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants