Skip to content

instrument: orchestration framework adapters (M1.C part 1)#96

Closed
mmercuri wants to merge 2 commits into
feat/instrument-base-foundationfrom
feat/instrument-frameworks-orchestration
Closed

instrument: orchestration framework adapters (M1.C part 1)#96
mmercuri wants to merge 2 commits into
feat/instrument-base-foundationfrom
feat/instrument-frameworks-orchestration

Conversation

@mmercuri

Copy link
Copy Markdown
Contributor

Summary

Ports the six orchestration-tier framework adapters from the ateam
reference implementation onto the new layerlens.instrument base
layer:

LangChain, LangGraph, CrewAI, AutoGen, Langfuse, Agentforce

These six frameworks share heavy multi-file packages (lifecycle,
handoff, state, tools, llm, nodes) and benefit from being landed
together for cross-cutting tests. The lighter agent-tier adapters
(Agno, OpenAI Agents, Pydantic-AI, etc.) ship in a follow-up PR
(M1.C part 2).

Scope

  • src/layerlens/instrument/adapters/frameworks/{langchain,langgraph,crewai,autogen,langfuse,agentforce}/
    — per-framework packages
  • tests/instrument/adapters/frameworks/{test_autogen,test_crewai,test_langfuse}_adapter.py
    — unit tests where the framework is importable on the test runner;
    langchain / langgraph / agentforce ride the bulk-ported smoke
    harness in part 2
  • samples/instrument/{langchain,langgraph,crewai,autogen,agentforce}/
    — runnable samples
  • docs/adapters/frameworks-{langchain,langgraph,crewai,autogen,agentforce}.md
    — per-framework integration guide
  • pyproject.toml — six new optional extras (langchain, langgraph,
    crewai, autogen, agentforce, langfuse-importer) with
    python_version markers where the upstream framework requires
    3.9+ or 3.10+; pyright/ruff exclusions for the dynamic monkey-
    patching framework code

Blast radius

  • Default pip install layerlens install set is unchanged. Each
    framework's heavy deps (langchain, crewai, etc.) are gated behind
    their own extra.
  • No changes to existing public API surface.
  • Importing layerlens.instrument still does NOT pull in any
    framework module (lazy registry lookup).

Test plan

  • uv run pytest tests/instrument/adapters/frameworks/ -x
    37 passed (autogen + crewai + langfuse units)
  • langchain / langgraph / agentforce coverage continues in
    part 2 via the bulk-ported smoke harness
  • Reviewer (m-peko) verifies callback/lifecycle wiring

Stacks on

  • feat/instrument-base-foundation (M1.A) — required for the
    BaseAdapter surface this PR consumes.

Linear

LAY-3400 umbrella (M1.C part 1).

Ports the six orchestration-tier framework adapters from the ateam
reference implementation onto the new layerlens.instrument base layer:

  LangChain, LangGraph, CrewAI, AutoGen, Langfuse, Agentforce

These six frameworks share heavy multi-file packages (lifecycle,
handoff, state, tools, llm, nodes) and benefit from being landed
together for cross-cutting tests. The lighter agent-tier adapters
(Agno, OpenAI Agents, Pydantic-AI, etc.) ship in a follow-up PR
(M1.C part 2).

Scope
-----
- src/layerlens/instrument/adapters/frameworks/{langchain,langgraph,
  crewai,autogen,langfuse,agentforce}/: per-framework packages
- tests/instrument/adapters/frameworks/{test_autogen,test_crewai,
  test_langfuse}_adapter.py: unit tests where the framework is
  importable on the test runner; langchain / langgraph / agentforce
  ride the bulk-ported smoke harness in part 2
- samples/instrument/{langchain,langgraph,crewai,autogen,
  agentforce}/: runnable samples
- docs/adapters/frameworks-{langchain,langgraph,crewai,autogen,
  agentforce}.md: per-framework integration guide
- pyproject.toml: six new optional extras
  (langchain, langgraph, crewai, autogen, agentforce, langfuse-importer)
  with python_version markers where the upstream framework requires
  3.9+ or 3.10+; pyright/ruff exclusions for the dynamic monkey-
  patching framework code

Blast radius
------------
- Default `pip install layerlens` install set is unchanged. Each
  framework's heavy deps (langchain, crewai, etc.) are gated behind
  their own extra.
- No changes to existing public API surface.
- Importing layerlens.instrument still does NOT pull in any framework
  module (lazy registry lookup).

Test plan
---------
- uv run pytest tests/instrument/adapters/frameworks/ -x  ->
  37 passed (autogen + crewai + langfuse units)
- langchain / langgraph / agentforce coverage continues in part 2 via
  the bulk-ported smoke harness

Stacks on
---------
- feat/instrument-base-foundation (M1.A) — required for the
  BaseAdapter surface this PR consumes.

LAY-3400 umbrella (M1.C part 1).
@mmercuri mmercuri requested a review from m-peko April 26, 2026 02:28
@mmercuri

Copy link
Copy Markdown
Contributor Author

Linear: https://linear.app/layerlens/issue/LAY-3400 (Framework adapters part 1 — orchestration tier: LangChain, LangGraph, CrewAI, AutoGen, Langfuse, Agentforce). Stacked on PR #93. Under Apollo M1 epic LAY-3423.

 deferred)

LayerLensCallbackHandler implements serialize_for_replay (returns
a non-stub ReplayableTrace populated with self._trace_events plus the
capture config) AND a working execute_replay coroutine, but
get_adapter_info().capabilities did not declare
AdapterCapability.REPLAY. The atlas-app catalog UI reads that list
to surface replay support, so customers were told they could not
replay traces from LangChain even though the adapter supports it.

PR #119 (brand leak + capability declarations) wired REPLAY for the
adapters that lived on its branch; LangChain was deferred because it
lives on the orchestration source-port branch (PR #96). This closes
that deferral per CLAUDE.md item 5/11.

STREAMING is intentionally NOT declared. Per CLAUDE.md 'no fake
claims', a capability is only declared if the adapter actually
implements it. The LangChain adapter registers
on_chat_model_start / on_llm_start / on_llm_end / on_tool_* /
on_agent_* / on_chain_* — but NOT on_llm_new_token, the LangChain
streaming callback. No per-chunk events flow through this adapter.
Adding STREAMING here would mislead the catalog UI into telling
customers they can stream from an adapter that does not see chunks.

Tests: added tests/instrument/adapters/frameworks/test_langchain_capabilities.py
with three regressions:
* test_declares_replay_capability — REPLAY surfaces via info().capabilities
* test_does_not_declare_streaming_capability — STREAMING stays absent
  until on_llm_new_token is wired and tested explicitly
* test_get_adapter_info_matches_info_wrapper — info() and
  get_adapter_info() agree on the capability list

Verification:
* uv run --with pytest python -m pytest     tests/instrument/adapters/frameworks/test_langchain_capabilities.py -x
    -> 3 passed
* uv run --with pytest python -m pytest     tests/instrument/adapters/frameworks/ -x
    -> 40 passed (no regressions in autogen / crewai / langfuse suites)
@m-peko m-peko closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants