Skip to content

feat(instrument): Port Ollama LLM provider adapter (M3)#106

Closed
mmercuri wants to merge 1 commit into
mainfrom
feat/instrument-providers-ollama
Closed

feat(instrument): Port Ollama LLM provider adapter (M3)#106
mmercuri wants to merge 1 commit into
mainfrom
feat/instrument-providers-ollama

Conversation

@mmercuri

Copy link
Copy Markdown
Contributor

Summary

M3 LLM-provider fan-out — self-contained PR that lands the Ollama
adapter together with the minimum base infrastructure required to
host it.

  • Source: ateam/stratix/sdk/python/adapters/llm_providers/ollama_adapter.py (~259 LOC)
  • Sister M3 PRs (in parallel): OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex, LiteLLM, Cohere, Mistral
  • Template: matches the structural shape used by the OpenAI adapter port (providers/openai_adapter.py + flat _base/)
  • Local-only: Ollama runs on the operator's own hardware. Default endpoint is http://localhost:11434 (overridable via OLLAMA_HOST)
  • Pricing: every Ollama model is recorded as 0.0 USD/token in the canonical pricing manifest with an inline comment explaining why (self-hosted). Optional infra_cost_usd derived from compute duration when cost_per_second is configured by the caller. Token counting is still required — the adapter records prompt_eval_count / eval_count from the daemon response on every call.

What's in the box

Deliverable Path
Adapter src/layerlens/instrument/adapters/providers/ollama_adapter.py
Lazy public export src/layerlens/instrument/adapters/providers/__init__.py (__getattr__)
Pricing entries (zero-cost) src/layerlens/instrument/adapters/providers/_base/pricing.py
Tests (respx HTTP fixtures) tests/instrument/adapters/providers/test_ollama_adapter.py
Sample (mocked + live modes) samples/instrument/ollama/{__init__.py,main.py,README.md}
Docs (incl. ollama serve setup + GPU notes) docs/adapters/providers/ollama.md
pyproject extra providers-ollama = ["ollama>=0.2"]
Base infrastructure _compat/, instrument/_vendored/, instrument/adapters/_base/, instrument/adapters/providers/_base/

Acceptance

  • uv run pytest tests/instrument/adapters/providers/test_ollama_adapter.py -x13 passed
  • uv run pytest tests/instrument/test_lazy_imports.py3 passed
  • uv run pytest tests/instrument/test_default_install.py3 passed
  • uv run mypy --strict src/layerlens/instrument/adapters/providers/ollamaclean (6 source files)
  • uv run ruff check src/ tests/clean
  • Lazy-import + default-install guards still pass

Test plan

  • Reviewer confirms ollama is NOT loaded by import layerlens (covered by test_lazy_imports.py::test_layerlens_import_does_not_pull_frameworks)
  • Reviewer confirms pip install layerlens (no extras) does not change the runtime dep set (test_default_install.py)
  • Reviewer optionally runs LAYERLENS_OLLAMA_LIVE=1 python -m samples.instrument.ollama.main against a local ollama serve daemon
  • Reviewer confirms api_cost_usd is exactly 0.0 on every emitted cost.record event (self-hosted invariant)
  • Reviewer confirms the pricing-manifest zero-cost entries match the canonical Ollama tag list

Self-contained M3 fan-out PR that lands the LayerLens Ollama provider
adapter together with the minimum base infrastructure required to
host it. Sister provider PRs (OpenAI, Anthropic, Azure, Bedrock,
Vertex, LiteLLM, Cohere, Mistral) land in parallel.

Source (~259 LOC): ateam/stratix/sdk/python/adapters/llm_providers/
ollama_adapter.py.

Adapter
- src/layerlens/instrument/adapters/providers/ollama_adapter.py wraps
  ollama.Client.{chat,generate,embeddings}, emits model.invoke +
  cost.record events. api_cost_usd is always 0.0 (Ollama is local /
  self-hosted); optional infra_cost_usd derived from
  prompt_eval_duration + eval_duration when cost_per_second is set.
- providers/__init__.py exposes OllamaAdapter via lazy __getattr__ —
  importing the package does NOT load the ollama SDK.

Pricing
- _base/pricing.py adds explicit zero-cost entries for the canonical
  Ollama model tags (llama3.x, mistral, mixtral, phi3, qwen2.5,
  gemma2, deepseek-r1, codellama, nomic-embed-text, mxbai-embed-large,
  all-minilm). Comment documents that 0.0 is intentional (self-hosted)
  and distinct from the hosted Bedrock/Together rates listed above.

Tests
- tests/instrument/adapters/providers/test_ollama_adapter.py — 13
  respx-based HTTP-fixture tests covering the chat/generate/embeddings
  paths, error path (HTTP 500 -> ResponseError + policy.violation),
  cost_per_second infra-cost math, endpoint detection from OLLAMA_HOST,
  lazy-import contract, and disconnect lifecycle.
- Lazy-import + default-install + resolved-dep-tree guards all green.

Sample
- samples/instrument/ollama/{__init__.py,main.py,README.md} — runnable
  in mocked mode (default, respx-backed) or live mode
  (LAYERLENS_OLLAMA_LIVE=1) against a real ollama serve daemon.
  Sample pulls the model first, then runs a chat round-trip.

Doc
- docs/adapters/providers/ollama.md — install, quickstart, ollama
  serve setup for macOS/Linux/Windows/Docker, GPU notes (CUDA / ROCm /
  Metal / CPU), env-var reference.

pyproject
- providers-ollama = ["ollama>=0.2"] extra. httpx is already a core
  dep so the extra surface is documented but minimal.
- Per-file ruff override for src/layerlens/instrument/adapters/
  providers/**.py (ARG002 — wrapped SDK callbacks).
- pyright executionEnvironment relaxation for providers (matches the
  existing cli relaxation).

Acceptance
- uv run pytest tests/instrument/adapters/providers/test_ollama_adapter.py: 13 passed
- uv run pytest tests/instrument/test_lazy_imports.py: 3 passed
- uv run pytest tests/instrument/test_default_install.py: 3 passed
- uv run mypy --strict src/layerlens/instrument/adapters/providers: clean
- uv run ruff check src/ tests/: clean
@mmercuri mmercuri requested a review from m-peko April 26, 2026 07:53
@m-peko m-peko closed this May 21, 2026
@m-peko m-peko deleted the feat/instrument-providers-ollama branch July 2, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants