Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 52 additions & 10 deletions docs/adapters/frameworks-benchmark_import.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,21 @@
# Benchmark import framework adapter
# Benchmark import (data importer)

`layerlens.instrument.adapters.frameworks.benchmark_import.BenchmarkImportAdapter`
imports external benchmark datasets into Stratix evaluation spaces. Unlike
the other framework adapters, this is a **data importer**, not a runtime
instrumentation adapter — it reads benchmarks from disk or from
HuggingFace and produces normalized rows.
the other classes in `layerlens.instrument.adapters.frameworks.*`, this is
a **data importer**, not a runtime instrumentation adapter — it reads
benchmarks from disk or from HuggingFace and produces normalized rows.

> **Architectural note:** `BenchmarkImportAdapter` is deliberately a
> **bare class** — it does NOT extend `BaseAdapter`
> (`src/layerlens/instrument/adapters/frameworks/benchmark_import/adapter.py:70`
> declares `class BenchmarkImportAdapter:` with no superclass). It has no
> `connect()` / `disconnect()` lifecycle, no `AdapterCapability`
> declarations, no sinks, and emits no telemetry events. It is a one-shot
> ETL pipeline rather than a runtime instrumentation adapter. See the
> canonical specification at
> [`docs/adapters/benchmark_import.md`](./benchmark_import.md) §1 for the
> full architectural carve-out.

## Install

Expand Down Expand Up @@ -84,15 +95,46 @@ Stratix evaluation schema:
| `difficulty` | `difficulty`, `level` |
| `category` | `category`, `subject`, `topic` |

When no mapping is provided, the adapter applies a small set of automatic
heuristics (case-insensitive name match against the canonical fields).
**Schema mapping is explicit-only in v1.x.** When no `schema_mapping` is
provided, source field names pass through unchanged —
`_apply_schema_mapping` short-circuits with `if not mapping: return record`
(`src/layerlens/instrument/adapters/frameworks/benchmark_import/adapter.py:423-434`).
There is **no automatic heuristic detection** (no case-insensitive
matching, no fuzzy aliasing) in the current implementation. To map a
source column to a canonical field, you must list it explicitly in
`schema_mapping`.

Automatic heuristic detection is on the v1.8 roadmap — see
[`docs/adapters/benchmark_import.md`](./benchmark_import.md) §3.3 and §7
("v1.8 — Schema-heuristic detection").

## Persistence

If you pass a `store=` argument to `BenchmarkImportAdapter(...)` (something
that exposes `save_benchmark(metadata, records)`), the adapter writes
imported benchmarks through it. Otherwise records are returned to the
caller and held in `adapter._benchmarks` keyed by `benchmark_id`.
If you pass a `store=` argument to `BenchmarkImportAdapter(...)`, the
adapter persists imported benchmarks through it. The `store` is
duck-typed: it must expose a single method
`insert_row(table_name: str, row: dict)`. The adapter writes one row to
the `"benchmarks"` table containing `metadata.model_dump()`, then one
row per imported record to the `"benchmark_records"` table (each record
is augmented with `record["benchmark_id"] = metadata.benchmark_id`
before insertion). See
`src/layerlens/instrument/adapters/frameworks/benchmark_import/adapter.py:436-446`
for the `_persist` implementation. Persistence failures are swallowed
and logged at debug level — `import_*` will still return
`success=True` even if the store raises.

If `store` is `None`, the adapter only retains `BenchmarkMetadata`
objects in-memory (`adapter._benchmarks`, keyed by `benchmark_id`, used
by `list_benchmarks()` / `get_benchmark()`). The imported records
themselves are NOT retained on the adapter — they are produced inside
`import_*`, returned via `ImportResult.records_imported` (a count, not
the rows), and otherwise discarded. Callers that need the rows without
a real store should pass a shim whose `insert_row` captures them (e.g.
an in-test list).

For the deeper persistence contract, see
[`docs/adapters/benchmark_import.md`](./benchmark_import.md) §3.4
("Persistence contract").

## Events emitted

Expand Down
14 changes: 12 additions & 2 deletions docs/adapters/frameworks-embedding.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,17 @@
`layerlens.instrument.adapters.frameworks.embedding.EmbeddingAdapter` and
`VectorStoreAdapter` instrument embedding-creation calls and vector-store
operations across the common providers. They emit `embedding.create` and
`vector_store.query` events with dimension, batch size, and latency metadata.
`retrieval.query` events with dimension, batch size, and latency metadata.

> **Event taxonomy note:** vector-store retrieval is normalized to
> `retrieval.query` regardless of backend — Pinecone, Weaviate, and Chroma
> wrappers all emit the same event type (the literal string
> `"retrieval.query"` is emitted at
> `src/layerlens/instrument/adapters/frameworks/embedding/vector_store_adapter.py`
> lines 167, 199, and 233). Backend-specific differences live in the event
> payload (e.g. Pinecone's `namespace`, Weaviate's `query_type`, Chroma's
> `distance_*`). See the canonical specification at
> [`docs/adapters/embedding.md`](./embedding.md) for the full contract.

## Install

Expand Down Expand Up @@ -76,7 +86,7 @@ vs_adapter.connect()
| Event | Layer | When |
|---|---|---|
| `embedding.create` | L3 | Per embedding call. Payload: `provider`, `model`, `batch_size`, `dimensions`, `total_tokens`, `latency_ms`. |
| `vector_store.query` | L3 | Per vector-store query. Payload: `provider`, `top_k`, `result_count`, `latency_ms`, `index_name`. |
| `retrieval.query` | L3 | Per vector-store query. Payload varies by backend (`provider`, `top_k`/`limit`/`n_results`, `result_count`/`match_count`, `has_filter`/`has_where`, `namespace` (Pinecone), `query_type` (Weaviate), `score_min`/`score_max`/`score_mean` (Pinecone), `distance_min`/`distance_max` (Chroma), `latency_ms`). See [`docs/adapters/embedding.md`](./embedding.md) §3.3 for the full payload schema. |

## Dimension tracking

Expand Down