diff --git a/docs/adapters/benchmark_import.md b/docs/adapters/benchmark_import.md
new file mode 100644
index 0000000..0e327be
--- /dev/null
+++ b/docs/adapters/benchmark_import.md
@@ -0,0 +1,361 @@
+# Benchmark import adapter — specification
+
+> Specification document for `layerlens.instrument.adapters.frameworks.benchmark_import`.
+> Companion to the user-facing reference at `docs/adapters/frameworks-benchmark_import.md`.
+> The depth audit at `A:/tmp/adapter-depth-audit.md` §1.18 flagged this adapter
+> as having no spec; this document closes that gap.
+
+| | |
+|---|---|
+| Spec ID | ADP-074 (benchmark import) |
+| Module | `layerlens.instrument.adapters.frameworks.benchmark_import` |
+| Source | `src/layerlens/instrument/adapters/frameworks/benchmark_import/adapter.py` |
+| Adapter type | **Data importer** — does NOT extend `BaseAdapter`; not a runtime instrumentation adapter |
+| Status | Functional, ports byte-near-identical from ateam |
+| Tests | None dedicated — covered by `tests/instrument/adapters/frameworks/test_bulk_ported_smoke.py::test_benchmark_import_adapter_independent` |
+
+---
+
+## 1. Overview
+
+The benchmark import adapter is a **data importer**. It reads benchmark
+suites from external sources (HuggingFace Datasets, HELM result JSON,
+local CSV/JSON/Parquet files), normalizes the records to LayerLens'
+canonical evaluation schema, and persists them so the platform's
+evaluation runner can consume them like any other dataset.
+
+This adapter is **architecturally distinct** from every other adapter in
+`frameworks/`. The other 17 are runtime instrumentation adapters that
+extend `BaseAdapter`, wrap an in-process framework, and emit telemetry
+events as the wrapped framework runs. `BenchmarkImportAdapter` does none
+of that:
+
+* It does NOT extend `BaseAdapter`. It is a standalone class.
+* It does NOT have a `connect()` / `disconnect()` lifecycle.
+* It does NOT emit any telemetry events.
+* It does NOT wrap any third-party framework's runtime methods.
+* It is closer in shape to the `langfuse` adapter's `importer.py` submodule
+  than to LangChain's callback handler.
+
+In ATEAM the smoke test for this adapter
+(`test_benchmark_import_adapter_independent`) explicitly documents this
+non-`BaseAdapter` shape; the audit (§1.18) classifies the architectural
+inconsistency as intentional but flags that the framework-architecture
+spec lacks a "data importer" carve-out for it. This spec provides that
+carve-out.
+
+The adapter exists because LayerLens evaluation spaces need a way to
+import published benchmark suites — MMLU, SQuAD, HumanEval, and similar
+HuggingFace datasets — without forcing every customer to write their own
+ETL. The downstream evaluation runner (in atlas-app) iterates the
+imported benchmark and produces the `model.invoke` / `evaluation.score`
+events through standard provider adapters.
+
+---
+
+## 2. Capability surface
+
+`BenchmarkImportAdapter` does NOT use the `AdapterCapability` enum and
+does NOT publish an `AdapterInfo` record. The capability surface is
+expressed instead through the public method set listed in §3.
+
+What the adapter CAN do, sourced directly from `adapter.py`:
+
+| Capability                                     | Source           |
+|------------------------------------------------|------------------|
+| Stream a HuggingFace dataset split             | `import_huggingface` (line 101–171). Uses `datasets.load_dataset(streaming=True)` to avoid materializing the whole split. |
+| Parse HELM JSON results                        | `import_helm` (line 175–246). Handles the top-level-list, `{results: …}`, and `{scenarios: […]}` shapes. |
+| Read a CSV file                                | `import_csv` (line 250–295). Uses stdlib `csv.DictReader`; `delimiter` is configurable. |
+| Read a JSON file (array or object-with-key)    | `import_json` (line 297–349). The `records_key` argument selects the array inside an object payload (defaults to `records`, falling back to `data`). |
+| Read a Parquet file                            | `import_parquet` (line 351–405). Lazily imports `pyarrow.parquet`; returns a clear error if `pyarrow` isn't installed. |
+| Apply schema-mapping renames                   | `_apply_schema_mapping` (line 419–431). Pure key-renaming; no value transforms. |
+| List / look up imported benchmarks             | `list_benchmarks` (line 409), `get_benchmark` (line 413). |
+| Persist via an injected store                  | `_persist` (line 433–443). If the constructor receives a `store=` argument with `insert_row(table, row)`, the adapter writes one row to the `benchmarks` table and one row per record to the `benchmark_records` table. |
+
+---
+
+## 3. Contract
+
+### 3.1 Public API
+
+| Member                                          | Description                                                                |
+|-------------------------------------------------|----------------------------------------------------------------------------|
+| `__init__(store: Any \| None = None)`           | Construct with optional persistent store. Without a store, imports stay in memory. |
+| `import_huggingface(dataset_name, split="test", subset=None, schema_mapping=None, max_records=None, tags=None) -> ImportResult` | Streamed HuggingFace import. |
+| `import_helm(path, schema_mapping=None, tags=None) -> ImportResult` | HELM JSON import.            |
+| `import_csv(path, schema_mapping=None, delimiter=",", max_records=None, tags=None) -> ImportResult` | CSV import. |
+| `import_json(path, schema_mapping=None, records_key=None, max_records=None, tags=None) -> ImportResult` | JSON import. |
+| `import_parquet(path, schema_mapping=None, max_records=None, tags=None) -> ImportResult` | Parquet import. |
+| `list_benchmarks() -> list[BenchmarkMetadata]`  | All metadata held in this adapter instance.                                |
+| `get_benchmark(benchmark_id) -> BenchmarkMetadata \| None` | Single lookup.                                                  |
+
+### 3.2 Public data models
+
+`BenchmarkMetadata` and `ImportResult` are Pydantic v1/v2-compatible
+models defined in the same module (`adapter.py:34–63`).
+
+`BenchmarkMetadata`:
+
+| Field                | Type               | Default                                    |
+|----------------------|--------------------|--------------------------------------------|
+| `benchmark_id`       | `str`              | `f"bench-{uuid4().hex[:12]}"`              |
+| `name`               | `str`              | required                                   |
+| `source`             | `str`              | required (`huggingface`/`helm`/`csv`/`json`/`parquet`) |
+| `source_identifier`  | `str`              | empty                                      |
+| `version`            | `str`              | `"1.0.0"`                                  |
+| `record_count`       | `int`              | 0                                          |
+| `schema_mapping`     | `dict[str, str]`   | `{}`                                       |
+| `imported_at`        | `str` (ISO-8601)   | `datetime.now(UTC).isoformat()`            |
+| `imported_by`        | `str`              | empty                                      |
+| `tags`               | `list[str]`        | `[]`                                       |
+
+`ImportResult`:
+
+| Field              | Type                          | Default |
+|--------------------|-------------------------------|---------|
+| `success`          | `bool`                        | `True`  |
+| `benchmark_id`     | `str`                         | empty   |
+| `records_imported` | `int`                         | 0       |
+| `records_skipped`  | `int`                         | 0       |
+| `duration_ms`      | `float`                       | 0.0     |
+| `errors`           | `list[str]`                   | `[]`    |
+| `metadata`         | `Optional[BenchmarkMetadata]` | `None`  |
+
+### 3.3 Schema mapping
+
+`schema_mapping` is a `dict[str, str]` of `source_field → canonical_field`.
+The mapping is applied identically across all five import methods via
+`_apply_schema_mapping`. The mapping is a pure key rename — values pass
+through unmodified, and source keys NOT in the mapping pass through with
+their original names.
+
+There is **no automatic heuristic mapping** in the source. The
+user-reference doc at `docs/adapters/frameworks-benchmark_import.md`
+mentions "automatic heuristics" — this is not implemented in code today
+(`_apply_schema_mapping` short-circuits to `return record` if the mapping
+is None or empty). Treat that doc claim as aspirational pending a future
+enhancement.
+
+The recommended canonical fields, when imports will be consumed by the
+LayerLens evaluation runner:
+
+| Canonical field    | Common source field aliases                            |
+|--------------------|--------------------------------------------------------|
+| `prompt`           | `question`, `input`, `query`                           |
+| `expected_output`  | `answer`, `target`, `reference`, `ground_truth`        |
+| `difficulty`       | `difficulty`, `level`                                  |
+| `category`         | `category`, `subject`, `topic`                         |
+
+### 3.4 Persistence contract
+
+The `store` constructor argument is duck-typed: any object with a single
+method `insert_row(table_name: str, row: dict)` will be accepted. The
+adapter writes:
+
+1. One row to table `"benchmarks"` containing `metadata.model_dump()`.
+2. One row per imported record to table `"benchmark_records"`, each
+   record dict pre-augmented with `record["benchmark_id"] = metadata.benchmark_id`.
+
+If `store` is `None`, the metadata is held in `self._benchmarks` (an
+in-memory dict keyed by `benchmark_id`) but the records themselves are
+NOT retained — they are produced and discarded inside the `import_*`
+method. This is intentional: holding millions of HuggingFace records in
+process memory would defeat the streaming load. Callers who need the
+records without a store should pass a callable shim that captures them
+(e.g. an in-test list).
+
+### 3.5 Error handling
+
+Every `import_*` method follows the same error-handling shape:
+
+* `ImportError` from a lazy dependency import (`datasets`, `pyarrow`)
+  → return `ImportResult(success=False, errors=["… library not installed …"])`.
+* `FileNotFoundError`, `json.JSONDecodeError`, etc. → return
+  `ImportResult(success=False, errors=[…])`.
+* Any other exception → return `ImportResult(success=False, errors=[f"X import failed: {exc}"])`.
+
+The methods do NOT raise. A caller can rely on the `result.success` flag
+and the `result.errors` list. Persistence failures inside `_persist` are
+swallowed and logged at DEBUG (`adapter.py:441–443`); the `ImportResult`
+will still report `success=True` even though nothing reached the store.
+This is a known gap — see §7 roadmap.
+
+---
+
+## 4. What we do NOT support
+
+* **Not a `BaseAdapter`.** No `connect()`, `disconnect()`, `health_check()`,
+  `get_adapter_info()`, `serialize_for_replay()`, sinks, capture config,
+  or capability declarations. The class is freestanding.
+* **No telemetry events.** `BenchmarkImportAdapter` does NOT emit
+  `embedding.create`, `model.invoke`, `evaluation.score`, or any other
+  event type. The downstream evaluation runner is responsible for
+  generating events when it iterates the imported benchmark.
+* **No live mid-run feedback.** This is a one-shot ETL: call
+  `import_csv(...)`, get an `ImportResult`. There is no streaming
+  interface, no progress callback, no per-batch event. For very large
+  datasets the only way to bound memory is via `max_records`.
+* **No MTEB importer.** The user-reference doc's parent-directory listing
+  references "MTEB" generically; there is no `import_mteb(...)` method.
+  MTEB datasets reachable through HuggingFace can be loaded via
+  `import_huggingface("mteb/...")`, but there is no MTEB-specific
+  schema mapping or score normalization. Adding native MTEB support is
+  on the roadmap.
+* **No MMLU-specific importer.** Same situation as MTEB — load it as a
+  HuggingFace dataset (`import_huggingface("cais/mmlu", subset=...)`).
+  No native MMLU-aware mapping ships today.
+* **No BIG-bench, GPQA, HumanEval, SWE-bench, AGIEval, etc. native
+  importers.** Same answer: any of these reachable through HuggingFace
+  Datasets work via `import_huggingface`, but no benchmark-specific
+  schema knowledge ships.
+* **No automatic schema-heuristic detection.** The reference doc
+  describes case-insensitive aliasing; the source does not implement it.
+* **No cross-benchmark deduplication.** Reimporting the same dataset
+  produces a fresh `benchmark_id` and a fresh row in the `benchmarks`
+  table. There is no dedupe key.
+* **No re-import / incremental update.** No cursor, no `since`, no diff
+  mode. Every import is a fresh full-load (within `max_records`).
+* **No per-record validation against the canonical evaluation schema.**
+  The adapter renames keys via `schema_mapping`; it does NOT verify that
+  a `prompt` field exists, that values are non-empty, or that types are
+  what the evaluation runner expects.
+* **No per-record signing or attestation.** Imported records are stored
+  as plain rows. There is no merkle root, no source-hash, no provenance
+  envelope.
+* **No multi-tenant scoping inside the adapter.** The adapter has no
+  `org_id` argument, does not stamp `org_id` onto rows, and does not
+  inject tenant context into the `store.insert_row` call. Multi-tenant
+  scoping is the responsibility of the **store** the caller injects —
+  see §5.
+* **No remote-source authentication.** HuggingFace public datasets work
+  out of the box. Gated datasets require `HUGGINGFACE_HUB_TOKEN` set in
+  the environment of the hosting process; the adapter does not surface
+  this as an explicit kwarg.
+* **No format-conversion outputs.** The adapter does not export to a
+  different format. Records flow in, get persisted, and live in whatever
+  shape the store keeps them.
+* **No sample (in `samples/instrument/benchmark_import/`) for
+  HuggingFace or HELM imports.** Only a CSV sample exists today.
+
+---
+
+## 5. BYOK and multi-tenancy
+
+**BYOK is not applicable.** The adapter does not call any LLM provider
+API and does not need a model API key. HuggingFace dataset loading uses
+the `datasets` library's own credential resolution
+(`HUGGINGFACE_HUB_TOKEN` env var) for gated datasets; this is outside
+the LayerLens BYOK scope per the project's "BYOK = model API keys only"
+convention.
+
+**Multi-tenancy** is the responsibility of the injected `store`. The
+adapter is tenant-agnostic by design — it does not know about
+organizations, users, or tenant scoping. A multi-tenant deployment must:
+
+1. Inject a `store` whose `insert_row` is itself tenant-aware (e.g.
+   wraps the underlying SQL with a `tenant_id` column populated from
+   request context).
+2. Serialize benchmark imports per tenant — do NOT share a single
+   `BenchmarkImportAdapter` instance across tenants because
+   `self._benchmarks` (the in-memory metadata cache) does not carry
+   tenant scoping. Callers running on a per-request basis should
+   construct a fresh adapter per import.
+3. If the platform later wants to surface "all benchmarks for org X",
+   that lookup must go through the store, not through `list_benchmarks()`
+   on a long-lived adapter instance.
+
+The `imported_by` field on `BenchmarkMetadata` is meant to record the
+user who triggered the import; it currently defaults to empty and must
+be populated by the caller before persistence. The platform's CLI/API
+layer is the right place to set it.
+
+---
+
+## 6. Test coverage
+
+* **Dedicated tests:** none.
+* **Bulk smoke coverage:** `tests/instrument/adapters/frameworks/test_bulk_ported_smoke.py::test_benchmark_import_adapter_independent`
+  (lines 170–189). The smoke test exercises:
+  * `BenchmarkMetadata(name="test", source="csv")` constructs and the
+    `benchmark_id` has the expected `bench-` prefix.
+  * `ImportResult(success=True, benchmark_id=…)` constructs.
+  * `BenchmarkImportAdapter()` constructs.
+
+  The smoke test does NOT exercise any actual import path. It is a
+  module-importability and dataclass-shape check, not a behavioral test.
+
+* **What's missing vs ateam:** ateam ships a dedicated integration test
+  exercising at least the CSV path end-to-end (audit §1.18). That file
+  was not ported. Restoring it is on the Tier-2 test-restoration queue
+  (audit §3 item 16).
+
+* **Sample:** `samples/instrument/benchmark_import/main.py` exercises
+  `import_csv` end-to-end against a temporary CSV. There are no samples
+  for HuggingFace, HELM, JSON, or Parquet paths.
+
+A complete test pass for v1.7 would cover:
+
+1. CSV import with explicit `schema_mapping`; verify `records_imported`,
+   `metadata.source == "csv"`, all rows reachable through the schema
+   mapping.
+2. CSV import with `max_records=N`; verify the cap.
+3. CSV import with custom `delimiter="\t"`.
+4. JSON import — array form, object-with-`records`-key form,
+   object-with-default-`data`-key form.
+5. JSON import where the file is not valid JSON → `success=False` and
+   a meaningful error string.
+6. HELM import — the three accepted shapes (top-level list,
+   `{results: …}`, `{scenarios: […]}`).
+7. HELM import where the file is missing → `success=False`.
+8. Parquet import without `pyarrow` installed → returns the expected
+   error string. (Achievable in CI by uninstalling pyarrow in a separate
+   tox env.)
+9. HuggingFace import without `datasets` installed → returns the
+   expected error string.
+10. HuggingFace import with a small public dataset (`squad`,
+    `max_records=2`) — gated behind `LAYERLENS_RUN_NETWORK_TESTS=1` so
+    it doesn't run in default CI.
+11. `_persist` with an injected stub store; verify the two table writes.
+12. `_persist` failure path — store raises; verify the result is still
+    returned (currently `success=True`, see §3.5 known gap).
+13. `list_benchmarks` / `get_benchmark` round-trip after a successful
+    import.
+
+---
+
+## 7. Roadmap
+
+* **v1.7 — Native MTEB importer.** `import_mteb(task_name, ...)` that
+  knows the MTEB task taxonomy and writes a benchmark per task. Likely
+  built on top of `import_huggingface` with task-specific schema
+  mappings.
+* **v1.7 — Native MMLU importer.** Subject-aware mapping
+  (`subject` → `category`), automatic answer-letter normalization
+  (`A`/`B`/`C`/`D` → choice text).
+* **v1.7 — `imported_by` plumbing.** Wire the CLI / API layer to
+  populate `imported_by` from the authenticated user.
+* **v1.7 — Persist-failure surface.** `_persist` errors should set
+  `result.success = False` and append to `result.errors` instead of
+  being swallowed at DEBUG.
+* **v1.7 — Restored test parity with ateam** (Tier-2 item 16).
+* **v1.8 — Schema-heuristic detection.** Implement the case-insensitive
+  alias matching that the user-reference doc currently describes
+  aspirationally.
+* **v1.8 — Per-record validation.** Optional `validate=True` flag that
+  rejects rows missing the canonical `prompt` field after mapping;
+  populate `records_skipped` accordingly.
+* **v1.8 — Source-hash provenance.** Stamp `metadata.source_hash =
+  sha256(file_or_dataset_bytes)` for attestation. Required for
+  audit-trail benchmarks shipped to regulated tenants.
+* **v1.9 — Native HumanEval / SWE-bench / GPQA importers.** Mirrors the
+  MMLU/MTEB pattern.
+* **No-date-set — Incremental import.** Cursor-based reimport for
+  benchmarks that grow over time. Practical only once the store layer
+  exposes a `last_imported_record_id` lookup.
+* **No-date-set — Cross-benchmark dedupe.** Decision pending product
+  input on whether identical question text across benchmarks should
+  collapse or remain distinct.
+
+---
+
+*End of spec.*
diff --git a/docs/adapters/embedding.md b/docs/adapters/embedding.md
new file mode 100644
index 0000000..0484093
--- /dev/null
+++ b/docs/adapters/embedding.md
@@ -0,0 +1,332 @@
+# Embedding adapter — specification
+
+> Specification document for `layerlens.instrument.adapters.frameworks.embedding`.
+> Companion to the user-facing reference at `docs/adapters/frameworks-embedding.md`.
+> This file describes WHAT the adapter is contracted to do, what it explicitly
+> does NOT do, and what is on the roadmap. The depth audit at
+> `A:/tmp/adapter-depth-audit.md` §1.17 flagged this adapter as having no
+> dedicated spec; this document closes that gap.
+
+| | |
+|---|---|
+| Spec ID | ADP-060 (embedding) + ADP-061 (vector store) |
+| Module | `layerlens.instrument.adapters.frameworks.embedding` |
+| Source | `src/layerlens/instrument/adapters/frameworks/embedding/` (`embedding_adapter.py`, `vector_store_adapter.py`) |
+| Adapter type | Provider — runtime wrapping of embedding-API calls and vector-store queries |
+| Status | Functional, ports byte-near-identical from ateam |
+| Tests | None dedicated — covered indirectly via `tests/instrument/adapters/frameworks/test_bulk_ported_smoke.py` |
+
+---
+
+## 1. Overview
+
+The embedding adapter is a runtime instrumentation adapter that wraps the
+client-side methods used to **create vector embeddings** and **query vector
+stores**. It targets the data-preparation and retrieval halves of a typical
+RAG pipeline, not the generative half (the LLM call itself is covered by the
+provider adapters in `layerlens.instrument.adapters.providers.*`).
+
+The module ships two distinct `BaseAdapter` subclasses — they are independent
+and either can be used without the other.
+
+* **`EmbeddingAdapter`** — wraps client methods that turn text into vectors.
+  Today: OpenAI `client.embeddings.create`, Cohere `client.embed`, and any
+  `sentence_transformers.SentenceTransformer.encode`.
+  *(Source: `embedding_adapter.py:118–143`.)*
+* **`VectorStoreAdapter`** — wraps client methods that retrieve vectors by
+  similarity. Today: Pinecone `Index.query`, Weaviate `collection.query.near_vector`
+  and `collection.query.near_text`, and Chroma `collection.query`.
+  *(Source: `vector_store_adapter.py:114–145`.)*
+
+Both adapters use the same monkey-patch + restore pattern as the rest of the
+framework adapters: `connect()` registers the wrapper, `disconnect()` calls
+`_restore_originals()` to put the original methods back. They emit dict events
+through `BaseAdapter.emit_dict_event` rather than typed Pydantic events; the
+adapter is therefore subject to the cross-cutting "typed events" gap called out
+in audit §2 finding 1.
+
+This adapter is conceptually different from a future RAG-retrieval adapter:
+it captures the API call to the embedding/vector-store backend, but it does
+NOT correlate retrieved chunks with the downstream LLM call that consumed
+them. That correlation is the responsibility of the agent-framework adapter
+(LangChain, LlamaIndex, etc.) sitting above it.
+
+---
+
+## 2. Capability surface
+
+Both classes declare a single capability via `AdapterInfo.capabilities`:
+
+| Class              | Declared capability  | Source                              |
+|--------------------|----------------------|-------------------------------------|
+| `EmbeddingAdapter` | `TRACE_MODELS`       | `embedding_adapter.py:101–103`      |
+| `VectorStoreAdapter` | `TRACE_TOOLS`      | `vector_store_adapter.py:97–99`     |
+
+The choice of `TRACE_MODELS` for embeddings reflects that an embedding API
+call is a model invocation against a hosted embedding model
+(`text-embedding-3-small`, `embed-english-v3.0`, etc.). The choice of
+`TRACE_TOOLS` for vector stores reflects that a similarity query is treated
+by the schema as a retrieval tool call rather than a model call.
+
+**Capabilities NOT declared:**
+
+* `STREAMING` — embedding APIs return whole vectors per call; there is no
+  intermediate token stream to capture. This is consistent with the
+  underlying provider APIs (OpenAI `embeddings.create` returns a synchronous
+  `CreateEmbeddingResponse`; Cohere `embed` returns a list; sentence-transformers
+  returns a numpy/torch tensor in one shot).
+* `REPLAY` — both classes implement `serialize_for_replay()` (returning a
+  `ReplayableTrace` containing `self._trace_events`), but neither declares
+  the capability. This is the same `serialize_for_replay()`-without-`REPLAY`-
+  declaration gap that the depth audit flagged across all adapters
+  (audit §2 finding 3); it is tracked in PR #119.
+* `TRACE_HANDOFFS`, `TRACE_STATE`, `TRACE_PROTOCOL_EVENTS` — not relevant to
+  the embedding/vector-store call surface.
+
+---
+
+## 3. Contract
+
+### 3.1 Public API — `EmbeddingAdapter`
+
+| Member                          | Description                                                                       |
+|---------------------------------|-----------------------------------------------------------------------------------|
+| `__init__(stratix=, capture_config=, *, org_id=)` | Standard `BaseAdapter` constructor.                            |
+| `connect()`                     | Marks the adapter healthy; takes no other action (no networking, no discovery).   |
+| `disconnect()`                  | Restores all originals, marks disconnected, closes sinks.                         |
+| `health_check() -> AdapterHealth` | Returns adapter status (`HEALTHY` / `DISCONNECTED` / etc.) and error count.     |
+| `get_adapter_info() -> AdapterInfo` | Returns name, version, framework, capabilities, author, description.          |
+| `serialize_for_replay() -> ReplayableTrace` | Returns the in-memory `_trace_events` for replay.                     |
+| `wrap_openai(client) -> client` | Patches `client.embeddings.create`. Returns the same client.                      |
+| `wrap_cohere(client) -> client` | Patches `client.embed`. Returns the same client.                                  |
+| `wrap_sentence_transformer(model) -> model` | Patches `model.encode`. Returns the same model.                       |
+
+Each `wrap_*` method is **idempotent against missing methods** but **not
+idempotent against repeat calls**: calling `wrap_openai(client)` twice on the
+same client wraps the wrapper. Callers should `disconnect()` (which restores)
+between repeated wrappings, or wrap each client exactly once.
+
+### 3.2 Public API — `VectorStoreAdapter`
+
+| Member                          | Description                                                                       |
+|---------------------------------|-----------------------------------------------------------------------------------|
+| Lifecycle                       | Same as `EmbeddingAdapter` (mirror methods).                                      |
+| `wrap_pinecone(index) -> index` | Patches `index.query`.                                                            |
+| `wrap_weaviate(collection) -> collection` | Patches `collection.query.near_vector` and `collection.query.near_text`.|
+| `wrap_chroma(collection) -> collection` | Patches `collection.query`.                                             |
+
+### 3.3 Events emitted
+
+All events are emitted via `emit_dict_event(event_type, payload)`. They are
+NOT typed Pydantic events; that migration is tracked under audit §3 Tier-1
+recommendation 5.
+
+#### `embedding.create` (L3)
+
+Emitted by `EmbeddingAdapter` after every wrapped call returns.
+
+| Field           | Type    | Source                                                                | Notes |
+|-----------------|---------|-----------------------------------------------------------------------|-------|
+| `provider`      | str     | Hardcoded per wrapper: `"openai"` / `"cohere"` / `"sentence_transformers"` | |
+| `model`         | str     | `kwargs["model"]` (OpenAI/Cohere) or literal `"local"` (ST)          | OpenAI defaults to `"unknown"` if model not in kwargs (`embedding_adapter.py:151`). Cohere defaults to `"embed-english-v3.0"` (line 188). |
+| `batch_size`    | int     | `len(input)` if list, else `1`                                        | |
+| `dimensions`    | int \| None | `len(result.data[0].embedding)` (OpenAI), `len(result.embeddings[0])` (Cohere), `result.shape[1]` (ST) | None if response shape unrecognized. |
+| `total_tokens`  | int     | `result.usage.total_tokens` (OpenAI only)                             | Cohere/ST do not surface token counts in the same way and the field is omitted from those payloads. |
+| `latency_ms`    | float   | Wall-clock measurement around the wrapped call (`time.monotonic`)     | Rounded to 2 decimals. |
+
+#### `retrieval.query` (L3)
+
+Emitted by `VectorStoreAdapter` after every wrapped query returns. The
+**event type is `retrieval.query`, not `vector_store.query`** — the existing
+user-reference doc disagrees with the source on this point; the source
+(`vector_store_adapter.py:169, 201, 235`) is the authority.
+
+| Field           | Type    | Notes                                                                         |
+|-----------------|---------|-------------------------------------------------------------------------------|
+| `provider`      | str     | `"pinecone"` / `"weaviate"` / `"chroma"`                                      |
+| `top_k` / `n_results` / `limit` | int | Field name varies by provider to match the underlying SDK kwarg. |
+| `match_count` / `result_count` | int | Number of hits returned.                                          |
+| `has_filter` / `has_where` | bool | Whether a metadata filter was supplied (Pinecone `filter`, Chroma `where`). |
+| `namespace`     | str     | Pinecone-specific.                                                            |
+| `query_type`    | str     | Weaviate-specific: `"near_vector"` or `"near_text"`.                          |
+| `score_min`/`score_max`/`score_mean` | float \| None | Pinecone — extracted from `result.matches[*].score`.       |
+| `distance_min`/`distance_max` | float \| None | Chroma — from `result["distances"][0]`.                       |
+| `latency_ms`    | float   | Always present.                                                               |
+
+### 3.4 Lifecycle
+
+Both adapters inherit the standard `BaseAdapter` lifecycle. `connect()` is
+trivial (no remote handshake, no SDK discovery — embedding/vector clients
+are passed in by the caller). `disconnect()` restores wrapped methods via
+`_restore_originals()` and is the only operation that matters for clean
+teardown.
+
+### 3.5 Error handling
+
+The wrappers do NOT swallow exceptions. If the wrapped client raises, the
+exception propagates to the caller. The wrapper's `start = time.monotonic()`
+is reached but the `emit_dict_event` after the call is not, so failed calls
+are NOT emitted as `embedding.create` / `retrieval.query` events. The
+underlying `BaseAdapter._error_count` is also not incremented on these
+exceptions because the wrappers do not invoke
+`_record_error()`. This is a known gap shared with most ported adapters
+(audit §2 finding for "no PolicyViolationEvent path").
+
+---
+
+## 4. What we do NOT support
+
+Each item below is an explicit non-goal as of this spec. None of these
+behaviors should be inferred from the source — they are not present.
+
+* **No async embedding clients.** The OpenAI v1 SDK exposes
+  `AsyncOpenAI().embeddings.create()`; this adapter only wraps the sync
+  `OpenAI` client (`hasattr(client, "embeddings")` check at
+  `embedding_adapter.py:120`). The async coroutine method is NOT patched.
+  Async support is on the v1.7 roadmap.
+* **No streaming.** Embedding APIs return whole vectors; there are no
+  intermediate stream events. The `STREAMING` capability is not declared
+  and never will be for this adapter.
+* **No retrieval-correlation events.** Vector-store queries emit a single
+  `retrieval.query` event with aggregate scores — not per-document
+  correlation IDs. An LLM consuming the retrieved chunks has no way to
+  reference them back to a specific `retrieval.query` event without help
+  from the framework adapter above (LangChain/LlamaIndex).
+* **No content capture.** The wrapper records *batch_size* but does not
+  record the input texts themselves, nor the returned vectors. This is
+  intentional for privacy reasons and is consistent with `CaptureConfig`'s
+  `capture_content=False` default for production presets. There is no
+  knob today to opt into capturing input text or returned vectors.
+* **No re-ranker support.** Cohere `rerank`, Voyage `rerank`, and other
+  re-ranking endpoints are NOT wrapped. Only `embed`/`encode` paths.
+* **No Voyage / Mistral-embed / Anthropic-embed / NVIDIA-embed / GTE / BGE
+  hosted endpoints.** Only the three providers listed in §1 are wrapped.
+  Adding new providers requires a new `wrap_<provider>(client)` method
+  plus a `_make_<provider>_wrapper(original)` factory mirroring the
+  existing pattern.
+* **No vector-store mutation tracking.** Pinecone `upsert` / `delete`,
+  Weaviate `data.insert`, Chroma `add` / `update` are NOT wrapped. Only
+  read-side `query` operations are instrumented. Tracking writes is on
+  the v1.7 roadmap.
+* **No Qdrant, Milvus, FAISS, or pgvector adapters.** Only Pinecone /
+  Weaviate / Chroma are supported today.
+* **No batch-vs-single distinction event.** OpenAI lets you submit a
+  batch in a single `embeddings.create`; we record `batch_size` but do
+  not split the call into per-item events.
+* **No typed Pydantic event payloads.** All events are emitted as plain
+  dicts via `emit_dict_event`. Migration to typed events is tracked at
+  audit §3 Tier-1 item 5 (cross-adapter, not adapter-specific).
+* **No OTel `gen_ai.*` semantic conventions.** Audit §2 finding 2 applies.
+  PR #125 tracks the cross-adapter migration.
+* **No `org_id` in event envelopes by default.** The constructor accepts
+  `org_id=` and stores it on `self._org_id`, but the emit path does NOT
+  inject it into the event payload. PR #118 tracks the cross-adapter fix.
+
+---
+
+## 5. BYOK and multi-tenancy
+
+The embedding adapter integrates with provider keys indirectly. The caller
+is responsible for constructing the upstream client (`OpenAI()`, `cohere.Client()`,
+`SentenceTransformer(...)`); the adapter never instantiates a provider client
+of its own and never reads `OPENAI_API_KEY` / `COHERE_API_KEY` / similar
+environment variables. Whatever credential the caller's client has, that's
+what the wrapped call uses.
+
+This means BYOK key resolution for embeddings happens **outside** the
+adapter's scope, in two layers above it:
+
+1. The caller's application chooses which key to use (org-scoped BYOK key,
+   platform-managed key, or environment fallback).
+2. The platform's BYOK store (atlas-app `/api/v1/model-keys`) is responsible
+   for materializing the per-org key into the OpenAI/Cohere client before
+   the embedding call is made.
+
+The adapter's only contribution to multi-tenancy is the `org_id`
+constructor argument (`embedding_adapter.py:69`,
+`vector_store_adapter.py:65`). Today this is stored on the adapter
+instance but **not stamped onto emitted events** — the cross-adapter
+`org_id` propagation work tracked by PR #118 is the fix.
+
+For SaaS deployments where a single shared sink is fed by per-tenant
+adapter instances, the recommended pattern until PR #118 lands is to use
+distinct adapter instances per tenant and let the sink layer (which knows
+its own tenant context from its API-key-driven session) attach the
+`org_id` envelope. Do NOT share a single `EmbeddingAdapter` instance
+across tenants.
+
+---
+
+## 6. Test coverage
+
+* **Dedicated tests:** none.
+* **Bulk smoke coverage:** `tests/instrument/adapters/frameworks/test_bulk_ported_smoke.py:65–67`
+  imports `EmbeddingAdapter` and exercises construction, `connect()`,
+  metadata, and capability declaration via `_PARAMETRIZE_CASES`. The
+  smoke pass for `VectorStoreAdapter` is implicit — it inherits the same
+  base class but is not explicitly listed in the parametrize cases.
+* **What's missing vs ateam:** ateam ships `tests/adapters/embedding/test_integration.py`
+  exercising provider wrapping with mocked clients (audit §1.17). That
+  file was NOT ported. Restoring it is on the Tier-2 test-restoration
+  queue (audit §3 item 16).
+* **Sample:** `samples/instrument/embedding/main.py` exercises
+  `wrap_openai` end-to-end with a real OpenAI client. There is no
+  vector-store sample.
+
+A complete test pass for v1.7 would cover, at minimum:
+
+1. Construction with each `CaptureConfig` preset.
+2. `wrap_openai(client)` with a stubbed client that returns a
+   `CreateEmbeddingResponse`-shaped object — verify the emitted event has
+   the expected `provider`, `model`, `batch_size`, `dimensions`,
+   `total_tokens`, `latency_ms`.
+3. Same for Cohere (`embed`) and sentence-transformers (`encode`).
+4. `wrap_pinecone(index)` with a stub returning `matches=[Mock(score=…)]` —
+   verify `score_min/max/mean` aggregation.
+5. `wrap_weaviate(collection)` for both `near_vector` and `near_text`.
+6. `wrap_chroma(collection)` with the dict-shaped response.
+7. `disconnect()` restores all originals (assert `client.embeddings.create
+   is original`).
+8. Exception in wrapped call does NOT emit an event but does NOT swallow.
+9. `serialize_for_replay()` returns a `ReplayableTrace` with the events
+   accumulated during the session.
+10. Pydantic-compat smoke (`requires_pydantic = PydanticCompat.V1_OR_V2`)
+    holds for both classes.
+
+---
+
+## 7. Roadmap
+
+The following items are explicitly planned for v1.7 or later. Nothing in
+§1–§6 should be read as committing to any of them.
+
+* **v1.7 — Async client support.** Add `wrap_async_openai(client)` and
+  equivalents for AsyncCohere. Wrappers will be `async def` and `await`
+  the underlying call before emitting.
+* **v1.7 — Vector-store write instrumentation.** Wrap Pinecone `upsert`,
+  Weaviate `data.insert`, Chroma `add`/`update`. Emit
+  `vector_store.upsert` / `vector_store.delete` events.
+* **v1.7 — Voyage AI provider.** Add `wrap_voyage(client)` matching the
+  shape of the existing OpenAI wrapper.
+* **v1.7 — Restored test parity with ateam** (Tier-2 item 16).
+* **v1.8 — Qdrant + pgvector adapters.** Both have widely-used Python
+  clients with stable query surfaces; wrapping is straightforward.
+* **v1.8 — Typed Pydantic events** (cross-adapter, audit Tier-1 item 5).
+  Define `EmbeddingCreateEvent` and `RetrievalQueryEvent` Pydantic models;
+  switch the wrappers from `emit_dict_event` to `emit_event`.
+* **v1.8 — OTel `gen_ai.*` semconv** (cross-adapter, audit Tier-1 item 6).
+  Add parallel `gen_ai.system="openai"`, `gen_ai.request.model=…`,
+  `gen_ai.usage.input_tokens=…` attributes alongside the existing flat
+  fields.
+* **No-date-set — Re-ranker support.** Cohere `rerank` and Voyage `rerank`
+  are reasonable candidates. Decision pending product input on whether
+  re-rankers should be modeled as `tool.call` or as a new
+  `retrieval.rerank` event.
+* **No-date-set — Per-chunk retrieval correlation.** Requires upstream
+  framework adapter cooperation (LangChain / LlamaIndex / a future
+  retrieval-orchestration adapter); not solvable inside this adapter
+  alone.
+
+---
+
+*End of spec.*