docs(adapters): embedding + benchmark_import specs by mmercuri · Pull Request #148 · LayerLens/stratix-python

mmercuri · 2026-05-10T16:36:58Z

Summary

Closes the spec gap flagged by the depth audit at A:/tmp/adapter-depth-audit.md for the two framework adapters that previously had no dedicated specification:

docs/adapters/embedding.md (~2,260 words / 332 lines) — covers EmbeddingAdapter (OpenAI / Cohere / sentence-transformers) and VectorStoreAdapter (Pinecone / Weaviate / Chroma) under audit sec 1.17.
docs/adapters/benchmark_import.md (~2,380 words / 361 lines) — covers BenchmarkImportAdapter (HuggingFace / HELM / CSV / JSON / Parquet) under audit sec 1.18, including the architectural carve-out for "data importer" adapters that do not extend BaseAdapter.

These docs sit alongside the existing user-reference docs (frameworks-embedding.md, frameworks-benchmark_import.md). The reference docs are quick-start guides; these specs are the contractual document — Overview, Capability surface, Contract (public API + events emitted + lifecycle + error handling), explicit non-goals, BYOK / multi-tenancy, test coverage (current + target), and v1.7 / v1.8 roadmap.

Honesty notes (per CLAUDE.md "no fake claims")

Each spec calls out aspirational behavior in the source/companion-doc as such, rather than restating it as fact:

Embedding event name — source emits retrieval.query (verified at vector_store_adapter.py:169 / 201 / 235). The companion user-reference doc claims vector_store.query. Spec sec 3.3 names the source as authority.
benchmark_import automatic schema heuristics — companion doc claims case-insensitive aliasing; source _apply_schema_mapping short-circuits on empty mapping with no heuristic fallback. Spec sec 3.3 calls this out and §7 puts heuristic detection on the v1.8 roadmap.
Cross-adapter gaps — typed Pydantic events, OTel gen_ai.* semconv, and org_id envelope propagation are flagged where applicable (audit §2 findings 1, 2, 4) and linked to the in-flight PRs that close them (PR feat(instrument): OpenTelemetry GenAI semantic conventions for all LLM-call adapters (spec 07) #125, fix(instrument): Propagate org_id through all event emissions (multi-tenancy CLAUDE.md fix) #118, fix(instrument): Brand leak in agentforce trust layer YAML + missing STREAMING/REPLAY capability declarations #119, feat(instrument): Typed Pydantic event foundation + agno reference (1/17 adapters) #129).

What is explicitly NOT supported

The "do NOT support" sections enumerate non-goals so reviewers and customers can avoid building on assumptions the source does not back:

Embedding: no async clients, no streaming (embeddings have no stream surface), no re-ranker, no Voyage / Mistral-embed / Anthropic-embed, no vector-store mutation tracking (write paths), no Qdrant / Milvus / pgvector / FAISS, no per-chunk retrieval correlation.
Benchmark import: not a BaseAdapter, no telemetry events, no live mid-run feedback, no MTEB / MMLU / HumanEval native importers (work via import_huggingface), no automatic heuristic mapping, no incremental import / cursor, no signing or attestation, no per-record validation.

Test plan

Reviewer reads each spec end-to-end and verifies the "Contract" section against the source files (embedding_adapter.py, vector_store_adapter.py, benchmark_import/adapter.py).
Reviewer confirms the "What we do NOT support" lists match audit A:/tmp/adapter-depth-audit.md §1.17 / §1.18 verdicts (no behavior is claimed that the source does not implement).
Reviewer cross-checks the v1.7 / v1.8 roadmap items against the audit's Tier-1 / Tier-2 cross-cutting work (typed events, OTel semconv, org_id propagation, sample / test parity restoration).

Closes the spec gap flagged by the depth audit at A:/tmp/adapter-depth-audit.md sec 1.17 (embedding) and sec 1.18 (benchmark_import). These two adapters were the only framework adapters lacking dedicated specifications. The new specs sit alongside the existing user-reference docs (frameworks-embedding.md, frameworks-benchmark_import.md) and describe what the adapter is contracted to do, the explicit non-goals, BYOK/multi-tenancy posture, current test coverage, and the v1.7 / v1.8 roadmap. Per CLAUDE.md, every claim traces back to a line in the source; aspirational behavior is called out as such (e.g. the user-reference doc claim of automatic schema-heuristic mapping in benchmark_import is not implemented). Embedding spec also corrects the event-name discrepancy: the source emits 'retrieval.query', not 'vector_store.query' as the existing user-reference doc claims.

…t.md (4th inaccuracy) The Persistence section claimed save_benchmark(metadata, records) but source uses insert_row(table, row) at adapter.py:441-444. Surfaced as bonus finding by agent a56bee25024184e15 while fixing the prior 3 inaccuracies in this PR. Cross-reference PR #148 canonical spec at docs/adapters/benchmark_import.md.

mmercuri requested a review from m-peko May 10, 2026 16:36

mmercuri mentioned this pull request May 10, 2026

docs(adapters): fix user-doc inaccuracies (closes spec PR #148 audit) #150

Closed

4 tasks

m-peko closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(adapters): embedding + benchmark_import specs#148

docs(adapters): embedding + benchmark_import specs#148
mmercuri wants to merge 1 commit into
mainfrom
docs/adapter-specs-embedding-benchmark-import

mmercuri commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mmercuri commented May 10, 2026

Summary

Honesty notes (per CLAUDE.md "no fake claims")

What is explicitly NOT supported

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants