Skip to content

docs(adapters): embedding + benchmark_import specs#148

Closed
mmercuri wants to merge 1 commit into
mainfrom
docs/adapter-specs-embedding-benchmark-import
Closed

docs(adapters): embedding + benchmark_import specs#148
mmercuri wants to merge 1 commit into
mainfrom
docs/adapter-specs-embedding-benchmark-import

Conversation

@mmercuri

Copy link
Copy Markdown
Contributor

Summary

Closes the spec gap flagged by the depth audit at A:/tmp/adapter-depth-audit.md for the two framework adapters that previously had no dedicated specification:

  • docs/adapters/embedding.md (~2,260 words / 332 lines) — covers EmbeddingAdapter (OpenAI / Cohere / sentence-transformers) and VectorStoreAdapter (Pinecone / Weaviate / Chroma) under audit sec 1.17.
  • docs/adapters/benchmark_import.md (~2,380 words / 361 lines) — covers BenchmarkImportAdapter (HuggingFace / HELM / CSV / JSON / Parquet) under audit sec 1.18, including the architectural carve-out for "data importer" adapters that do not extend BaseAdapter.

These docs sit alongside the existing user-reference docs (frameworks-embedding.md, frameworks-benchmark_import.md). The reference docs are quick-start guides; these specs are the contractual document — Overview, Capability surface, Contract (public API + events emitted + lifecycle + error handling), explicit non-goals, BYOK / multi-tenancy, test coverage (current + target), and v1.7 / v1.8 roadmap.

Honesty notes (per CLAUDE.md "no fake claims")

Each spec calls out aspirational behavior in the source/companion-doc as such, rather than restating it as fact:

What is explicitly NOT supported

The "do NOT support" sections enumerate non-goals so reviewers and customers can avoid building on assumptions the source does not back:

  • Embedding: no async clients, no streaming (embeddings have no stream surface), no re-ranker, no Voyage / Mistral-embed / Anthropic-embed, no vector-store mutation tracking (write paths), no Qdrant / Milvus / pgvector / FAISS, no per-chunk retrieval correlation.
  • Benchmark import: not a BaseAdapter, no telemetry events, no live mid-run feedback, no MTEB / MMLU / HumanEval native importers (work via import_huggingface), no automatic heuristic mapping, no incremental import / cursor, no signing or attestation, no per-record validation.

Test plan

  • Reviewer reads each spec end-to-end and verifies the "Contract" section against the source files (embedding_adapter.py, vector_store_adapter.py, benchmark_import/adapter.py).
  • Reviewer confirms the "What we do NOT support" lists match audit A:/tmp/adapter-depth-audit.md §1.17 / §1.18 verdicts (no behavior is claimed that the source does not implement).
  • Reviewer cross-checks the v1.7 / v1.8 roadmap items against the audit's Tier-1 / Tier-2 cross-cutting work (typed events, OTel semconv, org_id propagation, sample / test parity restoration).

Closes the spec gap flagged by the depth audit at
A:/tmp/adapter-depth-audit.md sec 1.17 (embedding) and sec 1.18
(benchmark_import). These two adapters were the only framework
adapters lacking dedicated specifications.

The new specs sit alongside the existing user-reference docs
(frameworks-embedding.md, frameworks-benchmark_import.md) and
describe what the adapter is contracted to do, the explicit
non-goals, BYOK/multi-tenancy posture, current test coverage,
and the v1.7 / v1.8 roadmap. Per CLAUDE.md, every claim traces
back to a line in the source; aspirational behavior is called
out as such (e.g. the user-reference doc claim of automatic
schema-heuristic mapping in benchmark_import is not implemented).

Embedding spec also corrects the event-name discrepancy: the
source emits 'retrieval.query', not 'vector_store.query' as the
existing user-reference doc claims.
@mmercuri mmercuri requested a review from m-peko May 10, 2026 16:36
mmercuri added a commit that referenced this pull request May 10, 2026
…t.md (4th inaccuracy)

The Persistence section claimed save_benchmark(metadata, records) but
source uses insert_row(table, row) at adapter.py:441-444. Surfaced as
bonus finding by agent a56bee25024184e15 while fixing the prior 3
inaccuracies in this PR.

Cross-reference PR #148 canonical spec at docs/adapters/benchmark_import.md.
@m-peko m-peko closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants