Support for outputting embeddings by joefutrelle · Pull Request #2 · WHOIGit/ifcb-inference

joefutrelle · 2026-06-05T16:47:44Z

This pull request adds penultimate-layer embedding output. A one-time ONNX graph surgery exposes the embedding tensor as a second model output, then --embeddings captures it alongside class scores in the same forward pass and writes it per bin.

add_embedding_output.py (new): auto-detects the pre-head tensor (final Gemm/MatMul input) and adds it to the model's graph outputs; --tensor-name override.
cli.py: new --embeddings / --embeddings-only / --embeddings-outfile flags; write_embeddings writes Parquet (pid + embedding as fixed_size_list)
sanstorch.py / withtorch.py: capture outputs[1] and accumulate it alongside scores.
pyproject.toml: new [embeddings] extra (pyarrow).
README: install row, options, and an Embeddings section.

Key questions:

is Parquet the right output format for embeddings?
is float16 sufficient output precision or is it a premature space optimization?
are the README changes accurate and sufficient?

support for outputting embeddings

808508d

joefutrelle requested a review from sbatchelder June 5, 2026 16:47

joefutrelle self-assigned this Jun 5, 2026

pin ONNX IR version to 10 in test fixture

860be15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for outputting embeddings#2

Support for outputting embeddings#2
joefutrelle wants to merge 2 commits into
mainfrom
output-embeddings

joefutrelle commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joefutrelle commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant