feat(embeddings): add Gemini embeddings provider support#583
Open
harrisony wants to merge 4 commits into
Open
Conversation
…eJsonResponseBody, and markProviderSuccess - Merge model-level extraBody between provider and alias levels (was silently dropped) - Replace raw response.json() with parseJsonResponseBody (crashes on non-JSON responses) - Add markProviderSuccess call so providers recover from cooldown via embeddings path
…AI transformer, and factory Replace the monolithic EmbeddingsTransformer class with an interface + OpenAIEmbeddingsTransformer + EmbeddingsTransformerFactory pattern, parallel to the existing TransformerFactory. Inline parseRequest() in the route handler. Make usage optional on UnifiedEmbeddingsResponse.
Add GeminiEmbeddingsTransformer implementing the EmbeddingsTransformer interface:
- Single-input: POST /v1beta/models/{model}:embedContent
- Batch: POST /v1beta/models/{model}:batchEmbedContents
- x-goog-api-key auth header instead of Bearer token
- OpenAI-compatible response format conversion (data[].embedding, usage.prompt_tokens)
Wire EmbeddingsTransformerFactory into dispatchEmbeddings: replace the hard-coded OpenAI
URL/header/payload construction with factory-dispatched transformRequest, transformResponse,
getEndpoint, and getAuthHeaders. Adds isPassthrough: true to embeddings plexus metadata.
Supporting fixes:
- Extend router filter to include gemini-type providers for embeddings routing
- Pass input field in probe dispatchEmbeddings call
- Fix base URL resolution via API_TYPE_ALIASES: embeddings (and transcriptions, speech,
images) now resolve their base URL from the chat/gemini URL key when no exact match is
configured, replacing the unpredictable first-key fallback
- Make GeminiEmbeddingsTransformer stateless: remove requestModel instance field, thread
the original request through transformResponse instead
- Add resolveTransformer() to EmbeddingsTransformerFactory and extract API_TYPE_ALIASES
to module scope (from inline in resolveBaseUrl)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a Gemini embeddings provider alongside the existing OpenAI pass-through. The embeddings pipeline now uses a transformer interface + factory (same pattern as chat/completions), so adding more provider types later is just implementing the interface.
How it works
Clients still send standard OpenAI-format
POST /v1/embeddings. When the router selects a Gemini provider (api_type: gemini), theGeminiEmbeddingsTransformer:/v1beta/models/{model}:embedContent(single) or:batchEmbedContents(batch), prependingmodels/if neededcontent: { parts: [{ text }] }shape, with passthrough fortaskType,title, andoutputDimensionalitywhen presentAuthorization: Bearerforx-goog-api-keydata[].embedding,usage.prompt_tokens)What's changed
EmbeddingsTransformerinterface +EmbeddingsTransformerFactory— replaces the old monolithicEmbeddingsTransformerclassOpenAIEmbeddingsTransformer— extracted pass-through behaviorGeminiEmbeddingsTransformer— new, stateless (request threaded throughtransformResponseinstead of stored as instance state)dispatchEmbeddingshardened: merges modelextraBody, usesparseJsonResponseBody, callsmarkProviderSuccessfor cooldown recovery, setsisPassthrough: true, and recordsattemptCount/retryHistorygemini-type providers in embeddings matches; probe sendsinputfield