Skip to content

feat(embeddings): add Gemini embeddings provider support#583

Open
harrisony wants to merge 4 commits into
mcowger:mainfrom
harrisony:embeddings
Open

feat(embeddings): add Gemini embeddings provider support#583
harrisony wants to merge 4 commits into
mcowger:mainfrom
harrisony:embeddings

Conversation

@harrisony

Copy link
Copy Markdown
Contributor

Adds a Gemini embeddings provider alongside the existing OpenAI pass-through. The embeddings pipeline now uses a transformer interface + factory (same pattern as chat/completions), so adding more provider types later is just implementing the interface.

How it works

Clients still send standard OpenAI-format POST /v1/embeddings. When the router selects a Gemini provider (api_type: gemini), the GeminiEmbeddingsTransformer:

  • Builds the endpoint: /v1beta/models/{model}:embedContent (single) or :batchEmbedContents (batch), prepending models/ if needed
  • Converts the request body to Gemini's content: { parts: [{ text }] } shape, with passthrough for taskType, title, and outputDimensionality when present
  • Swaps Authorization: Bearer for x-goog-api-key
  • Converts the response back to OpenAI format (data[].embedding, usage.prompt_tokens)

What's changed

  • EmbeddingsTransformer interface + EmbeddingsTransformerFactory — replaces the old monolithic EmbeddingsTransformer class
  • OpenAIEmbeddingsTransformer — extracted pass-through behavior
  • GeminiEmbeddingsTransformer — new, stateless (request threaded through transformResponse instead of stored as instance state)
  • dispatchEmbeddings hardened: merges model extraBody, uses parseJsonResponseBody, calls markProviderSuccess for cooldown recovery, sets isPassthrough: true, and records attemptCount/retryHistory
  • Router now includes gemini-type providers in embeddings matches; probe sends input field

…eJsonResponseBody, and markProviderSuccess

- Merge model-level extraBody between provider and alias levels (was silently dropped)
- Replace raw response.json() with parseJsonResponseBody (crashes on non-JSON responses)
- Add markProviderSuccess call so providers recover from cooldown via embeddings path
…AI transformer, and factory

Replace the monolithic EmbeddingsTransformer class with an interface +
OpenAIEmbeddingsTransformer + EmbeddingsTransformerFactory pattern,
parallel to the existing TransformerFactory. Inline parseRequest() in
the route handler. Make usage optional on UnifiedEmbeddingsResponse.
Add GeminiEmbeddingsTransformer implementing the EmbeddingsTransformer interface:
- Single-input: POST /v1beta/models/{model}:embedContent
- Batch: POST /v1beta/models/{model}:batchEmbedContents
- x-goog-api-key auth header instead of Bearer token
- OpenAI-compatible response format conversion (data[].embedding, usage.prompt_tokens)

Wire EmbeddingsTransformerFactory into dispatchEmbeddings: replace the hard-coded OpenAI
URL/header/payload construction with factory-dispatched transformRequest, transformResponse,
getEndpoint, and getAuthHeaders. Adds isPassthrough: true to embeddings plexus metadata.

Supporting fixes:
- Extend router filter to include gemini-type providers for embeddings routing
- Pass input field in probe dispatchEmbeddings call
- Fix base URL resolution via API_TYPE_ALIASES: embeddings (and transcriptions, speech,
  images) now resolve their base URL from the chat/gemini URL key when no exact match is
  configured, replacing the unpredictable first-key fallback
- Make GeminiEmbeddingsTransformer stateless: remove requestModel instance field, thread
  the original request through transformResponse instead
- Add resolveTransformer() to EmbeddingsTransformerFactory and extract API_TYPE_ALIASES
  to module scope (from inline in resolveBaseUrl)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant