Add Google Gemini TTS API support #37

shoryasethia · 2026-01-07T18:50:28Z

Description

This PR adds support for Google Gemini's Text-to-Speech (TTS) API to Verbi, allowing users to use Google's latest TTS capabilities alongside existing providers.

Changes Made

Added Google Gemini TTS integration to voice_assistant/text_to_speech.py
Updated configuration to support gemini as a TTS model option
Added API key management for Google API
Updated requirements to include google-generativeai
Updated documentation in README

Usage

Set TTS_MODEL = 'gemini' in config.py and add GOOGLE_API_KEY to your .env file.

References

Google Gemini Speech Generation Documentation: https://ai.google.dev/gemini-api/docs/speech-generation

shoryasethia · 2026-01-07T19:05:17Z

@PromtEngineer Hello, Requesting your review for this PR.

shoryasethia · 2026-01-24T18:35:37Z

Hi @PromtEngineer

PR Update: Gemini TTS Support - Ready for Review

I've updated this PR with significant improvements and fixes. The implementation is now fully tested and working.

What This PR Adds

Google Gemini Text-to-Speech Integration - Adds support for Google's latest TTS capabilities using the Gemini 2.5 Flash Preview TTS model.

Major Changes & Fixes (Jan 24, 2026)

1. Migrated to New Official Google API

Replaced deprecated google-generativeai (v0.8.6) with google-genai (v1.60.0)
Updated all API calls to use the new genai.Client() and types classes
Eliminates deprecation warnings

2. Fixed Deepgram SDK Compatibility

Upgraded from v2.12.0 → v5.3.1
Migrated transcription and TTS to use new v5 API (client.listen.v1.media.transcribe_file(), client.speak.v1.audio.generate())
Removed deprecated imports (PrerecordedOptions, FileSource, SpeakOptions)

3. Fixed Audio Format Issues

Added proper WAV file headers for Gemini TTS PCM audio output
Configured for 24kHz, 16-bit, mono format
Resolves pygame playback errors

Configurable Options

Users can customize Gemini TTS in config.py

Add Google Gemini TTS API support

48cd583

shoryasethia added 7 commits January 24, 2026 23:38

Add configurable Gemini TTS model and voice settings

33d5f05

Fix deepgram-sdk compatibility by upgrading to v5 and updating API calls

9a0278b

Merge deepgram SDK v5 fixes into gemini-tts branch

4420d9a

Fix transcription.py import for deepgram SDK v5

907b1aa

Migrate to google-genai package and fix Groq model

df7d41d

Fix Gemini TTS audio format to WAV

5636127

Add proper WAV headers for Gemini TTS PCM audio output

5060437

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Google Gemini TTS API support #37

Add Google Gemini TTS API support #37

shoryasethia commented Jan 7, 2026

Uh oh!

shoryasethia commented Jan 7, 2026

Uh oh!

shoryasethia commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Google Gemini TTS API support #37

Are you sure you want to change the base?

Add Google Gemini TTS API support #37

Conversation

shoryasethia commented Jan 7, 2026

Description

Changes Made

Usage

References

Uh oh!

shoryasethia commented Jan 7, 2026

Uh oh!

shoryasethia commented Jan 24, 2026

PR Update: Gemini TTS Support - Ready for Review

What This PR Adds

Major Changes & Fixes (Jan 24, 2026)

1. Migrated to New Official Google API

2. Fixed Deepgram SDK Compatibility

3. Fixed Audio Format Issues

Configurable Options

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant