Skip to content

Conversation

@shoryasethia
Copy link

Description

This PR adds support for Google Gemini's Text-to-Speech (TTS) API to Verbi, allowing users to use Google's latest TTS capabilities alongside existing providers.

Changes Made

  • Added Google Gemini TTS integration to voice_assistant/text_to_speech.py
  • Updated configuration to support gemini as a TTS model option
  • Added API key management for Google API
  • Updated requirements to include google-generativeai
  • Updated documentation in README

Usage

Set TTS_MODEL = 'gemini' in config.py and add GOOGLE_API_KEY to your .env file.

References

@shoryasethia
Copy link
Author

@PromtEngineer Hello, Requesting your review for this PR.

@shoryasethia
Copy link
Author

Hi @PromtEngineer

PR Update: Gemini TTS Support - Ready for Review

I've updated this PR with significant improvements and fixes. The implementation is now fully tested and working.

What This PR Adds

Google Gemini Text-to-Speech Integration - Adds support for Google's latest TTS capabilities using the Gemini 2.5 Flash Preview TTS model.

Major Changes & Fixes (Jan 24, 2026)

1. Migrated to New Official Google API

  • Replaced deprecated google-generativeai (v0.8.6) with google-genai (v1.60.0)
  • Updated all API calls to use the new genai.Client() and types classes
  • Eliminates deprecation warnings

2. Fixed Deepgram SDK Compatibility

  • Upgraded from v2.12.0 → v5.3.1
  • Migrated transcription and TTS to use new v5 API (client.listen.v1.media.transcribe_file(), client.speak.v1.audio.generate())
  • Removed deprecated imports (PrerecordedOptions, FileSource, SpeakOptions)

3. Fixed Audio Format Issues

  • Added proper WAV file headers for Gemini TTS PCM audio output
  • Configured for 24kHz, 16-bit, mono format
  • Resolves pygame playback errors

Configurable Options

Users can customize Gemini TTS in config.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant