Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Speechmatics TTS Python Extension

This extension provides text-to-speech functionality using Speechmatics TTS API.

## Features

- Low-latency speech synthesis (sub-150ms)
- High-quality, natural-sounding voices
- HTTP REST API integration
- Multiple voice options (UK and US English)
- Support for WAV and MP3 output formats
- Production-grade reliability

## Prerequisites

- Speechmatics API key
- Python 3.8+
- aiohttp package

## Configuration

The extension can be configured through your property.json:

```json
{
"params": {
"api_key": "your-api-key-here",
"voice_id": "sarah",
"output_format": "wav",
"sample_rate": 16000,
"base_url": "https://preview.tts.speechmatics.com"
}
}
```

### Configuration Options

**Parameters inside `params` object:**
- `api_key` (required): Speechmatics API key
- `voice_id` (required): Voice identifier (sarah, theo, megan, jack)
- `output_format` (optional): Audio format - "wav" or "mp3" (default: "wav")
- `sample_rate` (optional): Audio sample rate in Hz (default: 16000)
- `base_url` (optional): API base URL (default: "https://preview.tts.speechmatics.com")

### Available Voices

| Voice ID | Description |
|----------|-------------|
| `sarah` | English Female (UK) |
| `theo` | English Male (UK) |
| `megan` | English Female (US) |
| `jack` | English Male (US) |

## Getting Started

### 1. Get API Key

Create an API key at the [Speechmatics Portal](https://portal.speechmatics.com/).

### 2. Set Environment Variable

```bash
export SPEECHMATICS_API_KEY=your-api-key-here
```

### 3. Configure Extension

Update your `property.json` with the desired voice and settings.

## API Details

- **Endpoint**: `https://preview.tts.speechmatics.com/generate/{voice_id}`
- **Method**: POST
- **Authentication**: Bearer token
- **Latency**: Sub-150ms
- **Sample Rate**: 16kHz mono (optimized for voice agents)

## Architecture

This extension follows the TEN Framework TTS2 HTTP extension pattern:

- `extension.py`: Main extension class inheriting from `AsyncTTS2HttpExtension`
- `speechmatics_tts.py`: Client implementation with HTTP API integration
- `config.py`: Configuration model with validation
- `addon.py`: Extension addon registration

## License

Apache 2.0

## Contributing

Contributions are welcome! Please submit issues and pull requests to the TEN Framework repository.

## Links

- [Speechmatics TTS Documentation](https://docs.speechmatics.com/text-to-speech/quickstart)
- [Speechmatics Portal](https://portal.speechmatics.com/)
- [TEN Framework](https://github.com/TEN-framework/ten-framework)
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#
# This file is part of TEN Framework, an open source project.
# Licensed under the Apache License, Version 2.0.
# See the LICENSE file for more information.
#
from . import addon

__all__ = ["addon"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#
# This file is part of TEN Framework, an open source project.
# Licensed under the Apache License, Version 2.0.
# See the LICENSE file for more information.
#
from ten_runtime import (
Addon,
register_addon_as_extension,
TenEnv,
)


@register_addon_as_extension("speechmatics_tts_python")
class SpeechmaticsTTSExtensionAddon(Addon):
def on_create_instance(self, ten_env: TenEnv, name: str, context) -> None:
from .extension import SpeechmaticsTTSExtension

ten_env.log_info("SpeechmaticsTTSExtensionAddon on_create_instance")
ten_env.on_create_instance_done(SpeechmaticsTTSExtension(name), context)
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#
# This file is part of TEN Framework, an open source project.
# Licensed under the Apache License, Version 2.0.
# See the LICENSE file for more information.
#
from typing import Any
import copy
from pydantic import Field
from pathlib import Path
from ten_ai_base import utils
from ten_ai_base.tts2_http import AsyncTTS2HttpConfig


class SpeechmaticsTTSConfig(AsyncTTS2HttpConfig):
"""Speechmatics TTS Config"""

dump: bool = Field(default=False, description="Speechmatics TTS dump")
dump_path: str = Field(
default_factory=lambda: str(
Path(__file__).parent / "speechmatics_tts_in.pcm"
),
description="Speechmatics TTS dump path",
)
params: dict[str, Any] = Field(
default_factory=dict, description="Speechmatics TTS params"
)

def update_params(self) -> None:
"""Update configuration from params dictionary"""
pass

def to_str(self, sensitive_handling: bool = True) -> str:
"""Convert config to string with optional sensitive data handling."""
if not sensitive_handling:
return f"{self}"

config = copy.deepcopy(self)

# Encrypt sensitive fields in params
if config.params and "api_key" in config.params:
config.params["api_key"] = utils.encrypt(config.params["api_key"])

return f"{config}"

def validate(self) -> None:
"""Validate Speechmatics-specific configuration."""
if "api_key" not in self.params or not self.params["api_key"]:
raise ValueError("API key is required for Speechmatics TTS")
if "voice_id" not in self.params or not self.params["voice_id"]:
raise ValueError("Voice ID is required for Speechmatics TTS")
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#
# This file is part of TEN Framework, an open source project.
# Licensed under the Apache License, Version 2.0.
# See the LICENSE file for more information.
#
"""
Speechmatics TTS Extension

This extension implements text-to-speech using Speechmatics TTS API.
It provides low-latency, high-quality speech synthesis.
"""

from ten_ai_base.tts2_http import (
AsyncTTS2HttpExtension,
AsyncTTS2HttpConfig,
AsyncTTS2HttpClient,
)
from ten_runtime import AsyncTenEnv

from .config import SpeechmaticsTTSConfig
from .speechmatics_tts import SpeechmaticsTTSClient


class SpeechmaticsTTSExtension(AsyncTTS2HttpExtension):
"""
Speechmatics TTS Extension implementation.

Provides text-to-speech synthesis using Speechmatics HTTP API.
Inherits all common HTTP TTS functionality from AsyncTTS2HttpExtension.
"""

def __init__(self, name: str) -> None:
super().__init__(name)
# Type hints for better IDE support
self.config: SpeechmaticsTTSConfig = None
self.client: SpeechmaticsTTSClient = None

# ============================================================
# Required method implementations
# ============================================================

async def create_config(self, config_json_str: str) -> AsyncTTS2HttpConfig:
"""Create Speechmatics TTS configuration from JSON string."""
return SpeechmaticsTTSConfig.model_validate_json(config_json_str)

async def create_client(
self, config: AsyncTTS2HttpConfig, ten_env: AsyncTenEnv
) -> AsyncTTS2HttpClient:
"""Create Speechmatics TTS client."""
return SpeechmaticsTTSClient(config=config, ten_env=ten_env)

def vendor(self) -> str:
"""Return vendor name."""
return "speechmatics"

def synthesize_audio_sample_rate(self) -> int:
"""Return the sample rate for synthesized audio."""
return self.config.params.get("sample_rate", 16000)
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
{
"type": "extension",
"name": "speechmatics_tts_python",
"version": "0.1.0",
"dependencies": [
{
"type": "system",
"name": "ten_runtime_python",
"version": "0.11"
},
{
"type": "system",
"name": "ten_ai_base",
"version": "0.7"
}
],
"package": {
"include": [
"manifest.json",
"property.json",
"**.py",
"README.md",
"requirements.txt"
]
},
"api": {
"interface": [
{
"import_uri": "../../system/ten_ai_base/api/tts-interface.json"
}
],
"property": {
"properties": {
"params": {
"type": "object",
"properties": {
"api_key": {
"type": "string"
},
"voice_id": {
"type": "string"
},
"output_format": {
"type": "string"
},
"sample_rate": {
"type": "int64"
},
"base_url": {
"type": "string"
}
}
}
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"params": {
"api_key": "${env:SPEECHMATICS_API_KEY}",
"voice_id": "sarah",
"output_format": "wav",
"sample_rate": 16000,
"base_url": "https://preview.tts.speechmatics.com"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
aiohttp>=3.8.0
pydantic>=2.0.0
pytest==8.3.4
Loading
Loading