Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions ai_agents/agents/ten_packages/extension/blaze_stt_python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Blaze STT Extension for TEN Framework

Blaze Speech-to-Text (STT) extension for [TEN Framework](https://github.com/TEN-framework/ten-framework).

## Installation

```bash
pip install -r requirements.txt
```

Or install dependencies directly:

```bash
pip install httpx pydantic
```

## Configuration

### Environment Variables

Set the following environment variables:

```bash
export BLAZE_STT_API_URL="http://localhost:8000"
export BLAZE_STT_API_KEY="your-api-key-here" # Optional
```

### Property.json (TEN Framework)

The extension includes a `property.json` file with default configuration that TEN framework can use:

```json
{
"params": {
"api_url": "${env:BLAZE_STT_API_URL}",
"api_key": "${env:BLAZE_STT_API_KEY}",
"language": "vi",
"enable_segments": false,
"enable_refinement": false,
"timeout": 3600
}
}
```

TEN framework will automatically read this file and use environment variables for configuration.

## Usage

### As TEN Framework Extension

```python
from blaze_stt_python import BlazeSTTExtension

# Initialize extension (can accept dict config from TEN framework)
stt = BlazeSTTExtension(config={
"api_url": "http://localhost:8000",
"api_key": "your-api-key",
"language": "vi",
})

# Process audio using TEN framework interface
result = stt.process({
"audio_data": audio_bytes,
"audio_content_type": "audio/wav",
"language": "vi",
})

print(result["transcription"])

# Get extension metadata
metadata = stt.get_metadata()
print(metadata)
```

### As Direct Extension

```python
from blaze_stt_python import BlazeSTTExtension, BlazeSTTConfig

# Initialize extension
config = BlazeSTTConfig(
api_url="http://localhost:8000",
api_key="your-api-key",
default_language="vi",
)
stt = BlazeSTTExtension(config=config)

# Transcribe audio
result = stt.transcribe(
audio_data=audio_bytes,
audio_content_type="audio/wav",
language="vi",
)

print(result["transcription"])
```

## API Reference

### BlazeSTTExtension

**TEN Framework Interface Methods:**
- `process(input_data)` - Process audio and return transcription (TEN framework interface)
- `get_metadata()` - Get extension metadata (TEN framework interface)

**Direct Methods:**

- `transcribe(audio_data, audio_file, audio_content_type, language, enable_segments, enable_refinement, lazy_process)` - Transcribe audio data (bytes) or file (UploadFile)
- `get_job_status(job_id)` - Get transcription job status

## Supported Formats

- `audio/wav` - WAV format
- `audio/mpeg` - MP3 format
- `audio/webm` - WebM format
- `audio/ogg` - OGG format

## License

This extension is provided as-is for use with the TEN Framework and Blaze services.

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"""
Blaze STT Extension for TEN Framework

This extension provides Speech-to-Text (STT) functionality using Blaze API.
Implements TEN framework extension interface.
"""

from .blaze_stt import BlazeSTTExtension, BlazeSTTConfig

__all__ = ["BlazeSTTExtension", "BlazeSTTConfig"]
__version__ = "1.0.0"

Loading
Loading