Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 0 additions & 126 deletions .github/issue-formatter.yml

This file was deleted.

157 changes: 157 additions & 0 deletions backend/app/alembic/versions/049_add_tts_evaluation_tables.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
"""add tts evaluation tables

Revision ID: 049
Revises: 048
Create Date: 2026-02-14 12:00:00.000000

"""

import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects import postgresql

# revision identifiers, used by Alembic.
revision = "049"
down_revision = "048"
branch_labels = None
depends_on = None


def upgrade():
# Create tts_result table
op.create_table(
"tts_result",
sa.Column(
"id",
sa.Integer(),
nullable=False,
comment="Unique identifier for the TTS result",
),
sa.Column(
"sample_text",
sa.Text(),
nullable=False,
comment="Input text that will be synthesized to speech",
),
sa.Column(
"object_store_url",
sa.String(),
nullable=True,
comment="S3 URL of the generated WAV audio file",
),
sa.Column(
"metadata",
postgresql.JSONB(astext_type=sa.Text()),
nullable=True,
comment="Audio metadata: {duration_seconds, size_bytes}",
),
sa.Column(
"provider",
sa.String(length=100),
nullable=False,
comment="TTS provider used (e.g., gemini-2.5-pro-preview-tts)",
),
sa.Column(
"status",
sa.String(length=20),
nullable=False,
server_default="PENDING",
comment="Result status: PENDING, SUCCESS, FAILED",
),
sa.Column(
"score",
postgresql.JSONB(astext_type=sa.Text()),
nullable=True,
comment="Extensible evaluation metrics",
),
sa.Column(
"is_correct",
sa.Boolean(),
nullable=True,
comment="Human feedback flag on audio quality correctness",
),
sa.Column(
"comment",
sa.Text(),
nullable=True,
comment="Human feedback comment on audio quality",
),
sa.Column(
"error_message",
sa.Text(),
nullable=True,
comment="Error message if synthesis failed",
),
sa.Column(
"evaluation_run_id",
sa.Integer(),
nullable=False,
comment="Reference to the evaluation run",
),
sa.Column(
"organization_id",
sa.Integer(),
nullable=False,
comment="Reference to the organization",
),
sa.Column(
"project_id",
sa.Integer(),
nullable=False,
comment="Reference to the project",
),
sa.Column(
"inserted_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the result was created",
),
sa.Column(
"updated_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the result was last updated",
),
Comment on lines +103 to +114
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "inserted_at|updated_at" --type=py backend/app/alembic/versions/ -A 3 | head -100

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 9016


🏁 Script executed:

rg -n "server_default" --type=py backend/app/alembic/versions/ | head -50

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 5826


🏁 Script executed:

wc -l backend/app/alembic/versions/048_add_tts_evaluation_tables.py

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 140


🏁 Script executed:

cat -n backend/app/alembic/versions/048_add_tts_evaluation_tables.py | head -120

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 4189


inserted_at and updated_at are non-nullable with no server_default — inserts without explicit values will fail.

Both columns are nullable=False but lack a server_default. Any INSERT that doesn't supply these explicitly (raw SQL, test fixtures, bulk operations) will hit a NOT NULL constraint violation. The established pattern across migrations (005, 032, 040) consistently uses server_default=sa.text("now()") for timestamp columns.

🐛 Proposed fix
 sa.Column(
     "inserted_at",
     sa.DateTime(),
     nullable=False,
+    server_default=sa.text("now()"),
     comment="Timestamp when the result was created",
 ),
 sa.Column(
     "updated_at",
     sa.DateTime(),
     nullable=False,
+    server_default=sa.text("now()"),
     comment="Timestamp when the result was last updated",
 ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sa.Column(
"inserted_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the result was created",
),
sa.Column(
"updated_at",
sa.DateTime(),
nullable=False,
comment="Timestamp when the result was last updated",
),
sa.Column(
"inserted_at",
sa.DateTime(),
nullable=False,
server_default=sa.text("now()"),
comment="Timestamp when the result was created",
),
sa.Column(
"updated_at",
sa.DateTime(),
nullable=False,
server_default=sa.text("now()"),
comment="Timestamp when the result was last updated",
),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/alembic/versions/048_add_tts_evaluation_tables.py` around lines
103 - 114, The migration defines non-nullable timestamp columns inserted_at and
updated_at without server defaults which will cause NOT NULL violations on
inserts; update the sa.Column definitions for "inserted_at" and "updated_at" in
the migration (functions/classes around the table creation in the 048 migration)
to include server_default=sa.text("now()") so they match the established pattern
used in other revisions (e.g., migrations 005/032/040) and avoid failing
raw/bulk inserts or tests.

sa.ForeignKeyConstraint(
["evaluation_run_id"],
["evaluation_run.id"],
name="fk_tts_result_run_id",
ondelete="CASCADE",
),
sa.ForeignKeyConstraint(
["organization_id"],
["organization.id"],
ondelete="CASCADE",
),
sa.ForeignKeyConstraint(
["project_id"],
["project.id"],
ondelete="CASCADE",
),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(
"ix_tts_result_run_id",
"tts_result",
["evaluation_run_id"],
unique=False,
)
op.create_index(
"idx_tts_result_feedback",
"tts_result",
["evaluation_run_id", "is_correct"],
unique=False,
)
op.create_index(
"idx_tts_result_status",
"tts_result",
["evaluation_run_id", "status"],
unique=False,
)


def downgrade():
op.drop_index("idx_tts_result_status", table_name="tts_result")
op.drop_index("idx_tts_result_feedback", table_name="tts_result")
op.drop_index("ix_tts_result_run_id", table_name="tts_result")
op.drop_table("tts_result")
9 changes: 9 additions & 0 deletions backend/app/api/docs/tts_evaluation/create_dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Create a new TTS evaluation dataset with text samples.

Required fields:
- **name**: Dataset name
- **samples**: List of text samples, each with a **text** field

Optional fields:
- **description**: Dataset description
- **language_id**: ID of a language from the global languages table
3 changes: 3 additions & 0 deletions backend/app/api/docs/tts_evaluation/get_dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Get a TTS evaluation dataset by ID.

Returns dataset including sample count.
3 changes: 3 additions & 0 deletions backend/app/api/docs/tts_evaluation/get_result.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Get a single TTS synthesis result by ID.

Returns the result including audio URL, metadata, and human feedback status.
4 changes: 4 additions & 0 deletions backend/app/api/docs/tts_evaluation/get_run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Get a TTS evaluation run by ID with optional results.

Query parameters:
- `include_results`: Include synthesis results (default: true)
3 changes: 3 additions & 0 deletions backend/app/api/docs/tts_evaluation/list_datasets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
List all TTS evaluation datasets for the current project.

Supports pagination with `limit` and `offset` parameters.
3 changes: 3 additions & 0 deletions backend/app/api/docs/tts_evaluation/list_runs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
List TTS evaluation runs for the current project.

Supports filtering by `dataset_id` and `status`, with pagination via `limit` and `offset`.
15 changes: 15 additions & 0 deletions backend/app/api/docs/tts_evaluation/start_evaluation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Start a TTS evaluation run on a dataset.

Required fields:
- **run_name**: Name for this evaluation run
- **dataset_id**: ID of the TTS dataset to evaluate

Optional fields:
- **models**: List of TTS models to use (default: `["gemini-2.5-pro-preview-tts"]`)

The evaluation will:
1. Process each text sample through the specified TTS models
2. Generate speech audio using Gemini Batch API
3. Store WAV audio files in S3 for human review

**Supported models:** `gemini-2.5-pro-preview-tts`
5 changes: 5 additions & 0 deletions backend/app/api/docs/tts_evaluation/update_feedback.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Update human feedback on a TTS synthesis result.

Fields:
- **is_correct**: Whether the synthesized audio quality is acceptable (null to clear)
- **comment**: Optional feedback comment
2 changes: 2 additions & 0 deletions backend/app/api/routes/evaluations/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@

from app.api.routes.evaluations import dataset, evaluation
from app.api.routes.stt_evaluations.router import router as stt_router
from app.api.routes.tts_evaluations.router import router as tts_router

router = APIRouter()

router.include_router(dataset.router)
router.include_router(stt_router)
router.include_router(tts_router)
router.include_router(evaluation.router)
Loading