Skip to content

fix(p0): remove dead Python chaos, delete requirements.txt, clean CI whisper flag#365

Open
Coldaine wants to merge 5 commits intomainfrom
fix/p0-python-cleanup-and-dead-whisper
Open

fix(p0): remove dead Python chaos, delete requirements.txt, clean CI whisper flag#365
Coldaine wants to merge 5 commits intomainfrom
fix/p0-python-cleanup-and-dead-whisper

Conversation

@Coldaine
Copy link
Owner

@Coldaine Coldaine commented Mar 1, 2026

User description

Summary

  • Delete vestigial requirements.txt (just had a comment saying "no deps needed" while pyproject.toml listed real deps)
  • Fix pyproject.toml: bound Python <3.13 for PyO3 compatibility; add note to use uv sync
  • Remove dead --whisper flag from local_ci.sh (whisper feature is a dead stub)
  • Add PLAN.md: full state analysis and milestone-based execution plan
  • Add scripts/validate_telemetry_schema.py: enforce coldvox.{subsystem}.{metric}.{unit} naming conventions

Test plan

  • cargo build --workspace --locked passes
  • cargo test --workspace --locked passes
  • scripts/local_ci.sh runs without errors

Addresses P0 items in docs/plans/critical-action-plan.md.

🤖 Generated with Claude Code


PR Type

Bug fix, Enhancement


Description

  • Remove dead whisper feature stub and clean CI pipeline

  • Fix Python version chaos: bound to <3.13 for PyO3 compatibility

  • Delete vestigial requirements.txt, clarify UV-based dependency management

  • Add telemetry schema validator script for metric naming conventions

  • Document state analysis and milestone-based execution plan


Diagram Walkthrough

flowchart LR
  A["P0 Issues<br/>Identified"] -->|"Remove dead code"| B["Delete whisper<br/>feature stub"]
  A -->|"Fix version chaos"| C["Bound Python<br/>to 3.10-3.12"]
  A -->|"Clean config"| D["Delete requirements.txt<br/>clarify UV usage"]
  A -->|"Add tooling"| E["Telemetry schema<br/>validator script"]
  B --> F["Clean CI<br/>pipeline"]
  C --> F
  D --> F
  E --> G["Enforce metric<br/>naming conventions"]
Loading

File Walkthrough

Relevant files
Enhancement
validate_telemetry_schema.py
Add telemetry schema validator for metric naming                 

scripts/validate_telemetry_schema.py

  • New Python script to validate telemetry metric naming conventions
  • Enforces coldvox.{subsystem}.{metric}.{unit} schema pattern
  • Scans Rust crates for metric definitions using regex patterns
  • Supports legacy metrics grandfathering and per-crate scanning
  • Returns exit code 0 on success, 1 on violations with detailed
    reporting
+253/-0 
Bug fix
local_ci.sh
Remove dead whisper flag from CI script                                   

scripts/local_ci.sh

  • Remove dead --whisper flag argument parsing logic
  • Remove conditional whisper feature test execution path
  • Simplify test step to always run standard cargo test --workspace
    --locked
  • Eliminate venv setup requirement for CI pipeline
+1/-15   
Cargo.toml
Remove whisper feature, reorganize and document features 

crates/app/Cargo.toml

  • Remove whisper = [] feature flag (dead stub)
  • Remove no-stt feature flag (unused)
  • Reorganize remaining features with status comments and documentation
    links
  • Mark parakeet as "⚠️ Planned: Not currently functional"
  • Mark moonshine as "✅ Working: Python-based, CPU/GPU"
  • Add inline comments referencing STT overview documentation
+6/-6     
Cargo.toml
Clean up STT features, remove whisper and unused backends

crates/coldvox-stt/Cargo.toml

  • Remove whisper = [] placeholder feature
  • Remove unused features: coqui, leopard, silero-stt
  • Reorganize remaining features with status annotations
  • Add documentation link to docs/domains/stt/stt-overview.md
  • Mark parakeet as "⚠️ Planned: Not currently functional"
  • Mark moonshine as "✅ Working: Python-based, CPU/GPU"
+3/-6     
mise.toml
Remove Python 3.13 from mise, clarify UV ownership             

mise.toml

  • Remove python = "3.13" line that conflicts with PyO3 requirements
  • Add explanatory comments about UV managing Python versions
  • Document that Python <=3.12 is required for PyO3 compatibility
  • Reference .python-version and pyproject.toml as sources of truth
+2/-1     
pyproject.toml
Bound Python version for PyO3, clarify UV usage                   

pyproject.toml

  • Update requires-python constraint from >=3.10 to >=3.10,<3.13
  • Add comment explaining PyO3 requires Python <=3.12
  • Add prominent section documenting UV-based dependency management
  • Add warning against using pip/requirements.txt directly
  • Reference .python-version for required Python version
+5/-1     
requirements.txt
Delete vestigial requirements.txt file                                     

requirements.txt

  • Delete entire file (5 lines removed)
  • File was vestigial with only comments claiming "no deps needed"
  • Actual dependencies are defined in pyproject.toml
  • UV manages Python environment, not pip/requirements.txt
+0/-5     
Documentation
PLAN.md
Add state analysis and milestone execution plan                   

PLAN.md

  • New comprehensive state analysis document for ColdVox project
  • Documents current subsystem status and critical brittleness points
  • Defines 4-milestone execution plan with P0/P1/P2 issue prioritization
  • Details Milestone 1 implementation (Python version, whisper removal,
    telemetry validation)
  • Proposes telemetry naming convention schema with migration examples
+162/-0 

…whisper flag

- Delete requirements.txt (vestigial - only had a comment "no deps needed")
- Fix pyproject.toml: bound Python <3.13 for PyO3 compatibility, add uv sync note
- Remove dead --whisper flag from local_ci.sh (whisper feature is dead code)
- Update Cargo.toml crates to reflect current dependency state
- Add PLAN.md: full state analysis and milestone-based execution plan
- Add scripts/validate_telemetry_schema.py: enforce metric naming conventions

Addresses the P0 items in docs/plans/critical-action-plan.md:
- Python version chaos: now explicitly requires >=3.10,<3.13
- Whisper dead code: removed from CI pipeline (Cargo.toml stub cleanup deferred)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 1, 2026 02:44
@qodo-free-for-open-source-projects

Review Summary by Qodo

Clean up dead code, fix Python version constraints, add telemetry validation

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Remove dead whisper feature stub from CI and Cargo.toml configuration
• Fix Python version chaos: bound to <3.13 for PyO3 compatibility, remove conflicting mise.toml
  entry
• Delete vestigial requirements.txt; clarify UV as sole Python dependency manager
• Add telemetry schema validator script enforcing coldvox.{subsystem}.{metric}.{unit} naming
• Add comprehensive PLAN.md documenting current state and milestone-based execution roadmap
Diagram
flowchart LR
  A["Python Version Chaos"] -->|Fix mise.toml & pyproject.toml| B["Bounded Python <3.13"]
  C["Dead Whisper Code"] -->|Remove from Cargo.toml & CI| D["Clean Feature Flags"]
  E["Empty requirements.txt"] -->|Delete & clarify UV usage| F["Single Source of Truth"]
  G["No Telemetry Schema"] -->|Add validator script| H["Enforce Naming Convention"]
  B --> I["P0 Issues Resolved"]
  D --> I
  F --> I
  H --> I
Loading

Grey Divider

File Changes

1. scripts/validate_telemetry_schema.py ✨ Enhancement +253/-0

Add telemetry schema validator for metric naming

• New Python script to validate telemetry metric naming conventions
• Enforces coldvox.{subsystem}.{metric_name}.{unit} schema across Rust crates
• Scans telemetry-related crates (coldvox-telemetry, coldvox-stt, coldvox-audio, coldvox-vad)
• Supports legacy metrics grandfathering and provides detailed violation reports with suggestions

scripts/validate_telemetry_schema.py


2. scripts/local_ci.sh 🐞 Bug fix +1/-15

Remove dead whisper feature flag from CI

• Remove dead --whisper flag and associated conditional logic from test execution
• Simplify test invocation to single cargo test command without feature branching
• Eliminate vestigial ensure_venv.sh call for whisper feature

scripts/local_ci.sh


3. PLAN.md 📝 Documentation +162/-0

Add state analysis and milestone-based execution roadmap

• New comprehensive state analysis documenting current subsystem status and brittleness points
• Define four-milestone execution plan with P0/P1/P2 issue prioritization
• Document verified working features and critical failures (Python chaos, whisper stub, parakeet
 errors)
• Specify telemetry naming convention schema and migration path for existing metrics

PLAN.md


View more (5)
4. crates/app/Cargo.toml 🐞 Bug fix +6/-6

Clean up feature flags, remove whisper stub

• Remove whisper feature stub (marked as temporarily stubbed, now deleted)
• Remove unused feature flags (no-stt, coqui, leopard, silero-stt)
• Reorganize remaining features with status annotations (✅ Working, ⚠️ Planned)
• Add reference comments linking to STT overview documentation

crates/app/Cargo.toml


5. crates/coldvox-stt/Cargo.toml 🐞 Bug fix +3/-6

Remove whisper placeholder and unused STT features

• Remove whisper placeholder feature (marked as awaiting new backend implementation)
• Remove unused feature flags (coqui, leopard, silero-stt)
• Add status annotations to remaining features (✅ Working, ⚠️ Planned)
• Add documentation reference comment for STT backends overview

crates/coldvox-stt/Cargo.toml


6. mise.toml 🐞 Bug fix +2/-1

Remove conflicting Python 3.13 from mise.toml

• Remove python = "3.13" entry that conflicts with PyO3 requirement (<=3.12)
• Add explanatory comment clarifying UV manages Python, not mise
• Preserve node and rust tool specifications

mise.toml


7. pyproject.toml 🐞 Bug fix +5/-1

Bound Python version for PyO3, clarify UV usage

• Bound Python version to >=3.10,<3.13 to enforce PyO3 compatibility
• Add prominent comment directing users to use uv sync instead of pip/requirements.txt
• Reference .python-version as source of truth for Python version

pyproject.toml


8. requirements.txt 🐞 Bug fix +0/-5

Delete vestigial empty requirements.txt

• Delete entire file (contained only comments claiming "no deps needed")
• Eliminates misleading documentation that contradicted pyproject.toml dependencies

requirements.txt


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link

qodo-free-for-open-source-projects bot commented Mar 1, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. STT pipeline now unreachable🐞 Bug ✓ Correctness
Description
whisper was removed from crates/app features, but the app’s STT processor and runtime event
wiring are still gated behind #[cfg(feature = "whisper")]. As a result, enabling
moonshine/parakeet features won’t start STT processing and the app will run without
transcription despite STT plugins being registered.
Code

crates/app/Cargo.toml[R68-75]

[features]
default = ["silero", "text-injection"]
+# STT Backends (see docs/domains/stt/stt-overview.md)
+moonshine = ["coldvox-stt/moonshine"]      # ✅ Working: Python-based, CPU/GPU
+parakeet = ["coldvox-stt/parakeet"]        # ⚠️ Planned: Not currently functional
+# Other features
+silero = ["coldvox-vad-silero/silero"]     # ✅ Default: Silero VAD
+text-injection = ["dep:coldvox-text-injection"]  # ✅ Default: Text injection backends
Evidence
The PR removes the only feature flag (whisper) that enables compilation of the STT processor and
runtime STT wiring. Meanwhile, STT plugin factories are registered based on moonshine/parakeet
features, creating a mismatch: plugins can exist, but the processor that drives them is permanently
compiled out.

crates/app/Cargo.toml[68-75]
crates/app/src/stt/mod.rs[1-18]
crates/app/src/stt/processor.rs[53-77]
crates/app/src/runtime.rs[507-599]
crates/app/src/stt/plugin_manager.rs[556-582]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`crates/app` removed the `whisper` Cargo feature, but the unified STT processor module and the runtime wiring that spawns it are still `#[cfg(feature = &amp;quot;whisper&amp;quot;)]`. This makes STT processing impossible to enable (even when `moonshine`/`parakeet` plugins are compiled in).
### Issue Context
- `moonshine`/`parakeet` features currently control plugin registration.
- The processor + runtime plumbing is still gated on the removed `whisper` feature.
### Fix Focus Areas
- crates/app/Cargo.toml[68-80]
- crates/app/src/stt/mod.rs[1-18]
- crates/app/src/stt/processor.rs[53-77]
- crates/app/src/runtime.rs[507-605]
- crates/app/src/bin/tui_dashboard.rs[9-15]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Telemetry validator schema mismatch 🐞 Bug ✓ Correctness
Description
scripts/validate_telemetry_schema.py documents/prints a dot-separated schema
(coldvox.{subsystem}.{metric_name}.{unit}) but its validation logic only checks
underscore-prefixed names (coldvox_...) and will effectively skip validation for dot-form names.
This makes the tool misleading and unlikely to enforce the stated convention.
Code

scripts/validate_telemetry_schema.py[R103-150]

+def validate_metric_name(name: str) -> Tuple[bool, str, str]:
+    """
+    Validate a metric name against the schema.
+
+    Returns: (is_valid, issue, suggestion)
+    """
+    # Skip legacy metrics
+    if name in LEGACY_METRICS:
+        return True, "", ""
+
+    # Skip non-metric strings
+    if not (
+        name.startswith("coldvox")
+        or name.startswith("stt_")
+        or name.startswith("capture_")
+        or name.startswith("chunker_")
+        or name.startswith("vad_")
+    ):
+        return True, "", ""  # Not a metric we care about
+
+    # Check for new schema: coldvox.{subsystem}.{name}.{unit}
+    if name.startswith("coldvox_"):
+        parts = name.split("_")
+        if len(parts) < 3:
+            return (
+                False,
+                "Too few components",
+                f"Use format: coldvox_{{subsystem}}_{{name}}_{{unit}}",
+            )
+
+        subsystem = parts[1]
+        if subsystem not in VALID_SUBSYSTEMS:
+            return (
+                False,
+                f"Invalid subsystem '{subsystem}'",
+                f"Use one of: {VALID_SUBSYSTEMS}",
+            )
+
+        # Check for unit suffix
+        unit = parts[-1]
+        if unit not in VALID_UNITS and not any(u in unit for u in VALID_UNITS):
+            return (
+                False,
+                f"Missing/invalid unit suffix",
+                f"Add unit suffix from: {VALID_UNITS}",
+            )
+
+    return True, "", ""
Evidence
The script claims a dot-separated metric schema, but only validates names starting with coldvox_
and suggests underscore fixes. Additionally, the repo’s telemetry playbook defines dot-separated
naming (..), increasing the inconsistency/confusion.

scripts/validate_telemetry_schema.py[1-7]
scripts/validate_telemetry_schema.py[113-150]
scripts/validate_telemetry_schema.py[228-247]
docs/domains/telemetry/tele-observability-playbook.md[136-147]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The telemetry schema validator’s documented/printed schema is dot-separated, but the implementation validates only underscore-prefixed names and suggests underscore fixes. This mismatch makes the tool misleading and may allow schema violations to slip through.
### Issue Context
The repository telemetry playbook defines dot-separated metric naming (`&amp;lt;domain&amp;gt;.&amp;lt;subsystem&amp;gt;.&amp;lt;metric&amp;gt;`). The new validator should align with the chosen standard.
### Fix Focus Areas
- scripts/validate_telemetry_schema.py[1-7]
- scripts/validate_telemetry_schema.py[113-150]
- scripts/validate_telemetry_schema.py[228-247]
- docs/domains/telemetry/tele-observability-playbook.md[136-147]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-code-review
Copy link

qodo-code-review bot commented Mar 1, 2026

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
CI resource exhaustion

Description: The new telemetry validation script recursively reads and regex-scans all .rs files under
crates/
/src without size/time limits, which could enable CI resource exhaustion
(CPU/memory/time) if very large or pathologically-crafted Rust files are introduced into
the repo.
validate_telemetry_schema.py [153-190]

Referred Code
def find_metrics_in_file(file_path: Path) -> List[Tuple[int, str]]:
    """Find all metric names in a Rust source file."""
    metrics = []
    content = file_path.read_text(encoding="utf-8")

    for line_num, line in enumerate(content.split("\n"), 1):
        for pattern in METRIC_PATTERNS:
            for match in pattern.finditer(line):
                metric_name = match.group(1)
                metrics.append((line_num, metric_name))

    return metrics


def scan_crate(crate_path: Path) -> List[MetricViolation]:
    """Scan a crate for metric naming violations."""
    violations = []

    src_dir = crate_path / "src"
    if not src_dir.exists():
        return violations


 ... (clipped 17 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Unhandled IO errors: The validator performs filesystem reads and path operations without handling exceptions,
which can crash the script without actionable context.

Referred Code
def find_metrics_in_file(file_path: Path) -> List[Tuple[int, str]]:
    """Find all metric names in a Rust source file."""
    metrics = []
    content = file_path.read_text(encoding="utf-8")

    for line_num, line in enumerate(content.split("\n"), 1):
        for pattern in METRIC_PATTERNS:
            for match in pattern.finditer(line):
                metric_name = match.group(1)
                metrics.append((line_num, metric_name))

    return metrics


def scan_crate(crate_path: Path) -> List[MetricViolation]:
    """Scan a crate for metric naming violations."""
    violations = []

    src_dir = crate_path / "src"
    if not src_dir.exists():
        return violations


 ... (clipped 70 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Path traversal risk: The --crate argument is used to build a filesystem path without validating it is a safe
crate name, enabling ..-style traversal to scan unintended directories.

Referred Code
parser.add_argument("--crate", type=str, help="Specific crate to scan")
args = parser.parse_args()

repo_root = Path(__file__).parent.parent
crates_dir = repo_root / "crates"

all_violations = []

if args.crate:
    crate_path = crates_dir / args.crate
    if not crate_path.exists():
        print(f"Error: Crate '{args.crate}' not found at {crate_path}")
        return 1
    all_violations = scan_crate(crate_path)
else:

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Mar 1, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Support dot naming convention

Update the validation script to support both dot-separated and
underscore-separated metric names, aligning it with the naming convention
defined in PLAN.md.

scripts/validate_telemetry_schema.py [113-131]

 # Skip non-metric strings
 if not (
-    name.startswith("coldvox")
+    name.startswith(("coldvox_", "coldvox."))
     or name.startswith("stt_")
     or name.startswith("capture_")
     or name.startswith("chunker_")
     or name.startswith("vad_")
 ):
     return True, "", ""  # Not a metric we care about
 
-# Check for new schema: coldvox.{subsystem}.{name}.{unit}
-if name.startswith("coldvox_"):
-    parts = name.split("_")
-    if len(parts) < 3:
+# Check for new schema: coldvox_{subsystem}_{name}_{unit} or
+#                    coldvox.{subsystem}.{name}.{unit}
+if name.startswith(("coldvox_", "coldvox.")):
+    # split on both '.' and '_'
+    parts = re.split(r"[._]", name)
+    if len(parts) < 4:
         return (
             False,
             "Too few components",
-            f"Use format: coldvox_{{subsystem}}_{{name}}_{{unit}}",
+            f"Use format: coldvox{{separator}}{{subsystem}}{{separator}}{{name}}{{separator}}{{unit}}",
         )
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that the script fails to support the dot-separated naming convention proposed in PLAN.md and provides a comprehensive fix, significantly improving the script's correctness and alignment with the project's goals.

Medium
Scan all crates automatically

Modify the script to automatically scan all crate directories under crates/
instead of using a hardcoded list, making it more robust to future changes.

scripts/validate_telemetry_schema.py [209-226]

 if args.crate:
     ...
 else:
-    # Scan telemetry crates specifically
-    for crate_name in [
-        "coldvox-telemetry",
-        "coldvox-stt",
-        "coldvox-audio",
-        "coldvox-vad",
-    ]:
-        crate_path = crates_dir / crate_name
-        if crate_path.exists():
-            violations = scan_crate(crate_path)
-            all_violations.extend(violations)
+    # Dynamically scan all crates with source code
+    for crate_path in crates_dir.iterdir():
+        if (crate_path / "src").is_dir():
+            all_violations.extend(scan_crate(crate_path))

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion improves the maintainability and robustness of the script by replacing a hardcoded list of crates with dynamic discovery, ensuring all relevant crates are automatically included.

Low
Possible issue
Strengthen metric unit validation logic

Update the metric unit validation to use an exact match against VALID_UNITS
instead of a permissive substring check to prevent incorrectly validating
metrics.

scripts/validate_telemetry_schema.py [141-148]

 # Check for unit suffix
 unit = parts[-1]
-if unit not in VALID_UNITS and not any(u in unit for u in VALID_UNITS):
+if unit not in VALID_UNITS:
     return (
         False,
-        f"Missing/invalid unit suffix",
-        f"Add unit suffix from: {VALID_UNITS}",
+        f"Missing/invalid unit '{unit}'",
+        f"Use a valid unit suffix from: {VALID_UNITS}",
     )
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a logical flaw in the new validation script where a substring match could approve invalid metric units, and the proposed fix enforces a correct, exact match.

Medium
  • Update

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 606e7df3d7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@@ -1,5 +0,0 @@
# Python dependencies for ColdVox tooling and scripts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore requirements input or update docs CI installer

Deleting requirements.txt breaks the docs workflow because .github/workflows/docs-ci.yml still runs uv pip install -r requirements.txt in the "Create virtual environment" step; uv pip install --help defines --requirements as reading a requirements file, and running it with a missing file errors out (File not found). Since this commit also adds PLAN.md, it will trigger the docs workflow (**/*.md) and fail before docs validation runs.

Useful? React with 👍 / 👎.

f"Use format: coldvox_{{subsystem}}_{{name}}_{{unit}}",
)

subsystem = parts[1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse multi-token subsystem names in telemetry validator

The validator cannot accept metrics for the declared text_injection subsystem because it does parts = name.split("_") and then validates only parts[1] as the subsystem, so a valid name like coldvox_text_injection_latency_ms is parsed as subsystem text and falsely rejected. This makes the script inconsistent with its own VALID_SUBSYSTEMS and will produce false failures if teams add metrics for that subsystem.

Useful? React with 👍 / 👎.

Runs validate_telemetry_schema.py on pushes that touch telemetry/STT/audio/VAD crates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines 68 to +75
[features]
default = ["silero", "text-injection"]
# STT Backends (see docs/domains/stt/stt-overview.md)
moonshine = ["coldvox-stt/moonshine"] # ✅ Working: Python-based, CPU/GPU
parakeet = ["coldvox-stt/parakeet"] # ⚠️ Planned: Not currently functional
# Other features
silero = ["coldvox-vad-silero/silero"] # ✅ Default: Silero VAD
text-injection = ["dep:coldvox-text-injection"] # ✅ Default: Text injection backends

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Stt pipeline now unreachable 🐞 Bug ✓ Correctness

whisper was removed from crates/app features, but the app’s STT processor and runtime event
wiring are still gated behind #[cfg(feature = "whisper")]. As a result, enabling
moonshine/parakeet features won’t start STT processing and the app will run without
transcription despite STT plugins being registered.
Agent Prompt
### Issue description
`crates/app` removed the `whisper` Cargo feature, but the unified STT processor module and the runtime wiring that spawns it are still `#[cfg(feature = "whisper")]`. This makes STT processing impossible to enable (even when `moonshine`/`parakeet` plugins are compiled in).

### Issue Context
- `moonshine`/`parakeet` features currently control plugin registration.
- The processor + runtime plumbing is still gated on the removed `whisper` feature.

### Fix Focus Areas
- crates/app/Cargo.toml[68-80]
- crates/app/src/stt/mod.rs[1-18]
- crates/app/src/stt/processor.rs[53-77]
- crates/app/src/runtime.rs[507-605]
- crates/app/src/bin/tui_dashboard.rs[9-15]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cleans up legacy Whisper/Python drift and clarifies the supported STT backend setup (Moonshine via PyO3/UV), while introducing an initial telemetry metric naming validator and an execution plan document.

Changes:

  • Removes the dead --whisper path from scripts/local_ci.sh and deletes the vestigial requirements.txt.
  • Tightens Python tooling configuration by bounding pyproject.toml to <3.13 and removing python = "3.13" from mise.toml.
  • Updates Cargo feature flags to remove Whisper stubs and document Moonshine/Parakeet status; adds PLAN.md and a telemetry schema validation script.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scripts/validate_telemetry_schema.py Adds a repo script intended to validate telemetry metric naming conventions.
scripts/local_ci.sh Removes the dead --whisper flag handling and runs standard workspace tests only.
requirements.txt Deletes an empty/misleading Python requirements file.
pyproject.toml Bounds supported Python to <3.13 and documents using uv sync.
mise.toml Removes pinned Python runtime to avoid conflicts with PyO3/Python version constraints.
crates/coldvox-stt/Cargo.toml Removes Whisper stub feature and documents Moonshine/Parakeet feature intent.
crates/app/Cargo.toml Removes Whisper/no-stt stub features and clarifies feature grouping/comments.
PLAN.md Adds a state analysis + milestone execution plan document.
Cargo.lock Updates the locked thiserror version.
Comments suppressed due to low confidence (1)

scripts/local_ci.sh:96

  • With the --whisper path removed, scripts/ensure_venv.sh no longer appears to be referenced anywhere in the repo (and it still installs faster-whisper). Consider deleting it or repurposing it for the supported Moonshine/UV workflow to avoid carrying dead, Whisper-specific setup code.
# 6. Run tests
print_step "Running tests..."
if cargo test --workspace --locked; then
    print_success "All tests passed"
else
    print_error "Tests failed"
    exit 1

Telemetry Schema Validator for ColdVox

Validates that metrics follow the naming convention:
coldvox.{subsystem}.{metric_name}.{unit}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented schema uses dots (coldvox.{subsystem}.{metric_name}.{unit}), but the implementation/regexes/suggestions in this script are based on underscore-separated names (e.g., coldvox_{subsystem}_{name}_{unit}). This mismatch makes the validator’s output and guidance unreliable; align the docstring + printed schema with what the validator actually enforces (or update the validator to enforce the dot-separated schema used in docs).

Suggested change
coldvox.{subsystem}.{metric_name}.{unit}
coldvox_{subsystem}_{metric_name}_{unit}

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +150
# Skip non-metric strings
if not (
name.startswith("coldvox")
or name.startswith("stt_")
or name.startswith("capture_")
or name.startswith("chunker_")
or name.startswith("vad_")
):
return True, "", "" # Not a metric we care about

# Check for new schema: coldvox.{subsystem}.{name}.{unit}
if name.startswith("coldvox_"):
parts = name.split("_")
if len(parts) < 3:
return (
False,
"Too few components",
f"Use format: coldvox_{{subsystem}}_{{name}}_{{unit}}",
)

subsystem = parts[1]
if subsystem not in VALID_SUBSYSTEMS:
return (
False,
f"Invalid subsystem '{subsystem}'",
f"Use one of: {VALID_SUBSYSTEMS}",
)

# Check for unit suffix
unit = parts[-1]
if unit not in VALID_UNITS and not any(u in unit for u in VALID_UNITS):
return (
False,
f"Missing/invalid unit suffix",
f"Add unit suffix from: {VALID_UNITS}",
)

return True, "", ""
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_metric_name() only validates names starting with coldvox_ and returns True for other coldvox... names (including the dot-separated form the header claims to enforce). As written, any coldvox.<...> metric would be treated as valid without checking subsystem/unit components; update the logic to parse/validate the actual schema you want to enforce.

Copilot uses AI. Check for mistakes.
Comment on lines +141 to +148
# Check for unit suffix
unit = parts[-1]
if unit not in VALID_UNITS and not any(u in unit for u in VALID_UNITS):
return (
False,
f"Missing/invalid unit suffix",
f"Add unit suffix from: {VALID_UNITS}",
)
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit validation uses substring matching (any(u in unit for u in VALID_UNITS)), which can incorrectly accept invalid units (e.g., debug contains db). To avoid false passes, validate units as exact matches or suffix matches (e.g., ..._ms, ..._bytes, ..._total) rather than arbitrary substrings.

Copilot uses AI. Check for mistakes.
re.compile(r'"(coldvox_[a-z_]+)"'),
# Atomic store names like stt_transcription_success.store(...)
re.compile(r"(\w+)\.store\s*\("),
# Arc<AtomicU64> field names in structs
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric extraction patterns don’t match how metrics are currently represented in this repo: telemetry is mostly Arc<Atomic*> fields updated via .fetch_add()/.load() as well as .store(), and there are no counter!/gauge!/histogram! macro usages in the scanned crates. As a result, the validator is likely to find few/no metrics and won’t provide real enforcement; consider targeting the actual call sites (e.g., match \.fetch_add\( / \.load\( and/or parse known metrics structs) or adjust the scope to where string metric names exist.

Suggested change
# Arc<AtomicU64> field names in structs
# Atomic increment names like stt_transcription_success.fetch_add(...)
re.compile(r"(\w+)\.fetch_add\s*\("),
# Atomic load names like stt_transcription_success.load(...)
re.compile(r"(\w+)\.load\s*\("),
# Arc<AtomicU64> field names in structs, annotated via comment with metric name

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +203
"--fix", action="store_true", help="Suggest fixes (not implemented)"
)
parser.add_argument("--crate", type=str, help="Specific crate to scan")
args = parser.parse_args()

Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI exposes a --fix flag but it’s explicitly “not implemented” and the flag isn’t used anywhere in the script. This can mislead callers; either implement the behavior (even just printing suggested renames) or remove the flag until it exists.

Suggested change
"--fix", action="store_true", help="Suggest fixes (not implemented)"
)
parser.add_argument("--crate", type=str, help="Specific crate to scan")
args = parser.parse_args()
"--fix",
action="store_true",
help="Emphasize suggested fixes (no automatic changes)",
)
parser.add_argument("--crate", type=str, help="Specific crate to scan")
args = parser.parse_args()
if args.fix:
print(
"⚠️ --fix is advisory only: no files will be modified automatically.\n"
" Use the 'Fix:' suggestions below to manually rename metrics.\n"
)

Copilot uses AI. Check for mistakes.

3. **Remove whisper feature stub**
- Delete `whisper = []` from `crates/app/Cargo.toml`
- Remove dead code: `whisper_plugin.rs`, `whisper_cpp.rs`
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan references removing whisper_plugin.rs / whisper_cpp.rs, but those files don’t appear to exist anywhere in the repo anymore. To keep the plan actionable, update these bullets to point at the current Whisper-related stubs/references that still remain (or drop the file names if they’re already gone).

Suggested change
- Remove dead code: `whisper_plugin.rs`, `whisper_cpp.rs`
- Remove any remaining Whisper-related stubs and dead code (e.g., in STT plugin modules)

Copilot uses AI. Check for mistakes.
…cessing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Owner Author

@Coldaine Coldaine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments addressed:

  • docs-ci.yml: Updated to use uv sync instead of deleted requirements.txt; removed stale pip install step
  • whisper cfg gates: Replaced #[cfg(feature="whisper")] with #[cfg(any(feature="moonshine", feature="parakeet"))] throughout runtime.rs and stt/
  • telemetry validator: Fixed split("_") parser to handle multi-token subsystem names like text_injection correctly
  • stt/processor.rs: Fixed STT processor to not gate on removed whisper feature flag

Changes committed and pushed to fix/p0-python-cleanup-and-dead-whisper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rubato 1.0 removed SincFixedIn and merged it into a unified Async struct.
Downgrade to 0.16.x which still has the SincFixedIn API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants