Support loading local model directories by dewana-sl · Pull Request #137 · KittenML/KittenTTS

dewana-sl · 2026-05-21T12:00:26Z

Summary

Adds local model-directory loading for users who have already downloaded KittenTTS model assets.

Details

Adds load_from_local(model_path, backend=None).
Allows KittenTTS("/path/to/model-dir") when the path exists locally.
Validates config.json, model file, and voices file before constructing the ONNX model.
Keeps Hugging Face imports lazy so local loading can be imported without the download client path being initialized.
Documents the local-directory usage in the README.

Addresses #132 and covers the local-cache/repeated-download pain behind #20.

Validation

python3 -m unittest -q
Editable install in a fresh virtualenv with declared package dependencies.
Import smoke for kittentts, KittenTTS, load_from_local, and normalize_text.
Real inference smoke from a cached local kitten-tts-nano-0.8-int8 snapshot directory.

namanomar · 2026-06-28T12:31:29Z

Found a significant int8 quality regression specific to `nano` on this branch

I tested this PR locally (Windows, cp314, kitten-inference==0.1.1 from PyPI) and the native engine itself works great — all 8 voices generate correctly, speed control/streaming/generate_to_file all pass, and CPU latency is genuinely ~3-4x faster than the current ONNX path on this machine. Nice work.

While digging into the README's existing note about kitten-tts-nano-0.8-int8 quality issues, I ran a controlled fp32-vs-int8 comparison across all 3 model sizes, all 8 voices, and 80 varied sentences (5,120 generations total) to quantify it. Metrics: RMS energy, sample-to-sample "roughness" (mean squared first difference, normalized by signal energy), and the fraction of spectral energy above 8kHz — all should stay roughly flat between fp32 and a correctly-quantized int8 model.

Engine / model	n	RMS delta (int8 vs fp32)	Roughness delta	HF-spectral delta
ONNX (main) `nano`	640	-6.2% (σ 5.6%)	+6.2% (σ 16.1%)	+22.0% (σ 33.0%)
native (this PR) `nano`	640	-22.5% (σ 1.9%)	+132.6% (σ 21.7%)	+27.1% (σ 8.9%)
native (this PR) `micro`	640	-0.3% (σ 0.7%)	-0.1% (σ 1.7%)	+0.4% (σ 3.8%)
native (this PR) `mini`	640	-0.6% (σ 0.7%)	-0.7% (σ 2.0%)	+0.4% (σ 5.0%)

Two things stand out:

micro and mini int8 are statistically indistinguishable from fp32 under the native engine (deltas under 1%, tiny variance) — so the engine's int8 path is fine in general.
nano int8 specifically loses ~22% of signal energy and gains ~133% more sample-to-sample roughness, with very low variance (σ ~2%) across 640 samples — that's not noise, that's a systematic, reproducible effect isolated to one model size. And it's notably worse than the same nano int8 weights running through the existing ONNX engine (which shows a much smaller, noisier ~6% effect).

Since micro/mini/nano int8 all go through the same engine code path and the same model_inference.InferenceModel, the fact that only nano regresses this hard points at something specific to kitten_int8_15m_arch.json (or the corresponding kitten_fp32_15m_arch.json vs int8 weight layout) rather than a general int8-handling bug in the engine.

This is very likely the root cause behind the README's existing "some users have reported issues with kitten-tts-nano-0.8-int8" note — and this PR makes it measurably worse for that model.

Happy to share the full 80-sentence corpus and raw per-sample JSON if useful for debugging the nano int8 arch/weights specifically.

Support loading local model directories

3814ab4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support loading local model directories#137

Support loading local model directories#137
dewana-sl wants to merge 1 commit into
KittenML:mainfrom
dewana-sl:load-local-models

dewana-sl commented May 21, 2026 •

edited

Loading

Uh oh!

namanomar commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

dewana-sl commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Validation

Uh oh!

namanomar commented Jun 28, 2026

Found a significant int8 quality regression specific to nano on this branch

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dewana-sl commented May 21, 2026 •

edited

Loading

Found a significant int8 quality regression specific to `nano` on this branch