Skip to content

Add qwen3 tts#44517

Open
ShahVandit wants to merge 23 commits intohuggingface:mainfrom
ShahVandit:add-qwen3-tts
Open

Add qwen3 tts#44517
ShahVandit wants to merge 23 commits intohuggingface:mainfrom
ShahVandit:add-qwen3-tts

Conversation

@ShahVandit
Copy link

What does this PR do?

Adds Qwen3-TTS, a series of text-to-speech models by the Qwen team (Alibaba Group), to Transformers.

Architecture:

  • Qwen3TTSForConditionalGeneration — text to multi-codebook speech codes (talker)
  • Qwen3TTSTokenizerV2Model (12Hz) and Qwen3TTSTokenizerV1Model (25Hz) — codes to audio waveform
  • Qwen3TTSProcessor — text preprocessing

Features: voice presets, voice design via natural language, batch inference, 10 languages

Paper: Qwen3-TTS Technical Report

Before submitting

  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@eustlb @ebezzam @vasqu

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, qwen3_tts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants