Back to AI information
Tongyi Qianwen releases Qwen3-TTS-Flash: English and Chinese, unified architecture supports 17 tones

Tongyi Qianwen releases Qwen3-TTS-Flash: English and Chinese, unified architecture supports 17 tones

AI information Admin 96 views

The Alibaba Tongyi team announced the release of Qwen3-TTS (including the Qwen3-TTS-Flash variant), a next-generation text-to-speech model. This model features multi-timbre, multi-language, and multi-dialect synthesis, emphasizing more natural and expressive speech output. Official demos and blog posts demonstrate the model's outstanding performance in both English and Chinese scenarios. A new unified architecture supports multi-language and multi-dialect support within the same model. An online demo and access instructions are now available.

The accompanying product documentation and console page indicate that Qwen3-TTS-Flash offers 17 anthropomorphic voices, can output multiple languages and dialects (including Mandarin and some other dialects) using the same voice, and provides API billing specifications. It also offers a real-time speech synthesis option (Qwen3-TTS Realtime) to reduce end-to-end latency. Media reports also juxtaposed the same-day release of Qwen3-TTS with that of Qwen3-Omni, emphasizing that they constitute key updates to the Tongyi multimodal family.

Frequently Asked Questions

Q: What are the core features of Qwen3-TTS?

A: It integrates multiple tones, languages, and dialects, emphasizes the naturalness and expressiveness of English and Chinese, and provides online demonstrations and API access.

Q: What is the difference with Qwen-TTS?

A: The official documentation recommends using Qwen3-TTS, which covers a wider range of tones and languages (including multiple dialects) and is available in Flash and Realtime formats.

Q: Is the weight open source?

A: Currently, API and online demo are mainly used, and their weight is not disclosed. Please refer to the official interface and console for usage.

Q: What languages/dialects and tones are supported?

A: The document provides 17 tones, covering Chinese (including some dialects) and multiple foreign languages; see the product page for a detailed list and price.

Q: Where can I experience and get updates?

A: You can experience it on the official blog/demo page, and view the model and real-time voice options in the Alibaba Cloud Tongyi Qianwen product documentation.

Recommended Tools

More