Google unveils new developments in Gemini audio models: Translate real-time translation, TTS preview, and Native Audio updates

AI information • Admin • 12/13/2025 • 147 views

Google announced that it will bring Gemini's translation and audio capabilities to Google Translate, and will simultaneously update the text-to-speech and native audio models of the Gemini 2.5 series. Google Translate will launch a beta experience of "headset real-time speech-to-speech translation", which can translate in real time in conversational or continuous listening scenarios, and try to preserve the speaker's tone, accent and rhythm to make the translation more like "people talking".

The beta experience is available in batches in the United States, Mexico, and India on Android, supporting any headset and covering more than 70 languages. Google also said it will expand to iOS with more countries and regions in 2026. At the same time, Google DeepMind released a Text-to-Speech preview update for Gemini 2.5 Flash and 2.5 Pro on December 10, emphasizing more fit for style cues, automatic adjustment of speech speed and pauses according to context, and improved "character timbre consistency" in multi-character dialogues, suitable for multi-speaker scenarios such as podcasting, dubbing, teaching and customer service.

In terms of real-time voice interaction, Gemini 2.5 Flash Native Audio has also been updated, focusing on better handling complex processes, following user instructions, and maintaining natural multi-turn conversations, and has provided relevant capability portals in Google AI Studio, Vertex AI and other products. However, most of the above new features are in the beta/preview stage, and there may still be problems such as mistranslations, accent deviations, or style instability, so you need to pay attention to the impact of privacy and environmental noise on the effect when using it.

FAQs

Q: What is the function of Google Translate's headphone real-time translation?

A: Google Translate offers real-time speech-to-speech translation in beta, which allows you to listen while wearing headphones and try to preserve the tone and rhythm of your speech.

Q: In which regions will Google Translate's real-time translation be launched first?

A: The beta will be available in batches in the United States, Mexico, India and other regions on the Android side, and is planned to expand to iOS and more countries and regions in 2026.

Q: What languages does Google Translate's headset real-time translation support?

A: The beta claims to support more than 70 languages, and the specific languages available will be updated gradually with region and version.

Q: What has changed in the Text-to-Speech update for Gemini 2.5 Flash and 2.5 Pro?

A: The focus of the update is to better match the style cues, the speed and pauses are more "contextual", and the character timbre is more consistent in multi-speaker scenes.

Q: What is the Gemini 2.5 Flash Native Audio update suitable for?

A: This update is aimed at real-time voice agent and conversation applications, emphasizing stronger instruction compliance, multi-round conversation coherence, and complex task process processing capabilities.

Google unveils new developments in Gemini audio models: Translate real-time translation, TTS preview, and Native Audio updates

Related Articles

What is MyPrompt.cc website? An article will help you understand

New Disco and GenTabs on Google Labs: Use Gemini 3 to "remix" web tags into usable apps

MWC Shanghai hosts the robot penalty shootout: embodied intelligence moves into the public exam venue

Codex supports Windows control: AI programming agents begin cross-platform collaboration

Recommended Tools

Google unveils new developments in Gemini audio models: Translate real-time translation, TTS preview, and Native Audio updates

Related Articles

What is MyPrompt.cc website? An article will help you understand

New Disco and GenTabs on Google Labs: Use Gemini 3 to "remix" web tags into usable apps

MWC Shanghai hosts the robot penalty shootout: embodied intelligence moves into the public exam venue

Codex supports Windows control: AI programming agents begin cross-platform collaboration

Recommended Tools

Submit AI Tool

Please confirm submission information