Google releases Gemini 2.5 Flash Live native audio preview for more natural voice conversations

AI information • Admin • 9/24/2025 • 66 views

Google released a preview of Gemini 2.5 Flash native audio Live in a developer update, calling it the latest iteration of the Gemini Live model, focusing on improving function calling reliability and conversational naturalness. This model processes input and output using native audio, reducing the latency and distortion associated with traditional ASR/TTS cascades. It supports interruptions and resumes during conversations, and is targeted at scenarios such as real-time voice assistants, customer service agents, and live demonstrations.

According to official documentation, the Live API supports low-latency, two-way mixed voice/video and text input. Models can trigger tool calls directly within a conversation and return structured results. This preview version is now available for trial in Google AI Studio, with simultaneous updates to the Vertex AI and Gemini API documentation. Developers can follow the Live API guide to integrate and test it. The changelog indicates that the native audio model will be available for preview on September 23, 2025.

Frequently Asked Questions

Q: What are the core improvements of Gemini Live this time?

A: The native audio model is online, function calls are more stable and accurate; voice conversations are more natural, and you can interrupt and continue the answer immediately.

Q: Where can I experience it?

A: The Live portal of Google AI Studio is now open for online trial.

Q: What inputs and outputs can the Live API handle?

A: Text, audio and video input; text and audio output, supporting real-time two-way streaming.

Q: Is this the official version?

A: This is in preview. Please refer to the official documentation and console for specific capabilities and quotas.

Q: How is it different from previous Gemini Lives?

A: Using a single native audio model reduces STT/TTS cascading, resulting in lower latency and more stable tool calling performance.

Google releases Gemini 2.5 Flash Live native audio preview for more natural voice conversations

Related Articles

Qwen Chat Travel Planner is now available: powered by Amap and Fliggy interfaces, generating daily itineraries

OpenAI releases Codex CLI 0.40: Default model switches to gpt-5-codex, adds /review

Kimi K3 officially launched: 2.8 trillion parameters betting on millions of contexts and open weight

Mistral Studio adds prompt version management: enterprise AI is now managing behavioral assets

Recommended Tools

Google releases Gemini 2.5 Flash Live native audio preview for more natural voice conversations

Related Articles

Qwen Chat Travel Planner is now available: powered by Amap and Fliggy interfaces, generating daily itineraries

OpenAI releases Codex CLI 0.40: Default model switches to gpt-5-codex, adds /review

Kimi K3 officially launched: 2.8 trillion parameters betting on millions of contexts and open weight

Mistral Studio adds prompt version management: enterprise AI is now managing behavioral assets

Recommended Tools

Submit AI Tool

Please confirm submission information