google’s ai team has officially launched a next-generation real-time speech translation model—gemini 3.5 live translate—marking a new era of more natural, immersive cross-language communication. this model not only supports over 70 languages but also achieves groundbreaking precision in replicating intonation, speaking pace, and pitch, making the translated voice sound almost indistinguishable from the original, completely eliminating any mechanical or distorted quality.
unlike traditional speech translation solutions that suffer from high latency and frequent interruptions, gemini 3.5 live translate employs a streaming inference architecture, striking an intelligent balance between contextual understanding and instant response: it ensures translation accuracy while delivering seamless, continuous audio output, with end-to-end latency consistently kept under a few seconds. this dramatically enhances user experience in high-frequency scenarios such as video conferences, everyday conversations, and remote collaboration.
currently, this model has been fully integrated into several core products:
- the google translate app (ios/android) is now available worldwide, allowing users to enable the new real-time speech translation feature free of charge.
- developers can quickly integrate gemini live via the gemini live api and google ai studio to build customized interpretation systems or incorporate it into complex media‑stream processing workflows.
- google meet enterprise will begin offering a private preview to select customers this month, supporting multilingual real-time simultaneous interpretation during meetings.
in terms of technical highlights, gemini 3.5 live translate natively supports streaming speech processing that enables listening and translating simultaneously, automatically recognizing and responding to multiple input languages without requiring manual language switching. it also features an advanced built-in noise suppression module, ensuring robust performance and high accuracy even in challenging acoustic environments like subway stations or cafés.
the usage is convenient and flexible: global users can now experience it directly through the google translate app. to minimize echo interference, we recommend using headphones; android users can also activate “listening mode”—simply hold the phone close to your ear, and the translated speech will be clearly output through the earpiece, delivering a truly hands-free, highly private, and deeply immersive real-time conversation experience.