Google Translate AI adds real-time voice

What Gemini 3.5 Live Translate Is and Why It Matters

Gemini 3.5 Live Translate is Google’s new real-time speech-to-speech translation system that runs inside Google Translate and Google Meet, using AI to detect, interpret, and speak more than 70 languages as people talk, removing the need for turn-by-turn pauses or special hardware add-ons. Instead of waiting for a sentence to end, the model processes speech continuously and speaks the translation with only a short delay, preserving intonation, pacing, and pitch so the result sounds more like a human voice than a robotic audio track. As a real-time translation app feature, it turns almost any smartphone into a live speech translation tool, opening up multilingual translation support to far more people than earlier Pixel- and Pixel Buds–dependent attempts ever could.

Google Gemini Live Translate Brings Real-Time AI Voice to Any Phone

From Pixel-Only to Any Phone: Hardware Barriers Removed

Previous Google live translation features depended on specific hardware such as Pixel phones and Pixel Buds, which limited how many people could try live speech translation in everyday situations. Those constraints are gone. Gemini 3.5 Live Translate works on Android and iOS and supports any connected headphones, so it behaves like a platform-wide real-time translation app rather than an accessory demo. According to Technobezz, earlier Translate updates on Android still required Pixel Buds, while the new release “lets anyone hold a real-time conversation across languages using nothing more than a smartphone.” This change matters as much as the AI itself: Google can now apply its translation volume—over a trillion words processed per month—to live voice conversations without asking users to buy specific gear first.

How Real-Time AI Voice Translation Works Across 70+ Languages

Gemini 3.5 Live Translate focuses on real-time, speech-to-speech translation that feels conversational. The system can automatically detect over 70 languages without manual selection, then output translated audio while the speaker is still talking. Google says the model balances the delay needed for context with the need to stay in sync, which reduces awkward pauses common in older tools. This is central to the new generation of Google Translate AI: conversations sound more natural, and the translated voice preserves tone, pacing, and pitch. All AI-generated audio includes SynthID watermarking embedded in the waveform, making it detectable yet inaudible. The model is built to handle noisy settings and overlapping voices, keeping live speech translation usable in busy streets, stations, or offices where multilingual translation support is often most needed.

Hands-Free Conversations: Android Listening Mode and Headphone Support

On both Android and iOS, users can open the Google Translate app, connect any pair of headphones, and tap the Live Translate button in the lower-left corner to start a real-time translation session. The translated speech plays through the headphones, allowing two people to speak and hear each other’s languages in near real time. Android adds another layer: a new listening mode routes translated audio through the phone’s earpiece. You hold the device to your ear like a regular call while the translated voice streams only to you, useful when you do not have headphones or want privacy in a crowd. Together, these options turn Google Translate into a flexible real-time translation app that works for quick travel questions, customer support calls, or on-the-fly business chats.

Beyond the Phone: Google Meet and Developer Integrations

The rollout is not limited to the Translate app. Google Meet is gaining the same Gemini 3.5 Live Translate engine, expanding from 5 supported languages to over 70 and from English-only language pairs to more than 2,000 combinations in a single meeting. That means a call can include participants speaking many different languages, with live speech translation bridging them in real time. Enterprise customers are seeing this in private preview, with broader availability planned later. Developers also get an entry point via a public preview in the Gemini Live API and Google AI Studio, paving the way for real-time translation inside support hotlines, ride-hailing apps, or education tools. Grab is already piloting the technology to help drivers and passengers communicate during more than 10 million monthly voice calls.