news

Google I/O 2026: New Gemini Audio Tools

At Google I/O 2026, Google released Gemini 3.1 Flash TTS, extending its reach to 80+ languages and 30 preset voices while adding native multi-speaker dialogue support. Flixly users gained immediate ac...

By Flixly TeamMay 20, 20261 views
Google I/O 2026: New Gemini Audio Tools

TL;DR

Google I/O 2026 introduced Gemini 3.1 Flash TTS with 80+ languages and multi-speaker support. Flixly integrates these updates directly into its prompt to audio studio, delivering realistic AI audio 2026 in one workspace. Creators can now move from text prompt to finished voiceover or dialogue track without switching apps.

At Google I/O 2026, Google released Gemini 3.1 Flash TTS, extending its reach to 80+ languages and 30 preset voices while adding native multi-speaker dialogue support. Flixly users gained immediate access to these models through the Text to Speech tool, removing the need to copy outputs between platforms.

How Gemini 3.1 Flash TTS Works in Flixly

Model Capabilities

  • 80+ languages with regional accent variants
  • 30 voice presets covering broadcast, conversational, and character styles
  • Built-in multi-speaker dialogue that keeps turn order and timing consistent

Credit Costs and Duration Limits

Text to Speech runs at 1.2 credits per 30 seconds of output. Maximum single generation sits at 3 minutes before the system splits files automatically.

Workflow Steps

  1. Open Text to Speech and paste your script.
  2. Select Gemini 3.1 Flash TTS and pick two voices for dialogue.
  3. Add speaker tags such as [Speaker 1] and [Speaker 2].
  4. Choose speed and emotion sliders then generate.
  5. Download the MP3 or continue editing in the prompt to audio studio.

Prompt to Audio Studio Features

The prompt to audio studio inside Flixly combines text input, voice selection, and real-time preview. Users report finishing full podcast episodes in under 45 minutes compared with two hours on separate tools.

Voice Cloning Integration

Link Voice Cloning to import a 30-second sample and match an existing brand voice. Once cloned, the new voice appears in every subsequent generation inside the same workspace.

Multi-Speaker Dialogue Templates

Pre-built templates for interviews, customer support calls, and educational explainers save setup time. Each template ships with speaker roles and pause markers already inserted.

Realistic AI Audio 2026 Benchmarks

Realistic AI audio 2026 quality is measured by clarity, accent accuracy, and emotional range. Independent tests placed Gemini 3.1 Flash TTS at 4.6/5 naturalness on podcast scripts and 4.3/5 on technical narration.

Comparison Table

Feature Gemini 3.1 Flash TTS ElevenLabs Multilingual v2 OpenAI TTS HD
Languages supported 80+ 29 50+
Multi-speaker support Native Limited Manual
Credit cost (30 sec) 1.2 2.0 1.8
Max single file length 3 min 5 min 4 min
Real-time preview Yes No No

All-in-One AI Audio Workflow

Flixly bundles every audio step in a single dashboard. From script import to final mastering, users stay inside the same environment.

Real Workflow Example

  1. Write a 60-second explainer in Docs.
  2. Copy text into Text to Speech using Gemini 3.1 Flash TTS.
  3. Clone the host voice via Voice Cloning.
  4. Add background music from Music Generation.
  5. Apply Auto Captions and export both audio and video-ready SRT file.

Credit Efficiency

Full 60-second episode consumes 2.4 credits for voice and 1.0 credit for music. Total cost stays below 4 credits before any discounts from monthly plans.

Who Benefits Most

Podcasters, course creators, and marketing teams report consistent time savings. One agency finished three client explainer videos in one afternoon that previously took two days.

Starting with Gemini Audio in Flixly

Sign up at Sign Up to test Gemini 3.1 Flash TTS inside the prompt to audio studio today.

Frequently Asked Questions

What new audio tools did Google announce at I/O 2026?

Google introduced Gemini 3.1 Flash TTS with native multi-speaker dialogue, 80+ language coverage, and 30 voice presets. The model also added improved emotion sliders and real-time preview in supported platforms.

How do I use Gemini audio inside Flixly?

Open the Text to Speech tool, select Gemini 3.1 Flash TTS from the model list, and paste your script with speaker tags. Adjust speed and emotion then hit generate for immediate playback.

Can Flixly clone my voice with Gemini?

Yes. Use the Voice Cloning tool to upload a 30-second sample. Once processed, the cloned voice becomes available in every Gemini generation inside the same workspace.

What credit cost does Gemini TTS have in Flixly?

Each 30 seconds of Gemini 3.1 Flash TTS output costs 1.2 credits. A three-minute file uses roughly 7.2 credits before any plan discounts apply.

Is Gemini TTS better than ElevenLabs in 2026?

Gemini 3.1 Flash TTS leads in language count and native multi-speaker handling while ElevenLabs still holds an edge in ultra-fine emotion control. Flixly lets users switch models without leaving the platform.

Can I add background music to Gemini voiceovers?

Yes. After generating the voice track, open Music Generation and layer a track directly beneath it. Export both together as a single mixed file.

Does Flixly support multi-speaker scripts with Gemini?

Flixly supports full multi-speaker scripts when you tag lines with [Speaker 1] and [Speaker 2]. The model maintains correct timing and turn order automatically.

Tools mentioned in this post

google io 2026 geminirealistic ai audio 2026prompt to audio studioall-in-one ai audio

Ready to create with news?

Jump straight into Flixly's AI studio and try news with 50+ models — free to start.