Veo 3
Google's most advanced video model — the first to generate synchronized audio, dialogue, and sound effects alongside photorealistic video.
Max Duration
8 seconds
Resolution
720p, 1080p
Aspect Ratios
16:9
Audio
Native audio generation
What makes Veo 3 unique
First major model with native audio generation — dialogue, sound effects, and ambient noise are generated alongside video in a single pass
Latent Diffusion Transformer architecture processes audio and video latents together at each denoising step
Photorealistic output with accurate real-world physics and cinematic lighting
Strong prompt adherence with accurate text rendering in scenes
Reference image support (Veo 3.1) — up to 3 reference images for character/object/scene consistency
Most popular model on Flashloop with 239K+ videos generated
Best for
How Veo 3 works
Veo 3 uses a Latent Diffusion Transformer that compresses video and audio into lower-dimensional latent representations, then applies diffusion across height, width, and time. Audio and video latents are processed together at each denoising step, producing videos with naturally synchronized sound effects, dialogue, and ambient noise — all in a single generation pass.
How to use Veo 3 on Flashloop
Select Veo 3
Open Flashloop and choose Veo 3 from the model selector in the video creator.
Enter your prompt
Describe what you want to see in your video — be as detailed as you like.
Generate & download
Hit generate and your video will be ready in seconds. Download or share directly.
Ready to create with Veo 3?
Start generating videos with Veo 3 on Flashloop — available on iOS, Android, and web.
Start Creating with Veo 3