Gemini Omni
Multimodal video generation — text, up to 7 reference images, or a video clip. Native 4K output and up to 10-second clips.
- Max Duration
- 10 seconds max
- Resolution
- 720p, 1080p, 4K
- Aspect Ratios
- 16:9, 9:16
What makes Gemini Omni unique
Up to 7 reference images for character / scene continuity
Video-to-video editing with sub-clip trim range (up to 10s)
Native 4K output
Three input modes in one model: text, images, and video
Where it falls short
- –Video-to-video output duration is decided by the model, not the prompt
- –Aspect ratios limited to 16:9 and 9:16
- –Image and video references share a 7-slot quota (each image = 1, each video = 2)
Best for
Prompt tips
- 1.
Use reference images for identity. One image for the main character, others for wardrobe or location, instead of overloading a single image.
- 2.
When using a reference clip, trim to the moment you want the model to extend — short, focused clips give the cleanest continuation.
- 3.
If you need 4K, write longer prompts that describe lighting and texture; the model has more pixels to fill and richer prompts hold up better.
- 4.
Skip the duration field when using video-to-video — the model picks the length based on your clip.
How Gemini Omni works
Write a prompt describing the scene, camera, and style. Add up to 7 reference images for characters, products, or look-and-feel, or attach a short reference clip with start/end timestamps for video-to-video. Pick 720p, 1080p, or 4K and a duration of 4, 6, 8, or 10 seconds, then generate. Duration is fixed by the model when you provide a reference video.
How to use on Flashloop
Select Gemini Omni
Open Flashloop and choose Gemini Omni from the model selector in the video creator.
Write your prompt
Describe what you want to see in your video — be as detailed as you like.
Generate & download
Hit generate and your video will be ready in seconds. Download or share directly.
Frequently asked questions
Three: pure text, up to 7 reference images for image-to-video, or a single reference clip with a trim range for video-to-video. You can mix images with a video reference as long as you stay under the 7-slot quota (each image is 1 slot, the video is 2).
Yes. Gemini Omni outputs natively at 720p, 1080p, or 4K. 4K costs more per generation since rendering is heavier on Google's side.
4, 6, 8, or 10 seconds. When you attach a reference video, the model picks the output duration itself and your duration choice is ignored.
16:9 landscape and 9:16 portrait. There's no square or other ratio at the moment.
Each reference image consumes 1 slot, and a reference video consumes 2 slots. The total must stay at or below 7 — so you can use 7 images alone, or 5 images plus 1 video, or 1 video by itself.
Other video models
View all →Veo 3
Generate 8-second videos with native dialogue, sound effects, and ambient audio in one pass.
Kling 3.0
Create up to 15-second multi-shot videos with character consistency, 4K 60fps support, and bilingual audio.
Kling 2.6
Turn images into 10-second 1080p videos with 48fps motion control and native speaking, singing, or rapping audio.
Ready to create with Gemini Omni?
Available on iOS, Android, and web.
Start Creating