Seedance 2.0
ByteDance's video model with a Fast/High switch, native audio, up to 7 reference images, and director-level camera control.
- Max Duration
- 15 seconds max
- Resolution
- 480p, 720p
- Aspect Ratios
- 16:9, 9:16, 1:1, 4:3, 3:4, 21:9
- Audio
- Native audio
What makes Seedance 2.0 unique
Multi-image reference support
up to 7 reference images for consistent characters and scenes
Optional first-frame and last-frame control for precise scene anchoring
Native synchronized audio generation
Director-level camera control and cinematic shot composition
Realistic physics simulation
collisions with weight, fabric tearing, believable action sequences
Native CapCut integration
the same engine behind TikTok's video tools
Best for
Prompt tips
- 1.
Use multiple reference images for consistent character identity across shots.
- 2.
Specify camera movements explicitly — Seedance 2.0 follows director-style instructions well.
- 3.
Include audio cues in your prompt for synchronized sound design.
- 4.
For action scenes, describe physics and weight to leverage its realistic simulation engine.
How Seedance 2.0 works
Seedance 2.0 uses a Diffusion Transformer (DiT) architecture that replaces the traditional U-Net backbone with a transformer for better scalability. It features a unified multimodal audio-video joint generation system that processes reference images and audio together to produce cinematic output with synchronized sound.
How to use on Flashloop
Select Seedance 2.0
Open Flashloop and choose Seedance 2.0 from the model selector in the video creator.
Write your prompt
Describe what you want to see in your video — be as detailed as you like.
Generate & download
Hit generate and your video will be ready in seconds. Download or share directly.
Frequently asked questions
Seedance 2.0 prioritizes maximum fidelity. Seedance 2.0 Fast is the same model family with a faster, cheaper generation path — better when iteration speed matters more than peak quality.
Yes. It uses a unified multimodal system that generates synchronized audio and video together, not sequentially.
You can attach up to 7 reference images per generation. First-frame / last-frame control is also supported (mutually exclusive with multi-image references).
Yes. Its camera control and CapCut integration make it well suited for advertising and film production workflows.
Other video models
View all →Veo 3
Generate 8-second videos with native dialogue, sound effects, and ambient audio in one pass.
Kling 3.0
Create up to 15-second multi-shot videos with character consistency, 4K 60fps support, and bilingual audio.
Kling 2.6
Turn images into 10-second 1080p videos with 48fps motion control and native speaking, singing, or rapping audio.
Ready to create with Seedance 2.0?
Available on iOS, Android, and web.
Start Creating