Vidu was first announced in April 2024 and has iterated quickly since. Its Reference-to-Video feature is aimed squarely at the consistent-character problem that trips up many single-shot generation models.
Vidu
by Shengshu Technology (生数科技)
Chinese video model with text, image, and multi-reference generation, rooted in Tsinghua research.
- Current flagship Vidu Q3 (announced Jan 2026): long-form, native combined audio+video generation
- 'Reference-to-Video' combines multiple reference images of characters/objects/scenes for consistency
- Reference-to-Video launched globally April 13, 2026
- Up to 1080p output; free-credit tier plus paid Creator Plan and enterprise API