Text-to-image AI is mainstream now, but just waiting in the wings is text-to-video. The pitch for this technology is that you’ll be able to type a description and generate a corresponding video in any style you like.
Current capabilities lag behind this dream, but for those tracking the tech’s progress, an announcement today by AI startup Runway of a new AI video generation model is noteworthy nonetheless.
Runway offers a web-based video editor that specializes in AI tools like background removal and pose detection. The company helped develop open-source text-to-image model Stable Diffusion and announced its first AI video editing model, Gen-1, in February.
Gen-1 focused on transforming existing video footage, letting users input a rough 3D animation or shaky smartphone clip and apply an AI-generated overlay. In the clip below, for example, footage of cardboard packaging is paired with an image of an industrial factory to produce a clip that could be used for storyboarding or pitching a more polished feature.
Gen-2, by comparison, seems more focused on generating videos from scratch, though there are lots of caveats to note. First, the demo clips shared by Runway are short, unstable, and certainly not photorealistic, and second, access is limited. Bloomberg News reports that users will have to sign up to join a waitlist for Gen-2 via Runway’s Discord, and a spokesperson for the company, Kelsey Rondenet, told The Verge that Runway will be “providing broad access in the coming weeks.”
In other words, all we have to judge Gen-2 right now is a demo reel and a handful of clips (most of which were already being advertised as part of Gen-1).