New NVIDIA Research — excerpted section from this version of The Rundown

The NVIDIA research team just dropped a new research paper on creating high-quality short videos from text prompts. This technique uses Video Latent Diffusion Models (Video LDMs), which work efficiently without using too much computing power.

It can create 113 frame-long videos at 1280×2048 resolution, rendered at 24 FPS, resulting in 4.7-second clips. The team first trained the model on images, then added a time dimension to make it work with videos.

This new research is impressive. At the current pace of development, we may soon be able to generate full-length movies from just a handful of text prompts within the next few years.


Also relevant/see: