Member-only story

First Open-Source Text-to-Video Model with MIT License

3 min readOct 10, 2024

Hey everyone! Today, I’m so excited to share some amazing news about a new, groundbreaking model in the world of AI — the Pyramid Flow SD3. This is the first real, open-source text-to-video model that comes with an MIT license! If you’re into AI, video generation, or just love new tech, you’re going to want to hear about this.

Interface for user study of video generative performance

What is Pyramid Flow SD3? 🎥✨

Pyramid Flow SD3 is a 2B Diffusion Transformer (DiT) that can generate 10-second videos at 768p resolution with 24 frames per second (FPS). That’s right, high-quality videos, created entirely from text! This model doesn’t just stop at text-to-video though. It also supports image-to-video, making it incredibly versatile for content creators and developers alike.

Here is a sample video created with following prompt:

Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls

First Open-Source Text-to-Video Model with MIT License

What is Pyramid Flow SD3? 🎥✨

Why is this model so…

Written by Emad Dehnavi

No responses yet