Member-only story
First Open-Source Text-to-Video Model with MIT License
Hey everyone! Today, I’m so excited to share some amazing news about a new, groundbreaking model in the world of AI — the Pyramid Flow SD3. This is the first real, open-source text-to-video model that comes with an MIT license! If you’re into AI, video generation, or just love new tech, you’re going to want to hear about this.
What is Pyramid Flow SD3? 🎥✨
Pyramid Flow SD3 is a 2B Diffusion Transformer (DiT) that can generate 10-second videos at 768p resolution with 24 frames per second (FPS). That’s right, high-quality videos, created entirely from text! This model doesn’t just stop at text-to-video though. It also supports image-to-video, making it incredibly versatile for content creators and developers alike.
Here is a sample video created with following prompt:
Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls