ForkBasetenBasetenpublished Feb 2, 2026seen 5d

basetenlabs/Wan2.1

forked from Wan-Video/Wan2.1

Open original ↗

Captured source

source ↗
published Feb 2, 2026seen 5dcaptured 14hhttp 200method plain

basetenlabs/Wan2.1

Description: Wan: Open and Advanced Large-Scale Video Generative Models

License: Apache-2.0

Stars: 0

Forks: 0

Open issues: 0

Created: 2026-02-02T22:41:34Z

Pushed: 2026-02-03T05:37:47Z

Default branch: main

Fork: yes

Parent repository: Wan-Video/Wan2.1

Archived: no

README:

Wan2.1

💜 Wan &nbsp&nbsp | &nbsp&nbsp 🖥️ GitHub &nbsp&nbsp | &nbsp&nbsp🤗 Hugging Face&nbsp&nbsp | &nbsp&nbsp🤖 ModelScope&nbsp&nbsp | &nbsp&nbsp 📑 Technical Report &nbsp&nbsp | &nbsp&nbsp 📑 Blog &nbsp&nbsp | &nbsp&nbsp💬 WeChat Group&nbsp&nbsp | &nbsp&nbsp 📖 Discord&nbsp&nbsp

-----

**Wan: Open and Advanced Large-Scale Video Generative Models**

In this repository, we present Wan2.1, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. Wan2.1 offers these key features:

  • 👍 SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
  • 👍 Supports Consumer-grade GPUs: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.
  • 👍 Multiple Tasks: Wan2.1 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
  • 👍 Visual Text Generation: Wan2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
  • 👍 Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.

Video Demos

🔥 Latest News!!

  • May 14, 2025: 👋 We introduce Wan2.1 VACE, an all-in-one model for video creation and editing, along with its [inference code](#run-vace), [weights](#model-download), and technical report!
  • Apr 17, 2025: 👋 We introduce Wan2.1 [FLF2V](#run-first-last-frame-to-video-generation) with its inference code and weights!
  • Mar 21, 2025: 👋 We are excited to announce the release of the Wan2.1 technical report. We welcome discussions and feedback!
  • Mar 3, 2025: 👋 Wan2.1's T2V and I2V have been integrated into Diffusers (T2V | I2V). Feel free to give it a try!
  • Feb 27, 2025: 👋 Wan2.1 has been integrated into ComfyUI. Enjoy!
  • Feb 25, 2025: 👋 We've released the inference code and weights of Wan2.1.

Community Works

If your work has improved Wan2.1 and you would like more people to see it, please inform us.

  • Video-As-Prompt, the first unified semantic-controlled video generation model based on Wan2.1-14B-I2V with a Mixture-of-Transformers architecture and in-context controls (e.g., concept, style, motion, camera). Refer to the project page for more examples.
  • LightX2V, a lightweight and efficient video generation framework that integrates Wan2.1 and Wan2.2, supports multiple engineering acceleration techniques for fast inference, which can run on RTX 5090 and RTX 4060 (8GB VRAM).
  • DriVerse, an autonomous driving world model based on Wan2.1-14B-I2V, generates future driving videos conditioned on any scene frame and given trajectory. Refer to the project page for more examples.
  • Training-Free-WAN-Editing, built on Wan2.1-T2V-1.3B, allows training-free video editing with image-based training-free methods, such as FlowEdit and FlowAlign.
  • Wan-Move, accepted to NeurIPS 2025, a framework that brings Wan2.1-I2V-14B to SOTA fine-grained, point-level motion control! Refer to their project page for more information.
  • EchoShot, a native multi-shot portrait video generation model based on Wan2.1-T2V-1.3B, allows generation of multiple video clips featuring the same character as well as highly flexible content controllability. Refer to their project page for more information.
  • AniCrafter, a human-centric animation model based on Wan2.1-14B-I2V, controls the Video Diffusion Models with 3DGS Avatars to insert and animate anyone into any scene following given motion sequences. Refer to the project page for more examples.
  • HyperMotion, a human image animation framework based on Wan2.1, addresses the challenge of generating complex human body motions in pose-guided animation. Refer to their website for more examples.
  • MagicTryOn, a video virtual try-on framework built upon Wan2.1-14B-I2V, addresses the limitations of existing models in expressing garment details and maintaining dynamic stability during human motion. Refer to their website for more examples.
  • ATI, built on Wan2.1-I2V-14B, is a trajectory-based motion-control framework that unifies object, local, and camera movements in video generation. Refer to their website for more examples.
  • Phantom has developed a unified video generation framework for single and…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine fork, no traction details.