What does this repo signal mean?

Tencent Hunyuan published Tencent-Hunyuan/HunyuanVideo-I2V (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo Tencent-Hunyuan/HunyuanVideo-I2V · language Python · Notable image-to-video model release with strong stars.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Tencent Hunyuan Repo: Tencent-Hunyuan/HunyuanVideo-I2V

Captured source

source ↗

GitHub/github.com/Tencent-Hunyuan/HunyuanVideo-I2V

Tencent-Hunyuan/HunyuanVideo-I2V repository metadata

Source ↗

published Mar 4, 2025seen 1wcaptured 2dhttp 200method plain

Tencent-Hunyuan/HunyuanVideo-I2V

Description: HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo

Language: Python

License: NOASSERTION

Stars: 1827

Forks: 191

Open issues: 57

Created: 2025-03-04T12:02:05Z

Pushed: 2026-04-07T06:13:09Z

Default branch: main

Fork: no

Archived: no

README:

[中文阅读](./README_zh.md)

HunyuanVideo-I2V 🌅

👋 Join our WeChat and Discord

-----

Following the great successful open-sourcing of our HunyuanVideo, we proudly present the HunyuanVideo-I2V, a new image-to-video generation framework to accelerate open-source community exploration!

This repo contains official PyTorch model definitions, pre-trained weights and inference/sampling code. You can find more visualizations on our project page. Meanwhile, we have released the LoRA training code for customizable special effects, which can be used to create more interesting video effects.

> **HunyuanVideo: A Systematic Framework For Large Video Generation Model**

🔥🔥🔥 News!!

Mar 13, 2025: 🚀 We release the parallel inference code for HunyuanVideo-I2V powered by xDiT.
Mar 11, 2025: 🎉 We have updated the lora training and inference code after fixing the bug.
Mar 07, 2025: 🔥 We have fixed the bug in our open-source version that caused ID changes. Please try the new model weights of HunyuanVideo-I2V to ensure full visual consistency in the first frame and produce higher quality videos.
Mar 06, 2025: 👋 We release the inference code and model weights of HunyuanVideo-I2V. Download.

🎥 Demo

I2V Demo

First Frame Consistency Demo

| Reference Image | Generated Video | |:----------------:|:----------------:| | | | ｜ | | ｜ | |

Customizable I2V LoRA Demo

🧩 Community Contributions

If you develop/use HunyuanVideo-I2V in your projects, welcome to let us know.

ComfyUI-Kijai (FP8 Inference, V2V and IP2V Generation): ComfyUI-HunyuanVideoWrapper by Kijai
HunyuanVideoGP (GPU Poor version): HunyuanVideoGP by DeepBeepMeep
xDiT compatibility improvement: xDiT compatibility improvement by pftq and xibosun

📑 Open-source Plan

HunyuanVideo-I2V (Image-to-Video Model)
[x] Inference
[x] Checkpoints
[x] ComfyUI
[x] Lora training scripts
[x] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)

[HunyuanVideo-I2V 🌅](#hunyuanvideo-i2v-)
[🔥🔥🔥 News!!](#-news)
[🎥 Demo](#-demo)
[I2V Demo](#i2v-demo)
[Frist Frame Consistency Demo](#frist-frame-consistency-demo)
[Customizable I2V LoRA Demo](#customizable-i2v-lora-demo)
[🧩 Community Contributions](#-community-contributions)
[📑 Open-source Plan](#-open-source-plan)
[Contents](#contents)
[HunyuanVideo-I2V Overall Architecture](#hunyuanvideo-i2v-overall-architecture)
[📜 Requirements](#-requirements)
[🛠️ Dependencies and Installation](#️-dependencies-and-installation)
[Installation Guide for Linux](#installation-guide-for-linux)
[🧱 Download Pretrained Models](#-download-pretrained-models)
[🔑 Single-gpu Inference](#-single-gpu-inference)
[Tips for Using Image-to-Video Models](#tips-for-using-image-to-video-models)
[Using Command Line](#using-command-line)
[More Configurations](#more-configurations)
[🎉 Customizable I2V LoRA effects training](#-customizable-i2v-lora-effects-training)
[Requirements](#requirements)
[Environment](#environment)
[Training data construction](#training-data-construction)
[Training](#training)
[Inference](#inference)
[🚀 Parallel Inference on Multiple GPUs by xDiT](#-parallel-inference-on-multiple-gpus-by-xdit)
[Using Command Line](#using-command-line-1)
[🔗 BibTeX](#-bibtex)
[Acknowledgements](#acknowledgements)

---

HunyuanVideo-I2V Overall Architecture

Leveraging the advanced video generation capabilities of HunyuanVideo, we have extended its application to image-to-video generation tasks. To achieve this, we employ a token replace technique to effectively reconstruct and incorporate reference image information into the video generation process.

Since we utilizes a pre-trained Multimodal Large Language Model (MLLM) with a Decoder-Only architecture as the text encoder, we can significantly enhance the model's ability to comprehend the semantic content of the input image and to seamlessly integrate information from both the image and its associated caption. Specifically, the input image is processed by the MLLM to generate semantic image tokens. These tokens are then concatenated with the video latent tokens, enabling comprehensive full-attention computation across the combined data.

The overall architecture of our system is designed to maximize the synergy between image and text modalities, ensuring a robust and coherent generation of video content from static images. This integration not only improves the fidelity of the generated videos but also enhances the model's ability to interpret and utilize complex multimodal inputs. The overall architecture is as follows.

📜 Requirements

The following table shows the requirements for running HunyuanVideo-I2V model (batch size = 1) to generate videos:

| Model | Resolution | GPU Peak Memory | |:----------------:|:-----------:|:----------------:| | HunyuanVideo-I2V | 720p | 60GB |

An NVIDIA GPU with CUDA support is required.
The model is tested on a single 80G GPU.
Minimum: The minimum GPU memory required is 60GB for 720p.
Recommended: We recommend using a GPU with 80GB of memory for better generation quality.
Tested operating system: Linux

🛠️ Dependencies and Installation

Begin by cloning the repository:...

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Notable image-to-video model release with strong stars.