What does this repo signal mean?

StepFun published stepfun-ai/Step-Video-TI2V (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo stepfun-ai/Step-Video-TI2V · language Python · New video generation repo from notable lab, decent stars.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

StepFun Repo: stepfun-ai/Step-Video-TI2V

Captured source

source ↗

GitHub/github.com/stepfun-ai/Step-Video-TI2V

stepfun-ai/Step-Video-TI2V repository metadata

Source ↗

published Mar 6, 2025seen 5dcaptured 9hhttp 200method plain

stepfun-ai/Step-Video-TI2V

Language: Python

License: MIT

Stars: 375

Forks: 35

Open issues: 6

Created: 2025-03-06T04:42:11Z

Pushed: 2025-03-20T09:10:36Z

Default branch: main

Fork: no

Archived: no

README:

🔥🔥🔥 News!!

Mar 17, 2025: 👋 We release the inference code and model weights of Step-Video-TI2V. Download
Mar 17, 2025: 👋 We release a new TI2V benchmark Step-Video-TI2V-Eval
Mar 17, 2025: 👋 Step-Video-TI2V has been integrated into ComfyUI-Stepvideo-ti2v. Enjoy!
Mar 17, 2025: 🎉 We have made our technical report available as open source. Read

Motion Control

战马跳跃战马蹲下战马向前奔跑，然后转身

Motion Dynamics Control

两名男子在互相拳击，镜头环绕两人拍摄。(motion_score: 2) 两名男子在互相拳击，镜头环绕两人拍摄。(motion_score: 5) 两名男子在互相拳击，镜头环绕两人拍摄。(motion_score: 20)

🎯 Tips: The default motion_score = 5 is suitable for general use. If you need more stability, set motion_score = 2, though it may lack dynamism in certain movements. For greater movement flexibility, you can use motion_score = 10 or motion_score = 20 to enable more intense actions. Feel free to customize the motion_score based on your creative needs to fit different use cases.

Camera Control

镜头环绕女孩，女孩在跳舞镜头缓慢推进，女孩在跳舞镜头拉远，女孩在跳舞

Supported Camera Movements | 支持的运镜方式

| Camera Movement | 运镜方式 | |--------------------------------|--------------------| | Fixed Camera | 固定镜头 | | Pan Up/Down/Left/Right | 镜头上/下/左/右移 | | Tilt Up/Down/Left/Right | 镜头上/下/左/右摇 | | Zoom In/Out | 镜头放大/缩小 | | Dolly In/Out | 镜头推进/拉远 | | Camera Rotation | 镜头旋转 | | Tracking Shot | 镜头跟随 | | Orbit Shot | 镜头环绕 | | Rack Focus | 焦点转移 |

🔧 Motion Score Considerations: motion_score = 5 or 10 offers smoother and more accurate motion than motion_score = 2, with motion_score = 10 providing the best responsiveness and camera tracking. Choosing the suitable setting enhances motion precision and fluidity.

Anime-Style Generation

女生向前行走，背景是虚化模糊的效果女人眨眼，然后对着镜头做飞吻的动作。狸猫战士双手缓缓上扬，雷电从手中向四周扩散，身后灵兽影像的双眼闪烁强光，张开巨口发出低吼

Step-Video-TI2V excels in anime-style generation, enabling you to explore various anime-style images and create customized videos to match your preferences.

1. [Introduction](#1-introduction) 2. [Model Summary](#2-model-summary) 3. [Model Download](#3-model-download) 4. [Model Usage](#4-model-usage) 5. [Comparisons](#5-Comparisons) 6. [Online Engine](#6-online-engine) 7. [Citation](#7-citation)

1. Introduction

We present Step-Video-TI2V, a state-of-the-art text-driven image-to-video generation model with 30B parameters, capable of generating videos up to 102 frames based on both text and image inputs. We build Step-Video-TI2V-Eval as a new benchmark for the text-driven image-to-video task and compare Step-Video-TI2V with open-source and commercial TI2V engines using this dataset. Experimental results demonstrate the state-of-the-art performance of Step-Video-TI2V in the image-to-video generation task.

2. Model Summary

Step-Video-TI2V is trained based on Step-Video-T2V. To incorporate the image condition as the first frame of the generated video, we encode it into latent representations using Step-Video-T2V’s Video-VAE and concatenate them along the channel dimension of the video latent. Additionally, we introduce a motion score condition, enabling users to control the dynamic level of the video generated from the image condition.

3. Model Download

4. Model Usage

📜 4.1 Dependencies and Installation

git clone https://github.com/stepfun-ai/Step-Video-TI2V.git
conda create -n stepvideo python=3.10
conda activate stepvideo

cd Step-Video-TI2V
pip install -e .

🚀 4.2. Inference Scripts

python api/call_remote_server.py --model_dir where_you_download_dir & ## We assume you have more than 4 GPUs available. This command will return the URL for both the caption API and the VAE API. Please use the returned URL in the following command.

parallel=1 or 4 # or parallel=8 Single GPU can also predict the results, although it will take longer
url='127.0.0.1'
model_dir=where_you_download_dir

torchrun --nproc_per_node $parallel run_parallel.py --model_dir $model_dir --vae_url $url --caption_url $url --ulysses_degree $parallel --prompt "笑起来" --first_image_path ./assets/demo.png --infer_steps 50 --cfg_scale 9.0 --time_shift 13.0 --motion_score 5.0

We list some more useful configurations for easy usage:

5. Comparisons

To evaluate the performance of Step-Video-TI2V, We leverage VBench-I2V to systematically compare Step-Video-TI2V with recently released leading open-source models. The detailed results presented in the table below, highlight our model’s superior performance over these models. We presented two results of Step-Video-TI2V, with the motion set to 5 and 10, respectively. As expected, this mechanism effectively balances the motion dynamics and stability (or consistency) of the generated videos. Additionally, we submitted our results to the [VBench-I2V…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New video generation repo from notable lab, decent stars.