You can now fine-tune open-source video models
Captured source
source ↗You can now fine-tune open-source video models – Replicate blog
Replicate Blog
You can now fine-tune open-source video models
Posted January 24, 2025 by zsxkib zeke deepfates bfirsh
AI video generation has gotten really good.
Some of the best video models like tencent/hunyuan-video are open-source, and the community has been hard at work building on top of them. We’ve adapted the Musubi Tuner by @kohya_tech to run on Replicate, so you can fine-tune HunyuanVideo on your own visual content.
Never Gonna Give You Up animal edition, courtesy of @flngr and @fofr .
HunyuanVideo is good at capturing the style of the training data, not only in the visual appearance of the imagery and the color grading, but also in the motion of the camera and the way the characters move.
This in-motion style transfer is unique to this implementation: other video models that are trained only on images cannot capture it.
Here are some examples of videos created using different fine-tunes, all with the same settings, size, prompt and seed:
Twin Peaks Pixar Cowboy Bebop Westworld
You can make your own fine-tuned video model to:
Create videos in a specific visual style
Generate animations of particular characters
Capture specific types of motion or movement
Build custom video effects
In this post, we’ll show you how to gather training data, create a fine-tuned video model, and generate videos with it.
Note
Prefer to learn by watching? Check out Sakib’s 5-minute video demo on YouTube .
Prerequisites
A Replicate account
A video or YouTube URL to use as training data
Step 1: Create your training data
To train a video model, you’ll need a dataset of video clips and text captions describing each video.
This process can be time-consuming, so we’ve created a model to make it easier: zsxkib/create-video-dataset takes a video file or YouTube URL as input, slices it into smaller clips, and generates captions for each clip.
Here’s how to create training data right in your browser with just a few clicks:
Find a YouTube URL (or video file) that you want to use for training.
Go to replicate.com/zsxkib/create-video-dataset
Paste your video URL, or upload a video file from your computer.
Choose a unique trigger word like RCKRLL . Avoid using real words that have existing associations.
Click Run and download the resulting ZIP file.
Optional: Check out the logs from your training run if you want to see the auto-generated captions for each clip.
Step 2: Train your model
Now you’ll create your own fine-tuned video generation model using the training data you just compiled.
Go to replicate.com/zsxkib/hunyuan-video-lora/train
Choose a name for your model.
For the input_videos input, upload the ZIP file you just downloaded.
Enter the same trigger word you used before, e.g. RCKRLL
Adjust training settings (we recommend starting with 2 epochs)
Click Create training
Training typically takes about 5-10 minutes with default settings, but depends on the size and number of clips.
Step 3: Run your model
Once the training is complete, you can generate new videos in several ways:
Run the model in your browser directly from your model’s page.
Run your model in Replicate’s Playground : Go to “Manage models” and type your model name.
Use the API: Go to your model’s page and click the API tab for code snippets.
You can run your model as an API with just a few lines of code.
Here’s an example using the replicate-javascript client:
Copy
import Replicate from "replicate"
const replicate = new Replicate ()
const model = "your-username/your-model:your-model-version" const prompt = "A lion dancing on a subway train the style of RCKRLL" const output = await replicate. run (model, {input: { prompt }}) console. log (output)
Step 4: Experiment for best results
Video fine-tuning is pretty new, so we’re still learning what works best.
Here are some early tips:
Use a unique trigger word that doesn’t have associations with real words.
Experiment with training settings:
More epochs == better quality but longer training time
Adjust the LoRA rank
Increase batch size to speed up training
Use max_steps to control training duration precisely
If training looks like it’s going to take several hours, cancel it and try:
Reducing the number of epochs
Reducing the rank
Increasing batch size
Check the GitHub README for detailed parameter explanations
Extra credit: Train new models programmatically
If you want to automate the process or build applications, you can use our API.
Here’s an example of how to train a new model programmatically using the Replicate Python client:
Copy
import replicate import time
Create a training dataset from a video
dataset = replicate.run( "zsxkib/create-video-dataset:4eb83cc8ba563da7032933374444a9a7a6f630b5b1e4f219cf9088f6a4acc138" , input = { "video_url" : "YOUR_VIDEO_URL" , "trigger_word" : "UNIQUE_TRIGGER" , "start_time" : 10 , "end_time" : 40 , "num_segments" : 8 , "autocaption" : True , "autocaption_prefix" : "a video of UNIQUE_TRIGGER," } )
Create a new model to store the training results
model = replicate.models.create( owner = "your-username" , name = "your-model-name" , visibility = "public" , hardware = "gpu-t4" )
Start training with the processed video
training = replicate.trainings.create( model = "zsxkib/hunyuan-video-lora" , version = "04279caf015c30a635cabc4077b5bd82c5c706262eb61797a48db139444bcca9" , # Current model version ID input = { "input_videos" : dataset.url, "trigger_word" : "UNIQUE_TRIGGER" , "epochs" : 2 , "batch_size" : 8 , }, destination = "your-username/your-model-name" , # Where to push the trained model )
Wait for training to complete
while training.status not in [ "succeeded" , "failed" , "canceled" ]: training.reload() time.sleep( 10 ) # Wait 10 seconds between checks
if training.status != "succeeded" : raise Exception ( f "Training failed: { training.error } " )
Generate new videos with your fine-tuned model
output = replicate.run( training.output[ 'version' ], input = { "prompt" : "A video of UNIQUE_TRIGGER in a cyberpunk city" , "num_frames" : 45 , "frame_rate" : 24 } )
What’s next?
Fine-tuning video models is in its early days, so we don’t really know yet what is possible, and what might be able to be built on top of it.
Give it a try and show us what you’ve made on Discord , or tag @replicate on X.
Next: Generate short videos with the Replicate playground
Notability
notability 5.0/10New fine-tuning feature for video models announced by Replicate