What does this repo signal mean?

ByteDance (Doubao/Seed) published ByteDance-Seed/VeOmni (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo ByteDance-Seed/VeOmni · language Python · High-star ByteDance repo, likely notable model.. onlylabs links this event to 1 captured evidence page and 6 related repo signals. It also maps to Infrastructure in the data-business radar.

ByteDance (Doubao/Seed) Repo: ByteDance-Seed/VeOmni

Captured source

source ↗

GitHub/github.com/ByteDance-Seed/VeOmni

ByteDance-Seed/VeOmni repository metadata

Source ↗

published Mar 28, 2025seen 5dcaptured 13hhttp 200method plain

ByteDance-Seed/VeOmni

Description: VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Language: Python

License: Apache-2.0

Stars: 2003

Forks: 211

Open issues: 95

Created: 2025-03-28T03:42:42Z

Pushed: 2026-06-10T02:20:41Z

Default branch: main

Fork: no

Archived: no

README:

🍪 Overview

VeOmni is a versatile framework for both single- and multi-modal pre-training and post-training. It empowers users to seamlessly scale models of any modality across various accelerators, offering both flexibility and user-friendliness.

Our guiding principles when building VeOmni are:

Flexibility and Modularity: VeOmni is built with a modular design, allowing users to decouple most components and replace them with their own implementations as needed.
Trainer-free: VeOmni supports linear training scripts that avoid rigid, structured trainer classes (e.g., PyTorch-Lightning or HuggingFace Trainer). These training scripts expose the entire training logic to users for maximum transparency and control. Besides, VeOmni supports a basic trainer for text-only or vlm/omni models training and a rl trainer as a trainer backend in reinforcement learning.

Omni model native: VeOmni enables users to effortlessly scale any omni-model across devices and accelerators.
Torch native: VeOmni is designed to leverage PyTorch’s native functions to the fullest extent, ensuring maximum compatibility and performance.

🔥 Latest News

[2025/11] Our Paper OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo was accepted by AAAI 2026
[2025/09] We release first offical release v0.1.0 of VeOmni.
[2025/08] We release VeOmni Tech report and open the [WeChat group](./docs/assets/wechat.png). Feel free to join us!
[2025/04] We release VeOmni!

📚 Key Features

FSDP, FSDP2 backend for training.
Sequence Parallelism with Deepspeed Ulysess, support with non-async and async mode.
Experts Parallelism support large MOE model training, like Qwen3-Moe.
Efficient GroupGemm kernel for Moe model, Liger-Kernel.
Compatible with HuggingFace Transformers models. Qwen3, Qwen3-VL, Qwen3-Moe, etc
Dynamic batching strategy, Omnidata processing
**Torch Distributed Checkpoint** for checkpoint.
Support for both Nvidia-GPU and Ascend-NPU training.
Experiment tracking with wandb

📝 Upcoming Features and Changes

VeOmni v0.2 Roadmap https://github.com/ByteDance-Seed/VeOmni/issues/268, https://github.com/ByteDance-Seed/VeOmni/issues/271
Vit balance tool https://github.com/ByteDance-Seed/VeOmni/issues/280
Validation dataset during training https://github.com/ByteDance-Seed/VeOmni/issues/247
RL post training for omni-modality models with VeRL https://github.com/ByteDance-Seed/VeOmni/issues/262

🚀 Getting Started

Documentation

Quick Start

✏️ Supported Models

| Model | Model size | Example config File | | -------------------------------------------------------- | ----------------------------- | ----------------------------------------------------------------------| | DeepSeek2.5/3/R1 | 236B/671B | [deepseek.yaml](configs/text/deepseek.yaml) | | Llama3-3.3 | 1B/3B/8B/70B | [llama3.yaml](configs/text/llama3.yaml) | | Qwen2-3 | 0.5B/1.5B/3B/7B/14B/32B/72B/ | [qwen2_5.yaml](configs/text/qwen2_5.yaml) | | Qwen2-3 VL/QVQ | 2B/3B/7B/32B/72B | [qwen3_vl_dense.yaml](configs/multimodal/qwen3_vl/qwen3_vl_dense.yaml)| | Qwen3-VL MoE | 30BA3B/235BA22B | [qwen3_vl_moe.yaml](configs/multimodal/qwen3_vl/qwen3_vl_moe.yaml) | | Qwen3-MoE | 30BA3B/235BA22B | [qwen3-moe.yaml](configs/text/qwen3-moe.yaml) | | Qwen2-3 Omni | 7B/30BA3B | [qwen25_omni.yaml](configs/multimodal/qwen25_omni/qwen25_omni.yaml) | | Wan | Wan2.1-I2V-14B-480P | [wan_sft.yaml](configs/dit/wan_sft.yaml) | | Omni Model | Any Modality Training | [seed_omni.yaml](configs/multimodal/omni/seed_omni.yaml) |

Support new models to VeOmni see Support New Models

⛰️ Performance

For more details, please refer to our paper.

💡 Awesome work using VeOmni

dFactory: Easy and Efficient dLLM Fine-Tuning
LMMs-Engine
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
[OpenHA: A Series of Open-Source Hierarchical

Agentic Models in Minecraft](https://arxiv.org/pdf/2509.13347)

🎨 Contributing

Contributions from the community are welcome! Please check out [CONTRIBUTING.md](CONTRIBUTING.md) our project roadmap(To be updated),

📝 Citation and Acknowledgement

If you find VeOmni useful for your research and applications, feel free to give us a star ⭐ or cite us using:

@article{ma2025veomni,
title={VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo},
author={Ma, Qianli and Zheng, Yaowei and Shi, Zhelun and Zhao, Zhongkai and Jia, Bin and Huang, Ziyue and…

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

High-star ByteDance repo, likely notable model.