ForkNous ResearchNous Researchpublished May 27, 2026seen 5d

NousResearch/Megatron-Bridge

forked from NVIDIA-NeMo/Megatron-Bridge

Open original ↗

Captured source

source ↗
published May 27, 2026seen 5dcaptured 9hhttp 200method plain

NousResearch/Megatron-Bridge

Description: Training library for Megatron-based models with bidirectional Hugging Face conversion capability

License: Apache-2.0

Stars: 5

Forks: 0

Open issues: 0

Created: 2026-05-27T12:17:08Z

Pushed: 2026-05-27T12:36:35Z

Default branch: main

Fork: yes

Parent repository: NVIDIA-NeMo/Megatron-Bridge

Archived: no

README:

📣 News

  • [05/20/2026] **Nemotron-3 Nano Omni** day-0 branch support is now merged on main! The 30B-A3B MoE multimodal model supports image, video, audio, and text workflows with checkpoint conversion, inference, SFT, and PEFT (LoRA) examples. Read the NVIDIA Blog and see the examples README for the full walkthrough.
  • [05/19/2026] **Nemotron-Labs Diffusion** is now supported on main with autoregressive-to-diffusion conversion, continuous pretraining, checkpoint conversion, and inference workflows. Read the NVIDIA Research blog for the tri-mode language model overview.
  • [05/06/2026] **Gemma 4 VL 26B-A4B** is now supported! Checkpoint conversion, SFT, and PEFT (LoRA) recipes for Google's MoE vision-language model (26B total / 4B active params, 128 experts top-k=8, dual sliding/global attention with K=V tying on full-attention layers) are available on main. See the examples README for the full walkthrough.
  • [04/28/2026] Day 0 support for **Nemotron-3 Nano Omni**, a 30B-A3B MoE multimodal model that jointly processes image, video, audio, and text. Checkpoint conversion, SFT, and LoRA recipes are available on main — see the examples README for the full walkthrough.
  • [04/19/2026] **Qwen3.6-35B-A3B** is now supported! Qwen3.6 uses the same architecture as Qwen3.5 VL MoE (Qwen3_5MoeForConditionalGeneration) and works with the existing Qwen3.5-VL bridge out of the box — no code changes needed. HF→Megatron conversion and inference verified.
  • [04/12/2026] **MiniMax-M2.5 / M2.7** are now supported! Both models share the same architecture as MiniMax-M2 and work with the existing bridge out of the box — checkpoint conversion and inference verified on real FP8 checkpoints.
  • [04/09/2026] **Bailing MoE V2** is now supported! Checkpoint conversion and inference for the Bailing MoE V2 model are available on main. Thank you to @ccclyu for the community contribution!
  • [03/31/2026] Agent Skills for Megatron Bridge! We've added a `skills/` directory with structured guides that AI coding agents (Cursor, Claude Code, Codex, etc.) can use to help you add model support, set up dev environments, tune performance, and more. Try them out, and PRs to improve or add new skills are very welcome!
  • [03/26/2026] **Nemotron 3 Super** is now on main! Checkpoint conversion and SFT/LoRA recipes (120B-A12B) are available in the main branch. Read the blog post.
  • [03/12/2026] Deprecating Python 3.10 support: We're officially dropping Python 3.10 support with the upcoming 0.4.0 release. Downstream applications must raise their lower…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Routine fork with minimal stars.