ForkNovita AINovita AIpublished Jan 14, 2026seen 5d

novitalabs/rllm

forked from rllm-org/rllm

Open original ↗

Captured source

source ↗
published Jan 14, 2026seen 5dcaptured 14hhttp 200method plain

novitalabs/rllm

Description: Democratizing Reinforcement Learning for LLMs

License: Apache-2.0

Stars: 0

Forks: 0

Open issues: 0

Created: 2026-01-14T08:23:34Z

Pushed: 2026-03-09T09:24:47Z

Default branch: main

Fork: yes

Parent repository: rllm-org/rllm

Archived: no

README:

rLLM is an open-source framework for post-training language agents via reinforcement learning. With rLLM, you can easily build your custom agents and environments, train them with reinforcement learning, and deploy them for real-world workloads.

Releases 📰

[2025/12/11] We release rLLM v0.2.1 which comes with support for Tinker backend, LoRA and VLM training, and support for Eval Protocol. We also bumped our verl backend to v0.6.1. [[SDK Blogpost]](https://rllm-project.com/post.html?post=sdk.md)

[2025/10/16] rLLM v0.2 is now officially released! We introduce AgentWorkflowEngine for training over arbitrary agentic programs. It also comes integrated with the official verl-0.5.0, featuring support for Megatron training. Check out this blog post for more.

[2025/07/01] We release `DeepSWE-Preview`, a 32B software engineering agent (SWE) trained with purely RL that achieves 59% on SWEBench-Verified with test-time scaling,(42.2% Pass@1), topping the SWEBench leaderboard for open-weight models.

[2025/04/08] We release `DeepCoder-14B-Preview`, a 14B coding model that achieves an impressive 60.6% Pass@1 accuracy on LiveCodeBench (+8% improvement), matching the performance of o3-mini-2025-01-031 (Low) and o1-2024-12-17.

[2025/02/10] We release `DeepScaleR-1.5B-Preview`, a 1.5B model that surpasses O1-Preview and achieves 43.1% Pass@1 on AIME. We achieve this by iteratively scaling Deepseek's GRPO algorithm from 8K→16K->24K context length for thinking.

Getting Started 🎯

rLLM requires Python >= 3.10 (3.11 is needed if using tinker). You can install it either directly via pip or build from source.

There are three ways that you can install rLLM:

Approach A: Direct Installation

uv pip install "rllm[verl] @ git+https://github.com/rllm-org/rllm.git"

_(or replace the verl above for tinker to install with tinker backend, see below for more details)_

Approach B: Building from Source with uv

Step 1: Clone and Setup Environment

# Clone the repository
git clone https://github.com/rllm-org/rllm.git
cd rllm

# Create an uv environment
uv venv --python 3.11
source .venv/bin/activate

Step 2: Install rLLM with Training Backend

rLLM supports two training backends: verl and tinker. Choose one based on your needs.

_Option I: Using verl as Training Backend_

uv pip install -e .[verl]

_Option II: Using tinker as Training Backend_

# can add --torch-backend=cpu to train on CPU-only machines
uv pip install -e .[tinker]

Approach C: Installation with Docker 🐳

For a containerized setup, you can use Docker:

# Build the Docker image
docker build -t rllm .

# Create and start the container
docker create --runtime=nvidia --gpus all --net=host --shm-size="10g" --cap-add=SYS_ADMIN -v .:/workspace/rllm -v /tmp:/tmp --name rllm-container rllm sleep infinity
docker start rllm-container

# Enter the container
docker exec -it rllm-container bash

For more detailed installation guide, including using sglang for verl backend, please refer to our documentation.

Awesome Projects using rLLM 🔥

Acknowledgements

Our work is done as part of Berkeley Sky Computing Lab. The rLLM team is generously supported by grants from Laude Institute, AWS, Hyperbolic, Fireworks AI, and Modal. We pay special thanks to Together AI for the research partnership and compute support.

Citation

@misc{rllm2025,
title={rLLM: A Framework for Post-Training Language Agents},
author={Sijun Tan and Michael Luo and Colin Cai and Tarun Venkat and Kyle Montgomery and Aaron Hao and Tianhao Wu and Arnav Balyan and Manan Roongta and Chenguang Wang and Li Erran Li and Raluca Ada Popa and Ion Stoica},
year={2025},
howpublished={\url{https://pretty-radio-b75.notion.site/rLLM-A-Framework-for-Post-Training-Language-Agents-21b81902c146819db63cd98a54ba5f31}},
note={Notion Blog},
year={2025}
}

You may also cite our prior work DeepScaleR,…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Routine fork with no traction