NousResearch/Gym
forked from NVIDIA-NeMo/Gym
Captured source
source ↗NousResearch/Gym
Description: Build RL environments for LLM training
Language: Python
License: Apache-2.0
Stars: 17
Forks: 2
Open issues: 0
Created: 2026-03-31T01:31:07Z
Pushed: 2026-04-27T20:10:18Z
Default branch: main
Fork: yes
Parent repository: NVIDIA-NeMo/Gym
Archived: no
README:
NeMo Gym
[Requirements](#-requirements) • [Quick Start](#-quick-start) • [Available Environments](#-available-environments) • [Documentation & Resources](#-documentation--resources) • [Community & Support](#-community--support) • [Citations](#-citations)
NeMo Gym is a library for building reinforcement learning (RL) training environments for large language models (LLMs). It provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework.
🏆 Why NeMo Gym?
- Scaffolding and patterns to accelerate environment development: multi-step, multi-turn, and user modeling scenarios
- Contribute environments without expert knowledge of the entire RL training loop
- Test environments and throughput end-to-end, independent of the RL training loop
- Interoperable with existing environments, systems, and RL training frameworks
- Growing collection of training environments and datasets for Reinforcement Learning from Verifiable Reward (RLVR)
> [!IMPORTANT] > NeMo Gym is currently in early development. You should expect evolving APIs, incomplete documentation, and occasional bugs. We welcome contributions and feedback - for any changes, please open an issue first to kick off discussion!
🔗 Ecosystem
NeMo Gym is part of NVIDIA NeMo, NVIDIA's GPU-accelerated platform for building and training generative AI models. NeMo Gym integrates with a growing number of RL training frameworks and environment libraries; see the Ecosystem page for full details and tutorials.
Training Frameworks: NeMo RL • OpenRLHF • Unsloth • more →
Environment Libraries: Reasoning Gym • Aviary • more →
📋 Requirements
NeMo Gym is designed to run on standard development machines:
| Hardware Requirements | Software Requirements | | --------------------- | --------------------- | | GPU: Not required for NeMo Gym library operation • GPU may be needed for specific resources servers or model inference (see individual server documentation) | Operating System: • Linux (Ubuntu 20.04+, or equivalent) • macOS (11.0+ for x86_64, 12.0+ for Apple Silicon) • Windows (via WSL2) | | CPU: Any modern x86_64 or ARM64 processor (e.g., Intel, AMD, Apple Silicon) | Python: 3.12 or higher | | RAM: Minimum 8 GB (16 GB+ recommended for larger environments) | Git: For cloning the repository | | Storage: Minimum 5 GB free disk space for installation and basic usage | Internet Connection: Required for downloading dependencies and API access |
Additional Requirements
- API Keys: OpenAI API key with available credits (for the quickstart examples)
- Other model providers supported (Azure OpenAI, self-hosted models via vLLM)
- Ray: Automatically installed as a dependency (no separate setup required)
🚀 Quick Start
Install NeMo Gym, start the servers, and collect your first verified rollouts for RL training.
Setup
# Clone the repository git clone git@github.com:NVIDIA-NeMo/Gym.git cd Gym # Install UV (Python package manager) curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.local/bin/env # Create virtual environment uv venv --python 3.12 source .venv/bin/activate # Install NeMo Gym uv sync --extra dev --group docs
Configure Your API Key
Create an env.yaml file that contains your OpenAI API key and the policy model you want to use. Replace your-openai-api-key with your actual key. This file helps keep your secrets out of version control while still making them available to NeMo Gym.
echo "policy_base_url: https://api.openai.com/v1 policy_api_key: your-openai-api-key policy_model_name: gpt-4.1-2025-04-14" > env.yaml
> [!NOTE] > We use GPT-4.1 in this quickstart because it provides low latency (no reasoning step) and works reliably out-of-the-box. NeMo Gym is not limited to OpenAI models—you can use self-hosted models via vLLM or any OpenAI-compatible inference server. See the documentation for details.
Start Servers
Terminal 1 (start servers):
# Start servers (this will keep running)
config_paths="resources_servers/example_single_tool_call/configs/example_single_tool_call.yaml,\
responses_api_models/openai_model/configs/openai_model.yaml"
ng_run "+config_paths=[${config_paths}]"Terminal 2 (interact with agent):
# In a NEW terminal, activate environment source .venv/bin/activate # Interact with your agent python responses_api_agents/simple_agent/client.py
Collect Rollouts
Terminal 2 (keep servers running in Terminal 1):
# Create a simple dataset with one query
echo '{"responses_create_params":{"input":[{"role":"developer","content":"You are a helpful assistant."},{"role":"user","content":"What is the weather in Seattle?"}]}}' > weather_query.jsonl
# Collect verified rollouts
ng_collect_rollouts \
+agent_name=example_single_tool_call_simple_agent \
+input_jsonl_fpath=weather_query.jsonl \
+output_jsonl_fpath=weather_rollouts.jsonl
# View the result
cat weather_rollouts.jsonl | python -m json.toolThis generates training data with verification scores!
Clean Up Servers
Terminal 1 with the running servers: Ctrl+C to stop the ng_run process.
###…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Low-star fork, routine activity