RepoOpenBMB (MiniCPM)OpenBMB (MiniCPM)published Jan 8, 2026seen 5d

OpenBMB/AgentCPM

Python

Open original ↗

Captured source

source ↗
published Jan 8, 2026seen 5dcaptured 13hhttp 200method plain

OpenBMB/AgentCPM

Description: An End-to-End Infrastructure for Training and Evaluating Various LLM Agents

Language: Python

License: Apache-2.0

Stars: 800

Forks: 69

Open issues: 7

Created: 2026-01-08T06:08:33Z

Pushed: 2026-02-09T13:39:47Z

Default branch: main

Fork: no

Archived: no

README:

【中文 | English】

WeChat

Latest News

  • [2026-01-20] 🚀🚀🚀 We have open-sourced AgentCPM-Report, built on MiniCPM4.1-8B, which can rival top closed-source commercial systems for report generation such as Gemini-2.5-pro-DeepResearch.
  • [2026-01-12] 🚀🚀🚀 We have open-sourced AgentCPM-Explore—an agent LLM with only 4B parameters—along with all code for training, inference, and the tool sandbox environment. It successfully made it onto eight classic long-horizon and challenging agent leaderboards, including GAIA, HLE, and BrowseComp. Its SOTA performance at this scale enables longer action chains and more accurate Deep Research, breaking the performance barrier for on-device agents.

Table of Contents

  • [Latest News](#latest-news)
  • [Table of Contents](#table-of-contents)
  • [Overview](#overview)
  • [Model List](#model-list)
  • [AgentCPM-Explore](#agentcpm-explore)
  • [Demo](#demo)
  • [QuickStart](#quickstart)
  • [AgentCPM-Report](#agentcpm-report)
  • [Introduction](#introduction)
  • [Demo](#demo-1)
  • [QuickStart](#quickstart-1)
  • [Docker Deployment](#docker-deployment)
  • [License](#license)
  • [Citation](#citation)
  • [Explore More](#explore-more)

Overview

AgentCPM is a series of open-source LLM agents jointly developed by THUNLP (Tsinghua NLP Lab), Renmin University of China, ModelBest, and the OpenBMB community. To address challenges faced by agents in real-world applications—such as limited long-horizon capability, autonomy, and generalization—we propose a series of model-building approaches. Recently, the team has focused on comprehensively building deep research capabilities for agents, releasing [AgentCPM-Explore](./AgentCPM-Explore), a deep-search LLM agent, and [AgentCPM-Report](./AgentCPM-Report), a deep-research LLM agent.

Model List

| Model | Download Links | Open-Sourced Content | Technical Report | How to Use | |------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|-----------| | AgentCPM-Explore | 🤗 Hugging Face 🤖 ModelScope | [AgentDock](./AgentCPM-Explore/AgentDock): unified tool sandbox management & scheduling platform [AgentRL](./AgentCPM-Explore/AgentRL): fully asynchronous agent reinforcement learning framework [AgentToLeaP](./AgentCPM-Explore/AgentToLeaP): one-click evaluation framework for agent tool-learning capability | AgentCPM-Explore: Realizing Long-Horizon Deep Exploration for Edge-Scale Agents | [README.md](./AgentCPM-Explore) | AgentCPM-Report | 🤗 Hugging Face 🤖 ModelScope | UltraRAG: low-code RAG framework | AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research | [README.md](./AgentCPM-Report)

AgentCPM-Explore

The AgentCPM team has focused on systematically building agents’ deep research capabilities and released AgentCPM-Explore, a deep-search LLM agent. AgentCPM-Explore is the first open-source agent model with 4B parameters to appear on eight widely used long-horizon agent benchmarks such as GAIA, XBench, etc.

Key highlights:

  • SOTA at 4B Scale: Best-in-class among same-size models, matches or surpasses 8B models, rivals some 30B+ and closed-source LLMs.
  • Deep Exploration: 100+ turns of continuous interaction with multi-source cross-validation and dynamic strategy adjustment.
  • End-to-End Open Source: Complete training and evaluation infrastructure for community development and custom extensions.

Demo

Demo examples (speed up):

https://github.com/user-attachments/assets/f2b3bb20-ccd5-4b61-8022-9f6e90992baa

QuickStart

  • Multi-model, multi-tool collaborative environment setup: First, start the AgentDock tool sandbox platform to provide unified MCP (Model Context Protocol) tool services. When working with API-based models, configure the model’s BASE_URL and API_KEY. When working with locally hosted models, ensure the model service is accessible. Configure the required tool parameters in the config.toml file.
  • Launch the environment: Out of the box, one-click startup. The AgentDock unified tool sandbox platform supports launching all services with a single docker compose up -d command, including the management dashboard, database, and tool nodes.
  • Run execution: Quickly experience the core capabilities of the framework via the QuickStart script, allowing you to run a complete Agent task without complex configuration.

0. Prepare Evaluation Environment (Recommended): We provide a Docker image with all evaluation dependencies pre-installed. It is recommended to pull the image and run it directly:

# 1. Enter the project folder
cd AgentCPM-Explore

# 2. Pull the image (Supports amd64/arm64 architectures)
docker pull yuyangfu/agenttoleap-eval:v2.0

# 3. Start the container (Adjust the -v path as needed)
docker run -dit --name agenttoleap --gpus all --network host -v $(pwd):/workspace yuyangfu/agenttoleap-eval:v2.0

# 4. Enter the container
docker exec -it agenttoleap /bin/bash
cd /workspace

1. Configure and run: Open quickstart.py and make simple configurations in the [USER CONFIGURATION] section:

  • Custom task: Modify the QUERY variable to the instruction you want to test (e.g., “Check the results of last night’s UEFA Champions League matches”).
  • Model information: Provide your LLM API_KEY, MODEL_NAME, and BASE_URL.
  • Tool service: Set MANAGER_URL to the address of your MCP tool server (e.g., http://localhost:8000; make sure the service is already running).

After configuration, run:

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

New agent framework with moderate traction.