What does this repo signal mean?

ByteDance (Doubao/Seed) published ByteDance-Seed/BFS-Prover-V2 (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo ByteDance-Seed/BFS-Prover-V2 · language Python · New formal verification tool by ByteDance. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

ByteDance (Doubao/Seed) Repo: ByteDance-Seed/BFS-Prover-V2

Captured source

source ↗

GitHub/github.com/ByteDance-Seed/BFS-Prover-V2

ByteDance-Seed/BFS-Prover-V2 repository metadata

Source ↗

published Oct 8, 2025seen Jun 5captured Jun 11http 200method plain

ByteDance-Seed/BFS-Prover-V2

Language: Python

License: Apache-2.0

Stars: 50

Forks: 4

Open issues: 0

Created: 2025-10-08T02:34:07Z

Pushed: 2025-10-09T22:39:11Z

Default branch: main

Fork: no

Archived: no

README:

BFS-Prover-V2: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

Introduction

We introduce BFS-Prover-V2, the state-of-the-art open-source step-level theorem proving system for Lean4, designed to address the dual challenges of scaling both training and inference in neural theorem proving. BFS-Prover-V2 introduces novel solutions to overcome these limitations through:

1. Training-time scaling: A novel multi-stage expert iteration framework with adaptive tactic-level data filtering and periodic retraining to surmount the performance plateaus that typically curtail long-term post training 2. Inference-time scaling: A planner-enhanced multi-agent tree search system for hierarchical reasoning that scales performance at inference time

BFS-Prover-V2 achieves 95.08\% and 41.4\% on the miniF2F and ProofNet test sets respectively, setting a new state-of-the-art for step-level provers.

Technical Overview

Scaling up training

At the start of each training round, the system evaluates the model's performance to determine whether it has plateaued. If the model continues to improve, it entersan inner expert iteration loop involving rollout, tactic filtering, and refinement. Once improvement stalls, the system transitions to the outer retraining loop, which performs data re-synthesis, data curation, and full retraining from the base checkpoint.

Scaling up inference

The planner agent (a general-purpose reasoning model) decomposes the main theorem into a sequence of simpler subgoals, which are managed in a shared subgoal cache and solved in parallel by multiple prover agents using best-first tree search. Successfully proven subgoals augment the main proof's context, while failures can trigger a dynamic replanning loop.

Benchmark Performance

Use BFS-Prover-V2 in Lean (via LLMLean)

https://github.com/user-attachments/assets/5e4d00b8-6ea1-465c-92dd-bf85a8c48cb1

BFS-Prover-V2 is integrated with LLMLean, enabling interactive theorem proving in VS Code.

Requirements

Lean 4 (Installation)
Ollama (Download)

Installation

1. Pull BFS-Prover-V2 from Ollama (32B) (7B)

# e.g., for 7B model with 8-bit quantization
ollama pull zeyu-zheng/BFS-Prover-V2-7B:q8_0

2. Configure LLMLean for BFS-Prover-V2

# ~/.config/llmlean/config.toml
api = "ollama"
model = "zeyu-zheng/BFS-Prover-V2-7B:q8_0"
mode = "parallel"
prompt = "tacticstate"
responseFormat = "tactic"
numSamples = "5"

3. Add LLMLean to lakefile in your Lean project

-- lakefile.lean
require llmlean from git "https://github.com/cmu-l3/llmlean.git"

# lakefile.toml
[[require]]
name = "llmlean"
git = "https://github.com/cmu-l3/llmlean.git"

4. Use in Lean

import Mathlib
import LLMlean

example : ... := by
llmstep "" -- Get suggestion for next tactic
-- or
llmstep "rw" -- Get suggestion for using tactic `rw` next

Run Proof Search from Repository

Requirements

Python 3.11
Lean 4 (Installation)

(Version compatibility follows LeanDojo 2.1.3, has been tested under Lean 4.10.0.)

CUDA-compatible GPU

Installation

1. Pull BFS-Prover-V2 from Hugging Face (32B) (7B)

2. Clone and install the repository

# Clone the repository
git clone https://github.com/ByteDance-Seed/BFS-Prover-V2.git
cd BFS-Prover-V2

# Install dependencies
pip install .

3. Trace your Lean repository for proof search with LeanDojo

from lean_dojo import *

url = URL
commit = COMMIT_HASH
repo = LeanGitRepo(url, commit)
trace(repo)

4. Prepare the data for proof search

The repository includes sample data for planning and proof search in src/data.

Plan Generation and Replanning

1. Configure the planner model and data path in src/plan/config.yaml

Important: Ensure that the statement_file and dojo_data_file contain matching theorems. Each theorem ID in the statement file must have a corresponding entry in the dojo data file. Mismatched data will cause verification errors.

For a quick start, you can use the provided demo data:

Set dojo_data_file: "src/data/demo_dojo.jsonl"
Set statement_file: "src/data/demo_statements.jsonl"

These demo files contain 7 matching theorems from the miniF2F test set.

2. (Optional) To use local models, launch a vLLM server

vllm serve --model /PATH/TO/YOUR/PLANNER_MODEL --port 8000 --reasoning_parser YOUR_REASONING_PARSER

3. Generate a plan for the theorems in dataset

bash src/plan/run_local_plan.sh

4. (Optional) Replan the theorems when the proof search gets stuck

bash src/plan/run_local_replan.sh

Proof Search

1. Configure the proof search parameters in src/search/run_local_search.sh

For a quick start with demo data, you can use:

Set --file_path to src/data/demo_dojo.jsonl
Set --plan_file to src/data/demo_plan.json

The demo_plan.json contains pre-generated plans for the 7 demo theorems that can be used directly for proof search.

2. Run the proof search:

bash src/search/run_local_search.sh

Citation

@article{xin2025scaling,
title={Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers},
author={Xin, Ran and Zheng, Zeyu and Nie, Yanchen and Yuan, Kun and Xiao, Xia},
journal={arXiv preprint arXiv:2509.06493},
year={2025}
}

License

This project is licensed under the [Apache License 2.0](LICENSE).

Notability

notability 5.0/10

New formal verification tool by ByteDance