deepseek-ai/DeepSpec
Python
Captured source
source ↗deepseek-ai/DeepSpec
Description: DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms
Language: Python
License: MIT
Stars: 128
Forks: 6
Open issues: 0
Created: 2026-06-26T12:36:05Z
Pushed: 2026-06-27T04:57:08Z
Default branch: main
Fork: no
Archived: no
README:
DeepSpec
DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.
Environment
Install the Python dependencies:
python -m pip install -r requirements.txt
Data preparation additionally requires an inference engine to serve the target model when regenerating answers; see [scripts/data/README.md](./scripts/data/README.md) for details.
Workflow
Run the stages in order — each stage's output feeds the next:
1. Data Preparation — download prompts, regenerate target answers, and build the target cache. 2. Training — train a draft model against the cached target outputs. 3. Evaluation — measure speculative-decoding acceptance on benchmark tasks.
Data Preparation
See [scripts/data/README.md](./scripts/data/README.md) for the step-by-step data pipeline:
1. download and split training data, 2. regenerate answers, 3. prepare the target cache (storage warning: this can be very large — roughly 38 TB for the default Qwen/Qwen3-4B setting).
Training
bash scripts/train/train.sh
train.sh launches train.py, which spawns one worker per visible GPU. Select the algorithm and target model by pointing config_path at one of the configs under [config/](./config/) (e.g. config/dspark/dspark_qwen3_4b.py); see the script header for the full list of configs, how to override config_path / target_cache_dir, and how to use --opts to override individual config fields. Checkpoints are written to ~/checkpoints///step_*.
Hardware: the default configs and scripts assume a single node with 8 GPUs. For fewer GPUs, reduce CUDA_VISIBLE_DEVICES.
Evaluation
bash scripts/eval/eval.sh
eval.sh runs eval.py against a trained draft checkpoint over the speculative-decoding benchmarks in [eval_datasets/](./eval_datasets/) (gsm8k, math500, aime25, humaneval, mbpp, livecodebench, mt-bench, alpaca, arena-hard-v2). Set:
target_name_or_path— the target model the draft was trained against (e.g.Qwen/Qwen3-4B),draft_name_or_path— the draft checkpoint, e.g.~/checkpoints/deepspec/dspark_block8_qwen3_4b/step_latest.
Supported Algorithms
Currently, DeepSpec includes three draft models: [DSpark](./DSpark_paper.pdf), DFlash and Eagle3.
License
DeepSpec is released under the [MIT License](./LICENSE). It includes code adapted from third-party projects under their own licenses; see [NOTICE](./NOTICE) for the full attribution.
Acknowledgements
DeepSpec builds on the ideas and code of several excellent open-source projects:
- SpecForge (Apache-2.0) — the overall training framework and Eagle3 implementation; portions of the Eagle3 modeling, loss, optimizer, attention, and evaluation code are adapted from it. Adapted files carry an in-file attribution comment, and the full notice is recorded in [NOTICE](./NOTICE).
- DFlash (MIT) — the DFlash draft-model design and training recipe.
- Qwen3 and Gemma — the target model families supported in this repo.
We thank the authors and maintainers of these projects. Contributions of new algorithms are welcome.
Notability
notability 5.0/10New repo with moderate traction.
DeepSeek has a repo signal matching evals and quality, infrastructure.