ModelSnowflake (Arctic)Snowflake (Arctic)published Aug 21, 2025seen 5d

Snowflake/Arctic-LSTM-Speculator-gpt-oss-120b

Open original ↗

Captured source

source ↗
published Aug 21, 2025seen 5dcaptured 9hhttp 200method plainlicense apache-2.0downloads 78likes 5

ArcticSpeculator

Build the fastest OSS vllm-based speculative decoding system for your own model, using ArcticTraining and ArcticInference!

Throughput (tokens/s) of gpt-oss-120b on 8xH100 using vLLM below:

| method | ShareGPT | HumanEval | |--------------------------------------|----------------|--------------| | vLLM V1 Baseline | 220.2 | 220.7 | | ArcticSpeculator | 377.3 | 400.0 |

For more details about ArcticSpeculator and how to use it:

See all of the speculators we have released via our Speculators Collection

Notability

notability 4.0/10

Low traction, niche speculator model release.