mistralai/Devstral-Small-2507
Captured source
source ↗Devstral Small 1.1
Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positions it as the #1 open source model on this [benchmark](#benchmark-results).
It is finetuned from Mistral-Small-3.1, therefore it has a long context window of up to 128k tokens. As a coding agent, Devstral is text-only and before fine-tuning from Mistral-Small-3.1 the vision encoder was removed.
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
Learn more about Devstral in our blog post.
Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):
- Improved performance, please refer to the [benchmark results](#benchmark-results).
Devstral Small 1.1is still great when paired with OpenHands. This new version also generalizes better to other prompts and coding environments.- Supports Mistral's function calling format.
Key Features:
- Agentic coding: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
- lightweight: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use.
- Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
- Context Window: A 128k context window.
- Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.
Benchmark Results
SWE-Bench
Devstral Small 1.1 achieves a score of 53.6% on SWE-Bench Verified, outperforming Devstral Small 1.0 by +6,8% and the second best state of the art model by +11.4%.
| Model | Agentic Scaffold | SWE-Bench Verified (%) | |--------------------|--------------------|------------------------| | Devstral Small 1.1 | OpenHands Scaffold | 53.6 | | Devstral Small 1.0 | OpenHands Scaffold | *46.8* | | GPT-4.1-mini | OpenAI Scaffold | 23.6 | | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 | | SWE-smith-LM 32B | SWE-agent Scaffold | 40.2 | | Skywork SWE | OpenHands Scaffold | 38.0 | | DeepSWE | R2E-Gym Scaffold | 42.2 |
When evaluated under the same test scaffold (OpenHands, provided by All Hands AI 🙌), Devstral exceeds far larger models such as Deepseek-V3-0324 and Qwen3 232B-A22B.

Usage
We recommend to use Devstral with the OpenHands scaffold. You can use it either through our API or by running locally.
API
Follow these instructions to create a Mistral account and get an API key.
Then run these commands to start the OpenHands docker container.
export MISTRAL_API_KEY=
mkdir -p ~/.openhands && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2507","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik
docker run -it --rm --pull=always \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik \
-e LOG_ALL_EVENTS=true \
-v /var/run/docker.sock:/var/run/docker.sock \
-v ~/.openhands:/.openhands \
-p 3000:3000 \
--add-host host.docker.internal:host-gateway \
--name openhands-app \
docker.all-hands.dev/all-hands-ai/openhands:0.48Local inference
The model can also be deployed with the following libraries:
- `vllm (recommended)`: See [here](#vllm-recommended)
- `mistral-inference`: See [here](#mistral-inference)
- `transformers`: See [here](#transformers)
- `LMStudio`: See [here](#lmstudio)
- `llama.cpp`: See [here](#llama.cpp)
- `ollama`: See [here](#ollama)
vLLM (recommended)
Expand= 0.9.1`](https://github.com/vllm-project/vllm/releases/tag/v0.9.1):
pip install vllm --upgrade
Also make sure to have installed `mistral_common >= 1.7.0`.
pip install mistral-common --upgrade
To check:
python -c "import mistral_common; print(mistral_common.__version__)"
You can also make use of a ready-to-go docker image or on the docker hub.
_Launch server_
We recommand that you use Devstral in a server/client setting.
1. Spin up a server:
vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2
2. To ping the client you can use a simple Python snippet.
import requests
import json
from huggingface_hub import hf_hub_download
url = "http://:8000/v1/chat/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
model = "mistralai/Devstral-Small-2507"
def load_system_prompt(repo_id: str, filename: str) -> str:
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
with open(file_path, "r") as file:
system_prompt = file.read()
return system_prompt
SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{
"role": "user",
"content": [
{
"type": "text",
"text": "",
},
],
},
]
data = {"model": model, "messages": messages,…Excerpt shown — open the source for the full document.
Notability
notability 7.0/10Notable model by Mistral with moderate traction