mistralai/Devstral-2-123B-Instruct-2512
Captured source
source ↗Devstral 2 123B Instruct 2512
Devstral is an agentic LLM for software engineering tasks. Devstral 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench.
This model is an Instruct model in FP8, fine-tuned to follow instructions, making it ideal for chat, agentic and instruction based tasks for SWE use cases.
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we invite companies to reach out to us.
Key Features
The Devstral 2 Instruct model offers the following capabilities:
- Agentic Coding: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
- Improved Performance: Devstral 2 is a step-up compared to its predecessors.
- Better Generalization: Generalises better to diverse prompts and coding environments.
- Context Window: A 256k context window.
Use Cases
AI Code Assistants, Agentic Coding, and Software Engineering Tasks. Leveraging advanced AI capabilities for complex tool integration and deep codebase understanding in coding environments.
Benchmark Results
| Model/Benchmark | Size (B Parameters) | SWE Bench Verified | SWE Bench Multilingual | Terminal Bench 2 | |-------------------------------|-----------------|--------------------|------------------------|------------------| | Devstral 2 | 123 | 72.2% | 61.3% | 32.6% | | Devstral Small 2 | 24 | 68.0% | 55.7% | 22.5% | | | | | | | | GLM 4.6 | 355 | 68.0% | -- | 24.6% | | Qwen 3 Coder Plus | 480 | 69.6% | 54.7% | 25.4% | | MiniMax M2 | 230 | 69.4% | 56.5% | 30.0% | | Kimi K2 Thinking | 1000 | 71.3% | 61.1% | 35.7% | | DeepSeek v3.2 | 671 | 73.1% | 70.2% | 46.4% | | | | | | | | GPT 5.1 Codex High | -- | 73.7% | -- | 52.8% | | GPT 5.1 Codex Max | -- | 77.9% | -- | 60.4% | | Gemini 3 Pro | -- | 76.2% | -- | 54.2% | | Claude Sonnet 4.5 | -- | 77.2% | 68.0% | 42.8% |
*Benchmark results presented are based on publicly reported values for competitor models.
Usage
Scaffolding
Together with Devstral 2, we are releasing Mistral Vibe, a CLI tool allowing developers to leverage Devstral capabilities directly in your terminal.
- Mistral Vibe (recommended): Learn how to use it [here](#mistral-vibe)
Devstral 2 can also be used with the following scaffoldings:
You can use Devstral 2 either through our API or by running locally.
Mistral Vibe
The Mistral Vibe CLI is a command-line tool designed to help developers leverage Devstral’s capabilities directly from their terminal.
We recommend installing Mistral Vibe using uv for faster and more reliable dependency management:
uv tool install mistral-vibe
You can also run:
curl -LsSf https://mistral.ai/vibe/install.sh | sh
If you prefer using pip, use:
pip install mistral-vibe
To launch the CLI, navigate to your project's root directory and simply execute:
vibe
If this is your first time running Vibe, it will:
- Create a default configuration file at
~/.vibe/config.toml. - Prompt you to enter your API key if it's not already configured, follow these instructions to create an Account and get an API key.
- Save your API key to
~/.vibe/.envfor future use.
Local Deployment
The model can also be deployed with the following libraries, we advise everyone to use the Mistral AI API if the model is subpar with local serving:
- `vllm (recommended)`: See [here](#vllm-recommended)
- `sglang`: See [here](#sglang)
- `transformers`: See [here](#transformers)
We're thankful to the llama.cpp team and their community as well as the LM Studio and Ollama teams that worked hard to make these models also available for their frameworks.
You can now also run Devstral using these (alphabetical ordered) frameworks:
- `llama.cpp`: To use community ones such as Unsloth's or Bartowski's make sure to use changes from this PR.
- `Ollama`: https://ollama.com/library/devstral-2
If you notice subpar performance with local serving, please submit issues to the relevant framework so that it can be fixed and in the meantime we advise to use the Mistral AI API.
vLLM (recommended)
Expand= 1.8.6`](https://github.com/mistralai/mistral-common/releases/tag/v1.8.6). To check:
python -c "import mistral_common; print(mistral_common.__version__)"
_Launch server_
We recommand that you use Devstral in a server/client setting.
1. Spin up a server:
vllm serve mistralai/Devstral-2-123B-Instruct-2512 \ --tool-call-parser mistral --enable-auto-tool-choice \ --tensor-parallel-size 8
2. To ping the client you can use a simple Python snippet.
import requests
import json
from huggingface_hub import hf_hub_download
url = "http://:8000/v1/chat/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
model = "mistralai/Devstral-2-123B-Instruct-2512"
def load_system_prompt(repo_id: str, filename: str) -> str:
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
with open(file_path, "r") as file:
system_prompt = file.read()
return system_prompt
SYSTEM_PROMPT = load_system_prompt(model, "CHAT_SYSTEM_PROMPT.txt")
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{
"role": "user",
"content": [
{
"type": "text",
"text": "",
},
],
},
]
data = {"model": model, "messages": messages, "temperature": 0.15}
# Devstral 2 supports tool calling. If you want to use tools, follow this:
# tools = [ # Define tools for vLLM
# {
# "type": "function",
# "function": {
# "name": "git_clone",
#…Excerpt shown — open the source for the full document.
Notability
notability 7.0/10Notable model release by Mistral, decent downloads but not explosive.