What does this release signal mean?

Microsoft published microsoft/onnxruntime v1.25.0 (microsoft/onnxruntime). This release signal is evidence of what shipped, changed, or was packaged for users. High-signal details: Cross-platform inference engine for optimized ML model execution. · ONNX Runtime v1.25.0 Repository: microsoft/onnxruntime Tag: v1.25.0 Published: 2026-04-20T18:25:24Z Prerelease: no Release notes: 📢 Announcements & Breaking Changes.... onlylabs links this event to 1 captured evidence page and 6 related release signals.

Microsoft Release: microsoft/onnxruntime v1.25.0

Captured source

source ↗

GitHub/github.com/microsoft/onnxruntime

microsoft/onnxruntime v1.25.0

Source ↗

published Apr 20, 2026seen Jun 6captured Jun 11http 200method plain

ONNX Runtime v1.25.0

Repository: microsoft/onnxruntime

Tag: v1.25.0

Published: 2026-04-20T18:25:24Z

Prerelease: no

Release notes:

📢 Announcements & Breaking Changes

Build & Platform

C++20 is now required to build ONNX Runtime from source. Minimum toolchains: MSVC 19.29+, GCC 10+, Clang 10+. Users of prebuilt packages are unaffected. (#27178)
CUDA minimum version raised to 12.0 — CUDA 11.x is no longer supported. Users pinned to CUDA 11.x should stay on ORT 1.24.x or upgrade their CUDA toolkit/driver. (#27570)
ONNX upgraded to 1.21.0 (#27601)
sympy is now an optional dependency for Python builds. (#27200)

Execution Provider Changes

ArmNN EP has been removed. Users should remove any --use_armnn build flags and migrate to the MLAS/KleidiAI-backed CPU EP or QNN EP for Qualcomm hardware. (#27447)

API Version

ORT_API_VERSION updated to 25. (#27280)

---

🔒 Security Fixes

Fixed potential integer truncation leading to heap out-of-bounds read/write (#27544)
Addressed Pad Reflect vulnerability (#27652)
Security fix for transpose optimizer (#27555)
Upgraded minimatch 3.1.2 → 3.1.4 for CVE-2026-27904 (#27667)
Hardened shell command handling for constant strings (#27840)
Added validation of onnx::TensorProto data size before allocation (#27547)
Cleaned up external data path validation (#27539)
Fixed misaligned address reads for tensor attributes from raw data buffers (#27312)
Fixed CPU Attention overflow issue (#27822)
Fixed CPU LRN integer overflow issues (#27886)
Additional input validation hardening:
Tile kernel dim overflow (#27566)
Out-of-bounds read in cross entropy (#27568)
TreeEnsembleClassifier attributes (#27571)
AffineGrid (#27572)
EmbedLayerNorm position_ids (#27573)
RotaryEmbedding position_ids (#27597)
RoiAlign batch_indices (#27603)
MaxUnpool indices (#27432)
QMoECPU swiglu OOB (#27748)
SVMClassifier initializer (#27699)
Col2Im SafeInt (#27625)

---

✨ New Features

🔌 Execution Provider Plugin API & CUDA Plugin EP

ORT 1.25.0 introduces the CUDA Plugin EP — the first core implementation that enables third-party CUDA-backed EPs to be delivered as dynamically loaded plugins without rebuilding ORT.

CUDA Plugin EP: Core implementation (#27816)
CUDA Plugin EP: BFC-style arena and CUDA mempool allocators for stream-aware memory management (#27931)
Plugin EP Sync API for synchronous execution (#27538)
Plugin EP event profiling APIs (#27649)
Plugin EP APIs to retrieve ONNX operator schemas (#27713)
Annotation-based graph partitioning with resource accounting (#27595, #27972)
EP API adapter improvements: header-only adapter, OpKernelInfo::GetConfigOptions, LoggingManager::HasDefaultLogger() (#26879, #26919, #27540, #27541, #27587)
WebGPU EP made compatible with EP API (#26907)

🔧 Core APIs

Per-session thread pool work callbacks API (#27253)
`enable_profiling` in RunOptions (#26846)
KernelInfo string-array attribute APIs for C and C++ (#27599)
OrtModel input support for Compile API (#27332)
Session config to create weightless EPContext models during compilation (#27197)
Compiled model compatibility APIs in example plugin EP (#27088)
Model Package support (preview): Initial infrastructure for automatically selecting compiled EPContext model variants from a packaged collection based on EP, device, and hardware constraints. The directory structure is not yet finalized. (#27786)

📊 New ONNX Ops & Opset Coverage

Attention opset 23 on CUDA with GQA, boolean masks, softcap, and softmax precision (#26466, #27030, #27082,...

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Routine version release of ONNX Runtime