microsoft/onnxruntime v1.25.0
microsoft/onnxruntime
Captured source
source ↗published Apr 20, 2026seen 5dcaptured 8hhttp 200method plain
ONNX Runtime v1.25.0
Repository: microsoft/onnxruntime
Tag: v1.25.0
Published: 2026-04-20T18:25:24Z
Prerelease: no
Release notes:
📢 Announcements & Breaking Changes
Build & Platform
- C++20 is now required to build ONNX Runtime from source. Minimum toolchains: MSVC 19.29+, GCC 10+, Clang 10+. Users of prebuilt packages are unaffected. (#27178)
- CUDA minimum version raised to 12.0 — CUDA 11.x is no longer supported. Users pinned to CUDA 11.x should stay on ORT 1.24.x or upgrade their CUDA toolkit/driver. (#27570)
- ONNX upgraded to 1.21.0 (#27601)
- sympy is now an optional dependency for Python builds. (#27200)
Execution Provider Changes
- ArmNN EP has been removed. Users should remove any
--use_armnnbuild flags and migrate to the MLAS/KleidiAI-backed CPU EP or QNN EP for Qualcomm hardware. (#27447)
API Version
- ORT_API_VERSION updated to 25. (#27280)
---
🔒 Security Fixes
- Fixed potential integer truncation leading to heap out-of-bounds read/write (#27544)
- Addressed Pad Reflect vulnerability (#27652)
- Security fix for transpose optimizer (#27555)
- Upgraded minimatch 3.1.2 → 3.1.4 for CVE-2026-27904 (#27667)
- Hardened shell command handling for constant strings (#27840)
- Added validation of
onnx::TensorProtodata size before allocation (#27547) - Cleaned up external data path validation (#27539)
- Fixed misaligned address reads for tensor attributes from raw data buffers (#27312)
- Fixed CPU Attention overflow issue (#27822)
- Fixed CPU LRN integer overflow issues (#27886)
- Additional input validation hardening:
- Tile kernel dim overflow (#27566)
- Out-of-bounds read in cross entropy (#27568)
- TreeEnsembleClassifier attributes (#27571)
- AffineGrid (#27572)
- EmbedLayerNorm position_ids (#27573)
- RotaryEmbedding position_ids (#27597)
- RoiAlign batch_indices (#27603)
- MaxUnpool indices (#27432)
- QMoECPU swiglu OOB (#27748)
- SVMClassifier initializer (#27699)
- Col2Im SafeInt (#27625)
---
✨ New Features
🔌 Execution Provider Plugin API & CUDA Plugin EP
ORT 1.25.0 introduces the CUDA Plugin EP — the first core implementation that enables third-party CUDA-backed EPs to be delivered as dynamically loaded plugins without rebuilding ORT.
- CUDA Plugin EP: Core implementation (#27816)
- CUDA Plugin EP: BFC-style arena and CUDA mempool allocators for stream-aware memory management (#27931)
- Plugin EP Sync API for synchronous execution (#27538)
- Plugin EP event profiling APIs (#27649)
- Plugin EP APIs to retrieve ONNX operator schemas (#27713)
- Annotation-based graph partitioning with resource accounting (#27595, #27972)
- EP API adapter improvements: header-only adapter,
OpKernelInfo::GetConfigOptions,LoggingManager::HasDefaultLogger()(#26879, #26919, #27540, #27541, #27587) - WebGPU EP made compatible with EP API (#26907)
🔧 Core APIs
- Per-session thread pool work callbacks API (#27253)
- `enable_profiling` in RunOptions (#26846)
- KernelInfo string-array attribute APIs for C and C++ (#27599)
OrtModelinput support for Compile API (#27332)- Session config to create weightless EPContext models during compilation (#27197)
- Compiled model compatibility APIs in example plugin EP (#27088)
- Model Package support (preview): Initial infrastructure for automatically selecting compiled EPContext model variants from a packaged collection based on EP, device, and hardware constraints. The directory structure is not yet finalized. (#27786)
📊 New ONNX Ops & Opset Coverage
Excerpt shown — open the source for the full document.
Notability
notability 5.0/10Routine version release of ONNX Runtime