What does this release signal mean?

Microsoft published microsoft/onnxruntime-genai v0.13.0 (microsoft/onnxruntime-genai). This release signal is evidence of what shipped, changed, or was packaged for users. High-signal details: Microsoft's library for running generative AI models on ONNX Runtime. · v0.13.0 Repository: microsoft/onnxruntime-genai Tag: v0.13.0 Published: 2026-04-15T19:56:04Z Prerelease: no Release notes: What's Changed * update WebGPU buffer memory.... onlylabs links this event to 1 captured evidence page and 6 related release signals.

Microsoft Release: microsoft/onnxruntime-genai v0.13.0

Captured source

source ↗

GitHub/github.com/microsoft/onnxruntime-genai

microsoft/onnxruntime-genai v0.13.0

Source ↗

published Apr 15, 2026seen Jun 9captured Jun 11http 200method plain

v0.13.0

Repository: microsoft/onnxruntime-genai

Tag: v0.13.0

Published: 2026-04-15T19:56:04Z

Prerelease: no

Release notes:

What's Changed

update WebGPU buffer memory info name by @fs-eire in https://github.com/microsoft/onnxruntime-genai/pull/1957
Add enable_profiling in Runtime Options by @xiaofeihan1 in https://github.com/microsoft/onnxruntime-genai/pull/1949
Fix uninitialized tools variable and improve exception debug messages by @sheller-ms in https://github.com/microsoft/onnxruntime-genai/pull/1971
Add common download to Phi-3 tutorial by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/1973
Add support for InternLM2 model architecture by @amdrajeevp1 in https://github.com/microsoft/onnxruntime-genai/pull/1958
Update cmake cuda architecture and use win-arm64 pool workaround by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/1976
Update examples after 0.12.0 release by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/1980
Add CI pipeline for WebGPU EP model testing by @qjia7 in https://github.com/microsoft/onnxruntime-genai/pull/1956
Fix Python nightly build by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/1981
Add missing Quark 0.11 weight patterns for ChatGLM3 output layer by @poganesh in https://github.com/microsoft/onnxruntime-genai/pull/1983
Support Qwen2.5-VL pre-quantized models in qwen.py by @poganesh in https://github.com/microsoft/onnxruntime-genai/pull/1985
[VitisAI] external_ep_libray support fix for WinML by @akholodnamdcom in https://github.com/microsoft/onnxruntime-genai/pull/1984
Fix guidance bug by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/1988
Fix incorrect batch responses when using multiple prompts by @lnigam in https://github.com/microsoft/onnxruntime-genai/pull/1986
Enable webgpu graph capture in base.py by @qjia7 in https://github.com/microsoft/onnxruntime-genai/pull/1991
Harden CUDA error checking across the codebase by @Copilot in https://github.com/microsoft/onnxruntime-genai/pull/1994
allow pruned models for prefill by @fs-eire in https://github.com/microsoft/onnxruntime-genai/pull/1995
Fix WinML Packaging Pipeline by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/1998
Add small changes after pruning prefill by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/2000
webgpu: Optimize Copyfrom by @qjia7 in https://github.com/microsoft/onnxruntime-genai/pull/1992
Add support for CUDA 13 by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/2001
add webgpu to qmoe path by @guschmue in https://github.com/microsoft/onnxruntime-genai/pull/2005
Fix ERNIE 4.5 model builder: rope_attrs and config architecture name by @xiaoyao9184 in https://github.com/microsoft/onnxruntime-genai/pull/2007
Bug fix in Continuous Decoding by @chilukam-qti in https://github.com/microsoft/onnxruntime-genai/pull/2008
Update Phi-4 mm README links by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/2014
Add Qwen3-VL model support + multi-image input support in Qwen VL family by @hanbitmyths in https://github.com/microsoft/onnxruntime-genai/pull/2003
Add Qwen3.5 model support and optimize multi-image handling by @apsonawane in https://github.com/microsoft/onnxruntime-genai/pull/2019
Reuse a single generator via RewindTo(0) in benchmark instead of creating multiple generators by @qjia7 in https://github.com/microsoft/onnxruntime-genai/pull/2002
[RyzenAI] WinML compatibility fix by @akholodnamdcom in https://github.com/microsoft/onnxruntime-genai/pull/2026
Nemotron ASR Support for Streaming by @nenad1002 in https://github.com/microsoft/onnxruntime-genai/pull/1997
[WebGPU] Fix the prefill regression when graph capture is ON by @qjia7 in https://github.com/microsoft/onnxruntime-genai/pull/2021
Support 4 inputs for nemotron model by @jiafatom in https://github.com/microsoft/onnxruntime-genai/pull/2036
Updated java packaging based on python packaging logic by @EPNW-Eric in https://github.com/microsoft/onnxruntime-genai/pull/2029
Fix android packaging pipeline by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/2039
Add OpenAI's Whisper to model builder by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/2018
[Java] Add a dependency on onnxruntime (#2030) by @EPNW-Eric in https://github.com/microsoft/onnxruntime-genai/pull/2040
Fix mutually exclusive inputs for language models by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/2046
Decouple plugin execution providers (EPs) from the USE_WINML pre-processor macro by @baijumeswani in https://github.com/microsoft/onnxruntime-genai/pull/2038
Route pipeline model RunOptions through SetRunOption for proper special key handling by @Copilot in https://github.com/microsoft/onnxruntime-genai/pull/2044
Add ort_build_version and ort_build_source parameters to nuget and python packaging pipelines, remove ROCm support by @Copilot in https://github.com/microsoft/onnxruntime-genai/pull/2049
Add batched multi-image vision path and window_size config for Qwen VL by @hanbitmyths in https://github.com/microsoft/onnxruntime-genai/pull/2050
docs: fix formatting and syntax highlighting in documentation by @riddles-the-one in https://github.com/microsoft/onnxruntime-genai/pull/2051
Add Silero VAD Support to Nemotron Streaming ASR by @sayanshaw24 in https://github.com/microsoft/onnxruntime-genai/pull/2035
Add Qwen3.5 hybrid decoder export support (GatedDeltaNet + Attention) by @apsonawane in https://github.com/microsoft/onnxruntime-genai/pull/2043
Add support for QNN stateful models by @qti-ashimaj in https://github.com/microsoft/onnxruntime-genai/pull/2012
Allocate recurrent state via device allocator to enable CUDA graph capture by @apsonawane in https://github.com/microsoft/onnxruntime-genai/pull/2057
Speed up CI pipelines by @Copilot in https://github.com/microsoft/onnxruntime-genai/pull/2052
Fix tool calling for TRT-RTX models by @kunal-vaishnavi in https://github.com/microsoft/onnxruntime-genai/pull/2048
Fix vision pipeline EP hardcoding and pixel_values rank mismatch for Qwen VL models by @apsonawane in https://github.com/microsoft/onnxruntime-genai/pull/2060

New Contributors

@sheller-ms made their first contribution in https://github.com/microsoft/onnxruntime-genai/pull/1971

*...

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine library version release