ReleaseMicrosoftMicrosoftpublished Apr 17, 2026seen 5d

microsoft/Olive v0.12.0

microsoft/Olive

Open original ↗

Captured source

source ↗
published Apr 17, 2026seen 5dcaptured 13hhttp 200method plain

Olive-ai 0.12.0

Repository: microsoft/Olive

Tag: v0.12.0

Published: 2026-04-17T17:47:54Z

Prerelease: no

Release notes:

Olive 0.12.0

New Features

  • olive init interactive wizard (#2346, by @xiaoyu-work): Added a guided CLI experience to help users

configure and generate Olive optimization commands more easily.

  • Olive MCP server (#2353, by @xiaoyu-work): Added an MCP server for tool and agent integrations around

Olive workflows.

  • QAIRT ORT to Genie workflow (#2358, by @qti-kromero): Added an end-to-end Qualcomm workflow with new

preparation, GenAI builder, and encapsulation passes.

  • Qwen3-VL and multi-image Qwen VL support (#2345, by @hanbitmyths): Added export and optimization

support for Qwen3-VL and Qwen2.5-VL, plus new ONNX graph surgeries and 8-bit Gather quantization improvements.

  • AutoClip quantization pass (#2324, by @jambayk): Added automatic clipping search for linear layers

before quantization.

  • Layer annotation support (#2361, by @yuslepukhin): Added CaptureLayerAnnotations and ONNX

propagation so layer metadata can be preserved through conversion.

  • NVModelOptGraphSurgery pass (#2377, by @hthadicherla): Added NVIDIA ModelOpt graph surgery

integration for ONNX models.

Improvements

  • AMD Quark quantization updates (#2364, by @poganesh): Updated the Quark pass for Quark 0.11, VitisAI

LLM fusion, token fusion, and GPT-OSS pre-quantized models.

  • HQQ and RTN external data handling (#2380, by @Lidang-Jiang): Fixed ONNX quantization output

correctness when input models store weights as external data.

  • Transformers 5.0+ compatibility (#2328, by @xiaoyu-work): Updated export and training flows for the

new DynamicCache format and related argument handling.

  • Telemetry robustness (#2405, by @bmehta001): Fixed Linux and macOS device ID handling, auto-disabled

telemetry in CI, and improved cache and exporter reliability.

Security

  • PyTorch model loading hardening (#2389, by @jambayk): Removed unsafe legacy `torch.load(...,

weights_only=False) loading paths, removed PYTORCH_ENTIRE_MODEL, and now require model_loader` for PyTorch models.

  • Pydantic v2 migration (#2330, by @shaahji): Migrated Olive to Pydantic v2 across the codebase and

updated validators, config patterns, and model serialization.

Notability

notability 4.0/10

Routine release of an optimization tool