What does this release signal mean?

Microsoft published microsoft/Olive v0.10.0 (microsoft/Olive). This release signal is evidence of what shipped, changed, or was packaged for users. High-signal details: Routine tool release, no major traction · Olive-ai 0.10.0 Repository: microsoft/Olive Tag: v0.10.0 Published: 2025-11-05T19:24:15Z Prerelease: no Release notes: New Features - Quark Quantization for ONNX Models.... onlylabs links this event to 1 captured evidence page and 6 related release signals.

Microsoft Release: microsoft/Olive v0.10.0

Captured source

source ↗

GitHub/github.com/microsoft/Olive

microsoft/Olive v0.10.0

Source ↗

published Nov 5, 2025seen Jun 6captured Jun 11http 200method plain

Olive-ai 0.10.0

Repository: microsoft/Olive

Tag: v0.10.0

Published: 2025-11-05T19:24:15Z

Prerelease: no

Release notes:

New Features

Quark Quantization for ONNX Models (#2236) — New QuarkQuantization pass via olive run with support for int8/uint8/int16/uint16/int32/uint32/bf16/bfp16 and CLE/SmoothQuant/AdaRound/AdaQuant.
Embedding Quantization & RTN Improvements (#2238) — Added QuantEmbedding, a composable Rtn pass, and a unified checkpoint format aligned with MatMulNBits/GatherBlockQuantized (block/shape constraints enforced; AutoGPTQ/AutoAWQ export updated to 2D params).
Word Embedding Tying Surgery (#2240) — TieWordEmbeddings ties input embeddings and lm_head for both unquantized (Gemm) and quantized (MatMulNBits + GatherBlockQuantized) graphs.
Custom ONNX Model Naming (#2235) — Allows specifying a custom ONNX model name in the output directory.
Intel OpenVINO Weight Compression Pass (#2180) — Adds NNCF-based weight compression for HF/ONNX models to OpenVINO or compressed ONNX.

Improvements

AIMET Enhancements (#2158, #2187, #2215) — Adds Sequential MSE, enables AIMET in quantize CLI, and supports manual precision overrides.
GPTQ Updates (#2202, #2203) — Supports user-provided module overrides and transformers >= 4.53.
Quantization Export Compatibility (#2218) — Updates checks for ort-genai > 0.9.0 and fixes minor OnnxDAG name clashes.
Torch Dynamo Export Alignment (#2185) — extract_adapter recovers folded LoRA and decomposes DORA-fused Gemm to MatMul for quantization.
Post-Surgery Deduplication (#2228) — Runs DeduplicateHashedInitializersPass after surgeries to remove duplicate initializers.
QNN Execution Provider: GPU Enablement (#2220) — Enables QNN-EP GPU, updates StaticLLM and ContextBinaryGeneration, keeps NPU default.
Run API Ergonomics (#2199) — olive.run() now accepts a dict run_config.
OpenVINO Config Overrides (#2191) — Allows overriding genai_config.json properties in OV encapsulation.
ReplaceAttentionMaskValue Robustness (#2213) — Adds Shape to ALLOWED_CONSUMER_OPS for text-encoder graphs.
Implicit Olive Version Tagging (#2183) — Automatically embeds the Olive version in saved ONNX model protos.

Notability

notability 3.0/10

Routine tool release, no major traction