ReleaseMicrosoftMicrosoftpublished Apr 9, 2026seen 2d

microsoft/Foundry-Local v1.0.0

microsoft/Foundry-Local

Open original ↗

Captured source

source ↗
published Apr 9, 2026seen 2dcaptured 10hhttp 200method plain

v1.0.0 Foundry Local - General Availability

Repository: microsoft/Foundry-Local

Tag: v1.0.0

Published: 2026-04-09T23:09:56Z

Prerelease: no

Release notes: We are excited to announce the General Availability of Foundry Local, a unified on-device AI runtime that brings generative AI directly into your applications. All inference runs locally: user data never leaves the device, responses are instant with zero network latency, and everything works offline. No per-token costs, no backend infrastructure.

SDKs

Foundry Local ships production SDKs for C#, JavaScript, Python, and Rust, each providing a consistent API surface for model management, chat completions, audio transcription, and tool calling.

| | SDK | Package | |:-:|-----|---------| | | C# | `Microsoft.AI.Foundry.Local` | | | JavaScript | `foundry-local-sdk` | | 🐍 | Python | `foundry-local-sdk` | | 🦀 | Rust | `foundry-local-sdk` |

WinML Variants

Each SDK also ships a WinML variant that unlocks more GPU and NPU devices on Windows, available through the Windows ML execution provider catalog.

| | SDK | Package | |:-:|-----|---------| | | C# | Microsoft.AI.Foundry.Local.WinML | | | JavaScript | foundry-local-sdk-winml | | 🐍 | Python | foundry-local-sdk-winml | | 🦀 | Rust | foundry-local-sdk with winml feature flag |

Platform Support

| | OS | Architectures | |:-:|-----|---------------| | | Windows | x64, ARM64 | | | macOS | ARM64 | | | Linux | x64 |

What You Can Build

Chat Completions

Full OpenAI-compatible chat completions API with multi-turn conversations, and configurable inference parameters (temperature, top-k, top-p, max tokens, frequency/presence penalty, random seed).

Audio Transcription

On-device speech-to-text. Transcribe audio files with language selection and temperature control.

Embedded Web Server

Start an OpenAI-compatible HTTP server from your application with a single call. Useful for multi-process architectures or bridging to tools that speak the OpenAI REST protocol.

Hardware Acceleration

Powered by ONNX Runtime, Foundry Local automatically detects available hardware and selects the best execution provider, with zero hardware detection code needed in your application.

Supported execution providers:

| Execution Provider | Hardware | Platform | |-------------------|----------|----------| | CPU | Universal fallback | All platforms | | WebGPU | GPU acceleration | Windows x64, macOS arm64 | | CUDA | NVIDIA GPUs | Windows x64, Linux x64 | | OpenVINO | Intel GPUs and NPUs | Windows x64 | | QNN | Qualcomm NPUs | Windows ARM64 | | TensorRT RTX | NVIDIA GPUs | Windows x64 | | VitisAI | AMD NPUs | Windows x64 |

Execution providers can be discovered, downloaded, and registered at runtime through the SDK's discoverEps() and downloadAndRegisterEps() APIs, with per-provider progress callbacks.

Model Catalog & Management

Foundry Local includes a built-in model catalog with popular open-source models, optimized with state-of-the-art quantization and compression for on-device performance.

Model management features:

  • Browse & search the catalog programmatically
  • Multi-variant models - each alias maps to multiple variants optimized for different hardware (CPU, GPU, NPU)
  • Automatic variant selection - the SDK picks the best variant based on what's cached and what hardware is available, with manual override via selectVariant()
  • Download with progress tracking - real-time percentage callbacks
  • Load / unload lifecycle - explicit control over which models are in memory
  • Version management - query the catalog for the latest version of any model

Get Started

| Language | Cross-platform | Windows ML | |----------|---------------|----------------------| | JavaScript | npm install foundry-local-sdk | npm install foundry-local-sdk-winml | | C# | dotnet add package Microsoft.AI.Foundry.Local | dotnet add package Microsoft.AI.Foundry.Local.WinML | | Python | pip install foundry-local-sdk | pip install foundry-local-sdk-winml | | Rust | cargo add foundry-local-sdk | cargo add foundry-local-sdk --features winml |

Excerpt shown — open the source for the full document.