microsoft/Foundry-Local v1.1.0
microsoft/Foundry-Local
Captured source
source ↗v1.1.0 Foundry Local
Repository: microsoft/Foundry-Local
Tag: v1.1.0
Published: 2026-05-05T20:34:34Z
Prerelease: no
Release notes:
🚀 Foundry Local v1.1.0 Release Notes
We're excited to announce Foundry Local v1.1.0 — packed with new capabilities for on-device AI! This release brings expanded platform support, new model types, and performance improvements across the board.
---
🆕 What's New
🎯 .NET netstandard2.0 / net8.0 Support
The C# SDK now targets both `net8.0` and `netstandard2.0`, broadening compatibility to .NET Framework 4.6.1+, .NET Core 2.0+, Xamarin, Unity, and more. Ship on-device AI to virtually any .NET application!
> 📖 [C# SDK Documentation](sdk/cs/README.md)
---
🎙️ Live Audio Transcription
Real-time speech-to-text is here! Stream microphone audio directly to the SDK and receive transcription results as they arrive — no cloud round-trips, no latency. Built on the Nemotron ASR model with an OpenAI Realtime-compatible API surface.
Python
audio_client = model.get_audio_client() session = audio_client.create_live_transcription_session() session.settings.sample_rate = 16000 session.settings.channels = 1 session.settings.language = "en" session.start() # Push audio session.append(pcm_bytes) # Read results (typically on a background thread) for result in session.get_stream(): print(result.content[0].text) # transcribed text print(result.is_final) # True for final results session.stop()
📖 [Python live audio transcription sample](samples/python/live-audio-transcription/)
JavaScript
const audioClient = model.createAudioClient();
const session = audioClient.createLiveTranscriptionSession();
session.settings.sampleRate = 16000;
session.settings.channels = 1;
session.settings.language = 'en';
await session.start();
// Push audio
await session.append(pcmBytes);
// Read results
for await (const result of session.getStream()) {
console.log(result.content[0].text); // transcribed text
console.log(result.is_final); // true for final results
}
await session.stop();📖 [JavaScript live audio transcription sample](samples/js/live-audio-transcription/)
C#
var audioClient = await model.GetAudioClientAsync();
var session = audioClient.CreateLiveTranscriptionSession();
session.Settings.SampleRate = 16000;
session.Settings.Channels = 1;
session.Settings.Language = "en";
await session.StartAsync();
// Push audio
await session.AppendAsync(pcmBytes);
// Read results
await foreach (var result in session.GetStream())
{
Console.WriteLine(result.Content[0].Text); // transcribed text
Console.WriteLine(result.IsFinal); // true for final results
}
await session.StopAsync();📖 [C# live audio transcription sample](samples/cs/live-audio-transcription/)
Rust
let audio_client = model.create_audio_client();
let session = audio_client.create_live_transcription_session();
session.start(None).await?;
// Push audio
session.append(&pcm_bytes).await?;
// Read results
let mut stream = session.get_stream().await?;
while let Some(result) = stream.next().await {
let r = result?;
if let Some(content) = r.content.first() {
println!("{}", content.text); // transcribed text
println!("{}", r.is_final); // true for final results
}
}
session.stop().await?;📖 [Rust live audio transcription sample](samples/rust/live-audio-transcription/)
---
📐 Embeddings
Generate text embeddings entirely on-device for semantic search, RAG, clustering, and more. The new `qwen3-0.6b-embedding` model delivers high-quality vector representations in a compact footprint.
Python
model = manager.catalog.get_model("qwen3-0.6b-embedding")
model.download()
model.load()
client = model.get_embedding_client()
# Single embedding
response = client.generate_embedding("The quick brown fox jumps over the lazy dog")
embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")
# Batch embeddings
batch_response = client.generate_embeddings([
"Machine learning is a subset of artificial intelligence",
"The capital of France is Paris",
"Rust is a systems programming language",
])📖 [Python embeddings sample](samples/python/embeddings/)
JavaScript
const model = await manager.catalog.getModel('qwen3-0.6b-embedding');
await model.download();
await model.load();
const embeddingClient = model.createEmbeddingClient();
// Single embedding
const response = await embeddingClient.generateEmbedding(
'The quick brown fox jumps over the lazy dog'
);
console.log(`Dimensions: ${response.data[0].embedding.length}`);
// Batch embeddings
const batchResponse = await embeddingClient.generateEmbeddings([
'Machine learning is a subset of artificial intelligence',
'The capital of France is Paris',
'Rust is a systems programming language'
]);📖 [JavaScript embeddings sample](samples/js/embeddings/)
C#
var model = await catalog.GetModelAsync("qwen3-0.6b-embedding");
await model.DownloadAsync();
await model.LoadAsync();
var embeddingClient = await model.GetEmbeddingClientAsync();
// Single embedding
var response = await embeddingClient.GenerateEmbeddingAsync(
"The quick brown fox jumps over the lazy dog");
var embedding = response.Data[0].Embedding;
Console.WriteLine($"Dimensions: {embedding.Count}");
// Batch embeddings
var batchResponse = await embeddingClient.GenerateEmbeddingsAsync([
"Machine learning is a subset of artificial intelligence",
"The capital of France is Paris",
"Rust is a systems programming language"
]);📖 [C# embeddings sample](samples/cs/embeddings/)
Rust
📖 [Rust embeddings sample](samples/rust/embeddings/)
---
👁️ Qwen 3.5 Vision Language Model
Introducing Qwen 3.5 VL — a multimodal vision-language model that runs entirely on-device. Analyze images, understand visual content, and answer questions about what's in a picture — all without sending data to the cloud.
model = manager.catalog.get_model("qwen3.5-vision")
model.download()
model.load()---
📦 JavaScript SDK — Koffi Dependency Removed
The JavaScript SDK no longer depends on koffi for native interop. This results in a leaner dependency tree, faster installs, and fewer compatibility issues across platforms and Node.js versions.
- ✅ Smaller `node_modules` — no more large native FFI dependency
- ✅ Fewer platform quirks — prebuilt N-API addon replaces runtime FFI binding
- ✅ Faster install times — less to…
Excerpt shown — open the source for the full document.