ReleaseCloudflare (Workers AI)Cloudflare (Workers AI)published Jun 16, 2026seen 1w

cloudflare/ai workers-ai-provider@3.2.0

cloudflare/ai

Open original ↗

Captured source

source ↗
published Jun 16, 2026seen 1wcaptured 1whttp 200method plain

workers-ai-provider@3.2.0

Repository: cloudflare/ai

Tag: workers-ai-provider@3.2.0

Published: 2026-06-16T10:01:44Z

Prerelease: no

Release notes:

Minor Changes

  • #573 `4f19489` Thanks @threepointone! - Add AI Gateway routing for third-party catalog models to createWorkersAI, with capability-driven transport selection, the full provider registry, a bring-your-own-provider wrapper, typed errors, and client/server fallback.

Experimental. This is a substantial new surface for the package — well beyond its original job of wrapping Workers AI — and several behaviors rely on undocumented AI Gateway internals (the cf-aig-run-id resume buffer, per-provider run-path wire formats). Treat the entire third-party / gateway surface as experimental: the API may change, and provider coverage maturity varies (only the run-catalog providers are live-verified end-to-end). It does not affect the existing stable Workers AI / AI Search APIs.

createWorkersAI is the single public entry point. Pass an optional providers array (wire-format plugins from the sub-paths below). When set, a "/" catalog slug passed to the provider (or .chat) is routed through AI Gateway automatically, while @cf/... ids continue to build Workers AI models. Each slug is resolved against a registry of every AI Gateway provider, and the transport is picked from the requested options: the run path (env.AI.run) for resumable streaming (cf-aig-run-id, the default, on the unified-billing run catalog), or the gateway path (env.AI.gateway(id).run([…])) for BYOK providers, server-side fallback, and caching. Incompatible option combinations (e.g. resume: true with fallback.mode: "server", or resume/transport: "run" on a BYOK provider) throw a clear GatewayDelegateError; resume-disabling combinations warn loudly. This is fully additive: leaving providers unset preserves the prior behavior exactly, and passing a catalog slug without it throws a helpful error. The chat factory's settings argument is typed from the model id literal — a "/" slug autocompletes DelegateCallOptions, while a @cf/... id autocompletes WorkersAIChatSettings. gateway is optional for catalog routing — when unset, requests use the account's "default" AI Gateway; set gateway (here or per call) to target a specific one.

New sub-path exports:

  • workers-ai-provider/openai, workers-ai-provider/anthropic, workers-ai-provider/google — provider plugins keyed by wire format. One openai plugin serves the OpenAI-compatible long tail (deepseek, xai/grok, groq, mistral, perplexity, cerebras, openrouter, fireworks) plus the unified-catalog chat providers alibaba (Qwen) and minimax. @ai-sdk/openai, @ai-sdk/anthropic, and @ai-sdk/google are optional peer dependencies; install only the ones whose wire formats you use. The openai plugin is required for the run path (see below). Providers whose gateway-path URL isn't reproducible from the shared builder (cohere, baseten, parallel, azure-openai, google-vertex) and provider-native/non-chat providers are bring-your-own-provider only.
  • workers-ai-provider/gatewaycreateGatewayFetch / createGatewayProvider wrap any @ai-sdk/* provider so its traffic flows through AI Gateway (provider id detected from the request URL, or set explicitly). Use it for provider-native or non-chat providers the slug routing can't auto-wire (bedrock, replicate, audio/image), or for full control of the underlying provider.

The transport types, error classes (WorkersAIGatewayError, WorkersAIFallbackError, GatewayDelegateError), the registry helpers, DelegateCallOptions, and createResumableStream are re-exported from the package root.

Features:

  • Provider registry (GATEWAY_PROVIDERS, findProviderBySlug, detectProviderByUrl) maps slugs to gateway provider ids, wire formats, billing model, and run-catalog membership. Covers every provider in the AI Gateway directory (OpenAI, Anthropic, Google AI Studio/Vertex, xAI, Groq, DeepSeek, Mistral, Perplexity, Cerebras, OpenRouter, Cohere, Baseten, Parallel, Azure OpenAI, Amazon Bedrock, HuggingFace, Replicate, Fal, Ideogram, Cartesia, Deepgram, ElevenLabs — plus Fireworks), with URL host patterns so createGatewayFetch auto-detects each from the wrapped provider's request URL. Also includes the unified-catalog chat providers alibaba (Qwen) and minimax on the resumable run catalog (verified live: OpenAI-wire, cf-aig-run-id on streams); these are run-path only (gatewayPath: false — not native gateway providers), so caching, server-side fallback, and transport: "gateway" are rejected with a clear GatewayDelegateError instead of failing upstream.
  • Metadata & loggingmetadata (custom log attributes for spend attribution) and collectLog are first-class call options on both transports. On the run path they fold into the typed gateway options; on the gateway path they become cf-aig-metadata / cf-aig-collect-log headers (bigint metadata values are coerced to strings). Call-level metadata merges over (and wins against) any metadata set via gateway: { metadata }.
  • BYOK — set byok: true (+ supply the key via extraHeaders) to forward the upstream provider key on the gateway path; otherwise provider auth headers are stripped so unified billing / the gateway's stored key applies.
  • Client-side fallback (fallback.mode: "client") keeps resume per leg — a failed pre-stream dispatch falls through to the next model; if all fail, a WorkersAIFallbackError carries the per-attempt tree. Server-side fallback (fallback.mode: "server") routes same-vendor fallbacks through the gateway path.
  • Typed errorsWorkersAIGatewayError (with a coarse code, a recoverable hint, and the parsed CF/provider envelope) and WorkersAIFallbackError (attempt tree). Helpers classifyStatus / extractErrorMessage are exported.
  • Abort + gateway options are passed through on both transports.

On the run path, the response stream is wrapped so a transient mid-stream drop reconnects through the gateway resume endpoint (resume?from=N) transparently — the @ai-sdk parser never sees the break. from is an SSE event index, so the wrapper emits only complete events and realigns on the boundary...

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine minor release of Cloudflare's AI Workers provider.