RepoMicrosoftMicrosoftpublished Oct 8, 2025seen 1d

microsoft/amplifier-module-provider-anthropic

Python

Open original ↗

Captured source

source ↗

microsoft/amplifier-module-provider-anthropic

Description: Reference implementation for an Anthropic provider for the Amplifier project

Language: Python

License: MIT

Stars: 2

Forks: 10

Open issues: 8

Created: 2025-10-08T21:53:28Z

Pushed: 2026-06-10T06:17:31Z

Default branch: main

Fork: no

Archived: no

README:

Amplifier Anthropic Provider Module

Claude model integration for Amplifier via Anthropic API.

Prerequisites

  • Python 3.11+
  • [UV](https://github.com/astral-sh/uv) - Fast Python package manager

Installing UV

# macOS/Linux/WSL
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Purpose

Provides access to Anthropic's Claude models (Claude 4 series: Sonnet, Opus, Haiku) as an LLM provider for Amplifier.

Contract

Module Type: Provider Mount Point: providers Entry Point: amplifier_module_provider_anthropic:mount

Supported Models

  • claude-sonnet-4-5 - Claude Sonnet 4.5 (recommended, default)
  • claude-opus-4-6 - Claude Opus 4.6 (most capable)
  • claude-haiku-4-5 - Claude Haiku 4.5 (fastest, cheapest)

Configuration

[[providers]]
module = "provider-anthropic"
name = "anthropic"
config = {
default_model = "claude-sonnet-4-5",
max_tokens = 8192,
temperature = 1.0,
debug = false, # Enable standard debug events
raw_debug = false # Enable ultra-verbose raw API I/O logging
}

Debug Configuration

Standard Debug (debug: true):

  • Emits llm:request:debug and llm:response:debug events
  • Contains request/response summaries with message counts, model info, usage stats
  • Moderate log volume, suitable for development

Raw Debug (debug: true, raw_debug: true):

  • Emits llm:request:raw and llm:response:raw events
  • Contains complete, unmodified request params and response objects
  • Extreme log volume, use only for deep provider integration debugging
  • Captures the exact data sent to/from Anthropic API before any processing

Example:

providers:
- module: provider-anthropic
config:
debug: true # Enable debug events
raw_debug: true # Enable raw API I/O capture
default_model: claude-sonnet-4-5

Retry and Error Handling

The provider disables SDK built-in retries (max_retries=0) and manages retries itself via amplifier_core.utils.retry.retry_with_backoff(). This gives the provider full control over backoff timing, retry-after header honoring, and per-error-class delay scaling.

Error Translation

SDK exceptions are translated to kernel errors before the retry loop sees them. All translations preserve the original exception as __cause__ for debugging.

| SDK Exception | Condition | Kernel Error | Status | Retryable | | --- | --- | --- | --- | --- | | RateLimitError | 429 | RateLimitError | 429 | Yes | | OverloadedError | 529 | ProviderUnavailableError | 529 | Yes (10× backoff) | | InternalServerError | 5xx | ProviderUnavailableError | 5xx | Yes | | AuthenticationError | 401 | AuthenticationError | 401 | No | | BadRequestError | context length / too many tokens | ContextLengthError | 400 | No | | BadRequestError | safety / content filter / blocked | ContentFilterError | 400 | No | | BadRequestError | other | InvalidRequestError | 400 | No | | APIStatusError | 403 | AccessDeniedError | 403 | No | | APIStatusError | 404 | NotFoundError | 404 | No | | APIStatusError | other non-5xx | LLMError | — | No | | asyncio.TimeoutError | — | LLMTimeoutError | — | Yes | | Other | — | LLMError | — | Yes |

Backoff Formula

Each retry delay is computed as follows:

base_delay = min_retry_delay × 2^(attempt - 1)
capped_delay = min(base_delay, max_retry_delay)
scaled_delay = capped_delay × delay_multiplier # 1.0 for most errors, 10.0 for 529
final_delay = max(scaled_delay, retry_after) # server retry-after as floor
sleep = final_delay ± (final_delay × jitter) # randomised ± jitter fraction

Example: 529 Overloaded (10× multiplier, defaults)

| Attempt | base_delay | capped | ×10 | Sleep | | --- | --- | --- | --- | --- | | 1 | 1s | 1s | 10s | 10s | | 2 | 2s | 2s | 20s | 20s | | 3 | 4s | 4s | 40s | 40s | | 4 | 8s | 8s | 80s | 80s | | 5 | 16s | 16s | 160s | 160s |

Total wait ≈ 310s (~5 min) before the request is abandoned.

Retry Configuration

providers:
- module: provider-anthropic
config:
max_retries: 5
min_retry_delay: 1.0
max_retry_delay: 60.0
retry_jitter: 0.2
overloaded_delay_multiplier: 10.0

| Key | Default | Description | | --- | --- | --- | | max_retries | 5 | Maximum retry attempts before giving up | | min_retry_delay | 1.0 | Base delay in seconds for the first retry | | max_retry_delay | 60.0 | Cap on the base delay (before multiplier) | | retry_jitter | 0.2 | Jitter fraction (0.0–1.0). Also accepts true (→ 0.2) or false (→ 0.0) for backward compatibility | | overloaded_delay_multiplier | 10.0 | Multiplier applied to delays for 529 Overloaded errors |

Events

A provider:retry event is emitted before each retry sleep with the following fields:

| Field | Description | | --- | --- | | provider | Provider name ("anthropic") | | model | Model being called | | attempt | Current retry attempt number | | max_retries | Configured maximum retries | | delay | Computed sleep duration in seconds | | retry_after | Server retry-after value (or null) | | error_type | Kernel error class name | | error_message | Error description |

Beta Headers

Anthropic provides experimental features through beta headers. Enable these features by adding the beta_headers configuration field.

Configuration

Single beta header:

providers:
- module: provider-anthropic
config:
default_model: claude-sonnet-4-5
beta_headers: "context-1m-2025-08-07" # Enable 1M token context window

Multiple beta headers:

providers:
- module: provider-anthropic
config:
default_model: claude-sonnet-4-5
beta_headers:
- "context-1m-2025-08-07"
- "future-feature-header"

1M Token Context Window

Claude Sonnet 4.5 supports a 1M token context window when the context-1m-2025-08-07 beta header is enabled:

providers:
- module: provider-anthropic
config:
default_model: claude-sonnet-4-5
beta_headers: "context-1m-2025-08-07"
max_tokens: 8192 # Output tokens remain separate from context window

With this configuration:

  • Context window: Up to 1M tokens of input…

Excerpt shown — open the source for the full document.