WritingAnthropicAnthropicpublished Nov 24, 2025seen 2d

Advanced Tool Use

Open original ↗

Captured source

source ↗
published Nov 24, 2025seen 2dcaptured 13hhttp 200method plain

Introducing advanced tool use on the Claude Developer Platform \ Anthropic Engineering at Anthropic Introducing advanced tool use on the Claude Developer Platform

Published Nov 24, 2025 We’ve added three new beta features that let Claude discover, learn, and execute tools dynamically. Here’s how they work.

The future of AI agents is one where models work seamlessly across hundreds or thousands of tools. An IDE assistant that integrates git operations, file manipulation, package managers, testing frameworks, and deployment pipelines. An operations coordinator that connects Slack, GitHub, Google Drive, Jira, company databases, and dozens of MCP servers simultaneously. To build effective agents , they need to work with unlimited tool libraries without stuffing every definition into context upfront. Our blog article on using code execution with MCP discussed how tool results and definitions can sometimes consume 50,000+ tokens before an agent reads a request. Agents should discover and load tools on-demand, keeping only what's relevant for the current task. Agents also need the ability to call tools from code. When using natural language tool calling, each invocation requires a full inference pass, and intermediate results pile up in context whether they're useful or not. Code is a natural fit for orchestration logic, such as loops, conditionals, and data transformations. Agents need the flexibility to choose between code execution and inference based on the task at hand. Agents also need to learn correct tool usage from examples, not just schema definitions. JSON schemas define what's structurally valid, but can't express usage patterns: when to include optional parameters, which combinations make sense, or what conventions your API expects. Today, we're releasing three features that make this possible: Tool Search Tool, which allows Claude to use search tools to access thousands of tools without consuming its context window Programmatic Tool Calling , which allows Claude to invoke tools in a code execution environment reducing the impact on the model’s context window Tool Use Examples , which provides a universal standard for demonstrating how to effectively use a given tool

In internal testing, we’ve found these features have helped us build things that wouldn’t have been possible with conventional tool use patterns. For example, Claude for Excel uses Programmatic Tool Calling to read and modify spreadsheets with thousands of rows without overloading the model’s context window. Based on our experience, we believe these features open up new possibilities for what you can build with Claude.

Tool Search Tool The challenge MCP tool definitions provide important context, but as more servers connect, those tokens can add up. Consider a five-server setup: GitHub: 35 tools (~26K tokens) Slack: 11 tools (~21K tokens) Sentry: 5 tools (~3K tokens) Grafana: 5 tools (~3K tokens) Splunk: 2 tools (~2K tokens)

That's 58 tools consuming approximately 55K tokens before the conversation even starts. Add more servers like Jira (which alone uses ~17K tokens) and you're quickly approaching 100K+ token overhead. At Anthropic, we've seen tool definitions consume 134K tokens before optimization. But token cost isn't the only issue. The most common failures are wrong tool selection and incorrect parameters, especially when tools have similar names like notification-send-user vs. notification-send-channel . Our solution Instead of loading all tool definitions upfront, the Tool Search Tool discovers tools on-demand. Claude only sees the tools it actually needs for the current task. Tool Search Tool preserves 191,300 tokens of context compared to 122,800 with Claude’s traditional approach.

Traditional approach: All tool definitions loaded upfront (~72K tokens for 50+ MCP tools) Conversation history and system prompt compete for remaining space Total context consumption: ~77K tokens before any work begins

With the Tool Search Tool: Only the Tool Search Tool loaded upfront (~500 tokens) Tools discovered on-demand as needed (3-5 relevant tools, ~3K tokens) Total context consumption: ~8.7K tokens, preserving 95% of context window

This represents an 85% reduction in token usage while maintaining access to your full tool library. Internal testing showed significant accuracy improvements on MCP evaluations when working with large tool libraries. Opus 4 improved from 49% to 74%, and Opus 4.5 improved from 79.5% to 88.1% with Tool Search Tool enabled. How the Tool Search Tool works The Tool Search Tool lets Claude dynamically discover tools instead of loading all definitions upfront. You provide all your tool definitions to the API, but mark tools with defer_loading: true to make them discoverable on-demand. Deferred tools aren't loaded into Claude's context initially. Claude only sees the Tool Search Tool itself plus any tools with defer_loading: false (your most critical, frequently-used tools). When Claude needs specific capabilities, it searches for relevant tools. The Tool Search Tool returns references to matching tools, which get expanded into full definitions in Claude's context. For example, if Claude needs to interact with GitHub, it searches for "github," and only github.createPullRequest and github.listIssues get loaded—not your other 50+ tools from Slack, Jira, and Google Drive. This way, Claude has access to your full tool library while only paying the token cost for tools it actually needs. Prompt caching note: Tool Search Tool doesn't break prompt caching because deferred tools are excluded from the initial prompt entirely. They're only added to context after Claude searches for them, so your system prompt and core tool definitions remain cacheable. Implementation: { "tools": [ // Include a tool search tool (regex, BM25, or custom) {"type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex"},

// Mark tools for on-demand discovery { "name": "github.createPullRequest", "description": "Create a pull request", "input_schema": {...}, "defer_loading": true } // ... hundreds more deferred tools with defer_loading: true ] } Copy

For MCP servers, you can defer loading entire servers while keeping specific high-use tools loaded: { "type": "mcp_toolset", "mcp_server_name": "google-drive", "default_config": {"defer_loading": true}, # defer loading the entire server "configs": { "search_files": { "defer_loading":…

Excerpt shown — open the source for the full document.