Code Execution With Mcp
Captured source
source ↗Code execution with MCP: building more efficient AI agents \ Anthropic Engineering at Anthropic Code execution with MCP: Building more efficient agents
Published Nov 04, 2025 Direct tool calls consume context for each definition and result. Agents scale better by writing code to call tools instead. Here's how it works with MCP.
The Model Context Protocol (MCP) is an open standard for connecting AI agents to external systems. Connecting agents to tools and data traditionally requires a custom integration for each pairing, creating fragmentation and duplicated effort that makes it difficult to scale truly connected systems. MCP provides a universal protocol—developers implement MCP once in their agent and it unlocks an entire ecosystem of integrations. Since launching MCP in November 2024, adoption has been rapid: the community has built thousands of MCP servers , SDKs are available for all major programming languages, and the industry has adopted MCP as the de-facto standard for connecting agents to tools and data. Today developers routinely build agents with access to hundreds or thousands of tools across dozens of MCP servers. However, as the number of connected tools grows, loading all tool definitions upfront and passing intermediate results through the context window slows down agents and increases costs. In this blog we'll explore how code execution can enable agents to interact with MCP servers more efficiently, handling more tools while using fewer tokens. Excessive token consumption from tools makes agents less efficient As MCP usage scales, there are two common patterns that can increase agent cost and latency: Tool definitions overload the context window; Intermediate tool results consume additional tokens.
1. Tool definitions overload the context window Most MCP clients load all tool definitions upfront directly into context, exposing them to the model using a direct tool-calling syntax. These tool definitions might look like: gdrive.getDocument Description: Retrieves a document from Google Drive Parameters: documentId (required, string): The ID of the document to retrieve fields (optional, string): Specific fields to return Returns: Document object with title, body content, metadata, permissions, etc. Copy
salesforce.updateRecord Description: Updates a record in Salesforce Parameters: objectType (required, string): Type of Salesforce object (Lead, Contact, Account, etc.) recordId (required, string): The ID of the record to update data (required, object): Fields to update with their new values Returns: Updated record object with confirmation Copy
Tool descriptions occupy more context window space, increasing response time and costs. In cases where agents are connected to thousands of tools, they’ll need to process hundreds of thousands of tokens before reading a request. 2. Intermediate tool results consume additional tokens Most MCP clients allow models to directly call MCP tools. For example, you might ask your agent: "Download my meeting transcript from Google Drive and attach it to the Salesforce lead." The model will make calls like: TOOL CALL: gdrive.getDocument(documentId: "abc123") → returns "Discussed Q4 goals...\n[full transcript text]" (loaded into model context)
TOOL CALL: salesforce.updateRecord( objectType: "SalesMeeting", recordId: "00Q5f000001abcXYZ", data: { "Notes": "Discussed Q4 goals...\n[full transcript text written out]" } ) (model needs to write entire transcript into context again) Copy
Every intermediate result must pass through the model. In this example, the full call transcript flows through twice. For a 2-hour sales meeting, that could mean processing an additional 50,000 tokens. Even larger documents may exceed context window limits, breaking the workflow. With large documents or complex data structures, models may be more likely to make mistakes when copying data between tool calls. The MCP client loads tool definitions into the model's context window and orchestrates a message loop where each tool call and result passes through the model between operations. Code execution with MCP improves context efficiency With code execution environments becoming more common for agents, a solution is to present MCP servers as code APIs rather than direct tool calls. The agent can then write code to interact with MCP servers. This approach addresses both challenges: agents can load only the tools they need and process data in the execution environment before passing results back to the model. There are a number of ways to do this. One approach is to generate a file tree of all available tools from connected MCP servers. Here's an implementation using TypeScript: servers ├── google-drive │ ├── getDocument.ts │ ├── ... (other tools) │ └── index.ts ├── salesforce │ ├── updateRecord.ts │ ├── ... (other tools) │ └── index.ts └── ... (other servers) Copy
Then each tool corresponds to a file, something like: // ./servers/google-drive/getDocument.ts import { callMCPTool } from "../../../client.js";
interface GetDocumentInput { documentId: string; }
interface GetDocumentResponse { content: string; }
/* Read a document from Google Drive */ export async function getDocument(input: GetDocumentInput): Promise { return callMCPTool('google_drive__get_document', input); } Copy
Our Google Drive to Salesforce example above becomes the code: // Read transcript from Google Docs and add to Salesforce prospect import * as gdrive from './servers/google-drive'; import * as salesforce from './servers/salesforce';
const transcript = (await gdrive.getDocument({ documentId: 'abc123' })).content; await salesforce.updateRecord({ objectType: 'SalesMeeting', recordId: '00Q5f000001abcXYZ', data: { Notes: transcript } }); Copy
The agent discovers tools by exploring the filesystem: listing the ./servers/ directory to find available servers (like google-drive and salesforce ), then reading the specific tool files it needs (like getDocument.ts and updateRecord.ts ) to understand each tool's interface. This lets the agent load only the definitions it needs for the current task. This reduces the token usage from 150,000 tokens to 2,000 tokens—a time and cost saving of 98.7% . Cloudflare published similar findings , referring to code execution with MCP as “Code Mode." The core insight is the same: LLMs are adept at writing code and developers should take advantage of this strength to build…
Excerpt shown — open the source for the full document.