WritingOpenAIOpenAIpublished May 21, 2025seen 6d

New tools and features in the Responses API

Open original ↗

Captured source

source ↗

New tools and features in the Responses API | OpenAI

May 21, 2025

New tools and features in the Responses API

Loading…

Share

Today, we’re adding new built-in tools to the Responses API—our core API primitive for building agentic applications. This includes support for all remote Model Context Protocol (MCP) servers⁠, as well as tools like image generation⁠, Code Interpreter⁠, and improvements to file search⁠. These tools are available across our GPT‑4o series, GPT‑4.1 series, and OpenAI o-series reasoning models. o3 and o4-mini can now call tools and functions directly within their chain-of-thought in the Responses API, producing answers that are more contextually rich and relevant. Using o3 and o4-mini with the Responses API preserves reasoning tokens across requests and tool calls, improving model intelligence and reducing the cost and latency for developers.

We’re also introducing new features in the Responses API that improve reliability, visibility, and privacy for enterprises and developers. These include background mode⁠ to handle long-running tasks asynchronously and more reliably, support for reasoning summaries⁠, and support for encrypted reasoning items⁠.

Since releasing the Responses API in March 2025 with tools like web search, file search, and computer use, hundreds of thousands of developers have used the API to process trillions of tokens across our models. Customers have used the API to build a variety of agentic applications, including Zencoder⁠’s coding agent, Revi⁠’s market intelligence agent for private equity and investment banking, and MagicSchool AI⁠'s education assistant—all of which use web search to pull relevant, up-to-date information into their app. Now developers can build agents that are even more useful and reliable with access to the new tools and features released today.

New remote MCP server support

We’re adding support for remote MCP servers⁠ in the Responses API, building on the release of MCP support in the Agents SDK⁠. MCP is an open protocol that standardizes how applications provide context to LLMs. By supporting MCP servers in the Responses API, developers will be able to connect our models to tools hosted on any MCP server with just a few lines of code. Here are some examples showing how developers can use remote MCP servers with the Responses API today:

Python

1response = client.responses.create(2 model="gpt-4.1",3 tools=[{4 "type": "mcp",5 "server_label": "shopify",6 "server_url": "https://pitchskin.com/api/mcp",7 }],8 input="Add the Blemish Toner Pads to my cart"9)

The Blemish Toner Pads have been added to your cart! You can proceed to checkout here:

Popular remote MCP servers include Cloudflare⁠, HubSpot⁠, Intercom⁠, PayPal⁠, Plaid⁠, Shopify⁠, Stripe⁠, Square⁠, Twilio⁠, Zapier⁠, and more. We expect the ecosystem of remote MCP servers to grow quickly in the coming months, making it easier for developers to build powerful agents that can connect to the tools and data sources their users already rely on. In order to best support the ecosystem and contribute to this developing standard, OpenAI has also joined the steering committee for MCP.

To learn how to spin up your own remote MCP server, check out this guide from Cloudflare⁠. To learn how to use the MCP tool in the Responses API, check out this guide⁠ in our API Cookbook.

Updates to image generation, Code Interpreter, and file search

With built-in tools in the Responses API, developers can easily create more capable agents with just a single API call. By calling multiple tools while reasoning, models now achieve significantly higher tool calling performance on industry-standard benchmarks like Humanity’s Last Exam (source). Today, we’re adding new tools including:

  • Image generation: In addition to using the Images API⁠, developers can now access our latest image generation model—gpt-image-1—as a tool within the Responses API. This tool supports real-time streaming—allowing developers to see previews of the image as it’s being generated—and multi-turn edits—allowing developers to prompt the model to granularly refine these images step-by-step. Learn more⁠.
  • Code Interpreter: Developers can now use the Code Interpreter⁠ tool within the Responses API. This tool is useful for data analysis, solving complex math and coding problems, and helping the models deeply understand and manipulate images (e.g., thinking with images). The ability for models like o3 and o4-mini to use the Code Interpreter tool within their chain-of-thought has resulted in improved performance across several benchmarks including Humanity’s Last Exam (source). Learn more⁠.
  • File search: Developers can now access the file search⁠ tool in our reasoning models. File search enables developers to pull relevant chunks of their documents into the model’s context based on the user query. We’re also introducing updates to the file search tool that allow developers to perform searches across multiple vector stores and support attribute filtering with arrays. Learn more⁠.

New features in the Responses API

In addition to the new tools, we’re also adding support for new features in the Responses API, including:

  • Background mode: As seen in agentic products like Codex, deep research, and Operator, reasoning models can take several minutes to solve complex problems. Developers can now use background mode to build similar experiences on models like o3 without worrying about timeouts or other connectivity issues—background mode kicks off these tasks asynchronously. Developers can either poll these objects to check for completion, or start streaming events whenever their application needs to catch up on the latest state. Learn more⁠.

Python

1response = client.responses.create(2 model="o3",3 input="Write me an extremely long story.",4 reasoning={ "effort": "high" },5 background=True6)
  • Reasoning summaries: The Responses API can now generate concise, natural-language summaries of the model’s internal chain-of-thought, similar to what you see in ChatGPT. This makes it easier for developers to debug, audit, and build better end-user experiences. Reasoning summaries are available at no additional cost. Learn more⁠.

Python

1response = client.responses.create(2 model="o4-mini",3 tools=[4 {5 "type": "code_interpreter",6 "container": {"type": "auto"}7 }8 ],9 instructions=(10 "You are a personal math tutor. "11 "When asked a math question, run code to answer the…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

Notable update to OpenAI API, substantive but not a major launch.