WritingOpenAIOpenAIpublished Jul 17, 2025seen 6d

Introducing ChatGPT agent

Open original ↗

Captured source

source ↗
published Jul 17, 2025seen 6dcaptured 3dhttp 200method exa

Introducing ChatGPT agent: bridging research and action | OpenAI

July 17, 2025

Introducing ChatGPT agent: bridging research and action

ChatGPT now thinks and acts, proactively choosing from a toolbox of agentic skills to complete tasks for you using its own computer.

Loading…

Share

ChatGPT can now do work for you using its own computer, handling complex tasks from start to finish.

You can now ask ChatGPT to handle requests like “look at my calendar and brief me on upcoming client meetings based on recent news,” “plan and buy ingredients to make Japanese breakfast for four,” and “analyze three competitors and create a slide deck.” ChatGPT will intelligently navigate websites, filter results, prompt you to log in securely when needed, run code, conduct analysis, and even deliver editable slideshows and spreadsheets that summarize its findings.

At the core of this new capability is a unified agentic system. It brings together three strengths of earlier breakthroughs: Operator’s⁠ ability to interact with websites, deep research’s⁠ skill in synthesizing information, and ChatGPT’s intelligence and conversational fluency.

ChatGPT carries out these tasks using its own virtual computer, fluidly shifting between reasoning and action to handle complex workflows from start to finish, all based on your instructions.

Most importantly, you’re always in control. ChatGPT requests permission before taking actions of consequence, and you can easily interrupt, take over the browser, or stop tasks at any point.

Starting today, Pro, Plus, and Team users can activate ChatGPT’s new agentic capabilities directly through the tools dropdown from the composer by selecting ‘agent mode’ at any point in any conversation.

While ChatGPT agent is already a powerful tool for handling complex tasks, today’s launch is just the beginning. We’ll continue to iteratively add significant improvements regularly, making it more capable and useful to more people over time.

A natural evolution of Operator and deep research

Previously, Operator and deep research each brought unique strengths: Operator could scroll, click, and type on the web, while deep research excelled at analyzing and summarizing information. But they worked best in different situations: Operator couldn’t dive deep into analysis or write detailed reports, and deep research couldn’t interact with websites to refine results or access content requiring user authentication. In fact, we saw that many queries users attempted with Operator were actually better suited for deep research, so we brought the best of both together.

By integrating these complementary strengths in ChatGPT and introducing additional tools, we’ve unlocked entirely new capabilities within one model. It can now actively engage websites—clicking, filtering, and gathering more precise, efficient results. You can also naturally transition from a simple conversation to requesting actions directly within the same chat.

An agent that works for you, with you

We’ve equipped ChatGPT agent with a suite of tools: a visual browser that interacts with the web through a graphical-user interface, a text-based browser for simpler reasoning-based web queries, a terminal, and direct API access. The agent can also leverage ChatGPT connectors⁠, which allows you to connect apps like Gmail and Github so ChatGPT can find information relevant to your prompts and use them in its responses. You can also log in on any website by taking over the browser, allowing it to go deeper and broader in both its research and task execution. Giving ChatGPT these different avenues for accessing and interacting with web information means it can choose the optimal path to most efficiently perform tasks. For instance, it can gather information about your calendar through an API, efficiently reason over large amounts of text using the text-based browser, while also having the ability to interact visually with websites designed primarily for humans.

All this is done using its own virtual computer, which preserves the context necessary for the task, even when multiple tools are used—the model can choose to open a page using the text browser or visual browser, download a file from the web, manipulate it by running a command in the terminal, and then view the output back in the visual browser. The model adapts its approach to carry out tasks with speed, accuracy, and efficiency.

ChatGPT agent is designed for iterative, collaborative workflows, far more interactive and flexible than previous models. As ChatGPT works, you can interrupt at any point to clarify your instructions, steer it toward desired outcomes, or change the task entirely. It will pick up where it left off, now with the new information, but without losing previous progress. Likewise, ChatGPT itself may proactively seek additional details from you when needed to ensure the task remains aligned with your goals. If a task takes longer than anticipated or feels stuck, you can pause it, ask it for a progress summary, or stop it entirely and receive partial results. If you have the ChatGPT app on your phone, it will send you a notification when it’s done with your task.

Broadening real-world utility

These unified agentic capabilities significantly enhance ChatGPT’s usefulness in both everyday and professional contexts. At work, you can automate repetitive tasks, like converting screenshots or dashboards into presentations composed of editable vector elements, rearranging meetings, planning and booking offsites, and updating spreadsheets with new financial data while retaining the same formatting. In your personal life, you can use it to effortlessly plan and book travel itineraries, design and book entire dinner parties, or find specialists and schedule appointments.

The model’s elevated capabilities are reflected in its state-of-the-art (SOTA) performance on evaluations measuring web browsing and real-world task completion capabilities.

On Humanity’s Last Exam⁠*, an evaluation that measures AI’s performance across a broad range of subjects on expert-level questions, the model powering ChatGPT agent scores a new pass@1 SOTA at 41.6. Because the agent plans dynamically and chooses its own tools, it can tackle the same task in different ways across runs. When we scaled this with a simple parallel rollout strategy—running up to eight attempts at once and picking the one with the highest self-reported confidence—the agent’s HLE score increases to…

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

Major new agent feature from OpenAI