WritingOpenAIOpenAIpublished Sep 15, 2025seen 6d

Introducing upgrades to Codex

Open original ↗

Captured source

source ↗
published Sep 15, 2025seen 6dcaptured 2dhttp 200method exa

Introducing upgrades to Codex | OpenAI

September 15, 2025

Introducing upgrades to Codex

Codex just got faster, more reliable, and better at real-time collaboration and tackling tasks independently anywhere you develop—whether via the terminal, IDE, web, or even your phone.

Get started

$ npm i -g @openai/codex

Loading…

Share

Update on September 23, 2025: GPT‑5‑Codex is now available to developers using Codex via API key (in addition to being available to developers using Codex via their ChatGPT subscription). GPT‑5 Codex is available at the same price as GPT‑5, and is available in the Responses API only. The underlying model snapshot will be regularly updated. Check out the Codex developer documentation⁠ and changelog⁠ for more details.

Today, we’re releasing GPT‑5‑Codex—a version of GPT‑5 further optimized for agentic coding in Codex. GPT‑5‑Codex was trained with a focus on real-world software engineering work; it’s equally proficient at quick, interactive sessions and at independently powering through long, complex tasks. Its code review capability can catch critical bugs before they ship. GPT‑5‑Codex is available everywhere you use Codex—it’s the default for cloud tasks and code review, and developers can choose to use it for local tasks via Codex CLI and the IDE extension.

Since we first launched Codex CLI⁠ in April and Codex⁠ web in May, Codex has steadily evolved into a more effective coding collaborator. Two weeks ago, we unified Codex into a single product experience connected by your ChatGPT account, enabling you to move work seamlessly between your local environment and the cloud without losing context. Codex now works where you develop—in your terminal or IDE, on the web, in GitHub, and even in the ChatGPT iOS app. Codex is included with ChatGPT Plus, Pro, Business, Edu, and Enterprise plans.

With these updates, Codex moves closer to what we’ve been building toward all along—a teammate that understands your context, works alongside you, and reliably takes on work for your team.

GPT‑5‑Codex

GPT‑5‑Codex is a version of GPT‑5 further optimized for agentic software engineering in Codex. It’s trained on complex, real-world engineering tasks such as building full projects from scratch, adding features and tests, debugging, performing large-scale refactors, and conducting code reviews. It’s more steerable, adheres better to AGENTS.md⁠ instructions, and produces higher-quality code—just tell it what you need without writing long instructions on style or code cleanliness.

SWE-bench Verified: Historically, including at the time of the GPT‑5 launch, we reported results on 477 SWE-bench Verified tasks because some tasks couldn’t run in our infrastructure. We’ve since fixed this and now report on all 500 tasks.

Code refactoring tasks: Our code refactoring evaluation contains refactor-style tasks from large, established repositories and includes tasks in Python, Go and even OCaml. An example task is the following pull request from Gitea⁠ which changes 232 files and 3,541 lines to thread a ctx variable through the application logic.

GPT‑5‑Codex adapts how much time it spends thinking more dynamically based on the complexity of the task. The model combines two essential skills for a coding agent: pairing with developers in interactive sessions, and persistent, independent execution on longer tasks. That means Codex will feel snappier on small, well-defined requests or while you are chatting with it, and will work for longer on complex tasks like big refactors. During testing, we've seen GPT‑5‑Codex work independently for more than 7 hours at a time on large, complex tasks, iterating on its implementation, fixing test failures, and ultimately delivering a successful implementation.

On OpenAI employee traffic, we see that for the bottom 10% of user turns sorted by model-generated tokens (including hidden reasoning and final output), GPT‑5‑Codex uses 93.7% fewer tokens than GPT‑5. Conversely, for the top 10%, GPT‑5‑Codex thinks more, spending twice as long reasoning, editing and testing code, and iterating.

GPT‑5‑Codex has been trained specifically for conducting code reviews and finding critical flaws. When reviewing, it navigates your codebase, reasons through dependencies, and runs your code and tests in order to validate correctness. We evaluated code review performance on recent commits from popular open-source repositories. For each commit, experienced software engineers evaluated review comments for correctness and importance. We find that comments by GPT‑5‑Codex are less likely to be incorrect or unimportant, reserving more user attention for critical issues.

GPT‑5‑Codex is a reliable partner on front-end tasks. In addition to creating aesthetic desktop apps, GPT‑5‑Codex also shows significant improvements in human preference evaluations when creating mobile websites. When working in the cloud, it can look at images or screenshots you provide as input, visually inspect its progress, and display screenshots of its work to you.

GPT‑5‑Codex was purpose-built for Codex CLI, the Codex IDE extension, the Codex cloud environment, and working in GitHub, and also supports versatile tool use. Unlike GPT‑5, which is a general-purpose model, we recommend using GPT‑5‑Codex only for agentic coding tasks in Codex or Codex-like environments.

Updates to Codex

We also recently made some updates to make Codex a better pair programmer, with a revamped Codex CLI and the new Codex IDE extension.

Codex CLI

Codex CLI is open-source, and community feedback over the last few months has been invaluable in shaping its evolution. With this feedback, we’ve rebuilt Codex CLI around agentic coding workflows to harness our models into more capable and reliable partners. You can now attach and share images—screenshots, wireframes, and diagrams—right in the CLI to build shared context on design decisions and get exactly what you want. When doing more complex work, Codex now tracks progress with a to-do list, and includes tools like web search and MCP for connecting to external systems, with more accurate tool use overall.

The terminal UI has also been upgraded: tool calls and diffs are better formatted and easier to follow. Approval modes are simplified to three levels: read-only with explicit approvals, auto with full workspace access but requiring approvals outside the workspace, and full access with the ability to read files anywhere and run commands with network access.…

Excerpt shown — open the source for the full document.

Notability

notability 9.0/10

Major upgrade, high HN traction.