Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
Captured source
source ↗Gemini Deep Think: Redefining the Future of Scientific Research — Google DeepMind Skip to main content
February 11, 2026 Research Accelerating Mathematical and Scientific Discovery with Gemini Deep Think Thang Luong and Vahab Mirrokni
Share
Your browser does not support the audio element. Listen to article 10 minutes
Under direction from expert mathematicians and scientists, Gemini Deep Think is solving professional research problems across mathematics, physics, and computer science In the summer of 2025, an advanced version of Gemini Deep Think achieved Gold-medal standard at the International Mathematics Olympiad (IMO) and later, an updated version, obtained similar results at the International Collegiate Programming Contest. These results demonstrated the model could reason through some of the most challenging math and programming problems designed for students. Since then, Gemini Deep Think mode has moved into science, engineering and enterprise workflows to tackle more complex, open-ended challenges. In the last week, our teams published two papers ( 1 , 2 ) detailing a cross-disciplinary effort to solve professional research problems using Gemini Deep Think mode. These results stem from deep collaboration between mathematicians, physicists, and computer scientists. The Frontier of Pure Mathematics Unlike IMO problems, research-level mathematics requires advanced techniques from vast literature. While foundation models have large knowledge bases, data scarcity often leads to superficial understanding and hallucinations in advanced subjects. To solve this, we built a math research agent (internally codenamed Aletheia), powered by Gemini Deep Think mode. It features a natural language verifier to identify flaws in candidate solutions and enable an iterative process of generating and revising solutions. Crucially, this agent can admit failure to solve a problem, a key feature that improved the efficiency for researchers. Additionally, the research agent uses Google Search and web browsing to navigate complex research, preventing spurious citations and computational inaccuracies when synthesizing published literature.
Overview of Aletheia, a math research agent powered by Deep Think that can iteratively generate, verify, and revise for research-level math problems.
Since achieving IMO Gold-medal standard in July 2025, Gemini Deep Think has progressed rapidly, scoring up to 90% on the IMO-ProofBench Advanced test as inference-time compute scales. We demonstrated that the scaling law continues to hold as we progress beyond Olympiad level into PhD-level exercises (per our internal FutureMath Basic benchmark). Notably, Aletheia demonstrated that higher reasoning quality can be achieved at a lower inference-time compute.
The latest advanced version of Deep Think, as of Jan 2026, has significantly outperformed the IMO-Gold version (Jul 2025) on Olympiad-level problems. Aletheia makes further leaps in terms of reasoning quality with lower inference-time compute. All results were graded by human experts.
The inference-time scaling law also transfers to PhD-level exercises.
For research-level math, Aletheia has already enabled several advancements, produced via varying levels of autonomous research: Reliable autonomous research. A research paper ( Feng26 ) generated by AI without any human intervention, which calculates certain structure constants in arithmetic geometry called eigenweights. AI-guided collaboration. A research paper ( LeeSeo26 ) demonstrating human-AI collaboration in proving bounds on systems of interacting particles called independent sets. An extensive semi-autonomous evaluation ( Feng et al., 2026b ) of 700 open problems on Bloom’s Erdős Conjectures database , including autonomous solutions to four open questions listed there. On Erdős-1051, our model autonomously solved and helped lead to a generalization reported in a research paper ( BKKKZ26 ).
The agent also contributed intermediate propositions on two further papers, ( FYZ26 ) and ( ACGKMP26 ). It is also of note that there has been prior work using Gemini for research-level math at a smaller scale in terms of collaborations and the number of problems tackled. Following extensive discussions with the mathematical community, we suggest a taxonomy to classify AI-assisted mathematics research by significance and degree of AI contribution - contributing to the wider discussion on responsible documentation, evaluation and communication of AI-generated results. Level 2 (“publishable quality”) works have been submitted to reputable journals. Currently, we do not claim any Level 3 (“Major Advance”) and Level 4 (“Landmark Breakthrough”) results.
Classification of all AI-assisted mathematics results encompassed in this work. *Works listed as Level 2 in this table have been submitted for publications.
Prompts and model outputs are available here . For discussions on AI contributions, our “Human-AI Interaction card”, and community impact, see our paper . Expanding to Physics and Computer Science Gemini Deep Think mode has also demonstrated promise in computer science and physics. The second paper builds on similar agentic reasoning ideas, and identifies effective "recipes" for collaboration, specifically the "Advisor" model, where humans guide AI through iterative "Vibe-Proving" cycles to validate intuition and refine proofs. We also detail tactical techniques like “balanced prompting” —requesting simultaneous proof or refutation to prevent confirmation bias—and code-assisted verification. These methods, combined with the model's ability to bridge disparate scientific fields through deep structural connections, are transforming how theoretical research is conducted. This work builds upon our successful deployment of an advanced version of Gemini Deep Think to assist in reviewing CS theory papers for STOC’26 conference .
A schematic overview of the AI reasoning pipeline, illustrating how extensive solution space exploration at the network layer is funneled into structured reasoning and validated by automated and human verification.
Collaborating with experts on 18 research problems, an advanced version of Gemini Deep Think helped resolve long-standing bottlenecks across algorithms, ML and combinatorial optimization, information theory, and economics. Highlights from our “Accelerating Research with Gemini” paper include (corresponding section numbers in paper): Crossing mathematical borders for network puzzles…
Excerpt shown — open the source for the full document.
Notability
notability 6.0/10Notable research post but moderate traction