Google’s advanced Gemini model, known as Gemini Deep Think, has achieved a gold medal-equivalent score at the 2025 International Mathematical Olympiad (IMO) by fully solving five out of six problems. This accomplishment has been officially confirmed by the Olympiad organizers.
The IMO is the most prestigious and challenging competition for young mathematicians worldwide. In recent years, it has also become a benchmark for testing the advanced reasoning capabilities of artificial intelligence. Last year, Google’s AlphaProof and AlphaGeometry 2 systems managed to solve four problems, reaching the silver medal standard. However, that success had a major limitation: the problems had to be manually translated from natural language into a specific programming language by an expert, and the solving process took several days.
This year, however, Gemini Deep Think was able to understand the problems directly in natural language. Moreover, it produced complete and accurate solutions within the 4.5-hour competition time limit—just like a human participant. OpenAI’s new model also performed at the gold medal level in this Olympiad. However, unlike OpenAI, Google followed the rules set by the IMO. OpenAI took a different path by forming its own independent review committee rather than submitting answers to the official IMO jury.
Google’s AI at the 2025 International Math Olympiad
Gemini’s remarkable success stems from a new AI architecture approach called Deep Think. Unlike previous models that followed a linear reasoning path, Gemini Deep Think uses parallel reasoning. It simultaneously explores multiple possible approaches and solutions, compares and combines them, and ultimately selects the best answer.
To prepare the model, Google DeepMind’s team used reinforcement learning techniques and a high-quality dataset of detailed, step-by-step mathematical solutions. This taught the AI how to “show its work”—a crucial skill for earning full marks in the Olympiad.
Perhaps the most astonishing aspect of Gemini’s performance was its creativity. In one of the problems, while many human contestants relied on a complex graduate-level theorem (Dirichlet’s Theorem), Gemini produced a much simpler and shorter solution using only elementary number theory.
However, the model is not yet flawless. It failed to solve the competition’s hardest problem, which only five students managed to crack. The failure was due to the model beginning its reasoning from an incorrect initial assumption.
Google emphasizes that, unlike some competitors, the entire evaluation and scoring process for this model was carried out by the official IMO committee and held to the same standards as human participants. The company plans to release a version of this advanced model first to a group of trusted testers, including mathematicians, and then make it available to Google AI Ultra subscribers.