Technology103 min read

AI Triumphs: AlphaGeometry2 Outshines Math Olympiad Gold Medalists

Discover how DeepMind's AlphaGeometry2 has surpassed human gold medalists in the International Mathematical Olympiad, marking a significant milestone in AI development.

AI Triumphs: AlphaGeometry2 Outshines Math Olympiad Gold Medalists

Yesterday, Google announced a groundbreaking achievement in the field of artificial intelligence. DeepMind's latest AI system, AlphaGeometry2, has set a new record by surpassing the level of human gold medalists in a large-scale geometry problem test at the International Mathematical Olympiad (IMO).

The research team selected 45 geometry problems from IMO competitions held between 2000 and 2024, converting them into 50 standard problems. AlphaGeometry2 successfully solved 42 of these, surpassing the average score of gold medalists, which stands at 40.9 points. This achievement is not just a numerical victory but a profound leap in AI capabilities.

DeepMind's focus on high school mathematics competitions stems from a deep insight: the ability to solve Euclidean geometry problems may be key to building more powerful AI systems. Solving mathematical theorems requires reasoning and strategic decision-making, skills that are crucial for the next generation of general AI models.

In a demonstration during the summer of 2024, AlphaGeometry2, combined with the mathematical formal reasoning AI model AlphaProof, solved 4 out of 6 problems from that year's IMO competition. This hybrid approach combines Google's Gemini series language models with a specialized symbolic computation engine. The Gemini model predicts necessary geometric constructions, while the symbolic engine derives solutions based on strict mathematical rules. Together, they work through parallel search algorithms, storing useful information in a shared knowledge base.

To overcome the challenge of limited geometric training data, the research team generated over 300 million theorems and proofs for training. This large-scale synthetic data training method offers a new model for AI breakthroughs in specific fields. However, AlphaGeometry2's capabilities have limits. It struggles with problems involving variable point numbers, nonlinear equations, and inequalities, solving only 20 out of 29 more challenging IMO candidate problems.

This breakthrough has sparked reflections on AI's development path. Traditionally, AI has followed two main approaches: symbolic operation-based methods and neural network methods. AlphaGeometry2 adopts a hybrid architecture, using a neural network for its Gemini model and rule operations for its symbolic engine. In tests, the OpenAI o1 model, which also uses a neural network architecture, failed to solve any of the IMO problems that AlphaGeometry2 successfully answered.

Vince Conitzer, an AI expert at Carnegie Mellon University, noted that while language models make astonishing progress on benchmarks, they still struggle with simple common-sense questions. This highlights the unpredictable nature of AI systems and the need for a better understanding of their potential risks.

The DeepMind team has found preliminary evidence suggesting that AlphaGeometry2's language model component can generate partial solutions without a symbolic engine. However, until computational speed improves and the "hallucination" problem is resolved, symbolic computation will remain essential in mathematical applications.

Key Takeaways:

  1. AlphaGeometry2 surpasses human gold medalists in IMO geometry problems.
  2. The hybrid approach combines language models with symbolic computation.
  3. Large-scale synthetic data training is crucial for AI breakthroughs.
  4. AI's development path involves both symbolic and neural network methods.
  5. Understanding AI's potential risks is essential for future advancements.