AIME 2025
AIME (American Invitational Mathematics Exam) is a prestigious high school mathematics competition that serves as a qualifier for the USA Mathematical Olympiad. AIME problems require creative problem-solving and mathematical insight beyond standard curriculum, making it a strong test of genuine reasoning ability.
Key facts
How AIME 2025 works
AIME consists of 15 problems, each with an integer answer between 000 and 999. Problems require creative approaches and mathematical insight rather than routine calculation. Models are scored on the percentage of problems solved correctly. The 2025 exam is used as the standard reference, with some models also evaluated on AIME 2026.
What is a good AIME 2025 score?
Reasoning models score 83-100% on AIME. GPT-5.4 achieved a perfect 100%. General models typically score 7-35%. This is the widest gap of any benchmark between reasoning and general models - a 65+ point spread. A score above 90% indicates exceptional mathematical reasoning capability.
Why AIME 2025 matters
AIME problems test genuine creative mathematical reasoning, not pattern matching or memorization. The enormous gap between general models (7-35%) and reasoning models (83-100%) makes AIME the starkest discriminator between models with and without reasoning capability. A model that scores highly on AIME demonstrates the ability to approach novel problems creatively - a capability that transfers to other reasoning tasks.
How does AIME 2025 compare to other benchmarks?
AIME is harder than MATH and requires more creative insight. While MATH draws from multiple competition sources at various difficulty levels, AIME uses only the American Invitational Mathematics Exam - one of the hardest standardized math competitions. The score spread on AIME is even wider than MATH (7-100% vs 50-97%), making it an even stronger discriminator between reasoning and general models.
Which AI model has the highest AIME 2025 score?
Top 10 models by AIME 2025
Frequently asked questions
See all benchmark scores in the AI Frontier Model Tracker. Compare across all 8 benchmarks.
Get notified when we update the tracker
New model releases, benchmark updates, and pricing changes. No spam.