Skip to content

google's agent cracked 9 erdos problems, 44 oeis conjectures

pulse Robot plants a glowing "SOLVED" flag atop a mountain of math papers as a defeated armored figure retreats into the dark.

another bunch of unsolved math problems was solved by ai. here's how
google's agent solved 53+ open math problems. what it solved:

• 9 erdos problems out of 353 attempted, including two that had been open for 56 years (these span number theory, combinatorics, and set theory)
• 44 oeis conjectures out of 492 (oeis – the online encyclopedia of integer sequences is a catalog of number patterns – many entries have formulas or behaviors that are believed true but never proven)
• a 15-year-old open problem on hilbert functions in algebraic geometry (about whether a certain sequence of numbers always forms a "hill" shape where the middle is never lower than its neighbors)

how it was performed
they built 4 agents of increasing complexity:
- the basic one is just an llm proposing proofs + the lean compiler checking them in a loop, feeding errors back for correction
- the full-featured agent adds evolutionary search: llm raters rank proof attempts by plausibility, the best ones get combined and mutated, and it can call a specialized olympiad-level prover called alphaproof as a tool for sub-goals. all runs cost a few hundred dollars per problem

surprising finding
the basic agent solved almost everything the fancy one did, often at lower cost. the complex evolutionary agent only pulled ahead on the hardest problems, offering 2x-5x savings. as llms get more capable, simple loops with formal verification might be all you need

conclusion
ai is becoming more and more powerful in math. this destination is clear: openai, anthropic and google are improving their products for it

Diagram of Google's AlphaProof Nexus: mathematician feeds Lean problems to Prover and Rater subagents looping via proof validator and database.

Stay in the loop

Get the latest AI news delivered to your inbox weekly

Thanks for subscribing!