OpenAI’s general-purpose reasoning model autonomously disproved Erdős’s 80-year-old unit distance conjecture, constructing point configurations with n^(1+δ) unit-distance pairs (δ=0.014).
Key Takeaways
The model disproved the conjecture by finding an infinite family of configurations beating the long-standing square grid lower bound of n^(1+C/log log n).
The proof imports tools from algebraic number theory – infinite class field towers and Golod-Shafarevich theory – into a combinatorial geometry problem, a connection no human had made.
This was not a math-specific system: no Lean formalization, no domain-specific scaffolding, no targeted training on the unit distance problem.
External mathematicians including Fields medalist Tim Gowers verified the proof and wrote a companion paper; Will Sawin (Princeton) derived the explicit exponent δ=0.014.
OpenAI tested varying amounts of test-time compute; success rate data was published, suggesting this required significant inference budget.
Hacker News Comment Review
The “LLMs just interpolate” debate resurfaced sharply: commenters split on whether cross-domain recombination (algebraic number theory into discrete geometry) counts as genuine discovery or sophisticated retrieval.
Several technically detailed commenters raised unanswered operational questions: total inference time, hallucination/false-proof rate before the valid proof, and what “peak compute” means on OpenAI’s published success-rate graph.
A minority view flagged that frontier labs pay experts to generate proprietary novel training data – raising the question of whether relevant algebraic number theory connections were seeded into training rather than derived at inference time.
Notable Comments
@cpard: Cross-domain transfer with zero friction may be the core superpower – the model had no reason to stay inside discrete geometry’s usual toolkit.
@dwa3592: Asks the concrete operational questions OpenAI left out: inference duration, false-proof count, and what the ~50% success rate at peak compute actually implies.
@trostaft: Points out the proof was plain-language in and out – no Lean verification – which is either impressive or a gap depending on your priors about AI proof reliability.