Summary of "Terry Tao - Machine assistance and the future of research mathematics - IPAM at UCLA"
Overview
Terry Tao argues that recent advances in AI combined with formal verification are beginning to change how research mathematics is done — enabling scale, broader participation, and new workflows — after a long period in which mathematics has been relatively conservative in practice. He uses the “Erdős problems” (a ~1,000 problem collection curated by Thomas Bloom) as a concrete dataset to show how AI + human collaboration + verification can sweep many medium-difficulty problems, reveal tool strengths and weaknesses, and produce formally checked results.
Main ideas, concepts, and lessons
- Mathematics has been conservative in practice: textbooks, blackboards, small-group collaboration, and very high standards of proof have persisted for decades.
- Structural obstacles to larger-scale collaboration:
- High barrier to entry.
- Need for rigorous verification.
- Workflows that don’t scale (blackboards and tightly coupled small teams).
- New enabling technologies:
- AI — large models, semantic literature search, automated drafting, numerical experiments, code generation.
- Formal verification / proof assistants — Lean, and tools that convert informal proofs into formal code.
- The combination matters: AI produces many candidate solutions and insights, but formal verification filters out hallucinations and low-quality output — enabling trust and scale.
- Rough new model of mathematical research:
- Human teams still tackle deep, very-hard problems in small groups.
- AI + crowdsourcing can sweep large numbers of medium-difficulty problems (the “long tail”) and clear low-hanging fruit.
- Opens opportunities for non-professional contributors (students, hobbyists, people from other fields).
- Data sets matter: curated problem collections (like the Erdős problems) serve as benchmarks that let the community measure, compare, and accelerate progress.
- Social and governance elements are important: community rules (disclosing AI use, moderation, responsibility for posts) make large-scale participation viable.
- Verification is essential: unverified AI output can be dangerous; formalization and proof-checking are key to reliable adoption.
- Practical benefits already seen: faster literature discovery, faster translation of informal proofs to formal proofs, automated plotting and numerical work, and dozens of problems newly solved or formalized.
Workflows and methodology
Creating a productive large-scale math project
- Assemble a systematic problem data set (e.g., Erdős problems).
- Build an online community/forum with clear rules (AI disclosure, responsible summaries, anti-spam).
- Encourage contributions from a broad pool: professionals, grad students, undergraduates, hobbyists, engineers.
Using AI + humans + verification to solve problems (step-by-step)
- Survey: use AI-enabled semantic literature search to find relevant results (including overlooked or non-English papers).
- Experimentation: generate code with AI, run numerical experiments, and visualize patterns quickly.
- Conjecture formation: humans synthesize numerical/AI suggestions into conjectures and iteratively simplify them.
- AI-assisted solution generation: feed conjectures into AI tools to produce sketches or candidate proofs (or transform problems).
- Human vetting: experts inspect and clean AI output, translating it into a form suitable for formalization.
- Formalization: use a proof assistant / formalizer (e.g., Aristotle + Lean) to convert and check the proof; iterate until it compiles.
- Publication/record: record solved problems, mark formalized solutions, and update the dataset and metrics.
Moderation and quality control
- Require disclosure when AI is used.
- Ask contributors to summarize salient points and be responsible for their posts.
- Use links for long content to avoid forum spam.
- Moderators filter low-quality or clearly incorrect AI outputs but allow disclosed, scrutable AI-generated drafts.
Measuring progress
- Track solved vs. unsolved counts over time.
- Distinguish spikes due to literature discovery from spikes due to new AI capabilities.
- Monitor formalization progress (how many solutions are checked in Lean).
Case studies and concrete examples
The Erdős problems data set
- ~1,000 diverse problems collected and tracked publicly (curated by Thomas Bloom).
- Problems range from deep “acorns” (requiring new ideas) to “marshmallows” (easy with a bit of computation).
- Useful as a population survey to measure where AI helps most.
Literature-discovery wins
- AI tools uncovered many solutions in obscure literature (papers in other languages or fields), producing a substantial source of newly “found” solutions.
Problem 367
- A human contributor produced a construction reducing the problem to proving a specific identity.
- Terry Tao used Gemini to produce the missing step.
- Another participant (Alexi) used a formalizer (Aristotle) to convert the result into a Lean proof.
- Outcome: fully confirmed/formalized solution via a human + AI + formalizer pipeline.
Problems 126 and 1024 (coin-splitting / packing examples)
- Problem setup (illustrative): Alice splits coins into n piles; Bob picks a monotone subsequence (increasing or decreasing) to maximize take; question is the fair fraction C(n).
- Community used linear programming, numeric exploration, and AI-assisted tools (AlphaEvolve) to explore optimal configurations for small n and propose conjectures.
- AI/formalizer produced a proof for the square-number-of-piles case; human analysis linked the result to square-packing literature and other solved Erdős problems.
- Example highlights: creative AI problem transformations (mapping the game to a packing problem) and human recognition of related literature and prior results.
Productivity gains
- Tasks that used to take weeks (manual formalization, exhaustive literature search) can often be done in hours with AI + tooling.
- AI excels at secondary tasks (plots, numerical data, code generation) and is increasingly good at creative proof strategies for some problems.
Tools, terms, and resources
- AI models / systems:
- Gemini — used for proof steps.
- AlphaEvolve — used for numeric/optimization exploration.
- Deep research tools for semantic literature search.
- Formal verification:
- Lean — proof assistant.
- Aristotle — a conversion/formalizer tool used to translate/check proofs in Lean.
- Dataset / community:
- Erdős problems (Thomas Bloom’s website/forum) with community moderation and rules.
- Other references:
- Paul Erdős (historical mathematician).
- Jessica Win (photographer referenced to illustrate mathematics’ conservatism).
- Notable contributors and forum users (e.g., Alexi, Bars, Bis, Lawrence Wu).
- Key phrases:
- “long tail” / medium-difficulty problems.
- “acorns vs marshmallows” (Erdős metaphors).
- Semantic literature search, formal proof checking.
Takeaway conclusions
- AI is not (yet) solving the deepest high-profile open problems, but it is:
- Uncovering overlooked literature and solutions.
- Clearing large numbers of less-attended problems (the long tail).
- Speeding up experiments, visualization, and formalization.
- Enabling broader participation outside traditional professional mathematicians.
- The critical combination is human + AI + formal verification: AI supplies scale and creative suggestions, humans provide judgment and synthesis, and formal proof systems provide trust.
- Community structure and good data sets are crucial: curated problem sets and clear community rules enable productive, scalable collaboration.
- Verification remains non-negotiable: formal proof-checking is what makes large-scale use of AI reliable in mathematics.
Speakers and sources mentioned
- Main speaker: Terry Tao (IPAM at UCLA event)
- People, projects, tools, and sources:
- Paul Erdős; Thomas Bloom (curator of the Erdős problems).
- Jessica Win (photographer referenced in subtitles).
- Gemini (AI model); Aristotle (formalizer); Lean (proof assistant).
- “Nano Banana” (forum/user/AI alias); AlphaEvolve (optimization tool).
- Forum participants: Alexi, Bis, Bars; Lawrence Wu.
- IPAM at UCLA (event host).
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...