Episode Details
Back to Episodes
AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – #97
Description
Lin Yang is a professor of computer science at UCLA. Recently, he and his collaborator built an AI pipeline using commercial models such as Gemini, ChatGPT, and Grok that performed at the gold medal level on International Mathematics Olympiad problems. Steve and Lin discuss this research, which relies on "verifier-refiner" LLM instances and large token budgets to reliably solve difficult problems. They discuss how these methods can be used to advance AI for scientific research, legal analysis, and complex document processing.
https://github.com/lyang36/IMO25/blob/main/IMO25.pdf
https://x.com/hsu_steve/status/1948189075707469942
Chapter markers:
- (00:00) - AIs Win Math Olympiad Gold: Prof. Lin Yang (UCLA) – #97
- (00:57) - Prof. Lin Yang, UCLA
- (04:27) - Journey from Physics to Computer Science: 2 PhDs
- (11:15) - Transition to AI from Theoretical CS
- (13:16) - AI Pipeline Math Olympiad: Gold Medal!
- (28:23) - Probability Amplification
- (29:00) - Applications in Industry and Legal Analysis
- (29:58) - Challenges in Model Reasoning and Verification
- (33:23) - Future of AI in Scientific Research and AGI Speculations
–
Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University. Previously, he was Senior Vice President for Research and Innovation at MSU and Director of the Institute of Theoretical Science at the University of Oregon. Hsu is a startup founder (SuperFocus.ai, SafeWeb, Genomic Prediction, Othram) and advisor to venture capital and other investment firms. He was educated at Caltech and Berkeley, was a Harvard Junior Fellow, and has held faculty positions at Yale, the University of Oregon, and MSU.Please send any questions or suggestions to manifold1podcast@gmail.com or Steve on X @hsu_steve.
Announcing this for some friends at Mechanize - a startup that builds environments for training and evaluating frontier LLMs. Its customers include the top AI labs, and it has contributed to the breakthrough in coding capabilities of frontier models.
Mechanize is hiring!
Compensation is extremely competitive. For technical roles, $300-500k. They are also seeking smart generalists.
For example:
Research Engineer, Alignment: Build evals that test for misaligned model behaviors $500K salary
Puzzle Maker: Design interesting and original puzzles that LLMs can’t yet solve $300K salary
Mechanize understands that my readership is highly selected. There is a VERY GOOD CHANCE you will be interviewed if you apply via the link above.