I’m excited to welcome Ross Taylor back on the podcast (and sorry for the lack of episodes in general – I have a lot going on!). The first time Ross came on we focused on reasoning – before inference-time scaling and that sort of RL was popular, agents, Galactica, and more from his Llama days. Since then, and especially after DeepSeek R1, Ross and I have talked asynchronously about the happenings of AI, so it’s exciting to do it face to face.
In this episode we cover some of everything:
* Recent AI news (Chinese models and OpenAI’s coming releases)
* “Do and don’t” of LLM training organizations
* Reasoning research and academic blind spots
* Research people aren’t paying enough attention to
* Non language modeling news & other topics
Listen on Apple Podcasts, Spotify, YouTube, and where ever you get your podcasts. For other Interconnects interviews, go here.
Show outline as a mix of questions and edited assertions that Ross sent me as potential topics.
00:00 Recent AI news
Related reading is on Kimi’s K2 model, thoughts on OpenAI’s forthcoming open release.
* What did you think of Z.ai’s GLM 4.5 model (including MIT licensed base model) with very strong scores? And Kimi?
* What will OpenAI’s open model actually be?
* What do you make of the state of the ecosystem?
12:10 “Do and don’t” of LLM training organizations
Related reading is on managing training organizations or the Llama 4 release.
This is one of my favorite topics – I think a lot of great stuff will be written on it in the future. For now, Ross asserts…
* Most major LLM efforts are not talent-bound, but politics-bound. Recent failures like Llama 4 are org failures not talent failures.
* Most labs are chaotic, changing direction every week. Very different picture from the narrative presented online.
* Most labs represent investment banks or accountancy firms in that they hire smart young people as “soldiers” and deliberately burn them out with extremely long hours.
36:40 Reasoning research and academic blind spots
Related reading is two papers point questions at the Qwen base models for RL (or a summary blog post I wrote).
I start with: What do you think of o3, and search as something to train with RL?
And Ross asserts…
* Most open reasoning research since R1 has been unhelpful - because not enough compute to see what matters (underlying model and iterations).
* Best stuff has been simple tweaks to GRPO like overlong filtering and removing KL divergence.
* Far too much focus on MATH and code - AIME has tens of samples too so is very noisy.
* People are
Published on 1 month, 2 weeks ago
If you like Podbriefly.com, please consider donating to support the ongoing development.
Donate