Podcast Episodes
Back to Search
How Not to Read a Headline on AI (ft. new Olympiad Gold, GPT-5 …)
Season 2 Episode 23
GPT-5 did what? OpenAI ahead of Google? There are 9 ways to misread the headlines of the last 48 hours, so this video is here to tell you what happen…
8 months ago
Grok 4 - 10 New Things to Know
Season 2 Episode 22
Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promi…
8 months, 2 weeks ago
When Will AI Models Blackmail You, and Why?
Season 2 Episode 21
In the last few days Anthropic have released an impressive honest account of how all models blackmail, no matter what goal they have, and despite pro…
9 months ago
Apple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know
Season 2 Episode 20
What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the paper in layman’s terms, what it means and doesn’t mean, …
9 months, 2 weeks ago
AI Accelerates: New Gemini Model + AI Unemployment Stories Analysed
Season 2 Episode 19
There’s a new best language model, so let’s go through the up and downs of Gemini 2.5 Pro 06-05. Record-breaking common-sense, but dumb mistakes rema…
9 months, 2 weeks ago
Claude 4: Full 120 Page Breakdown … Is it the Best New Model?
Season 2 Episode 18
Not only did I get early access and ran my own tests, as per the title I read both the 120 page Claude 4 Opus and Claude 4 Sonnet System Card, and 25…
10 months ago
Google Takes No Prisoners Amid Torrent of AI Announcements
Season 2 Episode 17
Google just announced at least 12 things that are each worthy of a video, but here are the top I/O highlights. From Veo 3 to Deep Research now being …
10 months ago
AI Improves at Self-improving
Season 2 Episode 16
AlphaEvolve is not the first system to exhibit self-improvement, but it may be the most impressive yet. AI is literally improving the hardware, archi…
10 months, 1 week ago
o3 breaks (some) records, but AI becomes pay-to-win
Season 2 Episode 15
A green card, o3 vs Gemini 2.5, 6 Benchmarks and a whole bunch of my thoughts on what on earth is happening in AI, from here to 2030. Plus, how AI is…
11 months ago
o3 and o4-mini - they’re great, but easy to over-hype
Season 2 Episode 14
Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but so…
11 months, 1 week ago