Season 2 Episode 22
Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promises, here's 10 new things to know in just under 12…
Published on 4 months, 1 week ago
Season 2 Episode 21
In the last few days Anthropic have released an impressive honest account of how all models blackmail, no matter what goal they have, and despite prompt warnings, and other preventions. But do these …
Published on 4 months, 3 weeks ago
Season 2 Episode 20
What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the paper in layman’s terms, what it means and doesn’t mean, and what’s next.
Thanks to Storyblocks for sponsor…
Published on 5 months, 1 week ago
Season 2 Episode 19
There’s a new best language model, so let’s go through the up and downs of Gemini 2.5 Pro 06-05. Record-breaking common-sense, but dumb mistakes remain. And it’s not even their best model, which rema…
Published on 5 months, 1 week ago
Season 2 Episode 18
Not only did I get early access and ran my own tests, as per the title I read both the 120 page Claude 4 Opus and Claude 4 Sonnet System Card, and 25 page report on ASL-3 being triggered, plus the 2 …
Published on 5 months, 3 weeks ago
Season 2 Episode 17
Google just announced at least 12 things that are each worthy of a video, but here are the top I/O highlights. From Veo 3 to Deep Research now being useable, Deep Think breaking records to Gemini Dif…
Published on 5 months, 4 weeks ago
Season 2 Episode 16
AlphaEvolve is not the first system to exhibit self-improvement, but it may be the most impressive yet. AI is literally improving the hardware, architectures, data and training methods of AI itself. …
Published on 6 months ago
Season 2 Episode 15
A green card, o3 vs Gemini 2.5, 6 Benchmarks and a whole bunch of my thoughts on what on earth is happening in AI, from here to 2030. Plus, how AI is becoming pay-to-win, and why. Crazy times, 14 min…
Published on 6 months, 3 weeks ago
Season 2 Episode 14
Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but some you may not have seen before. Yes, they can whi…
Published on 7 months ago
Season 2 Episode 13
This pod won’t just be about the release of GPT 4.1 in the last 48 hours, o3 build-up, Kling 2.0, a sneak-peak at the next OpenAI model, or even the new Dolphin language tool. It will be about 7 such…
Published on 7 months ago
If you like Podbriefly.com, please consider donating to support the ongoing development.
Donate