Episode Details
Back to Episodes
📆 ThursdAI - Dec 18 - Gemini 3 Flash, Grok Voice, ChatGPT Appstore, Image 1.5 & GPT 5.2 Codex, Meta Sam Audio & more AI news
Description
Hey folks 👋 Alex here, dressed as 🎅 for our pre X-mas episode!
We’re wrapping up 2025, and the AI labs decided they absolutely could NOT let the year end quietly. This week was an absolute banger—we had Gemini 3 Flash dropping with frontier intelligence at flash prices, OpenAI firing off GPT 5.2 Codex as breaking news DURING our show, ChatGPT Images 1.5, Nvidia going all-in on open source with Nemotron 3 Nano, and the voice AI space heating up with Grok Voice and Chatterbox Turbo. Oh, and Google dropped FunctionGemma for all your toaster-to-fridge communication needs (yes, really).
Today’s show was over three and a half hours long because we tried to cover both this week AND the entire year of 2025 (that yearly recap is coming next week—it’s a banger, we went month by month and you’ll really feel the acceleration). For now, let’s dive into just the insanity that was THIS week.
00:00 Introduction and Overview
00:39 Weekly AI News Highlights
01:40 Open Source AI Developments
01:44 Nvidia's Nemotron Series
09:09 Google's Gemini 3 Flash
19:26 OpenAI's GPT Image 1.5
20:33 Infographic and GPT Image 1.5 Discussion
20:53 Nano Banana vs GPT Image 1.5
21:23 Testing and Comparisons of Image Models
23:39 Voice and Audio Innovations
24:22 Grok Voice and Tesla Integration
26:01 Open Source Robotics and Voice Agents
29:44 Meta's SAM Audio Release
32:14 Breaking News: Google Function Gemma
33:23 Weights & Biases Announcement
35:19 Breaking News: OpenAI Codex 5.2 Max
To receive new posts and support my work, consider becoming a free or paid subscriber.
Big Companies LLM updates
Google’s Gemini 3 Flash: The High-Speed Intelligence King
If we had to title 2025, as Ryan Carson mentioned on the show, it might just be “The Year of Google’s Comeback.” Remember at the start of the year when we were asking “Where is Google?” Well, they are here. Everywhere.
This week they launched Gemini 3 Flash, and it is rightfully turning heads. This is a frontier-class model—meaning it boasts Pro-level intelligence—but it runs at Flash-level speeds and, most importantly, Flash-level pricing. We are talking $0.50 per 1 million input tokens. That is not a typo. The price-to-intelligence ratio here is simply off the charts.
I’ve been using Gemini 2.5 Flash in production for a while because it was good enough, but Gemini 3 Flash is a different beast. It scores 71 on the Artificial Analysis Intelligence Index (a 13-point jump from the previous Flash), and it achieves 78% on SWE-bench Verified. That actually beats the bigger Gemini 3 Pro on some agentic coding tasks!
What impressed me most, and something Kwindla pointed out, is the tool calling. Previous Gemini models sometimes struggled with complex tool use compared to OpenAI, but Gemini 3 Flash can handle up to 100 simultaneous function calls. It’s fast, it’s smart, and it’s integrated immediately across the entire Google stack—Workspace, Android, Chrome. Google isn’t just releasing models anymore; they are deploying them instantly to billions of users.
For anyone building agents, this combination of speed, low latency, and 1 million context window (at this price!) makes it the new default workhorse.
Google’s FunctionGemma Open Source release
We also got a smaller, quirkier release from Google: FunctionGemma. This is a tiny 270M parameter model. Yes, millions, not billions.
It’s purpose-built for function calling on edge devices. It requires only 500MB of RAM, meaning it can run on your phone, in your browser, or even on a Raspberry Pi. As Nisten joked on the show, this is finally the model that lets your toaster talk to your fridge.
Is it going to write a novel? No. But after fine-tuning, it jumped from 58% to 85% accuracy on mobile action tasks. This represents a future where privacy-first age