The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso

Episode 735

Today, we're joined by Jason Corso, co-founder of Voxel51 and professor at the University of Michigan, to explore automated labeling in computer vision. Jason introduces FiftyOne, an open-source plat…

Published on 5 months ago

View Episode

Short Summary Not Available Long Summary Not Available

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin

Episode 734

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles fr…

Published on 5 months, 1 week ago

View Episode

Short Summary Not Available Long Summary Not Available

Google I/O 2025 Special Edition

Episode 733

Today, I’m excited to share a special crossover edition of the podcast recorded live from Google I/O 2025! In this episode, I join Shawn Wang aka Swyx from the Latent Space Podcast, to interview Loga…

Published on 5 months, 2 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann

Episode 732

Today, we're joined by Sebastian Gehrmann, head of responsible AI in the Office of the CTO at Bloomberg, to discuss AI safety in retrieval-augmented generation (RAG) systems and generative AI in high…

Published on 5 months, 3 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy

Episode 731

Today, we're joined by Mahesh Sathiamoorthy, co-founder and CEO of Bespoke Labs, to discuss how reinforcement learning (RL) is reshaping the way we build custom agents on top of foundation models. Ma…

Published on 5 months, 4 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

How OpenAI Builds AI Agents That Think and Act with Josh Tobin

Episode 730

Today, we're joined by Josh Tobin, member of technical staff at OpenAI, to discuss the company’s approach to building AI agents. We cover OpenAI's three agentic offerings—Deep Research for comprehens…

Published on 6 months ago

View Episode

Short Summary Not Available Long Summary Not Available

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi

Episode 729

Today, we're joined by Nidhi Rastogi, assistant professor at Rochester Institute of Technology to discuss Cyber Threat Intelligence (CTI), focusing on her recent project CTIBench—a benchmark for eval…

Published on 6 months, 2 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

Generative Benchmarking with Kelly Hong

Episode 728

In this episode, Kelly Hong, a researcher at Chroma, joins us to discuss "Generative Benchmarking," a novel approach to evaluating retrieval systems, like RAG applications, using synthetic data. Kell…

Published on 6 months, 2 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen

Episode 727

In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a La…

Published on 6 months, 4 weeks ago

View Episode

Short Summary Not Available Long Summary Not Available

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen

Episode 726

Today, we're joined by Maohao Shen, PhD student at MIT to discuss his paper, “Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search.” We dig int…

Published on 7 months ago

View Episode

Short Summary Not Available Long Summary Not Available

Podcast Episodes

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin

Google I/O 2025 Special Edition

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy

How OpenAI Builds AI Agents That Think and Act with Josh Tobin

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi

Generative Benchmarking with Kelly Hong

Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen