Podcast Episodes

"Scientific breakthroughs of the year" by technicalities

A couple of years ago, Gavin became frustrated with science journalism. No one was pulling together results across fields; the articles usually did…

5 months, 2 weeks ago

Short Long

View Episode

"A high integrity/epistemics political machine?" by Raemon

I have goals that can only be reached via a powerful political machine. Probably a lot of other people around here share them. (Goals include “ensur…

5 months, 2 weeks ago

Short Long

View Episode

"How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)" by Kaj_Sotala

How it started

I used to think that anything that LLMs said about having something like subjective experience or what it felt like on the inside was…

5 months, 2 weeks ago

Short Long

View Episode

“My AGI safety research—2025 review, ’26 plans” by Steven Byrnes

Previous: 2024, 2022

“Our greatest fear should not be of failure, but of succeeding at something that doesn't really matter.” –attributed to DL Mood…

5 months, 2 weeks ago

Short Long

View Episode

“Weird Generalization & Inductive Backdoors” by Jorio Cocola, Owain_Evans, dylan_f

This is the abstract and introduction of our new paper.

Links: 📜 Paper, 🐦 Twitter thread, 🌐 Project page, 💻 Code

Authors: Jan Betley*, Jorio Cocola…

5 months, 2 weeks ago

Short Long

View Episode

“Insights into Claude Opus 4.5 from Pokémon” by Julian Bradshaw

Credit: Nano Banana, with some text provided. You may be surprised to learn that ClaudePlaysPokemon is still running today, and that Claude still has…

5 months, 2 weeks ago

Short Long

View Episode

“The funding conversation we left unfinished” by jenn

People working in the AI industry are making stupid amounts of money, and word on the street is that Anthropic is going to have some sort of liquidi…

5 months, 2 weeks ago

Short Long

View Episode

“The behavioral selection model for predicting AI motivations” by Alex Mallen, Buck

Highly capable AI systems might end up deciding the future. Understanding what will drive those decisions is therefore one of the most important que…

5 months, 2 weeks ago

Short Long

View Episode

“Little Echo” by Zvi

I believe that we will win.

An echo of an old ad for the 2014 US men's World Cup team. It did not win.

I was in Berkeley for the 2025 Secular Solsti…

5 months, 3 weeks ago

Short Long

View Episode

“A Pragmatic Vision for Interpretability” by Neel Nanda

Executive Summary

The Google DeepMind mechanistic interpretability team has made a strategic pivot over the past year, from ambitious reverse-engin…

5 months, 3 weeks ago

Short Long

View Episode

Podcast Episodes

"Scientific breakthroughs of the year" by technicalities

"A high integrity/epistemics political machine?" by Raemon

"How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)" by Kaj_Sotala

“My AGI safety research—2025 review, ’26 plans” by Steven Byrnes

“Weird Generalization & Inductive Backdoors” by Jorio Cocola, Owain_Evans, dylan_f

“Insights into Claude Opus 4.5 from Pokémon” by Julian Bradshaw

“The funding conversation we left unfinished” by jenn

“The behavioral selection model for predicting AI motivations” by Alex Mallen, Buck

“Little Echo” by Zvi

“A Pragmatic Vision for Interpretability” by Neel Nanda

Love PodBriefly?