Podcast Episodes
Back to Search“Announcing: Iliad’s Fall 2026 Programs” by David Udell, Alexander Gietelink Oldenziel, Leon Lang
The April 2026 Iliad Intensive cohort, at LISA
Iliad, an umbrella organization for applied math for AI alignment, is running several additional prog…
1 week ago
“Data you could have observed but didn’t” by Gretta Duleba
You're running a study that involves keeping records about humans. You have a spreadsheet with rows for each person and columns for height, weight, …
1 week ago
“Claude Opus 4.8: The System Card” by Zvi
Only six weeks after Opus 4.7, we have Opus 4.8.
For everyone, that means another incremental upgrade to Claude. It is once again smarter, and can …
1 week ago
“Retrying vs Resampling in AI Control” by james.lucassen, Adam Kaufman
We’ve just released a new paper: Retrying vs Resampling in AI Control. We revisit the resampling protocols introduced in Ctrl-Z with an up-to-date s…
1 week ago
“AI Researchers, Ask Yourself These 6 Questions to Strengthen Your Moral Muscles” by Max Tegmark
By Max Tegmark & Meia Chita-Tegmark
Of course you have moral principles – but how often do you use them?
I, Meia, am a professor doing psychology r…
1 week, 1 day ago
“Developmental Cognitive Interpretability: A Research Agenda for Modelling Generalisation and Predicting Agent Behaviour” by JasonB, Edward James Young
Summary
Safe deployment of an AI system requires that we can make confident claims about its behaviour on out-of-distribution deployment inputs on t…
1 week, 1 day ago
“Does Claude really care about you?” by Simon Lermen
TLDR: The persona-selection alignment approach — selecting a warm, caring persona from the pretraining distribution and reinforcing it — looks succe…
1 week, 1 day ago
“How can the middle powers avoid getting trounced during the intelligence explosion? A plan.” by Tom Davidson
This is an edited version of a LW shortform.
Superintelligence will likely be developed by US companies; run on US data centres; and be under the ju…
1 week, 1 day ago
“Trees are mostly made of air and a generalizable lesson for AI safety” by zroe1
At the risk of embarrassing myself, I’ll share a confession.
For context, I took five years of Latin: four in high school and one in college. In add…
1 week, 1 day ago
“Advice for making robust-to-training model organisms” by SebastianP, Alek Westover, Vivek Hebbar, Julian Stastny, Dylan Xu
We’d like to develop training techniques that work when applied to future misaligned AI systems. One strategy for studying proposed techniques is to…
1 week, 1 day ago