Podcast Episodes
Back to Search“Predicting Rare LLM Failures with 30× Fewer Rollouts” by Santiago Aranguri, Francisco Pernice
TL;DR: We estimate how often Qwen 3 4B exhibits rare harmful behaviors with 30× fewer rollouts than naive sampling, using a new method that interpol…
3 weeks, 2 days ago
“Cyber Lack of Security and AI Governance” by Zvi
The real recent story of AI has been the background work being done on Cybersecurity, as we process the Mythos Moment along with GPT-5.5, and figure…
3 weeks, 2 days ago
[Linkpost] “Claude is Now Alignment Pretrained” by RogerDearnaley
This is a link post.
Anthropic are now actively using the approach to alignment often called “Alignment Pretraining” or “Safety Pretraining” — using …
3 weeks, 3 days ago
“The primary sources of near-term cybersecurity risk” by lc
[Some ideas here were developed in conversation with Chris Hacking (real name)]
I have tried and failed to write a longer post many times, so here g…
3 weeks, 3 days ago
“Most “inner work” looks like entertainment.” by Chris Lakin
Imagine you’re looking for a personal trainer.
You open one trainer's webpage and read their testimonials: “I had an experience tied for the most i…
3 weeks, 3 days ago
[Linkpost] “Apollo Update May 2026” by Marius Hobbhahn
This is a link post. We now have an SF office. We're hiring for all technical roles in SF and London!The Scheming Research team focuses on two effort…
3 weeks, 3 days ago
“Voters are surprisingly open to talking about AI risk” by less_raichu
TL;DR: Voters are now surprisingly open to talking about existential risk from AI. This seems to have changed in the last 6 months. When campaigning…
3 weeks, 3 days ago
“Childhood and Education #18: Do The Math” by Zvi
We did reading yesterday. Now we do the math. Math is hard.
It does not have to be this hard.
A large part of the reason math is hard, or boring, …
3 weeks, 4 days ago
“The Owned Ones” by Eliezer Yudkowsky
(An LLM Whisperer placed a strong request that I put this story somewhere not on Twitter, so it could be scraped by robots not owned by Elon Musk. I…
3 weeks, 4 days ago
“Optimisation: Selective versus Predictive” by Raymond Douglas
Looking over my favourite posts, I notice that many of them are making specific versions of a more general claim, which is essentially: don’t confus…
3 weeks, 4 days ago