Podcast Episodes
Back to Search“An Ambitious Vision for Interpretability” by leogao
The goal of ambitious mechanistic interpretability (AMI) is to fully understand how neural networks work. While some have pivoted towards more pragm…
2 months, 3 weeks ago
“6 reasons why ‘alignment-is-hard’ discourse seems alien to human intuitions, and vice-versa” by Steven Byrnes
Tl;dr
AI alignment has a culture clash. On one side, the “technical-alignment-is-hard” / “rational agents” school-of-thought argues that we should e…
2 months, 3 weeks ago
“Three things that surprised me about technical grantmaking at Coefficient Giving (fka Open Phil)” by null
Open Philanthropy's Coefficient Giving's Technical AI Safety team is hiring grantmakers. I thought this would be a good moment to share some positiv…
2 months, 3 weeks ago
“MIRI’s 2025 Fundraiser” by alexvermeer
MIRI is running its first fundraiser in six years, targeting $6M. The first $1.6M raised will be matched 1:1 via an SFF grant. Fundraiser ends at mi…
2 months, 3 weeks ago
“The Best Lack All Conviction: A Confusing Day in the AI Village” by null
The AI Village is an ongoing experiment (currently running on weekdays from 10 a.m. to 2 p.m. Pacific time) in which frontier language models are gi…
2 months, 3 weeks ago
“The Boring Part of Bell Labs” by Elizabeth
It took me a long time to realize that Bell Labs was cool. You see, my dad worked at Bell Labs, and he has not done a single cool thing in his life …
2 months, 4 weeks ago
[Linkpost] “The Missing Genre: Heroic Parenthood - You can have kids and still punch the sun” by null
This is a link post. I stopped reading when I was 30. You can fill in all the stereotypes of a girl with a book glued to her face during every meal, …
2 months, 4 weeks ago
“Writing advice: Why people like your quick bullshit takes better than your high-effort posts” by null
Right now I’m coaching for Inkhaven, a month-long marathon writing event where our brave residents are writing a blog post every single day for the …
2 months, 4 weeks ago
“Claude 4.5 Opus’ Soul Document” by null
Summary
As far as I understand and uncovered, a document for the character training for Claude is compressed in Claude's weights. The full document …
2 months, 4 weeks ago
“Unless its governance changes, Anthropic is untrustworthy” by null
Anthropic is untrustworthy.
This post provides arguments, asks questions, and documents some examples of Anthropic's leadership being misleading and…
2 months, 4 weeks ago