Podcast Episodes
Back to Search“Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals” by johnswentworth, David Lorell
The Cake
Imagine that I want to bake a chocolate cake, and my sole goal in my entire lightcone and extended mathematical universe is to bake that cak…
1 year, 2 months ago
“A Three-Layer Model of LLM Psychology” by Jan_Kulveit
This post offers an accessible model of psychology of character-trained LLMs like Claude.
Epistemic Status
This is primarily a phenomenological model…
1 year, 2 months ago
“Training on Documents About Reward Hacking Induces Reward Hacking” by evhub
This is a link post.This is a blog post reporting some preliminary work from the Anthropic Alignment Science team, which might be of interest to rese…
1 year, 2 months ago
“AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt
One hope for keeping existential risks low is to get AI companies to (successfully) make high-assurance safety cases: structured and auditable argume…
1 year, 2 months ago
“Mechanisms too simple for humans to design” by Malmesbury
Cross-posted from Telescopic Turnip
As we all know, humans are terrible at building butterflies. We can make a lot of objectively cool things like nuc…
1 year, 2 months ago
“The Gentle Romance” by Richard_Ngo
This is a link post.A story I wrote about living through the transition to utopia.
This is the one story that I've put the most time and effort into; …
1 year, 2 months ago
“Quotes from the Stargate press conference” by Nikola Jurkovic
This is a link post.Present alongside President Trump:
Sam AltmanLarry Ellison (Oracle executive chairman and CTO)Masayoshi Son (Softbank CEO who be…
1 year, 2 months ago
“The Case Against AI Control Research” by johnswentworth
The AI Control Agenda, in its own words:
… we argue that AI labs should ensure that powerful AIs are controlled. That is, labs should make sure that t…
1 year, 2 months ago
“Don’t ignore bad vibes you get from people” by Kaj_Sotala
I think a lot of people have heard so much about internalized prejudice and bias that they think they should ignore any bad vibes they get about a pe…
1 year, 2 months ago
“[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty” by tandem
(Both characters are fictional, loosely inspired by various traits from various real people. Be careful about combining kratom and alcohol.)
The orig…
1 year, 2 months ago