Podcast Episodes
Back to Search“The ‘Think It Faster’ Exercise” by Raemon
Ultimately, I don’t want to solve complex problems via laborious, complex thinking, if we can help it. Ideally, I'd want to basically intuitively fol…
1 year ago
“So You Want To Make Marginal Progress...” by johnswentworth
Once upon a time, in ye olden days of strange names and before google maps, seven friends needed to figure out a driving route from their parking lot…
1 year ago
“What is malevolence? On the nature, measurement, and distribution of dark traits” by David Althaus
Summary
In this post, we explore different ways of understanding and measuring malevolence and explain why individuals with concerning levels of mal…
1 year ago
“How AI Takeover Might Happen in 2 Years” by joshc
I’m not a natural “doomsayer.” But unfortunately, part of my job as an AI safety researcher is to think about the more troubling scenarios.
I’m like …
1 year ago
“Gradual Disempowerment, Shell Games and Flinches” by Jan_Kulveit
Over the past year and half, I've had numerous conversations about the risks we describe in Gradual Disempowerment. (The shortest useful summary of t…
1 year ago
“Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development” by Jan_Kulveit, Raymond D, Nora_Ammann, Deger Turan, David Scott Krueger (formerly: capybaralet), David Duvenaud
This is a link post.Full version on arXiv | X
Executive summary
AI risk scenarios usually portray a relatively sudden loss of human control to AIs,…
1 year ago
“Planning for Extreme AI Risks” by joshc
This post should not be taken as a polished recommendation to AI companies and instead should be treated as an informal summary of a worldview. The c…
1 year, 1 month ago
“Catastrophe through Chaos” by Marius Hobbhahn
This is a personal post and does not necessarily reflect the opinion of other members of Apollo Research. Many other people have talked about similar…
1 year, 1 month ago
“Will alignment-faking Claude accept a deal to reveal its misalignment?” by ryan_greenblatt
I (and co-authors) recently put out "Alignment Faking in Large Language Models" where we show that when Claude strongly dislikes what it is being tra…
1 year, 1 month ago
“‘Sharp Left Turn’ discourse: An opinionated review” by Steven Byrnes
Summary and Table of Contents
The goal of this post is to discuss the so-called “sharp left turn”, the lessons that we learn from analogizing evoluti…
1 year, 1 month ago