Podcast Episodes
Back to Search“Short Timelines don’t Devalue Long Horizon Research” by Vladimir_Nesov
Short AI takeoff timelines seem to leave no time for some lines of alignment research to become impactful. But any research rebalances the mix of cu…
1 year ago
“Alignment Faking Revisited: Improved Classifiers and Open Source Extensions” by John Hughes, abhayesian, Akbir Khan, Fabien Roger
In this post, we present a replication and extension of an alignment faking model organism:
Replication: We replicate the alignment faking (AF) pap…
1 year ago
“METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman
Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently …
1 year ago
“Why Have Sentence Lengths Decreased?” by Arjun Panickssery
“In the loveliest town of all, where the houses were white and high and the elms trees were green and higher than the houses, where the front yards …
1 year ago
“AI 2027: What Superintelligence Looks Like” by Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo
In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, b…
1 year ago
“OpenAI #12: Battle of the Board Redux” by Zvi
Back when the OpenAI board attempted and failed to fire Sam Altman, we faced a highly hostile information environment. The battle was fought largely …
1 year ago
“The Pando Problem: Rethinking AI Individuality” by Jan_Kulveit
Epistemic status: This post aims at an ambitious target: improving intuitive understanding directly. The model for why this is worth trying is that …
1 year ago
“OpenAI #12: Battle of the Board Redux” by Zvi
Back when the OpenAI board attempted and failed to fire Sam Altman, we faced a highly hostile information environment. The battle was fought largely …
1 year ago
“You will crash your car in front of my house within the next week” by Richard Korzekwa
I'm not writing this to alarm anyone, but it would be irresponsible not to report on something this important. On current trends, every car will be …
1 year ago
“My ‘infohazards small working group’ Signal Chat may have encountered minor leaks” by Linch
Remember: There is no such thing as a pink elephant.
Recently, I was made aware that my “infohazards small working group” Signal chat, an informal c…
1 year ago