Podcast Episodes
Back to Search"Thoughts on the AI Safety Summit company policy requests and responses" by So8res
Over the next two days, the UK government is hosting an AI Safety Summit focused on “the safe and responsible development of frontier AI”. They reque…
2 years, 6 months ago
[Human Voice] "Book Review: Going Infinite" by Zvi
Support ongoing human narrations of curated posts:
www.patreon.com/LWCurated
Previously: Sadly, FTX
I doubted whether it would be a good use of time to …
2 years, 7 months ago
"Announcing Timaeus" by Jesse Hoogland et al.
Timaeus is a new AI safety research organization dedicated to making fundamental breakthroughs in technical AI alignment using deep ideas from mathem…
2 years, 7 months ago
"Thoughts on responsible scaling policies and regulation" by Paul Christiano
I am excited about AI developers implementing responsible scaling policies; I’ve recently been spending time refining this idea and advocating for it…
2 years, 7 months ago
"AI as a science, and three obstacles to alignment strategies" by Nate Soares
AI used to be a science. In the old days (back when AI didn't work very well), people were attempting to develop a working theory of cognition.
Those …
2 years, 7 months ago
"Architects of Our Own Demise: We Should Stop Developing AI" by Roko
Some brief thoughts at a difficult time in the AI risk debate.
Imagine you go back in time to the year 1999 and tell people that in 24 years time, hum…
2 years, 7 months ago
"At 87, Pearl is still able to change his mind" by rotatingpaguro
Judea Pearl is a famous researcher, known for Bayesian networks (the standard way of representing Bayesian models), and his statistical formalization…
2 years, 7 months ago
"We're Not Ready: thoughts on "pausing" and responsible scaling policies" by Holden Karnofsky
Views are my own, not Open Philanthropy’s. I am married to the President of Anthropic and have a financial interest in both Anthropic and OpenAI via …
2 years, 7 months ago
[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis
Support ongoing human narrations of curated posts:
www.patreon.com/LWCurated
Doomimir: Humanity has made no progress on the alignment problem. Not only…
2 years, 7 months ago
"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.
Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Jeffrey Ladish.
TL;DR LoRA fine-tunin…
2 years, 7 months ago