Episode Details

“On restraining AI development for the sake of safety” by Joe Carlsmith

Published 1 month ago

Description

(Podcast version, read by the author, here, or search for "Joe Carlsmith Audio" on your podcast app.

This is the tenth essay in a series I’m calling “How do we solve the alignment problem?”. I’m hoping that the individual essays can be read fairly well on their own, but see this introduction for a summary of the essays that have been released thus far, plus a bit more about the series as a whole.

I work at Anthropic, but I am here speaking only for myself and not for my employer.)

1. Introduction

In the third essay in the series, I distinguished between three key “security factors” for developing advanced AI safely, namely:

Safety progress: our ability to develop new levels of AI capability safely.
Risk evaluation: our ability to track and forecast the level of risk that a given sort of AI capability development involves.
Capability restraint: our ability to steer and restrain AI capability development when doing so is necessary for maintaining safety.

A lot of my focus in the series has been on safety progress – and to a lesser extent, risk evaluation. In this essay, I want to look at capability restraint, in [...]

---

Outline:

(00:38) 1. Introduction

(08:18) 2. Preliminaries

(10:59) 3. AI development isnt necessarily a prisoners dilemma

(18:03) 4. Forms of capability restraint

(19:26) 4.1. Individual capability restraint

(21:32) 4.2. Collective capability restraint

(26:25) 4.3. Treatment of ongoing AI development

(33:06) 5. Idealized capability restraint

(45:00) 6. Capability restraint in practice

(45:32) 6.1. The likelihood of serious effort

(53:57) 6.2. The efficacy of capability restraint

(55:40) 6.2.1. Compute governance

(58:05) 6.2.2. Algorithmic governance

(01:04:12) 6.2.3. Greenlighting and safety progress

(01:10:44) 6.3. Ways that capability restraint could end up net negative

(01:11:53) 6.3.1. Concentrations of power

(01:17:17) 6.3.2. Ceding competitive advantage to authoritarian countries

(01:19:29) 6.3.3. Other concerns

(01:27:13) 7. Prioritizing capability restraint relative to other security factors

(01:29:43) 8. Conclusion

(01:31:19) Appendix 1: What are we using the time for?

The original text contained 40 footnotes which were omitted from this narration.

---

First published:
March 19th, 2026

Source:
https://www.lesswrong.com/posts/K8jyKcDQbfBjmiAoM/on-restraining-ai-development-for-the-sake-of-safety

---

Narrated by TYPE III AUDIO.

Episode Details

“On restraining AI development for the sake of safety” by Joe Carlsmith

Description

Listen Now

Love PodBriefly?