Episode Details

Back to Episodes

“Why Even Experts Don’t Know What to Do About AI Risk” by Luc Brinkman, plex

Published 3 days, 21 hours ago
Description

AI Safety veteran Holden Karnofsky thinks there's a 49% chance his actions are making things worse.[1]

In 2025, Jesse Clifton even stepped down as the executive director of the Center on Long-Term risk because of similar reasons.

Even top AI Safety strategists don’t know what will make things better, and what will make things worse.

Why is it so hard to improve humanity's odds?

And what can you do to choose your actions?

1) Hidden Failure Lets You Fail Without Knowing It

In AI Safety, impact is hard to measure, and thus lack of impact is often invisible. We call this "hidden failure". With hidden failure, projects fail to have a positive impact but the people doing the project don’t realise it.

To understand where hidden failure comes from, it's useful to understand reasons why projects fail in general. These reasons fall on a spectrum:

  • Wrong problem: You're addressing something with little influence on x-risk. For example, researching AI fairness when the core risk is misalignment.
  • Wrong solution: Your solution doesn't solve the problem, even when competently executed. E.g. interpretability research that's technically novel but isn’t actually helpful.
  • Poor execution: Your problem-solution set could be impactful but you're not executing your [...]

---

Outline:

(00:49) 1) Hidden Failure Lets You Fail Without Knowing It

(02:44) 2) Why impact is harder than profit

(03:24) 3) The pre-paradigmatic challenge

The original text contained 8 footnotes which were omitted from this narration.

---

First published:
June 2nd, 2026

Source:
https://www.lesswrong.com/posts/tRRkj249gdDL4mued/why-even-experts-don-t-know-what-to-do-about-ai-risk

---

Narrated by TYPE III AUDIO.

---