Episode Details
Back to Episodes"theory uplift differentially benefits safety & is massively underpriced" by Yudhister Kumar
Published 4 days, 12 hours ago
Description
[1] We will likely have near-superhuman mathematics AI by Q1 2027.
[1]
[2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities. [2]
[3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself.
[4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital.
[5] My understanding is that basically no one [3] is working on building infra that can usefully turn >100 million dollars of compute credits into safety-relevant mathematical output.
[5.1] The number of theory-driven ASI alignment efforts is also comparatively miniscule. ARC is a much better bet now than it was in 2023.
[5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, [metaphilosophy] (https://www.alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy). This is a much harder problem.
[6] In worlds where alignment is easy, prosaic methods may [...]
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
May 20th, 2026
Source:
https://www.lesswrong.com/posts/KWeAYcDJwfrG7RwBN/theory-uplift-differentially-benefits-safety-and-is
---
Narrated by TYPE III AUDIO.
[2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities. [2]
[3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself.
[4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital.
[5] My understanding is that basically no one [3] is working on building infra that can usefully turn >100 million dollars of compute credits into safety-relevant mathematical output.
[5.1] The number of theory-driven ASI alignment efforts is also comparatively miniscule. ARC is a much better bet now than it was in 2023.
[5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, [metaphilosophy] (https://www.alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy). This is a much harder problem.
[6] In worlds where alignment is easy, prosaic methods may [...]
The original text contained 3 footnotes which were omitted from this narration.
---
First published:
May 20th, 2026
Source:
https://www.lesswrong.com/posts/KWeAYcDJwfrG7RwBN/theory-uplift-differentially-benefits-safety-and-is
---
Narrated by TYPE III AUDIO.