Episode Details
Back to Episodes“How scary is Claude Mythos? 303 pages in 21 minutes” by 80000_Hours
Description
By Robert Wiblin | Watch on Youtube | Listen on Spotify
As we now know, Anthropic has built an AI that can break into almost any computer on Earth. That AI has already found thousands of unknown security vulnerabilities in every major operating system and every major browser. And Anthropic has decided it's too dangerous to release to the public; it would just cause too much harm.
Here are just a few of the things that AI accomplished during testing:
- It found a 27-year-old flaw in the world's most security-hardened operating system that would in effect let it crash all kinds of essential infrastructure.
- Engineers at the company with no particular security training asked it to find vulnerabilities overnight and woke up to working exploits of critical security flaws that could be used to cause real harm.
- It managed to figure out how to build web pages that, when visited by fully updated, fully patched computers, would allow it to write to the operating system kernel — the most important and protected layer of any computer.
We know all this because Anthropic has released hundreds of pages of documentation about this model, which they’ve called Claude Mythos.
[...]
---
Outline:
(01:35) Why people are panicking about computer security
(05:12) Mythos could break out of containment
(07:22) Anthropic is losing billions in revenue by not releasing Mythos
(09:09) Mythos is actually the most aligned model to date, except...
(11:44) Mythos knows when its being tested
(13:50) Mythos can hide its thoughts
(16:24) Mythos cant be trusted about whether its untrustworthy
(20:00) Does Mythos advance automated AI R&D?
(22:28) Mythos scares Anthropic
(24:53) Learn more
---
First published:
April 10th, 2026
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.