Podcast Episodes
Back to Search"Anthropic repeatedly accidentally trained against the CoT, demonstrating inadequate processes" by Alex Mallen, ryan_greenblatt
It turns out that Anthropic accidentally trained against the chain of thought of Claude Mythos Preview in around 8% of training episodes. This is at…
1 month, 2 weeks ago
"The policy surrounding Mythos marks an irreversible power shift" by sil
This post assumes Anthropic isn't lying:
Mythos is the current SOTAMythos is potent[1]Anthropic will not make it publicly available un-nerfed[2]Anth…
1 month, 2 weeks ago
"Only Law Can Prevent Extinction" by Eliezer Yudkowsky
There's a quote I read as a kid that stuck with me my whole life:
"Remember that all tax revenue is the result of holding a gun to somebody's head. …
1 month, 2 weeks ago
"Dario probably doesn’t believe in superintelligence" by RobertM
Epistemic status: I think this is true but don't think this post is a very strong argument for the case, or particularly interesting to read. But I …
1 month, 2 weeks ago
"Daycare illnesses" by Nina Panickssery
Before I had a baby I was pretty agnostic about the idea of daycare. I could imagine various pros and cons but I didn’t have a strong overall opinio…
1 month, 2 weeks ago
"If Mythos actually made Anthropic employees 4x more productive, I would radically shorten my timelines" by ryan_greenblatt
Anthropic's system card for Mythos Preview says:
It's unclear how we should interpret this. What do they mean by productivity uplift? To what exten…
1 month, 2 weeks ago
"Do not be surprised if LessWrong gets hacked" by RobertM
Or, for that matter, anything else.
This post is meant to be two things:
a PSA about LessWrong's current security posture, from a LessWrong admin[1]…
1 month, 3 weeks ago
"My picture of the present in AI" by ryan_greenblatt
In this post, I'll go through some of my best guesses for the current situation in AI as of the start of April 2026. You can think of this as a scen…
1 month, 3 weeks ago
"The effects of caffeine consumption do not decay with a ~5 hour half-life" by kman
epistemic status: confident in the overall picture, substantial quantitative uncertainty about the relative potency of caffeine and paraxanthine
tld…
1 month, 3 weeks ago
"AIs can now often do massive easy-to-verify SWE tasks and I’ve updated towards shorter timelines" by ryan_greenblatt
I've recently updated towards substantially shorter AI timelines and much faster progress in some areas. [1] The largest updates I've made are (1) …
1 month, 3 weeks ago