Podcast Episodes

Back to Search
“The Unintelligibility is Ours: Notes on Chain-of-Thought” by 1a3orn

Many people seem to think that the chains-of-thought in RL-trained LLMs are under a great deal of "pressure" to cease being English. The idea is tha…

1 week, 4 days ago

Short Long
View Episode
“If Mythos actually made Anthropic employees 4x more productive, I would radically shorten my timelines” by ryan_greenblatt

Anthropic's system card for Mythos Preview says:

It's unclear how we should interpret this. What do they mean by productivity uplift? To what exten…

1 week, 4 days ago

Short Long
View Episode
“Claude Mythos #2: Cybersecurity and Project Glasswing” by Zvi

Anthropic is not going to release its new most capable model, Claude Mythos, to the public any time soon. Its cyber capabilities are too dangerous t…

1 week, 4 days ago

Short Long
View Episode
“Why Control Creates Conflict, and When to Open Instead” by plex

tl;dr: with multiple agents, control attempts tend to create conflict, because control attempts shut down communications channels, which leads to fe…

1 week, 4 days ago

Short Long
View Episode
“Reproducing steering against evaluation awareness in a large open-weight model” by Thomas Read, Bronson Schoen, Joseph Bloom

Produced as part of the UK AISI Model Transparency Team. Our team works on ensuring models don't subvert safety assessments, e.g. through evaluation…

1 week, 4 days ago

Short Long
View Episode
“Have we already lost? Part 2: Reasons for Doom” by LawrenceC

Written very quickly for the Inkhaven Residency.

As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidab…

1 week, 4 days ago

Short Long
View Episode
“Model organisms researchers should check whether high LRs defeat their model organisms” by dx26, Sebastian Prasanna, Alek Westover, Vivek Hebbar, Julian Stastny

Thanks to Buck Shlegeris for feedback on a draft of this post.

The goal-guarding hypothesis states that schemers will be able to preserve their goal…

1 week, 5 days ago

Short Long
View Episode
“Anthropic did not publish a “risk discussion” of Mythos when required by their RSP” by RobertM

I and some other people noticed a potential discrepancy in Anthropic's announcement of Claude Mythos. The version of the RSP that was operative over…

1 week, 5 days ago

Short Long
View Episode
“Claude Mythos: The System Card” by Zvi

Claude Mythos is different.

This is the first model other than GPT-2 that is at first not being released for public use at all.

With GPT-2 the del…

1 week, 5 days ago

Short Long
View Episode
“Some takes on UV & cancer” by Steven Byrnes

Table of contents:

Part 1: In which I use my optical physics background to share some hopefully-uncontroversial observationsPart 2: In which I boldl…

1 week, 5 days ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us