Podcast Episodes

Back to Search
“Bringing More Expertise to Bear on Alignment” by Edmund Lau, Geoffrey Irving, Cameron Holmes, David Africa

Preamble

The preamble is less useful for the typical AlignmentForum/LessWrong reader, who may want to skip to Adversaria vs Basinland section.

On 28…

4 weeks, 1 day ago

Short Long
View Episode
[Linkpost] “How to prevent AI’s 2008 moment (We’re hiring)” by felixgaston

This is a link post.

TL;DR; CeSIA, the French Center for AI Safety is recruiting. French not necessary. Apply by 22 May 2026; Paris or remote in Euro…

4 weeks, 1 day ago

Short Long
View Episode
“AI #167: The Prior Restraint Era Begins” by Zvi

The era of training frontier models and then releasing them whenever you wanted?

That was fun while it lasted. It looks likely to be over now. The …

4 weeks, 2 days ago

Short Long
View Episode
“Mechanistic estimation for wide random MLPs” by Jacob_Hilton

This post covers joint work with Wilson Wu, George Robinson, Mike Winer, Victor Lecomte and Paul Christiano. Thanks to Geoffrey Irving and Jess Ried…

4 weeks, 2 days ago

Short Long
View Episode
“Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations” by Subhash Kantamneni, kitft, Euan Ong, Sam Marks

Abstract

We introduce Natural Language Autoencoders (NLAs), an unsupervised method for generating natural language explanations of LLM activations. …

4 weeks, 2 days ago

Short Long
View Episode
“Try, even if they have you cold” by WalterL

I think smart people try things less often than they should, because of a cached mental pattern where you think of what might go wrong, and you find…

4 weeks, 2 days ago

Short Long
View Episode
“A review of “Investigating the consequences of accidentally grading CoT during RL”” by Buck

Last week, OpenAI staff shared an early draft of Investigating the consequences of accidentally grading CoT during RL with Redwood Research staff.

T…

4 weeks, 2 days ago

Short Long
View Episode
“There is no evidence you should reapply sunscreen every 2 hours.” by Hide

It's incredible how many consensus guidelines dissolve when you look closely at them.


If you listen to any authority on the subject of sunscreen,…

4 weeks, 2 days ago

Short Long
View Episode
“Many individual CEVs are probably quite bad” by Viliam

I was thinking about Habryka's article on Putin's CEV, but I am posting my response here, because the original article is already 3 weeks old.

I am …

1 month ago

Short Long
View Episode
“x-risk-themed” by kave

Sometimes, a friend who works around here, at an x-risk-themed organisation, will think about leaving their job. They’ll ask a group of people “what…

1 month ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us