Podcast Episodes

Back to Search
“Announcing the Center for Shared AI Prosperity” by Dylan Matthews

I wanted to share the launch of a project I've been working on with pollster David Shor, Obama/Biden veteran Stef Feldman, political strategist Morr…

3 weeks, 1 day ago

Short Long
View Episode
“Risk reports need to address deployment-time spread of misalignment” by Alex Mallen

Risk reports commonly use pre-deployment alignment assessments to measure misalignment risk from an internally deployed AI. However, an AI that genu…

3 weeks, 1 day ago

Short Long
View Episode
“Mechanistic estimation for expectations of random products” by Jacob_Hilton

We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as …

3 weeks, 1 day ago

Short Long
View Episode
“MATS 9 Retrospective & Advice” by beyarkay

I couldn’t find a recent write-up from a MATS alum about what attending MATS was like, so this is the thing that I wish I had. I attended MATS from …

3 weeks, 1 day ago

Short Long
View Episode
[Linkpost] “Don’t be too Clever to Take Obvious Advice” by Hide

This is a link post.

An insidious pattern among smart people is feeling that because something is familiar and obvious, you are impervious to ignorin…

3 weeks, 1 day ago

Short Long
View Episode
“Verification-Centric AI” by Raemon

"Sometimes the AI just makes stuff up" is a problem I don't really expect to go away. In the nearterm, AI is going to keep occasionally hallucinatin…

3 weeks, 1 day ago

Short Long
View Episode
“Convergent Abstraction Hypothesis” by Jan_Kulveit

Tl;dr

Convergent abstraction hypothesis posits abstractions are often convergent in the sense of convergent evolution: different cognitive systems c…

3 weeks, 1 day ago

Short Long
View Episode
“AI #168: Not Leading the Future” by Zvi

This is what a lull looks like at this point. The government is having internal arguments. The models are getting improved internally. The coding ag…

3 weeks, 2 days ago

Short Long
View Episode
“Automated Alignment is Harder Than You Think” by Aleksandr Bowkis, Marie_DB, Jacob Pfau, Geoffrey Irving

Summary

This is a summary of a paper published by the alignment team at UK AISI. Read the full paper here.

AI research agents may help solve ASI ali…

3 weeks, 2 days ago

Short Long
View Episode
“The safe-to-dangerous shift is a fundamental problem for eval realism; but also for measuring awareness” by Charlie Griffin, Patrick Leask

1) The safe-to-dangerous shift is a fundamental problem for eval realism

Suppose we have a capable and potentially scheming model, and before we dep…

3 weeks, 2 days ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us