Podcast Episodes

Back to Search
“Risk from fitness-seeking AIs: mechanisms and mitigations” by Alex Mallen

Current AIs routinely take unintended actions to score well on tasks: hardcoding test cases, training on the test set, downplaying issues, etc. This…

1 month ago

Short Long
View Episode
“Sanity-checking “Incompressible Knowledge Probes”” by Sturb, LawrenceC

Or, did a chief scientist of an AI assistant startup conclusively show that GPT-5.5 has 9.7 trillion parameters?

Introduction

Recently, a paper was …

1 month ago

Short Long
View Episode
“AI unemployment and AI extinction are often the same” by KatjaGrace

My sense is that people think of AI existential risk and AI unemployment as distinct issues.

Some people are extremely concerned about extinction a…

1 month ago

Short Long
View Episode
“AI risk was not invested by AI CEOs to hype their companies” by KatjaGrace

I hear that many people believe that the idea of advanced AI threatening human existence was invented by AI CEOs to hype their products. I’ve even b…

1 month ago

Short Long
View Episode
“Cyborg evals” by Eye You, frmsaul

The low-background steel problem

Modern steel is slightly radioactive. We did a lot of atomic testing in the 40s and 50s, and now our atmosphere has…

1 month ago

Short Long
View Episode
“To what extent is Qwen3-32B predicting its persona?” by Arjun Khandelwal, ryan_greenblatt, Alex Mallen

TL;DR

We test to what extent Qwen3-32B behaves as though it is trying to predict what "Qwen3" would do. We do this by using Synthetic Document Finet…

1 month ago

Short Long
View Episode
“Research Sabotage in ML Codebases” by egan

One of the main hopes for AI safety is using AIs to automate AI safety research. However, if models are misaligned, then they may sabotage the safet…

1 month, 1 week ago

Short Long
View Episode
“Maybe I was too harsh on deep learning theory (three days ago)” by LawrenceC

A few days ago, I reviewed a paper titled “There Will Be a Scientific Theory of Deep Learning". In it, I expressed appreciation for the authors for …

1 month, 1 week ago

Short Long
View Episode
“Notes on Transformer Consciousness” by slavachalnev

Assuming transformers can have conscious experience, what would that experience be like?

Transformers[1] are a structured grid of layers and token p…

1 month, 1 week ago

Short Long
View Episode
“On today’s panel with Bernie Sanders” by David Scott Krueger

It's sort of easy to forget how close Bernie Sanders was to becoming the most powerful person in the world. The world we live in feels so much not l…

1 month, 1 week ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us