Podcast Episodes

Back to Search
“Ten people on the inside” by Buck

(Many of these ideas developed in conversation with Ryan Greenblatt)

In a shortform, I described some different levels of resources and buy-in for mis…

1 year, 1 month ago

Short Long
View Episode
“Anomalous Tokens in DeepSeek-V3 and r1” by henry

“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.

The Solid…

1 year, 1 month ago

Short Long
View Episode
“Tell me about yourself:LLMs are aware of their implicit behaviors” by Martín Soto, Owain_Evans

This is the abstract and introduction of our new paper, with some discussion of implications for AI Safety at the end.

Authors: Jan Betley*, Xuchan …

1 year, 1 month ago

Short Long
View Episode
“Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals” by johnswentworth, David Lorell

The Cake

Imagine that I want to bake a chocolate cake, and my sole goal in my entire lightcone and extended mathematical universe is to bake that cak…

1 year, 1 month ago

Short Long
View Episode
“A Three-Layer Model of LLM Psychology” by Jan_Kulveit

This post offers an accessible model of psychology of character-trained LLMs like Claude.

Epistemic Status

This is primarily a phenomenological model…

1 year, 1 month ago

Short Long
View Episode
“Training on Documents About Reward Hacking Induces Reward Hacking” by evhub

This is a link post.This is a blog post reporting some preliminary work from the Anthropic Alignment Science team, which might be of interest to rese…

1 year, 1 month ago

Short Long
View Episode
“AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt

One hope for keeping existential risks low is to get AI companies to (successfully) make high-assurance safety cases: structured and auditable argume…

1 year, 1 month ago

Short Long
View Episode
“Mechanisms too simple for humans to design” by Malmesbury

Cross-posted from Telescopic Turnip

As we all know, humans are terrible at building butterflies. We can make a lot of objectively cool things like nuc…

1 year, 1 month ago

Short Long
View Episode
“The Gentle Romance” by Richard_Ngo

This is a link post.A story I wrote about living through the transition to utopia.

This is the one story that I've put the most time and effort into; …

1 year, 1 month ago

Short Long
View Episode
“Quotes from the Stargate press conference” by Nikola Jurkovic

This is a link post.Present alongside President Trump:

 Sam AltmanLarry Ellison (Oracle executive chairman and CTO)Masayoshi Son (Softbank CEO who be…

1 year, 1 month ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us