Podcast Episodes

“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a link post.TL;DR: We published a new paper on out-of-c…

2 years, 1 month ago

Short Long

View Episode

“Boycott OpenAI” by PeterMcCluskey

This is a link post.I have canceled my OpenAI subscription in protest over OpenAI's lack ofethics.

In particular, I object to:

threats to confiscate d…

2 years, 1 month ago

Short Long

View Episode

“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a link post.New Anthropic model organisms research pape…

2 years, 1 month ago

Short Long

View Episode

“I would have shit in that alley, too” by Declan Molony

After living in a suburb for most of my life, when I moved to a major U.S. city the first thing I noticed was the feces. At first I assumed it was do…

2 years, 1 month ago

Short Long

View Episode

“Getting 50% (SoTA) on ARC-AGI with GPT-4o” by ryan_greenblatt

ARC-AGI post

Getting 50% (SoTA) on ARC-AGI with GPT-4o

I recently got to 50%[1] accuracy on the public test set for ARC-AGI by having GPT-4o generate …

2 years, 1 month ago

Short Long

View Episode

“Why I don’t believe in the placebo effect” by transhumanist_atom_understander

Have you heard this before? In clinical trials, medicines have to be compared to a placebo to separate the effect of the medicine from the psychologi…

2 years, 1 month ago

Short Long

View Episode

“Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)” by Andrew_Critch

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.As an AI researcher who wants to do technical work that helps h…

2 years, 1 month ago

Short Long

View Episode

“My AI Model Delta Compared To Christiano” by johnswentworth

Preamble: Delta vs Crux

This section is redundant if you already read My AI Model Delta Compared To Yudkowsky.

I don’t natively think in terms of crux…

2 years, 1 month ago

Short Long

View Episode

“My AI Model Delta Compared To Yudkowsky” by johnswentworth

Preamble: Delta vs Crux

I don’t natively think in terms of cruxes. But there's a similar concept which is more natural for me, which I’ll call a delt…

2 years, 1 month ago

Short Long

View Episode

“Response to Aschenbrenner’s ‘Situational Awareness’” by Rob Bensinger

(Cross-posted from Twitter.)

My take on Leopold Aschenbrenner's new report: I think Leopold gets it right on a bunch of important counts.

Three that I…

2 years, 1 month ago

Short Long

View Episode

Podcast Episodes

“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans

“Boycott OpenAI” by PeterMcCluskey

“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison

“I would have shit in that alley, too” by Declan Molony

“Getting 50% (SoTA) on ARC-AGI with GPT-4o” by ryan_greenblatt

“Why I don’t believe in the placebo effect” by transhumanist_atom_understander

“Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)” by Andrew_Critch

“My AI Model Delta Compared To Christiano” by johnswentworth

“My AI Model Delta Compared To Yudkowsky” by johnswentworth

“Response to Aschenbrenner’s ‘Situational Awareness’” by Rob Bensinger

Love PodBriefly?