Podcast Episodes

“Will Jesus Christ return in an election year?” by Eric Neyman

Thanks to Jesse Richardson for discussion.

Polymarket asks: will Jesus Christ return in 2025?

In the three days since the market opened, traders ha…

1 year ago

Short Long

View Episode

“Good Research Takes are Not Sufficient for Good Strategic Takes” by Neel Nanda

TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requir…

1 year ago

Short Long

View Episode

“Intention to Treat” by Alicorn

When my son was three, we enrolled him in a study of a vision condition that runs in my family. They wanted us to put an eyepatch on him for part of…

1 year ago

Short Long

View Episode

“On the Rationality of Deterring ASI” by Dan H

I’m releasing a new paper “Superintelligence Strategy” alongside Eric Schmidt (formerly Google), and Alexandr Wang (Scale AI). Below is the executive…

1 year ago

Short Long

View Episode

[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman

This is a link post. Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric ha…

1 year ago

Short Long

View Episode

“I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?” by shrimpy

I have, over the last year, become fairly well-known in a small corner of the internet tangentially related to AI.

As a result, I've begun making what…

1 year ago

Short Long

View Episode

“Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations” by Nicholas Goldowsky-Dill, Mikita Balesni, Jérémy Scheurer, Marius Hobbhahn

Note: this is a research note based on observations from evaluating Claude Sonnet 3.7. We’re sharing the results of these ‘work-in-progress’ investig…

1 year, 1 month ago

Short Long

View Episode

“Levels of Friction” by Zvi

Scott Alexander famously warned us to Beware Trivial Inconveniences.

When you make a thing easy to do, people often do vastly more of it.

When you put …

1 year, 1 month ago

Short Long

View Episode

“Why White-Box Redteaming Makes Me Feel Weird” by Zygi Straznickas

There's this popular trope in fiction about a character being mind controlled without losing awareness of what's happening. Think Jessica Jones, The …

1 year, 1 month ago

Short Long

View Episode

“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg

This research was conducted at AE Studio and supported by the AI Safety Grants programme administered by Foresight Institute with additional support …

1 year, 1 month ago

Short Long

View Episode

Podcast Episodes

“Will Jesus Christ return in an election year?” by Eric Neyman

“Good Research Takes are Not Sufficient for Good Strategic Takes” by Neel Nanda

“Intention to Treat” by Alicorn

“On the Rationality of Deterring ASI” by Dan H

[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman

“I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?” by shrimpy

“Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations” by Nicholas Goldowsky-Dill, Mikita Balesni, Jérémy Scheurer, Marius Hobbhahn

“Levels of Friction” by Zvi

“Why White-Box Redteaming Makes Me Feel Weird” by Zygi Straznickas

“Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg

Love PodBriefly?