Podcast Episodes

Back to Search
“EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024” by scasper

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Part 13 of 12 in the Engineer's Interpretability Sequence.

 TL;D…

1 year, 9 months ago

Short Long
View Episode
“What’s Going on With OpenAI’s Messaging?” by ozziegoen

This is a quickly-written opinion piece, of what I understand about OpenAI. I first posted it to Facebook, where it had some discussion. 

 

 Some argum…

1 year, 9 months ago

Short Long
View Episode
“Language Models Model Us” by eggsyntax

Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica Rumbelow

One-sentence summary: On a dataset of human-written essa…

1 year, 9 months ago

Short Long
View Episode
Jaan Tallinn’s 2023 Philanthropy Overview

This is a link post.to follow up my philantropic pledge from 2020, i've updated my philanthropy page with 2023 results.

in 2023 my donations funded $4…

1 year, 9 months ago

Short Long
View Episode
“OpenAI: Exodus” by Zvi

Previously: OpenAI: Facts From a Weekend, OpenAI: The Battle of the Board, OpenAI: Leaks Confirm the Story, OpenAI: Altman Returns, OpenAI: The Board…

1 year, 9 months ago

Short Long
View Episode
DeepMind’s ”​​Frontier Safety Framework” is weak and unambitious

FSF blogpost. Full document (just 6 pages; you should read it). Compare to Anthropic's RSP, OpenAI's RSP ("PF"), and METR's Key Components of an RSP.…

1 year, 9 months ago

Short Long
View Episode
Do you believe in hundred dollar bills lying on the ground? Consider humming

Introduction.

[Reminder: I am an internet weirdo with no medical credentials]

A few months ago, I published some crude estimates of the power of nitr…

1 year, 9 months ago

Short Long
View Episode
Deep Honesty

Most people avoid saying literally false things, especially if those could be audited, like making up facts or credentials. The reasons for this are …

1 year, 9 months ago

Short Long
View Episode
On Not Pulling The Ladder Up Behind You

Epistemic Status: Musing and speculation, but I think there's a real thing here.

1.

When I was a kid, a friend of mine had a tree fort. If you've neve…

1 year, 9 months ago

Short Long
View Episode
Mechanistically Eliciting Latent Behaviors in Language Models

Produced as part of the MATS Winter 2024 program, under the mentorship of Alex Turner (TurnTrout).

TL,DR: I introduce a method for eliciting latent be…

1 year, 10 months ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us