Podcast Episodes

Do you believe in hundred dollar bills lying on the ground? Consider humming

Introduction.

[Reminder: I am an internet weirdo with no medical credentials]

A few months ago, I published some crude estimates of the power of nitr…

1 year, 10 months ago

Short Long

View Episode

Deep Honesty

Most people avoid saying literally false things, especially if those could be audited, like making up facts or credentials. The reasons for this are …

1 year, 11 months ago

Short Long

View Episode

On Not Pulling The Ladder Up Behind You

Epistemic Status: Musing and speculation, but I think there's a real thing here.

1.

When I was a kid, a friend of mine had a tree fort. If you've neve…

1 year, 11 months ago

Short Long

View Episode

Mechanistically Eliciting Latent Behaviors in Language Models

Produced as part of the MATS Winter 2024 program, under the mentorship of Alex Turner (TurnTrout).

TL,DR: I introduce a method for eliciting latent be…

1 year, 11 months ago

Short Long

View Episode

Ironing Out the Squiggles

Adversarial Examples: A Problem

The apparent successes of the deep learning revolution conceal a dark underbelly. It may seem that we now know how to…

1 year, 11 months ago

Short Long

View Episode

Introducing AI Lab Watch

This is a linkpost for https://ailabwatch.orgI'm launching AI Lab Watch. I collected actions for frontier AI labs to improve AI safety, then evaluate…

1 year, 11 months ago

Short Long

View Episode

Refusal in LLMs is mediated by a single direction

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This work was produced as part of Neel Nanda's stream in the ML…

1 year, 11 months ago

Short Long

View Episode

Funny Anecdote of Eliezer From His Sister

This comes from a podcast called 18Forty, of which the main demographic of Orthodox Jews. Eliezer's sister (Hannah) came on and talked about her Shev…

1 year, 11 months ago

Short Long

View Episode

Thoughts on seed oil

This is a linkpost for https://dynomight.net/seed-oil/A friend has spent the last three years hounding me about seed oils. Every time I thought I was…

1 year, 11 months ago

Short Long

View Episode

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer

Yesterday Adam Shai put up a cool post which… well, take a look at the visual:

Yup, it sure looks like that fractal is very noisily embedded in the re…

1 year, 11 months ago

Short Long

View Episode

Podcast Episodes

Do you believe in hundred dollar bills lying on the ground? Consider humming

Deep Honesty

On Not Pulling The Ladder Up Behind You

Mechanistically Eliciting Latent Behaviors in Language Models

Ironing Out the Squiggles

Introducing AI Lab Watch

Refusal in LLMs is mediated by a single direction

Funny Anecdote of Eliezer From His Sister

Thoughts on seed oil

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer

Love PodBriefly?