Podcast Episodes

Back to Search
[Linkpost] “Identifying ‘Deception Vectors’ In Models” by Stephen Martin

This is a link post. Using representation engineering, we systematically induce, detect, and control such deception in CoT-enabled LLMs, extracting ”…

10 months, 1 week ago

Short Long
View Episode
“The Unparalleled Awesomeness of Effective Altruism Conferences” by omnizoid


Crosspost from my blog.

I just got back from Effective Altruism Global London—a conference that brought together lots of different people trying t…

10 months, 1 week ago

Short Long
View Episode
“The True Goal Fallacy” by adamShimi

As I ease out into a short sabbatical, I find myself turning back to dig the seeds of my repeated cycle of exhaustion and burnout in the last few ye…

10 months, 1 week ago

Short Long
View Episode
“AI companies’ eval reports mostly don’t support their claims” by Zach Stein-Perlman

AI companies claim that their models are safe on the basis of dangerous capability evaluations. OpenAI, Google DeepMind, and Anthropic publish report…

10 months, 1 week ago

Short Long
View Episode
“Against asking if AIs are conscious” by AlexMennen

People sometimes wonder whether certain AIs or animals are conscious/sentient/sapient/have qualia/etc. I don't think that such questions are coheren…

10 months, 1 week ago

Short Long
View Episode
“Season Recap of the Village: Agents raise $2,000” by Shoshannah Tekofsky

Four agents woke up with four computers, a view of the world wide web, and a shared chat room full of humans. Like Claude plays Pokemon, you can wat…

10 months, 1 week ago

Short Long
View Episode
“The Best Reference Works for Every Subject” by Parker Conley

Introduction

The Best Textbooks on Every Subject is the Schelling point for the best textbooks on every subject. My The Best Tacit Knowledge Videos …

10 months, 1 week ago

Short Long
View Episode
“‘Flaky breakthroughs’ pervade coaching — and no one tracks them” by Chipmonk

Has someone you know ever had a “breakthrough” from coaching, meditation, or psychedelics — only to later have it fade?



Show tweet

For example, man…

10 months, 1 week ago

Short Long
View Episode
“The Value Proposition of Romantic Relationships” by johnswentworth

What's the main value proposition of romantic relationships?

Now, look, I know that when people drop that kind of question, they’re often about to p…

10 months, 1 week ago

Short Long
View Episode
“It’s hard to make scheming evals look realistic” by Igor Ivanov, dan_moken

Abstract

Claude 3.7 Sonnet easily detects when it's being evaluated for scheming. Surface‑level edits to evaluation scenarios, such as lengthening t…

10 months, 2 weeks ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us