Podcast Episodes

Back to Search
"Against Almost Every Theory of Impact of Interpretability" by Charbel-Raphaël

I gave a talk about the different risk models, followed by an interpretability presentation, then I got a problematic question, "I don't understand, …

2 years, 6 months ago

Short Long
View Episode
"Inflection.ai is a major AGI lab" by Nikola

Inflection.ai (co-founded by DeepMind co-founder Mustafa Suleyman) should be perceived as a frontier LLM lab of similar magnitude as Meta, OpenAI, De…

2 years, 6 months ago

Short Long
View Episode
"Feedbackloop-first Rationality" by Raemon

I've been workshopping a new rationality training paradigm. (By "rationality training paradigm", I mean an approach to learning/teaching the skill of…

2 years, 6 months ago

Short Long
View Episode
"When can we trust model evaluations?" bu evhub

In "Towards understanding-based safety evaluations," I discussed why I think evaluating specifically the alignment of models is likely to require mec…

2 years, 6 months ago

Short Long
View Episode
"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan Perez

TL;DR: This document lays out the case for research on “model organisms of misalignment” – in vitro demonstrations of the kinds of failures that migh…

2 years, 6 months ago

Short Long
View Episode
"My current LK99 questions" by Eliezer Yudkowsky

So this morning I thought to myself, "Okay, now I will actually try to study the LK99 question, instead of betting based on nontechnical priors and m…

2 years, 6 months ago

Short Long
View Episode
"The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate" by Adam David Long

Summary of Argument: The public debate among AI experts is confusing because there are, to a first approximation, three sides, not two sides to the d…

2 years, 6 months ago

Short Long
View Episode
"ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks" by Beth Barnes

Blogpost version

Paper

We have just released our first public report. It introduces methodology for assessing the capacity of LLM agents to acquire res…

2 years, 6 months ago

Short Long
View Episode
"Thoughts on sharing information about language model capabilities" by paulfchristiano

I believe that sharing information about the capabilities and limits of existing ML systems, and especially language model agents, significantly redu…

2 years, 6 months ago

Short Long
View Episode
"Yes, It's Subjective, But Why All The Crabs?" by johnswentworth

Some early biologist, equipped with knowledge of evolution but not much else, might see all these crabs and expect a common ancestral lineage. That’s…

2 years, 6 months ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us