Podcast Episodes
Back to Search"Against Almost Every Theory of Impact of Interpretability" by Charbel-Raphaël
I gave a talk about the different risk models, followed by an interpretability presentation, then I got a problematic question, "I don't understand, …
2 years, 6 months ago
"Inflection.ai is a major AGI lab" by Nikola
Inflection.ai (co-founded by DeepMind co-founder Mustafa Suleyman) should be perceived as a frontier LLM lab of similar magnitude as Meta, OpenAI, De…
2 years, 6 months ago
"Feedbackloop-first Rationality" by Raemon
I've been workshopping a new rationality training paradigm. (By "rationality training paradigm", I mean an approach to learning/teaching the skill of…
2 years, 6 months ago
"When can we trust model evaluations?" bu evhub
In "Towards understanding-based safety evaluations," I discussed why I think evaluating specifically the alignment of models is likely to require mec…
2 years, 6 months ago
"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan Perez
TL;DR: This document lays out the case for research on “model organisms of misalignment” – in vitro demonstrations of the kinds of failures that migh…
2 years, 6 months ago
"My current LK99 questions" by Eliezer Yudkowsky
So this morning I thought to myself, "Okay, now I will actually try to study the LK99 question, instead of betting based on nontechnical priors and m…
2 years, 6 months ago
"The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate" by Adam David Long
Summary of Argument: The public debate among AI experts is confusing because there are, to a first approximation, three sides, not two sides to the d…
2 years, 6 months ago
"ARC Evals new report: Evaluating Language-Model Agents on Realistic Autonomous Tasks" by Beth Barnes
Blogpost version
Paper
We have just released our first public report. It introduces methodology for assessing the capacity of LLM agents to acquire res…
2 years, 6 months ago
"Thoughts on sharing information about language model capabilities" by paulfchristiano
I believe that sharing information about the capabilities and limits of existing ML systems, and especially language model agents, significantly redu…
2 years, 6 months ago
"Yes, It's Subjective, But Why All The Crabs?" by johnswentworth
Some early biologist, equipped with knowledge of evolution but not much else, might see all these crabs and expect a common ancestral lineage. That’s…
2 years, 6 months ago