Podcast Episodes

"Cohabitive Games so Far" by mako yass

A cohabitive game[1] is a partially cooperative, partially competitive multiplayer game that provides an anarchic dojo for development in applied coo…

2 years, 6 months ago

Short Long

View Episode

"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba

In 2023, MIRI has shifted focus in the direction of broad public communication—see, for example, our recent TED talk, our piece in TIME magazine “Pau…

2 years, 6 months ago

Short Long

View Episode

"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZI

Readers may have noticed many similarities between Anthropic's recent publication Towards Monosemanticity: Decomposing Language Models With Dictionar…

2 years, 6 months ago

Short Long

View Episode

"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds

Neural networks are trained on data, not programmed to follow rules. We understand the math of the trained network exactly – each neuron in a neural …

2 years, 6 months ago

Short Long

View Episode

"Evaluating the historical value misspecification argument" by Matthew Barnett

ETA: I'm not saying that MIRI thought AIs wouldn't understand human values. If there's only one thing you take away from this post, please don't take…

2 years, 6 months ago

Short Long

View Episode

"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by Zvi

Response to: Evolution Provides No Evidence For the Sharp Left Turn, due to it winning first prize in The Open Philanthropy Worldviews contest.

Quint…

2 years, 6 months ago

Short Long

View Episode

"Announcing Dialogues" by Ben Pace

As of today, everyone is able to create a new type of content on LessWrong: Dialogues.

In contrast with posts, which are for monologues, and comment s…

2 years, 6 months ago

Short Long

View Episode

"Thomas Kwa's MIRI research experience" by Thomas Kwa and others

Moderator note: the following is a dialogue using LessWrong’s new dialogue feature. The exchange is not completed: new replies might be added continu…

2 years, 6 months ago

Short Long

View Episode

"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth

Effective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its ref…

2 years, 6 months ago

Short Long

View Episode

"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.

Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs migh…

2 years, 6 months ago

Short Long

View Episode

Podcast Episodes

"Cohabitive Games so Far" by mako yass

"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba

"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZI

"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds

"Evaluating the historical value misspecification argument" by Matthew Barnett

"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by Zvi

"Announcing Dialogues" by Ben Pace

"Thomas Kwa's MIRI research experience" by Thomas Kwa and others

"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth

"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.

Love PodBriefly?