Podcast Episodes
Back to Search"Cohabitive Games so Far" by mako yass
A cohabitive game[1] is a partially cooperative, partially competitive multiplayer game that provides an anarchic dojo for development in applied coo…
2 years, 6 months ago
"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba
In 2023, MIRI has shifted focus in the direction of broad public communication—see, for example, our recent TED talk, our piece in TIME magazine “Pau…
2 years, 6 months ago
"Comparing Anthropic's Dictionary Learning to Ours" by Robert_AIZI
Readers may have noticed many similarities between Anthropic's recent publication Towards Monosemanticity: Decomposing Language Models With Dictionar…
2 years, 6 months ago
"Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds
Neural networks are trained on data, not programmed to follow rules. We understand the math of the trained network exactly – each neuron in a neural …
2 years, 6 months ago
"Evaluating the historical value misspecification argument" by Matthew Barnett
ETA: I'm not saying that MIRI thought AIs wouldn't understand human values. If there's only one thing you take away from this post, please don't take…
2 years, 6 months ago
"Response to Quintin Pope’s Evolution Provides No Evidence For the Sharp Left Turn" by Zvi
Response to: Evolution Provides No Evidence For the Sharp Left Turn, due to it winning first prize in The Open Philanthropy Worldviews contest.
Quint…
2 years, 6 months ago
"Announcing Dialogues" by Ben Pace
As of today, everyone is able to create a new type of content on LessWrong: Dialogues.
In contrast with posts, which are for monologues, and comment s…
2 years, 6 months ago
"Thomas Kwa's MIRI research experience" by Thomas Kwa and others
Moderator note: the following is a dialogue using LessWrong’s new dialogue feature. The exchange is not completed: new replies might be added continu…
2 years, 6 months ago
"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth
Effective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its ref…
2 years, 6 months ago
"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.
Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs migh…
2 years, 6 months ago