Podcast Episodes
Back to Search“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks
This is a link post for two papers that came out today:
Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test…
6 months ago
“Hospitalization: A Review” by Logan Riggs
I woke up Friday morning w/ a very sore left shoulder. I tried stretching it, but my left chest hurt too. Isn't pain on one side a sign of a heart a…
6 months ago
“What, if not agency?” by abramdemski
Sahil has been up to things. Unfortunately, I've seen people put effort into trying to understand and still bounce off. I recently talked to someone…
6 months, 1 week ago
“The Origami Men” by Tomás B.
Of course, you must understand, I couldn't be bothered to act. I know weepers still pretend to try, but I wasn't a weeper, at least not then. It isn…
6 months, 1 week ago
“A non-review of ‘If Anyone Builds It, Everyone Dies’” by boazbarak
I was hoping to write a full review of "If Anyone Builds It, Everyone Dies" (IABIED Yudkowski and Soares) but realized I won't have time to do it. S…
6 months, 1 week ago
“Notes on fatalities from AI takeover” by ryan_greenblatt
Suppose misaligned AIs take over. What fraction of people will die? I'll discuss my thoughts on this question and my basic framework for thinking ab…
6 months, 1 week ago
“Nice-ish, smooth takeoff (with imperfect safeguards) probably kills most ‘classic humans’ in a few decades.” by Raemon
I wrote my recent Accelerando post to mostly stand on it's own as a takeoff scenario. But, the reason it's on my mind is that, if I imagine being ve…
6 months, 1 week ago
“Omelas Is Perfectly Misread” by Tobias H
The Standard Reading
If you've heard of Le Guin's ‘The Ones Who Walk Away from Omelas’, you probably know the basic idea. It's a go-to story for disc…6 months, 1 week ago
“Ethical Design Patterns” by AnnaSalamon
Related to: Commonsense Good, Creative Good (and my comment); Ethical Injunctions.
Epistemic status: I’m fairly sure “ethics” does useful work in bu…
6 months, 2 weeks ago
“You’re probably overestimating how well you understand Dunning-Kruger” by abstractapplic
I
The popular conception of Dunning-Kruger is something along the lines of “some people are too dumb to know they’re dumb, and end up thinking they’…
6 months, 2 weeks ago