Podcast Episodes

Back to Search
“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout

This is a link post.Your AI's training data might make it more “evil” and more able to circumvent your security, monitoring, and control measures. Ev…

1 year, 1 month ago

Short Long
View Episode
“Judgements: Merging Prediction & Evidence” by abramdemski

I recently wrote about complete feedback, an idea which I think is quite important for AI safety. However, my note was quite brief, explaining the id…

1 year, 1 month ago

Short Long
View Episode
“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis

First, let me quote my previous ancient post on the topic:

Effective Strategies for Changing Public Opinion

The titular paper is very relevant here. I'…

1 year, 1 month ago

Short Long
View Episode
“Power Lies Trembling: a three-book review” by Richard_Ngo

In a previous book review I described exclusive nightclubs as the particle colliders of sociology—places where you can reliably observe extreme force…

1 year, 1 month ago

Short Long
View Episode
“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans

This is the abstract and introduction of our new paper. We show that finetuning state-of-the-art LLMs on a narrow task, such as writing vulnerable co…

1 year, 1 month ago

Short Long
View Episode
“The Paris AI Anti-Safety Summit” by Zvi

It doesn’t look good.

What used to be the AI Safety Summits were perhaps the most promising thing happening towards international coordination for AI …

1 year, 1 month ago

Short Long
View Episode
“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby

Note: this is a static copy of this wiki page. We are also publishing it as a post to ensure visibility.

Circa 2015-2017, a lot of high quality conten…

1 year, 1 month ago

Short Long
View Episode
“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby

Arbital was envisioned as a successor to Wikipedia. The project was discontinued in 2017, but not before many new features had been built and a subst…

1 year, 1 month ago

Short Long
View Episode
“How to Make Superbabies” by GeneSmith, kman

We’ve spent the better part of the last two decades unravelling exactly how the human genome works and which specific letter changes in our DNA affec…

1 year, 1 month ago

Short Long
View Episode
“A computational no-coincidence principle” by Eric Neyman

Audio note: this article contains 134 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in t…

1 year, 1 month ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us