Podcast Episodes
Back to Search“AI #169: New Knowledge” by Zvi
Even in a relatively quiet period, AI is out there creating new knowledge. The new knowledge in question is OpenAI getting us the first truly impres…
2 weeks, 2 days ago
“Loss of Oversight: How AI Systems May Become Harder to Audit, Monitor, and Investigate” by Jordan Taylor, Max H, Ed Fage, Thomas Read, Joseph Bloom
Produced by UK AISI Model Transparency and Situational Awareness teams. If you’re a Research Scientist or Research Engineer, we’re hiring – apply he…
2 weeks, 2 days ago
“Why does off-model SFT degrade capabilities?” by SebastianP, Dylan Xu, Alek Westover, Julian Stastny, Vivek Hebbar
Off-model SFT (Supervised Fine-Tuning on outputs generated by a different model) might be an important method for controlling AI behavior. For insta…
2 weeks, 2 days ago
“Women should be able to open things” by KatjaGrace
m pretty annoyed today, for nominal reasons ranging between ‘petty’ and ‘doesn’t even make sense’. I’m not entirely sure how or if to take oneself s…
2 weeks, 2 days ago
“Toward Interoperability of Minimal Programs” by johnswentworth
Assumed background: Kolmogorov complexity and Solomonoff induction.
Suppose I have some data , and I go looking for the models (i.e. programs) which…
2 weeks, 3 days ago
“theory uplift differentially benefits safety & is massively underpriced” by Yudhister Kumar
[1] We will likely have near-superhuman mathematics AI by Q1 2027. [1]
[2] Qualitatively, AI mathematics capabilities are developing significantly…
2 weeks, 3 days ago
“Power-seeking agents will likely be developed” by Alec Harris
I am going to argue that we will likely eventually get AIs that are strongly power-seeking, much more so than current SOTA LLMs.[1]
TLDR
Right now S…2 weeks, 3 days ago
“Synthetic Persona Pretraining: Alignment from Token Zero” by Julian Minder, Raghav Singhal, Viktor Moskvoretskii, Stefan Krsteski, ashtonanderson, rolandaydin, Robert West
Julian Minder, Viktor Moskvoretskii, Raghav Singhal,
Difan Jiao, Kartik Bali, Yiderigun Borjigin, Shaobo Cui, Stefan Krsteski,
Ashton Anderson, Rola…
2 weeks, 3 days ago
“If AI is normal technology, history is not reassuring.” by Davidmanheim
There's a truism that technology is good - even if it creates winners and losers, it improves the world. Toby Ord argues that the conclusions about …
2 weeks, 3 days ago
“Pythagorean addition” by kqr
TL;DR: Instead of labouriously computing , we can mentally calculate using the alpha-max plus beta-min algorithm, by estimating
and this will be ver…
2 weeks, 3 days ago