Podcast Episode Details

Back to Podcast Episodes
Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that

Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that


Season 4 Episode 2


A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.

https://assemblyai.com/aiexplained

Chapters:
00:00 - Introduction

00:56 - GPT 5.1 Smarter?

01:47 - Some Regressions

03:22 - Sycophancy?

05:22 - Claude Auto-Hacking 

06:16 - Jailbreaking through Granularity

08:22 - This Will be Re-used

09:30 - Hallucinating Hacker

09:57 - Surprisingly Neutral Tone

12:18 - SIMA 2

14:10 - Alpha Parallels

17:24 - AI Music



GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/

System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf

Benchmarks: https://openai.com/index/gpt-5-1-for-developers/

Simple Bench: https://lmcouncil.ai/benchmarks


Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618

https://www.anthropic.com/news/disrupting-AI-espionage

Report: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf



Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

https://x.com/amoufarek/status/1988986075331858693

Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/

Voyager: https://voyager.minedojo.org/


Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/



Published on 5 days, 1 hour ago






If you like Podbriefly.com, please consider donating to support the ongoing development.

Donate