Episode Details
Back to Episodes“Opus 4.7 Part 1: The Model Card” by Zvi
Description
Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7.
So here we are, with another 232 pages of light reading.
This post covers the first six sections of the Model Card.
It excludes section seven, model welfare, because there are concerns this time around that need to be expanded into their own post.
The reason model welfare and related topics get their own post this time around is that some things clearly went seriously wrong on that front, in ways they haven’t gone wrong in previous Claude models. Tomorrow's post is in large part an investigation of that, as best I can from this position, including various hypotheses for what happened.
This post also excludes section eight, capabilities, which will be included in the capabilities and reactions post as per usual.
Consider this the calm before the storm.
Since I likely won’t get to capabilities until Wednesday, for those experiencing first contact with Opus 4.7, a few quick tips:
- Turning off ‘adaptive thinking’ means no thinking, period. Terrible UI. So make sure to keep this on. If you [...]
---
Outline:
(02:28) Here We Go Again: Executive Summary
(03:29) Introduction (1)
(03:56) RSP Evaluations (2)
(04:49) Meanwhile Back With Claude Mythos
(09:15) Economic Capability Index (2.3.7)
(10:00) Alignment Risk (2.4)
(11:55) Cyber (3)
(13:25) Safeguards and Harmlessness (4)
(19:36) Agentic Safety (5)
(21:32) Alignment (6)
(27:51) Decision Theory (6.3.6)
(31:11) System Prompt Changes
(31:39) Mandatory Pliny Jailbreak
(32:07) Onward To Model Welfare and Capabilities
---
First published:
April 20th, 2026
Source:
https://www.lesswrong.com/posts/pfJWdoLxWPzF8tpbp/opus-4-7-part-1-the-model-card
---
Narrated by TYPE III AUDIO.
---
Listen Now
Love PodBriefly?
If you like Podbriefly.com, please consider donating to support the ongoing development.
Support Us