Episode Details

“Opus 4.7 Part 1: The Model Card” by Zvi

Published 1 day, 6 hours ago

Description

Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7.

So here we are, with another 232 pages of light reading.

This post covers the first six sections of the Model Card.

It excludes section seven, model welfare, because there are concerns this time around that need to be expanded into their own post.

The reason model welfare and related topics get their own post this time around is that some things clearly went seriously wrong on that front, in ways they haven’t gone wrong in previous Claude models. Tomorrow's post is in large part an investigation of that, as best I can from this position, including various hypotheses for what happened.

This post also excludes section eight, capabilities, which will be included in the capabilities and reactions post as per usual.

Consider this the calm before the storm.

Since I likely won’t get to capabilities until Wednesday, for those experiencing first contact with Opus 4.7, a few quick tips:

Turning off ‘adaptive thinking’ means no thinking, period. Terrible UI. So make sure to keep this on. If you [...]

---

Outline:

(02:28) Here We Go Again: Executive Summary

(03:29) Introduction (1)

(03:56) RSP Evaluations (2)

(04:49) Meanwhile Back With Claude Mythos

(09:15) Economic Capability Index (2.3.7)

(10:00) Alignment Risk (2.4)

(11:55) Cyber (3)

(13:25) Safeguards and Harmlessness (4)

(19:36) Agentic Safety (5)

(21:32) Alignment (6)

(27:51) Decision Theory (6.3.6)

(31:11) System Prompt Changes

(31:39) Mandatory Pliny Jailbreak

(32:07) Onward To Model Welfare and Capabilities

---

First published:
April 20th, 2026

Source:
https://www.lesswrong.com/posts/pfJWdoLxWPzF8tpbp/opus-4-7-part-1-the-model-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Portrait of a figure composed of text, mathematical formulas, and musical notation.

Bar graphs showing automated evaluation results for CB-1 threat model across different tasks and models.

Episode Details

“Opus 4.7 Part 1: The Model Card” by Zvi

Description

Listen Now

Love PodBriefly?