Episode Details

“Sanity-checking “Incompressible Knowledge Probes”” by Sturb, LawrenceC

Published 1 month ago

Description

Or, did a chief scientist of an AI assistant startup conclusively show that GPT-5.5 has 9.7 trillion parameters?

Introduction

Recently, a paper was circulated on Twitter claiming to have reverse engineered the parameter count of many frontier closed-source models including the newer GPT-5.5 (9.7 trillion parameters) and Claude Opus 4.7 (4.0 trillion parameters) as well as older models such as o1 (3.5T) and gpt-4o (720B). The paper, titled “Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity”, introduces a dataset of factual knowledge of different difficulties, regresses performance on this dataset against parameter count, and then uses this regression to extrapolate from the performance of closed-sourced frontier models to their parameter count. A notable fact about this paper is that, unlike most empirical machine learning papers, it's single-authored: Bojie Li, the chief scientist of Pine AI, is the sole author of this piece.

These results were suspicious for many reasons, the primary being that it seems like low-effort, hastily-written AI slop. For example, the codebase (https://github.com/19PINE-AI/ikp) was constructed in large part with Claude Code and has many of the flags for code that is almost entirely vibe-coded with little sanity checking (e.g. redundant and inconsistent variable definitions[1] [...]

---

Outline:

(00:19) Introduction

(04:19) Summary of Lis Incompressible Knowledge Probes

(08:04) The IKP dataset.

(11:59) IKP scoring and Regression Methodology

(16:54) Methodological Issues with the IKP paper

(17:24) Per-tier floors to the scoring

(19:27) Ambiguous/incorrect answers to hard questions

(23:21) Corrected model parameter estimates

(24:17) Possible methodological issues that mattered less than we thought

(24:22) Thinking vs non-thinking

(25:59) Different accuracy metrics used in some repository json files

(26:31) Conclusion

(29:46) Discussion

The original text contained 17 footnotes which were omitted from this narration.

---

First published:
May 1st, 2026

Source:
https://www.lesswrong.com/posts/veFMEzDDyWaer2Sms/sanity-checking-incompressible-knowledge-probes

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Episode Details

“Sanity-checking “Incompressible Knowledge Probes”” by Sturb, LawrenceC

Description

Listen Now

Love PodBriefly?