Episode Details
Back to Episodes“Mechanistic estimation for wide random MLPs” by Jacob_Hilton
Description
This post covers joint work with Wilson Wu, George Robinson, Mike Winer, Victor Lecomte and Paul Christiano. Thanks to Geoffrey Irving and Jess Riedel for comments on the post.
In ARC's latest paper, we study the following problem: given a randomly initialized multilayer perceptron (MLP), produce an estimate for the expected output of the model under Gaussian input. The usual approach to this problem is to sample many possible inputs, run them all through the model, and take the average. Instead, we produce an estimate "mechanistically", without running the model even once. For wide models, our approach produces more accurate estimates, both in theory and in practice.
Paper: Estimating the expected output of wide random MLPs more efficiently than sampling
Code: mlp_cumulant_propagation GitHub repo
We are excited about this result as an early step towards our goal of producing mechanistic estimates that outperform random sampling for any trained neural network. Drawing an analogy between this goal and a proof by induction, we see this result as (part of) the "base case": handling networks at initialization. We have a vision for the "inductive step", although we expect that to be much more difficult.
Summary of results
[...]---
Outline:
(01:29) Summary of results
(04:39) Significance of results
(07:18) Extending to trained networks
(08:36) Conclusion
The original text contained 18 footnotes which were omitted from this narration.
---
First published:
May 7th, 2026
Source:
https://www.lesswrong.com/posts/fsG4m6sRMpomd7Rk6/mechanistic-estimation-for-wide-random-mlps
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.