Episode Details

Back to Episodes

“Maybe I was too harsh on deep learning theory (three days ago)” by LawrenceC

Published 1 month, 1 week ago
Description

A few days ago, I reviewed a paper titled “There Will Be a Scientific Theory of Deep Learning". In it, I expressed appreciation for the authors for writing the piece, but skepticism for stronger forms of their titular claims.

Since then I’ve spoken with various past collaborators (via text and in person), and read or reread quite a few deep learning theory papers, including the bombshell Zhang et al. 2016 and Nagarajan et al. 2019 papers that I wrote about on LessWrong.

And the thing is, parts of the infinite width/depth-limit work turned out to be much more interesting than I thought it was. Perhaps I have judged deep learning theory (a bit) too harshly.

A lot of my impression for the infinite-width and depth-limit work comes from the neural tangent kernel/neural network Gaussian Process line of work. This line of work starts from Radford Neal's 1994 paper, where he noted that an infinitely-wide single hidden-layer neural network with random weights is a Gaussian Process. In 2017/2018, this work was extended to deep neural networks; it was shown by Lee et al. that a randomly initialized deep neural network was, if you took a certain type of infinite width [...]

---

First published:
April 29th, 2026

Source:
https://www.lesswrong.com/posts/6SRq7mZ97Dwuavwb6/maybe-i-was-too-harsh-on-deep-learning-theory-three-days-ago

---

Narrated by TYPE III AUDIO.

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us