Episode Details
Back to EpisodesNeural Networks: The 200-Year-Old Math Behind the AI Revolution
Description
What happens when you build a machine to find the best software engineers in the country and it secretly teaches itself to reject anyone whose resume contains the word "woman"? That actually happened at Amazon in 2018. The machine wasn't programmed to discriminate. It was just ruthlessly executing a mathematical equation trained on a decade of biased hiring data.
This episode strips the mystique from artificial intelligence by tracing neural networks back to their true origins, which turn out to be far older than Silicon Valley. The foundational math, linear regression and the method of least squares, dates to Carl Friedrich Gauss in 1795, who used it to predict planetary movement. The first conceptual neural network model arrived in 1943 from McCulloch and Pitts, followed by Frank Rosenblatt's perceptron in 1958, funded by the U.S. Navy and hailed as the dawn of machine intelligence. Then came the crash. In 1969, Minsky and Papert proved mathematically that these early networks couldn't solve problems any more complex than drawing a single straight line through data, a limitation exposed by a simple diagonal logic puzzle called XOR. Funding vanished, and the field entered what became known as the AI winter.
The resurrection came through backpropagation, an algorithm that traces errors backward through a network and adjusts its internal weights using the chain rule from calculus, a piece of math Leibniz derived in 1673. The episode uses a vivid recipe analogy: the network makes soup, tastes the terrible result, then uses calculus to determine exactly how much to reduce the salt and increase the garlic for the next batch. That learning loop, scaled up by a millionfold increase in computing power between 1991 and 2015 (driven largely by GPUs originally designed for video games), is what produced the deep learning explosion. The 2017 "Attention Is All You Need" paper introduced the transformer architecture, the T in GPT, which lets networks weigh the contextual importance of every word in a sentence against every other word simultaneously.
But the episode doesn't let the technology off the hook. It digs into the black box problem, the uncomfortable reality that no one can fully explain why a deep network reaches a particular decision. It explores dataset bias through the Amazon case, concept drift (when the real world evolves but the training data stays frozen), and the philosophical debate between mathematician Alexander Dudney, who called neural networks "lazy science," and technology writer Roger Bridgman, who countered that if the opaque table of numbers can safely steer a car, the engineering triumph speaks for itself. The conversation closes with a striking irony: researchers are now inventing entirely new fields of science just to observe and understand the machines they themselves built.
Source credit: Research for this episode included Wikipedia articles and transcript materials accessed 4/7/2026. Wikipedia text is licensed under CC BY-SA 4.0; content here is summarized/adapted in original wording for commentary and educational use.