Episode Details

Melanie Mitchell: Straight Talk on A.I. Large Language Models

Published 2 years, 10 months ago

Description

Transcript with Links

Eric Topol (00:00):

This is Eric Topol, and I'm so excited to have the chance to speak to Melanie Mitchell. Melanie is the Davis Professor of Complexity at the Santa Fe Institute in New Mexico. And I look to her as one of the real, not just leaders, but one with balance and thoughtfulness in the high velocity AI world of large language models that we live in. And just by way of introduction, the way I got to first meet Professor Mitchell was through her book, Artificial Intelligence, A Guide for Thinking Humans. And it sure got me thinking back about four years ago. So welcome, Melanie.

Melanie Mitchell (00:41):

Thanks Eric. It's great to be here.

The Lead Up to ChatGPT via Transformer Models

Eric Topol (00:43):

Yeah. There's so much to talk about and you've been right in the middle of many of these things, so that's what makes it especially fun. I thought we'd start off a little bit of history, because when we both were writing books about AI back in 2019 publishing the world kind of changed since . And in November when ChatGPT got out there, it signaled there was this big thing called transformer model. And I don't think many people really know the difference between a transformer model, which had been around for a while, but maybe hadn't come to the surface versus what were just the deep neural networks that ushered in deep learning that you had so systematically addressed in your book.

Melanie Mitchell (01:29):

Right. Yeah. Transformers are, were kind of a new thing. I can't remember exactly when they came out, maybe 2018, something like that, right from Google. They were an architecture that showed that you didn't really need to have a recurrent neural network in order to deal with language. So that was one of the earlier things, you know, and Google translate and other language processing systems, people were using recurrent neural networks, networks that sort of had feedback from one time step to the next. But now we have the transformers, which instead use what they call an attention mechanism where the entire text that the system is dealing with is available all at once. And the name of the paper, in fact was Attention is All You need. And that by attention is all you need they meant this particular attention mechanism in the neural network, and that was really a revolution and enabled this new era of large language models.

Eric Topol (02:34):

Yeah. And as you aptly pointed out, that was in, that was five years ago.

Episode Details

Melanie Mitchell: Straight Talk on A.I. Large Language Models

Description

Listen Now

Love PodBriefly?