Episode Details
Back to Episodes
π ThursdAI - Nov 21 - The fight for the LLM throne, OSS SOTA from AllenAI, Flux new tools, Deepseek R1 reasoning & more AI news
Description
Hey folks, Alex here, and oof what a π₯π₯π₯ show we had today! I got to use my new breaking news button 3 times this show! And not only that, some of you may know that one of the absolutely biggest pleasures as a host, is to feature the folks who actually make the news on the show!
And now that we're in video format, you actually get to see who they are! So this week I was honored to welcome back our friend and co-host Junyang Lin, a Dev Lead from the Alibaba Qwen team, who came back after launching the incredible Qwen Coder 2.5, and Qwen 2.5 Turbo with 1M context.
We also had breaking news on the show that AI2 (Allen Institute for AI) has fully released SOTA LLama post-trained models, and I was very lucky to get the core contributor on the paper, Nathan Lambert to join us live and tell us all about this amazing open source effort! You don't want to miss this conversation!
Lastly, we chatted with the CEO of StackBlitz, Eric Simons, about the absolutely incredible lightning in the bottle success of their latest bolt.new product, how it opens a new category of code generator related tools.
00:00 Introduction and Welcome
00:58 Meet the Hosts and Guests
02:28 TLDR Overview
03:21 Tl;DR
04:10 Big Companies and APIs
07:47 Agent News and Announcements
08:05 Voice and Audio Updates
08:48 AR, Art, and Diffusion
11:02 Deep Dive into Mistral and Pixtral
29:28 Interview with Nathan Lambert from AI2
30:23 Live Reaction to Tulu 3 Release
30:50 Deep Dive into Tulu 3 Features
32:45 Open Source Commitment and Community Impact
33:13 Exploring the Released Artifacts
33:55 Detailed Breakdown of Datasets and Models
37:03 Motivation Behind Open Source
38:02 Q&A Session with the Community
38:52 Summarizing Key Insights and Future Directions
40:15 Discussion on Long Context Understanding
41:52 Closing Remarks and Acknowledgements
44:38 Transition to Big Companies and APIs
45:03 Weights & Biases: This Week's Buzz
01:02:50 Mistral's New Features and Upgrades
01:07:00 Introduction to DeepSeek and the Whale Giant
01:07:44 DeepSeek's Technological Achievements
01:08:02 Open Source Models and API Announcement
01:09:32 DeepSeek's Reasoning Capabilities
01:12:07 Scaling Laws and Future Predictions
01:14:13 Interview with Eric from Bolt
01:14:41 Breaking News: Gemini Experimental
01:17:26 Interview with Eric Simons - CEO @ Stackblitz
01:19:39 Live Demo of Bolt's Capabilities
01:36:17 Black Forest Labs AI Art Tools
01:40:45 Conclusion and Final Thoughts
As always, the show notes and TL;DR with all the links I mentioned on the show and the full news roundup below the main new recap π
Google & OpenAI fighting for the LMArena crown π
I wanted to open with this, as last week I reported that Gemini Exp 1114 has taken over #1 in the LMArena, in less than a week, we saw a new ChatGPT release, called GPT-4o-2024-11-20 reclaim the arena #1 spot!
Focusing specifically on creating writing, this new model, that's now deployed on chat.com and in the API, is definitely more creative according to many folks who've tried it, with OpenAI employees saying "expect qualitative improvements with more natural and engaging writing, thoroughness and readability" and indeed that's what my feed was reporting as well.
I also wanted to mention here, that we've seen this happen once before, last time Gemini peaked at the LMArena, it took less than a week for OpenAI to release