Episode Details

Back to Episodes
๐Ÿ“† ThursdAI - August 1st - Meta SAM 2 for video, Gemini 1.5 is king now?, GPT-4o Voice is here (for some), new Stability, Apple Intelligence also here & more AI news

๐Ÿ“† ThursdAI - August 1st - Meta SAM 2 for video, Gemini 1.5 is king now?, GPT-4o Voice is here (for some), new Stability, Apple Intelligence also here & more AI news

Published 1ย year, 8ย months ago
Description

Starting Monday, Apple released iOS 18.1 with Apple Intelligence, then Meta dropped SAM-2 (Segment Anything Model) and then Google first open sourced Gemma 2B and now (just literally 2 hours ago, during the live show) released Gemini 1.5 0801 experimental that takes #1 on LMsys arena across multiple categories, to top it all off we also got a new SOTA image diffusion model called FLUX.1 from ex-stability folks and their new Black Forest Lab.

This week on the show, we had Joseph & Piotr Skalski from Roboflow, talk in depth about Segment Anything, and as the absolute experts on this topic (Skalski is our returning vision expert), it was an incredible deep dive into the importance dedicated vision models (not VLMs).

We also had Lukas Atkins & Fernando Neto from Arcee AI talk to use about their new DistillKit and explain model Distillation in detail & finally we had Cristiano Giardina who is one of the lucky few that got access to OpenAI advanced voice mode + his new friend GPT-4o came on the show as well!

Honestly, how can one keep up with all this? by reading ThursdAI of course, that's how but โš ๏ธ buckle up, this is going to be a BIG one (I think over 4.5K words, will mark this as the longest newsletter I penned, I'm sorry, maybe read this one on 2x? ๐Ÿ˜‚)

[ Chapters ]

00:00 Introduction to the Hosts and Their Work

01:22 Special Guests Introduction: Piotr Skalski and Joseph Nelson

04:12 Segment Anything 2: Overview and Capabilities

15:33 Deep Dive: Applications and Technical Details of SAM2

19:47 Combining SAM2 with Other Models

36:16 Open Source AI: Importance and Future Directions

39:59 Introduction to Distillation and DistillKit

41:19 Introduction to DistilKit and Synthetic Data

41:41 Distillation Techniques and Benefits

44:10 Introducing Fernando and Distillation Basics

44:49 Deep Dive into Distillation Process

50:37 Open Source Contributions and Community Involvement

52:04 ThursdAI Show Introduction and This Week's Buzz

53:12 Weights & Biases New Course and San Francisco Meetup

55:17 OpenAI's Advanced Voice Mode and Cristiano's Experience

01:08:04 SearchGPT Release and Comparison with Perplexity

01:11:37 Apple Intelligence Release and On-Device AI Capabilities

01:22:30 Apple Intelligence and Local AI

01:22:44 Breaking News: Black Forest Labs Emerges

01:24:00 Exploring the New Flux Models

01:25:54 Open Source Diffusion Models

01:30:50 LLM Course and Free Resources

01:32:26 FastHTML and Python Development

01:33:26 Friend.com: Always-On Listening Device

01:41:16 Google Gemini 1.5 Pro Takes the Lead

01:48:45 GitHub Models: A New Era

01:50:01 Concluding Thoughts and Farewell

Show Notes & Links

* Open Source LLMs

* Meta gives SAM-2 - segment anything with one shot + video capability! (X, Blog, DEMO)

* Google open sources Gemma 2 2.6B (Blog,

Listen Now