Episode Details

Back to Episodes
📆 ThursdAI - Jan 9th - NVIDIA's Tiny Supercomputer, Phi-4 is back, Kokoro TTS & Moondream gaze, ByteDance SOTA lip sync & more AI news

📆 ThursdAI - Jan 9th - NVIDIA's Tiny Supercomputer, Phi-4 is back, Kokoro TTS & Moondream gaze, ByteDance SOTA lip sync & more AI news

Published 1 year, 2 months ago
Description

Hey everyone, Alex here 👋

This week's ThursdAI was a whirlwind of announcements, from Microsoft finally dropping Phi-4's official weights on Hugging Face (a month late, but who's counting?) to Sam Altman casually mentioning that OpenAI's got AGI in the bag and is now setting its sights on superintelligence. Oh, and NVIDIA? They're casually releasing a $3,000 supercomputer that can run 200B parameter models on your desktop. No big deal.

We had some amazing guests this week too, with Oliver joining us to talk about a new foundation model in genomics and biosurveillance (yes, you read that right - think wastewater and pandemic monitoring!), and then, we've got some breaking news! Vik returned to the show with a brand new Moondream release that can do some pretty wild things. Ever wanted an AI to tell you where someone's looking in a photo? Now you can, thanks to a tiny model that runs on edge devices. 🤯

So buckle up, folks, because we've got a ton to cover. Let's dive into the juicy details of this week's AI madness, starting with open source.

03:10 TL;DR

03:10 Deep Dive into Open Source LLMs

10:58 MetaGene: A New Frontier in AI

20:21 PHI4: The Latest in Open Source AI

27:46 R Star Math: Revolutionizing Small LLMs

34:02 Big Companies and AI Innovations

42:25 NVIDIA's Groundbreaking Announcements

43:49 AI Hardware: Building and Comparing Systems

46:06 NVIDIA's New AI Models: LLAMA Neumatron

47:57 Breaking News: Moondream's Latest Release

50:19 Moondream's Journey and Capabilities

58:41 Weights & Biases: New Evals Course

01:08:29 NVIDIA's World Foundation Models

01:08:29 ByteDance's LatentSync: State-of-the-Art Lip Sync

01:12:54 Kokoro TTS: High-Quality Text-to-Speech

As always, TL;DR section with links and show notes below 👇

Open Source AI & LLMs

Phi-4: Microsoft's "Small" Model Finally Gets its Official Hugging Face Debut

Finally, after a month, we're getting Phi-4 14B on HugginFace. So far, we've had bootlegged copies of it, but it's finally officially uploaded by Microsoft. Not only is it now official, it's also officialy MIT licensed which is great!

So, what's the big deal? Well, besides the licensing, it's a 14B parameter, dense decoder-only Transformer with a 16K token context length and trained on a whopping 9.8 trillion tokens. It scored 80.4 on math and 80.6 on MMLU, making it about 10% better than its predecessor, Phi-3 and better than Qwen 2.5's 79

What’s interesting about phi-4 is that the training data consisted of 40% synthetic data (almost half!)

The vibes are always interesting with Phi models, so we'll keep an eye out, notable also, the base models weren't released due to "safety issues" and that this model was not trained for multi turn chat applications but single turn use-cases

MetaGene-1: AI for Pandemic Monitoring and Pathogen Detection

Now, this one's a bit different. We usually talk about LLMs in this section, but this is more about the "open source" than the "LLM." Prime Intellect, along with folks from USC, released MetaGene-1, a metagenomic foundation model. That's a mouthful, right? Thankfully, we had Oliver Liu, a PhD student at USC, and an author on this paper, join us to explain.

Oliver clarified that the goal is to use AI for "biosurveillance, pandemic monitoring, and pathogen detection." They trained a 7B parameter model on 1.5 trillion base pairs of DNA and RNA sequences from wastewater, creating a model surprisingly capable of zero-shot embedding. Oliver pointed out that while using genomics to pretrain foundation models is not new, MetaGene-1 is, "in its current state, the largest model out there" and is "one of the few decoder only models that are be

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us