Episode Details

📆 ThursdAI - Spooky Halloween edition with Video!

Published 1 year, 5 months ago

Description

Hey everyone, Happy Halloween! Alex here, coming to you live from my mad scientist lair! For the first ever, live video stream of ThursdAI, I dressed up as a mad scientist and had my co-host, Fester the AI powered Skeleton join me (as well as my usual cohosts haha) in a very energetic and hopefully entertaining video stream!

Since it's Halloween today, Fester (and I) have a very busy schedule, so no super length ThursdAI news-letter today, as we're still not in the realm of Gemini being able to write a decent draft that takes everything we talked about and cover all the breaking news, I'm afraid I will have to wish you a Happy Halloween and ask that you watch/listen to the episode.

The TL;DR and show links from today, don't cover all the breaking news but the major things we saw today (and caught live on the show as Breaking News) were, ChatGPT now has search, Gemini has grounded search as well (seems like the release something before Google announces it streak from OpenAI continues).

Here's a quick trailer of the major things that happened:

This weeks buzz - Halloween AI toy with Weave

In this weeks buzz, my long awaited Halloween project is finally live and operational!

I've posted a public Weave dashboard here and the code (that you can run on your mac!) here

Really looking forward to see all the amazing costumers the kiddos come up with and how Gemini will be able to respond to them, follow along!

ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Ok and finally my raw TL;DR notes and links for this week. Happy halloween everyone, I'm running off to spook the kiddos (and of course record and post about it!)

ThursdAI - Oct 31 - TL;DR

TL;DR of all topics covered:

* Open Source LLMs:

* Microsoft's OmniParser: SOTA UI parsing (MIT Licensed) 𝕏

* Groundbreaking model for web automation (MIT license).

* State-of-the-art UI parsing and understanding.

* Outperforms GPT-4V in parsing web UI.

* Designed for web automation tasks.

* Can be integrated into various development workflows.

* ZhipuAI's GLM-4-Voice: End-to-end Chinese/English speech 𝕏

* End-to-end voice model for Chinese and English speech.

* Open-sourced and readily available.

* Focuses on direct speech understanding and generation.

* Potential applications in various speech-related tasks.

* Meta releases LongVU: Video LM for long videos 𝕏

* Handles long videos with impressive performance.

* Uses DINOv2 for downsampling, eliminating redundant scenes.

* Fuses features using DINOv2 and SigLIP.

* Select tokens are passed to Qwen2/Llama-3.2-3B.

* Demo and model are available on HuggingFace.

* Potential for significant advancements in video understanding.

* OpenAI new factuality benchmark (Blog, Github)

* Introducing SimpleQA: new factuality benchmark

* Goal: high correctness, diversity, challenging for frontier models

* Question Curation: AI trainers, verified by second trainer

* Quality Assurance: 3% inherent error rate

* Topic Diversity: wide range of topics

* Grading Methodology: "correct", "incorrect", "not attempted"

Episode Details

📆 ThursdAI - Spooky Halloween edition with Video!

Description

Listen Now

Love PodBriefly?