Episode Details
Back to Episodes
2024 in Vision [LS Live @ NeurIPS]
Description
Happy holidays! We’ll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all all our LS supporters who helped fund the gorgeous venue and A/V production!
For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.
The single most requested domain was computer vision, and we could think of no one better to help us recap 2024 than our friends at Roboflow, who was one of our earliest guests in 2023 and had one of this year’s top episodes in 2024 again. Roboflow has since raised a $40m Series B!
Links
Their slides are here:
All the trends and papers they picked:
* Sora (see our Video Diffusion pod) - extending diffusion from images to video
* SAM 2: Segment Anything in Images and Videos (see our SAM2 pod) - extending prompted masks to full video object segmentation
* DETR Dominancy: DETRs show Pareto improvement over YOLOs
* RT-DETR: DETRs Beat YOLOs on Real-time Object Detection
* LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection
* D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement
* MMVP (Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs)
*
Love PodBriefly?
If you like Podbriefly.com, please consider donating to support the ongoing development.
Support Us