Podcast Episodes
Back to SearchYume-1.5: A Text-Controlled Interactive World Generation Model
Episode 1539
🤗 Upvotes: 50 | cs.CV
Authors:
Xiaofeng Mao, Zhen Li, Chuanhao Li, Xiaojie Xu, Kaining Ying, Tong He, Jiangmiao …
2Â months, 2Â weeks ago
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
Episode 1538
🤗 Upvotes: 33 | cs.CL, cs.AI, cs.CV, cs.LG, cs.MA
Authors:
Shaofei Cai, Yulei Qin, Haojia Lin, Zihan Xu, Gang Li…
2Â months, 2Â weeks ago
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Episode 1537
🤗 Upvotes: 32 | cs.CV
Authors:
Shaocong Xu, Songlin Wei, Qizhe Wei, Zheng Geng, Hong Li, Licheng Shen, Qianpu Su…
2Â months, 2Â weeks ago
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
Episode 1536
🤗 Upvotes: 30 | cs.CV
Authors:
Hau-Shiang Shiu, Chin-Yang Lin, Zhixiang Wang, Chi-Wei Hsiao, Po-Fan Yu, Yu-Chih …
2Â months, 2Â weeks ago
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
Episode 1535
🤗 Upvotes: 28 | cs.CV, cs.CL
Authors:
Jiacheng Ye, Shansan Gong, Jiahui Gao, Junming Fan, Shuang Wu, Wei Bi, Hao…
2Â months, 2Â weeks ago
SpotEdit: Selective Region Editing in Diffusion Transformers
Episode 1534
🤗 Upvotes: 27 | cs.CV, cs.AI
Authors:
Zhibin Qin, Zhenxiong Tan, Zeqing Wang, Songhua Liu, Xinchao Wang
2Â months, 2Â weeks ago
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models
Episode 1533
🤗 Upvotes: 21 | cs.CV
Authors:
Bozhou Li, Sihan Yang, Yushuo Guan, Ruichuan An, Xinlong Chen, Yang Shi, Pengfei …
2Â months, 2Â weeks ago
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion
Episode 1532
🤗 Upvotes: 74 | cs.CV, cs.AI
Authors:
Hoiyeong Jin, Hyojin Jang, Jeongho Kim, Junha Hyung, Kinam Kim, Dongjin Ki…
2Â months, 2Â weeks ago
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding
Episode 1531
🤗 Upvotes: 70 | cs.CL
Authors:
Yuqing Li, Jiangnan Li, Zheng Lin, Ziyan Zhou, Junjie Wu, Weiping Wang, Jie Zhou,…
2Â months, 2Â weeks ago
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
Episode 1530
🤗 Upvotes: 21 | cs.CV
Authors:
Hanzhang Zhou, Xu Zhang, Panrong Tong, Jianan Zhang, Liangyu Chen, Quyu Kong, Che…
2Â months, 2Â weeks ago