Podcast Episodes
Back to SearchDiffusion Transformers with Representation Autoencoders
Episode 1282
🤗 Upvotes: 93 | cs.CV, cs.LG
Authors:
Boyang Zheng, Nanye Ma, Shengbang Tong, Saining Xie
Title:
…
6Â months, 3Â weeks ago
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
Episode 1281
🤗 Upvotes: 39 | cs.AI
Authors:
Caorui Li, Yu Chen, Yiyan Ji, Jin Xu, Zhenyu Cui, Shihao Li, Yuanxing Zhang, Jiaf…
6Â months, 3Â weeks ago
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States
Episode 1280
🤗 Upvotes: 37 | cs.CL
Authors:
Qinglin Zhu, Yizhen Yao, Runcong Zhao, Yanzheng Xiang, Amrutha Saseendran, Chen J…
6Â months, 3Â weeks ago
Spotlight on Token Perception for Multimodal Reinforcement Learning
Episode 1279
🤗 Upvotes: 31 | cs.CV
Authors:
Siyuan Huang, Xiaoye Qu, Yafu Li, Yun Luo, Zefeng He, Daizong Liu, Yu Cheng
6Â months, 3Â weeks ago
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment
Episode 1278
🤗 Upvotes: 31 | cs.LG, cs.AI, cs.CL
Authors:
Jinghao Zhang, Naishan Zheng, Ruilin Li, Dongzhou Cheng, Zheming Li…
6Â months, 3Â weeks ago
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
Episode 1277
🤗 Upvotes: 26 | cs.CV
Authors:
Haoran Feng, Dizhe Zhang, Xiangtai Li, Bo Du, Lu Qi
Title:
…
6Â months, 3Â weeks ago
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration
Episode 1276
🤗 Upvotes: 26 | cs.CV
Authors:
Xinlong Chen, Yue Ding, Weihong Lin, Jingyun Hua, Linli Yao, Yang Shi, Bozhou Li,…
6Â months, 3Â weeks ago
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models
Episode 1275
🤗 Upvotes: 25 | cs.CV
Authors:
Haomin Wang, Jinhui Yin, Qi Wei, Wenguang Zeng, Lixin Gu, Shenglong Ye, Zhangwei …
6Â months, 3Â weeks ago
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions
Episode 1274
🤗 Upvotes: 25 | cs.CL, cs.AI
Authors:
Tao Yu, Zhengbo Zhang, Zhiheng Lyu, Junhao Gong, Hongzhu Yi, Xinming Wang,…
6Â months, 3Â weeks ago
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
Episode 1273
🤗 Upvotes: 104 | cs.AI, cs.CV, cs.RO
Authors:
Suwhan Choi, Jaeyoon Jung, Haebin Seong, Minchan Kim, Minyeong Kim…
6Â months, 3Â weeks ago