Podcast Episodes

Back to Search
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Episode 1776

🤗 Upvotes: 106 | cs.CV, cs.AI, cs.HC

Authors:
Mingyu Ouyang, Siyuan Hu, Kevin Qinghong Lin, Hwee Tou Ng, Mike Zh…

2 months, 2 weeks ago

Short Long
View Episode
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Episode 1775

🤗 Upvotes: 96 | cs.AI, cs.LG

Authors:
Haozhe Wang, Cong Wei, Weiming Ren, Jiaming Liu, Fangzhen Lin, Wenhu Chen

…

2 months, 2 weeks ago

Short Long
View Episode
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Episode 1774

🤗 Upvotes: 60 | cs.CV, cs.CL

Authors:
Dinging Li, Yingxiu Zhao, Xinrui Cheng, Kangheng Lin, Hongbo Peng, Hongxin…

2 months, 2 weeks ago

Short Long
View Episode
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation

Episode 1773

🤗 Upvotes: 50 | cs.CL

Authors:
Xiaomeng Hu, Yinger Zhang, Fei Huang, Jianhong Tu, Yang Su, Lianghao Deng, Yuxuan…

2 months, 2 weeks ago

Short Long
View Episode
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

Episode 1772

🤗 Upvotes: 24 | cs.AI, cs.CL

Authors:
Kangsan Kim, Minki Kang, Taeil Kim, Yanlai Yang, Mengye Ren, Sung Ju Hwang…

2 months, 2 weeks ago

Short Long
View Episode
From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

Episode 1771

🤗 Upvotes: 23 | cs.LG, cs.AI, cs.CL

Authors:
Yuqiao Tan, Minzheng Wang, Bo Liu, Zichen Liu, Tian Liang, Shizhu H…

2 months, 2 weeks ago

Short Long
View Episode
Exploration and Exploitation Errors Are Measurable for Language Model Agents

Episode 1770

🤗 Upvotes: 22 | cs.AI

Authors:
Jaden Park, Jungtaek Kim, Jongwon Jeong, Robert D. Nowak, Kangwook Lee, Yong Jae …

2 months, 2 weeks ago

Short Long
View Episode
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Episode 1769

🤗 Upvotes: 123 | cs.LG, cs.AI, cs.CL, cs.CV

Authors:
Fei Tang, Zhiqiong Lu, Boxuan Zhang, Weiming Lu, Jun Xiao, …

2 months, 2 weeks ago

Short Long
View Episode
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Episode 1768

🤗 Upvotes: 62 | cs.LG, cs.AI, cs.CL

Authors:
Yaxuan Li, Yuxin Zuo, Bingxiang He, Jinqian Zhang, Chaojun Xiao, Ch…

2 months, 2 weeks ago

Short Long
View Episode
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

Episode 1767

🤗 Upvotes: 27 | cs.AI, cs.LG

Authors:
Jiachen Zhu, Lingyu Yang, Rong Shan, Congmin Zheng, Zeyu Zheng, Weiwen Liu…

2 months, 2 weeks ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us