Podcast Episodes

Back to Search
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Episode 1192

🤗 Upvotes: 98 | cs.LG, cs.CL

Authors:
Xu Wujiang, Wentian Zhao, Zhenting Wang, Li Yu-Jhe, Jin Can, Jin Mingyu, M…

7 months, 1 week ago

Short Long
View Episode
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Episode 1191

🤗 Upvotes: 81 | cs.CV, cs.CL

Authors:
Junbo Niu, Zheng Liu, Zhuangcheng Gu, Bin Wang, Linke Ouyang, Zhiyuan Zhao…

7 months, 1 week ago

Short Long
View Episode
ReviewScore: Misinformed Peer Review Detection with Large Language Models

Episode 1190

🤗 Upvotes: 54 | cs.CL

Authors:
Hyun Ryu, Doohyuk Jang, Hyemin S. Lee, Joonhyun Jeong, Gyeongman Kim, Donghyeon C…

7 months, 1 week ago

Short Long
View Episode
Variational Reasoning for Language Models

Episode 1189

🤗 Upvotes: 51 | cs.CL, cs.AI, cs.LG

Authors:
Xiangxin Zhou, Zichen Liu, Haonan Wang, Chao Du, Min Lin, Chongxuan…

7 months, 1 week ago

Short Long
View Episode
Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Episode 1188

🤗 Upvotes: 48 | cs.CL, cs.AI, cs.LG

Authors:
Renjie Luo, Zichen Liu, Xiangyan Liu, Chao Du, Min Lin, Wenhu Chen,…

7 months, 1 week ago

Short Long
View Episode
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning

Episode 1187

🤗 Upvotes: 28 | cs.CV, cs.RO

Authors:
Jinkun Hao, Naifu Liang, Zhen Luo, Xudong Xu, Weipeng Zhong, Ran Yi, Yiche…

7 months, 1 week ago

Short Long
View Episode
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Episode 1186

🤗 Upvotes: 28 | cs.CV, cs.AI, cs.CL

Authors:
Long Xing, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Jianze Liang, Qido…

7 months, 1 week ago

Short Long
View Episode
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

Episode 1185

🤗 Upvotes: 27 | cs.CL, cs.AI, cs.LG

Authors:
Thanh-Long V. Le, Myeongho Jeon, Kim Vu, Viet Lai, Eunho Yang

…

7 months, 1 week ago

Short Long
View Episode
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models

Episode 1184

🤗 Upvotes: 95 | cs.LG, cs.CL

Authors:
Guochao Jiang, Wenfeng Feng, Guofeng Quan, Chuzhan Hao, Yuewei Zhang, Guoh…

7 months, 1 week ago

Short Long
View Episode
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

Episode 1183

🤗 Upvotes: 76 | cs.CL

Authors:
Yizhou Wang, Chen Tang, Han Deng, Jiabei Xiao, Jiaqi Liu, Jianyu Wu, Jun Yao, Pen…

7 months, 1 week ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us