Podcast Episodes
Back to SearchEPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Episode 1192
🤗 Upvotes: 98 | cs.LG, cs.CL
Authors:
Xu Wujiang, Wentian Zhao, Zhenting Wang, Li Yu-Jhe, Jin Can, Jin Mingyu, M…
7Â months, 1Â week ago
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Episode 1191
🤗 Upvotes: 81 | cs.CV, cs.CL
Authors:
Junbo Niu, Zheng Liu, Zhuangcheng Gu, Bin Wang, Linke Ouyang, Zhiyuan Zhao…
7Â months, 1Â week ago
ReviewScore: Misinformed Peer Review Detection with Large Language Models
Episode 1190
🤗 Upvotes: 54 | cs.CL
Authors:
Hyun Ryu, Doohyuk Jang, Hyemin S. Lee, Joonhyun Jeong, Gyeongman Kim, Donghyeon C…
7Â months, 1Â week ago
Variational Reasoning for Language Models
Episode 1189
🤗 Upvotes: 51 | cs.CL, cs.AI, cs.LG
Authors:
Xiangxin Zhou, Zichen Liu, Haonan Wang, Chao Du, Min Lin, Chongxuan…
7Â months, 1Â week ago
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
Episode 1188
🤗 Upvotes: 48 | cs.CL, cs.AI, cs.LG
Authors:
Renjie Luo, Zichen Liu, Xiangyan Liu, Chao Du, Min Lin, Wenhu Chen,…
7Â months, 1Â week ago
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Episode 1187
🤗 Upvotes: 28 | cs.CV, cs.RO
Authors:
Jinkun Hao, Naifu Liang, Zhen Luo, Xudong Xu, Weipeng Zhong, Ran Yi, Yiche…
7Â months, 1Â week ago
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
Episode 1186
🤗 Upvotes: 28 | cs.CV, cs.AI, cs.CL
Authors:
Long Xing, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Jianze Liang, Qido…
7Â months, 1Â week ago
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Episode 1185
🤗 Upvotes: 27 | cs.CL, cs.AI, cs.LG
Authors:
Thanh-Long V. Le, Myeongho Jeon, Kim Vu, Viet Lai, Eunho Yang
7Â months, 1Â week ago
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models
Episode 1184
🤗 Upvotes: 95 | cs.LG, cs.CL
Authors:
Guochao Jiang, Wenfeng Feng, Guofeng Quan, Chuzhan Hao, Yuewei Zhang, Guoh…
7Â months, 1Â week ago
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
Episode 1183
🤗 Upvotes: 76 | cs.CL
Authors:
Yizhou Wang, Chen Tang, Han Deng, Jiabei Xiao, Jiaqi Liu, Jianyu Wu, Jun Yao, Pen…
7Â months, 1Â week ago