Podcast Episodes

Back to Search
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Episode 836

🤗 Upvotes: 84 | cs.LG, cs.AI, cs.CL

Authors:
Ganqu Cui, Yuchen Zhang, Jiacheng Chen, Lifan Yuan, Zhi Wang, Yuxin…

11 months, 1 week ago

Short Long
View Episode
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Episode 835

🤗 Upvotes: 63 | cs.SE, cs.CL

Authors:
Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich, Anton Shevtso…

11 months, 1 week ago

Short Long
View Episode
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Episode 834

🤗 Upvotes: 59 | cs.CL, cs.AI, cs.LG, cs.PF, I.2.7

Authors:
Tianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan…

11 months, 1 week ago

Short Long
View Episode

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us