Podcast Episodes

Back to Search

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Episode 1222

🤗 Upvotes: 24 | cs.LG, cs.CL

Authors:
Yanxu Chen, Zijun Yao, Yantao Liu, Jin Ye, Jianing Yu, Lei Hou, Juanzi Li

…

9 months ago

Short Long

View Episode

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Episode 1221

🤗 Upvotes: 100 | cs.AI, cs.CL

Authors:
Fang Wu, Weihao Xuan, Heli Qi, Ximing Lu, Aaron Tu, Li Erran Li, Yejin Ch…

9 months ago

Short Long

View Episode

GEM: A Gym for Agentic LLMs

Episode 1220

🤗 Upvotes: 53 | cs.LG, cs.AI, cs.CL

Authors:
Zichen Liu, Anya Sims, Keyu Duan, Changyu Chen, Simon Yu, Xiangxin …

9 months ago

Short Long

View Episode

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

Episode 1219

🤗 Upvotes: 52 | cs.RO, cs.CV

Authors:
Hengtao Li, Pengxiang Ding, Runze Suo, Yihao Wang, Zirui Ge, Dongyuan Zang…

9 months ago

Short Long

View Episode

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Episode 1218

🤗 Upvotes: 32 | cs.LG, cs.AI, cs.CL

Authors:
Ziniu Li, Congliang Chen, Tianyun Yang, Tian Ding, Ruoyu Sun, Ge Zh…

9 months ago

Short Long

View Episode

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Episode 1217

🤗 Upvotes: 26 | cs.SE, cs.AI, cs.LG

Authors:
Alexander Kovrigin, Aleksandra Eliseeva, Konstantin Grotov, Egor Bo…

9 months ago

Short Long

View Episode

SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Episode 1216

🤗 Upvotes: 25 | cs.LG

Authors:
Lorenz K. Müller, Philippe Bich, Jiawei Zhuang, Ahmet Çelik, Luca Benfenati, Luka…

9 months ago

Short Long

View Episode

ACON: Optimizing Context Compression for Long-horizon LLM Agents

Episode 1215

🤗 Upvotes: 21 | cs.AI, cs.CL

Authors:
Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A. Inan, Lukas Wutschitz, Y…

9 months ago

Short Long

View Episode

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Episode 1214

🤗 Upvotes: 124 | cs.CL, cs.AI

Authors:
Zijian Wu, Xiangyan Liu, Xinyuan Zhang, Lingjun Chen, Fanqing Meng, Lingx…

9 months ago

Short Long

View Episode

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Episode 1213

🤗 Upvotes: 106 | cs.NE, cs.AI, cs.LG, stat.ML

Authors:
Adrian Kosowski, Przemysław Uznański, Jan Chorowski, Zuza…

9 months ago

Short Long

View Episode

Podcast Episodes

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

GEM: A Gym for Agentic LLMs

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

PIPer: On-Device Environment Setup via Online Reinforcement Learning

SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

ACON: Optimizing Context Compression for Long-horizon LLM Agents

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Love PodBriefly?