Podcast Episodes
Back to SearchMARS: Enabling Autoregressive Models Multi-Token Generation
Episode 1746
🤗 Upvotes: 25 | cs.CL
Authors:
Ziqi Jin, Lei Wang, Ziwei Luo, Aixin Sun
Title:
MARS: Ena…
2Â months, 3Â weeks ago
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents
Episode 1745
🤗 Upvotes: 22 | cs.AI, cs.CL, cs.LG
Authors:
Hanchen Li, Runyuan He, Qizheng Zhang, Changxiu Ji, Qiuyang Mang, X…
2Â months, 3Â weeks ago
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
Episode 1744
🤗 Upvotes: 201 | cs.CV
Authors:
Chaoyou Fu, Haozhi Yuan, Yuhao Dong, Yi-Fan Zhang, Yunhang Shen, Xiaoxing Hu, Xu…
2Â months, 3Â weeks ago
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents
Episode 1743
🤗 Upvotes: 98 | cs.AI
Authors:
Bowen Ye, Rang Li, Qibin Yang, Yuanxin Liu, Linli Yao, Hanglong Lv, Zhihui Xie, C…
2Â months, 3Â weeks ago
Learning to Retrieve from Agent Trajectories
Episode 1742
🤗 Upvotes: 55 | cs.IR, cs.AI, cs.CL
Authors:
Yuqi Zhou, Sunhao Dai, Changle Qu, Liang Pang, Jun Xu, Ji-Rong Wen
2Â months, 3Â weeks ago
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation
Episode 1741
🤗 Upvotes: 47 | cs.LG
Authors:
Hui Sun, Yun-Ji Zhang, Zheng Xie, Ren-Biao Liu, Yali Du, Xin-Ye Li, Ming Li
2Â months, 3Â weeks ago
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers
Episode 1740
🤗 Upvotes: 37 | cs.SE, cs.AI
Authors:
Shufan Jiang, Chios Chen, Zhiyang Chen
Title:
GBQA…
2Â months, 3Â weeks ago
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning
Episode 1739
🤗 Upvotes: 33 | cs.PF, cs.SE
Authors:
Qisheng Su, Shiting Huang, Zhen Fang, Ziyan Chen, Zehui Chen, Feng Zhao
2Â months, 3Â weeks ago
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement
Episode 1738
🤗 Upvotes: 32 | cs.AI
Authors:
Difan Jiao, Qianfeng Wen, Blair Yang, Zhenwei Tang, Ashton Anderson
T…
2Â months, 3Â weeks ago
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision
Episode 1737
🤗 Upvotes: 31 | cs.CV
Authors:
Hyunsoo Cha, Wonjung Woo, Byungjun Kim, Hanbyul Joo
Title:
…
2Â months, 3Â weeks ago