Episode Details
Back to EpisodesNeurIPS 2019 Deep RL Workshop
Episode 8
Published 6 years ago
Description
Thank you to all the presenters that participated. I covered as many as I could given the time and crowds, if you were not included and wish to be, please email talkrl@pathwayi.com
More details on the official NeurIPS Deep RL Workshop site.
- 0:23 Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms; Matthia Sabatelli (University of Liege); Gilles Louppe (University of Liège); Pierre Geurts (University of Liège); Marco Wiering (University of Groningen) [external pdf link]
- 4:16 Single Deep Counterfactual Regret Minimization; Eric Steinberger (University of Cambridge).
- 5:38 On the Convergence of Episodic Reinforcement Learning Algorithms at the Example of RUDDER; Markus Holzleitner (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); José Arjona-Medina (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Marius-Constantin Dinu (LIT AI Lab / University Linz ); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria).
- 9:33 Objective Mismatch in Model-based Reinforcement Learning; Nathan Lambert (UC Berkeley); Brandon Amos (Facebook); Omry Yadan (Facebook); Roberto Calandra (Facebook).
- 10:51 Option Discovery using Deep Skill Chaining; Akhil Bagaria (Brown University); George Konidaris (Brown University).
- 13:44 Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware; Kirill Polzounov (University of Calgary); Ramitha Sundar (Blue River Technology); Lee Reden (Blue River Technology).
- 14:52 LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games; Leonard Adolphs (ETHZ); Thomas Hofmann (ETH Zurich).
- 16:30 Accelerating Training in Pommerman with Imitation and Reinforcement Learning; Hardik Meisheri (TCS Research); Omkar Shelke (TCS Research); Richa Verma (TCS Research); Harshad Khadilkar (TCS Research).
- 17:27 Dream to Control: Learning Behaviors by Latent Imagination; Danijar Hafner (Google); Timothy Lillicrap (DeepMind); Jimmy Ba (University of Toronto); Mohammad Norouzi (Google Brain) [external pdf link].
- 20:48 Adaptive Temperature Tuning for Mellowmax in Deep Reinforcement Learning; Seungchan Kim (Brown University); George Konidaris (Brown).
- 22:05 Meta-learning curiosity algorithms; Ferran Alet (MIT); Martin Schneider (MIT); Tomas Lozano-Perez (MIT); Leslie Kaelbling (