TalkRL: The Reinforcement Learning Podcast artwork

TalkRL: The Reinforcement Learning Podcast

53 episodes - English - Latest episode: 18 days ago - ★★★★★ - 7 ratings

TalkRL podcast is All Reinforcement Learning, All the Time.
In-depth interviews with brilliant people at the forefront of RL research and practice.
Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute.
Hosted by Robin Ranjit Singh Chauhan.

Technology reinforcement learning machine learning artificial intelligence
Homepage Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Episodes

Vincent Moens on TorchRL

April 08, 2024 19:45 - 40 minutes - 36.9 MB

Dr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch.  Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens  Additional References   TorchRL on github   TensorDict Documentation  

Arash Ahmadian on Rethinking RLHF

March 25, 2024 06:46 - 33 minutes - 30.7 MB

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI. Featured Reference Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker Additional References Self-Rewarding Language Models, Yuan et al 2024 Re...

Glen Berseth on RL Conference

March 11, 2024 16:00 - 21 minutes - 19.9 MB

Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL).  Featured Links  Reinforcement Learning Conference  Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

Ian Osband

March 07, 2024 19:24 - 1 hour - 62.7 MB

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.   We spoke about:  - Information theory and RL  - Exploration, epistemic uncertainty and joint predictions  - Epistemic Neural Networks and scaling to LLMs  Featured References  Reinforcement Learning, Bit by Bit  Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen  From Predictions to Decisions: The Importance of Joint ...

Sharath Chandra Raparthy

February 12, 2024 01:43 - 40 minutes - 37.3 MB

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!   Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.   Featured Reference  Generalization to New Sequential Decision Making Tasks with In-Context Learning    Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu  Additional References   Sharath Chandra Raparthy Homepage   Human-Timescale Adaptation in an...

Pierluca D'Oro and Martin Klissarov

November 13, 2023 17:32 - 57 minutes - 78.9 MB

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!   Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta. Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.   Featured References  Motif: Intrinsic Motivation from Artificial Intelligence Feedback  Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mi...

Martin Riedmiller

August 22, 2023 16:18 - 1 hour - 102 MB

Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!   Martin Riedmiller is a research scientist and team lead at DeepMind.    Featured References    Magnetic control of tokamak plasmas through deep reinforcement learning  Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Ti...

Max Schwarzer

August 08, 2023 20:22 - 1 hour - 96.6 MB

Max Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science.  Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research.    Featured References Bigger, Better, Faster: Human-level Atari with human-level efficiency  Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro  Sample-Efficient Re...

Julian Togelius

July 25, 2023 08:00 - 40 minutes - 55.1 MB

Julian Togelius is an Associate Professor of Computer Science and Engineering at NYU, and Cofounder and research director at modl.ai    Featured References   Choose Your Weapon: Survival Strategies for Depressed AI Academics Julian Togelius, Georgios N. Yannakakis Learning Controllable 3D Level Generators Zehua Jiang, Sam Earle, Michael Cerny Green, Julian Togelius PCGRL: Procedural Content Generation via Reinforcement Learning Ahmed Khalifa, Philip Bontrager, Sam Earle, Juli...

Jakob Foerster

May 08, 2023 06:00 - 1 hour - 43.8 MB

Jakob Foerster on Multi-Agent learning, Cooperation vs Competition, Emergent Communication, Zero-shot coordination, Opponent Shaping, agents for Hanabi and Prisoner's Dilemma, and more.   Jakob Foerster is an Associate Professor at University of Oxford.   Featured References   Learning with Opponent-Learning Awareness  Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch   Model-Free Opponent Shaping  Chris Lu, Timon Willi, Christ...

Danijar Hafner 2

April 12, 2023 08:00 - 45 minutes - 31.2 MB

Danijar Hafner on the DreamerV3 agent and world models, the Director agent and heirarchical RL,  realtime RL on robots with DayDreamer, and his framework for unsupervised agent design! Danijar Hafner is a PhD candidate at the University of Toronto with Jimmy Ba, a visiting student at UC Berkeley with Pieter Abbeel, and an intern at DeepMind.  He has been our guest before back on episode 11.   Featured References    Mastering Diverse Domains through World Models [ blog ] DreaverV...

Jeff Clune

March 27, 2023 14:32 - 1 hour - 48.9 MB

AI Generating Algos, Learning to play Minecraft with Video PreTraining (VPT), Go-Explore for hard exploration, POET and Open Endedness, AI-GAs and ChatGPT, AGI predictions, and lots more!   Professor Jeff Clune is Associate Professor of Computer Science at University of British Columbia, a Canada CIFAR AI Chair and Faculty Member at Vector Institute, and Senior Research Advisor at DeepMind.   Featured References  Video PreTraining (VPT): Learning to Act by Watching Unlabeled Onl...

Natasha Jaques 2

March 14, 2023 06:34 - 46 minutes - 31.6 MB

Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!  Dr Natasha Jaques is a Senior Research Scientist at Google Brain. Featured References Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard  Sequence Tutor: ...

Jacob Beck and Risto Vuorio

March 07, 2023 16:19 - 1 hour - 46.1 MB

Jacob Beck and Risto Vuorio on their recent Survey of Meta-Reinforcement Learning.  Jacob and Risto are Ph.D. students at Whiteson Research Lab at University of Oxford.    Featured Reference    A Survey of Meta-Reinforcement Learning Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson    Additional References   VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning, Luisa Zintgraf et al   Mastering Diverse Dom...

John Schulman

October 18, 2022 08:00 - 44 minutes - 30.5 MB

John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI. Featured References WebGPT: Browser-assisted question-answering with human feedback Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman Training language models to follow instru...

Sven Mika

August 19, 2022 05:11 - 34 minutes - 28.1 MB

Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University.  Featured References RLlib Documentation: RLlib: Industry-Grade Reinforcement Learning Ray: Documentation RLlib: Abstractions for Distributed Reinforcement Learning Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jor...

Karol Hausman and Fei Xia

August 16, 2022 19:05 - 1 hour - 50.7 MB

Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments. Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested in robot learning in complex and unstructured environments. Previously he has been approaching this problem by learning in realistic and...

Sai Krishna Gottipati

August 01, 2022 02:41 - 1 hour - 46.8 MB

Saikrishna Gottipati is an RL Researcher at AI Redefined, working on RL, MARL, human in the loop learning. Featured References Cogment: Open Source Framework For Distributed Multi-actor Training, Deployment & Operations AI Redefined, Sai Krishna Gottipati, Sagar Kurandwad, Clodéric Mars, Gregory Szriftgiser, François Chabot Do As You Teach: A Multi-Teacher Approach to Self-Play in Deep Reinforcement Learning Currently under review Learning to navigate the synthetically accessibl...

Aravind Srinivas 2

May 09, 2022 04:41 - 58 minutes - 40.3 MB

Aravind Srinivas is back!  He is now a research Scientist at OpenAI. Featured References Decision Transformer: Reinforcement Learning via Sequence Modeling Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch VideoGPT: Video Generation using VQ-VAE and Transformers Wilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas

Rohin Shah

April 12, 2022 02:00 - 1 hour - 77.9 MB

Dr. Rohin Shah is a Research Scientist at DeepMind, and the editor and main contributor of the Alignment Newsletter. Featured References The MineRL BASALT Competition on Learning from Human Feedback Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan Preferences Implicit in the State of the World Rohin Shah, Dmitrii Krasheninnikov, Jordan A...

Jordan Terry

February 22, 2022 04:44 - 1 hour - 43.9 MB

Jordan Terry is a PhD candidate at University of Maryland, the maintainer of Gym, the maintainer and creator of PettingZoo and the founder of Swarm Labs. Featured References PettingZoo: Gym for Multi-Agent Reinforcement Learning J. K. Terry, Benjamin Black, Nathaniel Grammel, Mario Jayakumar, Ananth Hari, Ryan Sullivan, Luis Santos, Rodrigo Perez, Caroline Horsch, Clemens Dieffendahl, Niall L. Williams, Yashas Lokesh, Praveen Ravi PettingZoo on Github gym on Github Additional R...

Robert Lange

December 20, 2021 09:00 - 1 hour - 48.7 MB

Robert Tjarko Lange is a PhD student working at the Technical University Berlin. Featured References Learning not to learn: Nature versus nurture in silico Lange, R. T., & Sprekeler, H. (2020) On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning Vischer, M. A., Lange, R. T., & Sprekeler, H. (2021). Semantic RL with Action Grammars: Data-Efficient Learning of Hierarchical Task Abstractions Lange, R. T., & Faisal, A. (2019). MLE-Infrastructure on Gi...

NeurIPS 2021 Political Economy of Reinforcement Learning Systems (PERLS) Workshop

November 18, 2021 23:53 - 24 minutes - 16.6 MB

We hear about the idea of PERLS and why its important to talk about. Political Economy of Reinforcement Learning (PERLS) Workshop at NeurIPS 2021 on Tues Dec 14th  NeurIPS 2021

Amy Zhang

September 27, 2021 17:27 - 1 hour - 55.8 MB

Amy Zhang is a postdoctoral scholar at UC Berkeley and a research scientist at Facebook AI Research. She will be starting as an assistant professor at UT Austin in Spring 2023.  Featured References  Invariant Causal Prediction for Block MDPs  Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup  Multi-Task Reinforcement Learning with Context-based Representations  Shagun Sodhani, Amy Zhang, Joelle Pineau  MBRL-Lib: A Mod...

Xianyuan Zhan

August 30, 2021 20:31 - 41 minutes - 33.3 MB

Xianyuan Zhan is currently a research assistant professor at the Institute for AI Industry Research (AIR), Tsinghua University.  He received his Ph.D. degree at Purdue University. Before joining Tsinghua University, Dr. Zhan worked as a researcher at Microsoft Research Asia (MSRA) and a data scientist at JD Technology.  At JD Technology, he led the research that uses offline RL to optimize real-world industrial systems.  Featured References  DeepThermal: Combustion Optimization fo...

Eugene Vinitsky

August 18, 2021 15:22 - 1 hour - 53 MB

Eugene Vinitsky is a PhD student at UC Berkeley advised by Alexandre Bayen. He has interned at Tesla and Deepmind.   Featured References  A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings  Eugene Vinitsky, Raphael Köster, John P. Agapiou, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, Joel Z. Leibo  Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL  Eugene Vinitsky, Nathan Licht...

Jess Whittlestone

July 20, 2021 17:59 - 1 hour - 73.5 MB

Dr. Jess Whittlestone is a Senior Research Fellow at the Centre for the Study of Existential Risk and the Leverhulme Centre for the Future of Intelligence, both at the University of Cambridge.  Featured References  The Societal Implications of Deep Reinforcement Learning  Jess Whittlestone, Kai Arulkumaran, Matthew Crosby  Artificial Canaries: Early Warning Signs for Anticipatory and Democratic Governance of AI  Carla Zoe Cremer, Jess Whittlestone  Additional References  CogX: ...

Aleksandra Faust

July 06, 2021 10:00 - 54 minutes - 43.7 MB

Dr Aleksandra Faust is a Staff Research Scientist and Reinforcement Learning research team co-founder at Google Brain Research. Featured References Reinforcement Learning and Planning for Preference Balancing Tasks  Faust 2014 Learning Navigation Behaviors End-to-End with AutoRL Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis Evolving Rewards to Automate Reinforcement Learning  Aleksandra Faust, Anthony Francis, Dar Mehta  Evolving Reinforcement Learn...

Sam Ritter

June 21, 2021 10:00 - 1 hour - 80.7 MB

Sam Ritter is a Research Scientist on the neuroscience team at DeepMind. Featured References Unsupervised Predictive Memory in a Goal-Directed Agent (MERLIN) Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botv...

Thomas Krendl Gilbert

May 17, 2021 11:00 - 1 hour - 57.9 MB

Thomas Krendl Gilbert is a PhD student at UC Berkeley’s Center for Human-Compatible AI, specializing in Machine Ethics and Epistemology.  Featured References  Hard Choices in Artificial Intelligence: Addressing Normative Uncertainty through Sociotechnical Commitments  Roel Dobbe, Thomas Krendl Gilbert, Yonatan Mintz  Mapping the Political Economy of Reinforcement Learning Systems: The Case of Autonomous Vehicles  Thomas Krendl Gilbert  AI Development for the Public Interest: Fro...

Marc G. Bellemare

May 13, 2021 00:00 - 57 minutes - 46.3 MB

Professor Marc G. Bellemare is a Research Scientist at Google Research (Brain team), An Adjunct Professor at McGill University, and a Canada CIFAR AI Chair.  Featured References  The Arcade Learning Environment: An Evaluation Platform for General Agents  Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling  Human-level control through deep reinforcement learning  Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves,...

Robert Osazuwa Ness

May 08, 2021 21:00 - 1 hour - 63.1 MB

Robert Osazuwa Ness is an adjunct professor of computer science at Northeastern University, an ML Research Engineer at Gamalon, and the founder of AltDeep School of AI.  He holds a PhD in statistics.  He studied at Johns Hopkins SAIS and then Purdue University.  References  Altdeep School of AI, Altdeep on Twitch, Substack, Robert Ness  Altdeep Causal Generative Machine Learning Minicourse, Free course  Robert Osazuwa Ness on Google Scholar  Gamalon Inc  Causal Reinforcement L...

Marlos C. Machado

April 12, 2021 14:50 - 1 hour - 73.4 MB

Dr. Marlos C. Machado is a research scientist at DeepMind and an adjunct professor at the University of Alberta. He holds a PhD from the University of Alberta and a MSc and BSc from UFMG, in Brazil.  Featured References  Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents  Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew J. Hausknecht, Michael Bowling  Contrastive Behavioral Similarity Embeddings for Gener...

Nathan Lambert

March 22, 2021 22:21 - 50 minutes - 40.6 MB

Nathan Lambert is a PhD Candidate at UC Berkeley.  Featured References  Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning  Nathan O. Lambert, Albert Wilcox, Howard Zhang, Kristofer S. J. Pister, Roberto Calandra  Objective Mismatch in Model-based Reinforcement Learning  Nathan Lambert, Brandon Amos, Omry Yadan, Roberto Calandra  Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning  Nathan O. Lambert, Daniel S. Drew, Joseph Yacon...

Kai Arulkumaran

March 16, 2021 04:44 - 46 minutes - 51.3 MB

Kai Arulkumaran is a researcher at Araya in Tokyo.  Featured References  AlphaStar: An Evolutionary Computation Perspective  Kai Arulkumaran, Antoine Cully, Julian Togelius  Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation  Tianhong Dai, Kai Arulkumaran, Tamara Gerbert, Samyakh Tukra, Feryal Behbahani, Anil Anthony Bharath  Training Agents using Upside-Down Reinforcement Learning  Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowsk...

Michael Dennis

January 26, 2021 05:27 - 1 hour - 48.8 MB

Michael Dennis is a PhD student at the Center for Human-Compatible AI at UC Berkeley, supervised by Professor Stuart Russell.  I'm interested in robustness in RL and multi-agent RL, specifically as it applies to making the interaction between AI systems and society at large to be more beneficial.    --Michael Dennis  Featured References Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design [PAIRED] Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexand...

Roman Ring

January 11, 2021 12:00 - 42 minutes - 34 MB

Roman Ring is a Research Engineer at DeepMind.  Featured References  Grandmaster level in StarCraft II using multi-agent reinforcement learning  Vinyals et al, 2019  Replicating DeepMind StarCraft II Reinforcement Learning Benchmark with Actor-Critic Methods  Roman Ring, 2018  Additional References  Relational Deep Reinforcement Learning,  Zambaldi et al 2018  StarCraft II: A New Challenge for Reinforcement Learning, Vinyals et al 2017  Safe and Efficient Off-Policy Reinforce...

Shimon Whiteson

December 06, 2020 21:00 - 53 minutes - 43 MB

Shimon Whiteson is a Professor of Computer Science at Oxford University, the head of WhiRL, the Whiteson Research Lab at Oxford, and Head of Research at Waymo UK.  Featured References  VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning  Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson  Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning  Tabish Rashid, Mikayel Samve...

Aravind Srinivas

September 21, 2020 03:46 - 1 hour - 68.5 MB

Aravind Srinivas is a 3rd year PhD student at UC Berkeley advised by Prof. Abbeel.  He co-created and co-taught a grad course on Deep Unsupervised Learning at Berkeley.  Featured References  Data-Efficient Image Recognition with Contrastive Predictive Coding  Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord  Contrastive Unsupervised Representations for Reinforcement Learning  Aravind Srinivas, Michael Laskin, Pie...

Taylor Killian

August 17, 2020 15:00 - 1 hour - 72.1 MB

Taylor Killian is a Ph.D. student at the University of Toronto and the Vector Institute, and an Intern at Google Brain. Featured References  Direct Policy Transfer with Hidden Parameter Markov Decision Processes Yao, Killian, Konidaris, Doshi-Velez  Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes Killian, Daulton, Konidaris, Doshi-Velez  Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes Killian,...

Nan Jiang

July 06, 2020 15:00 - 1 hour - 57.6 MB

Nan Jiang is an Assistant Professor of Computer Science at University of Illinois.  He was a Postdoc Microsoft Research, and did his PhD at University of Michigan under Professor Satinder Singh.  Featured References  Reinforcement Learning: Theory and Algorithms Alekh Agarwal Nan Jiang Sham M. Kakade  Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John L...

Danijar Hafner

May 14, 2020 10:00 - 2 hours - 71.4 MB

Danijar Hafner is a PhD student at the University of Toronto, and a student researcher at Google Research, Brain Team and the Vector Institute.  He holds a Masters of Research from University College London.  Featured References  A deep learning framework for neuroscience Blake A. Richards, Timothy P. Lillicrap , Philippe Beaudoin, Yoshua Bengio, Rafal Bogacz, Amelia Christensen, Claudia Clopath, Rui Ponte Costa, Archy de Berker, Surya Ganguli, Colleen J. Gillon , Danijar Hafner,...

Csaba Szepesvari

April 05, 2020 16:00 - 48 minutes - 39.1 MB

Csaba Szepesvari is:  Head of the Foundations Team at DeepMind  Professor of Computer Science at the University of Alberta  Canada CIFAR AI Chair  Fellow at the Alberta Machine Intelligence Institute   Co-Author of the book Bandit Algorithms along with Tor Lattimore, and author of the book Algorithms for Reinforcement Learning  References  Bandit based monte-carlo planning, Levente Kocsis, Csaba Szepesvári  Bandit Algorithms, Tor Lattimore, Csaba Szepesvári  Algorithms for ...

Ben Eysenbach

March 30, 2020 16:00 - 49 minutes - 39.6 MB

Ben Eysenbach is a PhD student in the Machine Learning Department at Carnegie Mellon University.  He was a Resident at Google Brain, and studied math and computer science at MIT. He co-founded the ICML Exploration in Reinforcement Learning workshop.  Featured References Diversity is All You Need: Learning Skills without a Reward Function Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine Search on the Replay Buffer: Bridging Planning and Reinforcement Learning B...

NeurIPS 2019 Deep RL Workshop

December 20, 2019 07:00 - 56 minutes - 45.1 MB

Thank you to all the presenters that participated.  I covered as many as I could given the time and crowds, if you were not included and wish to be, please email [email protected]  More details on the official NeurIPS Deep RL Workshop site.  0:23  Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms; Matthia Sabatelli (University of Liege); Gilles Louppe (University of Liège); Pierre Geurts (University of...

Scott Fujimoto

November 19, 2019 06:00 - 48 minutes - 38.8 MB

Scott Fujimoto is a PhD student at McGill University and Mila. He is the author of TD3 as well as some of the recent developments in batch deep reinforcement learning.   Featured References  Addressing Function Approximation Error in Actor-Critic Methods  Scott Fujimoto, Herke van Hoof, David Meger  Off-Policy Deep Reinforcement Learning without Exploration  Scott Fujimoto, David Meger, Doina Precup  Benchmarking Batch Deep Reinforcement Learning Algorithms  Scott Fujimoto, Ed...

Jessica Hamrick

November 12, 2019 05:00 - 1 hour - 51.1 MB

Dr. Jessica Hamrick is a Research Scientist at DeepMind. She holds a PhD in Psychology from UC Berkeley.  Featured References  Structured agents for physical construction  Victor Bapst, Alvaro Sanchez-Gonzalez, Carl Doersch, Kimberly L. Stachenfeld, Pushmeet Kohli, Peter W. Battaglia, Jessica B. Hamrick  Analogues of mental simulation and imagination in deep learning  Jessica Hamrick  Additional References  Metacontrol for Adaptive Imagination-Based Optimization  Jessica B. Ha...

Pablo Samuel Castro

October 10, 2019 03:00 - 56 minutes - 45.5 MB

Dr Pablo Samuel Castro is a Staff Research Software Engineer at Google Brain.  He is the main author of the Dopamine RL framework.  Featured References  A Comparative Analysis of Expected and Distributional Reinforcement Learning  Clare Lyle, Pablo Samuel Castro, Marc G. Bellemare   A Geometric Perspective on Optimal Representations for Reinforcement Learning  Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans...

Kamyar Azizzadenesheli

September 21, 2019 01:00 - 1 hour - 68.8 MB

Dr. Kamyar Azizzadenesheli is a post-doctorate scholar at Caltech.  His research interest is mainly in the area of Machine Learning, from theory to practice, with the main focus in Reinforcement Learning.  He will be joining Purdue University as an Assistant CS Professor in Fall 2020.  Featured References  Efficient Exploration through Bayesian Deep Q-Networks  Kamyar Azizzadenesheli, Animashree Anandkumar  Surprising Negative Results for Generative Adversarial Tree Search  Kamya...

Antonin Raffin and Ashley Hill

September 05, 2019 00:00 - 34 minutes - 27.9 MB

Antonin Raffin is a researcher at the German Aerospace Center (DLR) in Munich, working in the Institute of Robotics and Mechatronics. His research is on using machine learning for controlling real robots (because simulation is not enough), with a particular interest for reinforcement learning.  Ashley Hill is doing his thesis on improving control algorithms using machine learning for real time gain tuning.  He works mainly with neuroevolution, genetic algorithms, and of course rei...

Twitter Mentions

@marcgbellemare 1 Episode
@osazuwa 1 Episode
@michaeld1729 1 Episode