Pierluca D'Oro and Martin Klissarov

11/13/23 • 57 min

TalkRL: The Reinforcement Learning Podcast

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!

Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.

Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.

Featured References

Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

To keep doing RL research, stop calling yourself an RL researcherPierluca D'Oro

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!

Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.

Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.

Featured References

To keep doing RL research, stop calling yourself an RL researcherPierluca D'Oro

Previous Episode

Martin Riedmiller

Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!

Martin Riedmiller is a research scientist and team lead at DeepMind.

Featured References

Magnetic control of tokamak plasmas through deep reinforcement learning
Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, Antoine Merle, Jean-Marc Moret, Seb Noury, Federico Pesamosca, David Pfau, Olivier Sauter, Cristian Sommariva, Stefano Coda, Basil Duval, Ambrogio Fasoli, Pushmeet Kohli, Koray Kavukcuoglu, Demis Hassabis & Martin Riedmiller

Human-level control through deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis

Neural fitted Q iteration–first experiences with a data efficient neural reinforcement learning method
Martin Riedmiller

Next Episode

Sharath Chandra Raparthy

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!

Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.

Featured Reference

Generalization to New Sequential Decision Making Tasks with In-Context Learning
Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu
Additional References

Sharath Chandra Raparthy Homepage
Human-Timescale Adaptation in an Open-Ended Task Space, Adaptive Agent Team 2023
Data Distributional Properties Drive Emergent In-Context Learning in Transformers, Chan et al 2022
Decision Transformer: Reinforcement Learning via Sequence Modeling, Chen et al 2021

TalkRL: The Reinforcement Learning Podcast - Pierluca D'Oro and Martin Klissarov

Transcript

Robin00:00:04.879

TalkRL Podcast is all reinforcement learning all the time, featuring brilliant guests, both research and applied. Join the conversation on Twitter at talkRL podcast. I'm your host, Robin Chohan. I'm very excited to welcome our guest today. We have Pierre Luca Adoro, a PhD student at Mila and visiting researcher at META. And we have Martin Klisseroff, a PhD student at Mila and McGill, and a research scientist intern at META. Welcome.

Martin<