
Pierluca D'Oro and Martin Klissarov
11/13/23 • 57 min
Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!
Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.
Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.
Featured References
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
To keep doing RL research, stop calling yourself an RL researcherPierluca D'Oro
Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!
Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.
Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.
Featured References
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
To keep doing RL research, stop calling yourself an RL researcherPierluca D'Oro
Previous Episode

Martin Riedmiller
Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more!
Martin Riedmiller is a research scientist and team lead at DeepMind.
Featured References
Magnetic control of tokamak plasmas through deep reinforcement learning
Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, Antoine Merle, Jean-Marc Moret, Seb Noury, Federico Pesamosca, David Pfau, Olivier Sauter, Cristian Sommariva, Stefano Coda, Basil Duval, Ambrogio Fasoli, Pushmeet Kohli, Koray Kavukcuoglu, Demis Hassabis & Martin Riedmiller
Human-level control through deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis
Neural fitted Q iteration–first experiences with a data efficient neural reinforcement learning method
Martin Riedmiller
Next Episode

Sharath Chandra Raparthy
Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!
Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.
Featured Reference
Generalization to New Sequential Decision Making Tasks with In-Context Learning
Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu
Additional References
- Sharath Chandra Raparthy Homepage
- Human-Timescale Adaptation in an Open-Ended Task Space, Adaptive Agent Team 2023
- Data Distributional Properties Drive Emergent In-Context Learning in Transformers, Chan et al 2022
- Decision Transformer: Reinforcement Learning via Sequence Modeling, Chen et al 2021
TalkRL: The Reinforcement Learning Podcast - Pierluca D'Oro and Martin Klissarov
Transcript
TalkRL Podcast is all reinforcement learning all the time, featuring brilliant guests, both research and applied. Join the conversation on Twitter at talkRL podcast. I'm your host, Robin Chohan. I'm very excited to welcome our guest today. We have Pierre Luca Adoro, a PhD student at Mila and visiting researcher at META. And we have Martin Klisseroff, a PhD student at Mila and McGill, and a research scientist intern at META. Welcome.
Martin<If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/talkrl-the-reinforcement-learning-podcast-217325/pierluca-doro-and-martin-klissarov-36908382"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to pierluca d'oro and martin klissarov on goodpods" style="width: 225px" /> </a>
Copy