Sharath Chandra Raparthy

02/12/24 • 40 min

TalkRL: The Reinforcement Learning Podcast

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!

Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.

Featured Reference

Generalization to New Sequential Decision Making Tasks with In-Context Learning
Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu
Additional References

Sharath Chandra Raparthy Homepage
Human-Timescale Adaptation in an Open-Ended Task Space, Adaptive Agent Team 2023
Data Distributional Properties Drive Emergent In-Context Learning in Transformers, Chan et al 2022
Decision Transformer: Reinforcement Learning via Sequence Modeling, Chen et al 2021

Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!

Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila.

Sharath Chandra Raparthy Homepage
Human-Timescale Adaptation in an Open-Ended Task Space, Adaptive Agent Team 2023
Data Distributional Properties Drive Emergent In-Context Learning in Transformers, Chan et al 2022
Decision Transformer: Reinforcement Learning via Sequence Modeling, Chen et al 2021

Previous Episode

Pierluca D'Oro and Martin Klissarov

Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more!

Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta.

Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta.

Featured References

Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

To keep doing RL research, stop calling yourself an RL researcherPierluca D'Oro

Next Episode

Ian Osband

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.

We spoke about:

Information theory and RL

Exploration, epistemic uncertainty and joint predictions

Epistemic Neural Networks and scaling to LLMs

Featured References

Reinforcement Learning, Bit by Bit
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Approximate Thompson Sampling via Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Additional References

Thesis defence, Ian Osband
Homepage, Ian Osband
Epistemic Neural Networks at Stanford RL Forum
Behaviour Suite for Reinforcement Learning, Osband et al 2019
Efficient Exploration for LLMs, Dwaracherla et al 2024

TalkRL: The Reinforcement Learning Podcast - Sharath Chandra Raparthy

Transcript

Robin00:00:04.959

TalkRL Podcast is all reinforcement learning all the time, featuring brilliant guests, both researched and applied. Join the conversation on Twitter at talk r l podcast. I'm your host, Robin Chohan. I'm very glad to introduce our guest today. I'm here with Sharath Chandra Ramparti, who is an AI resident at FAIR at Meta, and he did his master's at Mila. Welcome to the show, Sharath.

Sharath00:00:33.604

Thank you so much