Log in

goodpods headphones icon

To access all our features

Open the Goodpods app
Close icon
TalkRL: The Reinforcement Learning Podcast - Natasha Jaques 2
plus icon
bookmark

Natasha Jaques 2

03/14/23 • 46 min

TalkRL: The Reinforcement Learning Podcast

Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!

Dr Natasha Jaques is a Senior Research Scientist at Google Brain.

Featured References

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience
Marwa Abdulhai, Natasha Jaques, Sergey Levine

Additional References

plus icon
bookmark

Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!

Dr Natasha Jaques is a Senior Research Scientist at Google Brain.

Featured References

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience
Marwa Abdulhai, Natasha Jaques, Sergey Levine

Additional References

Previous Episode

undefined - Jacob Beck and Risto Vuorio

Jacob Beck and Risto Vuorio

Jacob Beck and Risto Vuorio on their recent Survey of Meta-Reinforcement Learning. Jacob and Risto are Ph.D. students at Whiteson Research Lab at University of Oxford.

Featured Reference

A Survey of Meta-Reinforcement Learning
Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson

Additional References

Next Episode

undefined - Jeff Clune

Jeff Clune

AI Generating Algos, Learning to play Minecraft with Video PreTraining (VPT), Go-Explore for hard exploration, POET and Open Endedness, AI-GAs and ChatGPT, AGI predictions, and lots more!

Professor Jeff Clune is Associate Professor of Computer Science at University of British Columbia, a Canada CIFAR AI Chair and Faculty Member at Vector Institute, and Senior Research Advisor at DeepMind.

Featured References

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [ Blog Post ]
Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune

Robots that can adapt like animals
Antoine Cully, Jeff Clune, Danesh Tarapore, Jean-Baptiste Mouret

Illuminating search spaces by mapping elites
Jean-Baptiste Mouret, Jeff Clune
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions
Rui Wang, Joel Lehman, Aditya Rawal, Jiale Zhi, Yulun Li, Jeff Clune, Kenneth O. Stanley

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley

First return, then explore
Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Episode Comments

Generate a badge

Get a badge for your website that links back to this episode

Select type & size
Open dropdown icon
share badge image

<a href="https://goodpods.com/podcasts/talkrl-the-reinforcement-learning-podcast-217325/natasha-jaques-2-28710926"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to natasha jaques 2 on goodpods" style="width: 225px" /> </a>

Copy