
Sven Mika
08/19/22 • 34 min
Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University.
Featured References
RLlib Documentation: RLlib: Industry-Grade Reinforcement Learning
Ray: Documentation
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion Stoica
Episode sponsor: Anyscale
Ray Summit 2022 is coming to San Francisco on August 23-24.
Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University.
Featured References
RLlib Documentation: RLlib: Industry-Grade Reinforcement Learning
Ray: Documentation
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion Stoica
Episode sponsor: Anyscale
Ray Summit 2022 is coming to San Francisco on August 23-24.
Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Previous Episode

Karol Hausman and Fei Xia
Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments.
Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested in robot learning in complex and unstructured environments. Previously he has been approaching this problem by learning in realistic and scalable simulation environments (GibsonEnv, iGibson). Most recently, he has been exploring using foundation models for those challenges.
Featured References
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [ website ]
Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter
Additional References
- Large-scale simulation for embodied perception and robot learning, Xia 2021
- QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, Kalashnikov et al 2018
- MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale, Kalashnikov et al 2021
- ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation, Xia et al 2020
- Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills, Chebotar et al 2021
- Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language, Zeng et al 2022
Episode sponsor: Anyscale
Ray Summit 2022 is coming to San Francisco on August 23-24.
Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.
Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
Next Episode

John Schulman
John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.
Featured References
WebGPT: Browser-assisted question-answering with human feedback
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman
Training language models to follow instructions with human feedback
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe
Additional References
- Our approach to alignment research, OpenAI 2022
- Training Verifiers to Solve Math Word Problems, Cobbe et al 2021
- UC Berkeley Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation, John Schulman 2017
- Proximal Policy Optimization Algorithms, Schulman 2017
- Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs, Schulman 2016
TalkRL: The Reinforcement Learning Podcast - Sven Mika
Transcript
There's a rise in interest in our finance. We have JPM for example, as well as other companies that we're seeing moving into the space, and trying RL on financial decision making. Ray was actually developed, because of the need, to write a reinforcement learning library. TalkRL.
RobinR l. Talk r l podcast is all reinforcement learning all the time, featuring brilliant guests, both research and applied. Join
If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/talkrl-the-reinforcement-learning-podcast-217325/sven-mika-24475278"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to sven mika on goodpods" style="width: 225px" /> </a>
Copy