
Harri Valpola: System 2 AI and Planning in Model-Based Reinforcement Learning
05/25/20 • 98 min
In this episode of Machine Learning Street Talk, Tim Scarfe, Yannic Kilcher and Connor Shorten interviewed Harri Valpola, CEO and Founder of Curious AI. We continued our discussion of System 1 and System 2 thinking in Deep Learning, as well as miscellaneous topics around Model-based Reinforcement Learning. Dr. Valpola describes some of the challenges of modelling industrial control processes such as water sewage filters and paper mills with the use of model-based RL. Dr. Valpola and his collaborators recently published “Regularizing Trajectory Optimization with Denoising Autoencoders” that addresses some of the concerns of planning algorithms that exploit inaccuracies in their world models!
00:00:00 Intro to Harri and Curious AI System1/System 2
00:04:50 Background on model-based RL challenges from Tim
00:06:26 Other interesting research papers on model-based RL from Connor
00:08:36 Intro to Curious AI recent NeurIPS paper on model-based RL and denoising autoencoders from Yannic
00:21:00 Main show kick off, system 1/2
00:31:50 Where does the simulator come from?
00:33:59 Evolutionary priors
00:37:17 Consciousness
00:40:37 How does one build a company like Curious AI?
00:46:42 Deep Q Networks
00:49:04 Planning and Model based RL
00:53:04 Learning good representations
00:55:55 Typical problem Curious AI might solve in industry
01:00:56 Exploration
01:08:00 Their paper - regularizing trajectory optimization with denoising
01:13:47 What is Epistemic uncertainty
01:16:44 How would Curious develop these models
01:18:00 Explainability and simulations
01:22:33 How system 2 works in humans
01:26:11 Planning
01:27:04 Advice for starting an AI company
01:31:31 Real world implementation of planning models
01:33:49 Publishing research and openness
We really hope you enjoy this episode, please subscribe!
Regularizing Trajectory Optimization with Denoising Autoencoders: https://papers.nips.cc/paper/8552-regularizing-trajectory-optimization-with-denoising-autoencoders.pdf
Pulp, Paper & Packaging: A Future Transformed through Deep Learning: https://thecuriousaicompany.com/pulp-paper-packaging-a-future-transformed-through-deep-learning/
Curious AI: https://thecuriousaicompany.com/
Harri Valpola Publications: https://scholar.google.com/citations?user=1uT7-84AAAAJ&hl=en&oi=ao
Some interesting papers around Model-Based RL:
GameGAN: https://cdn.arstechnica.net/wp-content/uploads/2020/05/Nvidia_GameGAN_Research.pdf
Plan2Explore: https://ramanans1.github.io/plan2explore/
World Models: https://worldmodels.github.io/
MuZero: https://arxiv.org/pdf/1911.08265.pdf
PlaNet: A Deep Planning Network for RL: https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html
Dreamer: Scalable RL using World Models: https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html
Model Based RL for Atari: https://arxiv.org/pdf/1903.00374.pdf
In this episode of Machine Learning Street Talk, Tim Scarfe, Yannic Kilcher and Connor Shorten interviewed Harri Valpola, CEO and Founder of Curious AI. We continued our discussion of System 1 and System 2 thinking in Deep Learning, as well as miscellaneous topics around Model-based Reinforcement Learning. Dr. Valpola describes some of the challenges of modelling industrial control processes such as water sewage filters and paper mills with the use of model-based RL. Dr. Valpola and his collaborators recently published “Regularizing Trajectory Optimization with Denoising Autoencoders” that addresses some of the concerns of planning algorithms that exploit inaccuracies in their world models!
00:00:00 Intro to Harri and Curious AI System1/System 2
00:04:50 Background on model-based RL challenges from Tim
00:06:26 Other interesting research papers on model-based RL from Connor
00:08:36 Intro to Curious AI recent NeurIPS paper on model-based RL and denoising autoencoders from Yannic
00:21:00 Main show kick off, system 1/2
00:31:50 Where does the simulator come from?
00:33:59 Evolutionary priors
00:37:17 Consciousness
00:40:37 How does one build a company like Curious AI?
00:46:42 Deep Q Networks
00:49:04 Planning and Model based RL
00:53:04 Learning good representations
00:55:55 Typical problem Curious AI might solve in industry
01:00:56 Exploration
01:08:00 Their paper - regularizing trajectory optimization with denoising
01:13:47 What is Epistemic uncertainty
01:16:44 How would Curious develop these models
01:18:00 Explainability and simulations
01:22:33 How system 2 works in humans
01:26:11 Planning
01:27:04 Advice for starting an AI company
01:31:31 Real world implementation of planning models
01:33:49 Publishing research and openness
We really hope you enjoy this episode, please subscribe!
Regularizing Trajectory Optimization with Denoising Autoencoders: https://papers.nips.cc/paper/8552-regularizing-trajectory-optimization-with-denoising-autoencoders.pdf
Pulp, Paper & Packaging: A Future Transformed through Deep Learning: https://thecuriousaicompany.com/pulp-paper-packaging-a-future-transformed-through-deep-learning/
Curious AI: https://thecuriousaicompany.com/
Harri Valpola Publications: https://scholar.google.com/citations?user=1uT7-84AAAAJ&hl=en&oi=ao
Some interesting papers around Model-Based RL:
GameGAN: https://cdn.arstechnica.net/wp-content/uploads/2020/05/Nvidia_GameGAN_Research.pdf
Plan2Explore: https://ramanans1.github.io/plan2explore/
World Models: https://worldmodels.github.io/
MuZero: https://arxiv.org/pdf/1911.08265.pdf
PlaNet: A Deep Planning Network for RL: https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html
Dreamer: Scalable RL using World Models: https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html
Model Based RL for Atari: https://arxiv.org/pdf/1903.00374.pdf
Previous Episode

ICLR 2020: Yoshua Bengio and the Nature of Consciousness
In this episode of Machine Learning Street Talk, Tim Scarfe, Connor Shorten and Yannic Kilcher react to Yoshua Bengio’s ICLR 2020 Keynote “Deep Learning Priors Associated with Conscious Processing”. Bengio takes on many future directions for research in Deep Learning such as the role of attention in consciousness, sparse factor graphs and causality, and the study of systematic generalization. Bengio also presents big ideas in Intelligence that border on the line of philosophy and practical machine learning. This includes ideas such as consciousness in machines and System 1 and System 2 thinking, as described in Daniel Kahneman’s book “Thinking Fast and Slow”. Similar to Yann LeCun’s half of the 2020 ICLR keynote, this talk takes on many challenging ideas and hopefully this video helps you get a better understanding of some of them! Thanks for watching!
Please Subscribe for more videos!
Paper Links:
Link to Talk: https://iclr.cc/virtual_2020/speaker_7.html
The Consciousness Prior: https://arxiv.org/abs/1709.08568
Thinking Fast and Slow: https://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374533555
Systematic Generalization: https://arxiv.org/abs/1811.12889
CLOSURE: Assessing Systematic Generalization of CLEVR Models: https://arxiv.org/abs/1912.05783
Neural Module Networks: https://arxiv.org/abs/1511.02799
Experience Grounds Language: https://arxiv.org/pdf/2004.10151.pdf
Benchmarking Graph Neural Networks: https://arxiv.org/pdf/2003.00982.pdf
On the Measure of Intelligence: https://arxiv.org/abs/1911.01547
Please check out our individual channels as well!
Machine Learning Dojo with Tim Scarfe: https://www.youtube.com/channel/UCXvHuBMbgJw67i5vrMBBobA
Yannic Kilcher: https://www.youtube.com/channel/UCZHmQk67mSJgfCCTn7xBfe
Henry AI Labs: https://www.youtube.com/channel/UCHB9VepY6kYvZjj0Bgxnpbw
00:00:00 Tim and Yannics takes
00:01:37 Intro to Bengio
00:03:13 System 2, language and Chomsky
00:05:58 Cristof Koch on conciousness
00:07:25 Francois Chollet on intelligence and consciousness
00:09:29 Meditation and Sam Harris on consciousness
00:11:35 Connor Intro
00:13:20 Show Main Intro
00:17:55 Priors associated with Conscious Processing
00:26:25 System 1 / System 2
00:42:47 Implicit and Verbalized Knowledge [DONT MISS THIS!]
01:08:24 Inductive Priors for DL 2.0
01:27:20 Systematic Generalization
01:37:53 Contrast with the Symbolic AI Program
01:54:55 Attention
02:00:25 From Attention to Consciousness
02:05:31 Thoughts, Consciousness, Language
02:06:55 Sparse Factor graph
02:10:52 Sparse Change in Abstract Latent Space
02:15:10 Discovering Cause and Effect
02:20:00 Factorize the joint distribution
02:22:30 RIMS: Modular Computation
02:24:30 Conclusion
#machinelearning #deeplearning
Next Episode

One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
*Note this is an episode from Tim's Machine Learning Dojo YouTube channel.
Join Eric Craeymeersch on a wonderful discussion all about ML engineering, computer vision, siamese networks, contrastive loss, one shot learning and metric learning.
00:00:00 Introduction
00:11:47 ML Engineering Discussion
00:35:59 Intro to the main topic
00:42:13 Siamese Networks
00:48:36 Mining strategies
00:51:15 Contrastive Loss
00:57:44 Trip loss paper
01:09:35 Quad loss paper
01:25:49 Eric's Quadloss Medium Article
02:17:32 Metric learning reality check
02:21:06 Engineering discussion II
02:26:22 Outro
In our second paper review call, Tess Ferrandez covered off the FaceNet paper from Google which was a one-shot siamese network with the so called triplet loss. It was an interesting change of direction for NN architecture i.e. using a contrastive loss instead of having a fixed number of output classes. Contrastive architectures have been taking over the ML landscape recently i.e. SimCLR, MOCO, BERT.
Eric wrote an article about this at the time: https://medium.com/@crimy/one-shot-learning-siamese-networks-and-triplet-loss-with-keras-2885ed022352
He then discovered there was a new approach to one shot learning in vision using a quadruplet loss and metric learning. Eric wrote a new article and several experiments on this @ https://medium.com/@crimy/beyond-triplet-loss-one-shot-learning-experiments-with-quadruplet-loss-16671ed51290?source=friends_link&sk=bf41673664ad8a52e322380f2a456e8b
Paper details:
Beyond triplet loss: a deep quadruplet network for person re-identification
https://arxiv.org/abs/1704.01719 (Chen at al '17)
"Person re-identification (ReID) is an important task in wide area video surveillance which focuses on identifying people across different cameras. Recently, deep learning networks with a triplet loss become a common framework for person ReID. However, the triplet loss pays main attentions on obtaining correct orders on the training set. It still suffers from a weaker generalization capability from the training set to the testing set, thus resulting in inferior performance. In this paper, we design a quadruplet loss, which can lead to the model output with a larger inter-class variation and a smaller intra-class variation compared to the triplet loss. As a result, our model has a better generalization ability and can achieve a higher performance on the testing set. In particular, a quadruplet deep network using a margin-based online hard negative mining is proposed based on the quadruplet loss for the person ReID. In extensive experiments, the proposed network outperforms most of the state-of-the-art algorithms on representative datasets which clearly demonstrates the effectiveness of our proposed method."
Original facenet paper;
https://arxiv.org/abs/1503.03832
#deeplearning #machinelearning
If you like this episode you’ll love
Episode Comments
Featured in these lists
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/machine-learning-street-talk-mlst-213859/harri-valpola-system-2-ai-and-planning-in-model-based-reinforcement-le-23890393"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to harri valpola: system 2 ai and planning in model-based reinforcement learning on goodpods" style="width: 225px" /> </a>
Copy