Machine Learning Street Talk (MLST) - Harri Valpola: System 2 AI and Planning in Model-Based Reinforcement Learning

Harri Valpola: System 2 AI and Planning in Model-Based Reinforcement Learning

05/25/20 • 98 min

In this episode of Machine Learning Street Talk, Tim Scarfe, Yannic Kilcher and Connor Shorten interviewed Harri Valpola, CEO and Founder of Curious AI. We continued our discussion of System 1 and System 2 thinking in Deep Learning, as well as miscellaneous topics around Model-based Reinforcement Learning. Dr. Valpola describes some of the challenges of modelling industrial control processes such as water sewage filters and paper mills with the use of model-based RL. Dr. Valpola and his collaborators recently published “Regularizing Trajectory Optimization with Denoising Autoencoders” that addresses some of the concerns of planning algorithms that exploit inaccuracies in their world models!

00:00:00 Intro to Harri and Curious AI System1/System 2

00:04:50 Background on model-based RL challenges from Tim

00:06:26 Other interesting research papers on model-based RL from Connor

00:08:36 Intro to Curious AI recent NeurIPS paper on model-based RL and denoising autoencoders from Yannic

00:21:00 Main show kick off, system 1/2

00:31:50 Where does the simulator come from?

00:33:59 Evolutionary priors

00:37:17 Consciousness

00:40:37 How does one build a company like Curious AI?

00:46:42 Deep Q Networks

00:49:04 Planning and Model based RL

00:53:04 Learning good representations

00:55:55 Typical problem Curious AI might solve in industry

01:00:56 Exploration

01:08:00 Their paper - regularizing trajectory optimization with denoising

01:13:47 What is Epistemic uncertainty

01:16:44 How would Curious develop these models

01:18:00 Explainability and simulations

01:22:33 How system 2 works in humans

01:26:11 Planning

01:27:04 Advice for starting an AI company

01:31:31 Real world implementation of planning models

01:33:49 Publishing research and openness

We really hope you enjoy this episode, please subscribe!

Regularizing Trajectory Optimization with Denoising Autoencoders: https://papers.nips.cc/paper/8552-regularizing-trajectory-optimization-with-denoising-autoencoders.pdf

Pulp, Paper & Packaging: A Future Transformed through Deep Learning: https://thecuriousaicompany.com/pulp-paper-packaging-a-future-transformed-through-deep-learning/

Curious AI: https://thecuriousaicompany.com/

Harri Valpola Publications: https://scholar.google.com/citations?user=1uT7-84AAAAJ&hl=en&oi=ao

Some interesting papers around Model-Based RL:

GameGAN: https://cdn.arstechnica.net/wp-content/uploads/2020/05/Nvidia_GameGAN_Research.pdf

Plan2Explore: https://ramanans1.github.io/plan2explore/

World Models: https://worldmodels.github.io/

MuZero: https://arxiv.org/pdf/1911.08265.pdf

PlaNet: A Deep Planning Network for RL: https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html

Dreamer: Scalable RL using World Models: https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html

Model Based RL for Atari: https://arxiv.org/pdf/1903.00374.pdf

00:00:00 Intro to Harri and Curious AI System1/System 2

00:04:50 Background on model-based RL challenges from Tim

00:06:26 Other interesting research papers on model-based RL from Connor

00:08:36 Intro to Curious AI recent NeurIPS paper on model-based RL and denoising autoencoders from Yannic

00:21:00 Main show kick off, system 1/2

00:31:50 Where does the simulator come from?

00:33:59 Evolutionary priors

00:37:17 Consciousness

00:40:37 How does one build a company like Curious AI?

00:46:42 Deep Q Networks

00:49:04 Planning and Model based RL

00:53:04 Learning good representations

00:55:55 Typical problem Curious AI might solve in industry

01:00:56 Exploration

01:08:00 Their paper - regularizing trajectory optimization with denoising

01:13:47 What is Epistemic uncertainty

01:16:44 How would Curious develop these models

01:18:00 Explainability and simulations

01:22:33 How system 2 works in humans

01:26:11 Planning

01:27:04 Advice for starting an AI company

01:31:31 Real world implementation of planning models

01:33:49 Publishing research and openness

We really hope you enjoy this episode, please subscribe!

Regularizing Trajectory Optimization with Denoising Autoencoders: https://papers.nips.cc/paper/8552-regularizing-trajectory-optimization-with-denoising-autoencoders.pdf

Pulp, Paper & Packaging: A Future Transformed through Deep Learning: https://thecuriousaicompany.com/pulp-paper-packaging-a-future-transformed-through-deep-learning/

Curious AI: https://thecuriousaicompany.com/

Harri Valpola Publications: https://scholar.google.com/citations?user=1uT7-84AAAAJ&hl=en&oi=ao

Some interesting papers around Model-Based RL:

GameGAN: https://cdn.arstechnica.net/wp-content/uploads/2020/05/Nvidia_GameGAN_Research.pdf

Plan2Explore: https://ramanans1.github.io/plan2explore/

World Models: https://worldmodels.github.io/

MuZero: https://arxiv.org/pdf/1911.08265.pdf

PlaNet: A Deep Planning Network for RL: https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html

Dreamer: Scalable RL using World Models: https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html

Model Based RL for Atari: https://arxiv.org/pdf/1903.00374.pdf

Previous Episode

ICLR 2020: Yoshua Bengio and the Nature of Consciousness

In this episode of Machine Learning Street Talk, Tim Scarfe, Connor Shorten and Yannic Kilcher react to Yoshua Bengio’s ICLR 2020 Keynote “Deep Learning Priors Associated with Conscious Processing”. Bengio takes on many future directions for research in Deep Learning such as the role of attention in consciousness, sparse factor graphs and causality, and the study of systematic generalization. Bengio also presents big ideas in Intelligence that border on the line of philosophy and practical machine learning. This includes ideas such as consciousness in machines and System 1 and System 2 thinking, as described in Daniel Kahneman’s book “Thinking Fast and Slow”. Similar to Yann LeCun’s half of the 2020 ICLR keynote, this talk takes on many challenging ideas and hopefully this video helps you get a better understanding of some of them! Thanks for watching!

Please Subscribe for more videos!

Paper Links:

Link to Talk: https://iclr.cc/virtual_2020/speaker_7.html

The Consciousness Prior: https://arxiv.org/abs/1709.08568

Thinking Fast and Slow: https://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374533555

Systematic Generalization: https://arxiv.org/abs/1811.12889

CLOSURE: Assessing Systematic Generalization of CLEVR Models: https://arxiv.org/abs/1912.05783

Neural Module Networks: https://arxiv.org/abs/1511.02799

Experience Grounds Language: https://arxiv.org/pdf/2004.10151.pdf

Benchmarking Graph Neural Networks: https://arxiv.org/pdf/2003.00982.pdf

On the Measure of Intelligence: https://arxiv.org/abs/1911.01547

Please check out our individual channels as well!

Machine Learning Dojo with Tim Scarfe: https://www.youtube.com/channel/UCXvHuBMbgJw67i5vrMBBobA

Yannic Kilcher: https://www.youtube.com/channel/UCZHmQk67mSJgfCCTn7xBfe

Henry AI Labs: https://www.youtube.com/channel/UCHB9VepY6kYvZjj0Bgxnpbw

00:00:00 Tim and Yannics takes

00:01:37 Intro to Bengio

00:03:13 System 2, language and Chomsky

00:05:58 Cristof Koch on conciousness

00:07:25 Francois Chollet on intelligence and consciousness

00:09:29 Meditation and Sam Harris on consciousness

00:11:35 Connor Intro

00:13:20 Show Main Intro

00:17:55 Priors associated with Conscious Processing

00:26:25 System 1 / System 2

00:42:47 Implicit and Verbalized Knowledge [DONT MISS THIS!]

01:08:24 Inductive Priors for DL 2.0

01:27:20 Systematic Generalization

01:37:53 Contrast with the Symbolic AI Program

01:54:55 Attention

02:00:25 From Attention to Consciousness

02:05:31 Thoughts, Consciousness, Language

02:06:55 Sparse Factor graph

02:10:52 Sparse Change in Abstract Latent Space

02:15:10 Discovering Cause and Effect

02:20:00 Factorize the joint distribution

02:22:30 RIMS: Modular Computation

02:24:30 Conclusion

#machinelearning #deeplearning

Next Episode

One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)

*Note this is an episode from Tim's Machine Learning Dojo YouTube channel.

Join Eric Craeymeersch on a wonderful discussion all about ML engineering, computer vision, siamese networks, contrastive loss, one shot learning and metric learning.

00:00:00 Introduction

00:11:47 ML Engineering Discussion

00:35:59 Intro to the main topic

00:42:13 Siamese Networks

00:48:36 Mining strategies

00:51:15 Contrastive Loss

00:57:44 Trip loss paper

01:09:35 Quad loss paper

01:25:49 Eric's Quadloss Medium Article

02:17:32 Metric learning reality check

02:21:06 Engineering discussion II

02:26:22 Outro

In our second paper review call, Tess Ferrandez covered off the FaceNet paper from Google which was a one-shot siamese network with the so called triplet loss. It was an interesting change of direction for NN architecture i.e. using a contrastive loss instead of having a fixed number of output classes. Contrastive architectures have been taking over the ML landscape recently i.e. SimCLR, MOCO, BERT.

Eric wrote an article about this at the time: https://medium.com/@crimy/one-shot-learning-siamese-networks-and-triplet-loss-with-keras-2885ed022352

He then discovered there was a new approach to one shot learning in vision using a quadruplet loss and metric learning. Eric wrote a new article and several experiments on this @ https://medium.com/@crimy/beyond-triplet-loss-one-shot-learning-experiments-with-quadruplet-loss-16671ed51290?source=friends_link&sk=bf41673664ad8a52e322380f2a456e8b

Paper details:

Beyond triplet loss: a deep quadruplet network for person re-identification

https://arxiv.org/abs/1704.01719 (Chen at al '17)

"Person re-identification (ReID) is an important task in wide area video surveillance which focuses on identifying people across different cameras. Recently, deep learning networks with a triplet loss become a common framework for person ReID. However, the triplet loss pays main attentions on obtaining correct orders on the training set. It still suffers from a weaker generalization capability from the training set to the testing set, thus resulting in inferior performance. In this paper, we design a quadruplet loss, which can lead to the model output with a larger inter-class variation and a smaller intra-class variation compared to the triplet loss. As a result, our model has a better generalization ability and can achieve a higher performance on the testing set. In particular, a quadruplet deep network using a margin-based online hard negative mining is proposed based on the quadruplet loss for the person ReID. In extensive experiments, the proposed network outperforms most of the state-of-the-art algorithms on representative datasets which clearly demonstrates the effectiveness of our proposed method."

Original facenet paper;

https://arxiv.org/abs/1503.03832

#deeplearning #machinelearning