
Moving NLP Forward With Transformer Models and Attention
08/12/22 • 50 min
1 Listener
What’s the big breakthrough for Natural Language Processing (NLP) that has dramatically advanced machine learning into deep learning? What makes these transformer models unique, and what defines “attention?” This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, continues our talk about how machine learning (ML) models understand and generate text.
This episode is a continuation of the conversation in episode #119. Jodie builds on the concepts of bag-of-words, word2vec, and simple embedding models. We talk about the breakthrough mechanism called “attention,” which allows for parallelization in building models.
We also discuss the two major transformer models, BERT and GPT3. Jodie continues to share multiple resources to help you continue exploring modeling and NLP with Python.
Course Spotlight: Building a Neural Network & Making Predictions With Python AI
In this step-by-step course, you’ll build a neural network from scratch as an introduction to the world of artificial intelligence (AI) in Python. You’ll learn how to train your neural network and make predictions based on a given dataset.
Topics:
- 00:00:00 – Introduction
- 00:02:20 – Where we left off with word2vec...
- 00:03:35 – Example of losing context
- 00:06:50 – Working at scale and adding attention
- 00:12:34 – Multiple levels of training for the model
- 00:14:10 – Attention is the basis for transformer models
- 00:15:07 – BERT (Bidirectional Encoder Representations from Transformers)
- 00:16:29 – GPT (Generative Pre-trained Transformer)
- 00:19:08 – Video Course Spotlight
- 00:20:08 – How far have we moved forward?
- 00:20:41 – Access to GPT-2 via Hugging Face
- 00:23:56 – How to access and use these models?
- 00:30:42 – Cost of training GPT-3
- 00:35:01 – Resources to practice and learn with BERT
- 00:38:19 – GPT-3 and GitHub Copilot
- 00:44:35 – DALL-E is a transformer
- 00:46:13 – Help yourself to the show notes!
- 00:49:19 – How can people follow your work?
- 00:50:03 – Thanks and goodbye
Show Links:
- Recurrent neural network - Wikipedia
- Long short-term memory - Wikipedia
- Vanishing gradient problem - Wikipedia
- Vanishing Gradient Problem | What is Vanishing Gradient Problem?
- Attention Is All You Need | Cornell University
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) – Jay Alammar
- Standing on the Shoulders of Giant Frozen Language Models | Cornell University
- #datalift22 Embeddings paradigm shift: Model training to vector similarity search by Nava Levy - YouTube
- Transformer Neural Networks - EXPLAINED! (Attention is all you need) - YouTube
- BERT 101 - State Of The Art NLP Model Explained
- How GPT3 Works - Easily Explained with Animations - YouTube
- Write With Transformer (GPT2 Live Playground Tool) - Hugging Face
- Language Model with Alpa (GPT3 Live Playground Tool) OPT-175B
- Big Data | Music
- OpenAI API
- 🤗 (Hugging Face)Transformers Notebooks
- GitHub Copilot: Fly With Python at the Speed of Thought
- GitHub Copilot learned about the daily struggle of JavaScript developers after being trained on billions of lines of code. | Marek Sotak on Twitter
- Jodie Burchell’s Blog - Standard error
- Jodie Burchell 🇦🇺🇩🇪 (@t_redactyl) / Twitter
- JetBrains: Essential tools for software ...
What’s the big breakthrough for Natural Language Processing (NLP) that has dramatically advanced machine learning into deep learning? What makes these transformer models unique, and what defines “attention?” This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, continues our talk about how machine learning (ML) models understand and generate text.
This episode is a continuation of the conversation in episode #119. Jodie builds on the concepts of bag-of-words, word2vec, and simple embedding models. We talk about the breakthrough mechanism called “attention,” which allows for parallelization in building models.
We also discuss the two major transformer models, BERT and GPT3. Jodie continues to share multiple resources to help you continue exploring modeling and NLP with Python.
Course Spotlight: Building a Neural Network & Making Predictions With Python AI
In this step-by-step course, you’ll build a neural network from scratch as an introduction to the world of artificial intelligence (AI) in Python. You’ll learn how to train your neural network and make predictions based on a given dataset.
Topics:
- 00:00:00 – Introduction
- 00:02:20 – Where we left off with word2vec...
- 00:03:35 – Example of losing context
- 00:06:50 – Working at scale and adding attention
- 00:12:34 – Multiple levels of training for the model
- 00:14:10 – Attention is the basis for transformer models
- 00:15:07 – BERT (Bidirectional Encoder Representations from Transformers)
- 00:16:29 – GPT (Generative Pre-trained Transformer)
- 00:19:08 – Video Course Spotlight
- 00:20:08 – How far have we moved forward?
- 00:20:41 – Access to GPT-2 via Hugging Face
- 00:23:56 – How to access and use these models?
- 00:30:42 – Cost of training GPT-3
- 00:35:01 – Resources to practice and learn with BERT
- 00:38:19 – GPT-3 and GitHub Copilot
- 00:44:35 – DALL-E is a transformer
- 00:46:13 – Help yourself to the show notes!
- 00:49:19 – How can people follow your work?
- 00:50:03 – Thanks and goodbye
Show Links:
- Recurrent neural network - Wikipedia
- Long short-term memory - Wikipedia
- Vanishing gradient problem - Wikipedia
- Vanishing Gradient Problem | What is Vanishing Gradient Problem?
- Attention Is All You Need | Cornell University
- Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) – Jay Alammar
- Standing on the Shoulders of Giant Frozen Language Models | Cornell University
- #datalift22 Embeddings paradigm shift: Model training to vector similarity search by Nava Levy - YouTube
- Transformer Neural Networks - EXPLAINED! (Attention is all you need) - YouTube
- BERT 101 - State Of The Art NLP Model Explained
- How GPT3 Works - Easily Explained with Animations - YouTube
- Write With Transformer (GPT2 Live Playground Tool) - Hugging Face
- Language Model with Alpa (GPT3 Live Playground Tool) OPT-175B
- Big Data | Music
- OpenAI API
- 🤗 (Hugging Face)Transformers Notebooks
- GitHub Copilot: Fly With Python at the Speed of Thought
- GitHub Copilot learned about the daily struggle of JavaScript developers after being trained on billions of lines of code. | Marek Sotak on Twitter
- Jodie Burchell’s Blog - Standard error
- Jodie Burchell 🇦🇺🇩🇪 (@t_redactyl) / Twitter
- JetBrains: Essential tools for software ...
Previous Episode

Inspiring Young People to Learn Python With Mission Encodeable
Is there someone in your life you’d like to inspire to learn Python? Mission Encodeable is a website designed to teach people to code, built by two high-school students. This week on the show, Anna and Harry Wake talk about creating their site and motivating people to start coding.
We discuss why they decided to build the site. Anna and Harry initially felt that the site would be for other students but soon realized it could be helpful for anyone interested in starting to code in Python. We cover the project-based approach and how they implemented the interactive browser-based tool replit.com.
We talk about learning Python in the classroom and how they found additional books and tutorials to supplement their coding education. Anna and Harry also created a resource hub to help teachers take advantage of the site.
Course Spotlight: Rock, Paper, Scissors With Python: A Command Line Game
In this course, you’ll learn to program rock paper scissors in Python from scratch. You’ll learn how to take in user input, make the computer choose a random action, determine a winner, and split your code into functions.
Topics:
- 00:00:00 – Introduction
- 00:02:17 – Personal backgrounds
- 00:02:51 – What’s the goal for the site?
- 00:03:54 – How did you come up with the idea?
- 00:05:08 – Where have you shared it?
- 00:06:39 – Projects for each level
- 00:09:28 – How has the response been?
- 00:10:10 – Using replit
- 00:12:56 – Sponsor: CData Software
- 00:13:37 – Design of the site and other tools to create it
- 00:15:49 – Learning Python and classes at school
- 00:17:41 – Did remote school inspire more online exploration?
- 00:19:16 – Myths of how kids learn programming
- 00:23:32 – More about projects
- 00:27:57 – Video Course Spotlight
- 00:29:27 – What other areas of Python do you want to explore?
- 00:33:08 – Teachers using the site
- 00:37:11 – What other resources have you used to learn Python?
- 00:38:52 – What are you excited about in the world of Python?
- 00:40:01 – What do you want to learn next?
- 00:42:06 – Thanks and goodbye
Show Links:
- Mission Encodeable | Free coding tutorials for young people
- Replit - The collaborative browser based IDE
- Make Your First Python Game: Rock, Paper, Scissors! – Real Python
- Figma: the collaborative interface design tool.
- React – A JavaScript library for building user interfaces
- Coding with Minecraft - Al Sweigart
- The Recursive Book of Recursion - Al Sweigart
- Codewars - Achieve mastery through coding practice and developer mentorship
- Advent of Code 2021
- Object-Oriented Programming (OOP) in Python 3 – Real Python
- Python Arcade
- Craig’n’Dave “Unscripted” - Mission Encodeable - YouTube
- Hello World issue 19 — Hello World
- LearningDust: LearningDust 3.15 - Anna & Harry Wake
- Teaching Python Episode 93: Mission Encodeable
- Mission Encodeable (@missionencode) / Twitter
Level up your Python skills with our expert-led courses:
Next Episode

Configuring a Coding Environment on Windows & Using TOML With Python
Have you attempted to set up a Python development environment on Windows before? Would it be helpful to have an easy-to-follow guide to get you started? This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder’s Weekly articles and projects.
We talk about a Real Python tutorial that covers configuring a Windows coding environment. The guide contains valuable suggestions, best practices, and powerful coding tools. It also covers how to use a package manager, the new Windows Terminal, PowerShell Core, and a program to manage multiple versions of Python.
Christopher covers another Real Python tutorial about using TOML in Python. TOML is a configuration format for building and distributing your own packages. We discuss how TOML parsing will be added to Python’s standard library in version 3.11.
We cover several other articles and projects from the Python community, on topics including technical writing for developers, a news round-up, a farewell to obsolete Python libraries, uncommon uses of Python in commonly used libraries, a prettier ls, and a project for advanced hot reloading in Python.
Course Spotlight: Python Basics: Finding and Fixing Code Bugs
In this Python Basics video course, you’ll learn how to identify and fix logic errors, or bugs, in your Python code. You’ll use the built-in debugging tools in Python’s Integrated Development and Learning Environment to practice locating and resolving bugs in an example function.
Topics:
- 00:00:00 – Introduction
- 00:02:13 – Python 3.10.6 Released
- 00:02:41 – Python 3.11.0rc1 Released
- 00:03:13 – Django 4.1 Released
- 00:04:07 – 10 malicious Python packages exposed in latest repository attack
- 00:05:12 – Protestware: Why Developers Sabotage Their Own Code
- 00:06:41 – Python and TOML: New Best Friends
- 00:16:19 – Say Goodbye to These Obsolete Python Libraries
- 00:25:51 – Video Course Spotlight
- 00:27:26 – Uncommon Uses of Python in Commonly Used Libraries
- 00:37:56 – Your Python Coding Environment on Windows: Setup Guide
- 00:48:20 – Technical Writing for Developers
- 00:55:24 – reloadium: Advanced Hot Reloading for Python
- 00:58:07 – pls: A Prettier ‘ls’
- 01:00:56 – Thanks and goodbye
News:
- Python 3.10.6 Released
- Python 3.11.0rc1 Released
- Django 4.1 Released
- 10 malicious Python packages exposed in latest repository attack | Ars Technica
- Protestware: Why Developers Sabotage Their Own Code
Topic Links:
- Python and TOML: New Best Friends – TOML is a configuration file format that’s becoming increasingly popular in the Python community. In this tutorial, you’ll learn the syntax of TOML and explore how you can work with TOML files in your own projects.
- Say Goodbye to These Obsolete Python Libraries – It’s time to say goodbye to os.path, random, pytz, namedtuple and many more obsolete Python libraries. Start using the latest and greatest ones instead.
- Uncommon Uses of Python in Commonly Used Libraries – To learn more about writing maintainable Python, Eugene has been reading code from some of the more popular Python libraries. This blog post talks about some of the coding patterns he has encountered along the way.
- Your Python Coding Environment on Windows: Setup Guide – With this opinionated guide to setting up a basic, fully featured, and flexible setup for Python coding and contributing to open-source projects when working from Windows, you’ll go from a fresh install to ready to contribute, and even check out a PowerShell script to automate much of the process.
Discussion:
- Technical Writing for Developers – “The way we write about and around code is arguably as important as the code itself.” This article outlines how programming and writing come together to take your developer skills to the next level.
Projects:
If you like this episode you’ll love

The Why And The What – Product Management Podcast

The Edtech Podcast

The Art of LiveOps

CodeWinds - Leading edge web developer news and training | javascript / React.js / Node.js / HTML5 / web development - Jeff Barczewski

Joomla Beat Podcast | Web design, development, online marketing, social media & website management
Episode Comments
Featured in these lists
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/the-real-python-podcast-186798/moving-nlp-forward-with-transformer-models-and-attention-23096193"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to moving nlp forward with transformer models and attention on goodpods" style="width: 225px" /> </a>
Copy