Log in

goodpods headphones icon

To access all our features

Open the Goodpods app
Close icon
headphones
PyTorch Developer Podcast

PyTorch Developer Podcast

Edward Yang, Team PyTorch

The PyTorch Developer Podcast is a place for the PyTorch dev team to do bite sized (10-20 min) topics about all sorts of internal development topics in PyTorch.
bookmark
Share icon

All episodes

Best episodes

Top 10 PyTorch Developer Podcast Episodes

Goodpods has curated a list of the 10 best PyTorch Developer Podcast episodes, ranked by the number of listens and likes each episode have garnered from our listeners. If you are listening to PyTorch Developer Podcast for the first time, there's no better place to start than with one of these standout episodes. If you are a fan of the show, vote for your favorite PyTorch Developer Podcast episode by adding your comments to the episode page.

PyTorch Developer Podcast - Inductor - Post-grad FX passes

Inductor - Post-grad FX passes

PyTorch Developer Podcast

play

04/12/24 • 24 min

The post-grad FX passes in Inductor run after AOTAutograd has functionalized and normalized the input program into separate forward/backward graphs. As such, they generally can assume that the graph in question is functionalized, except for some mutations to inputs at the end of the graph. At the end of post-grad passes, there are special passes that reintroduce mutation into the graph before going into the rest of Inductor lowering which is generally aware of passes. The post-grad FX passes are varied but are typically domain specific passes making local changes to specific parts of the graph.
bookmark
plus icon
share episode
PyTorch Developer Podcast - CUDA graph trees

CUDA graph trees

PyTorch Developer Podcast

play

03/24/24 • 20 min

CUDA graph trees are the internal implementation of CUDA graphs used in PT2 when you say mode="reduce-overhead". Their primary innovation is that they allow the reuse of memory across multiple CUDA graphs, as long as they form a tree structure of potential paths you can go down with the CUDA graph. This greatly reduced the memory usage of CUDA graphs in PT2. There are some operational implications to using CUDA graphs which are described in the podcast.
bookmark
plus icon
share episode
PyTorch Developer Podcast - Min-cut partitioner

Min-cut partitioner

PyTorch Developer Podcast

play

03/17/24 • 15 min

The min-cut partitioner makes decisions about what to save for backwards when splitting the forward and backwards graph from the joint graph traced by AOTAutograd. Crucially, it doesn't actually do a "split"; instead, it is deciding how much of the joint graph should be used for backwards. I also talk about the backward retracing problem.
bookmark
plus icon
share episode
PyTorch Developer Podcast - TH

TH

PyTorch Developer Podcast

play

06/16/21 • 11 min

What is TH? Why might you care? What is so horrible about it? What the heck is the generic/ folder? Why are we porting everything to C++? What are some downsides of having ported all our TH code to C++?

Further reading.

  • The TH to ATen porting guide has lots of explanations of old school TH idioms https://github.com/pytorch/pytorch/wiki/TH-to-ATen-porting-guide
  • Old notes about refcounting in TH https://github.com/pytorch/pytorch/blob/master/aten/src/README.md
bookmark
plus icon
share episode
PyTorch Developer Podcast - TorchScript

TorchScript

PyTorch Developer Podcast

play

06/15/21 • 19 min

There is a really good TorchScript overview at https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/OVERVIEW.md and in this 20min podcast, I want to give you some of the highlights from this document.

bookmark
plus icon
share episode
PyTorch Developer Podcast - CMake

CMake

PyTorch Developer Podcast

play

06/14/21 • 17 min

Why is PyTorch's build so g-dang complicated. How to avoid having to deal with cmake at all? And if you have to deal with cmake, what are the most important things to know? And if you were going to improve our cmake, how would you go about doing it...

Further reading.

Liner notes.

  • multiple build systems: cmake, buck, xplat buck, ovrsource buck, bazel
    • tools/build_variables.bzl is read from cmake! append_filelist
      • but not used uniformly for all components! (ouch!)
  • mashed together ATen and Caffe2 build systems (e.g., main library libtorch_cpu is defined in caffe2/CMakeLists.txt)
  • cmake: not very much syntax, "everything is a function". This means you can look up constructs relatively easily; e.g., even if() is a command
  • the general cmake model: "set a bunch of variables, run a bunch of commands". cmake is VERY GREPPABLE
    • but not everything is in CMakeLists.txt; check *.cmake too
    • the directory structure makes no sense, you really need to grep.
      (doing a lot of set PARENT_SCOPE to propagate stuff)
    • renaming a file? grep for it
    • primary hazard of refactoring: need to make sure all the variables
      are setup at the new location
  • many directories are not recursive glob, beware of adding new directories
  • old school cmake: literally everything is stuffed in variables (CMAKE_CXX_FLAGS). new school cmake: attach things to targets, things propagate when you depend on targets (public/private dependencies)
  • add_library: the most important thing
  • don't randomly change things and pray: have hypotheses and test them
bookmark
plus icon
share episode
PyTorch Developer Podcast - Code generation

Code generation

PyTorch Developer Podcast

play

06/04/21 • 16 min

Why does PyTorch use code generation as part of its build process? Why doesn't it use C++ templates? What things is code generation used for? What are the pros/consof using code generation? What are some other ways to do the same things we currently do with code generation?

Further reading.

Outline:

  • High level: reduce the amount of code in PyTorch, easier to develop
  • Strongly typed python
  • Stuff we're using codegen for
    • Meta point: stuff c++ metaprogramming can't do
    • C++ apis (functions, methods on classes)
      • Especially for forwarding (operator dot doko)
      • Prototypes for c++ to implement
    • YAML files used by external frameworks for binding (accidental)
    • Python arg parsing
    • pyi generation
    • Autograd classes for saving saved data
    • Otherwise complicated constexpr computation (e.g., parsing JIT
      schema)
  • Pros
    • Better surface syntax (native_functions.yaml, jit schema,
      derivatives.yaml)
    • Better error messages (template messages famously bad)
    • Easier to organize complicated code; esp nontrivial input
      data structure
    • Easier to debug by looking at generated code
  • Con
    • Not as portable (template can be used by anyone)
    • Less good modeling for C++ type based metaprogramming (we've replicated a crappy version of C++ type system in our codegen)
  • Counterpoints in the design space
    • C++ templates: just as efficient
    • Boxed fallback: simpler, less efficient
  • Open question: can you have best of both worlds, e.g., with partially evaluated interpreters?
bookmark
plus icon
share episode
PyTorch Developer Podcast - Why is autograd so complicated

Why is autograd so complicated

PyTorch Developer Podcast

play

06/03/21 • 15 min

Why is autograd so complicated? What are the constraints and features that go into making it complicated? What's up with it being written in C++? What's with derivatives.yaml and code generation? What's going on with views and mutation? What's up with hooks and anomaly mode? What's reentrant execution? Why is it relevant to checkpointing? What's the distributed autograd engine?

Further reading.

bookmark
plus icon
share episode
PyTorch Developer Podcast - __torch_function__

__torch_function__

PyTorch Developer Podcast

play

06/02/21 • 17 min

What is __torch_function__? Why would I want to use it? What does it have to do with keeping extra metadata on Tensors or torch.fx? How is it implemented? Why is __torch_function__ a really popular way of extending functionality in PyTorch? What makes it different from the dispatcher extensibility mechanism? What are some downsides of it being written this way? What are we doing about it?

Further reading.

bookmark
plus icon
share episode
PyTorch Developer Podcast - Higher order operators

Higher order operators

PyTorch Developer Podcast

play

04/21/24 • 17 min

Higher order operators are a special form of operators in torch.ops which have relaxed input argument requirements: in particular, they can accept any form of argument, including Python callables. Their name is based off of their most common use case, which is to represent higher order functions like control flow operators. However, they are also used to implement other variants of basic operators and can also be used to smuggle in Python data that is quite unusual. They are implemented using a Python dispatcher.
bookmark
plus icon
share episode

Show more best episodes

Toggle view more icon

FAQ

How many episodes does PyTorch Developer Podcast have?

PyTorch Developer Podcast currently has 83 episodes available.

What topics does PyTorch Developer Podcast cover?

The podcast is about Deep Learning, Podcasts, Technology and Machine Learning.

What is the most popular episode on PyTorch Developer Podcast?

The episode title 'Higher order operators' is the most popular.

What is the average episode length on PyTorch Developer Podcast?

The average episode length on PyTorch Developer Podcast is 16 minutes.

How often are episodes of PyTorch Developer Podcast released?

Episodes of PyTorch Developer Podcast are typically released every 3 days.

When was the first episode of PyTorch Developer Podcast?

The first episode of PyTorch Developer Podcast was released on May 4, 2021.

Show more FAQ

Toggle view more icon

Comments