![PyTorch Developer Podcast artwork](https://is3-ssl.mzstatic.com/image/thumb/Podcasts115/v4/5d/4e/01/5d4e0127-9482-b8e3-6f59-59eaf50a21d9/mza_11274935194810526674.jpg/100x100bb.jpg)
Batching
PyTorch Developer Podcast
English - August 18, 2021 13:00 - 13 minutes - 12.5 MB - ★★★★★ - 35 ratingsTechnology deep learning machine learning pytorch Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed
Previous Episode: Multiple dispatch in __torch_function__
Next Episode: DataLoader with multiple workers leaks memory
PyTorch operates on its input data in a batched manner, typically processing multiple batches of an input at once (rather than once at a time, as would be the case in typical programming). In this podcast, we talk a little about the implications of batching operations in this way, and then also about how PyTorch's API is structured for batching (hint: poorly) and how Numpy introduced a concept of ufunc/gufuncs to standardize over broadcasting and batching behavior. There is some overlap between this podcast and previous podcasts about TensorIterator and vmap; you may also be interested in those episodes.
Further reading.
ufuncs and gufuncs https://numpy.org/doc/stable/reference/ufuncs.html and https://numpy.org/doc/stable/reference/c-api/generalized-ufuncs.htmlA brief taxonomy of PyTorch operators by shape behavior http://blog.ezyang.com/2020/05/a-brief-taxonomy-of-pytorch-operators-by-shape-behavior/Related episodes on TensorIterator and vmap https://pytorch-dev-podcast.simplecast.com/episodes/tensoriterator and https://pytorch-dev-podcast.simplecast.com/episodes/vmap