Today, Shen Li (mrshenli) joins me to talk about distributed computation in PyTorch. What is distributed? What kinds of things go into making distributed work in PyTorch? What's up with all of the optimizations people want to do here?

Further reading.

PyTorch distributed overview https://pytorch.org/tutorials/beginner/dist_overview.htmlDistributed data parallel https://pytorch.org/docs/stable/notes/ddp.html