Previous Episode: Functional modules
Next Episode: Intro to distributed

Double backwards is PyTorch's way of implementing higher order differentiation. Why might you want it? How does it work? What are some of the weird things that happen when you do this?

Further reading.

Epic PR that added double backwards support for convolution initially https://github.com/pytorch/pytorch/pull/1643