March 14, 2025
PyTorch Internals — Quick Reference
Notes on autograd, tensors, and the PyTorch execution model.
Tensor basics
import torch
x = torch.randn(3, 4, requires_grad=True)
y = (x ** 2).sum()
y.backward()
print(x.grad)
Autograd
- Operations build a computational graph dynamically
.backward()traverses the graph in reversetorch.no_grad()disables gradient tracking for inference
Device placement
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
Common pitfalls
- Forgetting
.zero_grad()between steps (useoptimizer.zero_grad()) - Mixing CPU and GPU tensors
- Not calling
model.eval()during inference (affects BatchNorm, Dropout)