Blog

Essays and notes on AI, GPUs, and software I am building.

May 11, 2025

A practical walkthrough of how Flash Attention reduces memory traffic and speeds up transformer training.

Apr 27, 2025

Notes from learning CUDA memory hierarchy, occupancy, and writing my first custom kernels.