Build intuition from absolute zero. Every concept from first principles, with interactive simulations, step-by-step math, and quizzes. No prerequisites beyond curiosity.
From absolute zero to understanding every line of Karpathy's 243-line GPT.
Self-attention, multi-head, KV cache, MoE — the architecture behind everything.
Linear recurrence, selective scan — the O(n) alternative to attention.
Add noise, learn to reverse it. The dominant generative paradigm.
Straight paths from noise to data — simpler, faster than diffusion.
The secret plumbing — variational inference, codebooks, tokenization.
Generator vs discriminator — the adversarial game.
The glue connecting images and text in a shared space.
Teaching language models to see — vision encoder + LLM fusion.
Foundation models that physically act in the world.
Learning to imagine before acting — prediction as intelligence.
Reconstructing 3D worlds from 2D photographs.
RLHF, DPO, Constitutional AI — making AI do what we want.
The mother of all filters — recursive belief update from first principles.
Track a moving object through noise — the most elegant algorithm in engineering.
When the world is nonlinear — linearize with Jacobians.
Beyond linearization — sigma points capture nonlinearity directly.
Sequences with hidden causes — Forward, Viterbi, Baum-Welch.
Prior × likelihood = posterior. The foundation of all inference.
Variables with dependencies — DAGs, d-separation, message passing.
The chicken-and-egg problem — EKF-SLAM, particle filters, graph-based.
IMU + camera fusion — preintegration, MSCKF, tightly-coupled.
Deep features, neural implicit maps, Gaussian splatting SLAM.
Learned inertial models, transformer odometry, foundation models.