12 Architecture Overviews

Modern AI Architecture Atlas

A bird's-eye map of every major AI architecture — what it is, how it works, how it's trained, and how it's used. Interactive diagrams, intuitive explanations, and the research context you need to navigate the field. Not a textbook. Not a tutorial. A compass.

12 Architectures
~15 Min each
48+ Interactive diagrams
01

Transformer

Self-attention, KV caching, MoE, LoRA, RLHF, speculative decoding — the architecture that powers GPT, Claude, and Gemini.

Attention RoPE MoE LoRA RLHF
02

State Space Models

Selective scan, linear recurrence, and hybrid SSM-attention architectures like Jamba and Zamba. The O(n) alternative.

Mamba S4 Selective Scan Jamba
03

Diffusion Models

DDPM, DDIM, latent diffusion, classifier-free guidance, ControlNet, consistency models — the dominant generative paradigm.

DDPM Latent Diffusion CFG ControlNet
04

Flow Matching

Continuous normalizing flows, optimal transport, conditional flow matching — straighter paths, fewer steps. Behind SD3 and Flux.

CNF OT CFM Reflow
05

VAE / VQ-VAE / Tokenizers

Variational inference, ELBO, codebook learning, FSQ — the secret plumbing behind every generative system.

VAE VQ-VAE FSQ ELBO
06

GAN

Adversarial training, StyleGAN, spectral normalization, PatchGAN — the OG generative model, still alive at the edges.

Min-Max StyleGAN PatchGAN
07

Contrastive / Representation Learning

CLIP, SimCLR, DINO, MAE, SigLIP — the glue that makes multimodal AI possible.

CLIP DINO MAE InfoNCE
08

Vision-Language Models

Visual encoder + LLM fusion, instruction tuning, grounding, OCR-free document understanding — GPT-4V and beyond.

LLaVA Grounding Multi-image Video
09

Vision-Language-Action

Action heads, diffusion policies, action chunking, cross-embodiment training — foundation models that physically act.

RT-2 Diffusion Policy ACT OpenVLA
10

World Models

JEPA, Dreamer, Genie, UniSim — learned dynamics, latent imagination, and video prediction as world simulation.

JEPA Dreamer MPC Sim-to-Real
11

NeRF / 3D Gaussian Splatting

Neural radiance fields, 3DGS, generative 3D, SDF representations — 3D understanding from 2D images.

NeRF 3DGS SDF Zero-1-to-3
12

Reward Models / Alignment

RLHF, DPO, KTO, Constitutional AI, process reward models — making AI systems do what we actually want.

RLHF DPO Constitutional AI PRM