From probabilistic reasoning to multiagent systems. Every algorithm derived, visualized, and made interactive. The complete decision-making toolkit.
Bayesian networks, conditional independence, joint distributions.
Variable elimination, belief propagation, sampling methods.
MLE, Bayesian learning, EM algorithm.
Graph search, Markov equivalence classes.
Utility, decision networks, value of information.
MDPs, policy iteration, value iteration.
Parametric methods, tile coding, neural networks.
MCTS, heuristic search, rollout algorithms.
Genetic algorithms, CEM, evolution strategies.
Finite differences, likelihood ratio, REINFORCE.
Natural gradient, trust region, PPO.
GAE, deterministic policy gradient, A3C.
Robustness analysis, adversarial testing.
Bandits, UCB, Thompson sampling.
Bayesian RL, posterior sampling.
Q-learning, SARSA, experience replay, DQN.
Behavioral cloning, DAgger, IRL, GAIL.
Kalman filter, EKF, UKF, particle filters.
POMDPs, conditional plans, alpha vectors.
PBVI, SARSOP, point-based methods.
POMCP, DESPOT, online POMDP solvers.
Finite state controllers, policy graphs.
Game theory, Nash equilibrium, correlated equilibrium.
Stochastic games, MARL, fictitious play.
Dec-POMDPs, I-POMDPs, belief-space games.
Coordination, communication, team decision-making.