The definitive textbook on deep learning, from mathematical foundations through core architectures. Every chapter condensed into interactive lessons with live simulations.
Scalars, vectors, matrices, tensors, eigendecomposition, SVD, the trace operator, and the determinant.
Random variables, probability distributions, Bayes' rule, expectation, variance, common distributions, and information theory.
Overflow, underflow, gradient-based optimization, Jacobians, Hessians, constrained optimization, and linear least squares.
Learning algorithms, capacity, overfitting, underfitting, estimators, bias-variance, MLE, Bayesian statistics, SGD, and the curse of dimensionality.
XOR problem, hidden layers, activation functions, output units, backpropagation, and universal approximation.
Parameter norm penalties, dataset augmentation, noise robustness, early stopping, dropout, and batch normalization.
SGD, momentum, learning rate schedules, adaptive methods (Adam, RMSProp), batch normalization, and loss surfaces.
Convolution operation, motivation, pooling, variants, efficient algorithms, and neuroscientific basis.
Unfolding graphs, RNNs, bidirectional RNNs, encoder-decoder, deep recurrent nets, LSTM, and GRU.
Performance metrics, baselines, hyperparameter selection, debugging strategies, and when to gather more data.
Large-scale deep learning, computer vision, speech recognition, NLP, recommender systems, and other applications.