Matrices, eigenvectors, and the geometry of high dimensions. The language of data, networks, and quantum mechanics.
You have three equations with three unknowns. Write them out, solve by substitution. Tedious but possible. Now imagine a thousand equations with a thousand unknowns. That is what real engineering looks like: finite element models, circuit networks, machine learning weight matrices.
A matrix is a rectangular array of numbers. An m × n matrix has m rows and n columns. We write it as:
where aij is the entry in row i, column j. A vector is a matrix with one column (column vector) or one row (row vector).
A square matrix has m = n. The identity matrix I has 1s on the diagonal and 0s elsewhere. For any matrix A, AI = IA = A. It is the matrix equivalent of multiplying by 1.
Addition: A + B is defined only when A and B have the same dimensions. Add entry by entry: (A + B)ij = aij + bij.
Scalar multiplication: cA multiplies every entry by c: (cA)ij = c · aij.
Matrix multiplication: If A is m × n and B is n × p, then AB is m × p. The entry (AB)ij is the dot product of row i of A with column j of B:
Transpose: AT flips rows and columns: (AT)ij = aji. A matrix is symmetric if AT = A (symmetric about the main diagonal).
Inverse: If A is square and A−1 exists, then AA−1 = A−1A = I. Not all matrices have inverses. Those that do are called nonsingular or invertible.
The workhorse algorithm for solving Ax = b. The idea: use elementary row operations to transform the system into upper triangular form, then back-substitute.
The three elementary row operations are:
| Operation | Effect |
|---|---|
| Swap two rows | Reorder equations (no mathematical change) |
| Multiply a row by a nonzero scalar | Scale an equation |
| Add a multiple of one row to another | Eliminate a variable |
Example: Solve x + 2y = 5, 3x + 4y = 11.
Augmented: [1 2 | 5; 3 4 | 11]. Subtract 3×Row1 from Row2: [1 2 | 5; 0 −2 | −4]. Back-substitute: y = 2, x = 1.
Watch the algorithm eliminate variables step by step. The pivot (orange) eliminates entries below it. Click "Step" to advance.
The rank of a matrix is the number of nonzero rows after elimination. It tells you the number of independent equations. If rank(A) = rank([A|b]) = n (number of unknowns), the system has a unique solution.
The determinant det(A) is a single number computed from a square matrix. For 2×2:
For 3×3, expand along the first row (cofactor expansion):
Key properties:
| Property | Formula |
|---|---|
| A is invertible | det(A) ≠ 0 |
| Product rule | det(AB) = det(A) · det(B) |
| Transpose | det(AT) = det(A) |
| Inverse | det(A−1) = 1/det(A) |
| Row swap | Changes sign of det |
Drag the column vectors of a 2×2 matrix. The parallelogram area equals |det(A)|. When det = 0, the vectors are parallel (linearly dependent).
An eigenvector of a matrix A is a nonzero vector v that only gets scaled (not rotated) when multiplied by A:
The scalar λ is the eigenvalue. "Eigen" is German for "own" or "characteristic." Eigenvectors are the directions that the transformation preserves.
To find eigenvalues, rearrange: (A − λI)v = 0. For a nontrivial solution, the coefficient matrix must be singular:
This is the characteristic equation. For an n×n matrix, it is a degree-n polynomial in λ.
Example: A = [[4, 1], [2, 3]]. Characteristic equation: (4−λ)(3−λ) − 2 = λ2 − 7λ + 10 = (λ−5)(λ−2) = 0. Eigenvalues: λ1 = 5, λ2 = 2.
The matrix A maps every point. The orange and teal arrows show eigenvectors — directions that only get stretched, not rotated. Adjust the matrix to see how eigenvectors change.
If an n×n matrix A has n linearly independent eigenvectors v1,...,vn, we can form the matrix P = [v1 | ... | vn] and write:
where D = diag(λ1,...,λn) is the diagonal matrix of eigenvalues. This is diagonalization.
Not all matrices can be diagonalized. A matrix is diagonalizable if and only if it has n linearly independent eigenvectors. Symmetric matrices (AT = A) are always diagonalizable, and their eigenvalues are always real. Even better, their eigenvectors are orthogonal.
Quadratic forms: For a symmetric matrix A, the expression xTAx is a quadratic form. Its behavior is determined by the eigenvalues: if all λi > 0, the form is positive definite (bowl-shaped). If mixed signs, it is a saddle.
Not every matrix is square or diagonalizable. The SVD works for any m×n matrix. It decomposes A as:
where U is m×m orthogonal, V is n×n orthogonal, and Σ is m×n diagonal with the singular values σ1 ≥ σ2 ≥ ... ≥ 0 on the diagonal.
The singular values are the square roots of the eigenvalues of ATA (or AAT). The columns of U are the left singular vectors and the columns of V are the right singular vectors.
| Application | How SVD is used |
|---|---|
| Data compression | Keep only the largest singular values (low-rank approximation) |
| Pseudoinverse | Solve least-squares problems for non-square or singular A |
| PCA | SVD of centered data matrix gives principal components |
| Recommender systems | Matrix factorization (Netflix prize approach) |
| Numerical rank | Count singular values above a threshold |
The system x' = Ax has solution x(t) = eAtx(0). If A = PDP−1, then eAt = P·diag(eλ1t,...,eλnt)·P−1. Diagonalization converts a coupled system into n independent equations.
A Markov chain describes random transitions between states. The transition matrix M has entries mij = probability of going from state j to state i. The state after k steps is x(k) = Mkx(0). The steady-state is the eigenvector of M with eigenvalue 1.
When Ax = b has no exact solution (overdetermined), the least-squares solution minimizes ||Ax − b||2. It satisfies the normal equations:
Every 2×2 matrix defines a transformation of the plane. Watch how it maps the unit square, the unit circle, and individual points. Explore rotations, reflections, shears, and projections.
The teal unit circle becomes the orange ellipse under transformation A. Eigenvectors shown as thick lines. Grid lines show how the whole plane deforms.
| This lesson | Where it leads |
|---|---|
| Eigenvalues | ODE systems (Ch 2), stability analysis, quantum mechanics |
| Diagonalization | Matrix exponential eAt, Markov chains, Google PageRank |
| SVD | PCA, dimensionality reduction, recommender systems |
| Determinants | Change of variables in integrals (Jacobian), volume forms |
| Rank | Solvability theory for linear systems, null spaces |
| Least squares | Linear regression, signal processing, curve fitting |
"The introduction of numbers as coordinates is an act of violence." — Hermann Weyl