Kreyszig, Chapters 7–8

Linear Algebra

Matrices, eigenvectors, and the geometry of high dimensions. The language of data, networks, and quantum mechanics.

Prerequisites: Basic algebra + some exposure to systems of equations.
10
Chapters
6+
Simulations
10
Quizzes

Chapter 0: Why Matrices?

You have three equations with three unknowns. Write them out, solve by substitution. Tedious but possible. Now imagine a thousand equations with a thousand unknowns. That is what real engineering looks like: finite element models, circuit networks, machine learning weight matrices.

A matrix is a rectangular array of numbers. An m × n matrix has m rows and n columns. We write it as:

A = [aij],   i = 1,...,m,   j = 1,...,n

where aij is the entry in row i, column j. A vector is a matrix with one column (column vector) or one row (row vector).

The core idea: Matrices let us write systems of equations as a single equation Ax = b. Hundreds of equations become one compact expression. Better yet, matrix operations have beautiful geometric interpretations: rotations, reflections, projections, scaling.

A square matrix has m = n. The identity matrix I has 1s on the diagonal and 0s elsewhere. For any matrix A, AI = IA = A. It is the matrix equivalent of multiplying by 1.

What is the entry a23 in a matrix?

Chapter 1: Matrix Operations

Addition: A + B is defined only when A and B have the same dimensions. Add entry by entry: (A + B)ij = aij + bij.

Scalar multiplication: cA multiplies every entry by c: (cA)ij = c · aij.

Matrix multiplication: If A is m × n and B is n × p, then AB is m × p. The entry (AB)ij is the dot product of row i of A with column j of B:

(AB)ij = ∑k=1n aik bkj
Critical warning: Matrix multiplication is not commutative. In general, AB ≠ BA. This is one of the most important differences from ordinary algebra. Rotation by 90° then reflection is not the same as reflection then rotation by 90°.

Transpose: AT flips rows and columns: (AT)ij = aji. A matrix is symmetric if AT = A (symmetric about the main diagonal).

Inverse: If A is square and A−1 exists, then AA−1 = A−1A = I. Not all matrices have inverses. Those that do are called nonsingular or invertible.

Why is matrix multiplication not commutative (AB ≠ BA in general)?

Chapter 2: Gauss Elimination

The workhorse algorithm for solving Ax = b. The idea: use elementary row operations to transform the system into upper triangular form, then back-substitute.

The three elementary row operations are:

OperationEffect
Swap two rowsReorder equations (no mathematical change)
Multiply a row by a nonzero scalarScale an equation
Add a multiple of one row to anotherEliminate a variable
Augmented matrix: Write [A | b] and operate on rows. This is exactly the same as manipulating the equations, just more compact.

Example: Solve x + 2y = 5, 3x + 4y = 11.

Augmented: [1 2 | 5; 3 4 | 11]. Subtract 3×Row1 from Row2: [1 2 | 5; 0 −2 | −4]. Back-substitute: y = 2, x = 1.

Gauss Elimination Stepper

Watch the algorithm eliminate variables step by step. The pivot (orange) eliminates entries below it. Click "Step" to advance.

Press Step to begin.

The rank of a matrix is the number of nonzero rows after elimination. It tells you the number of independent equations. If rank(A) = rank([A|b]) = n (number of unknowns), the system has a unique solution.

Key insight: Gauss elimination is not just a solution technique. It reveals the structure of the solution space: unique solution, infinitely many solutions, or no solution at all. The rank determines everything.
What does it mean if rank(A) < number of unknowns?

Chapter 3: Determinants

The determinant det(A) is a single number computed from a square matrix. For 2×2:

det [[a, b], [c, d]] = ad − bc

For 3×3, expand along the first row (cofactor expansion):

det(A) = a11(a22a33 − a23a32) − a12(a21a33 − a23a31) + a13(a21a32 − a22a31)
Geometric meaning: The absolute value |det(A)| is the volume scaling factor of the linear transformation defined by A. For a 2×2 matrix, |det(A)| is the area of the parallelogram formed by the column vectors. If det(A) = 0, the transformation squashes space into a lower dimension.

Key properties:

PropertyFormula
A is invertibledet(A) ≠ 0
Product ruledet(AB) = det(A) · det(B)
Transposedet(AT) = det(A)
Inversedet(A−1) = 1/det(A)
Row swapChanges sign of det
Determinant as Area

Drag the column vectors of a 2×2 matrix. The parallelogram area equals |det(A)|. When det = 0, the vectors are parallel (linearly dependent).

a112.0
a210.5
a120.5
a222.0
If det(A) = 0, what does it mean geometrically?

Chapter 4: Eigenvalues & Eigenvectors

An eigenvector of a matrix A is a nonzero vector v that only gets scaled (not rotated) when multiplied by A:

Av = λv

The scalar λ is the eigenvalue. "Eigen" is German for "own" or "characteristic." Eigenvectors are the directions that the transformation preserves.

To find eigenvalues, rearrange: (A − λI)v = 0. For a nontrivial solution, the coefficient matrix must be singular:

det(A − λI) = 0

This is the characteristic equation. For an n×n matrix, it is a degree-n polynomial in λ.

Key insight: The characteristic equation here is exactly the characteristic equation from second-order ODEs. The connection is deep: solving y'' + py' + qy = 0 is equivalent to finding eigenvalues of the companion matrix [[0, 1], [−q, −p]]. ODEs and linear algebra are two views of the same mathematics.

Example: A = [[4, 1], [2, 3]]. Characteristic equation: (4−λ)(3−λ) − 2 = λ2 − 7λ + 10 = (λ−5)(λ−2) = 0. Eigenvalues: λ1 = 5, λ2 = 2.

Eigenvector Visualizer

The matrix A maps every point. The orange and teal arrows show eigenvectors — directions that only get stretched, not rotated. Adjust the matrix to see how eigenvectors change.

a112.0
a121.0
a211.0
a222.0
If Av = 3v, what is A2v?

Chapter 5: Diagonalization

If an n×n matrix A has n linearly independent eigenvectors v1,...,vn, we can form the matrix P = [v1 | ... | vn] and write:

A = PDP−1

where D = diag(λ1,...,λn) is the diagonal matrix of eigenvalues. This is diagonalization.

Why diagonalize? Powers become trivial: Ak = PDkP−1. Since D is diagonal, Dk = diag(λ1k,...,λnk). Computing A1000 is instant. This powers everything from Google's PageRank to Markov chains to solving systems of ODEs.

Not all matrices can be diagonalized. A matrix is diagonalizable if and only if it has n linearly independent eigenvectors. Symmetric matrices (AT = A) are always diagonalizable, and their eigenvalues are always real. Even better, their eigenvectors are orthogonal.

Quadratic forms: For a symmetric matrix A, the expression xTAx is a quadratic form. Its behavior is determined by the eigenvalues: if all λi > 0, the form is positive definite (bowl-shaped). If mixed signs, it is a saddle.

Why are symmetric matrices special in linear algebra?

Chapter 6: Singular Value Decomposition

Not every matrix is square or diagonalizable. The SVD works for any m×n matrix. It decomposes A as:

A = U Σ VT

where U is m×m orthogonal, V is n×n orthogonal, and Σ is m×n diagonal with the singular values σ1 ≥ σ2 ≥ ... ≥ 0 on the diagonal.

Key insight: Every linear transformation can be decomposed into three steps: (1) rotate/reflect the input space (VT), (2) scale along axes by σi (Σ), (3) rotate/reflect the output space (U). The SVD reveals the true geometry of any matrix.

The singular values are the square roots of the eigenvalues of ATA (or AAT). The columns of U are the left singular vectors and the columns of V are the right singular vectors.

ApplicationHow SVD is used
Data compressionKeep only the largest singular values (low-rank approximation)
PseudoinverseSolve least-squares problems for non-square or singular A
PCASVD of centered data matrix gives principal components
Recommender systemsMatrix factorization (Netflix prize approach)
Numerical rankCount singular values above a threshold
What is the geometric interpretation of the SVD?

Chapter 7: Applications

Systems of ODEs (revisited)

The system x' = Ax has solution x(t) = eAtx(0). If A = PDP−1, then eAt = P·diag(eλ1t,...,eλnt)·P−1. Diagonalization converts a coupled system into n independent equations.

Markov Chains

A Markov chain describes random transitions between states. The transition matrix M has entries mij = probability of going from state j to state i. The state after k steps is x(k) = Mkx(0). The steady-state is the eigenvector of M with eigenvalue 1.

Least Squares

When Ax = b has no exact solution (overdetermined), the least-squares solution minimizes ||Axb||2. It satisfies the normal equations:

ATA x = ATb
Connection to machine learning: Linear regression is exactly least squares. The "weights" are x = (ATA)−1ATb. The matrix (ATA)−1AT is the pseudoinverse of A.
What is the steady state of a Markov chain?

Chapter 8: Linear Transformation Lab

Every 2×2 matrix defines a transformation of the plane. Watch how it maps the unit square, the unit circle, and individual points. Explore rotations, reflections, shears, and projections.

2D Transformation Explorer

The teal unit circle becomes the orange ellipse under transformation A. Eigenvectors shown as thick lines. Grid lines show how the whole plane deforms.

a111.5
a120.5
a210.5
a221.5
The SVD in action: The singular values are the semi-axes of the output ellipse. The left singular vectors (U) give the orientation of the ellipse. The right singular vectors (V) give the input directions that map to the axes. Every matrix maps circles to ellipses.

Chapter 9: Connections

This lessonWhere it leads
EigenvaluesODE systems (Ch 2), stability analysis, quantum mechanics
DiagonalizationMatrix exponential eAt, Markov chains, Google PageRank
SVDPCA, dimensionality reduction, recommender systems
DeterminantsChange of variables in integrals (Jacobian), volume forms
RankSolvability theory for linear systems, null spaces
Least squaresLinear regression, signal processing, curve fitting
Historical note: Eigenvalues were introduced by Cauchy in 1829 for quadratic forms. The word "eigen" was introduced by Hilbert around 1904. Today eigenvalues are arguably the single most important concept in applied mathematics.

"The introduction of numbers as coordinates is an act of violence." — Hermann Weyl

What is the SVD of a matrix A?