EE269 Lecture 12 — Mert Pilanci, Stanford

Linear Systems & Eigenvectors

How LTI systems are completely characterized by their impulse response — and why complex exponentials are their eigenvectors.

Prerequisites: EE269 Lecture 6 (DFT) + Basic linear algebra (matrices, eigenvalues). That's it.
8
Chapters
6+
Simulations
0
Assumed Knowledge

Chapter 0: Systems Basics

You're building an audio equalizer. The user moves a slider and the bass gets louder. What's actually happening? A system takes the input signal x[n] (the original audio) and produces an output signal y[n] (the modified audio). The system's job: transform one signal into another.

x[n] → [System T] → y[n] = T{x[n]}

Every filter, every effect pedal, every noise cancellation algorithm, every averaging operation is a system. The question that drives this entire lecture: what properties must a system have for us to analyze it completely?

A Simple Example: Moving Average

The M-point moving average takes the average of the current sample and the M-1 previous samples:

y[n] = (1/M) ∑k=0M-1 x[n - k]

For M = 3: y[n] = (x[n] + x[n-1] + x[n-2]) / 3. This smooths the signal — rapid fluctuations get averaged out, slow trends remain.

What Makes a System Useful?

A completely arbitrary system could do anything — map sin(t) to a picture of a cat. That's too general to analyze. We need constraints. Two constraints together give us enormous analytical power:

Linearity: The system respects superposition

Shift-invariance: The system doesn't care when the input happens

A system with BOTH properties is called Linear Time-Invariant (LTI). As we'll prove, LTI systems are completely characterized by a single function: their impulse response h[n]. Know h[n], and you know everything the system will ever do.

The profound result of this lecture: For LTI systems, the operation is ALWAYS convolution (in time domain) or multiplication (in frequency domain). Complex exponentials are the eigenvectors. The DFT diagonalizes any LTI system. This is why Fourier analysis is so central to signal processing.
A System in Action: Moving Average

Input x[n] (noisy signal) passes through an M-point moving average. Adjust M to see how the system smooths more aggressively.

Window M 5
An LTI system is completely characterized by what single function?

Chapter 1: Linearity

A system T is linear if it satisfies the superposition principle:

T{c1x1[n] + c2x2[n]} = c1 T{x1[n]} + c2 T{x2[n]}

This packs two properties into one equation:

Homogeneity: T{c · x[n]} = c · T{x[n]} — scaling the input scales the output

Additivity: T{x1 + x2} = T{x1} + T{x2} — the response to a sum equals the sum of responses

Proving the Moving Average is Linear

Let's prove it rigorously. Define T{x[n]} = (1/M) ∑k=0M-1 x[n-k].

Apply T to a linear combination c1x1 + c2x2:

T{c1x1 + c2x2} = (1/M) ∑k=0M-1 [c1x1[n-k] + c2x2[n-k]]

Distribute the sum:

= c1 (1/M) ∑k=0M-1 x1[n-k] + c2 (1/M) ∑k=0M-1 x2[n-k]
= c1 T{x1} + c2 T{x2}   ✓

The proof works because summation is linear. Any system defined by a weighted sum of input samples is linear.

Key insight: Linearity lets us decompose any input into simple pieces, process each piece separately, then add the results. If we can decompose x[n] into impulses (spoiler: we can!), we only need to know the system's response to a single impulse to predict the output for ANY input.

Nonlinear Systems (for contrast)

Not every useful system is linear. Consider:

Squaring: T{x[n]} = x[n]2. Then T{2x} = 4x2 ≠ 2x2 = 2T{x}. Not linear.

Clipping: T{x[n]} = min(x[n], 1). Violates homogeneity for inputs > 1.

Median filter: T{x[n]} = median(x[n-1], x[n], x[n+1]). Violates additivity.

These are all useful (clipping prevents distortion, median filter removes impulse noise), but we can't analyze them with the elegant machinery we're building.

Linearity Test

Two inputs x1, x2 and scalars c1, c2. Compare T{c1*x1 + c2*x2} vs c1*T{x1} + c2*T{x2}. For a linear system, they're identical.

c1 1.0
c2 0.5
Which of these systems is linear?

Chapter 2: Shift-Invariance

The second crucial property. A system is shift-invariant (or time-invariant) if shifting the input shifts the output by the same amount:

If T{x[n]} = y[n],   then T{x[n - n0]} = y[n - n0]

In words: the system doesn't care when the input happens. A clap at t = 0 produces the same echo as a clap at t = 5 (just shifted by 5).

Why Shift-Invariance Matters

If a system is shift-invariant, its behavior doesn't depend on absolute time. The impulse response at time 0 is the same as the impulse response at time 1000. This means we only need to measure h[n] ONCE, and we know the system's response to an impulse placed ANYWHERE.

Physical intuition: A loudspeaker in a room produces an echo pattern (reverb). If the room doesn't change, the echo pattern is the same regardless of when you clap. That's shift-invariance. But if someone opens a door during the recording, the system changes — it's no longer time-invariant.

Proving the Moving Average is Shift-Invariant

Let y[n] = T{x[n]} = (1/M) ∑k=0M-1 x[n-k].

Now apply T to the shifted input x[n - n0]:

T{x[n - n0]} = (1/M) ∑k=0M-1 x[(n - n0) - k] = (1/M) ∑k=0M-1 x[(n-n0)-k] = y[n - n0]   ✓

The moving average treats all time indices the same — it has no "preferred" time.

Shift-Variant Examples

Modulation: T{x[n]} = x[n] · cos(ωn). The cos(ωn) depends on absolute time n. Shifting x by n0 gives x[n-n0] · cos(ωn) ≠ x[n-n0] · cos(ω(n-n0)).

Downsampling: T{x[n]} = x[2n]. Shift x by 1: x[2n - 1]. But shifting the output by 1 gives x[2(n-1)] = x[2n-2] ≠ x[2n-1]. Not shift-invariant.

LTI = Both Properties Together

A system that is BOTH linear AND shift-invariant is called LTI (Linear Time-Invariant). The moving average is LTI. Most "nice" filters are LTI. The key theorem (next chapter): any LTI system performs convolution.

Shift-Invariance Checker

Shift the input signal and observe whether the output shifts identically. For an LTI system, it should.

Shift n0 0
T{x[n]} = x[n] · n (multiply by the time index). Is this system shift-invariant?

Chapter 3: Convolution

Here's the fundamental theorem of LTI systems: if a system is linear AND shift-invariant, its output is the convolution of the input with the impulse response.

The Derivation

Any discrete signal can be written as a sum of shifted impulses:

x[n] = ∑k=-∞ x[k] · δ[n - k]

This is just a fancy way of saying "x[n] is the sum of spikes, each weighted by x[k] and located at position k."

Now apply the LTI system T to both sides:

y[n] = T{x[n]} = T{∑k x[k] · δ[n-k]}

By linearity, we pull the sum and constants out:

= ∑k x[k] · T{δ[n-k]}

By shift-invariance, T{δ[n-k]} = h[n-k] (the impulse response shifted to position k):

y[n] = ∑k=-∞ x[k] · h[n-k] = (x * h)[n]
This is convolution. The output at time n is computed by: flip h, shift it to position n, multiply point-by-point with x, and sum. The symbol * denotes convolution. For finite signals of length N: y[n] = ∑k=0N-1 h[k] · x[n-k].

Worked Example

Input: x = [1, 2, 3, 4] (length 4). Impulse response: h = [1, 1, 1] / 3 (3-point moving average).

Convolution y = x * h:

• y[0] = h[0]x[0] = (1/3)(1) = 0.33

• y[1] = h[0]x[1] + h[1]x[0] = (1/3)(2) + (1/3)(1) = 1.00

• y[2] = h[0]x[2] + h[1]x[1] + h[2]x[0] = (1/3)(3+2+1) = 2.00

• y[3] = h[0]x[3] + h[1]x[2] + h[2]x[1] = (1/3)(4+3+2) = 3.00

• y[4] = h[1]x[3] + h[2]x[2] = (1/3)(4+3) = 2.33

• y[5] = h[2]x[3] = (1/3)(4) = 1.33

Output length: len(x) + len(h) - 1 = 4 + 3 - 1 = 6.

In Code

python
import numpy as np

def convolve(x, h):
    """Direct convolution of x with h."""
    N, M = len(x), len(h)
    y = np.zeros(N + M - 1)
    for n in range(N + M - 1):
        for k in range(M):
            if 0 <= n - k < N:
                y[n] += h[k] * x[n - k]
    return y

# Verify against numpy
x = np.array([1, 2, 3, 4])
h = np.array([1, 1, 1]) / 3
assert np.allclose(convolve(x, h), np.convolve(x, h))
Convolution Visualizer

Watch h[n] flip and slide across x[n]. The output y[n] builds up one sample at a time.

Position n 0
If x has length N and h has length M, what is the length of y = x * h?

Chapter 4: DFT Domain

Convolution in time takes O(N2) operations. There's a shortcut: the convolution theorem.

y[n] = x[n] * h[n]   ↔   Y[k] = X[k] · H[k]

Convolution in time becomes pointwise multiplication in frequency. This is why the DFT is so powerful for filtering: transform to frequency, multiply by the filter's frequency response H[k], transform back.

The Fast Filtering Recipe

1. DFT
X[k] = DFT{x[n]},   H[k] = DFT{h[n]}
2. Multiply
Y[k] = X[k] · H[k] for each k
3. IDFT
y[n] = IDFT{Y[k]}

Cost: O(N log N) for the two DFTs + O(N) for the multiplication = O(N log N) total. Much faster than O(N2) direct convolution for long signals.

H[k] is the frequency response. It tells you, for each frequency bin k, how much the system amplifies (|H[k]| > 1) or attenuates (|H[k]| < 1) that frequency, and how much it shifts the phase (∠H[k]). Designing a filter = designing H[k].

Designing a Filter via H[k]

Want to remove high frequencies? Set H[k] = 0 for high-frequency bins:

H[k] = 1 for k ≤ kc,   H[k] = 0 for k > kc

This is an ideal lowpass filter. In practice, the sharp cutoff creates ringing artifacts (Gibbs phenomenon), so we use smoother rolloffs.

Worked Example

Signal x = [1, 2, 3, 4] (N = 4). Filter h = [0.5, 0.5, 0, 0] (2-point average, zero-padded to length 4).

DFT of x: X = [10, -2+2j, -2, -2-2j].

DFT of h: H = [1, 0.5-0.5j, 0, 0.5+0.5j].

Product: Y = X · H = [10, -1+1j+1+1j, 0, -1-1j+1-1j] (compute each entry)...

Actually let's just verify: np.fft.ifft(np.fft.fft([1,2,3,4]) * np.fft.fft([0.5,0.5,0,0])) = [2.5, 1.5, 2.5, 3.5]. This is the circular convolution of x and h (length-4 circular, not linear).

Circular vs. linear convolution: DFT multiplication gives circular convolution (wraps around). To get linear convolution, zero-pad both signals to length N + M - 1 before taking the DFT. This is the standard trick.
Filter Designer: Shape H[k] and See the Result

Design a frequency response H[k] by clicking the bars. The input signal passes through your filter — observe the filtered output and the impulse response h[n].

Click a preset or click the H[k] bars to design your own filter.
python
import numpy as np

def fft_convolve(x, h):
    """Fast convolution via FFT (linear, not circular)."""
    N = len(x) + len(h) - 1
    X = np.fft.fft(x, N)
    H = np.fft.fft(h, N)
    return np.real(np.fft.ifft(X * H))

# 10x faster than direct convolution for N > ~64
In the DFT domain, what operation does convolution in time correspond to?

Chapter 5: Eigenvectors of LTI

Here's one of the deepest results in signal processing: complex exponentials are the eigenvectors of ALL LTI systems.

What Does "Eigenvector" Mean for a System?

For a matrix A, an eigenvector v satisfies Av = λv — the matrix just scales v. For a system T, an eigenfunction x[n] satisfies:

T{x[n]} = λ · x[n]

The system just scales the input — it doesn't change its shape. The scaling factor λ is the eigenvalue.

The Proof

Let x[n] = ejωn (a complex exponential at frequency ω). Apply any LTI system with impulse response h[k]:

y[n] = ∑k h[k] · x[n-k] = ∑k h[k] · ejω(n-k)

Factor out ejωn (it doesn't depend on k):

= ejωnk h[k] · e-jωk = ejωn · H(ω)

Where H(ω) = ∑k h[k] e-jωk is the frequency response (the DTFT of h[n]).

The result: T{ejωn} = H(ω) · ejωn. The complex exponential passes through ANY LTI system unchanged in shape — only scaled by H(ω). The eigenvalue is H(ω): it tells you how much the system amplifies or attenuates that specific frequency.

Why This Matters

If complex exponentials are eigenvectors, then decomposing a signal into complex exponentials (= taking the DFT) diagonalizes the system. In the DFT domain, the system is just multiplication by H[k] — which is exactly the convolution theorem!

The DFT is not just a convenient trick. It's the natural representation for LTI systems, just as eigenvectors are the natural basis for matrices.

Analogy: Matrix Diagonalization

For a matrix A with eigendecomposition A = UΛUH:

• To compute Ax, transform to eigenbasis (UHx), multiply by eigenvalues (Λ), transform back (U)

• To filter x with LTI h, transform to DFT (F x), multiply by H[k] (diagonal), transform back (F-1)

The DFT matrix F plays the role of U. The frequency response H[k] plays the role of the diagonal Λ.

Eigenvector Property: Exponential Through LTI

Feed a complex exponential at frequency ω into an LTI system. The output is the SAME exponential, just scaled by H(ω). Change ω to see different eigenvalues.

Frequency ω/π 0.20
If you input e^{jωn} into any LTI system with impulse response h[n], what is the output?

Chapter 6: Circulant Matrices

Let's make the matrix connection explicit. When we write the LTI operation y = h * x in matrix form for finite-length signals, we get a special matrix structure: the circulant matrix.

Matrix Form of Convolution

For signals of length N with circular (periodic) boundary conditions, convolution y[n] = ∑ h[k] x[n-k mod N] can be written as:

y = H · x

Where H is an N×N matrix whose rows are circular shifts of h:

H = [h[0], h[N-1], h[N-2], ..., h[1];   h[1], h[0], h[N-1], ..., h[2];   ...]

For example, with h = [a, b, c, 0] and N = 4:

H = [[a, 0, c, b], [b, a, 0, c], [c, b, a, 0], [0, c, b, a]]

Each row is the previous row shifted one position to the right (with wrap-around). This is a circulant matrix.

The key theorem: Every circulant matrix has the DFT vectors as eigenvectors. Specifically: H = F-1 diag(H[0], H[1], ..., H[N-1]) F, where F is the DFT matrix and H[k] = DFT{h[n]}. This is the matrix version of the convolution theorem.

Eigendecomposition of Circulant Matrices

The DFT matrix F has entries F[m,n] = (1/√N) e-j2πmn/N. The eigendecomposition is:

H = FH Λ F

Where Λ = diag(H[0], H[1], ..., H[N-1]) contains the DFT of the first row (= the frequency response).

This means:

• The eigenvectors of H are the columns of FH = the DFT basis vectors

• The eigenvalues of H are H[k] = DFT{h[n]} = the frequency response

• Multiplying by H (= circular convolution) is diagonalized by the DFT

Why This Matters Practically

Computing Hx directly costs O(N2). But since H = FHΛF:

• Fx = FFT of x: O(N log N)

• Λ(Fx) = pointwise multiply: O(N)

• FH(ΛFx) = inverse FFT: O(N log N)

• Total: O(N log N) instead of O(N2)

Circulant Matrix Visualizer

See how the impulse response h generates a circulant matrix. Each row is a circular shift.

python
import numpy as np
from scipy.linalg import circulant

# Build circulant matrix from h
h = np.array([1, 1, 1, 0, 0, 0, 0, 0]) / 3
H_mat = circulant(h)

# Verify eigendecomposition: eigenvalues = DFT of h
eigenvalues = np.fft.fft(h)
F = np.fft.fft(np.eye(len(h)))  # DFT matrix
# H_mat ≈ F.conj().T @ diag(eigenvalues) @ F / N
What are the eigenvectors of any N×N circulant matrix?

Chapter 7: Mastery

Let's tie it all together. This lecture built a tower of results, each depending on the previous:

Linear + Shift-Invariant
These two properties define LTI systems
Convolution
LTI ⇒ output = input * impulse response
Complex exponentials = eigenvectors
e^{jωn} passes through unchanged (scaled by H(ω))
DFT diagonalizes LTI
Convolution ↔ multiplication in DFT domain
Circulant matrix = eigendecomposition
H = F^H Λ F — the matrix formulation

The Big Picture

DomainSystem OperationCost
Time domainy = h * x (convolution)O(N2)
Frequency domainY[k] = H[k] · X[k] (multiplication)O(N log N) via FFT
Matrix formy = Hx (circulant matrix)O(N2) direct, O(N log N) via FFT
Eigen formH = FHΛF (diagonal in eigenbasis)Same as frequency domain

Connections

Looking back: Lecture 6 (DFT) introduced the transform. Now we know WHY it's so fundamental: it diagonalizes all LTI systems. Lecture 11 (Wavelets) showed that wavelets offer an alternative decomposition — better for nonstationary signals but without the clean convolution theorem.

Looking forward: Lecture 13 (Cepstrum & MFCC) exploits the convolution theorem for speech: if a speech signal is a source convolved with a vocal tract filter, taking the log in the DFT domain separates them into an additive sum.

The takeaway: LTI systems have beautiful structure: convolution, eigenvectors, diagonalization. This structure is why signal processing works — and why the DFT is the single most important tool in the field.
Why is the DFT considered the "natural" basis for analyzing LTI systems?