How LTI systems are completely characterized by their impulse response — and why complex exponentials are their eigenvectors.
You're building an audio equalizer. The user moves a slider and the bass gets louder. What's actually happening? A system takes the input signal x[n] (the original audio) and produces an output signal y[n] (the modified audio). The system's job: transform one signal into another.
Every filter, every effect pedal, every noise cancellation algorithm, every averaging operation is a system. The question that drives this entire lecture: what properties must a system have for us to analyze it completely?
The M-point moving average takes the average of the current sample and the M-1 previous samples:
For M = 3: y[n] = (x[n] + x[n-1] + x[n-2]) / 3. This smooths the signal — rapid fluctuations get averaged out, slow trends remain.
A completely arbitrary system could do anything — map sin(t) to a picture of a cat. That's too general to analyze. We need constraints. Two constraints together give us enormous analytical power:
• Linearity: The system respects superposition
• Shift-invariance: The system doesn't care when the input happens
A system with BOTH properties is called Linear Time-Invariant (LTI). As we'll prove, LTI systems are completely characterized by a single function: their impulse response h[n]. Know h[n], and you know everything the system will ever do.
Input x[n] (noisy signal) passes through an M-point moving average. Adjust M to see how the system smooths more aggressively.
A system T is linear if it satisfies the superposition principle:
This packs two properties into one equation:
• Homogeneity: T{c · x[n]} = c · T{x[n]} — scaling the input scales the output
• Additivity: T{x1 + x2} = T{x1} + T{x2} — the response to a sum equals the sum of responses
Let's prove it rigorously. Define T{x[n]} = (1/M) ∑k=0M-1 x[n-k].
Apply T to a linear combination c1x1 + c2x2:
Distribute the sum:
The proof works because summation is linear. Any system defined by a weighted sum of input samples is linear.
Not every useful system is linear. Consider:
• Squaring: T{x[n]} = x[n]2. Then T{2x} = 4x2 ≠ 2x2 = 2T{x}. Not linear.
• Clipping: T{x[n]} = min(x[n], 1). Violates homogeneity for inputs > 1.
• Median filter: T{x[n]} = median(x[n-1], x[n], x[n+1]). Violates additivity.
These are all useful (clipping prevents distortion, median filter removes impulse noise), but we can't analyze them with the elegant machinery we're building.
Two inputs x1, x2 and scalars c1, c2. Compare T{c1*x1 + c2*x2} vs c1*T{x1} + c2*T{x2}. For a linear system, they're identical.
The second crucial property. A system is shift-invariant (or time-invariant) if shifting the input shifts the output by the same amount:
In words: the system doesn't care when the input happens. A clap at t = 0 produces the same echo as a clap at t = 5 (just shifted by 5).
If a system is shift-invariant, its behavior doesn't depend on absolute time. The impulse response at time 0 is the same as the impulse response at time 1000. This means we only need to measure h[n] ONCE, and we know the system's response to an impulse placed ANYWHERE.
Let y[n] = T{x[n]} = (1/M) ∑k=0M-1 x[n-k].
Now apply T to the shifted input x[n - n0]:
The moving average treats all time indices the same — it has no "preferred" time.
• Modulation: T{x[n]} = x[n] · cos(ωn). The cos(ωn) depends on absolute time n. Shifting x by n0 gives x[n-n0] · cos(ωn) ≠ x[n-n0] · cos(ω(n-n0)).
• Downsampling: T{x[n]} = x[2n]. Shift x by 1: x[2n - 1]. But shifting the output by 1 gives x[2(n-1)] = x[2n-2] ≠ x[2n-1]. Not shift-invariant.
A system that is BOTH linear AND shift-invariant is called LTI (Linear Time-Invariant). The moving average is LTI. Most "nice" filters are LTI. The key theorem (next chapter): any LTI system performs convolution.
Shift the input signal and observe whether the output shifts identically. For an LTI system, it should.
Here's the fundamental theorem of LTI systems: if a system is linear AND shift-invariant, its output is the convolution of the input with the impulse response.
Any discrete signal can be written as a sum of shifted impulses:
This is just a fancy way of saying "x[n] is the sum of spikes, each weighted by x[k] and located at position k."
Now apply the LTI system T to both sides:
By linearity, we pull the sum and constants out:
By shift-invariance, T{δ[n-k]} = h[n-k] (the impulse response shifted to position k):
Input: x = [1, 2, 3, 4] (length 4). Impulse response: h = [1, 1, 1] / 3 (3-point moving average).
Convolution y = x * h:
• y[0] = h[0]x[0] = (1/3)(1) = 0.33
• y[1] = h[0]x[1] + h[1]x[0] = (1/3)(2) + (1/3)(1) = 1.00
• y[2] = h[0]x[2] + h[1]x[1] + h[2]x[0] = (1/3)(3+2+1) = 2.00
• y[3] = h[0]x[3] + h[1]x[2] + h[2]x[1] = (1/3)(4+3+2) = 3.00
• y[4] = h[1]x[3] + h[2]x[2] = (1/3)(4+3) = 2.33
• y[5] = h[2]x[3] = (1/3)(4) = 1.33
Output length: len(x) + len(h) - 1 = 4 + 3 - 1 = 6.
python import numpy as np def convolve(x, h): """Direct convolution of x with h.""" N, M = len(x), len(h) y = np.zeros(N + M - 1) for n in range(N + M - 1): for k in range(M): if 0 <= n - k < N: y[n] += h[k] * x[n - k] return y # Verify against numpy x = np.array([1, 2, 3, 4]) h = np.array([1, 1, 1]) / 3 assert np.allclose(convolve(x, h), np.convolve(x, h))
Watch h[n] flip and slide across x[n]. The output y[n] builds up one sample at a time.
Convolution in time takes O(N2) operations. There's a shortcut: the convolution theorem.
Convolution in time becomes pointwise multiplication in frequency. This is why the DFT is so powerful for filtering: transform to frequency, multiply by the filter's frequency response H[k], transform back.
Cost: O(N log N) for the two DFTs + O(N) for the multiplication = O(N log N) total. Much faster than O(N2) direct convolution for long signals.
Want to remove high frequencies? Set H[k] = 0 for high-frequency bins:
This is an ideal lowpass filter. In practice, the sharp cutoff creates ringing artifacts (Gibbs phenomenon), so we use smoother rolloffs.
Signal x = [1, 2, 3, 4] (N = 4). Filter h = [0.5, 0.5, 0, 0] (2-point average, zero-padded to length 4).
DFT of x: X = [10, -2+2j, -2, -2-2j].
DFT of h: H = [1, 0.5-0.5j, 0, 0.5+0.5j].
Product: Y = X · H = [10, -1+1j+1+1j, 0, -1-1j+1-1j] (compute each entry)...
Actually let's just verify: np.fft.ifft(np.fft.fft([1,2,3,4]) * np.fft.fft([0.5,0.5,0,0])) = [2.5, 1.5, 2.5, 3.5]. This is the circular convolution of x and h (length-4 circular, not linear).
Design a frequency response H[k] by clicking the bars. The input signal passes through your filter — observe the filtered output and the impulse response h[n].
python import numpy as np def fft_convolve(x, h): """Fast convolution via FFT (linear, not circular).""" N = len(x) + len(h) - 1 X = np.fft.fft(x, N) H = np.fft.fft(h, N) return np.real(np.fft.ifft(X * H)) # 10x faster than direct convolution for N > ~64
Here's one of the deepest results in signal processing: complex exponentials are the eigenvectors of ALL LTI systems.
For a matrix A, an eigenvector v satisfies Av = λv — the matrix just scales v. For a system T, an eigenfunction x[n] satisfies:
The system just scales the input — it doesn't change its shape. The scaling factor λ is the eigenvalue.
Let x[n] = ejωn (a complex exponential at frequency ω). Apply any LTI system with impulse response h[k]:
Factor out ejωn (it doesn't depend on k):
Where H(ω) = ∑k h[k] e-jωk is the frequency response (the DTFT of h[n]).
If complex exponentials are eigenvectors, then decomposing a signal into complex exponentials (= taking the DFT) diagonalizes the system. In the DFT domain, the system is just multiplication by H[k] — which is exactly the convolution theorem!
The DFT is not just a convenient trick. It's the natural representation for LTI systems, just as eigenvectors are the natural basis for matrices.
For a matrix A with eigendecomposition A = UΛUH:
• To compute Ax, transform to eigenbasis (UHx), multiply by eigenvalues (Λ), transform back (U)
• To filter x with LTI h, transform to DFT (F x), multiply by H[k] (diagonal), transform back (F-1)
The DFT matrix F plays the role of U. The frequency response H[k] plays the role of the diagonal Λ.
Feed a complex exponential at frequency ω into an LTI system. The output is the SAME exponential, just scaled by H(ω). Change ω to see different eigenvalues.
Let's make the matrix connection explicit. When we write the LTI operation y = h * x in matrix form for finite-length signals, we get a special matrix structure: the circulant matrix.
For signals of length N with circular (periodic) boundary conditions, convolution y[n] = ∑ h[k] x[n-k mod N] can be written as:
Where H is an N×N matrix whose rows are circular shifts of h:
For example, with h = [a, b, c, 0] and N = 4:
Each row is the previous row shifted one position to the right (with wrap-around). This is a circulant matrix.
The DFT matrix F has entries F[m,n] = (1/√N) e-j2πmn/N. The eigendecomposition is:
Where Λ = diag(H[0], H[1], ..., H[N-1]) contains the DFT of the first row (= the frequency response).
This means:
• The eigenvectors of H are the columns of FH = the DFT basis vectors
• The eigenvalues of H are H[k] = DFT{h[n]} = the frequency response
• Multiplying by H (= circular convolution) is diagonalized by the DFT
Computing Hx directly costs O(N2). But since H = FHΛF:
• Fx = FFT of x: O(N log N)
• Λ(Fx) = pointwise multiply: O(N)
• FH(ΛFx) = inverse FFT: O(N log N)
• Total: O(N log N) instead of O(N2)
See how the impulse response h generates a circulant matrix. Each row is a circular shift.
python import numpy as np from scipy.linalg import circulant # Build circulant matrix from h h = np.array([1, 1, 1, 0, 0, 0, 0, 0]) / 3 H_mat = circulant(h) # Verify eigendecomposition: eigenvalues = DFT of h eigenvalues = np.fft.fft(h) F = np.fft.fft(np.eye(len(h))) # DFT matrix # H_mat ≈ F.conj().T @ diag(eigenvalues) @ F / N
Let's tie it all together. This lecture built a tower of results, each depending on the previous:
| Domain | System Operation | Cost |
|---|---|---|
| Time domain | y = h * x (convolution) | O(N2) |
| Frequency domain | Y[k] = H[k] · X[k] (multiplication) | O(N log N) via FFT |
| Matrix form | y = Hx (circulant matrix) | O(N2) direct, O(N log N) via FFT |
| Eigen form | H = FHΛF (diagonal in eigenbasis) | Same as frequency domain |
Looking back: Lecture 6 (DFT) introduced the transform. Now we know WHY it's so fundamental: it diagonalizes all LTI systems. Lecture 11 (Wavelets) showed that wavelets offer an alternative decomposition — better for nonstationary signals but without the clean convolution theorem.
Looking forward: Lecture 13 (Cepstrum & MFCC) exploits the convolution theorem for speech: if a speech signal is a source convolved with a vocal tract filter, taking the log in the DFT domain separates them into an additive sum.