Hartley & Zisserman, Chapter 11

Computation of the Fundamental Matrix F

The 8-point algorithm, normalization, algebraic and geometric minimization, RANSAC for robust estimation, degeneracies, and image rectification.

Prerequisites: Chapter 4 (Estimation) + Chapter 9 (Epipolar Geometry).

Chapters

Simulations

Chapter 0: Why Compute F?

The fundamental matrix F encodes the epipolar geometry of two views. Computing it reliably from image correspondences is the first step of any uncalibrated reconstruction pipeline. Get F right, and everything downstream (triangulation, reconstruction, calibration) benefits. Get it wrong, and nothing works.

The challenge: Real correspondences contain outliers (wrong matches). Measurements are noisy. Some scene configurations are degenerate. A robust F estimation algorithm must handle all three issues simultaneously.

The relationship is x'^TFx = 0 for corresponding points. Each correspondence gives one linear equation in the 9 entries of F. With enough correspondences, F can be determined.

How many equations does each point correspondence contribute to the linear system for F?

1 equation (from x'^TFx = 0) 2 equations 3 equations

Chapter 1: The 8-Point Algorithm

Expanding x'^TFx = 0 with x = (x, y, 1)^T and x' = (x', y', 1)^T:

x'x f₁₁ + x'y f₁₂ + x' f₁₃ + y'x f₂₁ + y'y f₂₂ + y' f₂₃ + x f₃₁ + y f₃₂ + f₃₃ = 0

This is one linear equation in the 9 unknowns f_ij. Stacking n correspondences gives an n × 9 matrix A and we solve Af = 0.

Case	Null space dim	# solutions for F
n ≥ 8 (general position)	1	Unique (up to scale)
n = 7	2	1 or 3 solutions (imposing det F = 0)

Why "8-point"? F has 9 entries but is defined up to scale, so 8 DOF as a homogeneous matrix. However, det F = 0 imposes an additional constraint, leaving 7 true DOF. Eight correspondences give 8 equations, determining a unique solution in the 8-DOF space. The det = 0 constraint is enforced afterwards.

The 8-point algorithm solves for F using how many linear equations?

7 8 or more (one per correspondence) 11

Chapter 2: Normalization

The un-normalized 8-point algorithm performs terribly in practice. The reason: image coordinates like (500, 300, 1) create matrix entries differing by orders of magnitude. The SVD produces numerically unstable results.

Hartley's normalized 8-point algorithm is dramatically better:

Step	Action
1	Translate and scale points in each image so centroid is at origin and RMS distance is √2
2	Run the 8-point algorithm on normalized points
3	Denormalize: F = T'^T F̃ T

This normalization step transforms the 8-point algorithm from a curiosity to a practical workhorse. Without it, results can be off by orders of magnitude. With it, the 8-point algorithm often approaches the accuracy of iterative methods.

What does normalization do to the image coordinates before computing F?

Translates centroid to origin and scales so RMS distance from origin is √2 Divides all coordinates by the image width Rounds coordinates to the nearest integer

Chapter 3: The Singularity Constraint

The matrix F must satisfy det F = 0 (it has rank 2). The SVD solution of Af = 0 does not guarantee this. We must enforce the rank-2 constraint as a post-processing step.

The method: compute the SVD of the estimated F = UDV^T, where D = diag(σ₁, σ₂, σ₃). Set σ₃ = 0 to get D' = diag(σ₁, σ₂, 0). The corrected F' = UD'V^T is the closest rank-2 matrix to F in Frobenius norm.

Why rank 2 matters: If F had rank 3, the epipolar lines would not pass through a common epipole. The geometry would be inconsistent. Enforcing rank 2 ensures that all epipolar lines in each image pass through a single point (the epipole).

For the 7-point algorithm: the null space of A is 2-dimensional, giving F = αF₁ + (1−α)F₂. The constraint det(αF₁ + (1−α)F₂) = 0 is a cubic in α, giving 1 or 3 real solutions.

How is the rank-2 constraint on F enforced?

Set the smallest singular value to zero in the SVD of F Add det F = 0 as a row of the equation matrix A Divide F by its determinant

Chapter 4: Geometric Error and Gold Standard

As with camera estimation, algebraic error does not correspond to anything meaningful geometrically. The Gold Standard approach minimizes the reprojection error:

min_{F, x̂_i, x̂'_i} Σ_i [d(x_i, x̂_i)² + d(x'_i, x̂'_i)²]

subject to x̂'_i^T F x̂_i = 0. This minimizes the sum of squared distances between measured and "corrected" points, where the corrected points exactly satisfy the epipolar constraint.

Sampson error: A first-order approximation to geometric error that avoids the full iterative minimization. It is much faster than the Gold Standard and often nearly as accurate. It is computed as:
d_Sampson² = (x'^TFx)² / [(Fx)₁² + (Fx)₂² + (F^Tx')₁² + (F^Tx')₂²]

What does the Sampson error approximate?

The geometric reprojection error (to first order) The algebraic error The rank of F

Chapter 5: RANSAC for Robust F

In practice, point correspondences contain outliers (wrong matches). RANSAC (Random Sample Consensus) handles this by repeatedly sampling minimal subsets.

Step	Action
1	Randomly select 7 or 8 correspondences
2	Compute F from this minimal set
3	Count inliers: correspondences with Sampson error below threshold
4	Repeat; keep the F with the most inliers
5	Re-estimate F from all inliers (Gold Standard)
6	Guided matching: use the estimated F to find additional correspondences

Guided matching: Once F is estimated, it constrains where matches can appear. For each unmatched feature in image 1, search only along its epipolar line in image 2. This typically doubles the number of correspondences, further improving the final F estimate.

What is the purpose of guided matching after RANSAC?

Use the estimated F to find additional correspondences by searching along epipolar lines Remove all outliers Estimate the camera matrices

Chapter 6: Degeneracies

Some point configurations cannot uniquely determine F, even with unlimited noise-free data.

Configuration	Null space dim	Solutions
General position, n ≥ 8	1	Unique F
Points + cameras on ruled quadric	2	3 solutions (including 7-point case)
All points on a plane (or camera rotation only)	3	2-parameter family: F = [t]_×H

The planar degeneracy: If all scene points lie on a plane, the correspondences are related by a homography x' = Hx, and F has a 2-parameter family of solutions F = SH for any skew-symmetric S. This is the most common degeneracy in practice (e.g., a wall, a road surface). RANSAC with a homography model can detect this case.

Pure rotation (no translation) is also degenerate: the camera centres coincide, so F is undefined (the zero matrix). Points are related by a homography H = K'RK⁻¹.

When all scene points lie on a plane, what happens to the F estimation?

F has a 2-parameter family of solutions; the points are related by a homography instead F can still be uniquely determined F is the identity matrix

Chapter 7: Special Cases

Special camera motions simplify the computation of F:

Motion	F form	DOF	Min points
Pure translation	F = [e]_× (skew-symmetric)	2	2
Planar motion	det F_s = 0	6	6
Calibrated (essential E)	σ₁ = σ₂, σ₃ = 0	5	5

The calibrated case: With calibrated cameras, compute the essential matrix E instead of F. The 8-point algorithm works, but the constraint that E's two non-zero singular values are equal must be enforced. The closest essential matrix to an arbitrary E has singular values ((a+b)/2, (a+b)/2, 0) where a ≥ b ≥ c are the original singular values.

Other entities besides points can constrain F. Vanishing points (images of parallel lines) provide one constraint each. Epipolar tangency of curves and surface outlines also constrains F. But general line correspondences provide no constraint on F (two planes always intersect).

Do line correspondences between two images constrain the fundamental matrix F?

No — two planes in 3-space always intersect, so line correspondences provide no constraint on F Yes — each line gives 2 constraints Yes — each line gives 1 constraint

Chapter 8: Image Rectification

Image rectification transforms two images so that corresponding epipolar lines become horizontal scan lines (y = y'). This simplifies stereo matching to a 1D search along each row.

Rectification works by applying a projective transformation H to the first image and H' to the second, such that:

Property	After rectification
Epipoles	Both at infinity: e = e' = (1, 0, 0)^T
Epipolar lines	Horizontal: y = y' (corresponding rasters)
F	F = [e']_× = standard skew-symmetric form

Why rectify? Dense stereo matching (computing a disparity map for every pixel) requires searching for correspondences. Rectification reduces this from a 2D search to a 1D search along each row, making stereo matching orders of magnitude faster. All practical stereo systems rectify their images first.

After image rectification, where are the epipoles?

At infinity: e = e' = (1, 0, 0)^T, so epipolar lines are horizontal At the image centres At the principal points

Chapter 9: Connections

Link	Connection
Ch 4 → Ch 11	DLT, normalization, and RANSAC all extend from homography estimation to F estimation
Ch 11 → Ch 12	Once F is computed, triangulation recovers 3D structure
Ch 11 → Ch 13	F and plane-induced homographies are related: F can be computed from H + one extra correspondence
Ch 11 → Ch 18	Bundle adjustment refines F (and all cameras) over many views

"The normalized 8-point algorithm: a rehabilitation of an old idea that works."

— Hartley & Zisserman, Chapter 11

What is the single most important practical improvement to the 8-point algorithm?

Normalization of coordinates before solving the linear system Using more than 8 points Enforcing the rank-2 constraint

← Chapter 10 Chapter 12: Triangulation →